Lightnews — Scholar-powered news

Giskard

@giskard-ai.bsky.social

🐢 The automated red-teaming platform for AI you can trust. We test, secure, and validate your LLM Agents for production.

Posts Replies Media Videos

Giskard

@giskard-ai.bsky.social

🤔 If your organization handles sensitive data- from healthcare records to financial information,

then you need proactive security testing... not reactive damage control.🚨

Put your AI agent to test! buff.ly/eLU9ORQ

September 9, 2025 at 11:01 AM

Giskard

@giskard-ai.bsky.social

🚩 AI Red Flags: Jalibreaking

With all the noise right now about #GPT5 jailbreak, let’s cut through the hype and explain what’s really going on.

In this video, Pierre, our lead AI Researcher uncovers “jailbreaking”

Test your AI agent for vulnerabilities today
www.giskard.ai/contact

August 20, 2025 at 12:03 PM

Giskard

@giskard-ai.bsky.social

🧨 Your LLM is underperforming... and your users can see that.

RealPerformance is a dataset of functional issues of language models, that mirrors failure patterns identified through rigorous testing in real LLM agents.

Understand these issues before they crop up: realperformance.giskard.ai

August 13, 2025 at 12:02 PM

Giskard

@giskard-ai.bsky.social

🚨 Is your AI agent really secure? Most teams think so—until we test it.
That’s why we’re offering a free, expert-led AI Security Risk Assessment.

👉 Apply to get security assessment and expert recommendations to strengthen your AI security and ensure safe deployment www.giskard.ai/free-ai-red-...

August 11, 2025 at 12:14 PM

Giskard

@giskard-ai.bsky.social

🚨 LLMs are great, until they go rogue.

RealHarm is a dataset of problematic interactions with textual AI agents built from a systematic review of publicly reported incidents.

Explore your risks here: gisk.ar/4luLJsd

August 6, 2025 at 11:01 AM

Giskard

@giskard-ai.bsky.social

🛡️ Finally, a different LLM benchmark.

Phare is independent, multilingual, reproducible, and has been set up responsibly!

David, explain what Phare has to offer and show you how to use our website to find the safest LLM for your use case.

Take a look at the benchmark: phare.giskard.ai

July 30, 2025 at 11:03 AM

Giskard

@giskard-ai.bsky.social

🧨 Some issues in AI deployments are often overlooked, but more important than you think.

RealPerformance is a dataset focused on functional issues in language models, which occur more often but aren't caught by traditional tests.

Explore your issues here: realperformance.giskard.ai

July 28, 2025 at 11:02 AM

Add to Home Screen

Light up
your news

Add to Home Screen

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news