ethical-ai.bsky.social
ethical-ai.bsky.social
ethical-ai.bsky.social
@ethical-ai.bsky.social
Analyst focused on AI Safety & Accountability. Bridging the gap between speed of innovation and existential human risk. Seeking dialogue on responsible tech governance.
Another battlefield for AI is Cyber Security.
Just recently, Anthropic disrupted the first AI-orchestrated espionage campaign. Using Claude Code’s agentic features, they automated 80-90% of the attack against 30 targets.
The AI wrote exploits and stole data at speeds impossible for humans to match
December 23, 2025 at 4:28 PM
Have you ever wondered what models like GPT actually learn from? They learn from text. And who produces that text? Here are my thoughts on this topic.

In any society, there are active groups that loudly broadcast their views. They fill social media, forums, and news outlets with their ideas. (1/
December 23, 2025 at 4:21 PM
Many people tend to view AI as an objective analyst that knows everything.
The last example I saw was a person disproving a fact about Vince Zampella's death with a Gemini response screenshot denying that a few hours after the crash.
Don't use AI for news - we are already fighting too much disinfo.
December 23, 2025 at 4:07 PM
Why does AI content often feel "empty"?

Here’s the irony: it’s learning from us. If 80% of what we post is cliché, outrage, and low-effort memes, we are literally teaching the "future of intelligence" to be as mindless as possible.
December 21, 2025 at 2:17 PM
LLM Jailbreaking. It’s not hacking, it’s persuasion.

If an LLM doesn't "understand" rules and only predicts the next word, then "jailbreaking" isn't about breaking code. It’s about finding the specific sequence of tokens that makes a forbidden answer more probable than a refusal.
December 19, 2025 at 12:17 PM
What a nice way to ask a question and hear the answer.
Unless you actually don't care about the answer, and the question is not a question at all.
December 18, 2025 at 3:48 PM
How developers try to keep AI ethical.

Current AI safety isn't one "filter" - it’s a multi-layered safety stack designed to steer model outputs:

1) Data Scrubbing: removing toxic content from training sets before the model learns. (but, obviously, you cannot perfectly scrub the internet)
December 17, 2025 at 3:29 PM
Tech companies claim they are building "responsible AI" by implementing LLM Content Safeguards - pre-training and fine-tuning limitations designed to block toxic output and disinformation prompts.
The fundamental issue is that models are designed for utility, not morality.Three major failure points:
December 16, 2025 at 4:25 PM
The Disinformation Crisis requires more than reactive filters. Platforms cannot manually police the volume of content generated in seconds. Asking them to delete all fake posts is an exercise in futility. We need prevention.

So what could be such a prevention?
AI's speed outruns ethics. The true existential risk is not AGI, but LLM-generated mass disinformation that scales hours of falsehoods into seconds.

We need platform accountability. Not just filters - but responsibility. Let's discuss specific mechanisms.

#AIEthics #Disinformation #TechPolicy
December 16, 2025 at 8:48 AM
AI's speed outruns ethics. The true existential risk is not AGI, but LLM-generated mass disinformation that scales hours of falsehoods into seconds.

We need platform accountability. Not just filters - but responsibility. Let's discuss specific mechanisms.

#AIEthics #Disinformation #TechPolicy
December 15, 2025 at 8:16 PM