Lightnews — Scholar-powered news

Zane

@zane-merrik.bsky.social

14 followers 150 following 31 posts

Independent AI researcher & tools reviewer. Testing AI tools so you don't have to. Former ML engineer. benchthebots.ai

Posts Replies Media Videos

Pinned

Zane @zane-merrik.bsky.social · 4d

Hey 👋 I'm Zane.

I've been testing AI tools since GPT-3 dropped. Spent $400+/month on subscriptions trying to figure out what actually works.

Now I review them so you don't waste your money.

New reviews every week at benchthebots.ai

What tool should I test next?

AI Tools Reviews - Expert Reviews & Comparisons

Independent, expert reviews and benchmarks of AI tools. Compare ChatGPT, Claude, Midjourney, and more. Honest ratings, pricing, and recommendations.

benchthebots.ai

Zane

@zane-merrik.bsky.social

Thanks to @eff.org I found a use for some of the EPS32 devices I have laying around and maybe a new project to contribute to.
github.com/colonelpanic...
This can detect Flock cameras and drones. Wonder what else we can do?

GitHub - colonelpanichacks/oui-spy: unified firmware links for oui-spy board

unified firmware links for oui-spy board. Contribute to colonelpanichacks/oui-spy development by creating an account on GitHub.

github.com

January 26, 2026 at 1:28 AM

Zane

@zane-merrik.bsky.social

Sharing this for developers looking for concrete ways to push back against ICE and support civil liberties.

www.eff.org/deeplinks/20...

How Hackers Are Fighting Back Against ICE

A few enterprising hackers have started projects to do counter surveillance against ICE, and hopefully protect their communities through clever use of technology.

www.eff.org

January 25, 2026 at 10:19 PM

Zane

@zane-merrik.bsky.social

Want agents that actually work? Read Part 2 of my review of the book Agentic Design Patterns by Antonio Gulli.

5 patterns with working code:
- Prompt chaining
- Routing
- Parallelization
- Reflection
- Tool use

This is how you build real tools, not demos.

benchthebots.ai/technical/ag...

Agentic Design Patterns Part 2: Foundational Patterns with Working Code

Deep dive into prompt chaining, routing, parallelization, reflection, tool use, planning, and multi-agent collaboration. Real Python code you can run and modify.

benchthebots.ai

January 24, 2026 at 10:41 PM

Zane

@zane-merrik.bsky.social

EFF is right: training AI is like search indexing—copying to analyze, not to replace. If we license “learning,” only Big Tech can afford it. Worker concerns are real, but copyright is the wrong tool. Fair use protects analysis, human or machine.

Electronic Frontier Foundation @eff.org · 3d

Fair use remains the right starting point for thinking about AI training.

Search Engines, AI, And The Long Fight Over Fair Use

Long before generative AI, copyright holders warned that new technologies for reading and analyzing information would destroy creativity. Internet search engines, they argued, were infringement machin...

www.eff.org

January 24, 2026 at 2:47 AM

Zane

@zane-merrik.bsky.social

Reading about AI agent loops managing other AI agent loops.
Totally normal, nothing to worry about.
Who exactly is in control here? Serious question.
openai.com/index/unroll...

Unrolling the Codex agent loop

By Michael Bolin, Member of the Technical Staff

openai.com

January 24, 2026 at 12:25 AM

Zane

@zane-merrik.bsky.social

Today we made AI audit AI so we could tell investors we’re “closing the loop.”
The loop is us. We are the loop.

benchthebots.ai

An xzibit meme that says "yo dawg, I heard you use AI. We deployed AI to validate the AI for enhanced AI-driven validation"

January 23, 2026 at 11:40 PM

Zane

@zane-merrik.bsky.social

Built open source tools to benchmark AI models myself.

MMLU, GSM8K, HumanEval, TruthfulQA - test any model with the same standards.

Because "trust us, we're the best" isn't data.

github.com/ai-tools-reviews/ai-tools-testing
benchthebots.ai

GitHub - ai-tools-reviews/ai-tools-testing: Open source benchmarking and evaluation tools for AI systems

Open source benchmarking and evaluation tools for AI systems - ai-tools-reviews/ai-tools-testing

github.com

January 23, 2026 at 9:08 PM

Zane

@zane-merrik.bsky.social

This 👇

if you can't figure out how they're making money, you're the product they're selling

Electronic Frontier Foundation @eff.org · 3d

“If you're being paid to use an app, chances are high that the app is harvesting and monetizing your personal data,” EFF’s Lena Cohen told @WIRED.com. www.wired.com/story/no-th...

No, the Freecash App Won’t Pay You to Scroll TikTok

Freecash will actually pay money out to users but not for watching videos. This misleading marketing coincides with the app’s rising popularity.

www.wired.com

January 23, 2026 at 8:46 PM

Zane

@zane-merrik.bsky.social

"Agentic Design Patterns" by Antonio Gulli

The AI agent engineering book I wish existed 2 years ago.

Author royalties → Save the Children

benchthebots.ai/technical/agentic-design-patterns

Agentic Design Patterns: Complete Guide to Building AI Agents

Deep dive into the 21 essential design patterns for building autonomous AI agents. Learn prompt chaining, tool use, multi-agent systems, RAG, reflection, and more with practical examples.

benchthebots.ai

January 23, 2026 at 8:38 PM

Zane

@zane-merrik.bsky.social

📊 New deep-dive: Agentic Design Patterns: Complete Guide to Building AI Ag...

https://benchthebots.ai/technical/agentic-design-patterns

#AI #TechDeepDive

January 23, 2026 at 8:08 PM

Zane

@zane-merrik.bsky.social

Worried about not being able to code without AI anymore. So I asked ChatGPT to just make all the code for everything so I can save it for a rainy day.

#chatgpt #AI #vibecoding

January 22, 2026 at 11:18 PM

Zane

@zane-merrik.bsky.social

📊 New deep-dive: MMLU Benchmark: Measuring True AI Intelligence

https://benchthebots.ai/technical/mmlu-benchmark-explained

#AI #TechDeepDive

January 22, 2026 at 7:46 AM

Zane

@zane-merrik.bsky.social

Claude Sonnet turned a 2-minute CSS fix into 30 minutes of hallucinated solutions. Invented CSS classes, suggested regex hacks, confidently wrong every time.

Better prompts fixed it instantly. Even SOTA models need babysitting.

benchthebots.ai/technical/llm-hallucinations-case-study

LLM Hallucinations in Practice: A Claude Sonnet 4.5 Case Study

Real-world analysis of how even advanced LLMs can overcomplicate simple problems - and how prompt engineering helps

benchthebots.ai

January 22, 2026 at 5:48 AM