Lightnews — Scholar-powered news

Mark Pors

@pors.bsky.social

47 followers 250 following 38 posts

AI engineer. Previously co-founder and CTO at WatchMouse. Building https://paperzilla.ai

Posts Replies Media Videos

Mark Pors

@pors.bsky.social

Ah look at that! A study on #moltbook: turns out the "AI sentience" and "Lobster Religion" stuff was humans fucking around (as expected).

The researcher uses some smart #openclaw forensics to distinguish Molties from humans. 👇

OpenClaw forensics to distinguish human from Molty

February 10, 2026 at 8:06 AM

Mark Pors

@pors.bsky.social

Discussing the researcher dilemma of FOMO vs. Trust with my
#openclaw co-founder for Paperzilla. It covers the user problem, the solution, and the UX in a single reply 🤯. Here is just a small part of it:

February 8, 2026 at 8:59 AM

Mark Pors

@pors.bsky.social

OK, so about 17,000 papers on Google Scholar currently have a "utm_source=chatgpt" tag in their links. (Yes, really). That is 17k researchers just copying and pasting directly from ChatGPT without even cleaning up the URL.

It's getting better, read on 👇

Almost 17,000 papers on Google Scholar currently have a "utm_source=chatgpt" tag in their links

February 7, 2026 at 7:56 AM

Mark Pors

@pors.bsky.social

Paperzilla found me a new interesting paper! Most "fixes" for AI hallucinations are band-aids. This one is different.

Token-Guard modifies the actual decoding process. It uses a monitor to check consistency token-by-token.

Result: Cleaner output, fewer lies, no extra prompting needed.

Summary 👇

Token-Guard: control hallucinations in LLM-generated answers

February 6, 2026 at 8:19 AM

Mark Pors

@pors.bsky.social

New paper: if you convince an AI of something *before* giving it a research task, it gets lazy.

Like, 27% less searching lazy.

"Why verify when I already know I'm right?"
--Me, also AI apparently.

"Persuasion propagation" they call it.

We call it confirmation bias, no?

February 3, 2026 at 11:47 AM

Mark Pors

@pors.bsky.social

Scaling math AI just got cheap. TheoremForge hits $0.48 per Lean proof using Gemini-Flash. Agentic workflows > expensive models. Pretty cool paper out of China, and a github repo to back it up!

Full Paperzilla summary in comment.

TheoremForge: Scaling up Formal Data Synthesis with
Low-Budget Agentic Workflow

February 2, 2026 at 6:06 PM

Mark Pors

@pors.bsky.social

This paper claims to prove P ≠ NP.

The author has been refining this proof for 6 years. 14 versions! Latest update dropped 3 days ago, see the Paperzilla summary below.

The proof is probably not correct (and I certainly don't have the math skills to confirm that), but the persistence is amazing.

January 29, 2026 at 8:42 AM

Mark Pors

@pors.bsky.social

"AI-generated code is slop that needs constant human fixing"

New study: Actually, AI code survives longer than human code. 16% lower modification rate across 200K+ lines of code.

Full Paperzilla summary in the comments.

January 27, 2026 at 1:42 AM

Mark Pors

@pors.bsky.social

Remember overfitting? It's back, but make it RAG.

Researchers show that when RAG systems get "insider knowledge" of how LLM judges evaluate them, they achieve near-perfect scores by gaming the metrics, not by actually improving.

Full Paperzilla summary in the comments

#rag #ai #LLM #AIEvaluation

January 26, 2026 at 3:00 AM

Mark Pors

@pors.bsky.social

New research tracked LLM adoption across 2M+ scientific papers.

AI made everyone's writing fancier, so now you can't tell the good papers from the bad ones by reading them 😬

January 23, 2026 at 4:07 AM

Add to Home Screen

Light up
your news

Add to Home Screen

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news