Lightnews — Scholar-powered news

Cosmin Stamate

@stamate.bsky.social

⚙️ The Core Idea

They call any layer that can read a separate context plus a query a “contextual layer”.

Stack this layer on top of a normal multilayer perceptron and you get a “contextual block”.

For that block, the context acts exactly like a rank 1 additive patch on the

--- ...

July 25, 2025 at 8:13 AM

Cosmin Stamate

@stamate.bsky.social

ICML’s Statement about subversive hidden LLM prompts

We live in a weird timeline…

July 23, 2025 at 1:32 PM

Cosmin Stamate

@stamate.bsky.social

🚨 The era of infinite internet data is ending, So we ask:

👉 What’s the right generative modelling objective when data—not compute—is the bottleneck?

TL;DR:

▶️Compute-constrained? Train Autoregressive models

▶️Data-constrained? Train Diffusion models

Get ready for 🤿 1/n

--- ...

July 23, 2025 at 12:52 PM

Cosmin Stamate

@stamate.bsky.social

🚨 Finding #1: Diffusion models outperform autoregressive models when trained with sufficient compute (i.e., more epochs & parameters).

Across different unique data scales, we observe:

1️⃣ At low compute, Autoregressive models win.
2️⃣ After a certain amount of compute,

--- ...

July 23, 2025 at 12:09 PM

Cosmin Stamate

@stamate.bsky.social

Everyone get your top 1% quality dataset and train 100 epochs right now

---

paper : https://arxiv.org/abs/2507.15857

July 23, 2025 at 11:56 AM

Cosmin Stamate

@stamate.bsky.social

opencode making a pong game in vite+react using (4bit)
qwen/qwen3-235b-a22b-2507 locally, served by lmstudio.
It used like 130GB of RAM, 0 issues with tool calls.

This is completely usable locally now. ...

Creating Pong game in Vite React project

opencode.ai

July 23, 2025 at 11:51 AM

Cosmin Stamate

@stamate.bsky.social

Anthropic just released a research paper.

Inverse Scaling in Test-Time Compute

This study shows that longer reasoning in Large Reasoning Models (LRMs) can hurt performance—revealing a surprising inverse scaling between reasoning length and accuracy. ...

July 23, 2025 at 10:17 AM

Cosmin Stamate

@stamate.bsky.social

Companies are using fake humans and AI to do interviews now...

July 23, 2025 at 10:13 AM

Cosmin Stamate

@stamate.bsky.social

Best model with an OSI-approved license:

🇨🇳: R1, Qwen3

🇪🇺: Mistral Small

🇺🇸: IBM Granite

July 23, 2025 at 10:11 AM

Reposted by Cosmin Stamate

Jeff Dean

@jeffdean.bsky.social

Thanks for sharing your journey, @moji249.bsky.social! As I said on Twitter, it's really important for people to see that grad school has its ups and downs and that there are real times of struggle. Congratulations on the graduation! 🎉

Taha @moji249.bsky.social · Jun 28

The elephant in the room. There, I said it.

The Dark Side of Academia: Mental Health, Mentorship, and the Unspoken Struggles of an NLP…

Much unhappiness has come into the world because of bewilderment and things left unsaid. — Fyodor DostoevskyI entered the Master’s in…

medium.com

June 28, 2025 at 5:33 AM

Cosmin Stamate

@stamate.bsky.social

🔄 DeepSeek-R1 is now MIT licensed for clear open access
🔓 Open for the community to leverage model weights & outputs
🛠️ API outputs can now be used for fine-tuning & distillation

January 20, 2025 at 3:08 PM

Cosmin Stamate

@stamate.bsky.social

Deepseek just published their R1 repo, have fun everyone!!

January 20, 2025 at 12:38 PM

Reposted by Cosmin Stamate

Joel Z Leibo

@jzleibo.bsky.social

Our tutorial on cross-disciplinary insights on alignment is tomorrow

neurips.cc/virtual/2024...

NeurIPS Tutorial Cross-disciplinary insights into alignment in humans and machinesNeurIPS 2024

neurips.cc

December 10, 2024 at 5:35 AM

Reposted by Cosmin Stamate

Alfredo Canziani

@alfcnz.bsky.social

Random student shows up on Friday evening: prof, I didn’t get xyz. Can you explain again?
Me: it’s Friday evening.
Student: *puppy eyes*
Me: okay, okay, let me fetch some coloured markers. 🎨🖌️🖼️

November 23, 2024 at 2:16 AM

Cosmin Stamate

@stamate.bsky.social

So I can ask ML questions here and people that actually understand and care will reply?

November 19, 2024 at 8:28 PM

Cosmin Stamate

@stamate.bsky.social

Bluesky now has over 10 million users, and I was #1,116,727!

September 19, 2024 at 4:51 PM

Add to Home Screen

Light up
your news

Add to Home Screen

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news