Lightnews — Scholar-powered news

Robert Isaacs

@r-b-i.bsky.social

21 followers 67 following 8 posts

Founder and CEO of Nine Minds. Developer of open source MSP PSA Alga PSA.

Posts Replies Media Videos

Reposted by Robert Isaacs

Ethan Mollick

@emollick.bsky.social

The Llama 4 model that won in LM Arena is different than the released version. I have been comparing the answers from Arena to the released model. They aren't close.

The data is worth a look also as it shows how LM Arena results can be manipulated to be more pleasing to humans. t.co/rqAey9SMwh

April 8, 2025 at 2:10 AM

Robert Isaacs

@r-b-i.bsky.social

It looks like quantizing the DeepSeek V3/R1 models devastate the performance of them. I can say that after weeks of using them extensively (v3 then r1). Maybe something about the fp8 training and MoE architecture makes it particularly susceptible.

Always test on full weight, non-distilled DeepSeek.

February 2, 2025 at 4:59 PM

Robert Isaacs

@r-b-i.bsky.social

If you are using OpenRouter for access to DeepSeek, I *highly* suggest you curate the providers in your account settings. I've gotten some garbage responses from some, as if they're hosting distilled R1 as the real thing.

The top 4 in this post have worked well for me.

aider.chat/2025/01/28/d...

Alternative DeepSeek V3 providers

DeepSeek’s API has been experiencing reliability issues. Here are alternative providers you can use.

aider.chat

February 2, 2025 at 4:30 PM

Reposted by Robert Isaacs

The Verge

@theverge.com

Microsoft makes DeepSeek’s R1 model available on Azure AI and GitHub

Microsoft moves quick to make R1 available broadly.

buff.ly

January 29, 2025 at 8:40 PM

Robert Isaacs

@r-b-i.bsky.social

I find it so interesting that Azure is hosting the DeepSeek R1 model now. On the one hand, they host a lot of models, but on the other, this one has the geopolitical angle, DeepSeek's upending OpenAI's biz model, and OpenAI's contentious relationship with Microsoft.

... and it's currently free. 🤯

January 30, 2025 at 1:31 AM

Reposted by Robert Isaacs

Sung Kim

@sungkim.bsky.social

SFT Memorizes, RL Generalizes: A Comparative Study of Foundation Model Post-training

- RL generalizes in rule-based envs, esp. when trained with an outcome-based reward
- SFT tends to memorize the training data and struggles to generalize OOD

January 29, 2025 at 1:43 PM

Add to Home Screen

Light up
your news

Add to Home Screen

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news