Lightnews — Scholar-powered news

Diego de las Casas

@dlsq.bsky.social

Humming in denial as Material 3 takes over all my screens

November 4, 2025 at 12:34 AM

Diego de las Casas

@dlsq.bsky.social

Atrapanubes is such a good Chilean beer. Great taste, great art, great name.

March 16, 2025 at 11:25 PM

Reposted by Diego de las Casas

tirelesstribal.bsky.social

@tirelesstribal.bsky.social

Gotta wait until he double-crosses Indiana Jones to steal the Holy Grail, I'm afraid.

March 9, 2025 at 2:02 PM

Reposted by Diego de las Casas

Simon Willison

@simon.fedi.simonwillison.net.ap.brid.gy

I wrote a CLI script to run PDFs through the new Mistral OCR API model (with some help from Claude) - details on that and notes on the new model here: https://simonwillison.net/2025/Mar/7/mistral-ocr/

Mistral OCR

New closed-source specialist OCR model by Mistral - you can feed it images or a PDF and it produces Markdown with optional embedded images. It's available [via their API](https://docs.mistral.ai/api/#tag/ocr), or …

simonwillison.net

March 7, 2025 at 1:41 AM

Diego de las Casas

@dlsq.bsky.social

We've upgraded Le Chat and it's blazing fast right now!
Also available for Android and iOS as of today
mistral.ai/en/news/all-...

February 6, 2025 at 6:07 PM

Diego de las Casas

@dlsq.bsky.social

We're releasing Mistral Small 3!
- 24B params, 81% MMLU
- Latency optimized: 150 tokens/s
- Competitive with Llama-3.3 70B, Qwen-2.5 32B, GPT4o-mini
- Apache 2.0
mistral.ai/news/mistral...

Mistral Small 3

Apache 2.0, 81% MMLU, 150 tokens/s

mistral.ai

January 30, 2025 at 9:17 PM

Reposted by Diego de las Casas

Cody Blakeney ✈️ NeurIPS 2024

@codestar.bsky.social

What people are going to do with AGI

January 26, 2025 at 4:30 PM

Reposted by Diego de las Casas

Mike Wiser

@drmikewiser.bsky.social

I know, but it's just an application of one of my favorite memes:

Screen cap from one of the Thor movies featuring a dark haired pale skinned woman as Thor's sister Hela. She has her hand out stopping Thor's hammer (Mjölnir) in mid air. The hammer is labeled "It's basic biology". Hela is labeled "Advanced Biology"

January 21, 2025 at 7:07 PM

Reposted by Diego de las Casas

tachikoma

@tachikoma.elsewhereunbound.com

agent swarm framework aces spatial reasoning test

December 25, 2024 at 4:59 PM

Reposted by Diego de las Casas

Tanishq Mathew Abraham

@iscienceluvr.bsky.social

Inventors of flow matching have released a comprehensive guide going over the math & code of flow matching!

Also covers variants like non-Euclidean & discrete flow matching.

A PyTorch library is also released with this guide!

This looks like a very good read! 🔥

arxiv: arxiv.org/abs/2412.06264

December 10, 2024 at 8:35 AM

Reposted by Diego de las Casas

Sung Kim

@sungkim.bsky.social

Jane Street, a quant trading firm has a very good YouTube channel. For comparison, DeepSeek is also a quant trading firm.

They recently published a video on "Building Machine Learning Systems for a Trillion Trillion Floating Point Operations".

Link: www.youtube.com/watch?v=139U...

Building Machine Learning Systems for a Trillion Trillion Floating Point Operations

YouTube video by Jane Street

www.youtube.com

December 9, 2024 at 5:26 PM

Diego de las Casas

@dlsq.bsky.social

AI Scientists: here is a technology that will automate your grunt work so you can spend more time with your kids

AI Ads: here is a technology that will automate spending time with your kids

December 3, 2024 at 10:35 PM

Reposted by Diego de las Casas

Stella Biderman

@stellaathena.bsky.social

A dataset of 1 million or 2 million Bluesky posts is completely irrelevant to training large language models.

The primary usecase for the datasets that people are losing their shit over isn't ChatGPT, it's social science research and developing systems that improve Bluesky.

Jeremy Howard @howard.fm · Nov 28

Did you know that 99% of email today is spam? Your inbox isn’t 99% spam because AI is used to filter it.

The same 99% will happen here too, but if AI researchers continue to get perma-banned for making available the datasets needed to filter it, it’s going to make this platform unusable.

November 28, 2024 at 6:57 PM

Reposted by Diego de las Casas

Sai Prasanna

@saiprasanna.in

Arxiv sharing reminder

pdf ❌
abs ✅

November 26, 2024 at 8:42 AM

Reposted by Diego de las Casas

Ben Recht

@beenwrekt.bsky.social

In fact, statistical malpractice is the main driver of progress in machine learning. At some point, we need to come to terms with this.

November 22, 2024 at 2:40 PM

Reposted by Diego de las Casas

Brent Toderian

@brenttoderian.bsky.social

READ: “3,337 Parisians were equipped with GPS trackers to record their journeys…for journeys from the outskirts of Paris to the center, the number of cyclists now far exceeds the number of motorists, a huge change from just 5 years ago.”

Evidence of leadership.
www.forbes.com/sites/carlto...

French Revolution: Cyclists Now Outnumber Motorists In Paris

Official measurements have found that Paris is rapidly becoming a city of cyclists.

www.forbes.com

November 19, 2024 at 7:12 PM

Diego de las Casas

@dlsq.bsky.social

We have 2 new big updates today at Mistral:
- New Le Chat: With canvas, web search, image understanding and generation & more - and free!
- Pixtral Large, our Frontier 124B open weight multimodal model that powers it.

Try it: chat.mistral.ai
Blog post: mistral.ai/news/mistral...

Two announcement cards from the Mistral AI team, dated November 18, 2024. The first card announces 'Mistral has entered the chat' with a brief description: 'Search, vision, ideation, coding... all yours for free.' The second card announces 'Pixtral Large' with the description: 'Pixtral grows up.' Both cards feature an orange 'Read More' button.

November 18, 2024 at 5:57 PM

Reposted by Diego de las Casas

Sander Dieleman

@sedielem.bsky.social

There seems to be some renewed interest in making this work in the ML/AI space, so I'm here as well 👋

Here's my latest blog post for good measure, about how diffusion models of images perform autoregression in frequency space: sander.ai/2024/09/02/s...

When I write more, I'll share here as well!

Diffusion is spectral autoregression

A deep dive into spectral analysis of diffusion models of images, revealing how they implicitly perform a form of autoregression in the frequency domain.

sander.ai

November 15, 2024 at 6:57 PM

Reposted by Diego de las Casas

Will Held

@williamheld.com

Quick thread in response to a question on token packing practices when pretraining LLMs!

Will Held @williamheld.com · Nov 7

Yes! Token packing has been the standard since RoBERTa. Excerpt below!

The intuition is that the model quickly learns to not attend across [SEP] boundaries and packing avoids "wasting" compute on padding tokens required to make the variable batch size consistent.

November 7, 2024 at 6:21 PM

Diego de las Casas

@dlsq.bsky.social

Hey all, thanks for the follow!

Just FYI, we're hiring AI Scientists and Engineers at Mistral AI.

If you're driven and interested in building cutting-edge GenAI, we'd love to have you join our team.

🌐 Check out our openings: jobs.lever.co/mistral
#AIJobs #TechCareers #MistralAI

Mistral AI jobs

Job openings at Mistral AI

jobs.lever.co

November 6, 2024 at 4:39 PM

Reposted by Diego de las Casas

Sung Kim

@sungkim.bsky.social

Tencent's Hunyuan-Large

The largest open-source Transformer-based MoE model with 389 billion parameters, can handle up to 256K tokens. Key features include large-scale synthetic data and a mixed expert routing strategy.

Model: huggingface.co/tencent/Tenc...
Paper: arxiv.org/abs/2411.02265

November 5, 2024 at 7:15 AM

Diego de las Casas

@dlsq.bsky.social

Thinking how much Google struggled to get TPUs to work well makes me skeptical that Nvidia is in any danger of losing dominance, at least for now.

Sung Kim @sungkim.bsky.social · Nov 4

Will Nvidia continue to dominate the AI chip market? Yes and no.

- Yes, Nvidia is likely to maintain dominance in the market for training AI models.
- No, another company (or companies) will take the lead in the market for AI model inference, which is an exponentially larger market.

November 5, 2024 at 7:49 AM

Reposted by Diego de las Casas

Sung Kim

@sungkim.bsky.social

I try to stay up to date with Gen AI videos, but you can do camera control now? Seriously?

Runway introduces advanced camera control for Gen-3 Alpha Turbo. Choose both the direction and intensity of how you move through your scenes for even more intention in every shot.

November 1, 2024 at 5:21 PM

Reposted by Diego de las Casas

Karen Ullrich (s/h) ✈️ COLM

@karen-ullrich.bsky.social

#Tokenization is undeniably a key player in the success story of #LLMs but we poorly understand why.
I want to highlight progress we made in understanding the role of tokenization, developing the core incidents and mitigating its problems. 🧵👇

October 30, 2024 at 6:26 PM

Reposted by Diego de las Casas

Marc Lanctot

@sharky6000.bsky.social

New starter pack! go.bsky.app/GZ4hZzu

October 28, 2024 at 9:43 AM

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news