Tal Schuster
talschuster.bsky.social
Tal Schuster
@talschuster.bsky.social
Research Scientist at Google DeepMind working on Gemini and Adaptive Compute LLMs
Reposted by Tal Schuster
🥁Introducing Gemini 2.5, our most intelligent model with impressive capabilities in advanced reasoning and coding.

Now integrating thinking capabilities, 2.5 Pro Experimental is our most performant Gemini model yet. It’s #1 on the LM Arena leaderboard. 🥇
March 25, 2025 at 5:25 PM
Reposted by Tal Schuster
Making LLMs run efficiently can feel scary, but scaling isn’t magic, it’s math! We wanted to demystify the “systems view” of LLMs and wrote a little textbook called “How To Scale Your Model” which we’re releasing today. 1/n
February 4, 2025 at 6:54 PM
Reposted by Tal Schuster
*Relaxed Recursive Transformers*
by @talschuster.bsky.social et al.

Converts pre-trained transformers to a more efficient version by turning blocks of layers into a single layer which is iterated. Lots of interesting tricks!

arxiv.org/abs/2410.20672
December 18, 2024 at 10:28 AM
Will be at NeurIPS. Reach out if you're interested in discussing adaptive compute in LLMs or other topics
December 10, 2024 at 6:43 AM
New Gemini model grabs first place in all domains.

Happy one year anniversary Gemini team!
December 6, 2024 at 6:49 PM
Reposted by Tal Schuster
Google just released a new Gemini model - gemini-exp-1206

I upgraded my llm-gemini plugin to support it and then got the best result yet for my "Generate an SVG of a pelican riding a bicycle" benchmark

simonwillison.net/2024/Dec/6/g...
December 6, 2024 at 6:08 PM
Dear algorithm,
I would like to view:
70% new ML and LLM research and cool results.
10% funny videos with cute animals.
5% sports (but no spoilers if I'm planning to watch the reply later).
5% travel and life hacks.
5% general tech.
5% random.
Regards,
November 29, 2024 at 1:59 PM