Lightnews — Scholar-powered news

Kuzman Ganchev

@ganchev.bsky.social

1.5K followers 33 following 3 posts

Research Scientist at GoogleDeepMind (formerly at Google Research). UPenn graduate.

Posts Replies Media Videos

Reposted by Kuzman Ganchev

Ethan Mollick

@emollick.bsky.social

Study in Nature: “Across 30 out of 32 evaluation axes from the specialist physician perspective & 25 out of 26 evaluation axes from the patient-actor perspective, AMIE [Google Medical LLM] was rated superior to PCPs [primary care docs] while being non-inferior on the rest.”

(& AIME is an older LLM)

May 4, 2025 at 1:27 PM

Reposted by Kuzman Ganchev

Gus

@gusthema.bsky.social

Gemma 3 explained: Longer context, image support, and a new 1B model. → goo.gle/4lV8iaw

Other key enhancements:
🔸 Best model that fits in a single consumer GPU or TPU host
🔸 KV-cache memory reduction with 5-to-1 interleaved attention
🔸 And more!

Read the blog for the full details on Gemma 3.

Gemma explained: What’s new in Gemma 3- Google Developers Blog

Google's Gemma 3 model includes vision-language support and architectural changes for resource-friendly multimodal language models.

goo.gle

April 30, 2025 at 9:46 PM

Kuzman Ganchev

@ganchev.bsky.social

There's a link to a really nice interactive viewer for a sample of the data (will only make sense after you read the post). There's some examples that I would have expected (where something is implied but not directly stated) but also a surprising number of kind of topical things.

Tyler Chang @tylerachang.bsky.social · Dec 13

We scaled training data attribution (TDA) methods ~1000x to find influential pretraining examples for thousands of queries in an 8B-parameter LLM over the entire 160B-token C4 corpus!
medium.com/people-ai-re...

December 17, 2024 at 4:12 PM

Reposted by Kuzman Ganchev

Andreas Steiner

@andreaspsteiner.bsky.social

Want to get started using PaliGemma 2?

🎤 developers.googleblog.com/en/introduci...
🤗 huggingface.co/blog/paligem...
💾 kaggle.com/models/googl...
🔧 github.com/google-resea...

7/7

December 5, 2024 at 6:19 PM

Kuzman Ganchev

@ganchev.bsky.social

Wanted to share that Varun Godbole recently released a prompting playbook. The title says prompt tuning, but this is text prompts, not soft prompts.

github.com/varungodbole...

GitHub - varungodbole/prompt-tuning-playbook: A playbook for effectively prompting post-trained LLMs

A playbook for effectively prompting post-trained LLMs - varungodbole/prompt-tuning-playbook

github.com

November 11, 2024 at 3:51 PM

Reposted by Kuzman Ganchev

Jacob Eisenstein

@jacobeisenstein.bsky.social

I’m pretty excited about this one!

ALTA is A Language for Transformer Analysis.

Because ALTA programs can be compiled to transformer weights, it provides constructive proofs of transformer expressivity. It also offers new analytic tools for *learnability*.

arxiv.org/abs/2410.18077