Lightnews — Scholar-powered news

Marcin Junczys-Dowmunt (Marian NMT)

@marian-nmt.bsky.social

150 followers 130 following 16 posts

NLP. NMT. Main author of Marian NMT. Research Scientist at Microsoft Translator.

https://marian-nmt.github.io

Posts Replies Media Videos

Pinned

Marcin Junczys-Dowmunt (Marian NMT) @marian-nmt.bsky.social · Dec 13

This place needs bookmarks.

Marcin Junczys-Dowmunt (Marian NMT)

@marian-nmt.bsky.social

Still no bookmarks?

February 5, 2025 at 7:03 AM

Marcin Junczys-Dowmunt (Marian NMT)

@marian-nmt.bsky.social

Hi, the Microsoft Translator research team is looking for an intern for the summer. If you a PhD student in Machine Translation, Natural Language Processing, or related, check it out: aka.ms/mtintern

Search Jobs | Microsoft Careers

aka.ms

January 28, 2025 at 5:55 PM

Reposted by Marcin Junczys-Dowmunt (Marian NMT)

Clem Delangue 🤗

@clem.hf.co

Just 10 days after o1's public debut, we’re thrilled to unveil the open-source version of the technique behind its success: scaling test-time compute

By giving models more "time to think," Llama 1B outperforms Llama 8B in math—beating a model 8x its size. The full recipe is open-source!

December 16, 2024 at 9:42 PM

Marcin Junczys-Dowmunt (Marian NMT)

@marian-nmt.bsky.social

Rant: Apparently every vector-based sentence alignment tool insists on having an unusable file-based API.

December 16, 2024 at 9:48 PM

Reposted by Marcin Junczys-Dowmunt (Marian NMT)

Simon Willison

@simon.fedi.simonwillison.net.ap.brid.gy

Wrote up some notes on Microsoft's new Phi-4 LLM. They trained it on a LOT of synthetic data, and the details of how and why they did that are really interesting.
https://simonwillison.net/2024/Dec/15/phi-4-technical-report/

Phi-4 Technical Report

Phi-4 is the latest LLM from Microsoft Research. It has 14B parameters and claims to be a big leap forward in the overall Phi series. From [Introducing Phi-4: Microsoft’s Newest …

simonwillison.net

December 16, 2024 at 12:21 AM

Reposted by Marcin Junczys-Dowmunt (Marian NMT)

porcoesphino.bsky.social

@porcoesphino.bsky.social

It's messier, but I think this one slaps the point home a bit stronger by adding the giant squid footage. I think unique weather, like lighting sprites, would make the point just as well.

Chart of time vs:
- number of cameras (exponentially increasing ),
- giant squid footage (exponentially increasing ),
- bigfoot footage (small and not increasing), and
- good quality UFO footage (small and not increasing)

December 14, 2024 at 8:25 PM

Reposted by Marcin Junczys-Dowmunt (Marian NMT)

Yoav Goldberg

@yoavgo.bsky.social

the anthropomorphizing in this LLM scheming paper is through the roof and the interpretations are wild, but still a cute set of experiments and a fun skim, showing some interesting behaviors.

arxiv.org/abs/2412.04984

Frontier Models are Capable of In-context Scheming

Frontier models are increasingly trained and deployed as autonomous agent. One safety concern is that AI agents might covertly pursue misaligned goals, hiding their true capabilities and objectives - ...

arxiv.org

December 13, 2024 at 9:36 AM

Reposted by Marcin Junczys-Dowmunt (Marian NMT)

Artidoro Pagnoni

@artidoro.bsky.social

🚀 Introducing the Byte Latent Transformer (BLT) – A LLM architecture that scales better than Llama 3 using patches instead of tokens 🤯
Paper 📄 dl.fbaipublicfiles.com/blt/BLT__Pat...
Code 🛠️ github.com/facebookrese...