Lightnews — Scholar-powered news

Alessandro Sordoni

@murefil.bsky.social

86 followers 130 following 4 posts

ML Team MSR Montreal. Adjunct Prof UdeM MILA. Modularity & reasoning.

Posts Replies Media Videos

Reposted by Alessandro Sordoni

Besmira Nushi

@besmiranushi.bsky.social

Today is the International Day for the Elimination of Violence against Women. According to the UN, more than 50 000 women were killed by a partner or family member in 2023 news.un.org/en/story/202... This number is an underestimate given that only 37 countries reported in 2023.

November 26, 2024 at 6:16 AM

Reposted by Alessandro Sordoni

Arthur Douillard

@douillard.bsky.social

distributed learning for LLM?

recently, @primeintellect.bsky.social have announced finishing their 10B distributed learning, trained across the world.

what is it exactly?

🧵

November 25, 2024 at 12:02 PM

Reposted by Alessandro Sordoni

Edoardo Ponti

@edoardo-ponti.bsky.social

Last 5 days to apply for a PhD at #EdinburghNLP!

Deadline: November 25

www.ed.ac.uk/studying/pos...

If you are passionate about:

- adaptive tokenization and memory in foundation models
- modular deep learning
- computational typology

please message me or meet me at #NeurIPS2024!

Informatics: ILCC: Language Processing, Speech Technology, Information Retrieval, Cognition

Study Informatics: ILCC: Language Processing, Speech Technology, Information Retrieval, Cognition at the University of Edinburgh. Our postgraduate degree programmes focus on natural language processin...

www.ed.ac.uk

November 21, 2024 at 1:41 PM

Reposted by Alessandro Sordoni

Edoardo Ponti

@edoardo-ponti.bsky.social

Another nano gem from my amazing student
Piotr Nawrot!

A repo & notebook on sparse attention for efficient LLM inference: github.com/PiotrNawrot/...

This will also feature in my #NeurIPS 2024 tutorial "Dynamic Sparsity in ML" with André Martins: dynamic-sparsity.github.io Stay tuned!

A sparse mask of attention scores based on VerticalAndSlashAttention and a plot of loss vs sparsity ratio for various methods.

November 20, 2024 at 12:51 PM

Alessandro Sordoni

@murefil.bsky.social

Explore zero-shot routing of parameter-efficient experts with Phatgoose arxiv.org/abs/2402.05859 and Arrow arxiv.org/abs/2405.11157 w. github.com/microsoft/mttl

👉 github.com/sordonia/pg_mb…

Part of "Dynamic Sparsity in ML" tut#neurips202424, feedback welcome and join for discussions! 😊

November 21, 2024 at 3:47 PM

Add to Home Screen

Light up
your news

Add to Home Screen

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news