Lightnews — Scholar-powered news

syhw.bsky.social

@syhw.bsky.social

SWE-RL: Advancing LLM Reasoning via Reinforcement Learning on Open Software Evolution arxiv.org/abs/2502.18449 by Yuxiang, Sida, and the whole team!
Get started with your favorite model here github.com/facebookrese...

SWE-RL: Advancing LLM Reasoning via Reinforcement Learning on Open Software Evolution

The recent DeepSeek-R1 release has demonstrated the immense potential of reinforcement learning (RL) in enhancing the general reasoning capabilities of large language models (LLMs). While DeepSeek-R1 ...

arxiv.org

February 26, 2025 at 7:00 PM

syhw.bsky.social

@syhw.bsky.social

EnCodec running in ffmpeg www.youtube.com/watch?v=5wlN...

ZML x FFmpeg

YouTube video by Steeve Morin

www.youtube.com

January 2, 2025 at 9:37 AM

syhw.bsky.social

@syhw.bsky.social

Awesome work @kjain14.bsky.social!

Kush Jain @kjain14.bsky.social · Dec 19

(6/6) Check out our preprint for more details: arxiv.org/abs/2410.00752 (w/Gabriel Synnaeve and Baptiste Rozière)

Homepage: testgeneval.github.io
Sample Explorer: testgeneval.github.io/demo.html
Dataset: huggingface.co/datasets/kja...
Code: github.com/facebookrese...

TestGenEval: A Real World Unit Test Generation and Test Completion Benchmark

Code generation models can help improve many common software tasks ranging from code completion to defect prediction. Most of the existing benchmarks for code generation LLMs focus on code authoring o...

arxiv.org

December 19, 2024 at 9:38 PM

Reposted

Kush Jain

@kjain14.bsky.social

Thrilled to announce our new work TestGenEval, a benchmark that measures unit test generation and test completion capabilities. This work was done in collaboration with the FAIR CodeGen team.

Preprint: arxiv.org/abs/2410.00752
Leaderboard: testgeneval.github.io/leaderboard....

December 19, 2024 at 8:59 PM

syhw.bsky.social

@syhw.bsky.social

That's a wrap #neurips2024

December 16, 2024 at 4:55 PM

syhw.bsky.social

@syhw.bsky.social

Don't transform the code, code the transform ! By Chris Cummins at #neurips2024

December 15, 2024 at 10:37 PM

syhw.bsky.social

@syhw.bsky.social

Just gave a talk on "Grounding LLMs in Code Execution" at the NeurIPS Hacker-Cup AI Competition, here are the slides docs.google.com/presentation...

[NeurIPS HackerCup 2024] Grounding LLMs in Code Execution

Grounding LLMs in Code Execution Gabriel Synnaeve, Meta, FAIR

docs.google.com

December 14, 2024 at 7:11 PM

syhw.bsky.social

@syhw.bsky.social

Gonna be at NeurIPS starting tomorrow afternoon. See you there, in particular if you want to talk about codegen and (post-)LLM research!

December 9, 2024 at 11:03 PM

syhw.bsky.social

@syhw.bsky.social

> Quality is free, but only to those willing to pay heavily for it.

> The major problems of our work are not so much technological as sociological in nature.

> Get the best people (cut out the deadwood), and make them happy. Turn them loose.

November 29, 2024 at 9:28 AM

Reposted

Thomas Wolf

@thomwolf.bsky.social

It's Sunday morning so taking a minute for a nerdy thread (on math, tokenizers and LLMs) of the work of our intern Garreth

By adding a few lines of code to the base Llama 3 tokenizer, he got a free boost in arithmetic performance 😮

[thread]

November 24, 2024 at 11:05 AM

Add to Home Screen

Light up
your news

Add to Home Screen

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news