Lightnews — Scholar-powered news

wellecks.bsky.social

@wellecks.bsky.social

Will future SWE agents be computer-use agents?

We explore this shift in Programming with Pixels: an agent environment where agents interact with VS Code to perform a wide variety of software engineering tasks

Code/agent environment: github.com/Programmingw...

Homepage: programmingwithpixels.com

Pranjal @pranjal2041.bsky.social · Feb 26

What if AI agents did software engineering like humans—seeing the screen & using any developer tool?

Introducing Programming with Pixels: an SWE environment where agents control VSCode via screen perception, typing & clicking to tackle diverse tasks.

programmingwithpixels.com

🧵

February 27, 2025 at 1:10 AM

Reposted

Pranjal

@pranjal2041.bsky.social

What if AI agents did software engineering like humans—seeing the screen & using any developer tool?

Introducing Programming with Pixels: an SWE environment where agents control VSCode via screen perception, typing & clicking to tackle diverse tasks.

programmingwithpixels.com

🧵

February 26, 2025 at 5:17 PM

wellecks.bsky.social

@wellecks.bsky.social

Big fan of this effort! Also check out our work on Inference Scaling Laws:

paper: arxiv.org/abs/2408.00724
code: github.com/thu-wyz/infe...

We study compute-optimal inference, develop a tree search with process reward models (REBASE), and find that smaller models often outperform larger ones

December 19, 2024 at 7:27 PM

wellecks.bsky.social

@wellecks.bsky.social

Check out our new work on grounding code generation with formal verification!

AlphaVerus generates Rust code that is provably correct via a new combination of tree search and refinement, along with a self-improvement loop that improves its capabilities over time

Pranjal @pranjal2041.bsky.social · Dec 10

LLMs often generate incorrect code.

Instead, what if they can prove code correctness?

Presenting AlphaVerus: A self-reinforcing method that automatically learns to generate correct code using inference-time search and verifier feedback.

🌐 : alphaverus.github.io

🧵

December 19, 2024 at 7:12 PM

Reposted

Matthew Finlayson

@mattf.nl

I’m proud of this tikz drawing I made today for our upcoming NeurIPS tutorial on decoding (our paper: arxiv.org/abs/2406.16838)

A diagram of how beam search works. The graphic is a tree with “Taylor swift is” at the root and possible continuations branching off.

November 14, 2024 at 5:02 AM

Reposted

Akari Asai

@akariasai.bsky.social

1/ Introducing ᴏᴘᴇɴꜱᴄʜᴏʟᴀʀ: a retrieval-augmented LM to help scientists synthesize knowledge 📚
@uwnlp.bsky.social & Ai2
With open models & 45M-paper datastores, it outperforms proprietary systems & match human experts.
Try out our demo!
openscholar.allen.ai

November 19, 2024 at 4:30 PM

wellecks.bsky.social

@wellecks.bsky.social

I was honored to give a talk at Simons Institute on inference-time algorithms and meta-generation!

simons.berkeley.edu/talks/sean-w...

It was a sneak-preview subset of our NeurIPS tutorial:
cmu-l3.github.io/neurips2024-...

November 21, 2024 at 9:44 PM

Add to Home Screen

Light up
your news

Add to Home Screen

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news