Akash Swamy
banner
akashswamy.bsky.social
Akash Swamy
@akashswamy.bsky.social
Product Manager @ Graphcore | My opinions are my own | I talk about AI, LLMs, Engineering, Products and of-course Accelerators
GitHub’s Copilot has just undergone a substantial upgrade. It now features Agent mode, with multi-file edits using various LLMs. While the pricing seems competitive compared to its closest competitor, Cursor, it’s unclear what’s the editing limit per month.

github.blog/news-insight...
GitHub Copilot: The agent awakens
Introducing agent mode for GitHub Copilot in VS Code, announcing the general availability of Copilot Edits, and providing a first look at our SWE agent.
github.blog
February 8, 2025 at 12:12 PM
Reposted by Akash Swamy
I haven’t seen o3 yet & have been critical of benchmarks for AI but they did test against some of the hardest & best

On GPQA, PhDs with access to the internet got 34% outside their specialty, up to 81% inside. o3 is 87%.

Frontier Math went from the best AI at 2% to 25%

Some other big ones, too
December 21, 2024 at 6:27 AM
OpenAI’s o3 model surpassed expectations on the Arc-AGI benchmark with impressive reasoning skills. Not AGI (we still don’t know what that is), but a big leap. Fingers crossed for o3/o3-mini public access in the future.
#openai

arcprize.org/blog/oai-o3-...
OpenAI o3 Breakthrough High Score on ARC-AGI-Pub
OpenAI o3 scores 75.7% on ARC-AGI public leaderboard.
arcprize.org
December 20, 2024 at 9:07 PM
People using Spotify have a delightful surprise this year in the form of NotebookLM wrapped podcast. Spotify Wrapped has always been an excellent summary of my listening trends, but this time, you can actually listen to two AI-generated podcasters presenting it to you.
#spotify #notebooklm
December 4, 2024 at 2:35 PM
Reposted by Akash Swamy
We just updated the OLMo repo at github.com/allenai/OLMo!
There are now several training configs that together reproduce the training runs that lead to the final OLMo 2 models.
In particular, all the training data is available, tokenized and shuffled exactly as we trained on it!
GitHub - allenai/OLMo: Modeling, training, eval, and inference code for OLMo
Modeling, training, eval, and inference code for OLMo - allenai/OLMo
github.com
December 2, 2024 at 8:13 PM
Interesting development last week on small language models (SLMs). The trend is clear: models getting better with flop efficiency and reasoning capabilities while maintaining smaller param size. Agentic workflow could become cheaper and better with these developments.
#llm #ai
allenai.org/blog/olmo2
OLMo 2: The best fully open language model to date | Ai2
Our next generation of fully-open base and instruct models sit at the Pareto frontier of performance and training efficiency.
allenai.org
November 30, 2024 at 11:30 AM
It’s uncertain whether the scaling law will hold true, but we might witness numerous intriguing techniques in the application layer.

bair.berkeley.edu/blog/2024/02...

#llm #compoundai
The Shift from Models to Compound AI Systems
The BAIR Blog
bair.berkeley.edu
November 26, 2024 at 10:41 AM
An amazing source for comparing LLM inference frameworks, hosting costs (inference) and serverless options. llm.extractum.io
#llm #llmops #serverless #inference
LLM Explorer: A Curated Large Language Model Directory. LLM List. 38371 Open-Source Language Models.
Browse 38371 open-source large and small language models conveniently grouped into various categories and llm lists complete with benchmarks and analytics.
llm.extractum.io
November 25, 2024 at 4:40 PM