Lightnews — Scholar-powered news

Reposted by Spandan Karma Mishra

Ethan Mollick

@emollick.bsky.social

Huh. Looks like Plato was right.

A new paper shows all language models converge on the same "universal geometry" of meaning. Researchers can translate between ANY model's embeddings without seeing the original text.

Implications for philosophy and vector databases alike. arxiv.org/pdf/2505.12540

May 23, 2025 at 2:44 AM

Reposted by Spandan Karma Mishra

infoDOCKET

@infodocket.bsky.social

AI Agents vs. Agentic #AI: A Conceptual Taxonomy, Applications and Challenges (preprint) arxiv.org/abs/2505.10468

May 17, 2025 at 1:55 PM

Reposted by Spandan Karma Mishra

NPR

@npr.org

BREAKING NEWS: The White House has begun the process of looking for a new secretary of defense, according to a U.S. official who was not authorized to speak publicly.

The White House has begun process of looking for new secretary of defense

The White House has begun the process of looking for a new secretary of defense, according to a U.S. official who was not authorized to speak publicly.

www.npr.org

April 21, 2025 at 5:25 PM

Reposted by Spandan Karma Mishra

Sung Kim

@sungkim.bsky.social

They show LMs can synthesize their own thoughts for more data-efficient pretraining, bootstrapping their capabilities on limited, task-agnostic data. They call this new paradigm “reasoning to learn”.

March 27, 2025 at 3:55 AM

Reposted by Spandan Karma Mishra

Sung Kim

@sungkim.bsky.social

PapersChat – Chat with Research Papers

PapersChat provides an agentic AI interface for querying papers, retrieving insights from ArXiv & PubMed, and structuring responses efficiently.

github.com/AstraBert/Pa...

March 10, 2025 at 4:47 AM

Reposted by Spandan Karma Mishra

Hacker News 100

@hn100.atproto.rocks

Show HN: Evolving Agents Framework
https://github.com/matiasmolinas/evolving-agents

https://news.ycombinator.com/item?id=43310963

GitHub - matiasmolinas/evolving-agents: Evolving agents is a production-grade environment for orchestrating, evolving, and managing AI agents

Evolving agents is a production-grade environment for orchestrating, evolving, and managing AI agents - matiasmolinas/evolving-agents

github.com

March 10, 2025 at 12:45 AM

Reposted by Spandan Karma Mishra

Adam Schwarz

@adamjschwarz.bsky.social

French Senator Claude Malhuret:

"Washington has become Nero’s court, with an incendiary emperor, submissive courtiers and a jester high on ketamine... We were at war with a dictator, we are now at war with a dictator backed by a traitor."

March 5, 2025 at 3:47 PM

Reposted by Spandan Karma Mishra

Thomas Wolf

@thomwolf.bsky.social

A few words on DeepSeek new releases. Links are:
- github.com/deepseek-ai/...
- github.com/deepseek-ai/...
- github.com/deepseek-ai/...
and the Ultra-Scale Playbook at huggingface.co/spaces/nanot...

February 27, 2025 at 1:41 PM

Reposted by Spandan Karma Mishra

Sebastian Raschka (rasbt)

@sebastianraschka.com

Just read the s1: Simple Test-Time Scaling paper. Super interesting approach to improving reasoning models!

TL;DR:
1. SFT on 1k curated examples w/ reasoning traces.
2. Control response length w/ budget forcing:
"Wait" tokens → longer reasoning/self-correction.
"Final Answer:" → enforce stopping.

February 7, 2025 at 2:00 PM

Reposted by Spandan Karma Mishra

Sebastian Raschka (rasbt)

@sebastianraschka.com

Maybe a hot take, but what about the following advice to the next gen:
Don't get an AI degree; the curriculum will be outdated before you graduate. Instead, study math, stats, or physics as your foundation, and stay current with AI through code-focused books, blogs, and papers.

February 9, 2025 at 3:36 PM

Reposted by Spandan Karma Mishra

AltYellostoneNatPar

@altyellonatpark.org

Bison should be allowed to roam free and cattle should be restricted to private land.
All abandoned barbed wire should be removed from public land.
The money today being wasted on public lands grazing should go into building wildlife overpasses and installing wildlife safe guide fencing.

A herd of bison stretching off into the distance on a snowy prairie.

February 7, 2025 at 4:46 PM

Reposted by Spandan Karma Mishra

Peyman Milanfar

@docmilanfar.bsky.social

Not one VC would ever fund a startup to do the kind of hardcore optimization work that DeepSeek did.

Every VC firm should be asking themselves why.

January 28, 2025 at 5:00 AM

Reposted by Spandan Karma Mishra

Eugene Vinitsky 🍒

@eugenevinitsky.bsky.social

Finally finally finally some scaling curves for imitation learning in the large-scale-data regime: arxiv.org/abs/2411.04434

Scaling Laws for Pre-training Agents and World Models

The performance of embodied agents has been shown to improve by increasing model parameters, dataset size, and compute. This has been demonstrated in domains from robotics to video games, when generat...

arxiv.org

January 20, 2025 at 2:48 PM

Reposted by Spandan Karma Mishra

Sebastian Raschka (rasbt)

@sebastianraschka.com

And here's a great reader project who trained a tokenizer from scratch on Nepali: github.com/rasbt/LLMs-f...

GPT2-Nepali (Pretrained from scratch) · rasbt LLMs-from-scratch · Discussion #485

Hi everyone! 👋 I’m excited to share my recent project: GPT2-Nepali, a GPT-2 model pretrained from scratch for the Nepali language. This project builds upon the GPT-2 model training code detailed in...

github.com

January 19, 2025 at 4:37 PM

Reposted by Spandan Karma Mishra

julianpollmann.bsky.social

@julianpollmann.bsky.social

Nice and fresh content to understand how Large Language Models work: arxiv.org/abs/2501.09223 #LLM #NLP

Foundations of Large Language Models

This is a book about large language models. As indicated by the title, it primarily focuses on foundational concepts rather than comprehensive coverage of all cutting-edge technologies. The book is st...

arxiv.org

January 19, 2025 at 3:17 PM

Reposted by Spandan Karma Mishra

David Berenstein

@davidberenstein.bsky.social

This is a wonderfully simple blog on how tensors flow through a transformer model.

Covering:
- Tokenize
- Embed
- Positional Encoding
- Decoder
- Multi-Head Attention
- Add and normalize
- Feed-Forward
- Model Head
- Cross-Attention

Blog:

Mastering Tensor Dimensions in Transformers

A Blog post by Hafedh Hichri on Hugging Face

buff.ly

January 14, 2025 at 1:00 PM

Reposted by Spandan Karma Mishra

Paul Frazee

@pfrazee.com

Free Our Feeds! What is it! @freeourfeeds.com

F.O.F. is an independent group with the goal of running THIS👇 social network totally outside of Bluesky.

It's not us. It's a fully independent version of the network. All the same users and posts. Running cooperatively with us and others.

January 13, 2025 at 9:03 PM

Reposted by Spandan Karma Mishra

Peyman Milanfar

@docmilanfar.bsky.social

If you’re an AI startups, or interviewing w/ one ask:

What are you the best in the world at?

Do you offer a service, formula, or delivery method you invented?

Is there something you do that’s patentable or a unique user experience?

Have you identified and isolated a market segment?

If not, walk

January 5, 2025 at 10:33 PM

Spandan Karma Mishra

@spandyie.bsky.social

Happy new year 2025

January 1, 2025 at 6:53 PM

Reposted by Spandan Karma Mishra

Ahmad Beirami

@abeirami.bsky.social

Very interesting paper by Ananda Theertha Suresh et al.

For categorical/Gaussian distributions, they derive the rate at which a sample is forgotten to be 1/k after k rounds of recursive training (hence 𝐦𝐨𝐝𝐞𝐥 𝐜𝐨𝐥𝐥𝐚𝐩𝐬𝐞 happens more slowly than intuitively expected)

December 27, 2024 at 11:35 PM

Reposted by Spandan Karma Mishra

Aranym

@aranym.bsky.social

Releasing a dataset of 40 million Bluesky posts!

Collected using the Firehose API, I hope people do some cool ML with it.

Anonymized with a data removal mechanism and includes text, language predictions, and image data.

#ai #ml #NLP

huggingface.co/datasets/Ara...

Aranym/40-million-bluesky-posts · Datasets at Hugging Face

We’re on a journey to advance and democratize artificial intelligence through open source and open science.

huggingface.co

December 17, 2024 at 3:25 PM

Reposted by Spandan Karma Mishra

Eugene Vinitsky 🍒

@eugenevinitsky.bsky.social

A short list of tips for keeping a clean, organized ML codebase for new researchers: eugenevinitsky.com/posts/quick-...

Eugene Vinitsky

eugenevinitsky.com

December 18, 2024 at 8:00 PM

Reposted by Spandan Karma Mishra

Sebastian Raschka (rasbt)

@sebastianraschka.com

Hey all, I've been a bit quiet the last couple of weeks as I am recovering from an accident & injury.

Unfortunately, I couldn’t write my yearly AI research review this year, but here’s at least a list of bookmarked papers you might find useful: magazine.sebastianraschka.com/p/llm-resear...

LLM Research Papers: The 2024 List

A curated list of interesting LLM-related research papers from 2024, shared for those looking for something to read over the holidays.

magazine.sebastianraschka.com

December 22, 2024 at 2:02 PM

Reposted by Spandan Karma Mishra

Sam Bowman

@sleepinyourhat.bsky.social

New work from my team at Anthropic in collaboration with Redwood Research. I think this is plausibly the most important AGI safety result of the year. Cross-posting the thread below:

Title card: Alignment Faking in Large Language Models by Greenblatt et al.

December 18, 2024 at 5:47 PM

Reposted by Spandan Karma Mishra

Ethan Mollick

@emollick.bsky.social

LLMs might secretly be world models of the internet!

By treating LLMs as simulators that can predict "what would happen if I click this?" the authors built an AI that can navigate websites by imagining outcomes before taking action, performing 33% better than baseline. arxiv.org/pdf/2411.06559

December 3, 2024 at 2:00 AM

Add to Home Screen

Light up
your news

Add to Home Screen

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news