A new paper shows all language models converge on the same "universal geometry" of meaning. Researchers can translate between ANY model's embeddings without seeing the original text.
Implications for philosophy and vector databases alike. arxiv.org/pdf/2505.12540
A new paper shows all language models converge on the same "universal geometry" of meaning. Researchers can translate between ANY model's embeddings without seeing the original text.
Implications for philosophy and vector databases alike. arxiv.org/pdf/2505.12540
PapersChat provides an agentic AI interface for querying papers, retrieving insights from ArXiv & PubMed, and structuring responses efficiently.
github.com/AstraBert/Pa...
PapersChat provides an agentic AI interface for querying papers, retrieving insights from ArXiv & PubMed, and structuring responses efficiently.
github.com/AstraBert/Pa...
https://github.com/matiasmolinas/evolving-agents
https://news.ycombinator.com/item?id=43310963
https://github.com/matiasmolinas/evolving-agents
https://news.ycombinator.com/item?id=43310963
"Washington has become Nero’s court, with an incendiary emperor, submissive courtiers and a jester high on ketamine... We were at war with a dictator, we are now at war with a dictator backed by a traitor."
"Washington has become Nero’s court, with an incendiary emperor, submissive courtiers and a jester high on ketamine... We were at war with a dictator, we are now at war with a dictator backed by a traitor."
- github.com/deepseek-ai/...
- github.com/deepseek-ai/...
- github.com/deepseek-ai/...
and the Ultra-Scale Playbook at huggingface.co/spaces/nanot...
- github.com/deepseek-ai/...
- github.com/deepseek-ai/...
- github.com/deepseek-ai/...
and the Ultra-Scale Playbook at huggingface.co/spaces/nanot...
TL;DR:
1. SFT on 1k curated examples w/ reasoning traces.
2. Control response length w/ budget forcing:
"Wait" tokens → longer reasoning/self-correction.
"Final Answer:" → enforce stopping.
TL;DR:
1. SFT on 1k curated examples w/ reasoning traces.
2. Control response length w/ budget forcing:
"Wait" tokens → longer reasoning/self-correction.
"Final Answer:" → enforce stopping.
Don't get an AI degree; the curriculum will be outdated before you graduate. Instead, study math, stats, or physics as your foundation, and stay current with AI through code-focused books, blogs, and papers.
Don't get an AI degree; the curriculum will be outdated before you graduate. Instead, study math, stats, or physics as your foundation, and stay current with AI through code-focused books, blogs, and papers.
All abandoned barbed wire should be removed from public land.
The money today being wasted on public lands grazing should go into building wildlife overpasses and installing wildlife safe guide fencing.
All abandoned barbed wire should be removed from public land.
The money today being wasted on public lands grazing should go into building wildlife overpasses and installing wildlife safe guide fencing.
Every VC firm should be asking themselves why.
Every VC firm should be asking themselves why.
Covering:
- Tokenize
- Embed
- Positional Encoding
- Decoder
- Multi-Head Attention
- Add and normalize
- Feed-Forward
- Model Head
- Cross-Attention
Blog:
Covering:
- Tokenize
- Embed
- Positional Encoding
- Decoder
- Multi-Head Attention
- Add and normalize
- Feed-Forward
- Model Head
- Cross-Attention
Blog:
F.O.F. is an independent group with the goal of running THIS👇 social network totally outside of Bluesky.
It's not us. It's a fully independent version of the network. All the same users and posts. Running cooperatively with us and others.
F.O.F. is an independent group with the goal of running THIS👇 social network totally outside of Bluesky.
It's not us. It's a fully independent version of the network. All the same users and posts. Running cooperatively with us and others.
What are you the best in the world at?
Do you offer a service, formula, or delivery method you invented?
Is there something you do that’s patentable or a unique user experience?
Have you identified and isolated a market segment?
If not, walk
What are you the best in the world at?
Do you offer a service, formula, or delivery method you invented?
Is there something you do that’s patentable or a unique user experience?
Have you identified and isolated a market segment?
If not, walk
For categorical/Gaussian distributions, they derive the rate at which a sample is forgotten to be 1/k after k rounds of recursive training (hence 𝐦𝐨𝐝𝐞𝐥 𝐜𝐨𝐥𝐥𝐚𝐩𝐬𝐞 happens more slowly than intuitively expected)
For categorical/Gaussian distributions, they derive the rate at which a sample is forgotten to be 1/k after k rounds of recursive training (hence 𝐦𝐨𝐝𝐞𝐥 𝐜𝐨𝐥𝐥𝐚𝐩𝐬𝐞 happens more slowly than intuitively expected)
Collected using the Firehose API, I hope people do some cool ML with it.
Anonymized with a data removal mechanism and includes text, language predictions, and image data.
#ai #ml #NLP
huggingface.co/datasets/Ara...
Collected using the Firehose API, I hope people do some cool ML with it.
Anonymized with a data removal mechanism and includes text, language predictions, and image data.
#ai #ml #NLP
huggingface.co/datasets/Ara...
Unfortunately, I couldn’t write my yearly AI research review this year, but here’s at least a list of bookmarked papers you might find useful: magazine.sebastianraschka.com/p/llm-resear...
Unfortunately, I couldn’t write my yearly AI research review this year, but here’s at least a list of bookmarked papers you might find useful: magazine.sebastianraschka.com/p/llm-resear...
By treating LLMs as simulators that can predict "what would happen if I click this?" the authors built an AI that can navigate websites by imagining outcomes before taking action, performing 33% better than baseline. arxiv.org/pdf/2411.06559
By treating LLMs as simulators that can predict "what would happen if I click this?" the authors built an AI that can navigate websites by imagining outcomes before taking action, performing 33% better than baseline. arxiv.org/pdf/2411.06559