Diffbot
banner
diffbot.bsky.social
Diffbot
@diffbot.bsky.social
AI that finds facts

diffbot.com
Between scant repo examples, FastMCP's irritating vector resemblance to FastAPI, and 180 degree overhauls on every MCP spec release, it's impossible to vibe code your way to a working server.
October 13, 2025 at 11:05 PM
The solution is to reinforce the use of knowledge tool calls for every query in post-training. By consistently grounding responses to citable sources, even the occasional quirk and hallucination are explainable.
October 9, 2025 at 1:56 AM
This phenomenon can sneak into production environments in unobvious ways. If there are enough token predictions pointing to the right answer, it's all too easy to skip the tool call and generate a structured response that still validates schemas.
October 9, 2025 at 1:56 AM
89,886 developers are building their own Perplexity on-prem with Diffbot LLM —

huggingface.co/diffbot/Llam...
diffbot/Llama-3.1-Diffbot-Small-2412 · Hugging Face
We’re on a journey to advance and democratize artificial intelligence through open source and open science.
huggingface.co
January 30, 2025 at 3:08 AM
The model isn't the moat. Perplexity can be recreated as a side project. #DeepSeek proved this. We proved this.

Download Diffbot LLM. Run it off your own GPU. Congrats, your on-prem #AI is smarter than #Perplexity.
January 30, 2025 at 3:08 AM
2. We used the profits from our primary business to train Diffbot LLM. Perplexity raised $915M to train theirs.

3. We open sourced Diffbot LLM. Perplexity chose to keep theirs secret.
January 30, 2025 at 3:08 AM
Let's be frank — The score difference is insignificant. And we'll probably play SimpleQA tag for awhile.

What IS significant is how we got here vs. Perplexity.

1. Diffbot LLM is a side project. Sonar is Perplexity's entire business.
January 30, 2025 at 3:08 AM
...so I set it up to run the 4000 question eval on Diffbot LLM overnight and went to bed.

The next morning, we beat Sonar Pro.
January 30, 2025 at 3:08 AM
While working on my talk last week, Perplexity released Sonar Pro API with a special emphasis on its factuality benchmark F1 score of 0.858, handily beating other internet connected LLMs like Gemini-2.0-flash.

The SimpleQA benchmark they used is open source and LLM judged...
January 30, 2025 at 3:08 AM
A demo is also available at diffy.chat.

We look forward to building a future of grounded AI with you all.
Diffy Chat
diffy.chat
January 9, 2025 at 9:47 PM
Diffbot LLM's lighter footprint puts on-prem hosting well within reach.

And we are excited to share that we are releasing Diffbot LLM open source on #Github, with weights available for download on #Huggingface.

github.com/diffbot/diff...
GitHub - diffbot/diffbot-llm-inference: DIffbot LLM Inference Server
DIffbot LLM Inference Server. Contribute to diffbot/diffbot-llm-inference development by creating an account on GitHub.
github.com
January 9, 2025 at 9:47 PM
At Diffbot, we believe that general purpose reasoning will eventually be distilled down to ~1B parameters.

Knowledge is best retrieved at inference, outside of model weights.
January 9, 2025 at 9:47 PM
The benefit of full source attribution goes two ways.

Not only is credit provided to publishers, every fact is also independently verifiable.
January 9, 2025 at 9:47 PM
Every response from Diffbot LLM draws from the results of real-time expert web searching and queries to the Diffbot Knowledge Graph.

Naturally, this means Diffbot LLM always provides full attribution to its cited sources.
January 9, 2025 at 9:47 PM