Francisco Jorge
franciscojorge.bsky.social
Francisco Jorge
@franciscojorge.bsky.social
ml | genai | platform engineering
Reposted by Francisco Jorge
This is so cliche but it is also so true — every time I grow in life/career is when I find myself in a situation I’m not ready/comfortable with.

I love routine, but have to keep reminding myself to break it from time to time.
November 14, 2025 at 7:57 PM
Reposted by Francisco Jorge
Many people are sleeping on, or even making fun of this plot in the GPT 5.1 release. This is a crucial plot for anyone serving a thinking model in real world use-cases. Latency to an answer is a huge cause of user churn and not thinking enough is a fast track to having your model's output be bad.
November 13, 2025 at 7:18 PM
Reposted by Francisco Jorge
Why Greatness Cannot Be Planned

Both the English and Japanese editions now found a home in the Sakana AI library ✨ @sakanaai.bsky.social
August 26, 2025 at 8:58 AM
Reposted by Francisco Jorge
For technical domains especially, getting non-DSes involved in analyzing outputs is vital. It’s hard to build anything good without it bc v1s almost always have major fail modes. Finding the appropriate system design—let alone optimizing—requires a tight coupling of output analysis and system design
Can non-data scientists write AI Evals? The answer is nuanced and not just "Yes". @eugeneyan.com and I discuss this in the context of the "analyze-measure-improve" cycle from our course.

Links to more resources in the reply
August 26, 2025 at 4:56 AM
Reposted by Francisco Jorge
It’s been a long time since I went back to Asimov. I forgot that Asimov predicted AI sycophancy
Isaac Asimov, “I, Robot” (1950)
August 7, 2025 at 11:03 AM
Reposted by Francisco Jorge
This is the single best piece I've read on "replacing coders with AI," it fully dispells the myth from the perspective of a software engineer and does so in a calm, reasonable way.
colton.dev/blog/curing-...
No, AI is not Making Engineers 10x as Productive
Curing Your AI 10x Engineer Imposter Syndrome
colton.dev
August 6, 2025 at 10:22 PM
Reposted by Francisco Jorge
Context Engineering for AI Agents: Lessons from Building Manus by Manys AI folks

Or... how to ensure your context takes advantage of KV-cache to save cost. For example, the cached input tokens cost 0.30 USD/MTok, while uncached ones cost 3 USD/MTok—a 10x difference.

manus.im/blog/Context...
Context Engineering for AI Agents: Lessons from Building Manus
This post shares the local optima Manus arrived at through our own "SGD". If you're building your own AI agent, we hope these principles help you converge faster.
manus.im
July 19, 2025 at 3:09 PM
Reposted by Francisco Jorge
The first mass-produced robotaxi, the Zeekr RT, is nearly here and will be joining Waymo's fleet soon. Here I go into what I think it costs and what that means for us. open.substack.com/pub/itcanthi...
The First Mass-Produced Robotaxi Is Here
Some thoughts on Waymo's new Zeekr RT and what it could mean for the autonomous car industry
open.substack.com
July 7, 2025 at 2:31 PM
Reposted by Francisco Jorge
I really liked this blog post by @jeremiahdjohns.bsky.social: open.substack.com/pub/infinite...

"Persistence beats talent" is honestly quite inspiring
Persistence Beats Talent
And how to think about achieving success
open.substack.com
June 20, 2025 at 3:00 PM
Reposted by Francisco Jorge
It’s wild when the people who control the only feed most people see accuse us of being the echo chamber. People are scared and in pain—they deserve spaces to talk without top-down interference. That’s what we’re building.
June 12, 2025 at 10:21 PM
Reposted by Francisco Jorge
1M public domain books now available digitally, through our Institutional Data Initiative at Harvard.
Today we released Institutional Books 1.0, a 242B token dataset from Harvard Library's collections, refined for accuracy and usability. 🧵
June 12, 2025 at 9:34 PM
Reposted by Francisco Jorge
Vibe governing: bad idea

Probably also illegal
June 6, 2025 at 10:26 PM
Reposted by Francisco Jorge
Trends – Artificial Intelligence by Mary Meeker and Bond Cap (www.bondcap.com)

www.bondcap.com/report/tai/0
May 31, 2025 at 2:24 PM
Reposted by Francisco Jorge
Without surprise, the most important paper you’d read today. github.com/deepseek-ai/...
April 30, 2025 at 6:22 PM
Reposted by Francisco Jorge
So on one level this post is pretty misleading -- it's a collection of very different cases where the AI was being used in very different ways -- but holy shit this Gemini example is wild and unhinged: gemini.google.com/share/6d141b...
April 18, 2025 at 7:53 PM
Reposted by Francisco Jorge
This is a great list, things that “the best engineers I know” do, stuff like:

- understanding things deeply, reading the actual source
- being willing to help other people
- status doesn’t matter, good ideas come from anywhere

endler.dev/2025/best-pr...
The Best Programmers I Know | Matthias Endler
I have met a lot of developers in my life. Late…
endler.dev
April 13, 2025 at 3:57 PM
Reposted by Francisco Jorge
My god it's weird watching a diffusion text generation model go (story generation)-- Dream 7B. hkunlp.github.io/blog/2025/dr...
April 3, 2025 at 8:53 AM
Reposted by Francisco Jorge
Good insights on sync engines, both as enabler of local-first software and in a more traditional server-centric setting
April 2, 2025 at 4:37 AM
Reposted by Francisco Jorge
ah the code is wrong
wait it's the test
no no, the code
ummm, is it both?
no, definitely test
hang on what is this test even doing?
*gets up and makes tea*
March 28, 2025 at 6:52 PM
Reposted by Francisco Jorge
Can we understand the mechanisms of a frontier AI model?

📝 Blog post: www.anthropic.com/research/tra...
🧪 "Biology" paper: transformer-circuits.pub/2025/attribu...
⚙️ Methods paper: transformer-circuits.pub/2025/attribu...

Featuring basic multi-step reasoning, planning, introspection and more!
On the Biology of a Large Language Model
transformer-circuits.pub
March 27, 2025 at 6:18 PM
Reposted by Francisco Jorge
We all want LLMs to collaborate with humans to help them achieve their goals. But LLMs are not trained to collaborate, they are trained to imitate. Can we teach LM agents to help humans by first making them help each other?

arxiv.org/abs/2503.14481
Don't lie to your friends: Learning what you know from collaborative self-play
To be helpful assistants, AI agents must be aware of their own capabilities and limitations. This includes knowing when to answer from parametric knowledge versus using tools, when to trust tool outpu...
arxiv.org
March 24, 2025 at 3:39 PM
Reposted by Francisco Jorge
Vibe coding on projects you don't care deeply about is pretty fun, but it's unnerving and unpleasant to not understand all the details of a project you care about deeply
March 21, 2025 at 3:15 AM
Those innie/outtie dynamics between Mark and Gemma in that last scene was crazyyyy
March 22, 2025 at 2:04 PM
Reposted by Francisco Jorge
This is a very tidy little RL paper for reasoning. Their GRPO changes:
1 Two different clip hyperparams, so positive clipping can uplift more unexpected tokens
2 Dynamic sampling -- remove samples w flat reward in batch
3 Per token loss
4 Managing too long generations in loss
dapo-sia.github.io
March 17, 2025 at 10:13 PM