Steven Fortney
steven-fortney.bsky.social
Steven Fortney
@steven-fortney.bsky.social
Applied Science @Uber. PhD in Financial Economics @Yale ‘21.
Reposted by Steven Fortney
ReAG - Reasoning Augmented Generation

- No chunking, splitting vectorizing bs
- Stateless, no vector DBs etc.
- Supports any model (deepseek, o3-mini et al)
- Reasoning traces
- Metadata filtering
- Typescript, Python support
February 4, 2025 at 8:17 AM
Reposted by Steven Fortney
DeepSeek is part of a quant trading firm, which probably operates out of the fanciest office imaginable, but why am I picturing this?
January 25, 2025 at 6:46 PM
Reposted by Steven Fortney
Apparently, you can run DeepSeek-V3 locally, provided that you have 8 M4 Pro 64GB Mac minis.

~5 tok/sec.
December 27, 2024 at 3:03 AM
Reposted by Steven Fortney
I haven’t seen o3 yet & have been critical of benchmarks for AI but they did test against some of the hardest & best

On GPQA, PhDs with access to the internet got 34% outside their specialty, up to 81% inside. o3 is 87%.

Frontier Math went from the best AI at 2% to 25%

Some other big ones, too
December 21, 2024 at 6:27 AM
Reposted by Steven Fortney
An Alternative to Test-Time Scaling by @kalomaze.bsky.social

Exploring conditional computation and dynamic depth in language models.

rentry.org/conditional_...
An Alternative to Test-Time Scaling
Exploring conditional computation and dynamic depth in language models. Contents Conditional Computation Width vs Depth O1 / Test-Time Scaling SMoE Dropout MoEUT Next Steps Conditional Computation I...
rentry.org
December 20, 2024 at 4:21 AM
Reposted by Steven Fortney
Genesis project

A generative physics engine able to generate 4D dynamical worlds powered by a physics simulation platform designed for general-purpose robotics and physical AI applications.
December 18, 2024 at 11:54 PM
Reposted by Steven Fortney
Introducing MASt3R-SLAM, the first real-time monocular dense SLAM with MASt3R as a foundation.

Easy to use like DUSt3R/MASt3R, from an uncalibrated RGB video it recovers accurate, globally consistent poses & a dense map.

With @ericdexheimer.bsky.social* @ajdavison.bsky.social (*Equal Contribution)
December 16, 2024 at 3:43 PM
Reposted by Steven Fortney
They hypothesize that there exist key "forking tokens," such that re-sampling the system at those specific tokens, but not others, leads to very different outcomes.

An example would be that a simple punctuation mark, or just a single token, can prompt an LLM to produce a different response.
December 15, 2024 at 10:23 PM
Reposted by Steven Fortney
Meta's SPDL: Faster AI model training with thread-based data loading. This framework-agnostic data loading solution utilizes multi-threading to achieve high-throughput in a regulator Python interpreter.

Blog: ai.meta.com/blog/spdl-fa...
Repo: github.com/facebookrese...
December 10, 2024 at 2:35 AM
Reposted by Steven Fortney
Jane Street, a quant trading firm has a very good YouTube channel. For comparison, DeepSeek is also a quant trading firm.

They recently published a video on "Building Machine Learning Systems for a Trillion Trillion Floating Point Operations".

Link: www.youtube.com/watch?v=139U...
Building Machine Learning Systems for a Trillion Trillion Floating Point Operations
YouTube video by Jane Street
www.youtube.com
December 9, 2024 at 5:26 PM
Reposted by Steven Fortney
How are Kernel Smoothing in statistics, Data-Adaptive Filters in image processing, and Attention in Machine Learning related?

My goal is not to argue who should get credit for what, but to show a progression of closely related ideas over time and across neighboring fields.

1/n
December 8, 2024 at 9:45 PM
Reposted by Steven Fortney
Real footage of a synthetic control model
December 8, 2024 at 4:46 AM
Reposted by Steven Fortney
Introducing 🧞Genie 2 🧞 - our most capable large-scale foundation world model, which can generate a diverse array of consistent worlds, playable for up to a minute. We believe Genie 2 could unlock the next wave of capabilities for embodied agents 🧠.
December 4, 2024 at 4:01 PM
Reposted by Steven Fortney
America has three functional high capacity institutions left

The Federal Reserve
The Southern District of New York
and The Delaware Court of Chancery
Elon Musk’s pay package from Tesla, worth more than $50 billion, cannot be reinstated, a Delaware judge ordered. The judge said she would not reverse her decision to strike down the enormous compensation package, which helped make Musk the richest person in the world. nyti.ms/4gfDcGX
December 3, 2024 at 2:14 AM
Reposted by Steven Fortney
1. The conventional explanation for food deserts—that these places are too poor or too rural to generate enough spending on groceries, or too Black to overcome racist corporate redlining — fail to grapple with a key fact: food deserts didn’t used to exist. My new piece in The Atlantic.
The Mystery of Food Deserts
They didn’t materialize around the country for no reason. Something happened.
www.theatlantic.com
December 1, 2024 at 2:06 PM
Bump
The Gaussian is a nice bumpy shape, but sometimes we hope for a smooth (i.e. C∞) function like the Gaussian that is 𝒂𝒍𝒔𝒐 compactly supported.

One such class of functions is called "Bump functions"

1/6
November 30, 2024 at 4:46 PM
Reposted by Steven Fortney
My idea for Econ seminars: speakers can go for as long as they want and talk about whatever they want. But we change the norm so that the audience can leave whenever they want and it’s nbd. Let supply and demand for attention determine seminar length/structure, etc.
November 21, 2024 at 2:04 AM
Reposted by Steven Fortney
Being logged into wandb on your phone is a recipe for misery
November 20, 2024 at 4:09 AM
Reposted by Steven Fortney
🌶️(?) take: Agents are somehow hot right because people realized that LLM output can be interpreted as a DSL which directs side effects in the world (e.g. tool calls) rather than just returning text in a chat/autocomplete sense. What are the open challenges? A 🧵... [1/11]
November 19, 2024 at 9:32 AM