Lightnews — Scholar-powered news

Reposted by Steven Fortney

Sung Kim

@sungkim.bsky.social

ReAG - Reasoning Augmented Generation

- No chunking, splitting vectorizing bs
- Stateless, no vector DBs etc.
- Supports any model (deepseek, o3-mini et al)
- Reasoning traces
- Metadata filtering
- Typescript, Python support

February 4, 2025 at 8:17 AM

Reposted by Steven Fortney

Sung Kim

@sungkim.bsky.social

DeepSeek is part of a quant trading firm, which probably operates out of the fanciest office imaginable, but why am I picturing this?

January 25, 2025 at 6:46 PM

Reposted by Steven Fortney

Sung Kim

@sungkim.bsky.social

Apparently, you can run DeepSeek-V3 locally, provided that you have 8 M4 Pro 64GB Mac minis.

~5 tok/sec.

December 27, 2024 at 3:03 AM

Reposted by Steven Fortney

Ethan Mollick

@emollick.bsky.social

I haven’t seen o3 yet & have been critical of benchmarks for AI but they did test against some of the hardest & best

On GPQA, PhDs with access to the internet got 34% outside their specialty, up to 81% inside. o3 is 87%.

Frontier Math went from the best AI at 2% to 25%

Some other big ones, too

December 21, 2024 at 6:27 AM

Reposted by Steven Fortney

Sung Kim

@sungkim.bsky.social

An Alternative to Test-Time Scaling by @kalomaze.bsky.social

Exploring conditional computation and dynamic depth in language models.

rentry.org/conditional_...

An Alternative to Test-Time Scaling

Exploring conditional computation and dynamic depth in language models. Contents Conditional Computation Width vs Depth O1 / Test-Time Scaling SMoE Dropout MoEUT Next Steps Conditional Computation I...

rentry.org

December 20, 2024 at 4:21 AM

Reposted by Steven Fortney

Sung Kim

@sungkim.bsky.social

Genesis project

A generative physics engine able to generate 4D dynamical worlds powered by a physics simulation platform designed for general-purpose robotics and physical AI applications.

December 18, 2024 at 11:54 PM

Reposted by Steven Fortney

Riku Murai

@rmurai0610.bsky.social

Introducing MASt3R-SLAM, the first real-time monocular dense SLAM with MASt3R as a foundation.

Easy to use like DUSt3R/MASt3R, from an uncalibrated RGB video it recovers accurate, globally consistent poses & a dense map.

With @ericdexheimer.bsky.social* @ajdavison.bsky.social (*Equal Contribution)

December 16, 2024 at 3:43 PM

Reposted by Steven Fortney

Sung Kim

@sungkim.bsky.social

They hypothesize that there exist key "forking tokens," such that re-sampling the system at those specific tokens, but not others, leads to very different outcomes.

An example would be that a simple punctuation mark, or just a single token, can prompt an LLM to produce a different response.

December 15, 2024 at 10:23 PM

Reposted by Steven Fortney

Sung Kim

@sungkim.bsky.social

Meta's SPDL: Faster AI model training with thread-based data loading. This framework-agnostic data loading solution utilizes multi-threading to achieve high-throughput in a regulator Python interpreter.

Blog: ai.meta.com/blog/spdl-fa...
Repo: github.com/facebookrese...

December 10, 2024 at 2:35 AM

Reposted by Steven Fortney

Sung Kim

@sungkim.bsky.social

Jane Street, a quant trading firm has a very good YouTube channel. For comparison, DeepSeek is also a quant trading firm.

They recently published a video on "Building Machine Learning Systems for a Trillion Trillion Floating Point Operations".

Link: www.youtube.com/watch?v=139U...

Building Machine Learning Systems for a Trillion Trillion Floating Point Operations

YouTube video by Jane Street

www.youtube.com

December 9, 2024 at 5:26 PM

Reposted by Steven Fortney

Peyman Milanfar

@docmilanfar.bsky.social

How are Kernel Smoothing in statistics, Data-Adaptive Filters in image processing, and Attention in Machine Learning related?

My goal is not to argue who should get credit for what, but to show a progression of closely related ideas over time and across neighboring fields.

1/n

December 8, 2024 at 9:45 PM

Reposted by Steven Fortney

Khoa

@khoavuumn.bsky.social

Real footage of a synthetic control model

December 8, 2024 at 4:46 AM

Reposted by Steven Fortney

Jack Parker-Holder

@jparkerholder.bsky.social

Introducing 🧞Genie 2 🧞 - our most capable large-scale foundation world model, which can generate a diverse array of consistent worlds, playable for up to a minute. We believe Genie 2 could unlock the next wave of capabilities for embodied agents 🧠.

December 4, 2024 at 4:01 PM

Reposted by Steven Fortney

Joey Politano🏳️‍🌈

@josephpolitano.bsky.social

America has three functional high capacity institutions left

The Federal Reserve
The Southern District of New York
and The Delaware Court of Chancery

The New York Times @nytimes.com · Dec 3

Elon Musk’s pay package from Tesla, worth more than $50 billion, cannot be reinstated, a Delaware judge ordered. The judge said she would not reverse her decision to strike down the enormous compensation package, which helped make Musk the richest person in the world. nyti.ms/4gfDcGX

Elon Musk reaches into a pocket of his black suit jacket while walking outside. Headline reads: "Elon Musk’s $50 Billion Tesla Pay Can’t Be Reinstated, Delaware Judge Rules." Photo credit: Johnathan Ernst/Reuters.

December 3, 2024 at 2:14 AM

Reposted by Steven Fortney

Stacy Mitchell

@stacyfmitchell.bsky.social

1. The conventional explanation for food deserts—that these places are too poor or too rural to generate enough spending on groceries, or too Black to overcome racist corporate redlining — fail to grapple with a key fact: food deserts didn’t used to exist. My new piece in The Atlantic.

The Mystery of Food Deserts

They didn’t materialize around the country for no reason. Something happened.

www.theatlantic.com

December 1, 2024 at 2:06 PM

Steven Fortney

@steven-fortney.bsky.social

Bump

Peyman Milanfar @docmilanfar.bsky.social · Nov 29

The Gaussian is a nice bumpy shape, but sometimes we hope for a smooth (i.e. C∞) function like the Gaussian that is 𝒂𝒍𝒔𝒐 compactly supported.

One such class of functions is called "Bump functions"

1/6

November 30, 2024 at 4:46 PM

Reposted by Steven Fortney

Jesse Bruhn

@jessebruhn.bsky.social

My idea for Econ seminars: speakers can go for as long as they want and talk about whatever they want. But we change the norm so that the audience can leave whenever they want and it’s nbd. Let supply and demand for attention determine seminar length/structure, etc.

November 21, 2024 at 2:04 AM

Reposted by Steven Fortney

Eugene Vinitsky 🍒

@eugenevinitsky.bsky.social

Being logged into wandb on your phone is a recipe for misery

November 20, 2024 at 4:09 AM

Reposted by Steven Fortney

Edward Grefenstette

@egrefen.bsky.social

🌶️(?) take: Agents are somehow hot right because people realized that LLM output can be interpreted as a DSL which directs side effects in the world (e.g. tool calls) rather than just returning text in a chat/autocomplete sense. What are the open challenges? A 🧵... [1/11]

November 19, 2024 at 9:32 AM

Add to Home Screen

Light up
your news

Add to Home Screen

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news