Nick Vincent
banner
nickmvincent.bsky.social
Nick Vincent
@nickmvincent.bsky.social
Studying people and computers (https://www.nickmvincent.com/)
Blogging about data and steering AI (https://dataleverage.substack.com/)
Also just saw this paper this week and was similarly quite excited and thinking along similar lines!
October 10, 2025 at 2:44 PM
Follow up, tying together "AI as ranking chunks of human records" with "eval leverage" and "dataset details as quality signals": dataleverage.substack.com/p/how-do-we-...

And related, "eval leverage": dataleverage.substack.com/p/evaluation...
How do we know our AI output is good? Double checks, bar charts, vibes, and training data.
Connecting evaluation and dataset documentation via the lens of "AI as ranking".
dataleverage.substack.com
August 8, 2025 at 10:31 PM
(1) ongoing challenges in benchmarking, (2) challenges in communicating benchmarks to the public, (3) dataset documentation, and (4) post-hoc dataset "reverse engineering"

The original post: dataleverage.substack.com/p/selling-ag...
August 8, 2025 at 10:31 PM
who paid that Dr for a verified attestation with provenance can use this attestation as a quality signal; a promise to consumers about the exact nature of the evaluation. A "9/10 dentists recommend" for a chatbot.

More generally, I think there are interesting connections between current discourse &
August 8, 2025 at 10:31 PM
For some types of info, we can maybe treat as open and focus on selling convenient/"nice" packages (ala Wikimedia Enterprise)

But attestations provide another object to transact over. Valuable info (a Dr giving thumbs up/down on medical responses) may leak, but the AI developer
August 8, 2025 at 10:31 PM
So in a post-AI world, to help people transact over work that produces information, we likely need:
- individual property-ish rights over info (not a great way to go, IMO)
- rights that enable collective bargaining (good!)
- or...
August 8, 2025 at 10:31 PM
The core challenge: many inputs into AI are information, and thus hard to design efficient markets for. Info is hard to exclude (pre-training data remains very hard to exclude, but even post-training data may be hard without sufficient effort)
August 8, 2025 at 10:31 PM
It looks like some skepticism was warranted (not much progress towards this vision yet). I do think "dataset details as quality signals" is still possible though, and could play a key role in addressing looming information economics challenges.
August 8, 2025 at 10:31 PM
Finally, I recently shared a preprint that relates deeply to the above ideas, on Collective Bargaining for Information: arxiv.org/abs/2506.10272, and have a blog post on this as well: dataleverage.substack.com/p/on-ai-driv...
On AI-driven Job Apocalypses and Collective Bargaining for Information
Reacting to a fresh wave of discussion about AI's impact on the economy and power concentration, and reiterating the potential role of collective bargaining.
dataleverage.substack.com
June 24, 2025 at 12:33 PM
These blog posts expand on attentional agency:
- genAI as ranking chunks of info: dataleverage.substack.com/p/google-and...
- utility of AI stems from people: dataleverage.substack.com/p/each-insta...
- connection to evals: dataleverage.substack.com/p/how-do-we-...
Each Instance of "AI Utility" Stems from Some Human Act(s) of Information Recording and Ranking
It's ranking information all the way down.
dataleverage.substack.com
June 24, 2025 at 12:33 PM