Lightnews — Scholar-powered news

Ted

@edwardbenson.bsky.social

Have we created a word yet for groups of agents that are constrained within nodes of a rigid workflow?

If you zoomed into a workflow node, you'd say: "oh yeah, this is agentic computing"

But if you zoom out, you'd say: "this is a classic workflow engine"

January 14, 2025 at 3:24 PM

Ted

@edwardbenson.bsky.social

"Jagged Frontier" is the term that I've been looking for the past two years of AI growth.

It's difficult from any one point on the frontier to make strong inferences about adjacent points.

E.g. AI can draft top-quality research memos, but still struggles at cocktail party chat.

January 10, 2025 at 6:43 PM

Ted

@edwardbenson.bsky.social

I (non-ironically) love that one of the top posts on HN right now is about making "beautiful" API keys..

..in which the first key in this pic is an example of "ugly", while the second key in this pic is an example of "beautiful"

January 10, 2025 at 2:31 PM

Ted

@edwardbenson.bsky.social

👀👀 "The last book I wrote, I’m happy if humans read it, but I mostly wrote it for the AIs. And my next book I’m writing even more for the AIs." -Tyler Cowen

January 9, 2025 at 8:04 PM

Ted

@edwardbenson.bsky.social

The first big red "Destroy the LLM!" button will be because of national security fears, not evil AGI fears.

If a government trained an LLM on its intelligence materials, the model weights would be the most sensitive asset in its possession.

The opposite of compartmentalized information.

January 9, 2025 at 4:19 PM

Ted

@edwardbenson.bsky.social

What will the first AI Morris Worm be?

It's bound to happen..

December 20, 2024 at 8:16 PM

Ted

@edwardbenson.bsky.social

Informed shoulder shrugs in the small room often precede confident parrots in the big room.

December 14, 2024 at 7:58 PM

Ted

@edwardbenson.bsky.social

I guess the つなみ was a っなみ.

December 5, 2024 at 7:52 PM

Ted

@edwardbenson.bsky.social

We've got to get the LLMs back in the office -- they're barely working hard remotely!

December 5, 2024 at 7:45 PM

Ted

@edwardbenson.bsky.social

Strategies for agents automation are different when you're thinking in data SETS rather than POINTS.

Eg with an imperfect agent, you can:

- Maximize shots on goal, then detect which went in
- Maximize shot opportunities, then just take the best K

December 5, 2024 at 6:33 PM

Ted

@edwardbenson.bsky.social

Google has a ways to go...

ChatGPT is better at GSheets formula help than the in-app Gemini.

December 5, 2024 at 5:28 PM

Ted

@edwardbenson.bsky.social

Wowzers.

T-3 years to my dream of playing Mario Kart inside of the Google Maps dataset.

deepmind.google/discover/blo...

Genie 2: A large-scale foundation world model

Generating unlimited diverse training environments for future general agents

deepmind.google

December 4, 2024 at 6:44 PM

Ted

@edwardbenson.bsky.social

That ten minute dance scene in Wicked captured the entire range of high school emotions better than anything that’s ever been filmed.

December 4, 2024 at 1:54 AM

Ted

@edwardbenson.bsky.social

The era of GreaseMonkey returns!

github.com/steel-dev/st...

GitHub - steel-dev/steel-browser: 🚧 Open Source Browser API for AI Agents & Apps. Steel Browser is a batteries-included browser instance that lets you build automate the web without worrying about inf...

🚧 Open Source Browser API for AI Agents & Apps. Steel Browser is a batteries-included browser instance that lets you build automate the web without worrying about infrastructure. - steel-dev/st...

github.com

December 2, 2024 at 1:16 PM

Ted

@edwardbenson.bsky.social

November 26, 2024 at 9:21 PM

Ted

@edwardbenson.bsky.social

November 26, 2024 at 9:11 PM

Ted

@edwardbenson.bsky.social

November 26, 2024 at 9:06 PM

Ted

@edwardbenson.bsky.social

Hats off to the team that built Shopify's "Collaborators" feature.

Fine-grained permissions with OAuth is a tornado of pain even for developers.

Shopify nailed it -- and for non-developers no less!

November 26, 2024 at 8:01 PM

Ted

@edwardbenson.bsky.social

True alignment would be LLMs making us listen to a 15 minute story about weekends at grandma’s house before giving us the recipe for blueberry cobbler.

- The Recipe Blog Lobby

November 24, 2024 at 7:25 PM

Ted

@edwardbenson.bsky.social

Teams working on OCR and Translation must be in such an odd spot right now...

LLMs don't yet universally outperform those task-specific models yet.. but it's pretty clear that they're on a path to.

.. So does Big Tech just freeze those products in place to wait?

November 24, 2024 at 4:38 PM

Ted

@edwardbenson.bsky.social

There's an interesting experiment waiting to be done w/ LLMs and OCR.

When OCRing a full-page of text w/ an LLM, it can go off the rails and, when it does - it usually stays off the rails.

Feels like an interesting substrate to create experiments to study hallucination.

November 24, 2024 at 3:50 PM

Ted

@edwardbenson.bsky.social

AI coworkers will interact on software timescales (immediate results), but also human timescales (I'll let you know by end-of-day)

We've been doing "agent progress bar" experiments at @everpilotapp that let you know what's happening & also invite you to collab with the agent.

November 20, 2024 at 5:15 PM

Ted

@edwardbenson.bsky.social

Forming a patent troll company filled with ML engineers and designers feels like an oddly high ROI endeavor at this moment in time.

I hope that’s not happening right now..

November 20, 2024 at 2:44 PM

Ted

@edwardbenson.bsky.social

The starter packs and feeds are awesome.

I think the final step in empowering users would be tags on posts that work with feeds.

I could auth a 3rd party service to tag posts to me to up/down rank them in my own feed.

Keep $USER but not their rants about $TOPIC

November 20, 2024 at 2:21 PM

Ted

@edwardbenson.bsky.social

I will probably regret this, but..

Here is an agent that can negotiate prices, make package deals, and actually sell you candy online:

hawke.bot

Who wants to be the first human in history to buy something from an AI street hawker?

Blog post about it:

edwardbenson.com/2024/11/the-...

November 19, 2024 at 4:13 PM

Add to Home Screen

Light up
your news

Add to Home Screen

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news