Lightnews — Scholar-powered news

Derek Abdine

@dabdine.bsky.social

Fair enough. Way more conspiracy theories in X these days than there used to be though. Also had about 11 bots follow me yesterday after a single tweet. One created a meme coin for my startup. Maybe dead internet theory is actually real….

January 3, 2025 at 1:55 AM

Derek Abdine

@dabdine.bsky.social

Perfectly describes the current state of Xhitter

January 3, 2025 at 1:50 AM

Derek Abdine

@dabdine.bsky.social

They’re waiting for you, Gordon. In the tessssssst chamberrrrrr.

December 14, 2024 at 2:43 AM

Derek Abdine

@dabdine.bsky.social

I keep forgetting they’re still doing this

December 14, 2024 at 1:57 AM

Derek Abdine

@dabdine.bsky.social

Manual. YMMV with prepared stuff like AutoGPT, but base LLMs at a fundamental level are just token emitters, so you have to string them together with other stuff to make them useful. Like a brain without a body.

December 13, 2024 at 8:57 PM

Derek Abdine

@dabdine.bsky.social

Another fun thought: I could give furl an agent that knows how furl itself is designed, its code framework, etc., and make it self-generate new agents and tools in case it can’t accomplish a task itself. Even an agent/tool to (re)train its own model.

December 13, 2024 at 7:55 PM

Derek Abdine

@dabdine.bsky.social

- Anthropic released a computer use model which seems like it would rely on tools combined with image processing (which has already existed).

To name a few. In other words, innovation seems to be on price per token and specific application now rather than on overall accuracy of base models.

December 13, 2024 at 7:38 PM

Derek Abdine

@dabdine.bsky.social

AI layer to research details about software like vendor website, docs, etc that a human could do but would take forever. Useful for remediation to have all the details about a particular software / package / whatever available when deciding what to do.

December 13, 2024 at 7:27 PM

Derek Abdine

@dabdine.bsky.social

This setup is used as the backing AI to furl.ai’s autonomous patching. We expose it all as a REST API internally to our other services which rely on our AI layer to gen the scripts/instrictuons/research details on software for us (software inventory info databases suck so we also use our 1/2

December 13, 2024 at 7:27 PM

Derek Abdine

@dabdine.bsky.social

For executing scripts we basically just boot a clean macOS / windows / Linux (rhel, Ubuntu) host and ship the script, execute and return stdout/stderr. Lots of ways to do that (some cheaper than others). 2/2

December 13, 2024 at 7:23 PM

Derek Abdine

@dabdine.bsky.social

Nope, those tools were built by us in-house. You can use scraperapi or other headless browser scraping services for content extraction (note: this is a slightly dumb way to do it, there are more intelligent ways to extract text from websites). 1/2

December 13, 2024 at 7:23 PM

Derek Abdine

@dabdine.bsky.social

to use with the web_scrape tool. If we find that it isn't doing that well enough, we can make a google_search agent (agents have a system prompt, samples, own model, etc that tools don't have. Tools are just functions.) that is specialized for this task. 5/5

December 13, 2024 at 5:53 PM

Derek Abdine

@dabdine.bsky.social

The research_from_internet tool actually calls our "internet_researcher" agent, which itself has web_scrape and search_google tools. The former will use services to extract text from rendered websites, the latter will use Google's customsearch api. internet_researcher must also gen search terms 4/5

December 13, 2024 at 5:53 PM

Derek Abdine

@dabdine.bsky.social

For example, the "upgrade_script_developer" agent uses OpenAI's base gpt-4o model, but itself knows about two tools: execute_script_on_runner and research_from_internet. The execute_script_on_runner tool runs a script that is generated by the LLM on a host and simply returns the response. 3/5

December 13, 2024 at 5:53 PM

Derek Abdine

@dabdine.bsky.social

with it's own system prompt and tool knowledge. Each agent can be configured to use its own model if we want (but don't do right now). When we build out a new agent, we can make the agent use other agents to achieve its goal.
2/5

December 13, 2024 at 5:53 PM

Derek Abdine

@dabdine.bsky.social

We use OpenAI's base models with RAG (later, fine tuned) essentially. So, in this case gpt-4o. Our "cognition" framework (which follows the NVIDIA blog post) contains agents and tools. Agents know about tools. Agents can be tools themselves. So basically each agent is the specialist 1/5

December 13, 2024 at 5:53 PM

Derek Abdine

@dabdine.bsky.social

Right now we just use OpenAI, though our design allows us to plug any LLM in (we have support for Gemini, Azure OpenAI, Grok, and Anthropic). Only very few support tool calls. For those that do, I still haven’t seen accuracy or reliability as high as OpenAI. Tool calls can be added to any LLM tho.

December 13, 2024 at 5:41 PM

Derek Abdine

@dabdine.bsky.social

More or less implement the components here, though the agent graph is not detailed:

developer.nvidia.com/blog/introdu...

Introduction to LLM Agents | NVIDIA Technical Blog

Consider a large language model (LLM) application that is designed to help financial analysts answer questions about the performance of a company. With a well-designed retrieval augmented generation…

developer.nvidia.com

December 13, 2024 at 5:38 PM

Derek Abdine

@dabdine.bsky.social

Haven’t written a guide, but open to doing that. LangGraph may be the closest framework to what we’ve built.

Most of what we have now is the culmination of trial & error + arxiv papers + blog posts + security/scanning backgrounds + some major conceptual contributions from our former chief of ai

December 13, 2024 at 5:34 PM

Derek Abdine

@dabdine.bsky.social

Definitely is. I’ve found accuracy improves greatly as you add more “specialists” that work in concert with each other (ie a true multi agent architecture), not just tools and not just prompt engineering. Accuracy scales fairly well and much faster than with prompt tweaks alone.

December 13, 2024 at 5:25 AM

Derek Abdine

@dabdine.bsky.social

Dunno. I’ve built one that uses agents to reason through creating upgrade scripts that work by giving it access to search google, scrape content from websites, and execute stuff in a sandbox. If it fails itll correct itself and try again. Knowing when to stop is key tho not hard for narrow use cases

December 13, 2024 at 4:12 AM

Derek Abdine

@dabdine.bsky.social

Yep. Basically run the original request and response through a “critic” which attempts to refute hallucinated bullshit. LLMs are pretty damn good at text extraction, so you are sort of leaning on that to provide some level of error correction.

December 13, 2024 at 3:46 AM

Derek Abdine

@dabdine.bsky.social

Relevant:

https://www.primevideotech.com/video-streaming/scaling-up-the-prime-video-audio-video-monitoring-service-and-reducing-costs-by-90

Scaling up the Prime Video audio/video monitoring service and reducing costs by 90%

The move from a distributed microservices architecture to a monolith application helped achieve higher scale, resilience, and reduce costs.

www.primevideotech.com

June 16, 2023 at 3:13 PM

Add to Home Screen

Light up
your news

Add to Home Screen

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news