Labels such as "facts", "observations", and "assertions" take on new meanings when we begin to consider time. Click 👇 to watch the full episode! 🎙️
youtu.be/VpFVAE3L1nk?
Labels such as "facts", "observations", and "assertions" take on new meanings when we begin to consider time. Click 👇 to watch the full episode! 🎙️
youtu.be/VpFVAE3L1nk?
Spotify: open.spotify.com/episode/1udV...
Apple: podcasts.apple.com/us/podcast/a...
Spotify: open.spotify.com/episode/1udV...
Apple: podcasts.apple.com/us/podcast/a...
You want layers of tools aligned in a graph that you can tune, debug, and update in isolation.
Today on How AI Is Built, we are talking to one of the OGs of search: Trey Grainger, the author of AI Powered Search.
www.youtube.com/watch?v=6IQq...
You want layers of tools aligned in a graph that you can tune, debug, and update in isolation.
Today on How AI Is Built, we are talking to one of the OGs of search: Trey Grainger, the author of AI Powered Search.
www.youtube.com/watch?v=6IQq...
The three contexts of search, layered architectures and much more!
Throw everything in a vector database and hope something good comes out.
Throw all ranking signals into one big ML model and hope it makes something good out of it.
You don’t want to create this witch’s cauldron.
New episode on @howaiisbuilt.fm
The three contexts of search, layered architectures and much more!
Spotify: open.spotify.com/episode/6eyT...
Apple: podcasts.apple.com/us/podcast/c...
Spotify: open.spotify.com/episode/6eyT...
Apple: podcasts.apple.com/us/podcast/c...
The reality is that it's easy to build, it's easy to get up and running, but it's really hard to get right.
And if you don't have a good setup, it's near impossible to debug.
One of the reasons it's really hard is chunking.
The reality is that it's easy to build, it's easy to get up and running, but it's really hard to get right.
And if you don't have a good setup, it's near impossible to debug.
One of the reasons it's really hard is chunking.
So, use an AI model to train an AI model.
The big labs like Cohere and OpenAI already use “synthetic data” - AI-generated data that mimics real-world patterns.
The LLMs you use are already trained with it.
youtu.be/thqgKG5lZ8Q
So, use an AI model to train an AI model.
The big labs like Cohere and OpenAI already use “synthetic data” - AI-generated data that mimics real-world patterns.
The LLMs you use are already trained with it.
youtu.be/thqgKG5lZ8Q
Spotify: open.spotify.com/episode/3LAJ...
Apple: podcasts.apple.com/us/podcast/a...
Spotify: open.spotify.com/episode/3LAJ...
Apple: podcasts.apple.com/us/podcast/a...
What's your take on agentic RAG?
youtu.be/Z9Z820HadIA
What's your take on agentic RAG?
youtu.be/Z9Z820HadIA
Different questions need different approaches.
➡️ 𝗤𝘂𝗲𝗿𝘆-𝗕𝗮𝘀𝗲𝗱 𝗙𝗹𝗲𝘅𝗶𝗯𝗶𝗹𝗶𝘁𝘆:
- Structured data? Use SQL
- Context-rich query? Use vector search
- Date-specific? Apply filters first
Different questions need different approaches.
➡️ 𝗤𝘂𝗲𝗿𝘆-𝗕𝗮𝘀𝗲𝗱 𝗙𝗹𝗲𝘅𝗶𝗯𝗶𝗹𝗶𝘁𝘆:
- Structured data? Use SQL
- Context-rich query? Use vector search
- Date-specific? Apply filters first
open.spotify.com/episode/4CXX...
open.spotify.com/episode/4CXX...
ParadeDB is building an open-source PostgreSQL extension to enable search within your database.
Today on How AI Is Built, I am talking to @philippemnoel.bsky.social , the founder and CEO of @paradedb.bsky.social.
youtu.be/RPjGuOcrTsQ
ParadeDB is building an open-source PostgreSQL extension to enable search within your database.
Today on How AI Is Built, I am talking to @philippemnoel.bsky.social , the founder and CEO of @paradedb.bsky.social.
youtu.be/RPjGuOcrTsQ
On top, they have to build ETL pipelines.
Get data normalized.
Worry about race conditions.
All in all, when you want to do search on top of your existing database, you are forced to build distributed systems.
#ai
On top, they have to build ETL pipelines.
Get data normalized.
Worry about race conditions.
All in all, when you want to do search on top of your existing database, you are forced to build distributed systems.
#ai
Check out the episode with Max.
Links to Spotify, Apple in the thread.
You can find the episode down below!
♻️ Repost this if you know someone struggling with RAG ♻️
Youtube: youtu.be/RtJY6sIQqcY
Check out the episode with Max.
Links to Spotify, Apple in the thread.
We want to put the blame on them.
But often it’s our fault.
Many knowledge bases have:
→ Temporal Inconsistencies
- Multiple versions from different time periods
- Historical information without timeline context
>>
We want to put the blame on them.
But often it’s our fault.
Many knowledge bases have:
→ Temporal Inconsistencies
- Multiple versions from different time periods
- Historical information without timeline context
>>
We do not look at full documents anymore, but at bits and pieces.
So we have to be extra careful.
Today on @howaiisbuilt.fm we talk to Max Buckley.
Max works at Google and has built a lot of interesting stuff with LLMs to improve knowledge bases for RAG.
>>
We do not look at full documents anymore, but at bits and pieces.
So we have to be extra careful.
Today on @howaiisbuilt.fm we talk to Max Buckley.
Max works at Google and has built a lot of interesting stuff with LLMs to improve knowledge bases for RAG.
>>
It is very costly in terms of storage and compute. We have to keep our indexes in memory to achieve a low enough latency for search.
What we are talking about today works for everything, works out of domain, and is one of the most efficient.
>>
It is very costly in terms of storage and compute. We have to keep our indexes in memory to achieve a low enough latency for search.
What we are talking about today works for everything, works out of domain, and is one of the most efficient.
>>
But vector search has a lot of downsides.
Vector search is not robust out of domain.
Different types of queries need different embedding models with different vector indices.
>>
But vector search has a lot of downsides.
Vector search is not robust out of domain.
Different types of queries need different embedding models with different vector indices.
>>
Today we are back continuing our series on search on @howaiisbuilt.fm with @taidesu.bsky.social.
We talk about BM25, how it works, what makes it great and how you can tailor it to your use-case.
Today we are back continuing our series on search on @howaiisbuilt.fm with @taidesu.bsky.social.
We talk about BM25, how it works, what makes it great and how you can tailor it to your use-case.
Text embeddings are far from perfect.
They struggle with long documents.
>>
Text embeddings are far from perfect.
They struggle with long documents.
>>
The data is too large to be stored on a single node.
We often need to handle 10k to 50k QPS.
Indexes are very slow to build, but we still want to search the fresh data.
>>
The data is too large to be stored on a single node.
We often need to handle 10k to 50k QPS.
Indexes are very slow to build, but we still want to search the fresh data.
>>
Catch the episode on:
- Youtube: youtu.be/3PEARAf7HEc (now in 4K :D)
- Spotify: open.spotify.com/episode/5lCl...
- Apple: podcasts.apple.com/us/podcast/v...
Catch the episode on:
- Youtube: youtu.be/3PEARAf7HEc (now in 4K :D)
- Spotify: open.spotify.com/episode/5lCl...
- Apple: podcasts.apple.com/us/podcast/v...
Every performance optimization comes with tradeoffs in either functionality, flexibility, or cost.
When building search systems, there's a seductive idea that we can optimize everything: fast results, high relevancy, and low costs.
But that’s not the reality.
Every performance optimization comes with tradeoffs in either functionality, flexibility, or cost.
When building search systems, there's a seductive idea that we can optimize everything: fast results, high relevancy, and low costs.
But that’s not the reality.