1. Build a simple benchmark to evaluate the performance of your models
2. How a single in-context examples allowed 4o-mini to out perform 4o
3. How to simple improve model quality, and latency at the same time.
Check it out!
www.limai.io/blog/example
1. Build a simple benchmark to evaluate the performance of your models
2. How a single in-context examples allowed 4o-mini to out perform 4o
3. How to simple improve model quality, and latency at the same time.
Check it out!
www.limai.io/blog/example
Check out our full notebook for the code, results, and how we caught hallucinated outputs: github.com/limai-io/de...
Or let’s chat! DM me or email bruno@limai.io to discuss how we can help build robust pipelines for your business. 🚀
Check out our full notebook for the code, results, and how we caught hallucinated outputs: github.com/limai-io/de...
Or let’s chat! DM me or email bruno@limai.io to discuss how we can help build robust pipelines for your business. 🚀
Vision-based models are powerful, but validation frameworks are critical for reliable results.
💡 If you’re building data pipelines, combine extraction with validation to ensure accuracy and trust.
Vision-based models are powerful, but validation frameworks are critical for reliable results.
💡 If you’re building data pipelines, combine extraction with validation to ensure accuracy and trust.
✅ Vision models like Gemini handled layouts flexibly.
✅ Validation caught hallucinations and ensured data accuracy.
✅ Trustworthiness increased for complex documents like utility bills.
✅ Vision models like Gemini handled layouts flexibly.
✅ Validation caught hallucinations and ensured data accuracy.
✅ Trustworthiness increased for complex documents like utility bills.
• Extract raw text using a PDF reader.
• Validate each extracted value (e.g., “160.69 €”) by searching for it in the raw text.
• Flag values that don’t match as potential hallucinations.
• Extract raw text using a PDF reader.
• Validate each extracted value (e.g., “160.69 €”) by searching for it in the raw text.
• Flag values that don’t match as potential hallucinations.
1️⃣ Vision-based extraction to handle complex layouts.
2️⃣ Instructor-powered validation to cross-check extracted values against raw text from PDFs.
This ensured data was grounded in reality, not hallucinated.
1️⃣ Vision-based extraction to handle complex layouts.
2️⃣ Instructor-powered validation to cross-check extracted values against raw text from PDFs.
This ensured data was grounded in reality, not hallucinated.
E.g., instead of extracting "2.983 kW" for contracted power, the model returned "2.0 kW"—a made-up value. 😬
How do we prevent this?
E.g., instead of extracting "2.983 kW" for contracted power, the model returned "2.0 kW"—a made-up value. 😬
How do we prevent this?
These models handle complex layouts, tables, and multimodal inputs natively—far beyond what rule-based parsing can achieve. But they also have challenges.
These models handle complex layouts, tables, and multimodal inputs natively—far beyond what rule-based parsing can achieve. But they also have challenges.
We’re offering free consulting calls this month to help businesses optimize their AI strategies.
📩 bruno@limai.io or DM me!
We’re offering free consulting calls this month to help businesses optimize their AI strategies.
📩 bruno@limai.io or DM me!
✅ How we built a simple test dataset to evaluate accuracy.
✅ Why adding examples worked so well (and why you should try it).
✅ How this influenced our product's UX/UI strategy.
✅ How we built a simple test dataset to evaluate accuracy.
✅ Why adding examples worked so well (and why you should try it).
✅ How this influenced our product's UX/UI strategy.
• With a small model plus the example, accuracy leaped from 61% to 97%.
• We achieved this without fine-tuning or complex parsing techniques.
• With a small model plus the example, accuracy leaped from 61% to 97%.
• We achieved this without fine-tuning or complex parsing techniques.