Same happens to LLMs. These are not thinking machines, in any definition of thinking we may agree on. And the NTP reduction clearly shows that.
Same happens to LLMs. These are not thinking machines, in any definition of thinking we may agree on. And the NTP reduction clearly shows that.
The sooner we understand the limits of LLMs, the sooner we'll learn to deploy them properly.
The sooner we understand the limits of LLMs, the sooner we'll learn to deploy them properly.
I guess the intrinsic variance is what dominates here. We can only know so much about an embryo by just looking at it.
I guess the intrinsic variance is what dominates here. We can only know so much about an embryo by just looking at it.
1. A good open model 🍞
2. A properly tuned RAG pipeline 🍒
And you will be cooking a five star AI system ⭐ ⭐ ⭐ ⭐ ⭐
See you on the AIAI 2025 conference, where we will be presenting this work, done at @bsc-cns.bsky.social and @hpai.bsky.social
1. A good open model 🍞
2. A properly tuned RAG pipeline 🍒
And you will be cooking a five star AI system ⭐ ⭐ ⭐ ⭐ ⭐
See you on the AIAI 2025 conference, where we will be presenting this work, done at @bsc-cns.bsky.social and @hpai.bsky.social
buff.ly/3Xa9gFc
buff.ly/3Xa9gFc
I'm an empiricist, so we attack metrics by developing adversarial benchmarks that expose model shortcuts. Plus, its a lot of fun to show how fragile LLMs can be.
I'm an empiricist, so we attack metrics by developing adversarial benchmarks that expose model shortcuts. Plus, its a lot of fun to show how fragile LLMs can be.