Dario
banner
dariogargas.bsky.social
Dario
@dariogargas.bsky.social
Senior AI researcher at BSC. Random thinker at home.
Biology needs to be reduced to fundamental components so that mysticism and religion do not corrupt its might.
Same happens to LLMs. These are not thinking machines, in any definition of thinking we may agree on. And the NTP reduction clearly shows that.
May 21, 2025 at 8:22 AM
Mostly agree here, though I would rather use the word mimic than persuade. Persuade entails a purpose, which I'm not sure LLMs have. That, is, does a mathematical function have a purpose?
May 21, 2025 at 8:11 AM
Exactly! The most effective control measure, RAG, is still a patch that can provide no technical guarantee. Just a strong bias that models may not follow.

The sooner we understand the limits of LLMs, the sooner we'll learn to deploy them properly.
May 21, 2025 at 8:08 AM
Though both data sources have the same origin (visual inspection of embryo change) I'd expect features found by humans and features found by a neural net to be complementary.

I guess the intrinsic variance is what dominates here. We can only know so much about an embryo by just looking at it.
April 11, 2025 at 2:35 PM
The recipe is simple 🧑‍🍳 :
1. A good open model 🍞
2. A properly tuned RAG pipeline 🍒

And you will be cooking a five star AI system ⭐ ⭐ ⭐ ⭐ ⭐

See you on the AIAI 2025 conference, where we will be presenting this work, done at @bsc-cns.bsky.social and @hpai.bsky.social
April 4, 2025 at 2:35 PM
Disclaimer: Only text questions were used to evaluate LLMs, unlike students. Student's score computed under the assumption that all questions were answered, which may not be the case.
buff.ly/3Xa9gFc
March 3, 2025 at 3:35 PM
I like the cell one 👍

I'm an empiricist, so we attack metrics by developing adversarial benchmarks that expose model shortcuts. Plus, its a lot of fun to show how fragile LLMs can be.
February 23, 2025 at 6:26 PM
While writing a paper I consistently learn general insights that are too general or not tested enough to be sold as paper contributions, but are great for conversation :)
February 23, 2025 at 9:13 AM
Remarkable effort. Questionable motivation.
February 19, 2025 at 8:45 PM