manuelsh.github.io
See my latest post: manuelsh.github.io/blog/2025/ra...
See my latest post: manuelsh.github.io/blog/2025/ra...
manuelsh.github.io/blog/2025/un...
#LLM #AI #ML
manuelsh.github.io/blog/2025/un...
#LLM #AI #ML
manuelsh.github.io/blog/2025/la...
manuelsh.github.io/blog/2025/la...
According to benchmarks, best model (Gemini 2.0 Flash-001) have a 0.7% level of hallucinations. Of course, this depends on the task, context, etc, real ones can be lower or higher. (1/3)
According to benchmarks, best model (Gemini 2.0 Flash-001) have a 0.7% level of hallucinations. Of course, this depends on the task, context, etc, real ones can be lower or higher. (1/3)
New post: manuelsh.github.io/blog/2025/datasets-for-advancing-Theoretical-Physics/
New post: manuelsh.github.io/blog/2025/datasets-for-advancing-Theoretical-Physics/
manuelsh.github.io/blog/2025/Se...
manuelsh.github.io/blog/2025/Se...
manuelsh.github.io/blog/2025/Se...
This establishes a difficult to beat benchmark on the efficiency of intelligence.
This establishes a difficult to beat benchmark on the efficiency of intelligence.
I hope the folks of NeurIPS publish it soon.
I hope the folks of NeurIPS publish it soon.
Practical tips, key takeaways, and insights all in one place! 🚀
Dive in:
manuelsh.github.io/blog/2025/NI...
Practical tips, key takeaways, and insights all in one place! 🚀
Dive in:
manuelsh.github.io/blog/2025/NI...