Ilia
ilia-i.bsky.social
Ilia
@ilia-i.bsky.social
AI/ML Engineer

🇪🇺
OpenAI released an eval that compared LLM to human experts, across broad fields. Seems like Claude is surpassing GPT-5, as per OpenAI's claims

Full study here: cdn.openai.com/pdf/d5eb7428...
September 26, 2025 at 12:34 PM
With the previous DeepSeek version, I was able to pass through the censorship by simply prompting in Bulgarian (language with fairly low representation); even getting it to criticize the Chinese leadership. DeepSeek-V3.1 no longer indulges me.
August 21, 2025 at 9:05 AM
I'm usually not impressed by demos I can't interact with but Genie 3 is... wow youtu.be/PDKhUknuQDg?...

#genie3
Genie 3: Creating dynamic worlds that you can navigate in real-time
YouTube video by Google DeepMind
youtu.be
August 6, 2025 at 5:34 PM
Gave uv a try. Sorry pip, you served me well but I'm not looking back. The speed difference is absolutely wild

#python
July 10, 2025 at 3:10 PM
Is "context engineering" a new term to take over "prompt engineering"? Or is it more about multi-agent management vs single-prompting? Not sure but I welcome it either way
June 26, 2025 at 8:04 AM
I'm not an Android developer but I realized a dumb app idea for something I couldn't find in the store and I've been actually using it. Yay for vibe coding!
May 8, 2025 at 1:51 PM
#Llama 4 was hyped up, especially by the open source community but it does seem underwhelming. Looks like #Meta went all in on scaling number of parameters and it's not yielding the results they hoped for
April 17, 2025 at 5:07 AM
After playing a bit with #gemini, I have to say I'm impressed by their multimodal capabilities. First time I've felt that a model is both easy to experiment with and useful with video input; without the need of additional preprocessing
March 21, 2025 at 8:28 PM
Finally got around to reading the #DeepSeek paper. I found the part about achieving FP8 precision the most fascinating - sounds like the authors had to jump several hoops to achieve it. I'm sure #nvidia is taking notes for their future GPUs.
DeepSeek-V3 Technical Report
arxiv.org
February 3, 2025 at 5:46 PM
Why is it such a pain to get Nvidia drivers running properly on Ubuntu... After battling with drivers and breaking my system several times already 🤬 I'm starting to consider switching OS
January 20, 2025 at 9:44 AM
Been trying to set up a structured output via locally hosted model on vllm. Shout-out to xgrammar (github.com/mlc-ai/xgram...) for succeeding where other libraries have failed
January 7, 2025 at 2:37 PM
What does 2025 hold for the field of AI?

I'm thinking incremental improvements - multimodals that combine audio+text+images, video gen will improve, models will keep getting smaller and more optimized. I don't buy the promise towards AGI.
January 2, 2025 at 7:03 AM