Florian Hölzl
banner
florianhoelzl.bsky.social
Florian Hölzl
@florianhoelzl.bsky.social
PhD Student
@tum-aim-lab.bsky.social @hpi.bsky.social
🇪🇺
For a bachelors project I think this can be fair. You don’t have much expertise and it’s more about showing your motivation and drive. Don’t be too hard on yourself.
November 19, 2025 at 9:34 PM
Glückwunsch @g-k.ai !
November 3, 2025 at 2:03 PM
I can also recommend @jbhuang0604.bsky.social for DL content.
October 9, 2025 at 8:25 PM
„revolutionized vision models“ 👀
September 22, 2025 at 10:16 AM
Great update!
August 28, 2025 at 12:48 PM
I guess they are using MCP jan.ai/docs/mcp-exa...
But I agree, a straight forward How-To would be beneficial.
Exa Search MCP - Jan
Connect Jan to real-time web search with Exa's AI-powered search engine.
jan.ai
August 12, 2025 at 11:23 AM
NeurIPS reviewer comments?
August 7, 2025 at 7:01 AM
In this case, I think it’s actually a productive exchange though. At least for me as a reader, I can read both (well structured) posts and make my own conclusions. Which would have, without both posts, taken me more time.
June 15, 2025 at 10:08 AM
La carte te dit aussi la réponse en Allemagne 🐰
April 20, 2025 at 7:29 PM
Reposted by Florian Hölzl
The reason I got interested in this is that I didn't understand why red plus blue makes violet/purple.
Violet corresponds to the shortest visible wavelengths -- shorter than both blue and red.
The explanation is that the S/red cone cells are also activated by short wavelengths for some reason.
Colours correspond to infinite-dimensional vectors, since there are infinitely many wavelengths of light.

But humans can only perceive a three-dimensional projection of colour (red, green, & blue).

What's interesting is that it's *not* an orthogonal projection. Here's a plot of the basis vectors.
April 19, 2025 at 10:20 PM
They discuss this a bit lower down in the README by looking at performance on HellaSwag - but no proper analysis. I also haven’t evaluated it myself. It is still a small model though. And the top solutions use quite some engineering tricks as well (layer wise LR) - but I like its accessibility.
April 7, 2025 at 4:31 PM
Modded nanoGPT’s FineWeb subset could be a good starting point. Currently the goal is to reduce compute time, mostly by lowering # tokens needed - improving loss at constant token size would be interesting as well. Both might be the same though.
github.com
April 7, 2025 at 4:24 PM
Reposted by Florian Hölzl
The rise of straightformer is near …
March 6, 2025 at 5:57 AM
Great work, can’t wait to try it!
February 11, 2025 at 5:20 PM
First time trying it, and I’m really liking it so far!
February 8, 2025 at 10:09 AM
I would like to learn a little bit more about this. Do you have any paper recommendations on this topic?
February 5, 2025 at 9:09 AM
Super interesting! Did you look at individual layers here as well?
January 3, 2025 at 8:20 PM