Sonnet 4.5 can for sure. But haiku 4.5 cannot consistently.
Sonnet 4.5 can for sure. But haiku 4.5 cannot consistently.
There's only 3 instructions and it can't follow them consistently.
My whole skill can fit in a post, so I'll attach it in the replies.
There's only 3 instructions and it can't follow them consistently.
My whole skill can fit in a post, so I'll attach it in the replies.
I'm making python plugins using this strategy because data driven is the way!
github.com/jack-michaud...
I'm making python plugins using this strategy because data driven is the way!
github.com/jack-michaud...
"Shifting Work Patterns with Generative AI" by Eleanor Wiske Dillon, Sonia Jaffe, Nicole Immorlica, Christopher T. Stanton
arxiv.org/abs/2504.11436
"Shifting Work Patterns with Generative AI" by Eleanor Wiske Dillon, Sonia Jaffe, Nicole Immorlica, Christopher T. Stanton
arxiv.org/abs/2504.11436
RAG systems excel on academic benchmarks - but are they robust to variations in linguistic style?
We find RAG systems are brittle. Small shifts in phrasing trigger cascading errors, driven by the complexity of the RAG pipeline 🧵
RAG systems excel on academic benchmarks - but are they robust to variations in linguistic style?
We find RAG systems are brittle. Small shifts in phrasing trigger cascading errors, driven by the complexity of the RAG pipeline 🧵
But this highlights something fundamental about today's language model architectures...
"Our results suggest that building robust AI systems is challenging even with extremely superhuman systems in some of the most tractable settings, and highlight two key gaps: efficient generalization in defenses, and diversity in training."
But this highlights something fundamental about today's language model architectures...
"Our results suggest that building robust AI systems is challenging even with extremely superhuman systems in some of the most tractable settings, and highlight two key gaps: efficient generalization in defenses, and diversity in training."
"Our results suggest that building robust AI systems is challenging even with extremely superhuman systems in some of the most tractable settings, and highlight two key gaps: efficient generalization in defenses, and diversity in training."
The National in Scotland continuing to do some good work.
The National in Scotland continuing to do some good work.
It can do many styles.
Prompt it with HWRIT keyword, give it some short text, a handwriting style and some ink and paper types.
More examples and download links in 🧵
It can do many styles.
Prompt it with HWRIT keyword, give it some short text, a handwriting style and some ink and paper types.
More examples and download links in 🧵
We ran a field experiment on X/Twitter (N=1,256) using LLMs to rerank content in real-time, adjusting exposure to polarizing posts. Result: Algorithmic ranking impacts feelings toward the political outgroup! 🧵⬇️
We ran a field experiment on X/Twitter (N=1,256) using LLMs to rerank content in real-time, adjusting exposure to polarizing posts. Result: Algorithmic ranking impacts feelings toward the political outgroup! 🧵⬇️
adding to my list of enthralling asmr channels
adding to my list of enthralling asmr channels