Michelle Hawley
banner
michellehawley.bsky.social
Michelle Hawley
@michellehawley.bsky.social
Editorial Director at Simpler Media Group, managing VKTR.com.
There’s Behemoth (2 trillion params), Scout (can handle entire books) and Maverick (for fast enterprise tasks).

Not subtle. Not boring. Not fully transparent either.
April 30, 2025 at 8:55 PM
Meta's Llama 4 models have been out for a couple weeks now, and honestly, there's still a lot to unpack.
April 30, 2025 at 8:55 PM
According to Google's shared benchmarks, Gemini 2 performs better than OpenAI's GPT-4.5, Claude 3.7 Sonnet, Grok 3 Beta and DeepSeek R1 in areas like:
🔹 Reasoning & knowledge
🔹 Code editing
🔹 Visual reasoning
🔹 Imagine understanding
🔹 Long context
🔹 Multilingual performance
March 28, 2025 at 4:52 PM
#Google just dropped Gemini 2.5, its latest move to take over the enterprise AI space. What it really has going for it is that Google is just so ubiquitous.

So many of us use Google Workspace that having a tool that's built-in and easy to access is more convenient than turning to something else.
March 28, 2025 at 4:52 PM
✈️ I'm prepping to attend the Adobe Summit next week in Las Vegas. I think AI is going to be a can't-get-away-from topic this year, so excited to see what new ideas, lessons and innovations #Adobe and other brands plan to share.
March 12, 2025 at 9:03 PM
Here's a few things to know:

*Automated scores* (MMLU, ROGUE, BLEU) don't guarantee real-world performance. These tests can still struggle with reasoning, accuracy & bias.

*Manual evaluation* is good at catching bias & nuance, but it's very hard to scale.
February 21, 2025 at 3:12 PM
AI model benchmarks can be misleading, like the benchmarks DeepSeek lists for its models (shown below), which many people dispute.
February 21, 2025 at 3:12 PM