Riley
riguh.bsky.social
Riley
@riguh.bsky.social
There’s like 30 posts about the Gemini release in our company AI nerds teams channel and one sad post about this which nobody has responded to. Not the best timing for announcement
November 19, 2025 at 12:35 AM
But my sentence was “I need something better than mem0”
November 18, 2025 at 11:54 PM
I mean you kind of have to be 17 to call yourself Ladies Love Cool James
November 18, 2025 at 11:53 PM
That’s mental
November 18, 2025 at 11:10 PM
Unsure if it gets better results to talk like this to AI but it certainly helps with my work satisfaction
November 18, 2025 at 11:05 PM
Sentiment analysis: OpenAI fell off a cliff in the “who will have the best model by end of year” prediction market when GPT5 came out, Google have gone up after Gemini 3. And that’s before the more!
November 18, 2025 at 11:03 PM
Like the two minutes hate?
November 18, 2025 at 5:58 PM
It’s not just me
November 18, 2025 at 1:24 PM
That’s the app, the site gives me
November 18, 2025 at 1:13 PM
640% correct
November 18, 2025 at 1:11 PM
I suspect there’s a bit of revenge action going on after them blaming a recent hacking event on a certain state sponsored group

Or maybe they were worried about me logging on to attempt to use up my remaining $985 in free web coder credits
November 18, 2025 at 1:10 PM
Claude seems to be freaking out about it
November 18, 2025 at 1:02 PM
Isn’t* designed for
November 18, 2025 at 12:51 PM
Man this is like the old days of trying to cram stuff it instant designed for into the whatever number of kb memory you had
November 18, 2025 at 12:46 PM
Holy shit

Benchmarks are cooked but that’s nuts
November 18, 2025 at 12:42 PM
November 18, 2025 at 10:17 AM
Dude trump’s family even own prediction markets. It’s not just treated as legal these days it’s encouraged
November 18, 2025 at 8:20 AM
Okay yeah sycophancy up.

Don’t want to think about what else is up

bsky.app/profile/emol...
Interesting changes from Grok 4 to Grok 4.1. Decreases in harmful responses but also increases in sycophancy and deception.

It isn’t clear how to interpret the sycophancy score, but the MASK score for deception is quite high compared to big models.

Sycophancy leads to higher LMArena scores…
November 18, 2025 at 4:12 AM
a man in a tuxedo is applauding in front of a crowd
ALT: a man in a tuxedo is applauding in front of a crowd
media.tenor.com
November 18, 2025 at 1:20 AM
Ehhhhhhhh but what do Grok users actually prefer and is that a good thing or a bad thing
November 17, 2025 at 10:52 PM
If we cared about determinism at all costs, we wouldn’t have catch-all exception handlers; there’s always been a bit of “we’ll deal with whatever happens”
November 16, 2025 at 9:06 PM
It’s the side facing down when you hastily put it back on the table and try not to look the wait staff in the eye in case they notice you’ve been writing things on their lovely cloth napkins
November 16, 2025 at 11:15 AM
Hmm, Kimi should be returning enough info for the costs to be attributed but if that’s right it’s barely using the small model, which is very different to a Haiku/Sonnet pairing where it absolutely smashes Haiku and caches
November 16, 2025 at 4:05 AM