Daniel
banner
otherdaniel.bsky.social
Daniel
@otherdaniel.bsky.social
Reposted by Daniel
Anthropic is always bragging about their models doing unintended things, but the unintended behaviour exactly matches what me and my friends who went into tech found cute or funny middle school. I think it's all boring, they are unintentionally training their models to do what they think is cute
December 19, 2025 at 5:54 PM
Reposted by Daniel
bsky.app/profile/arts...

It sure is one hell of a coincidence Anthropic trained a model that pretends to lose to Anthropic employees in a way that makes the employees feel good about themselves.
…this is the section I mean— would something like this suggest that they added guardrails? (I’m not sure if that term has a specific meaning that wouldn’t apply here)
December 19, 2025 at 5:58 PM
bsky.app/profile/arts...

It sure is one hell of a coincidence Anthropic trained a model that pretends to lose to Anthropic employees in a way that makes the employees feel good about themselves.
…this is the section I mean— would something like this suggest that they added guardrails? (I’m not sure if that term has a specific meaning that wouldn’t apply here)
December 19, 2025 at 5:58 PM
disobedience, but we already knew it's not that hard to get an LLM to pretend to be a fictional character.

I bet someone involved in deciding which model versions to reject loved this book as a kid.
December 19, 2025 at 5:56 PM
Anthropic is always bragging about their models doing unintended things, but the unintended behaviour exactly matches what me and my friends who went into tech found cute or funny middle school. I think it's all boring, they are unintentionally training their models to do what they think is cute
December 19, 2025 at 5:54 PM
I wonder why random filenames aren't standard for this sort of thing. It'd add an extra layer of protection if you messed up the security permissions.
November 26, 2025 at 12:17 PM
I'm curious why people don't use random file names for document uploads that are supposed to be secret until linked. Sure there are much better ways to keep it secure but it would add a backup layer of protection.
November 26, 2025 at 12:08 PM
Try apps advertising cheap international calling. They often forward through local numbers on both ends. It worked when I needed to call a US toll free number from the UK.
November 2, 2025 at 12:38 AM
In the USA the statistic I always heard was border encounters so whether to count students didn't come up. I've been wondering why different countries have different conventional stats used to reflect the so-called immigration problem - it seems like something you'd might have an interesting take on
September 3, 2025 at 11:57 AM
It's a few kilometers from an MOD training area though, hopefully they can set up over there next time.
August 22, 2025 at 7:46 AM
I personally don't trust the Greens to declare communications equipment dangerous, they don't have a great track record.
August 22, 2025 at 7:41 AM
What's that based on? Per Christie's 2023 Hotel Market Snapshot, Portugal has 113,000 rooms. Blackpool's not one of the cities in Christie's UK report, but they have London at 140,000, followed by Manchester at 20,000. I've never been to Blackpool, can it plausibly have London-level hotel figures?
August 18, 2025 at 11:00 AM
FYI for search results (not instant answers), especially for less popular queries, DDG mostly just forwards results from Bing.
August 5, 2025 at 8:43 AM
There were some relevant articles right after the 2024 election, if you didn't see this at the time you might find it interesting. www.latimes.com/politics/sto...
After Trump's win, Black women are rethinking their role as America's reliable political organizers
Donald Trump's victory has dismayed many politically engaged Black women, and they're reassessing their enthusiasm for politics and organizing.
www.latimes.com
July 25, 2025 at 9:38 AM
In China they do things like requiring game companies to lock children out on school nights. I'm not sure how well it all works but it seems like there are options that can improve the situation.
July 24, 2025 at 9:32 AM
Thanks, that's interesting.

You're right it's all technically public. That said it feels like something the public would feel differently about, like say the Cambridge Analytica scandal.
July 12, 2025 at 9:56 AM
Is this your interpr of ofcom or have they actually suggested this?
July 12, 2025 at 9:48 AM
Wait, ofcom wants us to use hacked *stolen data* to track private activities of users on other sites? That's crazy. Did they actually say that?
July 12, 2025 at 9:41 AM
In America local government and charities operate free cooling centers during heat waves. Heat exposure builds up, resting and resetting your body for a few hours makes an enormous difference.
July 11, 2025 at 6:54 PM
Ofcom's vision of email based age verification is not at all what I expected from the name. I'd call that bank/utility based age verification via email. I don't see the banks setting that up unless the govt requires it. Also has privacy issues - banks can't just publish a list of customer emails.
July 11, 2025 at 6:50 PM
I did this with k3s on a single node and after a month of tinkering (and leaning kubernetes for the first time) I spent the last year ignoring it and it just worked.

I like the all the state goes in a sqlite db.
July 9, 2025 at 10:54 PM
We can all see it's a fake, only a British person would forget it was (formerly) the bill act. /s
July 4, 2025 at 8:45 PM
"If a cartel shoots at another cartel in a vacant lot in America the SEALs will invade Mexico"
July 4, 2025 at 8:32 PM
There is a long history in the US of militias attempting this cunning plan. But I'm not sure it actually works.

See for example Boots on the Ground: As FEMA struggles to keep up with climate disasters, extremist groups see an opportunity 17 May 2023
www.preventionweb.net/news/boots-g...
June 27, 2025 at 9:12 AM
The post seems to only be about very specific FPV drones, not drones in general. For example the bit about the limitations of unencrypted analog radio links.
June 26, 2025 at 7:24 PM