I agree emergent misalignment was very important this year. But the headline was already known --- e.g., benign fine-tuning has always had the chance of removing safeguards right?
I agree emergent misalignment was very important this year. But the headline was already known --- e.g., benign fine-tuning has always had the chance of removing safeguards right?