Fuck that guy!
Fuck that guy!
(It's like wikipedia vs. primary sources)
(It's like wikipedia vs. primary sources)
Like they're trying to signal and other humans are trying to catch dishonest signaling
Like they're trying to signal and other humans are trying to catch dishonest signaling
It's not the case that now the training procedure has an incentive to game the "anti-cheating" bias, by finding cheating strategies that look legit?
It's not the case that now the training procedure has an incentive to game the "anti-cheating" bias, by finding cheating strategies that look legit?
(Though to be honest, the less correct he seems to be, the less patience I have with him being rude.
I haven't seen you being rude though.)
(Though to be honest, the less correct he seems to be, the less patience I have with him being rude.
I haven't seen you being rude though.)
Are saying the old OpenAI Superalignment plan will just work? Make AI scientists, they figure out alignment, then train superintelligences?
Are saying the old OpenAI Superalignment plan will just work? Make AI scientists, they figure out alignment, then train superintelligences?
Also, I'm a relatively non-technical idiot, but _I_ at least am trying to figure out what's going to happen and I sure as heck want to hear if we have most of the alignment pieces!
Also, I'm a relatively non-technical idiot, but _I_ at least am trying to figure out what's going to happen and I sure as heck want to hear if we have most of the alignment pieces!
...like it will be an alignment attractor basin that converges to robust alignment?
Or is "alignment" in quotes because the concept is confused.
...like it will be an alignment attractor basin that converges to robust alignment?
Or is "alignment" in quotes because the concept is confused.
Which is Steven Byrnes's basic view.
Which is Steven Byrnes's basic view.