FleetingBits
@fleetingbits.bsky.social
Are base models the dreams of an LLM?
or, the data labelers don't check the 20 line code fragment that the model spit out
hmmm
hmmm
December 2, 2024 at 4:47 AM
or, the data labelers don't check the 20 line code fragment that the model spit out
hmmm
hmmm
And, then the finding ends up being something like "our data labelers don't check the references" so we get bad labels : X
uhh, ok
uhh, ok
December 2, 2024 at 4:44 AM
And, then the finding ends up being something like "our data labelers don't check the references" so we get bad labels : X
uhh, ok
uhh, ok
this paper is referenced, which should have examples of reward hacking and the authors are high quality authors (Ethan Perez!)
December 2, 2024 at 4:44 AM
this paper is referenced, which should have examples of reward hacking and the authors are high quality authors (Ethan Perez!)
I think in your example - people would demand their money back right now if the agent failed.
I'm not sure how many agents will be described as agents in the future - or just as services - in which case, all of this is moot.
I'm not sure how many agents will be described as agents in the future - or just as services - in which case, all of this is moot.
December 2, 2024 at 12:22 AM
I think in your example - people would demand their money back right now if the agent failed.
I'm not sure how many agents will be described as agents in the future - or just as services - in which case, all of this is moot.
I'm not sure how many agents will be described as agents in the future - or just as services - in which case, all of this is moot.
I'm starting to really wonder if the issue at OpenAI with safety is that the safety advocates don't understand what it's like to work in a "control function" at a company.
It's pretty brutal work.
It's pretty brutal work.
November 14, 2024 at 9:36 AM
I'm starting to really wonder if the issue at OpenAI with safety is that the safety advocates don't understand what it's like to work in a "control function" at a company.
It's pretty brutal work.
It's pretty brutal work.