Rishub Jain
shubadubadub.bsky.social
Rishub Jain
@shubadubadub.bsky.social
Works at Google DeepMind on Safe+Ethical AI
Importantly, the best type of rater assistance depends a lot on how much raters over-rely on the assistant. Just showing directly quoted evidence helps more than showing this alongside the AI’s reasoning, judgments, and confidence, in our slice of data where humans > AI.
December 24, 2024 at 12:01 AM
Hybridization can also enable impactful Rater Assistance. Prior HCI work has shown that achieving complementarity can be hard in settings where AI > Humans. Our hybridization identifies a slice of data where humans > AI. Here, rater assistance helps!
December 24, 2024 at 12:01 AM
Combining judgements from human raters and AI raters working in isolation, called Hybridization, can be a useful technique to achieve complementarity.

We’ve found confidence-based hybridization (using AI ratings when it's confident, and human ratings otherwise) achieves complementarity!
December 24, 2024 at 12:01 AM
How do we ensure humans can still effectively oversee increasingly powerful AI systems? In our blog, we argue that achieving Human-AI complementarity is an underexplored yet vital piece of this puzzle! And, it’s hard, but we achieved it.

🧵(1/10)
December 24, 2024 at 12:01 AM