We compared notable LLMs such as InstructGPT, ChatGPT, GPT4, PaLM2 (text-bison), and Falcon-180B. They excel at presenting climate information, but there's room for improvement in the epistemic qualities of their answers.
October 6, 2023 at 5:28 PM
We compared notable LLMs such as InstructGPT, ChatGPT, GPT4, PaLM2 (text-bison), and Falcon-180B. They excel at presenting climate information, but there's room for improvement in the epistemic qualities of their answers.
This is a tough task for human raters. Our study finds that AI can effectively assist human raters, offering promising avenues for scalable oversight on difficult problems like this.
October 6, 2023 at 5:27 PM
This is a tough task for human raters. Our study finds that AI can effectively assist human raters, offering promising avenues for scalable oversight on difficult problems like this.