Borhane Blili-Hamelin
borhane.bsky.social
Borhane Blili-Hamelin
@borhane.bsky.social
He, him | Data Scientist @ TD Bank | Improving AI Governance | Views my own

https://borhane.xyz
Excited to have our mini paper on the ETHICS benchmark at the @neuripsconf.bsky.social #EvalEval workshop next week! We draw on moral theory, empirical research, and prompt evaluation to argue that the benchmark lacks validity. Stay tuned for future work on the practical consequences for evals.
December 2, 2024 at 4:59 PM