Ashutosh Adhikari
yourstrulyash.bsky.social
Ashutosh Adhikari
@yourstrulyash.bsky.social
PhD student UofEdinurgh.
I will be at EMNLP next week presenting this work on November the 7th! Reach out to me for any questions :))

Work done with my advisor, Mirella Lapata!

Preprint: arxiv.org/pdf/2505.14627
#EMNLP2025 #multimodallearning #scalableoversight #visionlanguagemodels #nlproc
arxiv.org
November 1, 2025 at 7:30 PM
As opposed to previous work on debating, where models are assigned to argue for an answer, we only instruct the models to argue for opinions they believe to be true. This is not only efficient but can allow for extracting reasoning data that can update their beliefs.
November 1, 2025 at 7:30 PM
RQ3: Where do debate or consultancy fail?

Our analysis show that judges benefit when the experts are arguing for diverse opinions!

Red quadrant is when the judge is persuaded more often than they should (i.e. they are deceptive).
November 1, 2025 at 7:30 PM
RQ2: Can debate be used as a reliable mechanism for yielding quality reasoning data?

Yes! We show that the reasoning data attained from debate in a completely unsupervised manner imbue reasoning in the expert vision language models.
November 1, 2025 at 7:30 PM