japneetsingh.bsky.social
@japneetsingh.bsky.social
Phd candidate @Purdue. I work on problems in information theory as well as on Ranking and Preference learning
Hi you tagged a wrong person.
November 19, 2025 at 10:06 PM
Reposted
Ge et al. show that they violate some basic axioms from social choice theory: Pareto optimality and pairwise majority consistency (true for any nondecreasing and convex loss function, not just Bradley-Terry-Luce). arxiv.org/abs/2405.14758 3/3.
Axioms for AI Alignment from Human Feedback
In the context of reinforcement learning from human feedback (RLHF), the reward function is generally derived from maximum likelihood estimation of a random utility model based on pairwise comparisons...
arxiv.org
November 19, 2024 at 3:36 PM
Reposted
This first is PRO, which post-trains an LLM directly via preference data. It uses general ranked lists rather than just pairwise, similar to the Plackett-Luce approach in the appendix of the DPO paper: arxiv.org/abs/2306.17492 2/3.
Preference Ranking Optimization for Human Alignment
Large language models (LLMs) often contain misleading content, emphasizing the need to align them with human values to ensure secure AI systems. Reinforcement learning from human feedback (RLHF) has b...
arxiv.org
November 18, 2024 at 2:47 PM