Probably the most important paper I've been part of: link.springer.com/article/10.1...
This practical guide reviews the 'why' and 'how' of multi-objective reinforcement learning.
The latest example is "Where reinforcement learning meets process control" which I apparently co-authored with Kumar and Yu. #ADIPEC 1/n
The latest example is "Where reinforcement learning meets process control" which I apparently co-authored with Kumar and Yu. #ADIPEC 1/n
Demonstration-Guided Multi-Objective Reinforcement Learning
Junlin Lu, Patrick Mannion, Karl Mason
https://openreview.net/forum?id=FQAgFgkaFG
#reinforcement #demonstrations #objective
Demonstration-Guided Multi-Objective Reinforcement Learning
Junlin Lu, Patrick Mannion, Karl Mason
https://openreview.net/forum?id=FQAgFgkaFG
#reinforcement #demonstrations #objective
As a side-benefit, we get this great accessible introduction to MORL video: youtu.be/VEXRuhJDkoA
As a side-benefit, we get this great accessible introduction to MORL video: youtu.be/VEXRuhJDkoA
I will not review for Springer again until this matter is satisfactorily resolved.
I will not review for Springer again until this matter is satisfactorily resolved.
Proof: different articles present at the specified journal/volume/page number, and their titles exist nowhere on any searchable repository.
Take this as a warning to not use LMs to generate your references!
Proof: different articles present at the specified journal/volume/page number, and their titles exist nowhere on any searchable repository.
Take this as a warning to not use LMs to generate your references!
Congrats on being the fastest scammy conference organiser of all time. Inviting me to a conference unrelated to the topic of my paper is highly questionable, but you did it within a day of publication so at least you’re fast.
Kindly remove me from your mailing list.
Regards,
Peter
Congrats on being the fastest scammy conference organiser of all time. Inviting me to a conference unrelated to the topic of my paper is highly questionable, but you did it within a day of publication so at least you’re fast.
Kindly remove me from your mailing list.
Regards,
Peter
After months in copy-editing hell, Haddie Harland's review of AI apology research is now available: link.springer.com/article/10.1...
This is a must read for anyone interested in how AI systems can effectively and appropriately use apologies to facilitate human interaction 1/2
After months in copy-editing hell, Haddie Harland's review of AI apology research is now available: link.springer.com/article/10.1...
This is a must read for anyone interested in how AI systems can effectively and appropriately use apologies to facilitate human interaction 1/2
Multi-Objective Reinforcement Learning for Large Language Model Optimization: Visionary Perspective
https://arxiv.org/abs/2509.21613
Multi-Objective Reinforcement Learning for Large Language Model Optimization: Visionary Perspective
https://arxiv.org/abs/2509.21613
uqtmiller.github.io/recruitment/
This summer was one of my hardest, mentally. 🌥️ Between ...
1/n
This summer was one of my hardest, mentally. 🌥️ Between ...
1/n
uqtmiller.github.io/recruitment/
uqtmiller.github.io/recruitment/
Huawei proposes an RL framework that decouples search planning from answer generation, using dual-reward alignment and Pareto optimization.
📝 arxiv.org/abs/2508.20368
Huawei proposes an RL framework that decouples search planning from answer generation, using dual-reward alignment and Pareto optimization.
📝 arxiv.org/abs/2508.20368