concordia as always
concordia as always
1. Reevaluating Policy Gradient Methods for Imperfect-Information Games
2. Inability to record the tutorial "Tutorial on General Evaluation of AI Agents".
No more negativity.
1. Reevaluating Policy Gradient Methods for Imperfect-Information Games
2. Inability to record the tutorial "Tutorial on General Evaluation of AI Agents".
No more negativity.
1. Multi-Actor Generative Artificial Intelligence as a Game Engine
2. Game of Thoughts: Iterative Reasoning in Game-Theoretic
Domains with Large Language Models
3. Soft Condorcet Optimization for Ranking of General Agents
1. Multi-Actor Generative Artificial Intelligence as a Game Engine
2. Game of Thoughts: Iterative Reasoning in Game-Theoretic
Domains with Large Language Models
3. Soft Condorcet Optimization for Ranking of General Agents
52. Quantifying the Self-Interest Level of Markov Social Dilemmas
53. Jackpot! Alignment as a Maximal Lottery
54. Wider or Deeper? Scaling LLM Inference-Time
Compute with Adaptive Branching Tree Search
Let's stop here.
52. Quantifying the Self-Interest Level of Markov Social Dilemmas
53. Jackpot! Alignment as a Maximal Lottery
54. Wider or Deeper? Scaling LLM Inference-Time
Compute with Adaptive Branching Tree Search
Let's stop here.
47. RE-EVALUATING OPEN-ENDED EVALUATION OF LARGE LANGUAGE MODELS
48. The Decrypto Benchmark for Multi-Agent Reasoning and ToM
49. Robust Autonomy Emerges from Self-Play
50. LLM-MEDIATED GUIDANCE OF MARL SYSTEMS
47. RE-EVALUATING OPEN-ENDED EVALUATION OF LARGE LANGUAGE MODELS
48. The Decrypto Benchmark for Multi-Agent Reasoning and ToM
49. Robust Autonomy Emerges from Self-Play
50. LLM-MEDIATED GUIDANCE OF MARL SYSTEMS
Games via Meta-Learning
43. DEVIATION RATINGS: A GENERAL, CLONE INVARIANT
RATING METHOD
44. Training Language Models for Social Deduction with Multi-Agent Reinforcement Learning
45. EXPECTED RETURN SYMMETRIES
Games via Meta-Learning
43. DEVIATION RATINGS: A GENERAL, CLONE INVARIANT
RATING METHOD
44. Training Language Models for Social Deduction with Multi-Agent Reinforcement Learning
45. EXPECTED RETURN SYMMETRIES
40. Deep mechanism design: Learning social and economic
policies for human benefit
41. Meta-Learning in Self-Play Regret Minimization
40. Deep mechanism design: Learning social and economic
policies for human benefit
41. Meta-Learning in Self-Play Regret Minimization
37. Remembering the Markov Property in Cooperative
MARL
38. Constrained Exploitability Descent: An Offline Reinforcement Learning Method for Finding Mixed-Strategy Nash Equilibrium
37. Remembering the Markov Property in Cooperative
MARL
38. Constrained Exploitability Descent: An Offline Reinforcement Learning Method for Finding Mixed-Strategy Nash Equilibrium
33. Improving Transformer World Models for Data-Efficient RL
34. MASTER: A Multi-Agent System with LLM Specialized MCTS
35. The Y¯ okai Learning Environment: Tracking Beliefs Over Space and Time
33. Improving Transformer World Models for Data-Efficient RL
34. MASTER: A Multi-Agent System with LLM Specialized MCTS
35. The Y¯ okai Learning Environment: Tracking Beliefs Over Space and Time
29. Evolution of Societies via Reinforcement Learning
30. ADIOS: Antibody Development via Opponent Shaping
31. Bootstrapping Task Spaces for Self-Improvement
29. Evolution of Societies via Reinforcement Learning
30. ADIOS: Antibody Development via Opponent Shaping
31. Bootstrapping Task Spaces for Self-Improvement
25. CODE WORLD MODELS FOR GENERAL GAME PLAYING
26. EVOLUTION STRATEGIES AT SCALE: LLM FINETUNING BEYOND REINFORCEMENT LEARNING
27. Modeling Others’ Minds as Code
25. CODE WORLD MODELS FOR GENERAL GAME PLAYING
26. EVOLUTION STRATEGIES AT SCALE: LLM FINETUNING BEYOND REINFORCEMENT LEARNING
27. Modeling Others’ Minds as Code
Transfers
22. NASH POLICY GRADIENT: A POLICY GRADIENT METHOD
WITH ITERATIVELY REFINED REGULARIZATION FOR FINDING
NASH EQUILIBRIA
23. OPPONENT SHAPING IN LLM AGENTS
Transfers
22. NASH POLICY GRADIENT: A POLICY GRADIENT METHOD
WITH ITERATIVELY REFINED REGULARIZATION FOR FINDING
NASH EQUILIBRIA
23. OPPONENT SHAPING IN LLM AGENTS
Games with Generalized Fictitious Cross-Play
19. SPICE : Self-Play In Corpus Environments
Improves Reasoning
20. Aligning Individual and Collective Objectives in Multi-Agent Cooperation
Games with Generalized Fictitious Cross-Play
19. SPICE : Self-Play In Corpus Environments
Improves Reasoning
20. Aligning Individual and Collective Objectives in Multi-Agent Cooperation
attention-aware inverse planning
16. A Variational Approach to Mutual Information-Based
Coordination for Multi-Agent Reinforcement Learning
17. Monte Carlo Tree Diffusion for System 2 Planning
attention-aware inverse planning
16. A Variational Approach to Mutual Information-Based
Coordination for Multi-Agent Reinforcement Learning
17. Monte Carlo Tree Diffusion for System 2 Planning
Multi-Agent RL
12. Hypernetworks That Evolve Themselves
13. Partner Modelling Emerges in Recurrent Agents
(But Only When It Matters)
14. Robust and Diverse Multi-Agent Learning via
Rational Policy Gradient
Multi-Agent RL
12. Hypernetworks That Evolve Themselves
13. Partner Modelling Emerges in Recurrent Agents
(But Only When It Matters)
14. Robust and Diverse Multi-Agent Learning via
Rational Policy Gradient
coherent framework for multi-agent learning
9. Social World Model-Augmented Mechanism Design
Policy Learning
10. Evaluating Cooperation with Novel Partners in Unknown En
vironments Using Unsupervised Environment Design
coherent framework for multi-agent learning
9. Social World Model-Augmented Mechanism Design
Policy Learning
10. Evaluating Cooperation with Novel Partners in Unknown En
vironments Using Unsupervised Environment Design
Differential Equations
5. An Efficient End-to-End Training Approach for
Zero-Shot Human-AI Coordination
6. Terra Nova: A Comprehensive Challenge Environment for
Intelligent Agents
7. Imagined Autocurricula
Differential Equations
5. An Efficient End-to-End Training Approach for
Zero-Shot Human-AI Coordination
6. Terra Nova: A Comprehensive Challenge Environment for
Intelligent Agents
7. Imagined Autocurricula
1. Adaptively Coordinating with Novel Partners via
Learned Latent Strategies
2. Generative Emergent Communication:
Large Language Model is a Collective World Model
3. Superhuman AI for Stratego Using Self-Play
Reinforcement Learning and Test-Time Search
1. Adaptively Coordinating with Novel Partners via
Learned Latent Strategies
2. Generative Emergent Communication:
Large Language Model is a Collective World Model
3. Superhuman AI for Stratego Using Self-Play
Reinforcement Learning and Test-Time Search