Aaron Dharna
aadharna.bsky.social
Aaron Dharna
@aadharna.bsky.social
PhD Student at UBC under Dr. Jeff Clune. Interested in RL, Generative Modeling, and Games (both for AI and for me)
FMSPs represent a new direction for open-ended strategy discovery in AI. We anticipate they can lead to a richer exploration of creative, diverse, and robust solutions across various domains, from language-based tasks to traditional RL
July 10, 2025 at 6:17 PM
We also explore FMSPs in an AI safety domain, Gandalf. An attacker LLM writes code (prompts and extraction functions) to jailbreak a secret from GPT-4o-mini while a defender LLM searches for system prompts & I/O guards (eg, double checking GPT’s response) to increase protection
July 10, 2025 at 6:17 PM
We evaluate FMSPs in Car Tag, an asymmetric continuous-control game (see gifs above). FMSP variants write code-based policies (go left; q-learning; etc). Below are PCA plots of policy embeddings showing that QDSP has the highest QD-Score vs the other FMSPs and a non-LLM baseline
July 10, 2025 at 6:17 PM
QDSP introduces a novel "dimensionless" MAP-Elites! Policies (Q-Learning, MCTS, etc.) are clustered via a pretrained model and are added to the archive if they're sufficiently new OR outperform the most similar policy (analogous to filling/improving a cell in MAP-Elites)
July 10, 2025 at 6:17 PM
We introduce a family of FMSP approaches with the same general structure (see Fig.). Harnessing open-endedness, the FM looks at the history of strategies tried so far (implemented in code), their scores, and creates new strategies to try
July 10, 2025 at 6:17 PM
Really excited to share my recent work combining open-ended foundation model innovation with the competitive dynamics of self-play!! arxiv.org/abs/2507.06466
July 10, 2025 at 6:17 PM