1. An Abstraction Generator proposes reasoning strategies.
2. A Solution Generator uses that strategy to produce an answer.
The reward corresponds to the average success rate, leading the first player to find useful abstractions.
1. An Abstraction Generator proposes reasoning strategies.
2. A Solution Generator uses that strategy to produce an answer.
The reward corresponds to the average success rate, leading the first player to find useful abstractions.
Introducing RLAD, a two-player RL framework for LLMs to discover 'reasoning abstractions'—natural language hints that encode procedural knowledge for structured exploration in reasoning.🧵⬇️
Introducing RLAD, a two-player RL framework for LLMs to discover 'reasoning abstractions'—natural language hints that encode procedural knowledge for structured exploration in reasoning.🧵⬇️