Lightnews — Scholar-powered news

Turan Orujlu

@turanorujlu.bsky.social

PhD Student @unituebingen.bsky.social. Interested in intuitive physics, world models, causality, and reinforcement learning.

Posts Replies Media Videos

Turan Orujlu

@turanorujlu.bsky.social

We also tested CPM's usefulness as a world model for a model-based RL agent. The agent's task was to move an object to a target location. Our CPM-based agent (red) broadly outperformed baselines (especially in the challenging "Unobserved" setting), achieving higher mean rewards.

July 22, 2025 at 6:59 PM

Turan Orujlu

@turanorujlu.bsky.social

We tested our model in a simple physics environment with 'Observed' & 'Unobserved' settings. The plots show CPM (red) has higher prediction accuracy (H@1) than GNN & Modular (separate transition MLP per slot) baselines. The performance gap widens over longer prediction horizons.

July 22, 2025 at 6:59 PM

Turan Orujlu

@turanorujlu.bsky.social

How does the CPM build its causal graph? We treat causal discovery as a multi-agent RL problem. As shown in the Causal MDP, controller agents make sequential decisions to add edges to the graph, determining which objects interact.

July 22, 2025 at 6:59 PM

Turan Orujlu

@turanorujlu.bsky.social

Our model (see diagram) has an object-centric vision encoder to instantiate object representations and an action encoder, for force representations. The core part of the architecture is CPM. It acts as a dynamic transition function using a causal graph to predict object dynamics.

July 22, 2025 at 6:59 PM

Add to Home Screen

Light up
your news

Add to Home Screen

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news