Jorge Mendez-Mendez
jmendezm.bsky.social
Jorge Mendez-Mendez
@jmendezm.bsky.social
Embodied lifelong learning (compositionality, RL, TAMP, robotics). Assistant Professor at Stony Brook ECE. Postdoc at MIT CSAIL, PhD from GRASP lab at Penn.
https://jorge-a-mendez.github.io
The verdict? LLMs can't replace TAMP systems yet, especially for long-horizon, geometrically constrained tasks.

They are more promising as fast "idea generators", where a formal planner verifies their output.

Paper: arxiv.org/abs/2510.001...
Code & Data: github.com/jorge-a-mend...
October 2, 2025 at 1:12 PM
More counter-intuitive insight: giving the LLM more information can hurt performance!

When the prompt included geometric details, the LLM made more PDDL errors. The extra info "distracts" the model from the logical constraints. Are these systems sensitive to prompt engineering!
October 2, 2025 at 1:12 PM
Key finding: LLMs can solve many TAMP problems but lag behind engineered planners.

"Thinking" isn't always better! Non-reasoning LLMs outperformed reasoning ones.

Why? It's more efficient for the LLM generate plans quickly and have a formal TAMP system verify and correct them.
October 2, 2025 at 1:12 PM
Can Large Language Models solve complex robotics problems with intricate geometric constraints? 🤖

Excited to share our preprint, "A Systematic Study of Large Language Models for Task and Motion Planning With PDDLStream"!

16 LLM planners using Gemini, 4950 TAMP problems.

🧵👇
October 2, 2025 at 1:12 PM