Get for VS and JetBrains: https://linktr.ee/refactai
Try refact.ai Agent — SOTA on SWE-bench Verified — in your IDE, today:
• VS Code: marketplace.visualstudio.com/items?itemNa...
• JetBrains: plugins.jetbrains.com/plugin/20647...
Try refact.ai Agent — SOTA on SWE-bench Verified — in your IDE, today:
• VS Code: marketplace.visualstudio.com/items?itemNa...
• JetBrains: plugins.jetbrains.com/plugin/20647...
We build for real-world results, not just leaderboards.
Delegate your everyday programming tasks to our AI Agent, preview every step, and guide the process whenever you like ❤️
We build for real-world results, not just leaderboards.
Delegate your everyday programming tasks to our AI Agent, preview every step, and guide the process whenever you like ❤️
• Made tools more tolerant of the model’s uncertainty
• Renamed them for clarity: definition()→search_symbol_definition(), etc.
• Reduced chat compression
❌Dropped multi-step planning
• & more
• Made tools more tolerant of the model’s uncertainty
• Renamed them for clarity: definition()→search_symbol_definition(), etc.
• Reduced chat compression
❌Dropped multi-step planning
• & more
It analyzes the debug_script() report, brainstorms the solution, and applies fixes directly — no patches or diffs.
One mandatory call per task, lean and focused.
It analyzes the debug_script() report, brainstorms the solution, and applies fixes directly — no patches or diffs.
One mandatory call per task, lean and focused.
🛡️We added automatic guardrails:
A script runs static checks on model outputs. If it detects Agent going off track, it injects mid-run helper messages (as from a “user”) to nudge it back in the right direction.
🛡️We added automatic guardrails:
A script runs static checks on model outputs. If it detects Agent going off track, it injects mid-run helper messages (as from a “user”) to nudge it back in the right direction.
It uses pdb to debug, modify, and generate scripts, gathering:
1. Which files are affected
2. What caused the failure
3. How it might be fixed.
We forced at least 1 and up to 3 calls per task.
It uses pdb to debug, modify, and generate scripts, gathering:
1. Which files are affected
2. What caused the failure
3. How it might be fixed.
We forced at least 1 and up to 3 calls per task.
Models:
• Orchestration: Claude 3.7
• debug_script(): Claude 3.7 + o4-mini
• strategic_planning(): o3
• Temp: 0 for Claude
For each benchmark task, our AI Agent made one multi-step run to produce a single, correct final solution.
Models:
• Orchestration: Claude 3.7
• debug_script(): Claude 3.7 + o4-mini
• strategic_planning(): o3
• Temp: 0 for Claude
For each benchmark task, our AI Agent made one multi-step run to produce a single, correct final solution.
You can run it end-to-end and reproduce our Agent’s approach and 69.8% score.
➡️ github.com/smallcloudai...
You can run it end-to-end and reproduce our Agent’s approach and 69.8% score.
➡️ github.com/smallcloudai...
• VS Code: marketplace.visualstudio.com/items?itemNa...
• JetBrains: plugins.jetbrains.com/plugin/20647...
• VS Code: marketplace.visualstudio.com/items?itemNa...
• JetBrains: plugins.jetbrains.com/plugin/20647...
• Automates repetitive tasks
• Understands large codebases
• Integrates with GitHub, Docker, PostgreSQL, & more +
1000+ dev tools via MCP
• Learns from every interaction
• Automates repetitive tasks
• Understands large codebases
• Integrates with GitHub, Docker, PostgreSQL, & more +
1000+ dev tools via MCP
• Learns from every interaction
🧠It uses deep_analysis() for reasoning in complex tasks: Solution generation→Critique→Refinement.
Refact.ai decides when tools are needed, creating custom strategies instead of following scripts.
🧠It uses deep_analysis() for reasoning in complex tasks: Solution generation→Critique→Refinement.
Refact.ai decides when tools are needed, creating custom strategies instead of following scripts.
1. Understand the problem
2. Investigate the repo
3. Create & run the problem reproduction script
4. Plan & implement changes (applying reasoning)
5. Test & evaluate changes (incl. optional reasoning)
6. Repeat steps 4 and 5 until the problem is solved.
1. Understand the problem
2. Investigate the repo
3. Create & run the problem reproduction script
4. Plan & implement changes (applying reasoning)
5. Test & evaluate changes (incl. optional reasoning)
6. Repeat steps 4 and 5 until the problem is solved.
AI Agent completes the entire dev workflow on its own: plans, executes, tests, self-corrects, and delivers a production-ready result.
For each task, it makes 1️⃣ multi-step run to generate a single correct solution through thoughtful iteration.
AI Agent completes the entire dev workflow on its own: plans, executes, tests, self-corrects, and delivers a production-ready result.
For each task, it makes 1️⃣ multi-step run to generate a single correct solution through thoughtful iteration.
Empower every dev with an Autonomous AI Agent that amplifies their capabilities & helps achieve 10x more.
✨Refact.ai is open-source: we believe coding tools should be transparent, customizable, and community-driven — building the future of programming together:
github.com/smallcloudai
Empower every dev with an Autonomous AI Agent that amplifies their capabilities & helps achieve 10x more.
✨Refact.ai is open-source: we believe coding tools should be transparent, customizable, and community-driven — building the future of programming together:
github.com/smallcloudai