Get for VS and JetBrains: https://linktr.ee/refactai
🔹 SWE-bench Verified → 70.4% solved
🔹 SWE-bench Lite → 60% solved
Soon: new score with Claude 4 Sonnet 🙂↕️
Full tech breakdown: refact.ai/blog/2025/op...
Our open-source SWE-bench pipeline [GH]: github.com/smallcloudai...
🔹 SWE-bench Verified → 70.4% solved
🔹 SWE-bench Lite → 60% solved
Soon: new score with Claude 4 Sonnet 🙂↕️
Full tech breakdown: refact.ai/blog/2025/op...
Our open-source SWE-bench pipeline [GH]: github.com/smallcloudai...
Score: 69.9% — 349/500 tasks solved.
Key tech behind the run:
• debug_script() sub-agent using pdb
• strategic_planning() tool powered by o3
• Automated guardrails that course-correct mid-run
🧵
Score: 69.9% — 349/500 tasks solved.
Key tech behind the run:
• debug_script() sub-agent using pdb
• strategic_planning() tool powered by o3
• Automated guardrails that course-correct mid-run
🧵
Our approach: fully autonomous Agent, no user intervention needed. Just assign a task & let AI handle it end-to-end
• Claude 3.7 Sonnet — core model
• deep_analysis() tool with o4-mini — reasoning
refact.ai/blog/2025/so...
🧵
Our approach: fully autonomous Agent, no user intervention needed. Just assign a task & let AI handle it end-to-end
• Claude 3.7 Sonnet — core model
• deep_analysis() tool with o4-mini — reasoning
refact.ai/blog/2025/so...
🧵