Leveraging RL with our reward mechanism, we push Qwen-Coder-2.5 7B to performance on par with much larger LLMs (>400B) on the BIRD dataset! 🤯
Model: huggingface.co/simone-papic...
Paper: huggingface.co/papers/2504....
Details 👇
Leveraging RL with our reward mechanism, we push Qwen-Coder-2.5 7B to performance on par with much larger LLMs (>400B) on the BIRD dataset! 🤯
Model: huggingface.co/simone-papic...
Paper: huggingface.co/papers/2504....
Details 👇
1/3 🧵
1/3 🧵