https://www.linkedin.com/in/takerunakao/
After repeated trials such as SFT, GRPO, model merge, and MoE using the medium-sized model Qwen3-32B, we were able to ultimately surpass the score of the base model, making it a very exciting challenge!
After repeated trials such as SFT, GRPO, model merge, and MoE using the medium-sized model Qwen3-32B, we were able to ultimately surpass the score of the base model, making it a very exciting challenge!