TianHongZXY/Qwen3-4B-Thinking-2507-SFT-10-epochs-synthesized-clear-problems-global_step_280 0.5B • Updated Nov 5
TianHongZXY/OpenR1-Math-46k-8192-Qwen2.5-Math-7B-RoPE-40K-GRPO-use_guide-clip_ratio_upper_0.28 Updated Jul 12
TianHongZXY/OpenR1-Math-46k-8192-Qwen2.5-7B-Instruct-GRPO-gpt-4o-summary_wo_think-clip_0.28 Updated Jul 8