🏆 G-Pass@k Leaderboard 🏆

GPassK: Are Your LLMs Capable of Stable Reasoning?

paper code data

📝 Notes

  1. Models labeled with 🌍 are Closed-source models, while others are Open-sourced.
  2. Models labeled with 🧮 are Mathematics-Specialization models.
  3. Models labeled with 💡 are o1-like models with Long-cot.

🤗 Acknowledgement

Thanks for the EvalPlus for sharing the leaderboard template.