1.Yang, L. et al. [[ProBench:Benchmarking large language models in competitive programming.]] Preprint at https://doi.org/10.48550/arXiv.2502.20868 (2025).
2.Islam, M. A., Ali, M. E. & Parvez, M. R. [[MapCoder:Multi-agent code generation for competitive problem solving. ]]Preprint at https://doi.org/10.48550/arXiv.2405.11403 (2024).
3.Hossain, M. S., Tabassum, A., Arefin, M. F. & Zaman, T. S. [[LLM-ProS:Analyzing large language models’ performance in competitive problem solving.]] Preprint at https://doi.org/10.48550/arXiv.2502.04355 (2025).
4.Zheng, Z. et al. [[LiveCodeBench pro:How do olympiad medalists judge LLMs in competitive programming?]] Preprint at https://doi.org/10.48550/arXiv.2506.11928 (2025).
5.Li, X. et al. [[Humanity’s last code exam:Can advanced LLMs conquer human’s hardest code competition?]] Preprint at https://doi.org/10.48550/arXiv.2506.12713 (2025).
6.Liu, F. et al. [[FastFixer:An efficient and effective approach for repairing programming assignments. ]] Preprint at https://doi.org/10.48550/arXiv.2410.21285 (2024).
7.Wei, M. et al.[[ Evaluating and improving large language models for competitive program generation.]] Preprint at https://doi.org/10.48550/arXiv.2506.22954 (2025).
8.Quan, S. et al. [[CodeElo:Benchmarking competition-level code generation of LLMs with human-comparable elo ratings.]] Preprint at https://doi.org/10.48550/arXiv.2501.01257 (2025).