Chinese artificial intelligence (AI) company DeepSeek recently launched DeepSeekMath-V2, an innovative mathematical reasoning model that sets new performance benchmarks and pushes the boundaries of AI-based problem-solving.This new model, now open-sourced on Hugging Face and GitHub, introduces a new self-verification framework designed to ensure not only correct answers, but also logical and verifiable proofs.
His achievements reached the gold medal level at the 2025 International Mathematical Olympiad (IMO) and the 2024 Chinese Mathematical Olympiad (CMO).
The model also managed to score 118 out of 120 points on the highly competitive 2024 Putnam Exam, easily surpassing the highest human score of 90.
The model's prowess was further strengthened through IMO-ProofBench, which saw it surpass models like DeepMind's DeepThink. The system pitted two large language models against each other, one acting as a "prover" to generate mathematical proofs, while the other served as a "reviewer" to examine the reasoning.
Such a mechanism overcomes a critical limitation in the current level of AI achievement, where a correct final answer does not guarantee a correct reasoning process, according to the DeepSeek team.
DeepSeek says this breakthrough establishes self-verifying mathematical reasoning as a viable and promising path towards developing more robust and reliable mathematical AI systems.