A Chinese artificial intelligence system has solved a complex mathematical problem that scientists had been unable to solve for ten years, which was posed by an American mathematician in 2014.
A team from Peking University developed a semi-autonomous system that reviewed decades of mathematical research to arrive at the solution, then verified the results itself with almost no human intervention.
The solved problem is an "algebraic conjecture" that was formulated by Professor Dan Anderson of the University of Iowa before his death in 2022.
The researchers, led by mathematician Dong Bin, explained that their system successfully solved this open problem in the field of commutative algebra (a branch of abstract algebra that focuses on the study of commutative loops and their ideals), and proved the result automatically, an achievement that is a concrete example of the possibility of automating mathematical research using artificial intelligence.
What distinguishes this system is its exceptional speed compared to humans, as it can perform complex mathematical tasks that previously required the collaboration of experts from multiple disciplines. However, the biggest challenge the researchers faced was that mathematical proofs require absolute precision, while proofs produced by large language models are often unreliable due to their tendency to "hallucination" or fabricate inaccurate information. Therefore, the Chinese team designed an innovative system that combines two agents: one that performs natural language reasoning, and another that formalizes and verifies the results.
The new system relies on a clever mechanism that begins with a reasoning system called Rethlas, which uses a mathematical theorem search engine called Matlas to explore solution strategies. When this system arrives at a probable proof, a second system called Archon, using another search engine called LeanSearch, transforms that proof into a project that can be verified by an interactive theorem validator called Lean B4. This validator is not just a tool; it is a complete programming language whose library contains hundreds of thousands of mathematical theorems and definitions.
The Chinese system required only 80 hours of runtime to solve Anderson's conjecture, without any need for mathematical judgment from a human operator. However, the researchers noted that the process could be accelerated if a real mathematician guided the Archon system.
The team asserts that this work represents a promising model for the future, where informal and formal reasoning systems work together to produce verifiable results, while greatly reducing human effort.
