Another interesting example of modern computer assistance in mathematics, again involving the site: Problem #707 (), previously marked as "open", is now "disproved" - with the disproof formalized in Lean : . But the path towards that disproof was quite unusual, not fitting neatly into any of the standard narratives about AI in mathematics: * The authors' initial proof used some numerical computer experiments, but did not get much help from LLMs initially, with even the code generated by hand. * The authors then found a disproof by conventional human arguments (and also obtained some stronger results of a similar nature), and did not turn up any prior solution to this problem in their literature searches. Even the modern AI deep research tools did not locate any hits. * Nevertheless, as part of the process of writing the paper, they found a solution to the problem (slightly different from theirs) had been obtained by Hall, thirty years before Erdos even posed the problem! * Perturbed by this, the authors then decided to formalize both their result and Hall's result in Lean. (1/3)
I am increasingly of the opinion that the most productive near-term adoptions of AI in mathematics will primarily come not from applying the most powerful models to the most challenging problems (although we will see a few isolated examples of progress along those lines, especially when large amounts of computational resources and expert attention are applied), but from using medium-powered tools to accelerate and scale up more mundane and time-consuming, but still essential, research tasks, using the accumulated human experience with (and understanding of) such tasks to guide, verify, and safely incorporate the AI output into one's workflows. In such use cases, the output of the AI tool could also have been produced (with increased expenditure of time and attention) by a human expert - but this is actually a feature rather than a bug, as it allows for the AI output to be readily and reliably assessed, confirmed, and converted to a format that such experts are already comfortable working with. An example of such a mundane task is literature review: locating relevant prior literature on a given problem. If the problem already has a commonly agreed upon name, as well a well-established community of researchers working on it, then existing web search and bibliographic search tools are already more than adequate to find both past and current literature on the problem: in particular, the citation graph between the literature will be dense enough that one can start with one key paper in the subject and perform both forward and backward citation searches to obtain a reasonably complete picture of the current state of knowledge on the problem. (1/4)
I was able to use an extended conversation with an AI https://chatgpt.com/share/68ded9b1-37dc-800e-b04c-97095c70eb29 to help answer a MathOverflow question . I had already conducted a theoretical analysis suggesting that the answer to this question was negative, but needed some numerical parameters verifying certain inequalities in order to conclusively build a counterexample. Initially I sought to ask AI to supply Python code to search for a counterexample that I could run and adjust myself, but found that the run time was infeasible and the initial choice of parameters would have made the search doomed to failure anyway. I then switched strategies and instead engaged in a step by step conversation with the AI where it would perform heuristic calculations to locate feasible choices of parameters. Eventually, the AI was able to produce parameters which I could then verify separately (admittedly using Python code supplied by the same AI, but this was a simple 29-line program that I could visually inspect to do what was asked, and also provided numerical values in line with previous heuristic predictions). Here, the AI tool use was a significant time saver - doing the same task unassisted would likely have required multiple hours of manual code and debugging (the AI was able to use the provided context to spot several mathematical mistakes in my requests, and fix them before generating code). Indeed I would have been very unlikely to even attempt this numerical search without AI assistance (and would have sought a theoretical asymptotic analysis instead).
Some loosely organized thoughts on the current Zeitgeist. They were inspired by the response to my recent meta-project mentioned in my previous post , where within 24 hours I became aware of a large number of ongoing small-scale collaborative math projects with their own modest but active community (now listed at ); but they are from the perspective of a human rather than a mathematician. As a crude first approximation, one can think of human society as the interaction between entities at four different scales: 1. Individual humans 2. Small organized groups of humans (e.g., close or extended family; friends; local social or religious organizations; informal sports clubs; small businesses and non-profits; ad hoc collaborations on small projects; small online communities) 3. Large organized groups of humans (e.g., large companies; governments; global institutions; professional sports clubs; large political parties or movements; large social media sites) 4. Large complex systems (e.g., the global economy; the environment; the geopolitical climate; popular culture and "viral" topics; the collective state of science and technology). (1/5)
#SLMath and #ICARM are hosting a joint hybrid workshop on Lean for PDEs, Oct 6-9. Registration is now open:
Any complex project tends to have both explicitly stated and implicitly unstated target goals. For instance, a Lean formalization project may have as its explicit goal the task of obtaining a formal proof of some mathematical claim X; but there are often unstated goals, such as also formalizing key subclaims and definitions X_1, X_2, ... to X in a fashion that would be suitable for upstreaming to the Mathlib library; learning how to use various collaboration tools and distribute tasks; organically discovering insights to the finer structure of the proof of X that might not be emphasized in previous informal proofs; giving real-world training and experience to novice formalizers; and more generally building a community of humans expert in the art of formalization. In the past, it has generally not been necessary to state these implicit goals because of a strong empirical correlation between the achievement of these goals and the achievement of the explicit goals. In the example of the formalization project, pretty much any human-centric effort to accomplish the explicit goal will end up naturally also achieving most of the implicit goals stated above. So the explicit goal effectively becomes a viable proxy for the broader range of actual goals. (1/2)
Earlier this year, the second #AIMO (artificial intelligence mathematical olympiad) concluded, with the winning team solving 34/50 in the final set of math problems (that had been selected to be harder for AI than the first AIMO). The competition was restricted to open source models and run with a limited amoutn of compute. The AIMO has now conducted a retest of these problems both for the top two teams from that competition (NemoSkills and imagination research), as well as OpenAI's o3 model, both with comparable levels of compute resources, and with high resources. Unsurprisingly, the high resource models did better, with the high resource o3 model scoring as high as 47/50, or even 50/50 if given two tries at each question. On the other hand, the gap between the open source models and the commercial models for a fixed amount of compute was relatively slight. More details of this experiment are available at
Thomas Bloom and I are launching a crowdsourced project to link up Thomas's erdosproblems.com site with the #OEIS, by systematically calculating the various integer sequences associated with the Erdos problems and crosschecking them against the OEIS database:
The Simons-Laufer Mathematical Sciences institute, or #SLMath (formerly the Mathematical Sciences Research Institute, or MSRI) has recently restructured its program formats, and is now announcing three new research initiatives, whose applications open on Sep 1 2025: * AxIOM (Accelerating Innovation in Mathematics) is a new, month-long research program at SLMath, designed to accelerate innovation and introduce transformative ideas into the mathematical sciences. Programs begin in Spring 2027. * PROOF (Promoting Research Opportunities and Open Forums) is a two-week summer program designed to provide research opportunities for U.S.-based mathematicians, statisticians, and their collaborators in the U.S. and abroad, whose ongoing research may have been impacted by factors such as heavy teaching loads, professional isolation, limited access to funding, heavy administrative duties, personal obligations, or other constraints. Programs begin June-July 2026. * Lasting Alliance Through Team Immersion and Collaborative Exploration (LATTICE) is a yearlong program which provides opportunities for U.S. mathematicians to conduct collaborative research on topics at the forefront of the mathematical and statistical sciences. Programs begin June-July 2026. (Disclosure: I am vice-chair of the board of trustees at SLMath.)
Recently, the erdosproblems.com site run by Thomas Bloom has enabled a discussion forum for each of the problems on the site. As a result of some discussions I had with Stijn Cambie and Vjeko Kovac on this forum, we were able to come up with a short elementary solution to a previously open problem on the site: . In fact the proof is sufficiently short and elementary that it was straightforward to formalize in Lean!