NeurIPS Tutorial on Machine Learning for Theorem Proving - 2023

Details

Title : NeurIPS Tutorial on Machine Learning for Theorem Proving Author(s): Azerbayev, Zhangir and First, Emily and Jiang, Albert Q. and Yang, Kaiyu and Anandkumar, Anima and Goodman, Noah and Sanchez-Stern, Alex and Song, Dawn and Welleck, Sean Link(s) : https://neurips.cc/virtual/2023/tutorial/73946

Rough Notes

They have a model LeanCopilot which is a Copilot style assistant for the Lean theorem proving language.

Aim: Go from theorem to proof, represented as programs - see e.g. (, ).

Key property: Generating the proof is hard, but checking the proof is easy. There is no ground truth proof to generate, any proof that passes the checking phase is good.

In the real world, proof assistants like Lean, Isabelle, Coq are used by human mathematicians.

Assume we generate proofs step by step, each proof step is called a tactic. Searching for proofs now starts from taking the current proof state and generating possible tactics. We can use existing data (of valid Lean proofs) to finetune a model to make a tactic generator. Given a tactic generator, we can now use it within a search algorithm like Depth First Search. A straight forward improvement to use a Best First Search which uses a cumulative score (e.g. log likelihood) from the tactic generator. There is also Hyper Tree Proof Search inspired by Monte-Carlo Tree Search (MCTS).

Now, we consider premise selection, which is different from proof search. See e.g. ReProver, MagnusHammer.

We can model the formal theorem proving process as a Markov Decision Process (MDP), where:

States: Goal(s) to prove.
Actions: Tactics (1 step in the proof).
Rewards: 1 for QED.
Transitions: Deterministic.

Beyond formal statements, which does not have much data, we would want to use informal statements e.g. natural language (which has more data) as much as possible.

[Skipped from formal software verification section].