Yanasse: Finding New Proofs from Deep Vision's Analogies, Part 1

arXiv cs.AI / 4/21/2026

📰 NewsDeveloper Stack & InfrastructureModels & Research

共有:

Key Points

Project Yanasse proposes a workflow to discover new theorem proofs by transferring proof-strategy patterns between mathematically distant domains, rather than substituting symbols directly.
The system analyzes tactic usage across 27 top-level Mathlib areas (217,133 proof states), uses z-scores to select candidate tactics, and performs analogy matching via a GPU-accelerated NP-hard matching engine run on Apple’s MPS.
An AI reasoning agent then semantically adapts the selected Lean 4 tactic invocation patterns to the target theorem, aiming for strategy-level transfer.
In the first study applying the method from Probability to Representation Theory, the approach produced 4 Lean-verified proofs from 10 attempts (40%) with zero `sorry` declarations.
A key insight is that tactic schemas can split into a “head” (domain-gated, hard to transfer) and a “modifier” (domain-general, often transferable), and the matching engine is largely domain independent beyond a domain-specific relation extractor.

Abstract

Project Yanasse presents a method for discovering new proofs of theorems in one area of mathematics by transferring proof strategy patterns (e.g., Lean 4 tactic invocation patterns) from a structurally distant area. The system extracts tactic usage distributions across 27 top-level areas of Mathlib (217,133 proof states), computes z-scores to identify tactics that are heavily used in a source area but rare or absent in a target area, matches source and target proof states via GPU-accelerated NP-hard analogy (running on a MacBook Air via Apple's MPS backend), and then asks an AI reasoning agent to semantically adapt--not symbol-substitute--the source tactics invocation pattern to the target theorem. In this first part of the study, the method is applied to the pair Probability -> Representation Theory, producing 4 Lean-verified new proofs out of 10 attempts (40%). The proofs compile with zero sorry declarations. The key finding is that tactic schemas decompose into a head (domain-gated, rarely transfers) and a modifier (domain-general, often transfers): filter upwards's head fails in representation theory (no Filter structure), but its [LIST] with {\omega} modifier transfers cleanly as ext1 + simp [LIST] + rfl. Crucially, the underlying matching engine--deep vision lib.py--is entirely domain independent: the same optimization code for an NP-hard matching that matches chess positions by analogy matches Lean proof states by analogy, without knowing which domain it is processing. Only a relation extractor is domain-specific.