AI Navigate

NANOZK: Layerwise Zero-Knowledge Proofs for Verifiable Large Language Model Inference

arXiv cs.AI / 3/20/2026

💬 OpinionDeveloper Stack & InfrastructureModels & Research

Key Points

  • METHOD is a zero-knowledge proof system that enables users to cryptographically verify that LLM outputs come from a specific model.
  • The approach decomposes transformer inference into independent layers, producing constant-size proofs per layer and enabling parallel proving regardless of model width.
  • It uses lookup-table approximations for softmax, GELU, and LayerNorm with zero measurable accuracy loss, plus Fisher information-guided verification for handling very deep models when full proving is impractical.
  • For transformer models up to depth d=128, METHOD achieves 5.5 KB layer proofs and 24 ms verification time, with 70x smaller proofs and 5.7x faster proving than EZKL while preserving formal soundness (epsilon < 1e-37).
  • Lookup approximations preserve perplexity exactly, enabling verifiable inference without compromising model quality.

Abstract

When users query proprietary LLM APIs, they receive outputs with no cryptographic assurance that the claimed model was actually used. Service providers could substitute cheaper models, apply aggressive quantization, or return cached responses - all undetectable by users paying premium prices for frontier capabilities. We present METHOD, a zero-knowledge proof system that makes LLM inference verifiable: users can cryptographically confirm that outputs correspond to the computation of a specific model. Our approach exploits the fact that transformer inference naturally decomposes into independent layer computations, enabling a layerwise proof framework where each layer generates a constant-size proof regardless of model width. This decomposition sidesteps the scalability barrier facing monolithic approaches and enables parallel proving. We develop lookup table approximations for non-arithmetic operations (softmax, GELU, LayerNorm) that introduce zero measurable accuracy loss, and introduce Fisher information-guided verification for scenarios where proving all layers is impractical. On transformer models up to d=128, METHOD generates constant-size layer proofs of 5.5KB (2.1KB attention + 3.5KB MLP) with 24 ms verification time. Compared to EZKL, METHOD achieves 70x smaller proofs and 5.7x faster proving time at d=128, while maintaining formal soundness guarantees (epsilon < 1e-37). Lookup approximations preserve model perplexity exactly, enabling verification without quality compromise.