Structural Generalization on SLOG without Hand-Written Rules

arXiv cs.AI / 4/30/2026

💬 OpinionIdeas & Deep AnalysisModels & Research

Key Points

  • The paper addresses structural generalization in semantic parsing, where models must apply learned compositional rules to new structural combinations without relying on hand-written algebraic rules.
  • It proposes a neural cellular automaton (NCA) with a discrete bottleneck that learns compositional operations purely from data via local iterations, avoiding hand-crafted compositional rules.
  • On the SLOG benchmark, the method reaches 100% type-exact match on 11 out of 17 structural generalization categories, including cases where AM-Parser performs very poorly (0–74%).
  • The authors find failures are concentrated into two specific mechanisms related to wh-extraction context interacting with reduced verb types and modifiers appearing on the subject side of verbs.
  • By analyzing CCG structural features, the study shows intermediate performance arises from mixing distinct structural patterns rather than partial generalization, and that successes align with operations covered during training while failures correspond to directed operations missing from training.

Abstract

Structural generalization in semantic parsing requires systems to apply learned compositional rules to novel structural combinations. Existing approaches either rely on hand-written algebraic rules (AM-Parser) or fail to generalize structurally (Transformer-based models). We present an alternative requiring no hand-written compositional rules, based on a neural cellular automaton (NCA) with a discrete bottleneck: all compositional rules are learned from data through local iteration. On the SLOG benchmark, the system achieves 100% type-exact match on 11 of 17 structural generalization categories, including three where AM-Parser scores 0 to 74%, with an overall standard deviation of 0.2 across 10 seeds (vs. AM-Parser's 4.3). Analysis reveals that all 5,539 failure instances reduce to exactly two mechanisms: novel combinations of wh-extraction context with reduced verb types, and modifiers appearing on the subject side of verbs.When we decompose results by CCG structural features, each sub-pattern either succeeds on all instances or fails on all. Intermediate scores (e.g., 41.4%) are mixtures of structurally distinct CCG patterns, not partial generalization.All failures correspond to directed operations absent from training; all successes correspond to operations already covered.

Structural Generalization on SLOG without Hand-Written Rules | AI Navigate