Do Neurons Dream of Primitive Operators? Wake-Sleep Compression Rediscovers Schank's Event Semantics

arXiv cs.AI / 3/30/2026

💬 OpinionIdeas & Deep AnalysisModels & Research

Key Points

  • The paper tests whether event “primitive operators” like those in Schank’s conceptual dependency theory can be automatically discovered using compression/MDL rather than hand-coding from linguistic intuition.
  • It adapts DreamCoder’s wake-sleep approach to learn operator compositions from before/after world-state event pairs, starting from a small set of generic primitives and inducing new ones under Minimum Description Length pressure.
  • On synthetic benchmarks, the discovered operators closely match Schank’s hand-coded primitives (within ~4% Bayesian MDL) and explain all events versus Schank’s ~81%.
  • On ATOMIC commonsense data, Schank’s original primitives explain only ~10% of events, while the learned library explains 100%, with dominant discovered operators centered on mental and emotional state changes rather than purely physical actions.
  • The authors argue this is the first empirical evidence that compression pressure can yield justified event primitives, and that the true primitive inventory is substantially richer—especially for naturalistic language involving thoughts and emotions.

Abstract

We show that they do. Schank's conceptual dependency theory proposed that all events decompose into primitive operations -- ATRANS, PTRANS, MTRANS, and others -- hand-coded from linguistic intuition. Can the same primitives be discovered automatically through compression pressure alone? We adapt DreamCoder's wake-sleep library learning to event state transformations. Given events as before/after world state pairs, our system finds operator compositions explaining each event (wake), then extracts recurring patterns as new operators optimized under Minimum Description Length (sleep). Starting from four generic primitives, it discovers operators mapping directly to Schank's: MOVE_PROP_has = ATRANS, CHANGE_location = PTRANS, SET_knows = MTRANS, SET_consumed = INGEST, plus compound operators ("mail" = ATRANS + PTRANS) and novel emotional state operators absent from Schank's taxonomy. We validate on synthetic events and real-world commonsense data from the ATOMIC knowledge graph. On synthetic data, discovered operators achieve Bayesian MDL within 4% of Schank's hand-coded primitives while explaining 100% of events vs. Schank's 81%. On ATOMIC, results are more dramatic: Schank's primitives explain only 10% of naturalistic events, while the discovered library explains 100%. Dominant operators are not physical-action primitives but mental and emotional state changes -- CHANGE_wants (20%), CHANGE_feels (18%), CHANGE_is (18%) -- none in Schank's original taxonomy. These results provide the first empirical evidence that event primitives can be derived from compression pressure, that Schank's core primitives are information-theoretically justified, and that the complete inventory is substantially richer than proposed -- with mental/emotional operators dominating in naturalistic data.