Linguistic Frameworks Go Toe-to-Toe at Neuro-Symbolic Language Modeling

arXiv cs.AI / 4/6/2026

💬 OpinionIdeas & Deep AnalysisModels & Research

Key Points

  • The paper evaluates whether linguistic graph representations can complement neural language models in a neuro-symbolic setting using an ensemble of a pretrained Transformer plus ground-truth graphs from seven different formalisms.
  • It finds that semantic constituency structures deliver the strongest overall gains in language modeling performance, outperforming syntactic constituency and dependency-based structures.
  • The reported benefits vary significantly by part-of-speech class, indicating that the usefulness of a graph formalism is not uniform across linguistic categories.
  • The authors conclude that the results reveal promising directions for neuro-symbolic language modeling and call for future work that systematically quantifies how different formalisms and design choices affect outcomes.

Abstract

We examine the extent to which, in principle, linguistic graph representations can complement and improve neural language modeling. With an ensemble setup consisting of a pretrained Transformer and ground-truth graphs from one of 7 different formalisms, we find that, overall, semantic constituency structures are most useful to language modeling performance -- outpacing syntactic constituency structures as well as syntactic and semantic dependency structures. Further, effects vary greatly depending on part-of-speech class. In sum, our findings point to promising tendencies in neuro-symbolic language modeling and invite future research quantifying the design choices made by different formalisms.