Current LLMs still cannot 'talk much' about grammar modules: Evidence from syntax

arXiv cs.CL / 3/23/2026

📰 NewsSignals & Early TrendsIdeas & Deep AnalysisModels & Research

Key Points

  • The study investigates how LLMs discuss grammar modules by translating 44 generative-syntax terms into Arabic and comparing human translations with ChatGPT-5 outputs.
  • It employs a qualitative analytical and comparative approach to assess translations across terms drawn from generative syntax literature and the authors' field experience.
  • Results show that only 25% of ChatGPT translations were accurate, 38.6% were inaccurate, and 36.4% were partially correct, indicating substantial limitations in core syntax translation.
  • The findings highlight several semantic and syntactic challenges that hamper LLMs' ability to encode the core properties of grammar terms.
  • The paper proposes actionable strategies, notably closer collaboration between AI specialists and linguists to improve LLM translation performance.

Abstract

We aim to examine the extent to which Large Language Models (LLMs) can 'talk much' about grammar modules, providing evidence from syntax core properties translated by ChatGPT into Arabic. We collected 44 terms from generative syntax previous works, including books and journal articles, as well as from our experience in the field. These terms were translated by humans, and then by ChatGPT-5. We then analyzed and compared both translations. We used an analytical and comparative approach in our analysis. Findings unveil that LLMs still cannot 'talk much' about the core syntax properties embedded in the terms under study involving several syntactic and semantic challenges: only 25% of ChatGPT translations were accurate, while 38.6% were inaccurate, and 36.4.% were partially correct, which we consider appropriate. Based on these findings, a set of actionable strategies were proposed, the most notable of which is a close collaboration between AI specialists and linguists to better LLMs' working mechanism for accurate or at least appropriate translation.

Current LLMs still cannot 'talk much' about grammar modules: Evidence from syntax | AI Navigate