ConlangCrafter: Constructing Languages with a Multi-Hop LLM Pipeline

arXiv cs.CL / 4/20/2026

💬 OpinionIdeas & Deep AnalysisModels & Research

Key Points

  • The paper presents ConlangCrafter, an end-to-end multi-hop LLM pipeline for generating constructed languages (conlangs) by splitting the process into modular stages such as phonology, morphology, syntax, and lexicon generation.
  • It uses LLM “metalinguistic reasoning” at each stage, adding randomness to improve diversity while applying self-refinement feedback to maintain consistency in the evolving language description.
  • The authors introduce a scalable evaluation framework with metrics focused on both consistency and typological diversity, enabling systematic comparison across generated conlangs.
  • Experiments with automatic and manual evaluations suggest ConlangCrafter can produce coherent, varied conlangs without requiring human linguistic expertise.
  • Overall, the work positions modern LLMs as computational creativity tools for language design, combining generation, refinement, and evaluation into a unified workflow.

Abstract

Constructed languages (conlangs) such as Esperanto and Quenya have played diverse roles in art, philosophy, and international communication. Meanwhile, foundation models have revolutionized creative generation in text, images, and beyond. In this work, we leverage modern LLMs as computational creativity aids for end-to-end conlang creation. We introduce ConlangCrafter, a multi-hop pipeline that decomposes language design into modular stages -- phonology, morphology, syntax, lexicon generation, and translation. At each stage, our method leverages LLMs' metalinguistic reasoning capabilities, injecting randomness to encourage diversity and leveraging self-refinement feedback to encourage consistency in the emerging language description. We construct a novel, scalable evaluation framework for this task, evaluating metrics measuring consistency and typological diversity. Automatic and manual evaluations demonstrate ConlangCrafter's ability to produce coherent and varied conlangs without human linguistic expertise.