Learning Hierarchical and Geometry-Aware Graph Representations for Text-to-CAD

arXiv cs.AI / 4/14/2026

💬 OpinionSignals & Early TrendsModels & Research

Key Points

  • The paper targets text-to-CAD generation by addressing failure modes of prior systems that decode directly into executable code without modeling assembly hierarchy or geometric constraints.
  • It introduces a hierarchical, geometry-aware graph intermediate representation where multi-level parts/components are nodes and geometric constraints are edges to reduce search space and cascading errors.
  • The proposed framework predicts assembly structure and constraints first, then conditions action sequencing and final code generation to improve geometric fidelity and constraint satisfaction.
  • It adds a structure-aware progressive curriculum learning strategy using graded tasks with controlled structural edits and synthesizing boundary examples to train more robustly.
  • The authors release a 12K instruction-to-CAD dataset (with decomposition graphs, action sequences, and bpy code) and graph- and constraint-oriented evaluation metrics, reporting consistent performance improvements over existing methods.

Abstract

Text-to-CAD code generation is a long-horizon task that translates textual instructions into long sequences of interdependent operations. Existing methods typically decode text directly into executable code (e.g., bpy) without explicitly modeling assembly hierarchy or geometric constraints, which enlarges the search space, accumulates local errors, and often causes cascading failures in complex assemblies. To address this issue, we propose a hierarchical and geometry-aware graph as an intermediate representation. The graph models multi-level parts and components as nodes and encodes explicit geometric constraints as edges. Instead of mapping text directly to code, our framework first predicts structure and constraints, then conditions action sequencing and code generation, thereby improving geometric fidelity and constraint satisfaction. We further introduce a structure-aware progressive curriculum learning strategy that constructs graded tasks through controlled structural edits, explores the model's capability boundary, and synthesizes boundary examples for iterative training. In addition, we build a 12K dataset with instructions, decomposition graphs, action sequences, and bpy code, together with graph- and constraint-oriented evaluation metrics. Extensive experiments show that our method consistently outperforms existing approaches in both geometric fidelity and accurate satisfaction of geometric constraints.