Structural Feature Engineering for Generative Engine Optimization: How Content Structure Shapes Citation Behavior

arXiv cs.CL / 4/1/2026

📰 NewsSignals & Early TrendsIdeas & Deep AnalysisModels & Research

Key Points

  • The paper argues that AI-powered search engines change how users discover information, making source citation behavior a key determinant of content visibility.
  • It introduces GEO-SFE, a generative engine optimization framework that applies structural feature engineering at macro (document architecture), meso (chunking), and micro (visual emphasis) levels.
  • The authors build architecture-aware optimization strategies and predictive models to improve citation probability while preserving semantic integrity.
  • Experiments across six major generative engines report average citation-rate improvements of 17.3% and gains in subjective quality of 18.5%, suggesting broad effectiveness.
  • The work positions structural optimization as a foundational, data-driven component of GEO for LLM-powered information ecosystems.

Abstract

The proliferation of AI-powered search engines has shifted information discovery from traditional link-based retrieval to direct answer generation with selective source citation, creating new challenges for content visibility. While existing Generative Engine Optimization (GEO) approaches focus primarily on semantic content modification, the role of structural features in influencing citation behavior remains underexplored. In this paper, we propose GEO-SFE, a systematic framework for structural feature engineering in generative engine optimization. Our approach decomposes content structure into three hierarchical levels: macro-structure (document architecture), meso-structure (information chunking), and micro-structure (visual emphasis), and models their impact on citation probability across different generative engine architectures. We develop architecture-aware optimization strategies and predictive models that preserve semantic integrity while improving structural effectiveness. Experimental evaluation across six mainstream generative engines demonstrates consistent improvements in citation rate (17.3 percent) and subjective quality (18.5 percent), validating the effectiveness and generalizability of the proposed framework. This work establishes structural optimization as a foundational component of GEO, providing a data-driven methodology for enhancing content visibility in LLM-powered information ecosystems.