AI Navigate

FactorEngine: A Program-level Knowledge-Infused Factor Mining Framework for Quantitative Investment

arXiv cs.AI / 3/18/2026

📰 NewsSignals & Early TrendsIdeas & Deep AnalysisModels & Research

Key Points

  • FactorEngine reframes factor discovery as the execution of Turing-complete factor programs to ensure factors are directly executable and auditable in quantitative investment.
  • The framework introduces three separations to boost effectiveness and efficiency: logic revision vs. parameter optimization, LLM-guided directional search vs. Bayesian hyperparameter search, and LLM usage vs. local computation.
  • A knowledge-infused bootstrapping module converts unstructured financial reports into executable factor programs through a closed-loop multi-agent pipeline for extraction, verification, and code generation.
  • An experience knowledge base enables trajectory-aware refinement by learning from past successes and failures to improve future factor discovery.
  • In backtests on real OHLCV data, FE delivers stronger predictive stability and portfolio metrics (IC/ICIR, Rank IC/ICIR, AR/Sharpe) than baselines, claiming state-of-the-art performance.

Abstract

We study alpha factor mining, the automated discovery of predictive signals from noisy, non-stationary market data-under a practical requirement that mined factors be directly executable and auditable, and that the discovery process remain computationally tractable at scale. Existing symbolic approaches are limited by bounded expressiveness, while neural forecasters often trade interpretability for performance and remain vulnerable to regime shifts and overfitting. We introduce FactorEngine (FE), a program-level factor discovery framework that casts factors as Turing-complete code and improves both effectiveness and efficiency via three separations: (i) logic revision vs. parameter optimization, (ii) LLM-guided directional search vs. Bayesian hyperparameter search, and (iii) LLM usage vs. local computation. FE further incorporates a knowledge-infused bootstrapping module that transforms unstructured financial reports into executable factor programs through a closed-loop multi-agent extraction-verification-code-generation pipeline, and an experience knowledge base that supports trajectory-aware refinement (including learning from failures). Across extensive backtests on real-world OHLCV data, FE produces factors with substantially stronger predictive stability and portfolio impact-for example, higher IC/ICIR (and Rank IC/ICIR) and improved AR/Sharpe, than baseline methods, achieving state-of-the-art predictive and portfolio performance.