Hubble: An LLM-Driven Agentic Framework for Safe and Automated Alpha Factor Discovery

arXiv cs.AI / 4/14/2026

📰 NewsDeveloper Stack & InfrastructureIdeas & Deep AnalysisModels & Research

共有:

Key Points

Hubble is proposed as an LLM-driven, closed-loop agentic framework for automated discovery of predictive alpha factors in quantitative finance, addressing the large search space and low signal-to-noise ratios.
The approach uses an LLM to propose candidates under a domain-specific operator language, then executes them within an AST-based sandbox to enforce deterministic safety constraints and improve interpretability.
Candidate factors are scored through a rigorous statistical pipeline, including cross-sectional RankIC, annualized Information Ratio, and portfolio turnover.
An evolutionary feedback loop returns top-performing factors and structured error diagnostics to the LLM for iterative refinement across multiple generation rounds.
In experiments on 30 U.S. equities over 752 trading days, Hubble evaluated 181 syntactically valid factors from 122 candidates across three rounds, reaching a peak composite score of 0.827 with full computational stability.

Abstract

Discovering predictive alpha factors in quantitative finance remains a formidable challenge due to the vast combinatorial search space and inherently low signal-to-noise ratios in financial data. Existing automated methods, particularly genetic programming, often produce complex, uninterpretable formulas prone to overfitting. We introduce Hubble, a closed-loop factor mining framework that leverages Large Language Models (LLMs) as intelligent search heuristics, constrained by a domain-specific operator language and an Abstract Syntax Tree (AST)-based execution sandbox. The framework evaluates candidate factors through a rigorous statistical pipeline encompassing cross-sectional Rank Information Coefficient (RankIC), annualized Information Ratio, and portfolio turnover. An evolutionary feedback mechanism returns top-performing factors and structured error diagnostics to the LLM, enabling iterative refinement across multiple generation rounds. In experiments conducted on a panel of 30 U.S. equities over 752 trading days, the system evaluated 181 syntactically valid factors from 122 unique candidates across three rounds, achieving a peak composite score of 0.827 with 100% computational stability. Our results demonstrate that combining LLM-driven generation with deterministic safety constraints yields an effective, interpretable, and reproducible approach to automated factor discovery.

Graftcode secures €2.1M to simplify AI-era software integration

Tech.eu

How AI Coding Assistants Actually Changed My Workflow (And Where They Still Fall Short)

Dev.to

The Magic of Auto-Sync: How AI Updates Ten Schedules from One Change

Dev.to

Kubegraf: AI SRE Platform for Faster Kubernetes Incident Resolution

Dev.to

# 🚀 5 Unique Project Ideas That 99% Developers Don’t Build

Dev.to

Hubble: An LLM-Driven Agentic Framework for Safe and Automated Alpha Factor Discovery

Key Points

Abstract

Related Articles

Graftcode secures €2.1M to simplify AI-era software integration

How AI Coding Assistants Actually Changed My Workflow (And Where They Still Fall Short)

The Magic of Auto-Sync: How AI Updates Ten Schedules from One Change

Kubegraf: AI SRE Platform for Faster Kubernetes Incident Resolution

# 🚀 5 Unique Project Ideas That 99% Developers Don’t Build

関連おすすめサービス

Notta搭載AI議事録イヤホン ZENCHORD1

AI搭載ボイスレコーダー Plaud

画像高画質化AIツール Aiarty Image Enhancer