AI Navigate

Can Blindfolded LLMs Still Trade? An Anonymization-First Framework for Portfolio Optimization

arXiv cs.LG / 3/19/2026

📰 NewsSignals & Early TrendsIdeas & Deep AnalysisModels & Research

Key Points

  • The paper introduces BlindTrade, an anonymization-first framework that removes tickers and company names to prevent LLM trading agents from exploiting ticker memorization and survivorship biases.
  • Four LLM agents output scores along with reasoning, which are then used to construct a graph from reasoning embeddings and guide trading with a PPO-DSR policy.
  • On 2025 YTD through 2025-08-01, the method achieves a Sharpe ratio of 1.40 ± 0.22 across 20 seeds and uses negative control experiments to validate signal legitimacy.
  • Extending the evaluation to 2024–2025 shows market-regime dependence: the approach excels in volatile conditions but exhibits reduced alpha in trending bull markets.
  • The work highlights the importance of signal validation and anonymization for trustworthy multi-agent trading systems, helping distinguish genuine market signals from memorized data.

Abstract

For LLM trading agents to be genuinely trustworthy, they must demonstrate understanding of market dynamics rather than exploitation of memorized ticker associations. Building responsible multi-agent systems demands rigorous signal validation: proving that predictions reflect legitimate patterns, not pre-trained recall. We address two sources of spurious performance: memorization bias from ticker-specific pre-training, and survivorship bias from flawed backtesting. Our approach is to blindfold the agents--anonymizing all identifiers--and verify whether meaningful signals persist. BlindTrade anonymizes tickers and company names, and four LLM agents output scores along with reasoning. We construct a GNN graph from reasoning embeddings and trade using PPO-DSR policy. On 2025 YTD (through 2025-08-01), we achieved Sharpe 1.40 +/- 0.22 across 20 seeds and validated signal legitimacy through negative control experiments. To assess robustness beyond a single OOS window, we additionally evaluate an extended period (2024--2025), revealing market-regime dependency: the policy excels in volatile conditions but shows reduced alpha in trending bull markets.