Versioned Late Materialization for Ultra-Long Sequence Training in Recommendation Systems at Scale
arXiv cs.AI / 4/29/2026
💬 OpinionDeveloper Stack & InfrastructureIndustry & Market MovesModels & Research
Key Points
- The paper argues that conventional “Fat Row” sequence pre-materialization in deep learning recommendation systems causes major storage and I/O bottlenecks as sequence length scales to ultra-long user histories.
- It proposes a “versioned late materialization” approach that stores user interaction histories once in a normalized, immutable layer and reconstructs sequences on-the-fly during training using lightweight, versioned pointers.
- The method includes an O2O (online-to-offline) consistency protocol designed to prevent future data leakage across both streaming and batch training workflows.
- To maintain high throughput despite just-in-time reconstruction, the system uses read-optimized immutable storage, multi-dimensional projection pushdown for different model tenants, and pipelined I/O prefetching/data-affinity optimizations.
- Deployed on production DLRMs, the approach reduces data infrastructure resource usage and enables aggressive sequence-length scaling that improves model quality, and supports architectures such as HSTU and ULTRA-HSTU.
Related Articles

Black Hat USA
AI Business

What to Build Still Beats How
Dev.to

I Build Systems, Flip Land, and Drop Trap Music — Meet Tyler Moncrieff aka Father Dust
Dev.to

From Claim Denials to Smart Decisions: My Experience Using AI in Healthcare Claims Processing
Dev.to

Whatsapp AI booking system in one prompt in 5 minutes
Dev.to