BASIS: Balanced Activation Sketching with Invariant Scalars for "Ghost Backpropagation"
arXiv cs.LG / 4/21/2026
📰 NewsDeveloper Stack & InfrastructureModels & Research
Key Points
- The paper introduces BASIS (Balanced Activation Sketching with Invariant Scalars), a new “ghost backpropagation” method that aims to reduce the activation-memory bottleneck that makes exact backpropagation scale as O(L * B * N).
- BASIS preserves exact gradient flow for activations (dX) while computing weight updates (dW) using highly compressed rank-R (sketched) tensors, reducing backward compute and memory requirements to about O(L * R * N).
- To address instability from sketched gradients, BASIS adds Balanced Hashing to eliminate off-diagonal collision variance and Invariant Scalars to maintain the exact continuous energy norm of the spatial geometry via a controlled bias-variance tradeoff.
- Empirically, training a GPT-style model for 50,000 steps shows BASIS matches or slightly improves exact backprop validation loss (6.575 vs. 6.616) at R=32, and still converges smoothly even at extreme compression (R=1), suggesting strong robustness and an implicit regularization effect.
- The authors release the implementation on GitHub, enabling direct experimentation with BASIS in deep and GPT-like architectures.
Related Articles
A practical guide to getting comfortable with AI coding tools
Dev.to
We built it during the NVIDIA DGX Spark Full-Stack AI Hackathon — and it ended up winning 1st place overall 🏆
Dev.to
Stop Losing Progress: Setting Up a Pro Jupyter Workflow in VS Code (No More Colab Timeouts!)
Dev.to
🚀 Major BrowserAct CLI Update
Dev.to
Building AgentOS: Why I’m Building the AWS Lambda for Insurance Claims
Dev.to