The DMA Streaming Framework: Kernel-Level Buffer Orchestration for High-Performance AI Data Paths
arXiv cs.AI / 3/12/2026
📰 NewsDeveloper Stack & InfrastructureIdeas & Deep Analysis
Key Points
- dmaplane is a Linux kernel module that exposes a stable UAPI at /dev/dmaplane to explicitly manage DMA buffer lifecycles and orchestration for AI data paths.
- It provides ring-based command channels, DMA buffer lifecycle management, dma-buf export for cross-device sharing, a kernel RDMA engine, NUMA-aware allocation/verification, credit-based flow control, low-overhead observability, and GPU memory integration via PCIe BAR pinning.
- The paper evaluates orchestration sensitivity with NUMA cross-node penalties at DRAM scale, completion-safe flow control under sustained RDMA load, and GPU BAR mapping tiers versus cudaMemcpy.
- It also demonstrates end-to-end disaggregated inference by transferring KV-cache chunks between two machines via RDMA WRITE WITH IMMEDIATE and reconstructing tensor views on the receiver, using Soft-RoCE for measurements.
Related Articles

The programming passion is melting
Dev.to

Maximize Developer Revenue with Monetzly's Innovative API for AI Conversations
Dev.to
Co-Activation Pattern Detection for Prompt Injection: A Mechanistic Interpretability Approach Using Sparse Autoencoders
Reddit r/LocalLLaMA

Nvidia GTC 2026: Jensen Huang Bets $1 Trillion on the Age of the AI Factory
Dev.to

Nvidia GTC 2026: Jensen Huang Eyes $1 Trillion in Orders as the AI Infrastructure Race Hits Warp Speed
Dev.to