The DMA Streaming Framework: Kernel-Level Buffer Orchestration for High-Performance AI Data Paths
arXiv cs.AI / 3/12/2026
📰 NewsDeveloper Stack & InfrastructureIdeas & Deep Analysis
Key Points
- dmaplane is a Linux kernel module that exposes a stable UAPI at /dev/dmaplane to explicitly manage DMA buffer lifecycles and orchestration for AI data paths.
- It provides ring-based command channels, DMA buffer lifecycle management, dma-buf export for cross-device sharing, a kernel RDMA engine, NUMA-aware allocation/verification, credit-based flow control, low-overhead observability, and GPU memory integration via PCIe BAR pinning.
- The paper evaluates orchestration sensitivity with NUMA cross-node penalties at DRAM scale, completion-safe flow control under sustained RDMA load, and GPU BAR mapping tiers versus cudaMemcpy.
- It also demonstrates end-to-end disaggregated inference by transferring KV-cache chunks between two machines via RDMA WRITE WITH IMMEDIATE and reconstructing tensor views on the receiver, using Soft-RoCE for measurements.
Related Articles
I Was Wrong About AI Coding Assistants. Here's What Changed My Mind (and What I Built About It).
Dev.to
I Built the Most Feature-Complete MCP Server for Obsidian — Here's How
Dev.to
A supervisor or "manager" Al agent is the wrong way to control Al
Reddit r/artificial
FeatherOps: Fast fp8 matmul on RDNA3 without native fp8
Reddit r/LocalLLaMA
What Rotifer Protocol Is Not: Positioning Beyond the AGI Hype
Dev.to