SOL-ExecBench: Speed-of-Light Benchmarking for Real-World GPU Kernels Against Hardware Limits
arXiv cs.LG / 3/20/2026
📰 NewsDeveloper Stack & InfrastructureSignals & Early TrendsTools & Practical UsageModels & Research
Key Points
- SOL-ExecBench introduces a new benchmark of 235 CUDA kernel optimization problems drawn from 124 production and emerging AI models, spanning language, diffusion, vision, audio, video, and hybrid architectures, and targets NVIDIA Blackwell GPUs.
- It evaluates forward and backward workloads across BF16, FP8, and NVFP4, including kernels whose best performance is expected to depend on Blackwell-specific capabilities.
- The benchmark measures performance against analytically derived Speed-of-Light (SOL) bounds computed by SOLAR, providing a hardware-grounded target rather than a traditional software baseline.
- It outputs a SOL Score that quantifies how much of the gap to the hardware SOL bound a candidate kernel closes, enabling objective comparison of kernel efficiency.
- A sandboxed harness with GPU clock locking, L2 cache clearing, isolated subprocess execution, and static-analysis checks is provided to guard against reward-hacking by agentic optimizers.
Related Articles

Attacks On Data Centers, Qwen3.5 In All Sizes, DeepSeek’s Huawei Play, Apple’s Multimodal Tokenizer
The Batch

ベテランの若手育成負担を減らせ、PLC制御の「ラダー図」をAIで生成
日経XTECH

Your AI generated code is "almost right", and that is actually WORSE than it being "wrong".
Dev.to

Lessons from Academic Plagiarism Tools for SaaS Product Development
Dev.to

Windsurf’s New Pricing Explained: Simpler AI Coding or Hidden Trade-Offs?
Dev.to