MobileKernelBench: Can LLMs Write Efficient Kernels for Mobile Devices?

arXiv cs.LG / 3/13/2026

📰 NewsDeveloper Stack & InfrastructureTools & Practical UsageModels & Research

共有:

Key Points

MobileKernelBench is introduced as a comprehensive benchmark for evaluating LLM-generated mobile kernels, featuring operator-diversity and cross-framework interoperability plus an automated host-device verification pipeline.
The evaluation on the CPU backend of Mobile Neural Network (MNN) reveals current LLMs struggle with mobile frameworks, exhibiting high compilation failure rates (>54%) and minimal performance gains due to hallucinations and data scarcity.
The authors propose the Mobile Kernel Agent (MoKA), a multi-agent system with repository-aware reasoning and a plan-and-execute paradigm to improve results.
On MobileKernelBench validation, MoKA achieves 93.7% compilation success and enables 27.4% of generated kernels to deliver measurable speedups over native libraries.

Abstract

Large language models (LLMs) have demonstrated remarkable capabilities in code generation, yet their potential for generating kernels specifically for mobile de- vices remains largely unexplored. In this work, we extend the scope of automated kernel generation to the mobile domain to investigate the central question: Can LLMs write efficient kernels for mobile devices? To enable systematic investigation, we introduce MobileKernelBench, a comprehensive evaluation framework comprising a benchmark prioritizing operator diversity and cross-framework interoperability, coupled with an automated pipeline that bridges the host-device gap for on-device verification. Leveraging this framework, we conduct extensive evaluation on the CPU backend of Mobile Neural Network (MNN), revealing that current LLMs struggle with the engineering complexity and data scarcity inher-ent to mobile frameworks; standard models and even fine-tuned variants exhibit high compilation failure rates (over 54%) and negligible performance gains due to hallucinations and a lack of domain-specific grounding. To overcome these limitations, we propose the Mobile K ernel A gent (MoKA), a multi-agent system equipped with repository-aware reasoning and a plan-and-execute paradigm.Validated on MobileKernelBench, MoKA achieves state-of-the-art performance, boosting compilation success to 93.7% and enabling 27.4% of generated kernelsto deliver measurable speedups over native libraries.

I Was Wrong About AI Coding Assistants. Here's What Changed My Mind (and What I Built About It).

Dev.to

Interesting loop

Reddit r/LocalLLaMA

Qwen3.5-122B-A10B Uncensored (Aggressive) — GGUF Release + new K_P Quants

Reddit r/LocalLLaMA

Die besten AI Tools fuer Digital Nomads 2026

Dev.to

I Built the Most Feature-Complete MCP Server for Obsidian — Here's How

Dev.to

MobileKernelBench: Can LLMs Write Efficient Kernels for Mobile Devices?

Key Points

Abstract

Related Articles

I Was Wrong About AI Coding Assistants. Here's What Changed My Mind (and What I Built About It).

Interesting loop

Qwen3.5-122B-A10B Uncensored (Aggressive) — GGUF Release + new K_P Quants

Die besten AI Tools fuer Digital Nomads 2026

I Built the Most Feature-Complete MCP Server for Obsidian — Here's How

関連おすすめサービス

Notta搭載AI議事録イヤホン ZENCHORD1

AI搭載ボイスレコーダー Plaud

画像高画質化AIツール Aiarty Image Enhancer