You can do CUDA inference on an Apple Silicon Mac with PCI Passthrough

Reddit r/LocalLLaMA / 5/9/2026

💬 OpinionDeveloper Stack & InfrastructureSignals & Early TrendsTools & Practical Usage

Key Points

  • A developer describes adapting QEMU on macOS to pass a GPU through to a Linux VM using PCI passthrough on Apple Silicon Macs.
  • The write-up highlights key challenges encountered during the virtualization and GPU passthrough process, along with performance benchmarks.
  • While the post is centered on gaming, it also includes AI benchmark results, indicating the approach can support AI inference workflows that expect CUDA.
  • The article points to a practical eGPU/VM setup path for users who want to run CUDA-oriented workloads on Apple Silicon despite native CUDA limitations.
You can do CUDA inference on an Apple Silicon Mac with PCI Passthrough

I have been working on a project to adapt QEMU, running on macOS, to support passing through a GPU into a Linux VM. I wrote this post walking through some of the interesting challenges there, along with benchmarks. The post focuses a lot on gaming, but there are AI benchmarks there as well.

submitted by /u/scottjgo
[link] [comments]