| I have been working on a project to adapt QEMU, running on macOS, to support passing through a GPU into a Linux VM. I wrote this post walking through some of the interesting challenges there, along with benchmarks. The post focuses a lot on gaming, but there are AI benchmarks there as well. [link] [comments] |
You can do CUDA inference on an Apple Silicon Mac with PCI Passthrough
Reddit r/LocalLLaMA / 5/9/2026
💬 OpinionDeveloper Stack & InfrastructureSignals & Early TrendsTools & Practical Usage
Key Points
- A developer describes adapting QEMU on macOS to pass a GPU through to a Linux VM using PCI passthrough on Apple Silicon Macs.
- The write-up highlights key challenges encountered during the virtualization and GPU passthrough process, along with performance benchmarks.
- While the post is centered on gaming, it also includes AI benchmark results, indicating the approach can support AI inference workflows that expect CUDA.
- The article points to a practical eGPU/VM setup path for users who want to run CUDA-oriented workloads on Apple Silicon despite native CUDA limitations.

