We all had p2p wrong with vllm so I rtfm

Reddit r/LocalLLaMA / 3/17/2026

💬 OpinionDeveloper Stack & InfrastructureTools & Practical Usage

共有:

Key Points

vLLM relies on NCCL and will attempt P2P assuming NVLink, which can cause hangs if NVLink is not present.
To enable P2P over PCIe, set VLLM_SKIP_P2P_CHECK=1 and NCCL_P2P_LEVEL=SYS (assuming proper IOMMU) to allow the needed cross-NUMA/PCIe traffic.
The NCCL_P2P_LEVEL values LOC, NVL, PIX, PXB, PHB, SYS define when P2P is used based on topology, as described in the linked NVIDIA docs.
Be aware that on Sapphire Rapids PCIe P2P is limited to Gen4 due to NTB limitations.

So either way you have pro gpu (non geforce) or p2p enabled driver, but no nvlink bridge and you try vllm and it hangs....

In fact vllm relies on NCCL under the hood will try to p2p assuming it has nvlink. But if your gpu can p2p over pcie but still nvlink fails.

Thats why everywhere you see NCCL_P2P_DISABLE=0

So how can you use p2p over pcie ? By telling nccl which level of p2p is ok. https://docs.nvidia.com/deeplearning/nccl/user-guide/docs/env.html#nccl-p2p-level

By adding VLLM_SKIP_P2P_CHECK=1 NCCL_P2P_LEVEL=SYS (of course if your iommu is properly setup) you tell nccl that whatever stuff he needs to cross on your motherboard is fine

Note: on saphire rappid pcie p2p is limited to gen 4 due to NTB limitations

Here the accepted values for NCCL_P2P_LEVEL

LOC : Never use P2P (always disabled) NVL : Use P2P when GPUs are connected through NVLink PIX : Use P2P when GPUs are on the same PCI switch. PXB : Use P2P when GPUs are connected through PCI switches (potentially multiple hops). PHB : Use P2P when GPUs are on the same NUMA node. Traffic will go through the CPU. SYS : Use P2P between NUMA nodes, potentially crossing the SMP interconnect (e.g. QPI/UPI).

submitted by /u/Opteron67
[link] [comments]

NVIDIA、GTC 2026で次世代AI基盤を発表「Vera Rubin」を軸にエージェント・ゲーム・宇宙領域へ展開のサムネイル画像

Ledge.ai

1Password、AIエージェントのアクセス制御を統合管理する「Unified Access」発表人間・マシン・AIの資格情報を一元統制のサムネイル画像

Ledge.ai

『モンドーモンドー』｜夏目龍頭流闇文学｜AI画像生成｜自由詩｜散文詩｜ホラー｜ダークファンタジー｜深淵図書館

note

「お金、見直したいけどどこから？」AIが改善ヒントを教えてくれる、公式プロンプトを公開

note

Copilotと物語を作ってみた #213 めーっちゃボロボロこぼす女の子の物語

note

We all had p2p wrong with vllm so I rtfm

Key Points

Related Articles

NVIDIA、GTC 2026で次世代AI基盤を発表「Vera Rubin」を軸にエージェント・ゲーム・宇宙領域へ展開のサムネイル画像

1Password、AIエージェントのアクセス制御を統合管理する「Unified Access」発表人間・マシン・AIの資格情報を一元統制のサムネイル画像

『モンドーモンドー』｜夏目龍頭流闇文学｜AI画像生成｜自由詩｜散文詩｜ホラー｜ダークファンタジー｜深淵図書館

「お金、見直したいけどどこから？」AIが改善ヒントを教えてくれる、公式プロンプトを公開

Copilotと物語を作ってみた #213 めーっちゃボロボロこぼす女の子の物語

関連おすすめサービス

Notta搭載AI議事録イヤホン ZENCHORD1

AI搭載ボイスレコーダー Plaud

画像高画質化AIツール Aiarty Image Enhancer

Key Points

Related Articles

NVIDIA、GTC 2026で次世代AI基盤を発表 「Vera Rubin」を軸にエージェント・ゲーム・宇宙領域へ展開のサムネイル画像

1Password、AIエージェントのアクセス制御を統合管理する「Unified Access」発表 人間・マシン・AIの資格情報を一元統制のサムネイル画像

『モンドーモンドー』｜夏目龍頭流闇文学｜AI画像生成｜自由詩｜散文詩｜ホラー｜ダークファンタジー｜深淵図書館

「お金、見直したいけどどこから？」AIが改善ヒントを教えてくれる、公式プロンプトを公開

Copilotと物語を作ってみた #213 めーっちゃボロボロこぼす女の子の物語

関連おすすめサービス

Notta搭載AI議事録イヤホン ZENCHORD1

AI搭載ボイスレコーダー Plaud

画像高画質化AIツール Aiarty Image Enhancer

NVIDIA、GTC 2026で次世代AI基盤を発表「Vera Rubin」を軸にエージェント・ゲーム・宇宙領域へ展開のサムネイル画像

1Password、AIエージェントのアクセス制御を統合管理する「Unified Access」発表人間・マシン・AIの資格情報を一元統制のサムネイル画像