Convergence of Byzantine-Resilient Gradient Tracking via Probabilistic Edge Dropout

arXiv cs.LG / 4/2/2026

💬 OpinionIdeas & Deep AnalysisModels & Research

共有:

Key Points

本論文は、ビザンチン（任意の敵対メッセージを送る）エージェントを含む分散最適化ネットワークに対して、確率的エッジ・ドロップアウトを組み合わせた勾配トラッキング手法GT-PDを提案している。
GT-PDは、(1) 受信エージェント中心の自己中心型射影による入力クリッピングと、(2) 判断チャネルとトラッキングチャネルでのデュアル指標トラストスコアに基づく完全分散型確率的ドロップアウトという2つの防御層で、敵対的摂動を抑えつつ（頑健集約で失われがちな）双重確率的な混合構造を維持することを狙っている。
ビザンチンが完全に隔離される場合（pb=0）には線形収束し、部分隔離の場合（pb>0）には追跡誤差の蓄積を制御する「GT-PD-L（リ―キー統合）」により、勾配の分散とクリップ対リーク比に依存した有界近傍へ線形収束する。
さらに、2段ドロップアウトでph=1のとき、ビザンチン隔離は正直エージェントのコンセンサス動力学に追加の分散を生まないことを示し、MNIST実験ではSign Flip/ALIE/Inner Product Manipulationの各攻撃下でGT-PD-Lが座標ごとのtrimmed meanより最大4.3ポイント優れると報告している。

Abstract

We study distributed optimization over networks with Byzantine agents that may send arbitrary adversarial messages. We propose \emph{Gradient Tracking with Probabilistic Edge Dropout} (GT-PD), a stochastic gradient tracking method that preserves the convergence properties of gradient tracking under adversarial communication. GT-PD combines two complementary defense layers: a universal self-centered projection that clips each incoming message to a ball of radius

\tau

around the receiving agent, and a fully decentralized probabilistic dropout rule driven by a dual-metric trust score in the decision and tracking channels. This design bounds adversarial perturbations while preserving the doubly stochastic mixing structure, a property often lost under robust aggregation in decentralized settings. Under complete Byzantine isolation (

p_b=0

), GT-PD converges linearly to a neighborhood determined solely by stochastic gradient variance. For partial isolation (

p_b>0

), we introduce \emph{Gradient Tracking with Probabilistic Edge Dropout and Leaky Integration} (GT-PD-L), which uses a leaky integrator to control the accumulation of tracking errors caused by persistent perturbations and achieves linear convergence to a bounded neighborhood determined by the stochastic variance and the clipping-to-leak ratio. We further show that under two-tier dropout with

p_h=1

, isolating Byzantine agents introduces no additional variance into the honest consensus dynamics. Experiments on MNIST under Sign Flip, ALIE, and Inner Product Manipulation attacks show that GT-PD-L outperforms coordinate-wise trimmed mean by up to 4.3 percentage points under stealth attacks.

Benchmarking Batch Deep Reinforcement Learning Algorithms

Dev.to

Qwen3.6-Plus: Alibaba's Quiet Giant in the AI Race Delivers a Million-Token Enterprise Powerhouse

Dev.to

How To Leverage AI for Back-Office Headcount Optimization

Dev.to

Is 1-bit and TurboQuant the future of OSS? A simulation for Qwen3.5 models.

Reddit r/LocalLLaMA

SOTA Language Models Under 14B?

Reddit r/LocalLLaMA

Convergence of Byzantine-Resilient Gradient Tracking via Probabilistic Edge Dropout

Key Points

Abstract

Related Articles

Benchmarking Batch Deep Reinforcement Learning Algorithms

Qwen3.6-Plus: Alibaba's Quiet Giant in the AI Race Delivers a Million-Token Enterprise Powerhouse

How To Leverage AI for Back-Office Headcount Optimization

Is 1-bit and TurboQuant the future of OSS? A simulation for Qwen3.5 models.

SOTA Language Models Under 14B?

関連おすすめサービス

Notta搭載AI議事録イヤホン ZENCHORD1

AI搭載ボイスレコーダー Plaud

画像高画質化AIツール Aiarty Image Enhancer