Information Plane Analysis of Binary Neural Networks

arXiv cs.LG / 5/6/2026

📰 NewsIdeas & Deep AnalysisModels & Research

共有:

Key Points

The paper applies information plane (IP) analysis to binary neural networks (BNNs), focusing on how mutual information (MI) can be estimated reliably despite high-dimensional deterministic representations.
It analyzes the finite-sample behavior of the plug-in entropy estimator and derives conditions on sample size (N) and representation dimensionality (D) where MI estimates remain trustworthy.
Outside the reliable regime, empirical MI estimates saturate at \(\log_2 N\), making IP trajectories largely uninformative for interpreting training dynamics.
Using 375 trained BNNs, the study examines whether late-stage compression phases occur and how compressed representations relate to generalization.
The findings indicate that late-stage compression often appears, but compressed latent representations do not consistently improve generalization; the link is strongly dependent on task, architecture, and regularization.

Abstract

Information plane (IP) analysis has been suggested to study the training dynamics of deep neural networks through mutual information (MI) between inputs, representations, and targets. However, its statistical validity is often compromised by the difficulty of estimating MI from samples of high-dimensional, deterministic representations. In this work, we perform IP analyses on binary neural networks (BNNs) where activations are discrete and MI is finite. We characterise the finite-sample behaviour of the plug-in entropy estimator and identify regimes for sample size

N

and representation dimensionality

D

under which MI estimates are reliable. Outside these regimes, we show that empirical MI estimates saturate to

\log_2 N

, rendering IP trajectories uninformative. Restricting attention to the reliable regime, we train 375 BNNs to investigate the existence of late-stage compression phases and the relationship between compressed representations and generalisation performance. Our results show that while late-stage compression is frequently observed, compressed latent representations do not consistently correlate with improved generalization performance. Instead, the relationship between compression and generalisation is highly dependent on task, architecture, and regularisation.

Top 10 Free AI Tools for Students in 2026: The Ultimate Study Guide

Dev.to

AI as Your Contingency Co-Pilot: Automating Wedding Day 'What-Ifs'

Dev.to

Google AI Releases Multi-Token Prediction (MTP) Drafters for Gemma 4: Delivering Up to 3x Faster Inference Without Quality Loss

MarkTechPost

When Claude Hallucinates in Court: The Latham & Watkins Incident and What It Means for Attorney Liability

MarkTechPost

Solidity LM surpasses Opus

Reddit r/LocalLLaMA

Information Plane Analysis of Binary Neural Networks

Key Points

Abstract

Related Articles

Top 10 Free AI Tools for Students in 2026: The Ultimate Study Guide

AI as Your Contingency Co-Pilot: Automating Wedding Day 'What-Ifs'

Google AI Releases Multi-Token Prediction (MTP) Drafters for Gemma 4: Delivering Up to 3x Faster Inference Without Quality Loss

When Claude Hallucinates in Court: The Latham & Watkins Incident and What It Means for Attorney Liability

Solidity LM surpasses Opus

関連おすすめサービス

Notta搭載AI議事録イヤホン ZENCHORD1

AI搭載ボイスレコーダー Plaud

画像高画質化AIツール Aiarty Image Enhancer