Photonic convolutional neural network with pre-trained in-situ training

arXiv cs.LG / 4/6/2026

💬 OpinionSignals & Early TrendsIdeas & Deep AnalysisModels & Research

共有:

Key Points

The paper presents a fully photonic convolutional neural network (PCNN) that performs MNIST image classification entirely in the optical domain and reports 94% test accuracy without frequent O/E/O conversions.
It maintains coherent optical processing using Mach-Zehnder interferometer (MZI) meshes, wavelength-division multiplexed (WDM) pooling, and microring resonator-based nonlinearities, with max pooling implemented on silicon photonics.
To address difficulties training physical phase shifter parameters, the authors use a hybrid training approach combining an exact differentiable digital twin for ex-situ backpropagation and in-situ fine-tuning with the SPSA algorithm.
Experimental evaluation shows strong robustness to thermal crosstalk, with only 0.43% accuracy degradation under severe coupling conditions.
The system is claimed to deliver 100–242× better energy efficiency than electronic GPUs for single-image inference, highlighting potential advantages for reducing energy bottlenecks in neural inference.

Abstract

Photonic computing is a computing paradigm which have great potential to overcome the energy bottlenecks of electronic von Neumann architecture. Throughput and power consumption are fundamental limitations of Complementary-metal-oxide-semiconductor (CMOS) chips, therefore convolutional neural network (CNN) is revolutionising machine learning, computer vision and other image based applications. In this work, we propose and validate a fully photonic convolutional neural network (PCNN) that performs MNIST image classification entirely in the optical domain, achieving 94 percent test accuracy. Unlike existing architectures that rely on frequent in-between conversions from optical to electrical and back to optical (O/E/O), our system maintains coherent processing utilizing Mach-Zehnder interferometer (MZI) meshes, wavelength-division multiplexed (WDM) pooling, and microring resonator-based nonlinearities. The max pooling unit is fully implemented on silicon photonics, which does not require opto-electrical or electrical conversions. To overcome the challenges of training physical phase shifter parameters, we introduce a hybrid training methodology deploying a mathematically exact differentiable digital twin for ex-situ backpropagation, followed by in-situ fine-tuning via Simultaneous Perturbation Stochastic Approximation (SPSA) algorithm. Our evaluation demonstrates significant robustness to thermal crosstalk (only 0.43 percent accuracy degradation at severe coupling) and achieves 100 to 242 times better energy efficiency than state-of-the-art electronic GPUs for single-image inference.