Scalable Multi-Task Learning through Spiking Neural Networks with Adaptive Task-Switching Policy for Intelligent Autonomous Agents

arXiv cs.RO / 4/20/2026

💬 OpinionIdeas & Deep AnalysisModels & Research

Key Points

  • The paper targets scalable multi-task training for resource-constrained autonomous agents, where task interference often degrades RL-based multi-task performance.
  • It proposes SwitchMT, combining a Deep Spiking Q-Network with active dendrites and a dueling architecture that uses task-specific context signals to form specialized sub-networks.
  • SwitchMT improves over prior SNN-based RL approaches by introducing an adaptive task-switching policy that depends on both reward signals and internal network dynamics, rather than fixed intervals.
  • Experiments on multiple Atari games (Pong, Breakout, Enduro) and longer episodes show competitive results versus state of the art, indicating better handling of task interference without increasing network complexity.
  • The method is positioned as enabling low-power, energy-efficient multi-task intelligent agents by leveraging spiking computation while improving training scalability and effectiveness.

Abstract

Training resource-constrained autonomous agents on multiple tasks simultaneously is crucial for adapting to diverse real-world environments. Recent works employ reinforcement learning (RL) approach, but they still suffer from sub-optimal multi-task performance due to task interference. State-of-the-art works employ Spiking Neural Networks (SNNs) to improve RL-based multi-task learning and enable low-power/energy operations through network enhancements and spike-driven data stream processing. However, they rely on fixed task-switching intervals during its training, thus limiting its performance and scalability. To address this, we propose SwitchMT, a novel methodology that employs adaptive task-switching for effective, scalable, and simultaneous multi-task learning. SwitchMT employs the following key ideas: (1) leveraging a Deep Spiking Q-Network with active dendrites and dueling structure, that utilizes task-specific context signals to create specialized sub-networks; and (2) devising an adaptive task-switching policy that leverages both rewards and internal dynamics of the network parameters. Experimental results demonstrate that SwitchMT achieves competitive scores in multiple Atari games (i.e., Pong: -8.8, Breakout: 5.6, and Enduro: 355.2) and longer game episodes as compared to the state-of-the-art. These results also highlight the effectiveness of SwitchMT methodology in addressing task interference without increasing the network complexity, enabling intelligent autonomous agents with scalable multi-task learning capabilities.