A Dual-Stream Transformer Architecture for Illumination-Invariant TIR-LiDAR Person Tracking
arXiv cs.RO / 4/2/2026
📰 NewsSignals & Early TrendsIdeas & Deep AnalysisModels & Research
Key Points
- The paper introduces a dual-stream Transformer-based architecture for all-weather person tracking using Thermal-Infrared (TIR) and LiDAR/Depth sensors, targeting failure cases of RGB-D tracking under extreme lighting like darkness and backlighting.
- It leverages standard SLAM-capable robot sensor suites (LiDAR and TIR cameras) to build a practical TIR-D tracking system intended for autonomous mobile robots performing reliable human-following.
- A key bottleneck addressed is limited annotated multi-modal TIR-D datasets, which the authors tackle via a sequential knowledge transfer method that transfers structural priors from a large-scale thermal-trained model into the TIR-D domain.
- The method uses a “Fine-grained Differential Learning Rate Strategy” to retain pre-trained feature extraction while rapidly adapting to geometric depth cues for the tracking task.
- Experiments report improved performance over RGB-transfer and single-modality baselines, including an Average Overlap (AO) of 0.700 and a Success Rate (SR) of 58.7%.
Related Articles

Black Hat Asia
AI Business
v5.5.0
Transformers(HuggingFace)Releases
Bonsai (PrismML's 1 bit version of Qwen3 8B 4B 1.7B) was not an aprils fools joke
Reddit r/LocalLLaMA

Big Tech firms are accelerating AI investments and integration, while regulators and companies focus on safety and responsible adoption.
Dev.to

Inference Engines - A visual deep dive into the layers of an LLM
Dev.to