Learning Spatial Structure from Pre-Beamforming Per-Antenna Range-Doppler Radar Data via Visibility-Aware Cross-Modal Supervision

arXiv cs.CV / 4/3/2026

💬 OpinionSignals & Early TrendsIdeas & Deep AnalysisModels & Research

共有:

Key Points

The paper examines whether automotive radar models can learn meaningful spatial structure directly from pre-beamforming per-antenna range-Doppler (RD) measurements, avoiding explicit angle-domain beamforming steps.
Using a 6-TX x 8-RX commodity automotive radar with an A/B chirp-sequence CS-FMCW scheme that changes effective transmit aperture across chirps, the authors analyze how chirp-dependent transmit configurations impact spatial recoverability.
A dual-chirp shared-weight, end-to-end encoder is trained on pre-beamforming per-antenna RD tensors, evaluated via bird’s-eye-view (BEV) occupancy as a geometry-focused probe rather than a purely performance metric.
The supervision is visibility-aware and cross-modal: LiDAR-derived labels incorporate radar field-of-view and occlusion-aware LiDAR observability via ray-based visibility modeling.
Chirp ablations and range-band analyses, alongside physics-aligned baselines, conclude that spatial structure can be recovered without hand-crafted signal-processing or explicit angle-domain construction.

Abstract

Automotive radar perception pipelines commonly construct angle-domain representations via beamforming before applying learning-based models. This work instead investigates a representational question: can meaningful spatial structure be learned directly from pre-beamforming per-antenna range-Doppler (RD) measurements? Experiments are conducted on a 6-TX x 8-RX (48 virtual antennas) commodity automotive radar employing an A/B chirp-sequence frequency-modulated continuous-wave (CS-FMCW) transmit scheme, in which the effective transmit aperture varies between chirps (single-TX vs. multi-TX), enabling controlled analysis of chirp-dependent transmit configurations. We operate on pre-beamforming per-antenna RD tensors using a dual-chirp shared-weight encoder trained in an end-to-end, fully data-driven manner, and evaluate spatial recoverability using bird's-eye-view (BEV) occupancy as a geometric probe rather than a performance-driven objective. Supervision is visibility-aware and cross-modal, derived from LiDAR with explicit modeling of the radar field-of-view and occlusion-aware LiDAR observability via ray-based visibility. Through chirp ablations (A-only, B-only, A+B), range-band analysis, and physics-aligned baselines, we assess how transmit configurations affect geometric recoverability. The results indicate that spatial structure can be learned directly from pre-beamforming per-antenna RD tensors without explicit angle-domain construction or hand-crafted signal-processing stages.

Black Hat Asia

AI Business

90000 Tech Workers Got Fired This Year and Everyone Is Blaming AI but Thats Not the Whole Story

Dev.to

Microsoft’s $10 Billion Japan Bet Shows the Next AI Battleground Is National Infrastructure

Dev.to

TII Releases Falcon Perception: A 0.6B-Parameter Early-Fusion Transformer for Open-Vocabulary Grounding and Segmentation from Natural Language Prompts

MarkTechPost

Portable eye scanner powered by AI expands access to low-cost community screening

Reddit r/artificial

Learning Spatial Structure from Pre-Beamforming Per-Antenna Range-Doppler Radar Data via Visibility-Aware Cross-Modal Supervision

Key Points

Abstract

Related Articles

Black Hat Asia

90000 Tech Workers Got Fired This Year and Everyone Is Blaming AI but Thats Not the Whole Story

Microsoft’s $10 Billion Japan Bet Shows the Next AI Battleground Is National Infrastructure

TII Releases Falcon Perception: A 0.6B-Parameter Early-Fusion Transformer for Open-Vocabulary Grounding and Segmentation from Natural Language Prompts

Portable eye scanner powered by AI expands access to low-cost community screening

関連おすすめサービス

Notta搭載AI議事録イヤホン ZENCHORD1

AI搭載ボイスレコーダー Plaud

画像高画質化AIツール Aiarty Image Enhancer