Cross-Modal Generation: From Commodity WiFi to High-Fidelity mmWave and RFID Sensing

arXiv cs.LG / 4/21/2026

📰 NewsIdeas & Deep AnalysisModels & Research

共有:

Key Points

The paper proposes RF-CMG, a diffusion-based cross-modal generative method that uses abundant WiFi data to synthesize high-fidelity RF signals for data-scarce modalities like mmWave and RFID.
RF-CMG decouples cross-modal generation into “high-frequency guidance” and “low-frequency constraints,” learning target high-frequency distributions from limited target data while preserving physical structure through progressive low-frequency consistency.
It introduces two key modules: a Modality-Guided Embedding (MGE) to steer the reverse diffusion process toward the target modality’s high-frequency characteristics, and a Low-Frequency Modality Consistency (LFMC) to reduce structural bias from the source modality during inference.
Experiments report improved synthesis quality for both RFID and mmWave compared with several existing generative models, and the authors validate usefulness of generated data in gesture recognition, including analysis of how synthetic-data ratios affect downstream results.

Abstract

AIGC has shown remarkable success in CV and NLP, and has recently demonstrated promising potential in the wireless domain. However, significant data imbalance exists across RF modalities, with abundant WiFi data but scarce mmWave and RFID data due to high acquisition cost. This makes it difficult to train high-quality generative models for these data-scarce modalities. In this work, we propose RF-CMG, a diffusion-based cross-modal generative method that leverages data-rich WiFi signals to synthesize high-fidelity RF data for scarce modalities including mmWave and RFID. The key insight of RF-CMG is to decouple cross-modal generation into high-frequency guidance and low-frequency constraint, which respectively learn high-frequency distribution from limited target modality data and preserve the underlying physical structure via low-frequency constraints during generation. On this basis, we introduce a Modality-Guided Embedding (MGE) module to steer the reverse diffusion trajectory toward the target high-frequency distribution, and a Low-Frequency Modality Consistency (LFMC) module to progressively enforce low-frequency constraints to suppress the accumulation of source-modality structural biases during inference, enabling high-quality target-modality generation. Performance comparison with several prevalent generative models demonstrates that RF-CMG achieves superior performance in synthesizing RFID and mmWave signals. We further showcase the effectiveness of the data generated by RF-CMG in gesture recognition tasks, and analyze the impact of the proportion of synthetic data on downstream performance.

Every time a new model comes out, the old one is obsolete of course

Reddit r/LocalLLaMA

We built it during the NVIDIA DGX Spark Full-Stack AI Hackathon — and it ended up winning 1st place overall 🏆

Dev.to

Stop Losing Progress: Setting Up a Pro Jupyter Workflow in VS Code (No More Colab Timeouts!)

Dev.to

Building AgentOS: Why I’m Building the AWS Lambda for Insurance Claims

Dev.to

Where we are. In a year, everything has changed. Kimi - Minimax - Qwen - Gemma - GLM

Reddit r/LocalLLaMA

Cross-Modal Generation: From Commodity WiFi to High-Fidelity mmWave and RFID Sensing

Key Points

Abstract

Related Articles

Every time a new model comes out, the old one is obsolete of course

We built it during the NVIDIA DGX Spark Full-Stack AI Hackathon — and it ended up winning 1st place overall 🏆

Stop Losing Progress: Setting Up a Pro Jupyter Workflow in VS Code (No More Colab Timeouts!)

Building AgentOS: Why I’m Building the AWS Lambda for Insurance Claims

Where we are. In a year, everything has changed. Kimi - Minimax - Qwen - Gemma - GLM

関連おすすめサービス

Notta搭載AI議事録イヤホン ZENCHORD1

AI搭載ボイスレコーダー Plaud

画像高画質化AIツール Aiarty Image Enhancer