PiLoT: Neural Pixel-to-3D Registration for UAV-based Ego and Target Geo-localization

arXiv cs.CV / 3/24/2026

📰 NewsSignals & Early TrendsIdeas & Deep AnalysisModels & Research

共有:

Key Points

PiLoT is a unified UAV localization framework that directly registers a live video stream to a geo-referenced 3D map to estimate both ego-pose and target geo-localization, reducing reliance on GNSS and separate active sensors.
The system improves real-time performance and accuracy using a Dual-Thread Engine that separates map rendering from the localization core to keep latency low while avoiding drift.
It introduces a large synthetic training dataset with precise geometric annotations (camera pose and depth maps) to train a lightweight network that generalizes in a zero-shot way from simulation to real-world data.
A Joint Neural-Guided Stochastic-Gradient Optimizer (JNGO) is proposed to maintain robust convergence under aggressive UAV motion.
Experiments on multiple public and newly collected benchmarks report state-of-the-art performance while achieving over 25 FPS on an NVIDIA Jetson Orin, with code and dataset released on GitHub.

Abstract

We present PiLoT, a unified framework that tackles UAV-based ego and target geo-localization. Conventional approaches rely on decoupled pipelines that fuse GNSS and Visual-Inertial Odometry (VIO) for ego-pose estimation, and active sensors like laser rangefinders for target localization. However, these methods are susceptible to failure in GNSS-denied environments and incur substantial hardware costs and complexity. PiLoT breaks this paradigm by directly registering live video stream against a geo-referenced 3D map. To achieve robust, accurate, and real-time performance, we introduce three key contributions: 1) a Dual-Thread Engine that decouples map rendering from core localization thread, ensuring both low latency while maintaining drift-free accuracy; 2) a large-scale synthetic dataset with precise geometric annotations (camera pose, depth maps). This dataset enables the training of a lightweight network that generalizes in a zero-shot manner from simulation to real data; and 3) a Joint Neural-Guided Stochastic-Gradient Optimizer (JNGO) that achieves robust convergence even under aggressive motion. Evaluations on a comprehensive set of public and newly collected benchmarks show that PiLoT outperforms state-of-the-art methods while running over 25 FPS on NVIDIA Jetson Orin platform. Our code and dataset is available at: https://github.com/Choyaa/PiLoT.

The Security Gap in MCP Tool Servers (And What I Built to Fix It)

Dev.to

Big Tech firms are accelerating AI investments and integration, while regulators and companies focus on safety and responsible adoption.

Dev.to

I made a new programming language to get better coding with less tokens.

Dev.to

RSA Conference 2026: The Week Vibe Coding Security Became Impossible to Ignore

Dev.to

Adversarial AI framework reveals mechanisms behind impaired consciousness and a potential therapy

Reddit r/artificial

PiLoT: Neural Pixel-to-3D Registration for UAV-based Ego and Target Geo-localization

Key Points

Abstract

Related Articles

The Security Gap in MCP Tool Servers (And What I Built to Fix It)

Big Tech firms are accelerating AI investments and integration, while regulators and companies focus on safety and responsible adoption.

I made a new programming language to get better coding with less tokens.

RSA Conference 2026: The Week Vibe Coding Security Became Impossible to Ignore

Adversarial AI framework reveals mechanisms behind impaired consciousness and a potential therapy

関連おすすめサービス

Notta搭載AI議事録イヤホン ZENCHORD1

AI搭載ボイスレコーダー Plaud

画像高画質化AIツール Aiarty Image Enhancer