UAIV

Multi-scenario · Multi-modal · Multi-condition · Real-world UAV Intelligence
paper status license status annotations python pytorch
Jiening Zhang¹, Yangming Zhang¹, Pengwei Yang¹, Yupeng Gao¹, Xi Wu¹, Junyu Liu¹,
Guoqing Wang¹,²*, Tianyu Li¹*, Yang Yang¹,²
¹ School of Computer Science and Engineering, University of Electronic Science and Technology of China
² Sichuan Artificial Intelligence Institute
* Corresponding authors
📄 Paper (Coming) 💻 GitHub 📦 Download Dataset
UAIV teaser banner

✨ Visual Teaser

scene1 scene2 dark ir mining pollution

🔥 Highlights

🚀 17B+ annotations
🌐 RGB / IR cross-modal alignment
🌧 Multi-weather paired data (rain, snow, fog)
🕒 Multi-temporal alignment (change detection & Re-ID)
✈ Flexible UAV acquisition (multi-altitude, multi-route)
🧠 Unified multi-task data paradigm
🏙 Real-world urban governance deployment

📖 Overview

UAIV is a large-scale low-altitude multimodal dataset designed for urban intelligence and fine-grained governance analysis, with a strong focus on scene understanding, spatio-temporal reasoning, and physics-aware image restoration. Unlike conventional UAV datasets, UAIV emphasizes:

The dataset is built upon a self-operated UAV acquisition system, covering long-term, large-scale real urban environments with strong diversity in scene types (urban/rural/industrial/natural), illumination (day/night), weather (rain/snow/fog/haze), and flight altitudes.

17B+
Annotations
RGB+IR
Paired Modalities
4+
Weather Conditions
Multi-altitude
Flexible Routes

🧠 Dataset Pipeline

pipeline diagram

UAV acquisition → multi-modal alignment → annotation → multi-task learning


🧩 Task Definitions

1. Scene Understanding

Tasks: Scene classification, semantic/instance segmentation, object counting, OCR, environment understanding.

UAIV emphasizes context-rich and governance-oriented perception. Each sample corresponds to real urban governance scenarios: illegal construction, open burning, environmental pollution, infrastructure monitoring. Aligned multi-modal signals (RGB, IR, metadata) enable joint reasoning about object presence, scene semantics, and environmental status.

agri commercial burning forest counting dark IR excavator fire farmland excavation lake forest mining pollution OCR example river pollution

2. Spatio-Temporal Learning

Tasks: Change detection, cross-time Re-ID.

UAIV provides strictly aligned multi-temporal data across different time spans, flight routes, and environmental conditions. For change detection, it distinguishes true semantic changes from appearance variations (illumination/weather). Cross-time Re-ID enables identity-consistent learning under viewpoint and altitude shifts.

change detection re-id sample

3. Physics-Aware Image Restoration

Tasks: Rain/snow/fog removal, cross-weather translation, weather-aware degradation modeling.

Real-world paired/unpaired weather-degraded data (rain, snow, fog, haze) captured under consistent UAV trajectories. Preserves physical consistency: illumination variation, atmospheric scattering, sensor response differences. Enables recovery of intrinsic scene properties rather than only appearance translation.

foggy scene haze original rainy snowy

🚀 Applications & Real‑World Impact

UAIV has been deployed in real-world urban governance systems using Xuzhou (Jiangsu Province, China) as a pilot region, achieving full coverage of the Huaihai Economic Zone. Based on UAIV, we developed:

The system supports automated analysis for illegal construction detection, environmental anomaly monitoring, infrastructure status assessment, leading to significant improvement in semantic understanding under complex conditions, enhanced event detection accuracy, reduced manual inspection costs, and faster decision-making.

📢 User Feedback: “UAIV provides strong coverage of complex real-world scenarios, robust representation of multi-scale objects, and high-quality semantic consistency — a reliable foundation for large-scale model training and deployment.”


📊 Dataset Description (Detailed)

UAIV is a large-scale multimodal dataset designed for urban fine-grained governance and low-altitude remote sensing intelligence.

🧪 Benchmark (Coming Soon)

Standardized benchmarks and baselines for scene understanding, change detection, and image restoration will be released in future versions.


📥 Data Access

Current status: Partially released · Open subset (~30%) available now.
Full dataset will be released in future versions.

👉 ScienceDB (official release): https://www.scidb.cn/detail?dataSetId=203705443be44f7882bb9ddfd7d401da
👉 GitHub: https://github.com/JennyZhang0810/LowAltitude-Multimodal-Dataset
👉 Project Page: https://jennyzhang0810.github.io/LowAltitude-Multimodal-Dataset/

📌 The dataset is officially released on ScienceDB. Please use the ScienceDB link for data download.


🙏 Acknowledgements

This dataset was designed and led by the author, covering the full pipeline of data acquisition, organization, and annotation system design. We sincerely thank:


📄 MIT License · UAIV Project · Open for Research & Industrial Collaboration

© 2025 UAIV Team | Low-Altitude Urban Intelligence Dataset