A Domain-Specific Language for LLM-Driven Trigger Generation in Multimodal Data Collection

arXiv cs.LG / 4/16/2026

💬 OpinionIdeas & Deep AnalysisModels & Research

Key Points

  • The paper argues that multimodal data collection should be active and selective instead of passive logging to reduce storage costs and irrelevant data capture.
  • It introduces a declarative, DSL-based framework where users express intent in natural language and an LLM converts it into verifiable DSL programs defining conditional sensor triggers.
  • The approach supports heterogeneous sensors (e.g., cameras, LiDAR, and system telemetry) through composable trigger definitions that can be deployed on edge devices.
  • Experiments on vehicular and robotic perception tasks show improved generation consistency and reduced execution latency versus unconstrained LLM code generation while keeping detection performance comparable.
  • The DSL abstraction is designed to enable modular composition and concurrent deployment under resource constraints for real-time multimodal systems.

Abstract

Data-driven systems depend on task-relevant data, yet data collection pipelines remain passive and indiscriminate. Continuous logging of multimodal sensor streams incurs high storage costs and captures irrelevant data. This paper proposes a declarative framework for intent-driven, on-device data collection that enables selective collection of multimodal sensor data based on high-level user requests. The framework combines natural language interaction with a formally specified domain-specific language (DSL). Large language models translate user-defined requirements into verifiable and composable DSL programs that define conditional triggers across heterogeneous sensors, including cameras, LiDAR, and system telemetry. Empirical evaluation on vehicular and robotic perception tasks shows that the DSL-based approach achieves higher generation consistency and lower execution latency than unconstrained code generation while maintaining comparable detection performance. The structured abstraction supports modular trigger composition and concurrent deployment on resource-constrained edge platforms. This approach replaces passive logging with a verifiable, intent-driven mechanism for multimodal data collection in real-time systems.