Grammar of the Wave: Towards Explainable Multivariate Time Series Event Detection via Neuro-Symbolic VLM Agents

arXiv cs.LG / 3/13/2026

💬 OpinionSignals & Early TrendsIdeas & Deep AnalysisModels & Research

共有:

Key Points

The paper introduces Knowledge-Guided TSED where a model grounds natural-language event descriptions to intervals in multivariate signals with little or no labeled data.
It proposes Event Logic Tree (ELT) to connect linguistic descriptions with time-series via modeling the intrinsic temporal-logic structures of events.
It presents a neuro-symbolic VLM agent framework that instantiates primitives from signal visualizations and composes them under ELT constraints, producing detected intervals and explanations.
It releases a benchmark based on real-world time series data with expert knowledge and annotations, and experiments show the method outperforms supervised fine-tuning baselines and zero-shot LLM/VLM approaches, while mitigating VLM hallucination.

Abstract

Time Series Event Detection (TSED) has long been an important task with critical applications across many high-stakes domains. Unlike statistical anomalies, events are defined by semantics with complex internal structures, which are difficult to learn inductively from scarce labeled data in real-world settings. In light of this, we introduce Knowledge-Guided TSED, a new setting where a model is given a natural-language event description and must ground it to intervals in multivariate signals with little or no training data. To tackle this challenge, we introduce Event Logic Tree (ELT), a novel knowledge representation framework to bridge linguistic descriptions and physical time series data via modeling the intrinsic temporal-logic structures of events. Based on ELT, we present a neuro-symbolic VLM agent framework that iteratively instantiates primitives from signal visualizations and composes them under ELT constraints, producing both detected intervals and faithful explanations in the form of instantiated trees. To validate the effectiveness of our approach, we release a benchmark based on real-world time series data with expert knowledge and annotations. Experiments and human evaluation demonstrate the superiority of our method compared to supervised fine-tuning baselines and existing zero-shot time series reasoning frameworks based on LLMs/VLMs. We also show that ELT is critical in mitigating VLMs' inherent hallucination in matching signal morphology with event semantics.

What 81,000 people want from AI

Anthropic News

ラピダス、半導体設計AIエージェント「国内2社海外1社が使用中」

日経XTECH

「AIで雇用は増える」「AIの進化はツールがけん引」、5つのAI潮流を解説

日経XTECH

生成AIで盛り上がる「推論専用チップ」、著名科学者が示す進化の行方

日経XTECH

中国AI企業が他社製AIを「ただ乗り蒸留」か米社が主張、安全保障リスクも

日経XTECH

Grammar of the Wave: Towards Explainable Multivariate Time Series Event Detection via Neuro-Symbolic VLM Agents

Key Points

Abstract

Related Articles

What 81,000 people want from AI

ラピダス、半導体設計AIエージェント「国内2社海外1社が使用中」

「AIで雇用は増える」「AIの進化はツールがけん引」、5つのAI潮流を解説

生成AIで盛り上がる「推論専用チップ」、著名科学者が示す進化の行方

中国AI企業が他社製AIを「ただ乗り蒸留」か米社が主張、安全保障リスクも

関連おすすめサービス

Notta搭載AI議事録イヤホン ZENCHORD1

AI搭載ボイスレコーダー Plaud

画像高画質化AIツール Aiarty Image Enhancer

Key Points

Abstract

Related Articles

What 81,000 people want from AI

ラピダス、半導体設計AIエージェント「国内2社海外1社が使用中」

「AIで雇用は増える」「AIの進化はツールがけん引」、5つのAI潮流を解説

生成AIで盛り上がる「推論専用チップ」、著名科学者が示す進化の行方

中国AI企業が他社製AIを「ただ乗り蒸留」か 米社が主張、安全保障リスクも

関連おすすめサービス

Notta搭載AI議事録イヤホン ZENCHORD1

AI搭載ボイスレコーダー Plaud

画像高画質化AIツール Aiarty Image Enhancer

中国AI企業が他社製AIを「ただ乗り蒸留」か米社が主張、安全保障リスクも