Towards Automatic Soccer Commentary Generation with Knowledge-Enhanced Visual Reasoning

arXiv cs.AI / 4/2/2026

💬 OpinionSignals & Early TrendsIdeas & Deep AnalysisModels & Research

共有:

Key Points

The paper argues that end-to-end automatic soccer commentary often fails for real live televised settings due to anonymous entities, context-dependent errors, and lack of statistical insight.
It introduces GameSight, a two-stage system that first performs knowledge-enhanced visual reasoning to align mentioned entities (players/teams) using fine-grained visual and contextual analysis.
GameSight then refines the entity-aligned commentary by injecting external historical statistics and iteratively updating an internal game state to improve factuality and relevance.
Reported results show an 18.5% improvement in player alignment accuracy on the SN-Caption-test-align dataset versus Gemini 2.5-pro, along with gains in segment-level accuracy, commentary quality, and game-level contextual relevance.
The work positions this approach as a step toward more informative, human-centric AI sports experiences and provides a demo page for evaluation.

Abstract

Soccer commentary plays a crucial role in enhancing the soccer game viewing experience for audiences. Previous studies in automatic soccer commentary generation typically adopt an end-to-end method to generate anonymous live text commentary. Such generated commentary is insufficient in the context of real-world live televised commentary, as it contains anonymous entities, context-dependent errors and lacks statistical insights of the game events. To bridge the gap, we propose GameSight, a two-stage model to address soccer commentary generation as a knowledge-enhanced visual reasoning task, enabling live-televised-like knowledgeable commentary with accurate reference to entities (players and teams). GameSight starts by performing visual reasoning to align anonymous entities with fine-grained visual and contextual analysis. Subsequently, the entity-aligned commentary is refined with knowledge by incorporating external historical statistics and iteratively updated internal game state information. Consequently, GameSight improves the player alignment accuracy by 18.5% on SN-Caption-test-align dataset compared to Gemini 2.5-pro. Combined with further knowledge enhancement, GameSight outperforms in segment-level accuracy and commentary quality, as well as game-level contextual relevance and structural composition. We believe that our work paves the way for a more informative and engaging human-centric experience with the AI sports application. Demo Page: https://gamesight2025.github.io/gamesight2025

Black Hat Asia

AI Business

I Audited 30+ Small Businesses on Their AI Visibility. Here's What Most Are Getting Wrong.

Dev.to

Big Tech firms are accelerating AI investments and integration, while regulators and companies focus on safety and responsible adoption.

Dev.to

Один промпт заменил мне 3 часа работы с текстами в день

Dev.to

Building an AI that analyzes stocks like Warren Buffett

Dev.to

Towards Automatic Soccer Commentary Generation with Knowledge-Enhanced Visual Reasoning

Key Points

Abstract

Related Articles

Black Hat Asia

I Audited 30+ Small Businesses on Their AI Visibility. Here's What Most Are Getting Wrong.

Big Tech firms are accelerating AI investments and integration, while regulators and companies focus on safety and responsible adoption.

Один промпт заменил мне 3 часа работы с текстами в день

Building an AI that analyzes stocks like Warren Buffett

関連おすすめサービス

Notta搭載AI議事録イヤホン ZENCHORD1

AI搭載ボイスレコーダー Plaud

画像高画質化AIツール Aiarty Image Enhancer