AI Navigate

Click-to-Ask: An AI Live Streaming Assistant with Offline Copywriting and Online Interactive QA

arXiv cs.CV / 3/20/2026

📰 NewsDeveloper Stack & InfrastructureTools & Practical UsageIndustry & Market MovesModels & Research

Key Points

  • Click-to-Ask is an AI-driven live streaming assistant that combines offline processing of multimodal product information with an online module for real-time viewer interactions.
  • The offline component converts diverse inputs into structured product data and generates compliant promotional copywriting.
  • The online component enables real-time responses to viewer questions by using the structured product data and an event-level memory within a streaming architecture.
  • On a TikTok live stream dataset, the method achieves a Question Recognition Accuracy of 0.913 and a Response Quality score of 0.876, demonstrating strong practical potential.
  • The system is positioned to reduce promotional preparation time, boost audience engagement, and improve prompt interaction with viewers in live streaming commerce.

Abstract

Live streaming commerce has become a prominent form of broadcasting in the modern era. To facilitate more efficient and convenient product promotions for streamers, we present Click-to-Ask, an AI-driven assistant for live streaming commerce with complementary offline and online components. The offline module processes diverse multimodal product information, transforming complex inputs into structured product data and generating compliant promotional copywriting. During live broadcasts, the online module enables real-time responses to viewer inquiries by allowing streamers to click on questions and leveraging both the structured product information generated by the offline module and an event-level historical memory maintained in a streaming architecture. This system significantly reduces the time needed for promotional preparation, enhances content engagement, and enables prompt interaction with audience inquiries, ultimately improving the effectiveness of live streaming commerce. On our collected dataset of TikTok live stream frames, the proposed method achieves a Question Recognition Accuracy of 0.913 and a Response Quality score of 0.876, demonstrating considerable potential for practical application. The video demonstration can be viewed here: https://www.youtube.com/shorts/mWIXK-SWhiE.