SiriusHelper: An LLM Agent-Based Operations Assistant for Big Data Platforms

arXiv cs.AI / 5/4/2026

📰 NewsDeveloper Stack & InfrastructureTools & Practical UsageIndustry & Market MovesModels & Research

Key Points

  • SiriusHelper is an LLM agent-based operations assistant designed for in-production big data platforms to provide faster, actionable guidance and reduce user operational burden.
  • It improves real-world effectiveness by identifying user intent and routing requests to specialized expert workflows, including domain troubleshooting such as SQL execution diagnosis.
  • The system enhances troubleshooting quality and speed using DeepSearch-driven multi-hop retrieval paired with a priority-based hierarchical knowledge base to avoid context overload.
  • It lowers maintenance and expert workload through automated ticket understanding and SOP distillation, extracting procedure (SOP) knowledge from escalated tickets to continuously enrich the knowledge base.
  • Experiments and an online deployment on Tencent’s Big Data platform reportedly outperform baseline alternatives and reduce online ticket volume by 20.8%.

Abstract

Big data platforms are widely used in modern enterprises, and an in-production intelligent assistant is increasingly important to help users quickly find actionable guidance and reduce operational burden. While recent LLM+RAG assistants provide a natural interface, they face practical challenges in real deployments: limited scenario coverage across both general consultation and domain-specific troubleshooting workflows, inefficient knowledge access due to inadequate multi-hop retrieval and flat knowledge organization, and high maintenance cost because escalated tickets are unstructured and hard to convert into assistant improvements and reusable SOPs. In this paper, we present SiriusHelper, a deployed intelligent assistant for big data platforms. SiriusHelper serves as a unified online assistant that automatically identifies user intent and routes queries to the right handling path, including dedicated expert workflows for specialized scenarios (e.g., SQL execution diagnosis). To support complex troubleshooting, SiriusHelper combines a DeepSearch-driven mechanism with a priority-based hierarchical knowledge base to enable multi-hop retrieval without context overload, thus improving answer reliability and latency. To reduce expert overhead, SiriusHelper further introduces automated ticket understanding and SOP distillation: it diagnoses the assistant failure reason (e.g., missing knowledge or wrong routing) and extracts domain-specific SOPs to continuously enrich the knowledge base. Experiments and online deployment on Tencent Big Data platform show that SiriusHelper outperforms representative alternatives and reduces online ticket volume by 20.8\%.