How people use Copilot for Health

arXiv cs.AI / 4/20/2026

💬 OpinionSignals & Early TrendsIdeas & Deep AnalysisModels & Research

Key Points

  • The study analyzes 500,000+ de-identified health conversations with Microsoft Copilot (Jan 2026 onward) to understand what users ask conversational AI about in healthcare contexts.
  • Researchers built a privacy-preserving LLM-based hierarchical intent taxonomy with 12 primary categories, validated via expert human annotation, and used it to cluster and characterize recurring health themes.
  • A key finding is that nearly 1 in 5 conversations involve personal symptom assessment or condition discussions, and even the largest “general information” group is heavily tied to specific treatments and conditions.
  • Usage patterns differ by audience, time of day, and device: many queries are about others (caregiving), symptom and emotional health questions rise in the evening/night, mobile skews to personal health, while desktop skews to professional/academic work.
  • A significant portion of requests focus on navigating healthcare systems (finding providers, understanding insurance), indicating friction in existing care delivery and the need for platform-specific design and safety for health AI.

Abstract

We analyze over 500,000 de-identified health-related conversations with Microsoft Copilot from January 2026 to characterize what people ask conversational AI about health. We develop a hierarchical intent taxonomy of 12 primary categories using privacy-preserving LLM-based classification validated against expert human annotation, and apply LLM-driven topic-clustering for prevalent themes within each intent. Using this taxonomy, we characterize the intents and topics behind health queries, identify who these queries are about, and analyze how usage varies by device and time of day. Five findings stand out. First, nearly one in five conversations involve personal symptom assessment or condition discussion, and even the dominant general information category (40%) is concentrated on specific treatments and conditions, suggesting that this is a lower bound on personal health intent. Second, one in seven of these personal health queries concern someone other than the user, such as a child, a parent, a partner, suggesting that conversational AI can be a caregiving tool, not just a personal one. Third, personal queries about symptoms and emotional health queries increase markedly in the evening and nighttime hours, when traditional healthcare is most limited. Fourth, usage diverges sharply by device: mobile concentrates on personal health concerns, while desktop is dominated by professional and academic work. Fifth, a substantial share of queries focuses on navigating healthcare systems such as finding providers, and understanding insurance, highlighting friction in the delivery of existing healthcare. These patterns have direct implications for platform-specific design, safety considerations, and the responsible development of health AI.