Learning Selective LLM Autonomy from Copilot Feedback in Enterprise Customer Support Workflows

arXiv cs.CL / 4/28/2026

📰 NewsDeveloper Stack & InfrastructureIndustry & Market MovesModels & Research

Key Points

  • The paper describes a deployed system that automates complete enterprise customer support workflows within a BPM platform using a selective (high-confidence) autonomy strategy.
  • It achieves rapid scalability, reaching automation for a newly introduced process within two weeks by leveraging large-scale supervision from per-case UI interaction traces and low-overhead copilot feedback.
  • The approach uses a staged deployment pipeline that trains a next-UI-action policy, learns a critic calibrated via copilot feedback to manage abstention, and then runs background automation only for steps deemed reliable.
  • In operation, the system lets a single operator supervise multiple concurrent sessions, only interrupting when uncertainty is detected, and it includes monitoring plus safe fallbacks to protect production quality.
  • In production results, the system automated 45% of support sessions and reduced average handling time by 39% without degrading support quality.

Abstract

We present a deployed system that automates end-to-end customer support workflows inside an enterprise Business Process Management (BPM) platform. The approach is scalable in production and reaches selective automation within two weeks for a new process, leveraging supervision already generated at scale: structured per-case UI interaction traces and low-overhead copilot feedback, where operators either accept a suggestion or provide a correction. A staged deployment pipeline trains a next UI action policy, learns a critic from copilot feedback to calibrate abstention, and executes only high-confidence steps in the background while deferring uncertain decisions to operators and resuming from the updated UI state. This setup lets one operator supervise multiple concurrent sessions and be interrupted only when the system is uncertain. The system operates on a schema-driven view of the BPM interface and includes monitoring and safe fallbacks for production. In production, it automated 45% of sessions and reduced average handling time by 39% without degrading support quality level.