AI Navigate

PivotAttack: Rethinking the Search Trajectory in Hard-Label Text Attacks via Pivot Words

arXiv cs.CL / 3/12/2026

📰 NewsIdeas & Deep AnalysisModels & Research

Key Points

  • PivotAttack introduces an inside-out, query-efficient attack framework that uses a Multi-Armed Bandit to identify Pivot Sets—combinatorial token groups that anchor predictions—and perturb them to induce label flips.
  • The approach captures inter-word dependencies and significantly reduces query costs compared to traditional outside-in methods.
  • Experiments show PivotAttack achieves higher attack success rates and better query efficiency across traditional models and Large Language Models, beating state-of-the-art baselines.
  • The work provides a scalable method for evaluating robustness and has implications for NLP security research and defense design.

Abstract

Existing hard-label text attacks often rely on inefficient "outside-in" strategies that traverse vast search spaces. We propose PivotAttack, a query-efficient "inside-out" framework. It employs a Multi-Armed Bandit algorithm to identify Pivot Sets-combinatorial token groups acting as prediction anchors-and strategically perturbs them to induce label flips. This approach captures inter-word dependencies and minimizes query costs. Extensive experiments across traditional models and Large Language Models demonstrate that PivotAttack consistently outperforms state-of-the-art baselines in both Attack Success Rate and query efficiency.