PROMPT2BOX: Uncovering Entailment Structure among LLM Prompts

arXiv cs.CL / 3/24/2026

📰 NewsIdeas & Deep AnalysisModels & Research

Key Points

  • The paper highlights a limitation of using vector embeddings for prompt analysis: they mainly reflect topical similarity and can miss important differences in prompt specificity and difficulty.
  • PROMPT2BOX is introduced as a box-embedding approach that uses a trained encoder to represent prompts so that both semantic similarity and specificity relations are preserved.
  • The authors train the encoder using a combination of existing and synthesized datasets, enabling the embedding space to learn example specificity ordering such as “more specific than.”
  • They develop a dimension-reduction method for box embeddings to support visualization and more reliable dataset comparisons.
  • Experiments show PROMPT2BOX improves prompt specificity capture over vector baselines and, in hierarchical clustering across 17 LLMs, detects 8.9% more weaknesses with a ~33% stronger correlation between hierarchical depth and instruction specificity.

Abstract

To discover the weaknesses of LLMs, researchers often embed prompts into a vector space and cluster them to extract insightful patterns. However, vector embeddings primarily capture topical similarity. As a result, prompts that share a topic but differ in specificity, and consequently in difficulty, are often represented similarly, making fine-grained weakness analysis difficult. To address this limitation, we propose PROMPT2BOX, which embeds prompts into a box embedding space using a trained encoder. The encoder, trained on existing and synthesized datasets, outputs box embeddings that capture not only semantic similarity but also specificity relations between prompts (e.g., "writing an adventure story" is more specific than "writing a story"). We further develop a novel dimension reduction technique for box embeddings to facilitate dataset visualization and comparison. Our experiments demonstrate that box embeddings consistently capture prompt specificity better than vector baselines. On the downstream task of creating hierarchical clustering trees for 17 LLMs from the UltraFeedback dataset, PROMPT2BOX can identify 8.9\% more LLM weaknesses than vector baselines and achieves an approximately 33\% stronger correlation between hierarchical depth and instruction specificity.