To Write or to Automate Linguistic Prompts, That Is the Question
arXiv cs.CL / 3/27/2026
💬 OpinionSignals & Early TrendsIdeas & Deep AnalysisModels & Research
Key Points
- The study presents a first systematic comparison of hand-crafted expert zero-shot prompts versus automatic prompt optimization using DSPy signatures, including GEPA-optimized variants, across translation, terminology insertion, and language quality assessment (LQA).
- Results are highly task-dependent: terminology insertion shows mostly no statistically meaningful quality difference between optimized and manual prompts, while translation and LQA exhibit different winners depending on the model configuration.
- For translation, different prompt approaches outperform on different models, suggesting no universal prompting strategy for all linguistic tasks.
- In LQA, expert prompts tend to achieve stronger error detection, but GEPA optimization improves model characterization, indicating distinct strengths between manual expertise and automated search.
- Overall, GEPA can elevate minimal DSPy signatures, and most expert-optimized comparisons show no statistically significant difference; the work also highlights an asymmetric setup where GEPA relies on programmatic search over gold splits, while expert prompts can be done without labeled data via iterative refinement.
広告
Related Articles

Got My 39-Agent System Audited Live. Here's What the Maturity Scorecard Revealed.
Dev.to

The Redline Economy
Dev.to

$500 GPU outperforms Claude Sonnet on coding benchmarks
Dev.to

From Scattershot to Sniper: AI for Hyper-Personalized Media Lists
Dev.to

The LiteLLM Supply Chain Attack: A Wake-Up Call for AI Infrastructure
Dev.to