RESCORE: LLM-Driven Simulation Recovery in Control Systems Research Papers

arXiv cs.AI / 4/7/2026

💬 OpinionSignals & Early TrendsIdeas & Deep AnalysisTools & Practical UsageModels & Research

共有:

Key Points

記載された制御システム研究論文から数値シミュレーションを復元する際、パラメータ不足や実装の曖昧さで再現性が損なわれる問題を「Paper to Simulation Recoverability」として定義しました。
500本のIEEE CDC論文を対象にしたベンチマークを用意し、Analyzer/Coder/Verifierの3構成からなるLLMエージェントフレームワークRESCOREを提案しています。
RESCOREは反復実行フィードバックと可視的比較を通じてコード復元の忠実度を高め、単発生成よりも高い再現率を達成しました。
40.7%のベンチマークでタスク整合的なシミュレーション復元に成功し、手作業による再現と比べて約10倍のスピードアップ（検証時間・労力の大幅削減）が見込まれると報告しています。
ベンチマークとエージェントを公開する予定で、論文の自動再現をコミュニティで進めることを狙っています。

Abstract

Reconstructing numerical simulations from control systems research papers is often hindered by underspecified parameters and ambiguous implementation details. We define the task of Paper to Simulation Recoverability, the ability of an automated system to generate executable code that faithfully reproduces a paper's results. We curate a benchmark of 500 papers from the IEEE Conference on Decision and Control (CDC) and propose RESCORE, a three component LLM agentic framework, Analyzer, Coder, and Verifier. RESCORE uses iterative execution feedback and visual comparison to improve reconstruction fidelity. Our method successfully recovers task coherent simulations for 40.7% of benchmark instances, outperforming single pass generation. Notably, the RESCORE automated pipeline achieves an estimated 10X speedup over manual human replication, drastically cutting the time and effort required to verify published control methodologies. We will release our benchmark and agents to foster community progress in automated research replication.

Black Hat USA

AI Business

Black Hat Asia

AI Business

Fully Automated Website 2026-04-11: The Scoreboard — Visual Judge Score Comparison on the Homepage

Dev.to

Human-Aligned Decision Transformers for satellite anomaly response operations with ethical auditability baked in

Dev.to

That Smoking-Gun Video? It's Not Evidence. It's a Suspect.

Dev.to

RESCORE: LLM-Driven Simulation Recovery in Control Systems Research Papers

Key Points

Abstract

Related Articles

Black Hat USA

Black Hat Asia

Fully Automated Website 2026-04-11: The Scoreboard — Visual Judge Score Comparison on the Homepage

Human-Aligned Decision Transformers for satellite anomaly response operations with ethical auditability baked in

That Smoking-Gun Video? It's Not Evidence. It's a Suspect.

関連おすすめサービス

Notta搭載AI議事録イヤホン ZENCHORD1

AI搭載ボイスレコーダー Plaud

画像高画質化AIツール Aiarty Image Enhancer

Key Points

Abstract

Related Articles

Black Hat USA

Black Hat Asia

Fully Automated Website 2026-04-11: **The Scoreboard — Visual Judge Score Comparison on the Homepage**

Human-Aligned Decision Transformers for satellite anomaly response operations with ethical auditability baked in

That Smoking-Gun Video? It's Not Evidence. It's a Suspect.

関連おすすめサービス

Notta搭載AI議事録イヤホン ZENCHORD1

AI搭載ボイスレコーダー Plaud

画像高画質化AIツール Aiarty Image Enhancer

Fully Automated Website 2026-04-11: The Scoreboard — Visual Judge Score Comparison on the Homepage