PAI: Fast, Accurate, and Full Benchmark Performance Projection with AI

arXiv cs.AI / 3/23/2026

📰 NewsIdeas & Deep AnalysisModels & Research

共有:

Key Points

PAI is a hierarchical LSTM-based model that accurately predicts full benchmark performance without relying on traditional cycle-accurate simulation or instruction-wise encoding.
It uses a trace of microarchitecture-independent features from program execution to forecast performance metrics.
On SPEC CPU 2017, PAI achieves an average IPC prediction error of 9.35% while processing the entire suite in about 2 minutes 57 seconds, three orders of magnitude faster than prior approaches.
This technique addresses prior ML-based limitations (speed and accuracy) and enables faster pre-silicon power-performance analysis and competitive benchmarking.

Abstract

The exponential increase in complex IPs within modern SoCs, driven by Moore's Law, has created a pressing need for fast and accurate hardware-software power-performance analysis. Traditional performance simulators (such as cycle accurate simulators) are often too slow to simulate full benchmarks within a reasonable timeframe; require considerable effort for development, maintenance, and extensions; and are prone to errors, making pre-silicon performance projections and competitive analysis increasingly challenging. Prior attempts in addressing this challenge using machine learning fall short as they are either slow, inaccurate or unable to predict the performance of full benchmarks. To address these limitations, we present PAI, the first technique to accurately predict full benchmark performance without relying on detailed simulation or instruction-wise encoding. At the heart of PAI is a hierarchical Long Short Term Memory (LSTM)-based model that takes a trace of microarchitecture independent features from a program execution and predicts performance metrics. We present the detailed design, implementation and evaluation of PAI. Our initial experiments showed that PAI can achieve an average IPC prediction error of 9.35% for SPEC CPU 2017 benchmark suite while taking only 2 min 57 sec for the entire suite. This prediction error is comparable to prior state-of-the-art techniques while requiring 3 orders of magnitude less time.

Is AI becoming a bubble, and could it end like the dot-com crash?

Reddit r/artificial

Externalizing State

Dev.to

I made a 'benchmark' where LLMs write code controlling units in a 1v1 RTS game.

Dev.to

My AI Does Not Have a Clock

Dev.to

How to settle on a coding LLM ? What parameters to watch out for ?

Reddit r/LocalLLaMA

PAI: Fast, Accurate, and Full Benchmark Performance Projection with AI

Key Points

Abstract

Related Articles

Is AI becoming a bubble, and could it end like the dot-com crash?

Externalizing State

I made a 'benchmark' where LLMs write code controlling units in a 1v1 RTS game.

My AI Does Not Have a Clock

How to settle on a coding LLM ? What parameters to watch out for ?

関連おすすめサービス

Notta搭載AI議事録イヤホン ZENCHORD1

AI搭載ボイスレコーダー Plaud

画像高画質化AIツール Aiarty Image Enhancer