Peek2: Regex-free Byte-level Byte-Pair Encoding Pretokenizer for LLM Inference on Edge Devices

arXiv cs.CL / 5/4/2026

📰 NewsDeveloper Stack & InfrastructureModels & Research

共有:

Key Points

Peek2 is a regex-free, byte-level BPE pretokenizer designed to replace cl100k-like pretoknizers used by GPT-3, LLaMA-3, and Qwen-2.5.
The work analyzes the existing cl100k pretokenization logic and proposes a new algorithm with linear time complexity and constant, minimal memory usage to better fit edge-device inference.
Benchmarks indicate up to a 2.48× increase in pretokenization microbenchmark throughput.
Across end-to-end Byte-level BPE encoding, Peek2 can improve overall throughput by about 1.14× depending on the dataset while matching the baseline’s output exactly.

Abstract

Pretokenization is a crucial, sequential pass in Byte-level BPE tokenizers, yet little work has been done to optimize it for edge-side inference. Our proposed new implementation, Peek2, serves as a drop-in replacement for cl100k-like pretokenizers used in GPT-3, LLaMa-3, and Qwen-2.5. After breaking down and analyzing the logic of the original cl100k pretokenizer, we introduced a new pretokenization algorithm with linear time complexity and constant, trivial memory usage, suited for edge scenarios. Test results show that it increases microbenchmarking throughput by up to

2.48\times

and delivers a

1.14\times

improvement in overall throughput across the entire Byte-level BPE encoding process, depending on the dataset, while providing identical results as the baseline Regex-based tokenizer.

AnnouncementsBuilding a new enterprise AI services company with Blackstone, Hellman & Friedman, and Goldman Sachs

Anthropic News

Dara Khosrowshahi on replacing Uber drivers — and himself — with AI

The Verge

CLMA Frame Test

Dev.to

Governance and Liability in AI Agents: What I Built Trying to Answer Those Questions

Dev.to

Roundtable chat with Talkie-1930 and Gemma 4 31B

Reddit r/LocalLLaMA

Peek2: Regex-free Byte-level Byte-Pair Encoding Pretokenizer for LLM Inference on Edge Devices

Key Points

Abstract

Related Articles

AnnouncementsBuilding a new enterprise AI services company with Blackstone, Hellman & Friedman, and Goldman Sachs

Dara Khosrowshahi on replacing Uber drivers — and himself — with AI

CLMA Frame Test

Governance and Liability in AI Agents: What I Built Trying to Answer Those Questions

Roundtable chat with Talkie-1930 and Gemma 4 31B

関連おすすめサービス

Notta搭載AI議事録イヤホン ZENCHORD1

AI搭載ボイスレコーダー Plaud

画像高画質化AIツール Aiarty Image Enhancer