Evolve: A Persistent Knowledge Lifecycle for Small Language Models

arXiv cs.LG / 4/28/2026

📰 NewsDeveloper Stack & InfrastructureModels & Research

共有:

Key Points

Evolve proposes a persistent knowledge lifecycle for small local language models by pairing a 2B model with a teacher-compiled, semantically coherent knowledge store that is updated and consolidated over time.
Instead of fragment retrieval at query time, it stages new knowledge sections when acquired, consolidates them offline via teacher-mediated merging (“sleep consolidation”), and refreshes sections inline when they expire.
Experiments on 750 benchmark queries (specialist questions, NaturalQuestions, TriviaQA) show accuracy rising from a 20–33% baseline to 60–84% (+40–52 percentage points) while cutting teacher model invocations by more than 50% through cross-query knowledge reuse.
The store compression achieved after consolidation is 31–33.5% across three benchmarks without sacrificing accuracy, and section-based retrieval outperforms chunk-based retrieval by 5–9 percentage points across all lifecycle conditions.
The system supports two generation modes—“suppress” (strict section-only, auditable) and “augment” (section-supplemented)—over the same underlying knowledge lifecycle.

Abstract

Evolve pairs a small local language model with a persistent, teacher-compiled knowledge store -- refined through sleep consolidation and usage-driven refresh -- to deliver substantial accuracy gains over the model's parametric baseline while amortizing teacher costs through cross-query knowledge reuse. Rather than retrieving document fragments at query time, Evolve constructs a store of semantically coherent sections compiled by teacher models at natural conceptual boundaries; new sections are staged on acquisition, consolidated offline through teacher-mediated merging, and refreshed inline when expired. A 2B-parameter local model handles classification and generation; large teacher models are invoked only for knowledge operations. Across 750 benchmark queries spanning custom specialist questions, NaturalQuestions, and TriviaQA, the 2B model augmented by Evolve improves from 20-33% baseline accuracy to 60-84% (+40-52pp) while reducing teacher invocations by over 50% through reuse. Post-consolidation compresses the knowledge store by 31-33.5% across three independent benchmarks while preserving accuracy; section-based retrieval outperforms chunk-based retrieval by 5-9pp across every lifecycle condition. The architecture supports two generation modes over the same lifecycle -- suppress (strict section-only grounding, auditable) and augment (section-supplemented responses).

💡 Insights using this article

This article is featured in our daily AI news digest — key takeaways and action items at a glance.

📅 4/28DailyView insight →

How I Automate My Dev Workflow with Claude Code Hooks

Dev.to

Claude Haiku for Low-Cost AI Inference: Patterns from a Horse Racing Prediction System

Dev.to

How We Built an Ambient AI Clinical Documentation Pipeline (and Saved Doctors 8+ Hours a Week)

Dev.to

🦀 PicoClaw Deep Dive — A Field Guide to Building an Ultra-Light AI Agent in Go 🐹

Dev.to

Real-Time Monitoring for AI Agents: Beyond Log Streaming

Dev.to

Evolve: A Persistent Knowledge Lifecycle for Small Language Models

Key Points

Abstract

💡 Insights using this article

Related Articles

How I Automate My Dev Workflow with Claude Code Hooks

Claude Haiku for Low-Cost AI Inference: Patterns from a Horse Racing Prediction System

How We Built an Ambient AI Clinical Documentation Pipeline (and Saved Doctors 8+ Hours a Week)

🦀 PicoClaw Deep Dive — A Field Guide to Building an Ultra-Light AI Agent in Go 🐹

Real-Time Monitoring for AI Agents: Beyond Log Streaming

関連おすすめサービス

Notta搭載AI議事録イヤホン ZENCHORD1

AI搭載ボイスレコーダー Plaud

画像高画質化AIツール Aiarty Image Enhancer