I tried building a memory-first AI… and ended up discovering smaller models can beat larger ones

Reddit r/artificial / 3/31/2026

💬 OpinionIdeas & Deep AnalysisTools & Practical UsageModels & Research

共有:

Key Points

The article reports a small experiment using a “Seed” architecture discovery approach, focused on identifying memory-first AI model structures rather than simply scaling model size.
Across several intent classification datasets (Banking77, CLINC150, HWU64, MASSIVE), the author finds that smaller dynamically discovered/distilled models can match or even beat larger static/large baselines on key metrics.
Results show clear cases where dynamic seed distillation delivers higher accuracy at roughly 4–5× fewer parameters (e.g., Banking77), while other datasets show mixed performance where size does not automatically translate into gains.
A consistent pattern emerges: the best strategy is not “bigger is better,” but “find the smallest model that still wins” through intelligent structure compression/search.
The overall takeaway is a practical research insight into model architecture search/compression methods that can improve efficiency (smaller models and faster inference) without sacrificing task quality.

Dataset	Model	Acc	F1	Δ vs Log	Δ vs Static	Avg Params	Peak Params	Steps	Infer ms	Size
Banking77-20	Logistic TF-IDF	92.37%	0.9230	+0.00pp	+0.76pp	64,940	64,940	0.00M	0.473	1.000x
	Static Seed	91.61%	0.9164	-0.76pp	+0.00pp	52,052	52,052	94.56M	0.264	0.801x
	Dynamic Seed Distill	93.53%	0.9357	+1.17pp	+1.92pp	12,648	16,881	70.46M	0.232	0.195x

CLINC150 | Logistic TF-IDF | 97.00% | 0.9701 | +0.00pp | +1.78pp | 41,020 | 41,020 | 0.00M | 0.000 | 1.000x | Static Seed | 95.22% | 0.9521 | -1.78pp | +0.00pp | 52,052 | 52,052 | 66.80M | 0.302 | 1.269x | Dynamic Seed | 94.78% | 0.9485 | -2.22pp | -0.44pp | 10,092 | 10,136 | 28.41M | 0.324 | 0.246x | Dynamic Seed Distill | 95.44% | 0.9544 | -1.56pp | +0.22pp | 9,956 | 9,956 | 32.69M | 0.255 | 0.243x HWU64 | Logistic TF-IDF | 87.94% | 0.8725 | +0.00pp | +0.81pp | 42,260 | 42,260 | 0.00M | 0.000 | 1.000x | Static Seed | 87.13% | 0.8674 | -0.81pp | +0.00pp | 52,052 | 52,052 | 146.61M | 0.300 | 1.232x | Dynamic Seed | 86.63% | 0.8595 | -1.31pp | -0.50pp | 12,573 | 17,565 | 62.54M | 0.334 | 0.297x | Dynamic Seed Distill | 87.23% | 0.8686 | -0.71pp | +0.10pp | 13,117 | 17,575 | 62.86M | 0.340 | 0.310x MASSIVE-20 | Logistic TF-IDF | 86.06% | 0.7324 | +0.00pp | -1.92pp | 74,760 | 74,760 | 0.00M | 0.000 | 1.000x | Static Seed | 87.98% | 0.8411 | +1.92pp | +0.00pp | 52,052 | 52,052 | 129.26M | 0.247 | 0.696x | Dynamic Seed | 86.94% | 0.7364 | +0.88pp | -1.04pp | 11,595 | 17,565 | 47.62M | 0.257 | 0.155x | Dynamic Seed Distill | 86.45% | 0.7380 | +0.39pp | -1.53pp | 11,851 | 19,263 | 51.90M | 0.442 | 0.159x

Built a small experiment around Seed (architecture discovery)

Tested across 4 intent datasets:

Banking77
CLINC150
HWU64
MASSIVE

Results surprised me.

On Banking77:

Logistic TF-IDF: 92.37%
Dynamic Seed (distilled): 93.53%

At ~5x smaller (12.6k vs 64.9k params)

Across the others:

CLINC150 / HWU64 → not always higher accuracy
but ~4–5x smaller models with competitive performance
MASSIVE → quality + size wins consistently
Key pattern:

Dynamic Seed finds much smaller architectures
that stay competitive — and sometimes outperform strong baselines

This isn’t about bigger models.
It’s about:
finding the smallest model that still wins

Traditional approach:
scale size → hope for gains

Seed:
search structure → compress intelligently

Some takeaways:
Static models often lose

Dynamic discovery consistently improves efficiency
Distillation helps stabilize small models

Structure matters more than uniform scaling

This is the direction behind Seed AutoArch:
automatically discovering efficient models for real tasks
Not AGI
Not “we solved NLU”
But a real signal that:

structure > scale

What you guys make of this?

submitted by /u/califalcon
[link] [comments]