Once-for-All Channel Mixers (HYPERTINYPW): Generative Compression for TinyML
arXiv cs.LG / 3/27/2026
📰 NewsSignals & Early TrendsIdeas & Deep AnalysisModels & Research
Key Points
- Introduces HYPER-TINYPW, a “compression-as-generation” method that stores compact per-layer codes and uses a shared tiny MLP to generate most 1x1 pointwise (PW) mixer weights at load time for MCU deployment.
- By caching the generated PW kernels and running inference with standard integer operators, the approach preserves commodity microcontroller runtimes and keeps steady-state latency/energy comparable to INT8 separable CNN baselines.
- Enforces a shared latent basis across layers to remove cross-layer redundancy while keeping PW1 in INT8 to stabilize morphology-sensitive mixing during early training/inference stages.
- Reports strong flash/memory tradeoffs on ECG benchmarks (Apnea-ECG, PTB-XL, MIT-BIH), achieving about a 6.31× reduction in bytes (~225 kB) while retaining at least ~95% of large-model macro-F1, and better performance under tight 32–64 kB budgets.
- Demonstrates broader applicability beyond ECG by transferring to TinyML audio, reaching 96.2% test accuracy on Speech Commands, suggesting the technique fits other embedded sensing/speech settings where repeated linear mixers dominate storage.
広告
Related Articles

Got My 39-Agent System Audited Live. Here's What the Maturity Scorecard Revealed.
Dev.to

The Redline Economy
Dev.to

$500 GPU outperforms Claude Sonnet on coding benchmarks
Dev.to

From Scattershot to Sniper: AI for Hyper-Personalized Media Lists
Dev.to

The LiteLLM Supply Chain Attack: A Wake-Up Call for AI Infrastructure
Dev.to