I made a 35% REAP of 397B with potentially usable quality in 96GB GPU

Reddit r/LocalLLaMA / 4/5/2026

💬 OpinionSignals & Early TrendsTools & Practical UsageModels & Research

Key Points

  • The post claims the author produced a REAP-compressed version of a 397B model achieving a reported 35% REAP while maintaining potentially usable quality.
  • The resulting model is stated to fit and run on a 96GB GPU setup, positioning it as more feasible for local/consumer-grade hardware compared with full-size 397B variants.
  • A Hugging Face link is provided to the released artifact (Qwen3.5-397B-A17B-REAP35), enabling others to test, benchmark, and fine-tune the compression result.
  • The focus is on practical viability of weight compression/efficiency techniques (REAP) rather than a new training method or official product announcement.