要点

ServiceNow-AI/SuperApriel-15B-Instruct is a 15B-parameter token-mixer supernet derived from Apriel-1.6, released as a single checkpoint.

ServiceNow-AI/SuperApriel-15B-Instruct · Hugging Face

A 15B-parameter token-mixer supernet with 8 optimized deployment presets spanning 1.0× to 10.7× decode throughput at 32K sequence length, all from a single checkpoint. Derived from Apriel-1.6 through stochastic distillation and targeted supervised fine-tuning.

Model Size: 15B parameters
Layers: 48 decoder layers, each with 4 mixer variants
Context Length: 262K positions (runtime dependent)
Languages: English (best)

Highlights

Flexible deployment from a single checkpoint: multiple presets trading throughput for quality
Four mixer types per layer: Full Attention (FA), Sliding Window Attention (SWA), Gated DeltaNet (GDN), Kimi Delta Attention (KDA)
Instruction-tuned: targeted SFT with multiple Pareto-optimal placements
Speculative decoding support: use all-attention as target with efficient placements as drafts from the same checkpoint

submitted by /u/jacek2023
[link] [comments]

ServiceNow-AI/SuperApriel-15B-Instruct · Hugging Face

要点

Highlights

関連記事

Black Hat USA

なぜあなたのブランドはChatGPTに見つけられないのか（そして直し方）

ノーフリーランチ定理（No Free Lunch Theorem）— ディープダイブ＋問題：ビットを反転

Salesforce Headless 360：ブラウザなしでCRMを動かす

RAGシステムを本番運用する：エンタープライズ向けナレッジ検索の構築

関連おすすめサービス

Notta搭載AI議事録イヤホン ZENCHORD1

AI搭載ボイスレコーダー Plaud

画像高画質化AIツール Aiarty Image Enhancer