Token Estimate for Qwen 3.5-397B. Based on official source only :)

Reddit r/LocalLLaMA / 4/20/2026

💬 OpinionSignals & Early TrendsIdeas & Deep AnalysisModels & Research

Read original →

共有:

Key Points

The article estimates that Qwen 3.5-397B was trained on about 42–48 trillion tokens, starting from an assumed baseline of 36T tokens for Qwen 3.
It attributes the increase mainly to the shift from text-only training to native multimodal (visual-text) training, which encodes image-text pairs and adds extra token streams.
The estimated range is described as conservative, reflecting an implied 15–30% growth over the 36T figure rather than speculative extrapolation.
It cites official Qwen blog posts for Qwen 3 and Qwen 3.5 as the basis for the estimate.
The core takeaway is that multimodal training can substantially raise effective token volume even at the same overall model size class.

Token Estimate for Qwen 3.5-397B. Based on official source only :)

Qwen 3 Baseline: 36 trillion tokens

Qwen 3.5 Description: Described as having a *significantly larger scale of visual-text tokens* compared to Qwen 3.

Multimodal Factor: Transition from text-only training to native visual-text (multimodal) training increases total token volume due to image-text pair encoding and richer data representation.

**Conservative Estimate: 42–48 trillion tokens**

Reasoning:
A “significant” increase over 36T reasonably implies a ~15–30% expansion**, accounting for:

Added visual token streams
Multimodal alignment overhead
Broader dataset diversity

This range stays conservative while avoiding speculative overestimation.

**Sources:**

submitted by /u/9r4n4y
[link] [comments]

Runtime security for AI agents: risk scoring, policy enforcement, and rollback for production agent pipeline [P]

Reddit r/MachineLearning

Anthropic Won't Fix the MCP Vulnerability — Here's How to Protect Your Server

Dev.to

Vercel Hack: Why You Need to Rotate Your "Non-Sensitive" Environment Variables Today

Dev.to

Researchers gave 1,222 people AI assistants, then took them away after 10 minutes. Performance crashed below the control group and people stopped trying. UCLA, MIT, Oxford, and Carnegie Mellon call it the "boiling frog" effect.

Reddit r/artificial

The Brutal Truth About Learning AI Agents: What I Learned from Building 17 Broken Versions

Dev.to

Token Estimate for Qwen 3.5-397B. Based on official source only :)

Key Points

Related Articles

Runtime security for AI agents: risk scoring, policy enforcement, and rollback for production agent pipeline [P]

Anthropic Won't Fix the MCP Vulnerability — Here's How to Protect Your Server

Vercel Hack: Why You Need to Rotate Your "Non-Sensitive" Environment Variables Today

Researchers gave 1,222 people AI assistants, then took them away after 10 minutes. Performance crashed below the control group and people stopped trying. UCLA, MIT, Oxford, and Carnegie Mellon call it the "boiling frog" effect.

The Brutal Truth About Learning AI Agents: What I Learned from Building 17 Broken Versions

関連おすすめサービス

Notta搭載AI議事録イヤホン ZENCHORD1

AI搭載ボイスレコーダー Plaud

画像高画質化AIツール Aiarty Image Enhancer