Token Estimate for Qwen 3.5-397B. Based on official source only :)

Reddit r/LocalLLaMA / 4/20/2026

💬 OpinionSignals & Early TrendsIdeas & Deep AnalysisModels & Research

Key Points

  • The article estimates that Qwen 3.5-397B was trained on about 42–48 trillion tokens, starting from an assumed baseline of 36T tokens for Qwen 3.
  • It attributes the increase mainly to the shift from text-only training to native multimodal (visual-text) training, which encodes image-text pairs and adds extra token streams.
  • The estimated range is described as conservative, reflecting an implied 15–30% growth over the 36T figure rather than speculative extrapolation.
  • It cites official Qwen blog posts for Qwen 3 and Qwen 3.5 as the basis for the estimate.
  • The core takeaway is that multimodal training can substantially raise effective token volume even at the same overall model size class.
Token Estimate for Qwen 3.5-397B. Based on official source only :)

Qwen 3 Baseline: 36 trillion tokens

Qwen 3.5 Description: Described as having a *significantly larger scale of visual-text tokens* compared to Qwen 3.

Multimodal Factor: Transition from text-only training to native visual-text (multimodal) training increases total token volume due to image-text pair encoding and richer data representation.

**Conservative Estimate: 42–48 trillion tokens**

Reasoning:
A “significant” increase over 36T reasonably implies a ~15–30% expansion**, accounting for:

  • Added visual token streams
  • Multimodal alignment overhead
  • Broader dataset diversity

This range stays conservative while avoiding speculative overestimation.

**Sources:**

submitted by /u/9r4n4y
[link] [comments]