other companies are slowly going away from open weight, not releasing base models, delaying open weight distribution, not releasing top models (this one I think is fair, but still), and I also noticed they stopped publishing research (old Gemma and qwen had detailed papers about the models training and characteristics, now it's replaced by blog posts and model cards)
Kimi (no base model for Kimi k2.5), GLM (no base model for glm 5 and 5.1), minimax (delayed open weights and problematic license for m2.7) and qwen (qwen 3.5 397B was open weight, 3.6 is not)
Meanwhile, deepseek keeps publishing mind-blowing research every month, release their base models, release the open weight as soon as the model is officially launched and explain model training and architecture in detail with a launch paper
They are extremely important in the field and are the ones pushing the technology and efficiency forward
Unfortunately they don't release small models, but we can't have everything can we?
[link] [comments]

