Been running the new model entire evening in different quants and coding tasks with OpenCode. Used oMLX and LM Studio. Used recommended settings for precise tasks (temp 0.6, top-k 20, etc) and OpenCode agent. So far my findings is that the model goes into infinite reasoning loops more often than 3.5, and I sometimes see failed tool calls. The latter could be parser bugs, but the former is the model itself.
It’s ok on basic apps, but really struggles to move ahead on something more complex like a simple 3D game even when the context is nearly empty, as if it tries to be super defensive and rechecks itself continuously.
Does anyone else have similar observations?
Edit: forgot to mention I tried 8bit MLX, Q6_K_XL, Q8_XL, BF16, all had this problem
[link] [comments]




