Im running the raw version straight from the minimax release on hugging face (https://huggingface.co/MiniMaxAI/MiniMax-M2.7) on 3 rtx pro 6000's on vllm. So no quantization. And i'm not going to lie something feels off about it.
Same workloads in our coding environment, including our re-usable evals on problem solving in our codebase and its very inconsistent. Our humans are scoring its output lower than 2.5 on some tasks.
It is also not uncommon for it to make a spelling error or miss putting a space between example const variable = something will instead constvariable =something then have to go back and fix it.
Anyone else experiencing any weirdness with the model? I've redownloaded straight from the HF repo twice and its the same results.
Sampling params:
--override-generation-config '{
"temperature": 1.0,
"top_p": 0.95,
"top_k": 40,
"repetition_penalty": 1.15,
"max_tokens": 16384
}'
[link] [comments]




