Kimi bad at tool calling? [D]

Reddit r/MachineLearning / 4/30/2026

💬 OpinionSignals & Early TrendsTools & Practical Usage

Key Points

  • A Reddit user testing Kimi 2.5 via AWS Bedrock reports that while it performs well on simple tasks, it frequently makes incorrect tool calls (about 5 out of 10 attempts), suggesting weakness in tool-calling reliability.
  • The user contrasts Kimi’s tool-calling performance with Claude and OpenAI models, which they say handle tool calling more accurately.
  • They question whether the issue might stem from AWS Bedrock integration rather than the model itself, noting they haven’t tried Kimi’s official API.
  • The post also raises a broader skepticism about how good Chinese models are, based on this perceived tool-calling problem.
  • The author invites other people to share whether they have experienced similar behavior and what they think the root cause is.

So I've tried using kimi 2.5 in a personal project through AWS Bedrock. For simple tasks it does quite well. But when it comes to tool calling it seems the model is not that great, it hallucinates the tool calls 5 out of 10 times or what I noticed. On the other hand Claude and Openai models are really efficient at tool calling. Anyone else faced this issue or is this a bedrock problem? I haven't tried the official Kimi api but still under the hood the model is same.

Are the chinese model really that good as we think they are?

submitted by /u/Ok_Firefighter261
[link] [comments]