Deepseek flash seems like a very good replacement for Haiku at the very least

Reddit r/LocalLLaMA / 4/25/2026

💬 OpinionSignals & Early TrendsTools & Practical UsageModels & Research

Key Points

  • The author describes using Haiku in a chat system primarily for tool calling and summarization, especially when dealing with complex tool input schemas.
  • They ran evaluations comparing DeepSeek v4 Flash to Haiku and report that Flash outperforms Haiku with only minor prompting changes.
  • The author characterizes Flash as particularly proactive and accurate in making many tool calls, giving an impression of strong intelligence.
  • They note that benchmark comparisons may place Flash around a higher-tier (e.g., Sonnet-level), but their evidence is based on comparisons only versus Haiku.
  • The author emphasizes that Flash appears cheaper than Haiku in pricing, suggesting it could serve as a practical replacement.

We have a chat system which we use haiku for because it is mostly about tool calling and summarisation of them. But we have many tools with pretty complex input schemas, and stuff like gemma didn't cut it, so we went with haiku. Haiku is pretty good.

I ran the evals for deepseek v4 flash today compared to haiku and it pretty handily beats it - just with a few prompting changes. Flash is very proactive, it makes many tool calls very accurately and somehow gives the feeling of a very smart and intelligent model. I know looking at the benchmarks, it is probably a sonnet level thing, but if you look at the pricing, it is chepaer than Haiku. And i don't have any evals comparing to sonnet, so I can only judge it against haiku.

submitted by /u/cant-find-user-name
[link] [comments]