Building a chatbot with ASR [P]

Reddit r/MachineLearning / 4/10/2026

💬 OpinionDeveloper Stack & InfrastructureIdeas & Deep AnalysisTools & Practical Usage

共有:

Key Points

A startup-focused discussion explores how to add speech-to-text (ASR) to a chatbot as part of an MVP while managing tight budget constraints.
The author is seeking guidance on choosing an initial ASR approach (e.g., Whisper, Parakeet) and the trade-offs among architectures.
A key requirement is avoiding external APIs due to security and compliance, motivating consideration of self-hosted/self-contained deployment options.
Contributors are asked to weigh performance and deployment complexity, with the author willing to handle the deployment challenges.
The post frames the problem as selecting a practical starting point that balances cost, compliance, and engineering effort for near-term launch.

I’ve been working on building a chatbot, and one of the features I want to include is speech-to-text. Since I’m part of a startup, budget is definitely a constraint. At the same time, due to security and compliance requirements, I’d prefer to avoid relying on external APIs.

For an MVP or pilot launch, I’m trying to figure out which ASR approach or architecture would make the most sense to start with. I’ve been looking into options like Whisper, Parakeet, etc., but I’m a bit unsure about the best starting point given my constraints.

Would really appreciate any suggestions or insights from people who’ve worked on something similar, especially around trade-offs between self-hosted models vs APIs, performance, and ease of deployment (I am ready to take on the challenge for deployment).

submitted by /u/Excellent-Couple-394
[link] [comments]