Built a multi-model AI platform with real-time WebRTC voice, persistent cross-model memory, and a full generation suite - free account gets 1 min voice/month

Reddit r/artificial / 4/25/2026

📰 NewsDeveloper Stack & InfrastructureTools & Practical UsageIndustry & Market MovesModels & Research

Key Points

  • The article describes the launch of AskSary, a multi-model AI platform offering real-time two-way voice chat using OpenAI’s WebRTC API with near-zero latency and multiple voice options.
  • It highlights persistent cross-model memory so conversations can continue seamlessly when switching between models (e.g., from Claude on mobile to GPT-5.2 on desktop).
  • The platform includes a broad generation suite spanning RAG (up to 500MB per document with unlimited uploads), image generation, video generation (via multiple providers), music creation, and 3D model tooling.
  • Users can access various AI models through smart auto-routing or manual selection, and the system provides proactive personalization by referencing previous sessions at login.
  • A free account tier is positioned for tryouts by granting 1 minute of real-time voice per month without requiring a credit card.
Built a multi-model AI platform with real-time WebRTC voice, persistent cross-model memory, and a full generation suite - free account gets 1 min voice/month

https://reddit.com/link/1sutga7/video/ktd3pxcam7xg1/player

I've been building AskSary for the past few months - a multi-model AI platform - and just shipped real-time 2-way voice chat powered by OpenAI's WebRTC API.

The visualization reacts to your voice in real time: 180 radial frequency bars orbit a glowing orb, 280 particles drift across a full-screen canvas, aurora sweeps and ripple waves emit on voice peaks, and the whole thing color-shifts from cool blue (listening) to warm violet (speaking). Near-zero latency, 8 voice options.

Anyone with a free account at asksary.com gets 1 minute of real-time voice every month to try it out - no credit card needed.

The platform also has a lot more built around it if you're curious:

Models - GPT-5-Nano, GPT-5.2, GPT-5.2 Pro, O1 Reasoning, Claude Sonnet 4.6, Gemini 2.5 Flash, Gemini 3.1 Pro, Gemini Ultra, Grok 4, DeepSeek V3, DeepSeek R1 - with smart auto-routing or manual selection

Memory and context - Persistent cross-model memory. Start on mobile with Claude, switch to GPT-5.2 on desktop and it already knows the conversation. Plus proactive personalization: on every login the chatbot reads your previous sessions and opens with a message asking if you want to continue - before you type anything.

RAG - Upload docs up to 500 MB each, unlimited uploads, chat with them across any model via OpenAI Vector Store

Generation - GPT-Image-1, Nano Banana Pro + Flux editor with visual history, Video Studio (Luma, Veo 3.1, Kling), Music Studio with ElevenLabs and in-chat visualizer, 3D Model Studio with STL export (coming soon)

Builder tools - Vision to Code, Web Architect, Game Engine, Code Lab with SQL Architect / Bug Buster / Git Guru and more

Voice and audio - Real-time chat, Podcast Mode (two AI voices, downloadable MP3), Voiceover, Voice Notes, Voice Tuner

Productivity - Slides, Docs, Pro Writer, Social tools, Business Suite, CV Creator, Daily Briefing, Market Watch

Platform - 30+ live wallpapers, Custom Agents, Folder org, Smart search, Media Gallery, 26 languages + RTL, fully customizable UI

Happy to answer questions about the WebRTC implementation or anything else. Would love to hear what you think of the voice visualization.

submitted by /u/Beneficial-Cow-7408
[link] [comments]