Built a multi-model AI platform with real-time WebRTC voice, persistent cross-model memory, and a full generation suite - free account gets 1 min voice/month

Reddit r/artificial / 4/25/2026

📰 NewsDeveloper Stack & InfrastructureTools & Practical UsageIndustry & Market MovesModels & Research

共有:

Key Points

The article describes the launch of AskSary, a multi-model AI platform offering real-time two-way voice chat using OpenAI’s WebRTC API with near-zero latency and multiple voice options.
It highlights persistent cross-model memory so conversations can continue seamlessly when switching between models (e.g., from Claude on mobile to GPT-5.2 on desktop).
The platform includes a broad generation suite spanning RAG (up to 500MB per document with unlimited uploads), image generation, video generation (via multiple providers), music creation, and 3D model tooling.
Users can access various AI models through smart auto-routing or manual selection, and the system provides proactive personalization by referencing previous sessions at login.
A free account tier is positioned for tryouts by granting 1 minute of real-time voice per month without requiring a credit card.

Built a multi-model AI platform with real-time WebRTC voice, persistent cross-model memory, and a full generation suite - free account gets 1 min voice/month

https://reddit.com/link/1sutga7/video/ktd3pxcam7xg1/player

I've been building AskSary for the past few months - a multi-model AI platform - and just shipped real-time 2-way voice chat powered by OpenAI's WebRTC API.

The visualization reacts to your voice in real time: 180 radial frequency bars orbit a glowing orb, 280 particles drift across a full-screen canvas, aurora sweeps and ripple waves emit on voice peaks, and the whole thing color-shifts from cool blue (listening) to warm violet (speaking). Near-zero latency, 8 voice options.

Anyone with a free account at asksary.com gets 1 minute of real-time voice every month to try it out - no credit card needed.

The platform also has a lot more built around it if you're curious:

Models - GPT-5-Nano, GPT-5.2, GPT-5.2 Pro, O1 Reasoning, Claude Sonnet 4.6, Gemini 2.5 Flash, Gemini 3.1 Pro, Gemini Ultra, Grok 4, DeepSeek V3, DeepSeek R1 - with smart auto-routing or manual selection

Memory and context - Persistent cross-model memory. Start on mobile with Claude, switch to GPT-5.2 on desktop and it already knows the conversation. Plus proactive personalization: on every login the chatbot reads your previous sessions and opens with a message asking if you want to continue - before you type anything.

RAG - Upload docs up to 500 MB each, unlimited uploads, chat with them across any model via OpenAI Vector Store

Generation - GPT-Image-1, Nano Banana Pro + Flux editor with visual history, Video Studio (Luma, Veo 3.1, Kling), Music Studio with ElevenLabs and in-chat visualizer, 3D Model Studio with STL export (coming soon)

Builder tools - Vision to Code, Web Architect, Game Engine, Code Lab with SQL Architect / Bug Buster / Git Guru and more

Voice and audio - Real-time chat, Podcast Mode (two AI voices, downloadable MP3), Voiceover, Voice Notes, Voice Tuner

Productivity - Slides, Docs, Pro Writer, Social tools, Business Suite, CV Creator, Daily Briefing, Market Watch

Platform - 30+ live wallpapers, Custom Agents, Folder org, Smart search, Media Gallery, 26 languages + RTL, fully customizable UI

Happy to answer questions about the WebRTC implementation or anything else. Would love to hear what you think of the voice visualization.

submitted by /u/Beneficial-Cow-7408
[link] [comments]

💡 Insights using this article

This article is featured in our daily AI news digest — key takeaways and action items at a glance.

📅 4/25DailyView insight →

Black Hat USA

AI Business

The 2AM Discipline: What an AI Agent Does When There's Nothing Left But the Clock (Day 63)

Dev.to

Big Tech firms are accelerating AI investments and integration, while regulators and companies focus on safety and responsible adoption.

Dev.to

Two-Stream 3D Convolutional Neural Network for Skeleton-Based Action Recognition

Dev.to

Self-Supervised Temporal Pattern Mining for smart agriculture microgrid orchestration under multi-jurisdictional compliance

Dev.to

Built a multi-model AI platform with real-time WebRTC voice, persistent cross-model memory, and a full generation suite - free account gets 1 min voice/month

Key Points

💡 Insights using this article

Related Articles

Black Hat USA

The 2AM Discipline: What an AI Agent Does When There's Nothing Left But the Clock (Day 63)

Big Tech firms are accelerating AI investments and integration, while regulators and companies focus on safety and responsible adoption.

Two-Stream 3D Convolutional Neural Network for Skeleton-Based Action Recognition

Self-Supervised Temporal Pattern Mining for smart agriculture microgrid orchestration under multi-jurisdictional compliance

関連おすすめサービス

Notta搭載AI議事録イヤホン ZENCHORD1

AI搭載ボイスレコーダー Plaud

画像高画質化AIツール Aiarty Image Enhancer