Streaming Speech-to-Text API

WebSocket-based streaming speech recognition for instant transcription. Ideal for live broadcasting, voice assistants, and real-time captioning, with support for multiple languages.

Get a Demo

Click to try our speech recognition service

00:00

Languages

Advanced Features

Speaker Diarization

Smart Formatting

Filler Word Removal

Speaker Diarization : Differentiate speakers in a single audio channel using voiceprint information.

Smart Formatting : Improve readability by applying additional formatting. When enabled, dates, times, and numbers will be displayed in conventional formats.

Filler Word Removal : Support filler word filtering to improve the readability of transcribed spoken language.

Live Transcript

Start speaking to see your live transcript

Speech-to-text with speaker diarization and instant response

Lower Cost

Save up to 80% with DolphinVoice compared to others.

Learn more

Lower Latency

Support real-time interim results and provide the final transcript upon completion, with endpointing latency as low as 500ms.

Learn more

Higher Accuracy

Exceptional performance in accuracy, with code-switching support for Chinese-English and Japanese-English.

Learn more

Features

View all features

Multi-Domain Support
Support optimized models for call centers with enhanced accuracy.
Smart Punctuation & Formatting
Automatic punctuation prediction and text format optimization to generate natural, readable transcripts.
Custom Vocabulary
Boost accuracy for proper nouns like names, places, and organizations with custom hot words.
Speaker Diarization
Distinguish between speakers through voiceprint information.
Filler Word Removal
Support filler word filtering to improve the readability of transcribed spoken language.

Use Cases

Live Captioning

Provide real-time subtitles for live events like seminars to enhance the viewer experience.

Voice Assistant

Enable voice input for various scenarios such as in-car navigation and chat applications, maximizing hands-free operation.

Call Centers

Transcribe customer service calls in real-time, making it easier to record and analyze customer needs and improve service quality.

Meeting Minutes

Real-time transcription during meetings, quickly generating meeting minutes with speaker labels and timestamps.

Medical Documentation

Improve the efficiency of medical documentation through real-time speech recognition, reducing paperwork for healthcare professionals.

Education & Training

Enhance learning engagement with live captions for training sessions, helping students grasp concepts more effectively.

Voice Commands

Enable instant voice control for smart home devices and IoT applications with responsive command recognition.

Legal Documentation

Perform real-time transcription in the courtroom to ensure the accuracy and integrity of court transcripts.

Powering Most Innovative Teams

HopeRun

Start Building

View Docs

Streaming Speech-to-Text APIStreaming Speech-to-Text API

Live Transcript

Lower Cost

Lower Latency

Higher Accuracy

Features

Multi-Domain Support

Support optimized models for call centers with enhanced accuracy.

Smart Punctuation & Formatting

Automatic punctuation prediction and text format optimization to generate natural, readable transcripts.

Custom Vocabulary

Boost accuracy for proper nouns like names, places, and organizations with custom hot words.

Speaker Diarization

Distinguish between speakers through voiceprint information.

Filler Word Removal

Support filler word filtering to improve the readability of transcribed spoken language.

Use Cases

Powering Most Innovative Teams

Start Building

Streaming Speech-to-Text API