Streaming Speech-to-Text API

WebSocket-based streaming speech recognition for instant transcription. Ideal for live broadcasting, voice assistants, and real-time captioning, with support for multiple languages.

Click to try our speech recognition service

Languages

Advanced Features

Speaker Diarization
Smart Formatting
Filler Word Removal

Speaker Diarization : Differentiate speakers in a single audio channel using voiceprint information.

Smart Formatting : Improve readability by applying additional formatting. When enabled, dates, times, and numbers will be displayed in conventional formats.

Filler Word Removal : Support filler word filtering to improve the readability of transcribed spoken language.

Live Transcript

Start speaking to see your live transcript

Speech-to-text with speaker diarization and instant response

Lower Cost

Save up to 80% with DolphinVoice compared to others.

Lower Latency

Support real-time interim results and provide the final transcript upon completion, with endpointing latency as low as 500ms.

Higher Accuracy

Exceptional performance in accuracy, with code-switching support for Chinese-English and Japanese-English.

  • Multi-Domain Support

    Support optimized models for call centers with enhanced accuracy.

  • Smart Punctuation & Formatting

    Automatic punctuation prediction and text format optimization to generate natural, readable transcripts.

  • Custom Vocabulary

    Boost accuracy for proper nouns like names, places, and organizations with custom hot words.

  • Speaker Diarization

    Distinguish between speakers through voiceprint information.

  • Filler Word Removal

    Support filler word filtering to improve the readability of transcribed spoken language.

Use Cases

Live Captioning

Provide real-time subtitles for live events like seminars to enhance the viewer experience.

Voice Assistant

Enable voice input for various scenarios such as in-car navigation and chat applications, maximizing hands-free operation.

Call Centers

Transcribe customer service calls in real-time, making it easier to record and analyze customer needs and improve service quality.

Meeting Minutes

Real-time transcription during meetings, quickly generating meeting minutes with speaker labels and timestamps.

Medical Documentation

Improve the efficiency of medical documentation through real-time speech recognition, reducing paperwork for healthcare professionals.

Education & Training

Enhance learning engagement with live captions for training sessions, helping students grasp concepts more effectively.

Voice Commands

Enable instant voice control for smart home devices and IoT applications with responsive command recognition.

Legal Documentation

Perform real-time transcription in the courtroom to ensure the accuracy and integrity of court transcripts.

Powering Most Innovative Teams

Start Building

Sign up and get started in minutes!