We may earn compensation from some listings on this page. Learn More
OpenAI Whisper is a speech recognition system that converts spoken language into written text. The model processes ...
OpenAI Whisper is a speech recognition system that converts spoken language into written text. The model processes audio inputs through an encoder-decoder transformer architecture, splitting audio into 30-second segments and generating text using a language model. Trained on 680,000 hours of multilingual data from diverse sources, it supports transcription in 99 languages and translation to English. Its training dataset includes varied accents, acoustic conditions, and domain-specific terminology, enabling accurate results in noisy environments. Whisper achieves a 92% accuracy rate, with performance adjustable across five model sizes to balance speed and resource requirements.
Open AI Whisper Features