Free AI Audio Transcription
Convert speech to text in 99 languages, right in your browser. Drop in an audio or video file, get a transcript with timestamps, download as .txt, .srt, or .vtt. No upload. No sign-up. No size limit.
Drop an audio or video file
or click to browse. Best with files under 30 min on most browsers. Cap at 60 min — split longer files first with our Audio Splitter.
Don't have a file? Record one with our voice recorder to test how transcription works.
100% in your browser. Audio stays on your device. The Whisper AI model downloads once (~40 MB) from a public CDN, then runs locally for every transcription. We can't access your audio because it never leaves your computer. Privacy policy.
Runs free in your browser. Keep this tab open while it runs — we'll chime if you switch tabs. Models cache after first download. Need translation? Use the dedicated Audio Translator.
Transcript
Free, private AI audio transcription — how it works
SnipSound's transcription tool uses OpenAI's open-source Whisper speech-recognition model running entirely in your browser via WebAssembly. The first time you click Transcribe, your browser downloads a ~40 MB model file from a public CDN; after that, every transcription is fully local. Your audio file never gets uploaded to any server — not ours, not OpenAI's, not anyone's.
What it's good for
- Transcribing podcast interviews, meeting recordings, voice memos, lectures, or any clear-speech audio.
- Generating subtitles for video — .srt and .vtt downloads with accurate timestamps that drop into YouTube, Vimeo, or any editor.
- Quick rough transcripts for journalists, researchers, students, content creators who don't want to pay for Otter or Rev.
- Privacy-sensitive audio you don't want on a third-party server — therapy notes, confidential interviews, internal meetings.
What it's not so good for
- Heavy background noise, music behind voice, or multiple overlapping speakers — tiny Whisper struggles with these.
- Heavy accents or non-mainstream dialects — bigger Whisper models handle these better but are too heavy for a browser.
- Speaker diarization ("who said what") — not supported by Whisper-tiny.
- Files longer than 60 minutes — we cap input length to keep browser RAM under control.
Translate audio to English
Tick "Translate to English" and Whisper renders any non-English audio as English text. Spanish podcast → English transcript. Mandarin interview → English notes. Dedicated Audio Translator tool here if translation is your primary need.