Senhalament de bòg / demanda de foncionalitat

Free Speech to Text en linha

Convertir la votz en tèxt amb la transcripcion alimentada per IA. Telecargar de fichièrs àudio, enregistrar amb vòstre microfòn o colar una URL. Mai de 100 lengas, mai de 10 modèls, mai de 98% de precision.

Funciona amb de fichièrs àudio e vidèo publics. Lo contengut protegit per DRM es pas suportat.

@ info: status

Private transcript

Discutir amb transcripcion

Desverrollar amb Pro →

@ info: status

MP3, WAV, M4A, FLAC, MP4, MKV, MOV, WebM - fins a 2 Go

Descarga de fichièrs multiples with Pro

@ info: status

Private transcript

Discutir amb transcripcion

Desverrollar amb Pro →

@ info: status

Transformar la votz en tèxt en temps real. L'IA se corrigís automaticament mentre que parlatz - la precision melhora amb la durada de la votz.

Testatz lo vòstre microfòn en primièr

10 minutas liuras/ jorn 600 min gratuit amb inscripcion Pas cap de carta de credit EncriptatComment

Inscriure' s →

How speech to text works →

Enregistrament vocal

Telecargar un fichièr àudio o vidèo, colar una URL, o enregistrar de discors a partir de vòstre microfòn.

2. AI convertit la votz en tèxt

Triar entre mai de 10 modèls d'IA. Deteccion de locutor e deteccion automatica de lenga inclusas.

3. Exportar vòstra transcripcion

Telecargar en 6 formats. Partejar de ligams de transcripcion amb lectura àudio.

Formats de picada vocala suportats

MP3 WAV M4A FLAC OGG MP4 MKV MOV WebM AVI

Models de sintèsi vocala en tèxt

Seleccionatz lo modèl d'IA que correspond a vòstres besonhs — o nos permetètz de causir lo melhor.

Sintetizar la votz en tèxt en mai de 100 lengas

English Spanish French German Japanese Arabic Hindi Portuguese Russian Korean Totes los lengatges →

Comment=Transformador de votz en tèxt

@ info: status

Començar liure →

Questions frequentas

La conversion vocala en tèxt (també nomenada reconeissença vocala o ASR) convertis automaticament l'audio parlat en mots escrichs. STT.ai executa vòstre enregistrament a travèrs d'un modèl d'IA qu'escóta l'audio e produsís un tèxt editable amb de marcadors de temps e d'etiquetas de locutor — pas de picada necessària.

Un modèl acústic transforma la forma d'onda del son en fonèmas, puèi un modèl de lenga los assembla dins los mots e la ponctuacion pus probables. STT.ai o fa sus GPU amb de modèls coma Whisper Large V3 e NVIDIA Canary, doncas un enregistrament d'una ora se fa normalament en 2-3 minutas.

@ info: credit

On clean speech our best models reach 95-97% accuracy (a 3-5% Word Error Rate on benchmarks). Accuracy drops with background noise, heavy accents, crosstalk, or low-bitrate audio — using a decent microphone and a quiet room makes the biggest difference.

Yes. Speak into your microphone and STT.ai streams the transcript live via the live-transcription tool. You can also upload a finished recording for batch transcription if you don't need it word-by-word as you talk.

STT.ai recognizes 100+ languages and auto-detects the spoken language for most audio. You can also set the language manually for a small accuracy lift, and mixed-language recordings are handled by switching mid-clip.

Yes. Speaker diarization labels each voice (Speaker 1, Speaker 2, ...) and you can rename them in the editor. This works across every supported model and language.

STT.ai accepts 20+ formats including MP3, WAV, M4A, FLAC, OGG, MP4, MKV, MOV, WebM, and AVI. Output to TXT, SRT, VTT, DOCX, JSON, or PDF.

Speech to text transcribes WHAT was said into words; voice recognition (speaker identification) determines WHO said it. STT.ai does both — transcription plus speaker diarization — but the terms describe different tasks.

Yes. Audio is processed and deleted by default. Pro plans add client-side encryption so transcripts are unreadable without your key, even to STT.ai, and your data is never used for model training without explicit opt-in.

Yes. STT.ai has a REST API with Python and Node.js SDKs plus an MCP server for Claude and Cursor. The free API tier includes 100 minutes/month, with per-second billing beyond that.

Yes. Every transcript opens in a built-in editor where you can fix misheard words, rename speakers, adjust timestamps, and add notes. Edits persist across every export format.

Free Speech to Text en linha

Enregistrament vocal

2. AI convertit la votz en tèxt

3. Exportar vòstra transcripcion

Formats de picada vocala suportats

Models de sintèsi vocala en tèxt

Sintetizar la votz en tèxt en mai de 100 lengas

Comment=Transformador de votz en tèxt

@ info: status

Questions frequentas

Qu'es la sintèsi vocala en tèxt?

Coma fonciona la sintèsi vocala en tèxt ?

STT.ai parla al tèxte es liure?

How accurate is speech to text?

Can I convert speech to text in real time?

What languages does speech to text support?

Does speech to text identify who is speaking?

What audio and video formats can I convert to text?

Is speech to text the same as voice recognition?

Is my audio private when I use speech to text?

Can developers add speech to text via an API?

Can I edit the text after speech to text?