Arrastratz e depausatz quin que siá fichièr àudio o vidèo (MP3, WAV, MP4 e mai de 20 formats). Enregistratz a partir de vòstre microfòn en temps real. O colatz un ligam de YouTube, Vimeo, TikTok e mai de 1300 plataformas.

2. AI transcriu amb vòstra causida de modèl

Triatz entre mai de 10 modèls d'IA, coma Whisper, NVIDIA Canary (#1 precision) e Moonshine. Detecta automaticament la lenga entre mai de 100 opcions. La diaritz del locutor identifica qui a dich çò que cal.

3. Exportar, partejar o integrar

Telecargar coma TXT, SRT, VTT, DOCX, JSON, o PDF. Partejar via ligam. Utilizar nòstra API per integrar la transcripcion dins vòstra aplicacion. Perfièch per sostítols, nòtas de reünion, podcasts, e mai.

Cas d' utilizacion populars

Totes los cases d' utilizacion →

Class Notes & Study Guides

Legal

@ title: column

Tot çò que vos cal per l'audio e la vidèo

Mai de 70 aisina liuras alimentadas per l'IA

Sintèsi vocala en tèxt

Transcriure de fichièrs àudio e vidèo

Transcricion en dirècte

Transcricion de microfòn en temps real

YouTube

Extraire de subtítols de quina vidèo que siá

Editor de sostítols

Editar de fichièrs SRT e VTT en linha

Suprimir lo bruit

Suprimir lo bruit de fons de l'audio

Convertidor àudio

MP3, WAV, FLAC, OGG, AAC e mai

Vocal Remover

Isolar o suprimir de vocalas

Trimmer àudio

Tallar e tallar de fichièrs àudio

Convertidor de legendas

Formats SRT, VTT, SSA, SBV

@ title: window

@ info

Sintèsi vocala

Convertir un tèxt en lenga naturala

Traductor de sostitolatge

Traduire los sostítols en mai de 100 lengas

Visualizar totas las 70+ aisina →

100+

Lengas

70+

Aisinas liuras

1,300+

Platafòrmas suportadas

Formats d' exportacion

API de desvolopaire

Integrar la sintèsi vocala dins vòstra aplicacion en qualques minutas. API RESTful amb streaming WebSocket en temps real.

REST + WebSocket — Telecargar de fichièrs e difusar en temps real

Models multiples — Whisper, Canary, Enhanced e mai

Diaris de l'encenedor — Detectar automaticament qui a dit çò que

Sortida flexible — JSON, TXT, SRT, VTT amb marcadors de temps de mot

Documentacion API Playground

import requests

response = requests.post(
    "https://api.stt.ai/v1/transcribe",
    headers={"Authorization": f"Bearer {API_KEY}"},
    files={"file": open("meeting.mp3", "rb")},
    data={
        "model": "large-v3-turbo",
        "language": "auto",
        "diarize": "true",
        "response_format": "json",
    },
)

result = response.json()
for seg in result["segments"]:
    print(f"{seg['speaker']}: {seg['text']}")

import fs from "fs";

const form = new FormData();
form.append("file", fs.createReadStream("meeting.mp3"));
form.append("model", "large-v3-turbo");
form.append("language", "auto");
form.append("diarize", "true");

const res = await fetch("https://api.stt.ai/v1/transcribe", {
  method: "POST",
  headers: { Authorization: `Bearer ${API_KEY}` },
  body: form,
});

const { segments } = await res.json();
segments.forEach(s =>
  console.log(`${s.speaker}: ${s.text}`)
);

Cambiar d'un autre servici de lectura vocala a un servici de tèxt?

STT.ai vs Otter.ai STT.ai vs TurboScribe STT.ai vs Fireflies STT.ai vs Rev Comparar tot →

Pretz simple e transparent

Començar liure. Escalar a mesura que creissètz.

Liure

$0/mes

600 min/ mes

Lengas
Exportar
Accès API

Aviar

$9/mes

3, 000 min/ mes

Lengas
Tots los modèls
Tots los formats d' exportacion

Popular

Pro

$19/mes

7, 500 min/ mes

Transcriches privats
Sits d'equipa illimitats
Prioritat

Afars

$39/mes

20, 000 min/ mes

Tot dins Pro
@ info: status
Discutir sens limitas amb l'IA

Visualizar totes los plans e prètz →

Lengas suportadas

Totes los 100+ lengas →

English Spanish French German Japanese Chinese Arabic Hindi Portuguese Russian Korean Italian Turkish Dutch Polish +85 mai

Transcriure

Telecargar vòstre primièr fichièr liure. Pas de carta de credit, pas d'inscripcion. 600 minutas per mes sul plan liure.

@ info: status

Questions frequentas

speech to text runs in your browser: paste a URL, upload a file, or record from your mic. STT.ai picks the AI model and returns the transcript in under 5 minutes. Export as TXT, SRT, VTT, DOCX, JSON, or PDF.

Yes — every visitor gets 600 free minutes/month on STT.ai, usable for speech to text the same as any other workflow. Paid plans starting at $5/month unlock longer files, private transcripts, and priority queueing.

speech to text runs on the same AI models as the rest of STT.ai — our best models reach 95-97% accuracy on clean speech (3-5% Word Error Rate on benchmarks). Switch models on the fly if the first pass is below your target.

speech to text can run on any of STT.ai's 10+ models — STT.ai Enhanced (most accurate), Whisper Large V3 (99 languages), NVIDIA Canary (#1 WER on supported langs), Whisper Turbo (fast), Moonshine (lightweight), and more.

Yes. Every transcript exports as SRT or VTT — works with YouTube, Vimeo, TikTok, VLC, and every major video player. The burn-subtitles tool overlays them onto video as hardsubs.

Yes. Speaker diarization automatically labels each voice (Speaker 1, Speaker 2, ...) and you can rename them in the built-in editor. Works across all models and languages.

Most speech to text jobs finish in under 5 minutes. A 1-hour audio file typically completes in 2-3 minutes with our fastest models. Speed depends on chosen model and current GPU load.

speech to text accepts 20+ formats — MP3, WAV, M4A, FLAC, OGG, MP4, MKV, MOV, WebM, AVI, and more. Output to TXT, SRT, VTT, DOCX, JSON, or PDF.

Yes. Audio files submitted to speech to text are processed and deleted by default. Pro plans add client-side encryption — even if STT.ai's database is breached, your transcripts are unreadable without your key. Data is never used for model training without explicit opt-in.

Yes. STT.ai offers a REST API with Python and Node.js SDKs, plus an MCP server for Claude and Cursor — all usable for speech to text workflows. Free API tier includes 100 minutes/month.

Yes. Every transcript opens in the built-in editor where you can correct words, rename speakers, adjust timestamps, and add notes. All changes save automatically.

Every transcript gets a unique shareable URL. Export to DOCX or PDF for email. Pro plans add password-protected and permanent links — useful for client work.

STT.ai handles 1,300+ platforms including YouTube, Vimeo, TikTok, SoundCloud, Zoom, Google Meet, podcast hosts, and more. URL transcription works with publicly-available content only — DRM-protected sources can't be transcribed.

Free AI Sintèsi vocala en tèxt

Models de sintèsi vocala en tèxt

Comment STT.ai Works

1. Telecargar, enregistrar o pegar l' URL

2. AI transcriu amb vòstra causida de modèl

3. Exportar, partejar o integrar

Cas d' utilizacion populars

Tot çò que vos cal per l'audio e la vidèo

API de desvolopaire

Pretz simple e transparent

Lengas suportadas

Transcriure

Questions frequentas

How does speech to text work on STT.ai?

Is speech to text free?

How accurate is speech to text?

What AI models can I use for speech to text?

Can I get subtitles from speech to text?

Does speech to text detect different speakers?

How long does speech to text take?

What input formats does speech to text support?

Is my audio private when I use speech to text?

Is there a speech to text API?

Can I edit a speech to text transcript after?

How do I share what speech to text produces?

What other platforms work beyond speech to text?