Her hili ses ýa-da wideo faýly (MP3, WAV, MP4, we 20+ formatlar) çek we goý. Mikrofonyňyzdan hakykatdanam wagtyň içinde ýaz. Ýa-da YouTube, Vimeo, TikTok, we 1,300+ platformalardan bir baglanyşygy goýdyň.

2. AI Siziň Saýlaýan Modeliňiz bilen Öňe Çykýar

Whisper, NVIDIA Canary (#1 dogrylyk), we Moonshine içeren 10+ AI modellerinden saýlaň. 100+ opsiýalardan dilin awtomatik tap. Sözleýji diarization kim näme aýdany anyklaýar.

3. Eksport, Paýlaş ýa-da Birleşdir

TXT, SRT, VTT, DOCX, JSON, ýa-da PDF görnüşde ýükle. Baglanyşyk arkaly paýlaş. Programiňize transkripsiýany birikdirmek üçin biziň API-mizi ullan. Subtitrler, duşuşyk ýazgylary, podkastlar we başgalar üçin täsin.

Ulanylan

Ehli Ullanmalar →

Duşuşyk

Duşuşyk notlary we hereketler

Klas notlary we okuw rehberleri

Legal

Depozitler

Ses we Video üçin isleýän her zadyňy

70+ azat esbaplar AI tarapyndan güýçlendirildi

Sözden metin

Ses we wideo faýllary ýaz

Gyzykly

Raýat wagtynda mikrofonyň transkripti

YouTube transkriptleri

Hiç bir wideodan subtitleleri çykarma

Azatlyk

SRT we VTT faýllary onlaýn düzed

Ses aýyryjy

Sesden arkaplan gürlegini aýyr

Ses Döwrüji

MP3, WAV, FLAC, OGG, AAC we başgalar

Sesler

Vokallary izolirle ýa-da aýyr

Ses

Ses faýllaryny kes we taýýarla

Üýtgediş

SRT, VTT, SSA, SBV hili

Subtitrleri 100'den gowrak dilde terjime et

Ehli 70+ esbaplary görkez →

100+

Saýlanan Diller

70+

Azat esbaplar

1,300+

Saýlawlar

Eksport

Öňki Öwrediji

Programiňize minutlar içinde sözden metine birikdiriň. RESTful API real-time WebSocket stream bilen.

REST + WebSocket — Faýl ýükle we real wagt akdyr

Birnäçe modyller — Whisper, Canary, Enhanced & more

Sesçi — Kimiň näme diýendigini otomatik tap

Aýdyş — JSON, TXT, SRT, VTT söz wagt möhürleri bilen

Senedler Oýnamak meýdançasy

import requests

response = requests.post(
    "https://api.stt.ai/v1/transcribe",
    headers={"Authorization": f"Bearer {API_KEY}"},
    files={"file": open("meeting.mp3", "rb")},
    data={
        "model": "large-v3-turbo",
        "language": "auto",
        "diarize": "true",
        "response_format": "json",
    },
)

result = response.json()
for seg in result["segments"]:
    print(f"{seg['speaker']}: {seg['text']}")

import fs from "fs";

const form = new FormData();
form.append("file", fs.createReadStream("meeting.mp3"));
form.append("model", "large-v3-turbo");
form.append("language", "auto");
form.append("diarize", "true");

const res = await fetch("https://api.stt.ai/v1/transcribe", {
  method: "POST",
  headers: { Authorization: `Bearer ${API_KEY}` },
  body: form,
});

const { segments } = await res.json();
segments.forEach(s =>
  console.log(`${s.speaker}: ${s.text}`)
);

Başga bir sözden metin hyzmatyna geçeými?

STT.ai vs Otter.ai STT.ai vs TurboScribe STT.ai vs Fireflies STT.ai vs Rev Ehlini deňle →

Basit, Açyk

Özgür başla. Ösýänçä ölçeý

_Boş

$0/1000 MB

600 minut/aýda

Dil
TXT we SRT eksporty
API erişiş

Başlançy

$9/1000 MB

3,000 min/ay

100+ dil
Ehli AI Modeller
Ehli Eksport Düzeltmeleri

Ençeme

Pro

$19/1000 MB

7,500 min/ay

Özüne degişli transkriptler
Süňňürsiz topar otaglary
Ön bellenen işleme

Iş

$39/1000 MB

20,000 min/ay

Pro-da Her Zat
50K min saklama
Süňňürsiz AI gürleş

Ehli planlary we bahalary gör →

Saýlanan Diller

100+ dil →

English Spanish French German Japanese Chinese Arabic Hindi Portuguese Russian Korean Italian Turkish Dutch Polish +85 ýene

_Gözleg

Ilki faýlyňyzy mugt ýükläň. Kredi kart gerek däl, ýazma gerek däl. Mugt planda aýda 600 minut.

_Gözleg

Gynançly Soraglar

speech to text runs in your browser: paste a URL, upload a file, or record from your mic. STT.ai picks the AI model and returns the transcript in under 5 minutes. Export as TXT, SRT, VTT, DOCX, JSON, or PDF.

Yes — every visitor gets 600 free minutes/month on STT.ai, usable for speech to text the same as any other workflow. Paid plans starting at $5/month unlock longer files, private transcripts, and priority queueing.

speech to text runs on the same AI models as the rest of STT.ai — our best models reach 95-97% accuracy on clean speech (3-5% Word Error Rate on benchmarks). Switch models on the fly if the first pass is below your target.

speech to text can run on any of STT.ai's 10+ models — STT.ai Enhanced (most accurate), Whisper Large V3 (99 languages), NVIDIA Canary (#1 WER on supported langs), Whisper Turbo (fast), Moonshine (lightweight), and more.

Yes. Every transcript exports as SRT or VTT — works with YouTube, Vimeo, TikTok, VLC, and every major video player. The burn-subtitles tool overlays them onto video as hardsubs.

Yes. Speaker diarization automatically labels each voice (Speaker 1, Speaker 2, ...) and you can rename them in the built-in editor. Works across all models and languages.

Most speech to text jobs finish in under 5 minutes. A 1-hour audio file typically completes in 2-3 minutes with our fastest models. Speed depends on chosen model and current GPU load.

speech to text accepts 20+ formats — MP3, WAV, M4A, FLAC, OGG, MP4, MKV, MOV, WebM, AVI, and more. Output to TXT, SRT, VTT, DOCX, JSON, or PDF.

Yes. Audio files submitted to speech to text are processed and deleted by default. Pro plans add client-side encryption — even if STT.ai's database is breached, your transcripts are unreadable without your key. Data is never used for model training without explicit opt-in.

Yes. STT.ai offers a REST API with Python and Node.js SDKs, plus an MCP server for Claude and Cursor — all usable for speech to text workflows. Free API tier includes 100 minutes/month.

Yes. Every transcript opens in the built-in editor where you can correct words, rename speakers, adjust timestamps, and add notes. All changes save automatically.

Every transcript gets a unique shareable URL. Export to DOCX or PDF for email. Pro plans add password-protected and permanent links — useful for client work.

STT.ai handles 1,300+ platforms including YouTube, Vimeo, TikTok, SoundCloud, Zoom, Google Meet, podcast hosts, and more. URL transcription works with publicly-available content only — DRM-protected sources can't be transcribed.

Free AI Sözden metin

Sözden metin modelleri

STT.ai Nädip Işleýär

1. URL'i ýükle, ýaz ýa-da gopla

2. AI Siziň Saýlaýan Modeliňiz bilen Öňe Çykýar

3. Eksport, Paýlaş ýa-da Birleşdir

Ulanylan

Ses we Video üçin isleýän her zadyňy

Öňki Öwrediji

Basit, Açyk

Saýlanan Diller

_Gözleg

Gynançly Soraglar

How does speech to text work on STT.ai?

Is speech to text free?

How accurate is speech to text?

What AI models can I use for speech to text?

Can I get subtitles from speech to text?

Does speech to text detect different speakers?

How long does speech to text take?

What input formats does speech to text support?

Is my audio private when I use speech to text?

Is there a speech to text API?

Can I edit a speech to text transcript after?

How do I share what speech to text produces?

What other platforms work beyond speech to text?