Povucite i ispustite bilo koju audio ili video datoteku (MP3, WAV, MP4, i 20+ formata). Snimite sa svog mikrofona u stvarnom vremenu. Ili zalijepite link sa YouTube, Vimeo, TikTok, i 1300+ platformi.

2. AI transkripira sa vašim izborom modela

Izaberite iz 10+ AI modela uključujući Whisper, NVIDIA Canary (#1 preciznost), i Moonshine. Automatski detektirajte jezik iz 100+ opcija. Diarizacija govornika identificira ko je šta rekao.

3. Izvoz, dijeljenje ili integracija

Preuzmite kao TXT, SRT, VTT, DOCX, JSON, ili PDF. Podijelite putem linka. Koristite naš API za integraciju transkripcije u vašu aplikaciju. Savršeno za podnaslove, bilješke sa sastanka, podcaste i drugo.

Popularni slučajevi upotrebe

Svi slučajevi upotrebe →

Sastanci

Bilješke sa sastanka i radnje

Podcast

Prepisi i prikaz bilješki

Podnaslovi

SRT, VTT i više

Medicinski

Sigurna transkripcija

Predavanja

Class notes & study guides

Legalno

Depoziti i sud

Sve što trebate za audio i video

70+ besplatnih alata podržanih od strane AI

Pretvorba govora u tekst

Prepisivati audio i video datoteke

Live Transcription

Transkripcija mikrofona u stvarnom vremenu

YouTube video

Izvadi podnaslove iz bilo kojeg videa

Uređivač podnaslova

Uređuj SRT i VTT datoteke na internetu

Uklanjanje šuma

Ukloni pozadinsku buku iz zvuka

Audio Converter

MP3, WAV, FLAC, OGG, AAC i više

Voice Remover

Izdvoji vokale ili ih ukloni

Audio trimer

Izrežite i obrezujte audio datoteke

Pretvornik naslova

SRT, VTT, SSA, SBV formati

Zapisnik sa sastanka

Izvuci stavke akcija

Tekst u govor

Pretvori tekst u prirodni govor

Prevoditelj podnaslova

Prevedite podnaslove na preko 100 jezika

Prikaži sve alate →

100+

Podržani jezici

70+

Slobodni alati

1,300+

Podržane platforme

Izvoz

Developer-First API

Integrirajte govor-u-tekst u svoju aplikaciju u minutama. RESTful API sa WebSocket streamingom u realnom vremenu.

REST + WebSocket — Učitavanje datoteka i strujanje u stvarnom vremenu

Više modela — Whisper, Canary, Enhanced i još mnogo toga

Dijagnoza — Automatski detektuje ko je šta rekao

Fleksibilan izlaz — JSON, TXT, SRT, VTT sa slovnim vremenskim žigom

API dokumentacija Igralište

import requests

response = requests.post(
    "https://api.stt.ai/v1/transcribe",
    headers={"Authorization": f"Bearer {API_KEY}"},
    files={"file": open("meeting.mp3", "rb")},
    data={
        "model": "large-v3-turbo",
        "language": "auto",
        "diarize": "true",
        "response_format": "json",
    },
)

result = response.json()
for seg in result["segments"]:
    print(f"{seg['speaker']}: {seg['text']}")

import fs from "fs";

const form = new FormData();
form.append("file", fs.createReadStream("meeting.mp3"));
form.append("model", "large-v3-turbo");
form.append("language", "auto");
form.append("diarize", "true");

const res = await fetch("https://api.stt.ai/v1/transcribe", {
  method: "POST",
  headers: { Authorization: `Bearer ${API_KEY}` },
  body: form,
});

const { segments } = await res.json();
segments.forEach(s =>
  console.log(`${s.speaker}: ${s.text}`)
);

Prebacivanje s drugog govora na tekstni servis?

STT.ai vs Otter.ai STT.ai vs TurboScribe STT.ai vs Fireflies STT.ai vs Rev Pogledaj sve →

Jednostavno, transparentno određivanje cijena

Počnite besplatno, a onda se razvijajte.

Slobodan

$0/mj

600 min/mjesec

5 jezika
Izvoz TXT i SRT
API pristup

Starter

$9/mj

3,000 min/mjesec

100+ jezika
Svi modeli AI
Svi formati izvoza

Najpopularniji

Pro

$19/mj

7,500 min/mjesec

Privatni transkripti
Neograničeno broj timova
Prioritet obrade

Posao

$39/mj

20,000 min/mjesec

Sve u Pro
50K min storage
Neograničeni AI chat

Prikaži sve planove i cijene →

Podržani jezici

Svih 100+ jezika →

English Spanish French German Japanese Chinese Arabic Hindi Portuguese Russian Korean Italian Turkish Dutch Polish +85 više

Spreman za transkripciju?

Učitaj svoj prvi dokument besplatno, bez kreditne kartice, bez registracije, 600 minuta mjesečno na besplatnom planu.

Počni prepisivanje

Često postavljana pitanja

speech to text runs in your browser: paste a URL, upload a file, or record from your mic. STT.ai picks the AI model and returns the transcript in under 5 minutes. Export as TXT, SRT, VTT, DOCX, JSON, or PDF.

Yes — every visitor gets 600 free minutes/month on STT.ai, usable for speech to text the same as any other workflow. Paid plans starting at $5/month unlock longer files, private transcripts, and priority queueing.

speech to text runs on the same AI models as the rest of STT.ai — our best models reach 95-97% accuracy on clean speech (3-5% Word Error Rate on benchmarks). Switch models on the fly if the first pass is below your target.

speech to text can run on any of STT.ai's 10+ models — STT.ai Enhanced (most accurate), Whisper Large V3 (99 languages), NVIDIA Canary (#1 WER on supported langs), Whisper Turbo (fast), Moonshine (lightweight), and more.

Yes. Every transcript exports as SRT or VTT — works with YouTube, Vimeo, TikTok, VLC, and every major video player. The burn-subtitles tool overlays them onto video as hardsubs.

Yes. Speaker diarization automatically labels each voice (Speaker 1, Speaker 2, ...) and you can rename them in the built-in editor. Works across all models and languages.

Most speech to text jobs finish in under 5 minutes. A 1-hour audio file typically completes in 2-3 minutes with our fastest models. Speed depends on chosen model and current GPU load.

speech to text accepts 20+ formats — MP3, WAV, M4A, FLAC, OGG, MP4, MKV, MOV, WebM, AVI, and more. Output to TXT, SRT, VTT, DOCX, JSON, or PDF.

Yes. Audio files submitted to speech to text are processed and deleted by default. Pro plans add client-side encryption — even if STT.ai's database is breached, your transcripts are unreadable without your key. Data is never used for model training without explicit opt-in.

Yes. STT.ai offers a REST API with Python and Node.js SDKs, plus an MCP server for Claude and Cursor — all usable for speech to text workflows. Free API tier includes 100 minutes/month.

Yes. Every transcript opens in the built-in editor where you can correct words, rename speakers, adjust timestamps, and add notes. All changes save automatically.

Every transcript gets a unique shareable URL. Export to DOCX or PDF for email. Pro plans add password-protected and permanent links — useful for client work.

STT.ai handles 1,300+ platforms including YouTube, Vimeo, TikTok, SoundCloud, Zoom, Google Meet, podcast hosts, and more. URL transcription works with publicly-available content only — DRM-protected sources can't be transcribed.

Slobodna AI Pretvorba govora u tekst

Modeli govora u tekst

STT.ai. Službena stranica

1. Upload, Record, or Paste URL

2. AI transkripira sa vašim izborom modela

3. Izvoz, dijeljenje ili integracija

Popularni slučajevi upotrebe

Sve što trebate za audio i video

Developer-First API

Jednostavno, transparentno određivanje cijena

Podržani jezici

Spreman za transkripciju?

Često postavljana pitanja

How does speech to text work on STT.ai?

Is speech to text free?

How accurate is speech to text?

What AI models can I use for speech to text?

Can I get subtitles from speech to text?

Does speech to text detect different speakers?

How long does speech to text take?

What input formats does speech to text support?

Is my audio private when I use speech to text?

Is there a speech to text API?

Can I edit a speech to text transcript after?

How do I share what speech to text produces?

What other platforms work beyond speech to text?