报告错误/功能要求

导出格式

以您需要的任何格式下载转录。STT.ai支持六种导出格式，每种针对不同的工作流程优化。

使用公开的音频和视频工作。 DRM 保护的内容不支持。

增强的升级

Private transcript

与笔录聊天

以 Pro 解锁 →

在此拖放文件或单击以浏览文件

MP3、WAV、M4A、FLAC、MP4、MKV、MOV、WebM-至多2GB

批次上传多个文件与 Pro 合

增强的升级

Private transcript

与笔录聊天

以 Pro 解锁 →

增强的升级

文本的实时演讲。 AI 自动校正, 使用较长的演讲, 准确性会提高。

先测试一下麦克风

每天10分钟免费 600分钟免费,有注册无信用卡已加密

免费签名 →

支持的导出格式

转录音频或视频后，您可以以以下任何格式下载。所有格式包含完整转录文本，定时格式包含词级或段级时间戳。

TXT（纯文本）

.txt

无格式的简单纯文本转录。最适合复制到文档、电子邮件或其他应用程序。启用说话人检测时包含说话人标签。

Free plan

SRT（SubRip字幕）

.srt

最广泛支持的字幕格式。包含顺序编号、时间戳和文本。兼容YouTube、Vimeo、VLC、Premiere Pro、Final Cut以及几乎所有视频播放器和编辑器。

Free plan

VTT（WebVTT）

.vtt

Web视频文本轨道格式，HTML5视频字幕标准。支持样式、定位和元数据。用于网页浏览器、流媒体平台和现代视频播放器。

Basic plan+

DOCX（Word文档）

.docx

带有适当标题、时间戳和说话人标签的格式化Word文档。适合会议纪要、报告和需要在Microsoft Word或Google Docs中进一步编辑的文档。

Basic plan+

JSON（结构化数据）

.json

机器可读的结构化格式，包含词级时间戳、置信度分数、说话人ID和段数据。适合在STT.ai基础上构建或将数据导入其他系统的开发者。

Basic plan+

PDF（便携式文档）

.pdf

带有时间戳、说话人标签和STT.ai品牌标识的专业格式PDF。适合与客户分享、存档记录或打印。布局针对可读性优化。

Basic plan+

格式对比

特征特征	TXT	SRT	VTT	DOCX	JSON	PDF
Plain text	✓	✓	✓	✓	✓	✓
Timestamps	✗	✓	✓	✓	✓	✓
Speaker labels	✓	✓	✓	✓	✓	✓
Word-level timing	✗	✗	✗	✗	✓	✗
Confidence scores	✗	✗	✗	✗	✓	✗
Video player compatible	✗	✓	✓	✗	✗	✗
Editable	✓	✓	✓	✓	✓	✗
Machine-readable	✗	✗	✗	✗	✓	✗

您应该选择哪种格式？

用于字幕

Use SRT for maximum compatibility or VTT for web-based video players. SRT works with YouTube, Vimeo, Premiere Pro, Final Cut, and DaVinci Resolve.

用于文档和报告

Use DOCX for editable documents or PDF for sharing and archiving. Both include formatted timestamps and speaker labels.

用于开发者和集成

Use JSON for the richest data including word-level timestamps, confidence scores, and speaker IDs. Ideal for building custom applications.

用于快速复制粘贴

Use TXT for a simple plain text transcript you can paste anywhere -- emails, notes, chat, or any text field.

批量导出

Need to export multiple transcripts at once? STT.ai supports batch export from your transcript library. Select multiple transcripts, choose your format, and download them all in a single ZIP file. Available on all paid plans.

API导出

Developers can retrieve transcripts in any format via the STT.ai API. Simply specify the desired format in your API request and receive the formatted output directly. The JSON format includes the most detailed data including word-level timestamps and confidence scores.

转录并以任何格式导出

上传音频或视频。选择导出格式。即时下载。

免费开始转录

常见问题

export formats runs in your browser: paste a URL, upload a file, or record from your mic. STT.ai picks the AI model and returns the transcript in under 5 minutes. Export as TXT, SRT, VTT, DOCX, JSON, or PDF.

Yes — every visitor gets 600 free minutes/month on STT.ai, usable for export formats the same as any other workflow. Paid plans starting at $5/month unlock longer files, private transcripts, and priority queueing.

export formats runs on the same AI models as the rest of STT.ai — our best models reach 95-97% accuracy on clean speech (3-5% Word Error Rate on benchmarks). Switch models on the fly if the first pass is below your target.

export formats can run on any of STT.ai's 10+ models — STT.ai Enhanced (most accurate), Whisper Large V3 (99 languages), NVIDIA Canary (#1 WER on supported langs), Whisper Turbo (fast), Moonshine (lightweight), and more.

Yes. Every transcript exports as SRT or VTT — works with YouTube, Vimeo, TikTok, VLC, and every major video player. The burn-subtitles tool overlays them onto video as hardsubs.

Yes. Speaker diarization automatically labels each voice (Speaker 1, Speaker 2, ...) and you can rename them in the built-in editor. Works across all models and languages.

Most export formats jobs finish in under 5 minutes. A 1-hour audio file typically completes in 2-3 minutes with our fastest models. Speed depends on chosen model and current GPU load.

export formats accepts 20+ formats — MP3, WAV, M4A, FLAC, OGG, MP4, MKV, MOV, WebM, AVI, and more. Output to TXT, SRT, VTT, DOCX, JSON, or PDF.

Yes. Audio files submitted to export formats are processed and deleted by default. Pro plans add client-side encryption — even if STT.ai's database is breached, your transcripts are unreadable without your key. Data is never used for model training without explicit opt-in.

Yes. STT.ai offers a REST API with Python and Node.js SDKs, plus an MCP server for Claude and Cursor — all usable for export formats workflows. Free API tier includes 100 minutes/month.

Yes. Every transcript opens in the built-in editor where you can correct words, rename speakers, adjust timestamps, and add notes. All changes save automatically.

Every transcript gets a unique shareable URL. Export to DOCX or PDF for email. Pro plans add password-protected and permanent links — useful for client work.

STT.ai handles 1,300+ platforms including YouTube, Vimeo, TikTok, SoundCloud, Zoom, Google Meet, podcast hosts, and more. URL transcription works with publicly-available content only — DRM-protected sources can't be transcribed.

导出格式

支持的导出格式

TXT（纯文本）

SRT（SubRip字幕）

VTT（WebVTT）

DOCX（Word文档）

JSON（结构化数据）

PDF（便携式文档）

格式对比

您应该选择哪种格式？

批量导出

API导出

转录并以任何格式导出

常见问题

How does export formats work on STT.ai?

Is export formats free?

How accurate is export formats?

What AI models can I use for export formats?

Can I get subtitles from export formats?

Does export formats detect different speakers?

How long does export formats take?

What input formats does export formats support?

Is my audio private when I use export formats?

Is there a export formats API?

Can I edit a export formats transcript after?

How do I share what export formats produces?

What other platforms work beyond export formats?