免费在线视频转文字

使用AI驱动的转录将视频转为文字。上传视频文件、通过麦克风录音或粘贴URL。支持100多种语言、10多种模型,准确率超过98%。

使用公开的音频和视频工作。 DRM 保护的内容不支持 。

增强的升级
Private transcript
与笔录聊天
以 Pro 解锁 →
在此拖放文件或单击以浏览文件
MP3、WAV、M4A、FLAC、MP4、MKV、MOV、WebM-至多2GB
增强的升级
Private transcript
与笔录聊天
以 Pro 解锁 →
增强的升级
录音: 0:00
实时 伏( 即时)
增强 耳语( 准确)
公共链接:24小时,仅文本 · 签名签名 7d+音频 · 职业 用于私人链接的私人链接

文本的实时演讲。 AI 自动校正, 使用较长的演讲, 准确性会提高 。

先测试一下麦克风
❤️ 爱你的STT. AI 告诉你的朋友!
你用的是免费的抄本

免费报名每月获得600分钟,或升级无限制的抄本。

每天10分钟免费 600分钟免费,有注册 无信用卡 已加密
免费签名 →

1. 上传视频

上传MP4、MKV、MOV、WebM或AVI。音频会自动提取。

2. AI转录视频

AI提取并转录音轨,包含说话人标签和时间戳。

3. 导出与字幕

下载SRT/VTT格式的字幕。或将转录导出为TXT、DOCX、PDF。

支持的视频格式

视频转文字使用场景

准备好将视频转为文字了吗?

免费开始 →

常见问题

Upload your video file or paste a video URL. STT.ai extracts the audio track automatically — no separate demux step — runs it through your chosen AI model, and returns the transcript plus SRT/VTT subtitles.

MP4, MKV, MOV, WebM, AVI, and other common containers are all supported. You don't need to extract the audio yourself — upload the video as-is.

Yes. Export the transcript as SRT or VTT for upload to YouTube, Vimeo, or any player, and the burn-subtitles tool can hardcode captions directly onto the video. MKV and MP4 also support attaching soft-subtitle tracks without re-encoding.

Yes. STT.ai includes 600 free minutes per month — about ten hours of video. Paid plans starting at $5/month add larger files, longer videos, and private transcripts.

Accuracy depends on the audio track inside the video — higher-bitrate audio (256 kbps+) transcribes better than heavily compressed soundtracks. Our best models reach 93-95% on clean dialogue.

Files up to 2 GB are supported on every plan. Free users get up to one hour of video per file; paid plans extend that to 8+ hours. For huge raw camera files, compress to H.264/AAC or use a URL upload.

Yes. Paste a public video URL from any of 1,300+ supported platforms and STT.ai fetches the video and extracts its audio automatically. DRM-protected or private videos must be downloaded manually first.

Yes. Speaker diarization labels each voice (Speaker 1, Speaker 2, ...) and you rename them in the editor — useful for interviews, panels, and multi-host video.

Yes. 100+ languages with auto-detection. You can also translate the finished transcript or subtitles into other languages with the subtitle-translator tool for a wider audience.

Export to SRT or VTT for subtitles, plus TXT, DOCX, PDF, or JSON for articles, show notes, and archives. JSON keeps machine-readable timestamps and speaker labels.

Yes. Video and the extracted audio are processed and deleted by default, and Pro plans add client-side encryption so transcripts are unreadable without your key. Nothing is used for training without explicit opt-in.

Most videos finish in a few minutes; a one-hour video typically takes 3-5 minutes depending on the model and current GPU load. Long videos queue and email you when they're done.