Security & Privacy

How STT.ai protects your audio and transcripts. Client-side encrypted storage means even we can't read your data.

Client-Side Encrypted Storage

When you enable Privacy Mode, your transcripts are encrypted in your browser before they ever reach our servers. The encryption key is derived from your password — we never see it, store it, or have access to it.

This means: Even if our servers were compromised, your transcripts are unreadable. Only you can decrypt them.

Audit the encryption code yourself (open-source, MIT license)

How Client-Side Encrypted Storage Works

You upload audio

Your audio file is sent to our GPU for transcription. The audio is processed in memory and immediately deleted after transcription — never stored on disk.

Transcript returned to your browser

The raw transcript (text, timestamps, speakers) is sent back to your browser over HTTPS (TLS 1.3, encrypted in transit).

Your browser encrypts the transcript

Using AES-256-GCM encryption with a key derived from your password via PBKDF2 (100,000 iterations). The key never leaves your browser. We never see it.

Encrypted blob stored on our servers

We store only the encrypted data. It looks like random bytes to us. We cannot decrypt it. Our database admins cannot read it. If our servers are breached, your data is safe.

Only you can decrypt

When you view your transcript, your browser derives the key from your password again and decrypts locally. Nobody else — including STT.ai staff — can read your transcripts.

Technical Details

Encryption algorithm	AES-256-GCM (authenticated encryption)
Key derivation	PBKDF2 with SHA-256, 100,000 iterations
Key salt	User's email address (unique per user)
IV (nonce)	Random 12 bytes per encryption (never reused)
Key storage	Never stored — derived from password on each session
Transport encryption	TLS 1.3 (HTTPS)
Audio retention	Deleted immediately after processing (never stored on disk)
Implementation	Web Crypto API (browser-native, no external libraries)
Source code	github.com/sttaigit/stt-encryption (MIT license)

What We Can and Can't See

We CANNOT see

Your transcript text
Speaker names or labels
Timestamps or word-level data
Your encryption key
Your audio (deleted after processing)

We CAN see

File name and size (metadata)
Audio duration
Language detected
Model used
Timestamp of transcription

Privacy Mode Trade-offs

Client-side encrypted storage is opt-in because it limits some features. With encryption enabled:

Works with encryption

Viewing your transcripts
Exporting (TXT, SRT, VTT, etc.)
Downloading
Editing (decrypted in browser)

Not available with encryption

Server-side search across transcripts
AI summaries (server can't read data)
Sharing via link (recipient needs key)
Team workspace collaboration

Data Handling (All Users)

Even without Privacy Mode enabled, we follow strict data handling practices:

Audio files are never stored permanently. They are processed in GPU memory and deleted immediately after transcription completes. They are processed in GPU memory and deleted immediately after transcription completes.
Your data is never used for training unless you explicitly opt-in via Voice Lab. Paid plan data is never used. unless you explicitly opt-in via Voice Lab. Paid plan data is never used.
All traffic is encrypted in transit via TLS 1.3 (HTTPS). via TLS 1.3 (HTTPS).
You can delete all your data at any time from Privacy Settings. at any time from Privacy Settings.
We don't sell your data. Ever. To anyone. For any reason. Ever. To anyone. For any reason.

Open-Source Encryption

Our encryption library is fully open-source under the MIT license. Audit it yourself. Verify that we're doing what we say. No trust required — just math.

View on GitHub | View Source Directly

Ready to transcribe securely?

Upload your first file free. Client-side encryption included on all plans.

Start Transcribing

Frequently Asked Questions

Upload your audio or video file to STT.ai. Select your preferred AI model and options, then click Transcribe. Your transcript will be ready in minutes. Export as TXT, SRT, VTT, DOCX, JSON, or PDF.

Yes! STT.ai offers 600 free minutes per month for all users. No signup required for your first transcription. Paid plans with more minutes and features start at $5/month.

Accuracy depends on the AI model you choose and audio quality. Our best models achieve a 5-7% Word Error Rate on benchmarks, meaning 93-95%+ accuracy. Clear audio with minimal background noise produces the best results.

STT.ai offers 10+ models including Whisper Large V3, NVIDIA Canary, and more. You can compare results from different models on the same file.

Yes. After transcribing, export your transcript as SRT or VTT subtitle files. These work with YouTube, Vimeo, and all major video platforms.

Yes. STT.ai automatically identifies and labels different speakers using AI speaker diarization. Works across all models and languages.

Most files are transcribed in under 5 minutes. A 1-hour audio file typically takes 2-3 minutes with our fastest models.

STT.ai supports 20+ audio and video formats including MP3, WAV, M4A, FLAC, OGG, MP4, MKV, MOV, WebM, and AVI. Export as TXT, SRT, VTT, DOCX, JSON, or PDF.

Yes. Audio files are processed and deleted after transcription. Your data is never used for training. Client-side encryption is free on all plans. Learn about our security.

Yes. STT.ai offers a REST API with Python and Node.js SDKs. Free tier includes 100 minutes/month.

Yes. STT.ai includes a built-in transcript editor where you can correct errors, rename speakers, and adjust timestamps.

Every transcript gets a unique shareable link. Export to DOCX or PDF for email. Pro plans offer password-protected and permanent links.

Security & Privacy

Client-Side Encrypted Storage

How Client-Side Encrypted Storage Works

Technical Details

What We Can and Can't See

We CANNOT see

We CAN see

Privacy Mode Trade-offs

Data Handling (All Users)

Open-Source Encryption

Ready to transcribe securely?

Frequently Asked Questions

How do I transcribe audio?

Is transcription free?

How accurate is the transcription?

What AI models can I use?

Can I get subtitles and captions?

Does it detect different speakers?

How long does transcription take?

What file formats are supported?

Is my audio data kept private?

Can I access transcription via API?

Can I edit the transcript after?

How do I share my transcript?