Transcribe YouTube and Instagram
Hand us a URL — we fetch the audio, transcribe it, and structure the result. No intermediate files to manage.
Supported sources
- YouTube videos — youtube.com/watch?v=…, youtu.be/…, m.youtube.com/watch?v=…
- YouTube Shorts — youtube.com/shorts/…
- Instagram Reels and feed posts — instagram.com/reel/…, instagram.com/p/… (public only)
- Direct media URLs — any publicly reachable MP3, MP4, M4A, WAV, or similar
Basic request
POST a URL to /v1/transcriptions. You'll get back a 202 with a transcription object in the queued state — the URL type is detected automatically.
curl -X POST https://api.quillhub.ai/v1/transcriptions \
-H "Authorization: Bearer $QAI_KEY" \
-H "Content-Type: application/json" \
-d '{
"url": "https://www.youtube.com/watch?v=dQw4w9WgXcQ",
"structure": true
}'{
"id": "trs_8f2c91a0b3e4",
"status": "queued",
"source": {
"type": "youtube",
"url": "https://www.youtube.com/watch?v=dQw4w9WgXcQ"
},
"created_at": "2026-04-23T10:14:02Z"
}Processing stages
Polling for completion
Fetch the transcription by id. While it's running, status is processing and progress is a float between 0 and 1. If you'd rather not poll, set webhook_url on the create request.
curl https://api.quillhub.ai/v1/transcriptions/trs_8f2c91a0b3e4 \
-H "Authorization: Bearer $QAI_KEY"{
"id": "trs_8f2c91a0b3e4",
"status": "processing",
"progress": 0.42,
"source": { "type": "youtube", "url": "https://www.youtube.com/watch?v=dQw4w9WgXcQ" },
"duration_seconds": 1843,
"created_at": "2026-04-23T10:14:02Z"
}Configuring the request
| Field | Type | Default | Description |
|---|---|---|---|
language | string | auto | ISO-639-1 code (en, ru, es…). Omit to auto-detect. Forcing a language helps with short clips and noisy audio. |
speaker_recognition | boolean | false | Label who said what. Great for podcasts and interviews. Adds ~10–15% to processing time. |
structure | boolean | true | Adds result.structured with title, summary, TOC, paragraphs, highlights, and terms. |
webhook_url | string | — | HTTPS endpoint to POST the finished transcription to. See the Webhooks guide for signing. |
metadata | object | — | Up to 16 key/value pairs echoed back on the transcription. Useful for correlating jobs with your own records. |
Handling failures
Some sources can't be fetched. When that happens, the transcription transitions to status: "failed" with a machine-readable error.code you can branch on.
{
"id": "trs_8f2c91a0b3e4",
"status": "failed",
"error": {
"code": "source_unavailable",
"message": "The video is private, deleted, or geoblocked in our region."
}
}Instagram specifics
Only public Reels and feed videos are supported. Stories, Highlights, and anything behind a login or a private account will fail with source_unavailable — we don't proxy authenticated sessions.