Retry logic
Retry on 5xx and network timeouts — never on 4xx. Use exponential backoff with jitter, a hard cap on attempts, and an idempotency key so retries don't create duplicate jobs.
When to retry
The rule of thumb: retry only when the failure is plausibly transient and the server hasn't told you the request itself is wrong. 4xx responses are deterministic — the same payload will fail the same way, so retrying just wastes quota.
| Scenario | Retry? | Notes |
|---|---|---|
2xx | No | Success — no retry needed. |
4xx | No | Deterministic client error. Surface error.code to the user; retrying will fail identically. |
429 | Yes, delayed | Reserved for future rate-limiting. Respect Retry-After if present, otherwise wait at least 1s before a single retry. |
5xx | Yes | Transient server error. Retry with exponential backoff + jitter, up to 5 attempts. |
network/timeout | Yes | Connection reset, DNS failure, or client timeout. Retry — but guard POSTs with an idempotency key (see below). |
body parse error | No | Malformed JSON from the server is almost always a bug or a proxy injecting HTML. Do not loop — log and fail the operation. |
transcription.failed | No | Async failure on the job itself. Needs user intervention (fix input, top up balance). Retrying the request will hit the same failure. |
Backoff strategy
Use exponential backoff capped at 30 seconds: 1s, 2s, 4s, 8s, 16s, 30s, with full random jitter on each delay. Cap at 5 retries and a total deadline (we use 2 minutes below) so a broken upstream never stalls your worker forever.
type RetryOpts = { retries?: number; baseMs?: number; capMs?: number; totalMs?: number };
export async function retry<T>(
fn: (signal: AbortSignal) => Promise<T>,
opts: RetryOpts = {},
): Promise<T> {
const { retries = 5, baseMs = 1000, capMs = 30_000, totalMs = 120_000 } = opts;
const ctrl = new AbortController();
const deadline = setTimeout(() => ctrl.abort(), totalMs);
try {
let lastErr: unknown;
for (let attempt = 0; attempt <= retries; attempt++) {
try {
return await fn(ctrl.signal);
} catch (err) {
lastErr = err;
if (attempt === retries || !isRetryable(err)) throw err;
const exp = Math.min(capMs, baseMs * 2 ** attempt);
const jitter = Math.random() * exp;
await new Promise((r) => setTimeout(r, jitter));
}
}
throw lastErr;
} finally {
clearTimeout(deadline);
}
}
function isRetryable(err: any) {
if (err?.name === "AbortError") return false;
if (err?.status && err.status >= 500) return true;
return err?.code === "ETIMEDOUT" || err?.code === "ECONNRESET" || err?.name === "TypeError";
}4xx errors
Instead, surface error.code and error.message to the user (for interactive flows) or to your logs (for background jobs), and stop. Branch on code — not message — since messages are human-readable and may change.
429 responses
429 is reserved for future rate-limiting and is not yet enforced, but well-behaved clients should already handle it. If the response includes a Retry-After header (in seconds), wait that long before retrying once. If not, wait at least 1 second. Don't stack 429 retries into your normal exponential loop — treat them as a separate, deliberate pause.
async function sendWith429(req) {
const res = await fetch("https://api.quillhub.ai/v1/transcriptions", req);
if (res.status !== 429) return res;
const header = res.headers.get("Retry-After");
const waitSec = header ? Math.max(1, parseInt(header, 10) || 1) : 1;
await new Promise((r) => setTimeout(r, waitSec * 1000));
return fetch("https://api.quillhub.ai/v1/transcriptions", req);
}Duplicate jobs
POST /v1/transcriptions is not idempotent yet. If a request times out mid-flight, the server may have accepted the job — retrying blindly creates a second one and double-charges points. Server-side dedupe via an Idempotency-Key header is on our roadmap; until then, dedupe on the client.
const idempotencyKey = crypto.randomUUID();
async function createOnce(url: string) {
// Before retrying a timed-out POST, check if the job already landed.
const existing = await api.get("/v1/transcriptions?limit=20").then((r) =>
r.data.find((t: any) => t.metadata?.idempotency_key === idempotencyKey)
);
if (existing) return existing;
return api.post("/v1/transcriptions", {
url,
metadata: { idempotency_key: idempotencyKey },
});
}Webhook delivery
Transcription failures
A job that lands in status: "failed" is not a transient failure — it's a deterministic outcome like an unsupported source, a corrupt file, or insufficient balance. Don't retry these programmatically; the same input will produce the same error. Present error.code to the user so they can fix the cause and resubmit.