GuidesRetry logic

Retry logic

Retry on 5xx and network timeouts — never on 4xx. Use exponential backoff with jitter, a hard cap on attempts, and an idempotency key so retries don't create duplicate jobs.

When to retry

The rule of thumb: retry only when the failure is plausibly transient and the server hasn't told you the request itself is wrong. 4xx responses are deterministic — the same payload will fail the same way, so retrying just wastes quota.

Scenario	Retry?	Notes
`2xx`	No	Success — no retry needed.
`4xx`	No	Deterministic client error. Surface error.code to the user; retrying will fail identically.
`429`	Yes, delayed	Reserved for future rate-limiting. Respect Retry-After if present, otherwise wait at least 1s before a single retry.
`5xx`	Yes	Transient server error. Retry with exponential backoff + jitter, up to 5 attempts.
`network/timeout`	Yes	Connection reset, DNS failure, or client timeout. Retry — but guard POSTs with an idempotency key (see below).
`body parse error`	No	Malformed JSON from the server is almost always a bug or a proxy injecting HTML. Do not loop — log and fail the operation.
`transcription.failed`	No	Async failure on the job itself. Needs user intervention (fix input, top up balance). Retrying the request will hit the same failure.

Backoff strategy

Use exponential backoff capped at 30 seconds: 1s, 2s, 4s, 8s, 16s, 30s, with full random jitter on each delay. Cap at 5 retries and a total deadline (we use 2 minutes below) so a broken upstream never stalls your worker forever.

Don't retry without jitter. If every client backs off on the exact same schedule, a brief outage turns into a thundering herd when they all retry in lockstep. Full jitter — delay = random(0, exp) — spreads the load and is strictly better than fixed or partial jitter for independent clients.

retry.tstypescript

type RetryOpts = { retries?: number; baseMs?: number; capMs?: number; totalMs?: number };

export async function retry<T>(
  fn: (signal: AbortSignal) => Promise<T>,
  opts: RetryOpts = {},
): Promise<T> {
  const { retries = 5, baseMs = 1000, capMs = 30_000, totalMs = 120_000 } = opts;
  const ctrl = new AbortController();
  const deadline = setTimeout(() => ctrl.abort(), totalMs);
  try {
    let lastErr: unknown;
    for (let attempt = 0; attempt <= retries; attempt++) {
      try {
        return await fn(ctrl.signal);
      } catch (err) {
        lastErr = err;
        if (attempt === retries || !isRetryable(err)) throw err;
        const exp = Math.min(capMs, baseMs * 2 ** attempt);
        const jitter = Math.random() * exp;
        await new Promise((r) => setTimeout(r, jitter));
      }
    }
    throw lastErr;
  } finally {
    clearTimeout(deadline);
  }
}

function isRetryable(err: any) {
  if (err?.name === "AbortError") return false;
  if (err?.status && err.status >= 500) return true;
  return err?.code === "ETIMEDOUT" || err?.code === "ECONNRESET" || err?.name === "TypeError";
}

4xx errors

Never auto-retry a 4xx. By definition the server is telling you the request is wrong. A retry loop will burn quota, produce noisy logs, and delay the actual fix.

Instead, surface error.code and error.message to the user (for interactive flows) or to your logs (for background jobs), and stop. Branch on code — not message — since messages are human-readable and may change.

429 responses

429 is reserved for future rate-limiting and is not yet enforced, but well-behaved clients should already handle it. If the response includes a Retry-After header (in seconds), wait that long before retrying once. If not, wait at least 1 second. Don't stack 429 retries into your normal exponential loop — treat them as a separate, deliberate pause.

handle-429.jsjavascript

async function sendWith429(req) {
  const res = await fetch("https://api.quillhub.ai/v1/transcriptions", req);
  if (res.status !== 429) return res;
  const header = res.headers.get("Retry-After");
  const waitSec = header ? Math.max(1, parseInt(header, 10) || 1) : 1;
  await new Promise((r) => setTimeout(r, waitSec * 1000));
  return fetch("https://api.quillhub.ai/v1/transcriptions", req);
}

Duplicate jobs

POST /v1/transcriptions is not idempotent yet. If a request times out mid-flight, the server may have accepted the job — retrying blindly creates a second one and double-charges points. Server-side dedupe via an Idempotency-Key header is on our roadmap; until then, dedupe on the client.

Attach an idempotency key to every create. Generate a UUID per logical job and send it as metadata.idempotency_key on the POST. Before retrying a timed-out request, GET /v1/transcriptions and skip the retry if you see a row with the same key.

idempotent-create.tstypescript

const idempotencyKey = crypto.randomUUID();

async function createOnce(url: string) {
  // Before retrying a timed-out POST, check if the job already landed.
  const existing = await api.get("/v1/transcriptions?limit=20").then((r) =>
    r.data.find((t: any) => t.metadata?.idempotency_key === idempotencyKey)
  );
  if (existing) return existing;

  return api.post("/v1/transcriptions", {
    url,
    metadata: { idempotency_key: idempotencyKey },
  });
}

Webhook delivery

We retry webhook deliveries for you. If your endpoint fails or times out, we retry on a fixed schedule — 1 minute, 5 minutes, 30 minutes, 2 hours, 12 hours — for a total of 5 retries. You don't need to implement any retry logic on the receiving side; just return 2xx once you've persisted the event. Full details at /docs/webhooks.

Transcription failures

A job that lands in status: "failed" is not a transient failure — it's a deterministic outcome like an unsupported source, a corrupt file, or insufficient balance. Don't retry these programmatically; the same input will produce the same error. Present error.code to the user so they can fix the cause and resubmit.

← Previous

Polling vs webhooks

Errors