Store Chat Messages & State Without Managing Infrastructure.Check Out DialogueDB
Skip to content
Back to examples

Retries and Timeouts

LLM APIs fail in ways normal APIs rarely do: requests hang for 60+ seconds, providers return intermittent 5xx errors under load, and a "successful" response can still be unparseable. Hand-rolling retry loops around every call is the boilerplate llm-exe is built to remove — timeouts, retries, and backoff are configuration, not code.

Step 1 - Configure the LLM

All three options are generic options — they work the same for every provider:

ts
export function createSummarizer() {
  const llm = useLlm("openai.gpt-4o-mini", {
    timeout: 15000, // fail any single API call after 15 seconds
    numOfAttempts: 3, // make up to 3 attempts before throwing
    maxDelay: 5000, // cap the backoff wait between attempts at 5 seconds
  });

  return createLlmExecutor({
    name: "summarize",
    llm,
    prompt: createChatPrompt<{ text: string }>(
      "Summarize the following in one sentence: {{text}}"
    ),
    parser: createParser("string"),
  });
}
  • timeout — maximum time for a single API call, in milliseconds
  • numOfAttempts — total attempts before the executor throws
  • maxDelay — cap on the backoff wait between attempts

Step 2 - Observe Failures and Handle the Final Error

Retries handle the transient failures silently. For the failures that survive all attempts, attach an onError hook for telemetry, and catch the typed error at the call site to decide the fallback:

ts
export async function summarize(text: string): Promise<string | null> {
  const summarizer = createSummarizer();

  summarizer.on("onError", (exec, meta) => {
    console.error(`${meta.name} failed after retries:`, exec.errorMessage);
  });

  try {
    return await summarizer.execute({ text });
  } catch (error) {
    if (isLlmExeError(error)) {
      // error.code identifies what failed (timeout, provider error,
      // parse failure) so you can decide the right fallback per case.
      return null;
    }
    throw error;
  }
}

The hook observes; the try/catch decides. isLlmExeError narrows the error so you can read error.code and error.category to distinguish a timeout from a parse failure from a provider error — see Error Handling for the full list of codes.

What to set the values to

There is no universal right answer, but the trade-offs are consistent:

  • User-facing requests: short timeout (10–15s), 2 attempts. A user won't wait 90 seconds; fail fast and show a fallback.
  • Background jobs: longer timeout (30–60s), 3+ attempts. Nobody is waiting, so trade latency for reliability.
  • Parse failures are not transient. Retrying the identical prompt after a parse failure sometimes helps (sampling varies), but if a prompt fails to parse consistently, fix the prompt or the parser — don't raise numOfAttempts to paper over it.