Clean editorial hero cover for the article
Engineering

Latency budgets that keep you honest

If you don’t budget for latency, latency will budget for you. This post shares the lightweight framework we use to keep requests fast as features grow.

Why budgets at all?

Teams rarely intend to ship slow software; it happens by a thousand paper cuts. A latency budget is a time allowance per request that we split across hops—client, edge, API, DB, third-party.

Budgets turn vibes into numbers. Numbers unlock trade-offs you can discuss—and test.

Our default numbers

For interactive views, we target p95 ≤ 400ms end-to-end on broadband, and we test p95 ≤ 700ms on 3G throttling. Here’s a typical split:

Engineers reviewing performance dashboards
Weekly performance review. Keep graphs boring; users like fast more than flashy.

A sample split

  • Client render & hydration: 80ms
  • Edge + routing: 30ms
  • API compute: 120ms
  • DB (read): 120ms
  • Third-party: 50ms (amortized)

We don’t worship these numbers; they’re a working theory. The key is to defend the total.

Minimal instrumentation

Measure where time goes. Below is a tiny fetch wrapper that adds server-timing headers and logs p95 locally for dev builds.


// app/lib/request.js
export async function timedFetch(url, opts = {}){
  const t0 = performance.now();
  const res = await fetch(url, opts);
  const t1 = performance.now();
  const serverTiming = res.headers.get("server-timing") || "";
  if(import.meta?.env?.DEV){
    console.debug("[timedFetch]", { url, ms: Math.round(t1 - t0), serverTiming });
  }
  return res;
}
            

Guard rails in tests

We add a simple budget assertion to E2E flows. Keep it generous to avoid flakiness, but strict enough to catch regressions.


// tests/latency.spec.ts
test("dashboard stays under budget", async ({ page }) => {
  const start = Date.now();
  await page.goto("/dashboard");
  await page.getByRole("heading", { name: "Overview" }).waitFor();
  const elapsed = Date.now() - start;
  expect(elapsed).toBeLessThan(700); // p95 on 3G throttle
});
            

Playbook: when over budget

  1. Confirm the regression: throttle, warm caches, retry.
  2. Localize the slow hop: server-timing + spans.
  3. Decide a trade-off: precompute, cache, or cut scope.
  4. Write down the decision for the next you.

Footnotes

1 Budgets aren’t a license to ignore accessibility or reliability; slowness is a tax, not a feature.

Engineering Performance Playbooks
Author portrait
Omar S.
Engineer & systems tinkerer. Likes boring software that’s fast.

Related articles

Developers pairing
Engineering 8 min
Make async work not feel slow

Cadences and constraints that keep work moving.

Read
Sustainability lab
Product 6 min
Shape the problem, then sprint

A pre-sprint doc that cut rework by 30%.

Read
Analytics presentation
Growth 5 min
North-star metrics vs. decoy KPIs

Signals we retired—and what replaced them.

Read
Discuss or ask a question
We keep threads focused and actionable.
Contact editors
StartupHub
Hello 👋
Please briefly describe your issue so we can help you better.