Lambda Cold Starts: Math, Mitigation, When Not to Care
Cold starts are a percentile problem, not a feature problem. Here's the math, the levers, and when to stop optimising.
Cold starts get treated as a Lambda bug. They aren't — they're a property of the execution model. The question is not "how do I make them zero" (you can't), it's "how much do they cost me in user-visible latency, and is that more than I want to pay?"
1. What actually happens during a cold start
Three stages, in order, on every fresh sandbox:
- Init phase — Firecracker microVM boots, your runtime starts, dependencies are loaded, top-level code runs. Anywhere from 100ms (Go binary) to several seconds (heavy Node/Python with many imports).
- Invoke phase — your handler executes. Same as warm.
- Idle then reuse — the sandbox stays around for some minutes. The next invocation skips the init phase.
2. The math you actually care about
Cold starts hit the tail, not the mean. At 100 RPS with a 5-minute idle, virtually every request is warm — cold starts are 0.1% of invocations. At 1 RPS, every other request is cold.
The right framing: cold starts contribute to p99 latency for low-traffic functions. For high-traffic functions they barely register. Check your invocation rate before optimising.
3. Runtime is the biggest lever
Typical cold-start times (rough, varies by region and weather):
- Go binary, no deps: ~100–200ms.
- Node.js, ~10 deps: ~300–600ms.
- Python, ~10 deps + pandas: ~800ms–1.5s.
- Java (SnapStart off): 2–5s. SnapStart on: ~300–500ms.
- .NET: 1–3s. Native AOT brings it under 500ms.
If you cannot tolerate the cold-start tail and your function is in Python with a hundred dependencies, the right fix is not a clever config — it's a slimmer dependency tree or a different runtime.
4. Levers in order of effort
4.1 Trim dependencies
Half the cold-start time on heavy Node functions is require() walking node_modules. Don't import what you don't use. Tree-shake. Bundle with esbuild. Removing one heavy import beats every other optimisation.
4.2 Right-size memory
Memory == CPU on Lambda. A 128 MB function gets a fraction of a vCPU; a 1769 MB function gets a full vCPU. Cold starts scale with CPU. aws-lambda-power-tuning finds the cost-optimal point in 15 minutes — sometimes the cheaper memory size is also slower in dollars because you're paying for double the duration.
4.3 Provisioned concurrency
Pre-warmed sandboxes. Zero cold starts on the provisioned slots. Costs money even when idle, so only worth it on functions where the tail matters and traffic is predictable enough to plan capacity.
4.4 SnapStart (Java, .NET on supported runtimes)
Lambda snapshots the post-init memory state and restores from it on cold start. Skips most of the init phase. Free to enable; gotchas around connection objects that must be restored, not snapshotted.
5. When NOT to worry
- Async / event-driven functions (SQS, Kinesis, S3 events). Nobody is waiting for the response.
- Functions with steady traffic above ~10 RPS — cold starts are statistical noise.
- Internal admin functions. The cold start is one human waiting two seconds.
The list of where cold starts genuinely matter is shorter than people think: synchronous, low-traffic, user-facing endpoints. That is when you reach for provisioned concurrency.
6. The escape valve
If you cannot get cold starts under your latency budget after every other lever, the workload is probably not a fit for Lambda. ECS Fargate or a small EC2 fleet gives you predictable warm latency at slightly higher operational cost. The savings of "serverless" stop being savings when you are paying for provisioned concurrency at 100% of peak.
The honest summary
Cold starts on Lambda are real and small, mostly. Treat them as a tail-latency line item, measure their actual impact on your p99, and only intervene where the dollars or the user experience justify it. The instinct to "eliminate cold starts" is usually a sign you have not checked the numbers.