Every Lambda invocation goes through two phases. The cold start covers everything before your code runs: downloading the deployment package, provisioning the execution environment, and initializing the runtime. The warm start is everything after β€” just your handler code executing. Cold starts are the visible latency spikes that appear periodically in production Lambda metrics.

The 15-minute reuse window: AWS freezes the execution context after a function returns and thaws it for the next invocation. This reuse window is roughly 15 minutes. Any object initialized outside the handler function persists across warm invocations within that window. This is the core optimization lever.

Practical optimizations:

  1. Move heavy initialization outside the handler β€” SDK client instantiation, database connections, and expensive computed values belong at module scope, not inside the handler. They run once per cold start, not once per request.

  2. Use keep-alive HTTP connections β€” By default, Node.js AWS SDK closes the underlying HTTPS connection after each request. Create an https.Agent({ keepAlive: true }) and pass it as httpOptions.agent when instantiating DynamoDB/S3/etc. clients. This eliminates TCP handshake overhead on warm invocations.

  3. Cache SDK clients at module scope β€” Same principle: instantiate new AWS.DynamoDB(...) outside the handler so it’s reused across invocations.

  4. Use /tmp for cross-invocation caching β€” Lambda provides 512 MB (now up to 10 GB) of ephemeral storage at /tmp that persists within the execution context lifetime. Store computed artifacts, downloaded config files, or ML model weights here to avoid re-fetching on every warm start.

  5. Mind background processes β€” If you fire async work inside the handler, ensure it completes before the handler returns. Frozen context can carry incomplete background work into the next invocation, causing subtle bugs.

The key insight: cold starts are unavoidable on the first invocation and after ~15 minutes of inactivity. Optimizing warm starts is where most of the latency wins live, because they represent the majority of invocations in any steady-traffic system.

Connections

  • seekable-oci β€” Related pattern: lazy-loading container images to reduce cold start for ECS/Fargate/EKS
  • tail-latency β€” Cold starts manifest as tail latency spikes in Lambda-based systems

Sources