AWS Lambda

Overview

AWS Lambda is a serverless compute service that lets you run code without provisioning or managing servers. Create workload-aware cluster scaling logic, maintain event integrations, and manage runtimes with ease. With Lambda, you can run code for virtually any type of application or backend service, all with zero administration, and only pay for what you use. You are charged based on the number of requests for your functions and the duration it takes for your code to execute.

Lambda counts a request each time it starts executing in response to an event notification trigger, such as from Amazon Simple Notification Service (SNS) or Amazon EventBridge, or an invoke call, such as from Amazon API Gateway, or via the AWS SDK, including test invokes from the AWS Console.

Duration is calculated from the time your code begins executing until it returns or otherwise terminates, rounded up to the nearest 1 ms*. The price depends on the amount of memory you allocate to your function. In the AWS Lambda resource model, you choose the amount of memory you want for your function, and are allocated proportional CPU power and other resources. An increase in memory size triggers an equivalent increase in CPU available to your function. To learn more, see the Function Configuration documentation.

You can run your Lambda functions on processors built on either x86 or Arm architectures. AWS Lambda functions running on Graviton2, using an Arm-based processor architecture designed by AWS, deliver up to 34% better price performance compared to functions running on x86 processors. This applies to a variety of serverless workloads, such as web and mobile backends, data, and media processing.

* Duration charges apply to code that runs in the handler of a function as well as initialization code that is declared outside of the handler. For Lambda functions with AWS Lambda Extensions, duration also includes the time it takes for code in the last running extension to finish executing during shutdown phase. For Lambda functions configured with SnapStart, duration also includes the time it takes for the runtime to load, any code that runs in a runtime hook, and the initialization code executed during creation of copies of snapshots created for resilience. For more details, see the Lambda Programming Model documentation.

The AWS Lambda free tier includes one million free requests per month and 400,000 GB-seconds of compute time per month, usable for functions powered by both x86, and Graviton2 processors, in aggregate. Additionally, the free tier includes 100GiB of HTTP response streaming per month, beyond the first 6MB per request, which are free. Lambda also offers tiered pricing options for on-demand duration above certain monthly usage thresholds. AWS Lambda participates in Compute Savings Plans, a flexible pricing model that offers low prices on Amazon Elastic Compute Cloud (Amazon EC2), AWS Fargate, and Lambda usage, in exchange for a commitment to a consistent amount of usage (measured in $/hour) for a one- or three-year term. With Compute Savings Plans, you can save up to 17 percent on AWS Lambda. Savings apply to duration and Provisioned Concurrency. Learn more

people at desk

AWS Pricing Calculator

Calculate your AWS Lambda and architecture cost in a single estimate.

Create your custom estimate now

hands with calculator

AWS Lambda Pricing

Asynchronous Event (including events from S3, SNS, EventBridge, StepFunctions, Cloudwatch Logs): You are charged for 1 request per each asynchronous Event for first 256 KB. Individual event size beyond 256 KB is charged 1 additional request for each 64 KB of chunk upto 1 MB.

Duration cost depends on the amount of memory you allocate to your function. You can allocate any amount of memory to your function between 128 MB and 10,240 MB, in 1 MB increments. The table below contains a few examples of the price per 1 ms associated with different memory sizes, for usage falling within the first pricing tier – for example, up to 6 billion GB-seconds / month in US East (Ohio)

  • x86 Price
  • Arm Price

Lambda on-demand duration pricing tiers are applied to aggregate monthly duration of your functions running on the same architecture (x86 or Arm, respectively), in the same region, within the account. If you’re using consolidated billing in AWS Organizations, pricing tiers are applied to the aggregate monthly duration of your functions running on the same architecture, in the same region, across the accounts in the organization.

Lambda Managed Instances

Lambda Managed Instances enables you to run Lambda functions on fully-managed EC2 instances in your VPC, combining Lambda's serverless developer experience with the cost efficiency and hardware flexibility of EC2. This feature is ideal for steady-state, high-volume workloads where you want to optimize costs while maintaining Lambda's operational simplicity.

With Lambda Managed Instances, you can select from a wide variety of current-generation EC2 instance type to match your workload requirements, benefit from EC2 pricing options including EC2 Instance Savings Plans, Compute Savings Plans and Reserved Instances, and process multiple requests concurrently within the same execution environment to maximize resource utilization. Lambda automatically manages instance provisioning, scaling, patching, and lifecycle management, while you retain the familiar Lambda programming model and seamless integration with event sources like SQS, Kinesis, and Kafka.

Pricing:
Lambda Managed Instances pricing has three components:

1. Request charges: $0.20 per million requests
2. Compute management fee: 15% premium on the EC2 on-demand instance price for the instances provisioned and managed by Lambda (Premium for each instance type provided below)
3. EC2 instance charges: Standard EC2 instance pricing applies for the instances provisioned in your capacity provider. You can reduce costs by using Compute Savings Plans, Reserved Instances, or other EC2 pricing options

Note that Lambda Managed Instances functions will not be paying separately for the execution duration of each request unlike Lambda (default) compute type functions.

Event Source Mappings: For workloads using provisioned Event Poller Units (EPUs) with event sources like Kafka or SQS, standard EPU pricing applies.

Management Fees
  • Suppose you're running a high-traffic API service that processes 100 million requests per month with an average duration of 200ms per request. You configure your Lambda Managed Instance capacity provider to use m7g.xlarge instances (4 vCPU, 16 GB memory, Graviton3) and use a 3-year Compute Savings Plan for maximum cost savings.

    Monthly charges

    Request charges
    Monthly requests: 100M requests
    Request price: $0.20 per million requests
    Monthly request charges: 100M / 1M × $0.20 = $20

    Compute charges
    Instance type: m7g.xlarge
    EC2 on-demand price: $0.1632 per hour (US East N. Virginia)
    With 3-year Compute Savings Plan discount (72%): $0.0457 per hour
    Estimated instance hours needed: ~2,000 hours/month (based on workload pattern and multi-concurrency)
    Monthly EC2 instance charges: 2,000 × $0.0457 = $91.40

    Management fee charges
    Management fee: 15% of EC2 on-demand price
    Management fee per hour: $0.1632 × 0.15 = $0.02448 per hour
    Monthly management fee: 2,000 × $0.02448 = $48.96

    Total monthly charges
    Total charges = Request charges + EC2 instance charges + Management fee charges
    Total charges = $20 + $91.40 + $48.96 = $160.36

Lambda Durable Functions Pricing

Lambda durable functions simplify how you build reliable multi-step applications and AI workflows directly within Lambda’s existing programming model, enabling resilient and cost-effective long-running workloads. In durable functions, you use durable operations like “steps” and “waits”, which are checkpoints with optional data stored for extended periods, allowing your function to resume execution after interruptions. When functions resume, the system performs replay, automatically re-executing the event handler from the beginning while skipping completed checkpoints and continuing from the point of interruption. The lifecycle may include multiple sub-invocations (Lambda function invocations that occur when resuming after wait operations, retries, or infrastructure failures) to complete the execution. 

Existing Lambda compute charges apply, including for sub-invocations from replays. When using wait operations, the function suspends execution and, for on-demand functions, does not incur duration charges until execution resumes. In addition, you are charged for durable operations (such as starting executions, completing steps, and creating waits). You also pay for the amount of data written by these operations (in GB) and for data retention during and after execution (in GB-month, prorated). The retention period after completion is configurable from 1 to 90 days (default 14 days). 

For a full list and detailed description of durable operations, see the Lambda Developer Guide.

  • An insurance claim processing system uses Lambda durable functions to analyze claims for fraud detection, coordinate human review for high-value claims, and process approved payments. The process begins with a document analysis step that takes 30 seconds to perform LLM-based fraud detection and risk assessment. The execution then uses a wait to suspend execution for a human review (typically 7 days wait) where an adjuster reviews claims exceeding automatic approval thresholds. Finally, a payment step taking 2 seconds to process the approval decision to initiate the payment. The system processes 1,000,000 insurance claims per month. Each execution uses an 8KB invocation payload and 32KB payloads for claim analysis (step 1), approval decisions (wait), and final payment processing (step 2). The function is configured with 1GB of memory on an ARM-based processor. Completed claim records are retained for 14 days for audit and compliance. Note: Examples are based on price in US East (N. Virginia). All executions start at the beginning of the month and all steps succeed on first attempt without retries to simplify the calculations.

    Note: Examples are based on price in US East (N. Virginia). All executions start at the beginning of the month and all steps succeed on first attempt without retries to simplify the calculations.

    Monthly Compute Charges

    Total compute (seconds) 1,000,000 × 32s = 32,000,000 seconds
    Total compute (GB-s) 32,000,000 × 1GB = 32,000,000 GB-s
    Billable compute 32,000,000 - 400,000 free tier = 31,600,000 GB-s
    Compute cost 31,600,000 × $0.0000133334 = $421.34

    Monthly Request Charges

    Total requests 2 invocations (initial + after wait) × 1,000,000 = 2,000,000 requests
    Billable requests 2,000,000 - 1M free tier = 1,000,000
    Request cost 1M × $0.20/M = $0.20


    Monthly Durable Functions Charges

    Operations 1M × (1 start execution + 2 steps + 1 wait) = 4M
    Operations cost 4M × $8.00/M = $32.00
    Data Written 1M × (8KB invoke + 3 × 32KB steps/waits) = 104GB
    Data written cost 104GB × $0.25/GB = $26.00
    Storage (running incl. 7 day wait) 104GB × (7/30) = 24.27 GB-month
    Storage (retained 14 days) 104GB × (14/30) = 48.53 GB-month
    Data Retained cost (24.27 + 48.53) GB-month × $0.15/GB-month = $10.92

     

    Total Monthly Charges

    Total charges $421.34 + $0.20 + $32.00 + $26.00 + $10.92 = $490.46

Tenant Isolation Pricing

Enable tenant isolation mode to isolate request processing for individual end-users or tenants invoking your Lambda function. The underlying execution environments for a tenant-isolated Lambda function are always associated with a particular tenant and are never used to execute requests from other tenants invoking the same function. This capability simplifies developing and maintaining multi-tenant applications that process tenant-specific code or data with strict isolation requirements across tenants. You are charged when Lambda creates a new tenant-isolated execution environment to serve a request, depending on the amount of memory you allocate to your function and the CPU architecture you use. To learn more about Lambda's tenant isolation capability, read the documentation.

  • Multi-tenant SaaS application

    Let’s assume you are building an automation platform that executes user-provided code in response to events. For example, an IT team may want to execute an automated workflow when a new employee joins their organization or transfers across departments. As another example, a DevOps team may want to trigger a CI/CD workflow when a developer commits code changes to their source code repository. Your automation platform is multi-tenant, meaning that it serves multiple end-users. Because you expect high variation in demand, by time of day and for each end-user or tenant, you build your platform using serverless services including AWS Lambda.

    Your automation platform supports the ability to run user-supplied code in response to events. Since you do not control the code provided by users, you enable tenant isolation mode to ensure that Lambda function invocations for each end-user are processed in separate execution environments that are isolated from one another.

    Assume that you have configured your Lambda function with 1024 MB of memory and x86 CPU architecture. During a typical month, your function processes 10M invokes with an average duration of 2 seconds per invoke. Your SaaS platform is used by 1K end-users or tenants. For simplicity, let’s assume that on average each tenant generates 10K invokes per month and Lambda creates 200 execution environments per tenant (i.e. a cold-start rate of 2% per tenant).

    Your charges would be calculated as follows:

    Request charges
    Per month, your function executes 10M times.

    Monthly request charges: 10M * $0.2/M = $2.

    Compute charges
    Per month, your function executes 10M times with an average duration of 2s. Your function's configured memory is 1024 MB.

    Monthly compute duration (seconds): 10M * 2s = 20M seconds
    Monthly compute (GB-s): 20M seconds * 1024 MB / 1024 MB = 20M GB-s
    Monthly compute charges: 20M * $0.0000166667 = $333.34

    Tenant isolation charges
    Per month, on average, your function serves 1K unique tenants. Each tenant invokes the function 10K times with an average of 200 execution environments created per tenant (i.e. average cold-start rate of 2% for each tenant).

    Monthly execution environments created for 1K tenants: 200 * 1K = 200K
    Monthly tenant isolation charges: 200K * $0.000167 * 1024 MB / 1024MB = $33.4

    Total monthly charges
    Total charges = Request charges + Compute charges + Tenant isolation charges
    Total charges = $2 + $333.34 + $33.4 = $368.74

Lambda Ephemeral Storage Pricing

Ephemeral storage cost depends on the amount of ephemeral storage you allocate to your function, and function execution duration, measured in milliseconds. You can allocate any additional amount of storage to your function between 512 MB and 10,240 MB, in 1 MB increments. You can configure ephemeral storage for functions running on both x86 and Arm architectures. 512 MB of ephemeral storage is available to each Lambda function at no additional cost. You only pay for the additional ephemeral storage you configure.

All examples below are based on price in US East (N. Virginia).