AWS Lambda pricing
Overview
AWS Lambda is a serverless compute service that lets you run code without provisioning or managing servers. Create workload-aware cluster scaling logic, maintain event integrations, and manage runtimes with ease. With Lambda, you can run code for virtually any type of application or backend service, all with zero administration, and only pay for what you use. You are charged based on the number of requests for your functions and the duration it takes for your code to execute.
Lambda counts a request each time it starts executing in response to an event notification trigger, such as from Amazon Simple Notification Service (SNS) or Amazon EventBridge, or an invoke call, such as from Amazon API Gateway, or via the AWS SDK, including test invokes from the AWS Console.
Duration is calculated from the time your code begins executing until it returns or otherwise terminates, rounded up to the nearest 1 ms*. The price depends on the amount of memory you allocate to your function. In the AWS Lambda resource model, you choose the amount of memory you want for your function, and are allocated proportional CPU power and other resources. An increase in memory size triggers an equivalent increase in CPU available to your function. To learn more, see the Function Configuration documentation.
You can run your Lambda functions on processors built on either x86 or Arm architectures. AWS Lambda functions running on Graviton2, using an Arm-based processor architecture designed by AWS, deliver up to 34% better price performance compared to functions running on x86 processors. This applies to a variety of serverless workloads, such as web and mobile backends, data, and media processing.
* Duration charges apply to code that runs in the handler of a function as well as initialization code that is declared outside of the handler. For Lambda functions with AWS Lambda Extensions, duration also includes the time it takes for code in the last running extension to finish executing during shutdown phase. For Lambda functions configured with SnapStart, duration also includes the time it takes for the runtime to load, any code that runs in a runtime hook, and the initialization code executed during creation of copies of snapshots created for resilience. For more details, see the Lambda Programming Model documentation.
The AWS Lambda free tier includes one million free requests per month and 400,000 GB-seconds of compute time per month, usable for functions powered by both x86, and Graviton2 processors, in aggregate. Additionally, the free tier includes 100GiB of HTTP response streaming per month, beyond the first 6MB per request, which are free. Lambda also offers tiered pricing options for on-demand duration above certain monthly usage thresholds. AWS Lambda participates in Compute Savings Plans, a flexible pricing model that offers low prices on Amazon Elastic Compute Cloud (Amazon EC2), AWS Fargate, and Lambda usage, in exchange for a commitment to a consistent amount of usage (measured in $/hour) for a one- or three-year term. With Compute Savings Plans, you can save up to 17 percent on AWS Lambda. Savings apply to duration and Provisioned Concurrency. Learn more
AWS Pricing Calculator
Calculate your AWS Lambda and architecture cost in a single estimate.
AWS Lambda Pricing
Asynchronous Event (including events from S3, SNS, EventBridge, StepFunctions, Cloudwatch Logs): You are charged for 1 request per each asynchronous Event for first 256 KB. Individual event size beyond 256 KB is charged 1 additional request for each 64 KB of chunk upto 1 MB.
Duration cost depends on the amount of memory you allocate to your function. You can allocate any amount of memory to your function between 128 MB and 10,240 MB, in 1 MB increments. The table below contains a few examples of the price per 1 ms associated with different memory sizes, for usage falling within the first pricing tier – for example, up to 6 billion GB-seconds / month in US East (Ohio)
-
x86 Price
-
Arm Price
-
x86 Price
-
-
Arm Price
-
Lambda on-demand duration pricing tiers are applied to aggregate monthly duration of your functions running on the same architecture (x86 or Arm, respectively), in the same region, within the account. If you’re using consolidated billing in AWS Organizations, pricing tiers are applied to the aggregate monthly duration of your functions running on the same architecture, in the same region, across the accounts in the organization.
Lambda Managed Instances
Lambda Managed Instances enables you to run Lambda functions on fully-managed EC2 instances in your VPC, combining Lambda's serverless developer experience with the cost efficiency and hardware flexibility of EC2. This feature is ideal for steady-state, high-volume workloads where you want to optimize costs while maintaining Lambda's operational simplicity.
With Lambda Managed Instances, you can select from a wide variety of current-generation EC2 instance type to match your workload requirements, benefit from EC2 pricing options including EC2 Instance Savings Plans, Compute Savings Plans and Reserved Instances, and process multiple requests concurrently within the same execution environment to maximize resource utilization. Lambda automatically manages instance provisioning, scaling, patching, and lifecycle management, while you retain the familiar Lambda programming model and seamless integration with event sources like SQS, Kinesis, and Kafka.
Pricing:
Lambda Managed Instances pricing has three components:
1. Request charges: $0.20 per million requests
2. Compute management fee: 15% premium on the EC2 on-demand instance price for the instances provisioned and managed by Lambda (Premium for each instance type provided below)
3. EC2 instance charges: Standard EC2 instance pricing applies for the instances provisioned in your capacity provider. You can reduce costs by using Compute Savings Plans, Reserved Instances, or other EC2 pricing options
Note that Lambda Managed Instances functions will not be paying separately for the execution duration of each request unlike Lambda (default) compute type functions.
Event Source Mappings: For workloads using provisioned Event Poller Units (EPUs) with event sources like Kafka or SQS, standard EPU pricing applies.
-
Pricing Example: High-throughput API service
Suppose you're running a high-traffic API service that processes 100 million requests per month with an average duration of 200ms per request. You configure your Lambda Managed Instance capacity provider to use m7g.xlarge instances (4 vCPU, 16 GB memory, Graviton3) and use a 3-year Compute Savings Plan for maximum cost savings.Monthly charges
Request charges
Monthly requests: 100M requests
Request price: $0.20 per million requests
Monthly request charges: 100M / 1M × $0.20 = $20Compute charges
Instance type: m7g.xlarge
EC2 on-demand price: $0.1632 per hour (US East N. Virginia)
With 3-year Compute Savings Plan discount (72%): $0.0457 per hour
Estimated instance hours needed: ~2,000 hours/month (based on workload pattern and multi-concurrency)
Monthly EC2 instance charges: 2,000 × $0.0457 = $91.40Management fee charges
Management fee: 15% of EC2 on-demand price
Management fee per hour: $0.1632 × 0.15 = $0.02448 per hour
Monthly management fee: 2,000 × $0.02448 = $48.96Total monthly charges
Total charges = Request charges + EC2 instance charges + Management fee charges
Total charges = $20 + $91.40 + $48.96 = $160.36
Lambda Durable Functions Pricing
Lambda durable functions simplify how you build reliable multi-step applications and AI workflows directly within Lambda’s existing programming model, enabling resilient and cost-effective long-running workloads. In durable functions, you use durable operations like “steps” and “waits”, which are checkpoints with optional data stored for extended periods, allowing your function to resume execution after interruptions. When functions resume, the system performs replay, automatically re-executing the event handler from the beginning while skipping completed checkpoints and continuing from the point of interruption. The lifecycle may include multiple sub-invocations (Lambda function invocations that occur when resuming after wait operations, retries, or infrastructure failures) to complete the execution.
Existing Lambda compute charges apply, including for sub-invocations from replays. When using wait operations, the function suspends execution and, for on-demand functions, does not incur duration charges until execution resumes. In addition, you are charged for durable operations (such as starting executions, completing steps, and creating waits). You also pay for the amount of data written by these operations (in GB) and for data retention during and after execution (in GB-month, prorated). The retention period after completion is configurable from 1 to 90 days (default 14 days).
For a full list and detailed description of durable operations, see the Lambda Developer Guide.
-
Pricing Example:
An insurance claim processing system uses Lambda durable functions to analyze claims for fraud detection, coordinate human review for high-value claims, and process approved payments. The process begins with a document analysis step that takes 30 seconds to perform LLM-based fraud detection and risk assessment. The execution then uses a wait to suspend execution for a human review (typically 7 days wait) where an adjuster reviews claims exceeding automatic approval thresholds. Finally, a payment step taking 2 seconds to process the approval decision to initiate the payment. The system processes 1,000,000 insurance claims per month. Each execution uses an 8KB invocation payload and 32KB payloads for claim analysis (step 1), approval decisions (wait), and final payment processing (step 2). The function is configured with 1GB of memory on an ARM-based processor. Completed claim records are retained for 14 days for audit and compliance. Note: Examples are based on price in US East (N. Virginia). All executions start at the beginning of the month and all steps succeed on first attempt without retries to simplify the calculations.Note: Examples are based on price in US East (N. Virginia). All executions start at the beginning of the month and all steps succeed on first attempt without retries to simplify the calculations.
Monthly Compute Charges
Total compute (seconds) 1,000,000 × 32s = 32,000,000 seconds Total compute (GB-s) 32,000,000 × 1GB = 32,000,000 GB-s Billable compute 32,000,000 - 400,000 free tier = 31,600,000 GB-s Compute cost 31,600,000 × $0.0000133334 = $421.34 Monthly Request Charges
Total requests 2 invocations (initial + after wait) × 1,000,000 = 2,000,000 requests Billable requests 2,000,000 - 1M free tier = 1,000,000 Request cost 1M × $0.20/M = $0.20
Monthly Durable Functions ChargesOperations 1M × (1 start execution + 2 steps + 1 wait) = 4M Operations cost 4M × $8.00/M = $32.00 Data Written 1M × (8KB invoke + 3 × 32KB steps/waits) = 104GB Data written cost 104GB × $0.25/GB = $26.00 Storage (running incl. 7 day wait) 104GB × (7/30) = 24.27 GB-month Storage (retained 14 days) 104GB × (14/30) = 48.53 GB-month Data Retained cost (24.27 + 48.53) GB-month × $0.15/GB-month = $10.92 Total Monthly Charges
Total charges $421.34 + $0.20 + $32.00 + $26.00 + $10.92 = $490.46
Tenant Isolation Pricing
Enable tenant isolation mode to isolate request processing for individual end-users or tenants invoking your Lambda function. The underlying execution environments for a tenant-isolated Lambda function are always associated with a particular tenant and are never used to execute requests from other tenants invoking the same function. This capability simplifies developing and maintaining multi-tenant applications that process tenant-specific code or data with strict isolation requirements across tenants. You are charged when Lambda creates a new tenant-isolated execution environment to serve a request, depending on the amount of memory you allocate to your function and the CPU architecture you use. To learn more about Lambda's tenant isolation capability, read the documentation.
-
Pricing Example:
Multi-tenant SaaS applicationMulti-tenant SaaS applicationLet’s assume you are building an automation platform that executes user-provided code in response to events. For example, an IT team may want to execute an automated workflow when a new employee joins their organization or transfers across departments. As another example, a DevOps team may want to trigger a CI/CD workflow when a developer commits code changes to their source code repository. Your automation platform is multi-tenant, meaning that it serves multiple end-users. Because you expect high variation in demand, by time of day and for each end-user or tenant, you build your platform using serverless services including AWS Lambda.
Your automation platform supports the ability to run user-supplied code in response to events. Since you do not control the code provided by users, you enable tenant isolation mode to ensure that Lambda function invocations for each end-user are processed in separate execution environments that are isolated from one another.
Assume that you have configured your Lambda function with 1024 MB of memory and x86 CPU architecture. During a typical month, your function processes 10M invokes with an average duration of 2 seconds per invoke. Your SaaS platform is used by 1K end-users or tenants. For simplicity, let’s assume that on average each tenant generates 10K invokes per month and Lambda creates 200 execution environments per tenant (i.e. a cold-start rate of 2% per tenant).
Your charges would be calculated as follows:
Request charges
Per month, your function executes 10M times.
Monthly request charges: 10M * $0.2/M = $2.Compute charges
Per month, your function executes 10M times with an average duration of 2s. Your function's configured memory is 1024 MB.Monthly compute duration (seconds): 10M * 2s = 20M seconds
Monthly compute (GB-s): 20M seconds * 1024 MB / 1024 MB = 20M GB-s
Monthly compute charges: 20M * $0.0000166667 = $333.34Tenant isolation charges
Per month, on average, your function serves 1K unique tenants. Each tenant invokes the function 10K times with an average of 200 execution environments created per tenant (i.e. average cold-start rate of 2% for each tenant).Monthly execution environments created for 1K tenants: 200 * 1K = 200K
Monthly tenant isolation charges: 200K * $0.000167 * 1024 MB / 1024MB = $33.4Total monthly charges
Total charges = Request charges + Compute charges + Tenant isolation charges
Total charges = $2 + $333.34 + $33.4 = $368.74
Lambda Ephemeral Storage Pricing
Ephemeral storage cost depends on the amount of ephemeral storage you allocate to your function, and function execution duration, measured in milliseconds. You can allocate any additional amount of storage to your function between 512 MB and 10,240 MB, in 1 MB increments. You can configure ephemeral storage for functions running on both x86 and Arm architectures. 512 MB of ephemeral storage is available to each Lambda function at no additional cost. You only pay for the additional ephemeral storage you configure.
All examples below are based on price in US East (N. Virginia).
-
Example 1: Mobile application backend
Let’s assume you are a mobile app developer building a food ordering app. Customers can use the app to order food from a specific restaurant location, receive order status updates, and pick up the food when the order is ready. Because you expect high variation in demand, both by time of day and restaurant location, you build your mobile backend using serverless services, including AWS Lambda.Let’s assume you are a mobile app developer building a food ordering app. Customers can use the app to order food from a specific restaurant location, receive order status updates, and pick up the food when the order is ready. Because you expect high variation in demand, both by time of day and restaurant location, you build your mobile backend using serverless services, including AWS Lambda.For simplicity, let’s assume your application processes three million requests per month. The average function execution duration is 120 ms. You have configured your function with 1536 MB of memory, on an x86 based processor. Your charges would be calculated as follows:
Monthly compute charges
The monthly compute price is $0.0000166667 per GB-s and the free tier provides 400,000 GB-s.
Total compute (seconds) = 3 million * 120ms = 360,000 seconds
Total compute (GB-s) = 360,000 * 1536MB/1024 MB = 540,000 GB-s
Total compute – Free tier compute = monthly billable compute GB- s
540,000 GB-s – 400,000 free tier GB-s = 140,000 GB-s
Monthly compute charges = 140,000 * $0.0000166667 = $2.33Monthly request charges
The monthly request price is $0.20 per one million requests and the free tier provides 1 million requests per month.
Total requests – Free tier requests = monthly billable requests
3 million requests – 1 million free tier requests = 2 million monthly billable requests
Monthly request charges = 2M * $0.2/M = $0.40Total monthly charges
Total charges = Compute charges + Request charges = $2.33 + $0.40 = $2.73 per month
-
Example 2: Enriching streaming telemetry with additional metadata
Let’s say you are a logistics company with a fleet of vehicles in the field, each of which are enabled with sensors and 4G/5G connectivity to emit telemetry data into an Amazon Kinesis Data Stream. You want to use machine learning (ML) models you’ve developed to infer the health of the vehicle and predict when maintenance for particular components might be required.Let’s say you are a logistics company with a fleet of vehicles in the field, each of which are enabled with sensors and 4G/5G connectivity to emit telemetry data into an Amazon Kinesis Data Stream. You want to use machine learning (ML) models you’ve developed to infer the health of the vehicle and predict when maintenance for particular components might be required.Suppose you have 10,000 vehicles in the field, each of which is emitting telemetry once an hour in a staggered fashion with sufficient jitter. You intend to perform this inference on each payload to ensure vehicles are scheduled promptly for maintenance and ensure optimal health of your vehicle fleet.
Assume the ML model is packaged along with the function and is 512 MB in size. For inference, you’veconfigured your function with 1 GB of memory, and function execution takes two seconds to complete on average on an x86 based processor.
Monthly request charges:
Per month, the vehicles will emit 10,000 * 24 * 31 = 7,440,000 messages which will be processed by the Lambda function.Monthly request charges → 7.44M * $0.20/million = $1.488 ~= $1.49
Monthly compute charges:
Per month, the functions will be executed once per message for two seconds.Monthly compute duration (seconds) → 7.44 million * 2 seconds = 14.88 million seconds
Monthly compute (GB-s) → 14.88M seconds * 1024 MB/1024 MB = 14.88 GB-s
Monthly compute charges → 14.88M GB-s * $0.0000166667 = $248.00Total monthly charges:
Monthly total charges = Request charges + Compute charges = $1.49 + $248.00 = $249.49 -
Example 3: Performing ML on customer support tickets and interactions to improve customer experience
Let’s assume you are a financial services company looking to better understand your top customer service issues. Your goal is to improve the customer experience and reduce customer churn. Your customers can chat live with your customer support staff via the mobile app