With AWS Glue, you pay an hourly rate, billed by the second, for crawlers (discovering data) and ETL jobs (processing and loading data). For the AWS Glue Data Catalog, you pay a simple monthly fee for storing and accessing the metadata. The first million objects stored are free, and the first million accesses are free. If you provision a development endpoint to interactively develop your ETL code, you pay an hourly rate, billed per second.
Note: Pricing can vary by region.
-
ETL jobs and development endpoints
-
Data Catalog storage and requests
-
Crawlers
-
ETL jobs and development endpoints
-
-
Data Catalog storage and requests
-
-
Crawlers
-
Pricing examples
ETL job example: Consider an AWS Glue job of type Apache Spark that runs for 10 minutes and consumes 6 DPUs. The price of 1 DPU-Hour is $0.44. Since your job ran for 1/6th of an hour and consumed 6 DPUs, you will be billed 6 DPUs * 1/6 hour at $0.44 per DPU-Hour or $0.44.
Development endpoint example: Now let’s consider that you provision a development endpoint to connect your notebook to interactively develop your ETL code. A development endpoint is provisioned with 5 DPUs. If you keep the development endpoint running for 24 minutes or 2/5th of an hour, you will be billed for 5 DPUs * 2/5 hour at $0.44 per DPU-Hour or $0.88.
AWS Glue Data Catalog free tier example: Let’s consider that you store a million tables in your AWS Glue Data Catalog in a given month and make a million requests to access these tables. You pay $0 because your usage will be covered under the AWS Glue Data Catalog free tier. You can store the first million objects and make a million requests per month for free.
AWS Glue Data Catalog example: Now consider your storage usage remains the same at one million tables per month, but your requests double to two million requests per month. Let’s say you also use crawlers to find new tables and they run for 30 minutes and consume 2 DPUs.
Your storage cost is still $0, as the storage for your first million tables is free. Your first million requests are also free. You will be billed for one million requests above the free tier, which is $1. Crawlers are billed at $0.44 per DPU-Hour, so you will pay for 2 DPUs * 1/2 hour at $0.44 per DPU-Hour or $0.44. This is a total monthly bill of $1.44.
ML Transforms example: Similar to AWS Glue jobs runs, the cost of running ML Transforms, including FindMatches on your data will vary based on the size of your data, the content of your data, and the number and types of nodes that you use. In the following example, we used FindMatches to integrate points of interest information from multiple data sources. With a data set size of ~11,000,000 rows (1.6GB), a size of Label data (examples of true matches or true no-matches) of ~8,000 rows (641kb), running on 16 instances of type G.2x, then you would have a labelset generation runtime of 34 minutes at a cost of $8.23, a metrics estimation runtime of 11 minutes at a cost of $2.66, and a FindingMatches job execution runtime of 32 minutes at a cost of $7.75.
Note: Pricing can vary by region.
View the Global Regions table to learn more about AWS Glue availability
Additional pricing resources
Calculate your total cost of ownership (TCO)
Easily calculate your monthly costs with AWS
Additional resources for switching to AWS
Get started building with AWS Glue in the AWS Management Console.