S3 FAQs All

General S3 FAQs

Open all

Amazon S3 is object storage built to store and retrieve any amount of data from anywhere. S3 is a simple storage service that offers industry leading durability, availability, performance, security, and virtually unlimited scalability at very low costs.

Amazon S3 provides a simple web service interface that you can use to store and retrieve any amount of data, at any time, from anywhere. Using this service, you can easily build applications that make use of cloud native storage. Since Amazon S3 is highly scalable and you only pay for what you use, you can start small and grow your application as you wish, with no compromise on performance or reliability. Amazon S3 is also designed to be highly flexible. Store any type and amount of data that you want, read the same piece of data a million times or only for emergency disaster recovery, build a simple FTP application or a sophisticated web application such as the Amazon.com retail web site. Amazon S3 frees you to focus on innovation instead of spending time figuring out how to store your data.

To sign up for Amazon S3, visit the S3 console. You must have an Amazon Web Services account to access this service. If you do not already have an account, you will be prompted to create one when you begin the Amazon S3 sign-up process. After signing up, refer to the Amazon S3 documentation, view the S3 getting started materials, and see the additional resources in the resource center to begin using Amazon S3.

Amazon S3 lets you leverage Amazon’s own benefits of massive scale with no up-front investment or performance compromises. By using Amazon S3, it is inexpensive and simple to ensure your data is quickly accessible, always available, and secure.

You can store virtually any kind of data in any format. Refer to the Amazon Web Services Licensing Agreement for details.

The total volume of data and number of objects you can store in Amazon S3 are unlimited. Individual Amazon S3 objects can range in size from a minimum of 0 bytes to a maximum of 50 TB. The largest object that can be uploaded in a single PUT is 5 GB. For objects larger than 100 MB, customers should consider using the multipart upload capability.

A general purpose bucket is a container for objects stored in Amazon S3, and you can store any number of objects in a bucket. General purpose buckets are the original S3 bucket type, and a single general purpose bucket can contain objects stored across all storage classes except S3 Express One Zone. They are recommended for most use cases and access patterns.

A directory bucket is a container for objects stored in Amazon S3, and you can store any number of objects in a bucket. S3 directory buckets only allow objects stored in the S3 Express One Zone storage class, which provides faster data processing within a single Availability Zone. They are recommended for low-latency use cases. Each S3 directory bucket can support up to 2 million transactions per second (TPS), independent of the number of directories within the bucket.

A table bucket is purpose-built for storing tables using the Apache Iceberg format. Use Amazon S3 Tables to create table buckets and set up table-level permissions in just a few steps. S3 table buckets are specifically optimized for analytics and machine learning workloads. With built-in support for Apache Iceberg, you can query tabular data in S3 with popular query engines including Amazon Athena, Amazon Redshift, and Apache Spark. Use S3 table buckets to store tabular data such as daily purchase transactions, streaming sensor data, or ad impressions as an Iceberg table in Amazon S3, and then interact with that data using analytics capabilities.

A vector bucket is purpose-built for storing and querying vectors. Within a vector bucket, you do not use the S3 object APIs, but rather dedicated vector APIs to write vector data and query it based on semantic meaning and similarity. You can control access to your vector data with the existing access control mechanisms in Amazon S3, including bucket and IAM policies. All writes to a vector bucket are strongly consistent, which means that you can immediately access the most recently added vectors. As you write, update, and delete vectors over time, S3 vector buckets automatically optimize the vector data stored in them to achieve the optimal price-performance, even as the data sets scale and evolve.

A bucket is a container for objects and tables stored in Amazon S3, and you can store any number of objects in a bucket. General purpose buckets are the original S3 bucket type, and a single general purpose bucket can contain objects stored across all storage classes except S3 Express One Zone. They are recommended for most use cases and access patterns. S3 directory buckets only allow objects stored in the S3 Express One Zone storage class, which provides faster data processing within a single Availability Zone. They are recommended for low-latency use cases. Each S3 directory bucket can support up to 2 million transactions per second (TPS), independent of the number of directories within the bucket. S3 table buckets are purpose-built for storing tabular data in S3 such as daily purchase transactions, streaming sensor data, or ad impressions. When using a table bucket, your data is stored as an Iceberg table in S3, and then you can interact with that data using analytics capabilities such as row-level transactions, queryable table snapshots, and more, all managed by S3. Additionally, table buckets perform continual table maintenance to automatically optimize query efficiency over time, even as the data lake scales and evolves. S3 vector buckets are purpose-built for storing and querying vectors. Within a vector bucket, you use dedicated vector APIs to write vector data and query it based on semantic meaning and similarity. You can control access to your vector data using the existing access control mechanisms in Amazon S3, including bucket and IAM policies. As you write, update, and delete vectors over time, S3 vector buckets automatically optimize the vector data stored in them to achieve the optimal price-performance, even as the data sets scale and evolve.

Amazon stores your data and tracks its associated usage for billing purposes. Amazon will not otherwise access your data for any purpose outside of the Amazon S3 offering, except when required to do so by law. Refer to the Amazon Web Services Licensing Agreement for details.

Yes. Organizations across Amazon use Amazon S3 for a wide variety of projects. Many of these projects use Amazon S3 as their authoritative data store and rely on it for business-critical operations.

Amazon S3 is a simple key-based object store. When you store data, you assign a unique object key that can later be used to retrieve the data. Keys can be any string, and they can be constructed to mimic hierarchical attributes. Alternatively, you can use S3 Object Tagging to organize your data across all of your S3 buckets and/or prefixes.

Amazon S3 provides a simple, standards-based REST web services interface that is designed to work with any internet-development toolkit. The operations are intentionally made simple to make it easy to add new distribution protocols and functional layers.

Amazon S3 gives you access to the same highly scalable, highly available, fast, inexpensive data storage infrastructure that Amazon uses to run its own global network of web sites. The S3 Standard storage class is designed for 99.99% availability, the S3 Standard-IA storage class, S3 Intelligent-Tiering storage class, and the S3 Glacier Instant Retrieval storage classes are designed for 99.9% availability, the S3 One Zone-IA storage class is designed for 99.5% availability, and the S3 Glacier Flexible Retrieval and S3 Glacier Deep Archive class are designed for 99.99% availability and an SLA of 99.9%. All of these storage classes are backed by the Amazon S3 Service Level Agreement.

Amazon S3 is designed from the ground up to handle traffic for any internet application. Pay-as-you-go pricing and unlimited capacity ensures that your incremental costs don’t change and that your service is not interrupted. Amazon S3’s massive scale lets you spread the load evenly, so that no individual application is affected by traffic spikes.

Yes. The Amazon S3 SLA provides for a service credit if a customer's monthly uptime percentage is below our service commitment in any billing cycle.

Amazon S3 delivers strong read-after-write consistency automatically, without changes to performance or availability, without sacrificing regional isolation for applications, and at no additional cost. After a successful write of a new object or an overwrite of an existing object, any subsequent read request immediately receives the latest version of the object. S3 also provides strong consistency for list operations, so after a write, you can immediately perform a listing of the objects in a bucket with any changes reflected.

Strong read-after-write consistency helps when you need to immediately read an object after a write; for example, when you often read and list immediately after writing objects. High-performance computing workloads also benefit in that when an object is overwritten and then read many times simultaneously, strong read-after-write consistency provides assurance that the latest write is read across all reads. These applications automatically and immediately benefit from strong read-after-write consistency. The strong consistency of S3 also reduces costs by removing the need for extra infrastructure to provide strong consistency.

AWS Regions

Open all

You specify an AWS Region when you create your Amazon S3 general purpose bucket. For S3 Standard, S3 Standard-IA, S3 Intelligent-Tiering, S3 Glacier Instant Retrieval, S3 Glacier Flexible Retrieval, and S3 Glacier Deep Archive storage classes, your objects are automatically stored across multiple devices spanning a minimum of three Availability Zones (AZs). AZs are physically separated by a meaningful distance, many kilometers, from any other AZ, although all are within 100 km (60 miles) of each other. Objects stored in the S3 One Zone-IA storage class are stored redundantly within a single Availability Zone in the AWS Region you select. You specify a single Availability Zone or AWS Local Zone when you create your directory bucket. Objects in directory buckets are stored redundantly within a single Availability Zone or single Local Zone. When using Local Zones, your objects stay in the Local Zone unless you transfer them to an AWS Region. For S3 on Outposts, your data is stored in your Outpost on-premises environment, unless you manually choose to transfer it to an AWS Region. Refer to AWS regional services list for details of Amazon S3 service availability by AWS Region.

You should use S3 storage classes for AWS Dedicated Local Zones if you have sensitive data and applications that need to run on physically separate infrastructure that is dedicated to your exclusive use and placed within a specified regulatory jurisdiction to address security and compliance requirements. For example, some regulations require data must be stored in a particular country or state, for regulatory, contractual, or information security reasons common in public sector, healthcare, oil and gas, and other highly-regulated industries. AWS works with you to configure your own private zones with the enhanced security and governance capabilities needed to help you meet your regulatory requirements.

You should use S3 in AWS Local Zones if you have data and applications that need to run in a specific geographic locations to address data residency and compliance requirements. For example, some regulations require data must be stored in a particular country, for regulatory, contractual, or information security reasons common in public sector, healthcare, oil and gas, and other highly-regulated industries.

An AWS Region is a physical location around the world where AWS cluster data centers. Each group of logical data centers within a Region is know as an Availability Zone (AZ). Each AWS Region consists of a minimum of three, isolated, and physically separate AZs within a geographic area. Unlike other cloud providers, who often define a Region as a single data center, the multiple AZ design of every AWS Region offers advantages for customers. Each AZ has independent power, cooling, and physical security and is connected via redundant, ultra-low-latency networks.

An Availability Zone (AZ) is one or more discrete data centers with redundant power, networking, and connectivity in an AWS Region. AZs give customers the ability to operate production applications and databases that are more highly available, fault tolerant, and scalable than would be possible from a single data center. All AZs in an AWS Region are interconnected with high-bandwidth, low-latency networking, over fully redundant, dedicated metro fiber providing high-throughput, low-latency networking between AZs. Amazon S3 Standard, S3 Standard-Infrequent Access, S3 Intelligent-Tiering, S3 Glacier Instant Retrieval, S3 Glacier Flexible Retrieval, and S3 Glacier Deep Archive storage classes replicate data across a minimum of three AZs to protect against the loss of one entire AZ. This remains true in Regions where fewer than three AZs are publicly available. Objects stored in these storage classes are available for access from all of the AZs in an AWS Region.
The Amazon S3 One Zone-IA storage class replicates data within a single AZ. The data stored in S3 One Zone-IA is not resilient to the physical loss of an Availability Zone resulting from disasters, such as earthquakes, fires, and floods.

There are several factors to consider based on your specific application. For instance, you may want to store your data in a Region that is near your customers, your data centers, or other AWS resources to reduce data access latencies. You may also want to store your data in a Region that is remote from your other operations for geographic redundancy and disaster recovery purposes. You should also consider Regions that let you address specific legal and regulatory requirements and/or reduce your storage costs—you can choose a lower priced Region to save money. For S3 pricing information, visit the Amazon S3 pricing page.

Amazon S3 is available in AWS Regions worldwide, and you can use Amazon S3 regardless of your location. You just have to decide which AWS Region(s) you want to store your Amazon S3 data. See the AWS regional services list for a list of AWS Regions in which S3 is available today.

Billing

Open all

With Amazon S3, you pay only for what you use. There is no minimum charge. You can estimate your monthly bill using the AWS Pricing Calculator. AWS charges less where our costs are less. Some prices vary across Amazon S3 Regions. Billing prices are based on the location of your S3 bucket. There is no Data Transfer charge for data transferred within an Amazon S3 Region via a COPY request. Data transferred via a COPY request between AWS Regions is charged at rates specified on the Amazon S3 pricing page. There is no Data Transfer charge for data transferred between Amazon EC2 (or any AWS service) and Amazon S3 within the same Region, for example, data transferred within the US East (Northern Virginia) Region. However, data transferred between Amazon EC2 (or any AWS service) and Amazon S3 across all other Regions is charged at rates specified on the Amazon S3 pricing page, for example, data transferred between Amazon EC2 US East (Northern Virginia) and Amazon S3 US West (Northern California). Data transfer costs are billed to the source bucket owner. For S3 on Outposts pricing, visit the Outposts pricing page.

There are no set up charges or commitments to begin using Amazon S3. At the end of the month, you will automatically be charged for that month’s usage. You can view your charges for the current billing period at any time by logging into your Amazon Web Services account, and selecting the 'Billing Dashboard' associated with your console profile. With the AWS Free Usage Tier*, you can get started with Amazon S3 for free in all Regions except the AWS GovCloud Regions. Upon sign up, new AWS customers receive 5 GB of Amazon S3 Standard storage, 20,000 Get Requests, 2,000 Put Requests, and 100 GB of data transfer out (to internet, other AWS Regions, or Amazon CloudFront) each month for one year. Unused monthly usage will not roll over to the next month. Amazon S3 charges you for the following types of usage. Note that the calculations below assume there is no AWS Free Tier in place.

Starting July 15, 2025, new AWS customers will receive up to $200 in AWS Free Tier credits, which can be applied towards eligible AWS services, including Amazon S3. At account sign-up, you can choose between a free plan and a paid plan. The free plan will be available for 6 months after account creation. If you upgrade to a paid plan, any remaining Free Tier credit balance will automatically apply to your AWS bills. All Free Tier credits must be used within 12 months of your account creation date. To learn more about the AWS Free Tier program, refer to AWS Free Tier website and AWS Free Tier documentation.

AWS charges less where our costs are less. For example, our costs are lower in the US East (Northern Virginia) Region than in the US West (Northern California) Region.

Normal Amazon S3 rates apply for every version of an object stored or requested. For example, let’s look at the following scenario to illustrate storage costs when utilizing Versioning (let’s assume the current month is 31 days long): 1) Day 1 of the month: You perform a PUT of 4 GB (4,294,967,296 bytes) on your bucket.
2) Day 16 of the month: You perform a PUT of 5 GB (5,368,709,120 bytes) within the same bucket using the same key as the original PUT on Day 1.

When analyzing the storage costs of the above operations, note that the 4 GB object from Day 1 is not deleted from the bucket when the 5 GB object is written on Day 15. Instead, the 4 GB object is preserved as an older version and the 5 GB object becomes the most recently written version of the object within your bucket. At the end of the month: Total Byte-Hour usage
[4,294,967,296 bytes x 31 days x (24 hours / day)] + [5,368,709,120 bytes x 16 days x (24 hours / day)] = 5,257,039,970,304 Byte-Hours. Conversion to Total GB-Months
5,257,039,970,304 Byte-Hours x (1 GB / 1,073,741,824 bytes) x (1 month / 744 hours) = 6.581 GB-Month The cost is calculated based on the current rates for your Region on the Amazon S3 pricing page.

Normal Amazon S3 pricing applies when accessing the service through the AWS Management Console. To provide an optimized experience, the AWS Management Console may proactively execute requests. Also, some interactive operations result in more than one request to the service.

Normal Amazon S3 pricing applies when your storage is accessed by another AWS Account. Alternatively, you may choose to configure your bucket as a Requester Pays bucket, in which case the requester will pay the cost of requests and downloads of your Amazon S3 data. You can find more information on Requester Pays bucket configurations in the Amazon S3 documentation.

Except as otherwise noted, our prices are exclusive of applicable taxes and duties, including VAT and applicable sales tax. For customers with a Japanese billing address, use of AWS services is subject to Japanese Consumption Tax. Learn more about taxes on AWS services »

AWS offers eligible customers free data transfer out to the internet when they move all of their data off of AWS, in accordance with the process below.

Complete the following steps: 1) If you have a dedicated AWS account team, contact them first and inform them of your plans. In some cases, if you have a negotiated commitment with AWS, you'll want to discuss your options with your AWS account team. 2) Review the criteria and process described on this page. 3) Contact AWS Customer Support and indicate that your request is for “free data transfer to move off AWS.” AWS Customer Support will ask that you provide information, so they can review your moving plans, evaluate whether you qualify for free data transfer out, and calculate a proper credit amount. 4) If AWS Customer Support approves your move, you will receive a temporary credit for the cost of data transfer out based on the volume of all data you have stored across AWS services at the time of AWS’ calculation. AWS Customer Support will notify you if you are approved, and you will then have 60 days to complete your move off of AWS. The credit will count against data transfer out usage only, and it will not be applied to other service usage. After your move away from AWS services, within the 60-day period, you must delete all remaining data and workloads from your AWS account, or you can close your AWS account. Free data transfers for moving IT providers are also subject to the following criteria: a) Only customers with an active AWS account in good standing are eligible for free data transfer out. b) If you have less than 100 GB of data stored in your AWS account you may move this data off of AWS for free under AWS’s existing 100 GB monthly free tier for data transfer out. Customers with less than 100 GB of data stored in their AWS account are not eligible for additional credits. c) AWS will provide you with free data transfer out to the internet when you move all of your data off of AWS. If you only want to move your total usage of a single service, but not everything, contact AWS Customer Support. d) If your plans change, or you cannot complete your move off of AWS within 60 days, you must notify AWS Customer Support. e) Standard services charges for use of AWS services are not included. Only data transfer out charges in support of your move off of AWS are eligible for credits. However, data transfer out from specialized data transfer services, such as Amazon CloudFront, AWS Direct Connect, AWS Snowball, and AWS Global Accelerator, are not included. f) AWS may review your service usage to verify compliance with these requirements. If we determine your use of data transfer out was for a purpose other than moving off of AWS, we may charge you for the data transfer out that had been credited. g) AWS may make changes with respect to free data transfers out to the internet at any time.

AWS customers make hundreds of millions of data transfers each day, and we generally don’t know the reason for any given data transfer. For example, customers may be transferring data to an end user of their application, to a visitor of their website, or to another cloud or on-premises environment for backup purposes. Accordingly, the only way we know that your data transfer is to support your move off of AWS is if you tell us beforehand.

S3 Tables

Open all

Amazon S3 Tables deliver S3 storage that is specifically optimized for analytics workloads, improving query performance while also reducing costs. You can access advanced Iceberg analytics capabilities and query data using familiar AWS services like Amazon Athena, Redshift, and EMR through the S3 Tables integration with Amazon SageMaker lakehouse architecture. Additionally, you can use Iceberg REST compatible third-party applications like Apache Spark, Apache Flink, Trino, DuckDB, and PyIceberg, to read and write data into S3 Tables. You can use table buckets to store tabular data such as daily purchase transactions, streaming sensor data, or ad impressions as an Iceberg table in Amazon S3, and then interact with that data using analytics capabilities such as row-level transactions, queryable table snapshots, and more, all managed by Amazon S3. Table buckets perform continual table maintenance to automatically optimize query efficiency over time, even as the data lake scales and evolves.

You should use S3 Tables for a simple, performant, and cost-effective way to store tabular data in Amazon S3. S3 Tables give you the ability to organize your structured data into tables, and then to query that data using standard SQL statements, with virtually no setup. Additionally, S3 Tables deliver the same durability, availability, scalability, and performance characteristics as S3 itself, and automatically optimize your storage to maximize query performance and to minimize cost. With the Intelligent-Tiering storage class, S3 Tables automatically optimizes costs based on access patterns, without performance impact or operational overhead.

S3 Tables provide purpose-built S3 storage for storing structured data in the Apache Iceberg format. Within a table bucket, you can create tables as first-class resources directly in S3. These tables can be secured with table-level permissions defined in either identity- or resource-based policies and are accessible by applications or tooling that supports the Apache Iceberg standard. When you create a table in your table bucket, S3 maintains the metadata necessary to make that data queryable by your applications. Table buckets include an Iceberg REST Catalog endpoint that can be used by any Iceberg-compatible query engines discover, access, and update Iceberg metadata for tables in your table bucket. This allows for multiple clients to safely read and write data to your tables. Over time, S3 automatically optimizes the underlying data by rewriting, or "compacting” your objects. Compaction optimizes your data on S3 to improve query performance. Additionally, snapshot expiration and unreferenced file removal optimize storage cost as the data in your tables age. Read the user guide to learn more.

You can get started with S3 Tables in just a few simple steps without having to stand up any infrastructure outside of S3. First, create a table bucket in the S3 console. As part of creating your first table bucket through the console, the integration with AWS Analytics services happens automatically, which enables S3 to automatically populate all table buckets and tables in your account and Region in the AWS Glue Data Catalog. After this, S3 Tables is now accessible to AWS query engines such as Amazon Athena, EMR, and Redshift. Next, you can click to create a table using Amazon Athena from the S3 console. Once in Athena, you can quickly start populating new tables and querying them.

Alternatively, you can access S3 Tables using the Iceberg REST Catalog endpoint through the AWS Glue Data Catalog, which enables you to discover your entire data estate, including all table resources. You can also connect directly to an individual table bucket endpoint to discover all S3 Tables resources within that bucket. This enables you to use S3 Tables with any application or query engine that supports the Apache Iceberg REST Catalog specification.

You can create a table in your table bucket using the S3 console or using the CreateTable API or CLI operation. To create a table using the console, navigate to your table bucket in the S3 console and choose "Create table with Athena." You'll be prompted to either create a new namespace or select an existing one. After selecting a namespace, you'll be taken to the Athena query editor, where you can use the pre-populated sample SQL statement to create a new table. Once you run the query, your table is created and appears in both Athena and the S3 console. To delete a table, you can use the DeleteTable API or CLI operation. Alternatively, you can use your query engine to delete a table. When you do this, your table will no longer be accessible by your query engine.

S3 Tables support the Apache Iceberg standard, and query engines such as Amazon Athena, Amazon Redshift, and Apache Spark can be used to query Iceberg tables can be used to query the tables in your table buckets using standard SQL.

S3 Tables deliver up to 10x higher transactions per second (TPS) compared to storing Iceberg tables in general purpose Amazon S3 buckets. S3 Tables automatically perform compaction on the underlying data to continually optimize your tables for optimal query performance. Depending on your workload and query patterns, you can also choose from advanced compaction strategies such as sort and z-order compaction to further optimize your tables. Sort compaction organizes data based on specified columns to improve query performance for filtered operations, while z-order compaction optimizes data organization across multiple dimensions, making it ideal when you need to query data across multiple columns simultaneously.

No. To prevent accidentally compromising the integrity of your tables or breaking downstream applications, table buckets do not allow manual object overwrites or deletes. Table buckets only support the subset of S3 APIs necessary to access and update Iceberg tables. Instead, you can configure unreferenced file removal and snapshot expiration on your tables to delete data.

Table buckets give you the ability to apply resource policies to the entire bucket, or to individual tables. Table bucket policies can be applied using the PutTablePolicy and PutTableBucketPolicy APIs. Table-level policies allow you to manage permissions to tables in your table buckets based on the logical table that it is associated with, without having to understand the physical location of individual data files. S3 Tables support tags for attribute-based access control (ABAC), enabling you to scale access permissions and grant access to tables based on their tags in your IAM policies, AWS Organizations policies, and S3 Tables resource policies. This simplifies access management by reducing the need for frequent policy updates as your data lake grows. Additionally, S3 Block Public Access is always applied to your table buckets.

Yes. Table buckets rely on Iceberg’s snapshot functionality to keep your tables consistent when there are multiple concurrent writers.

Table buckets support the Apache Iceberg table format with Parquet, Avro, or ORC data.

Table buckets offer three maintenance operations: compaction, snapshot management, and unreferenced file removal. Compaction periodically combines smaller objects into fewer, larger objects to improve query performance.

You can also choose from advanced compaction strategies such as sort and z-order to further optimize performance based on your query patterns.

Yes, S3 Tables support AWS CloudTrail. You can set up CloudTrail data and management events for your table buckets, similar to how you would with a general purpose S3 bucket. CloudTrail logs for your table buckets include information about both table and data-level requests, as well as automatic maintenance operations performed by S3 Tables on your tables.

Yes, data in table buckets are encrypted by default using server-side encryption, ensuring baseline protection for your data at rest. For enhanced security, you have the option to encrypt your data in S3 Tables using your own encryption keys. These keys are created and managed within your AWS account via AWS Key Management Service (AWS KMS). With KMS, there are separate permissions for the use of the KMS key, adding an extra layer of control and protection against unauthorized access to your tables stored in table buckets. Additionally, KMS generates a detailed audit trail, allowing you to track who accessed which table and when, using your key. KMS also offers additional security controls to support your efforts in complying with industry requirements such as PCI-DSS, HIPAA/HITECH, and FedRAMP. This comprehensive approach to encryption and key management delivers the security and flexibility needed to protect your sensitive data effectively.

With S3 Tables, you pay for storage, requests, and an object monitoring fee per object stored in table buckets. There are also additional fees for table maintenance and replication. To see pricing details, read the S3 pricing page.

Compaction combines smaller objects into fewer, larger objects to improve Iceberg query performance. By default, S3 Tables uses binpack compaction strategy. You can also use advanced compaction strategies such as sort and z-order to further optimize performance. The compacted files are written as the most recent snapshot of your table. Amazon S3 compacts tables based on a target file size optimal for your data access pattern, or a value you specify, with a default target file size of 512MB. You can change the target file size from 64MB to 512MB using the PutTableMaintenanceConfiguration API.

Snapshot management expires and removes table snapshots as per the snapshot retention configuration. Snapshot management determines the number of active snapshots for your tables based on the MinimumSnapshots (1 by default) and MaximumSnapshotAge (120 hours by default). When a snapshot expires, Amazon S3 creates delete markers for the data and metadata files uniquely referenced by that snapshot and marks these files as noncurrent. These noncurrent files are deleted after the number of days specified by the NoncurrentDays property in your unreferenced file removal policy. You can change the default values for snapshot using the PutTableMaintenanceConfiguration API. Snapshot management does not support retention values you configure on the Iceberg metadata.json file, including branch or tag-based retention. Snapshot management for S3 Tables is disabled when you configure a branch or tag-based retention policy or configure a retention policy on the metadata.json file that is longer than the values configured through the PutTableMaintenanceConfiguration API.

Unreferenced file removal identifies and deletes all objects that are not referenced by any table snapshots. As part of your unreferenced file removal policy, you can configure two properties: ExpireDays (3 days by default) and NoncurrentDays (10 days by default). For any object not referenced by your table and older than the ExpireDays property, S3 permanently deletes the objects after the number of days specified by the NoncurrentDays property. You can configure unreferenced file removal at a table bucket level. You can change the default values for snapshot retention using the PutTableBucketMaintenanceConfiguration API.

S3 Tables replication enables automatic, asynchronous replication of Apache Iceberg tables across AWS Regions and accounts. S3 Tables replication creates and maintains read-only replicas of your tables, including all table data, metadata, and snapshot history, helping you reduce query latency for geographically distributed teams. When you enable replication, S3 Tables automatically creates read-only replica tables in your destination table buckets, backfills them with the latest state of the source table, and continuously monitors for new updates to keep replicas in sync. The replication process maintains the order of snapshots and preserves parent- child relationships in the snapshot history. You can configure replication at the table bucket level to replicate all tables, or at the individual table level for selective replication.

S3 Vectors

Open all

You can get started with S3 Vectors in four simple steps, without having to set up any infrastructure outside of Amazon S3. First, create a vector bucket in a specific AWS Region through the CreateVectorBucket API or in the S3 Console. Second, to organize your vector data in a vector bucket, you create a vector index with the CreateIndex API or in the S3 Console. When you create a vector index, you specify the distance metric (Cosine or Euclidean) and the number of dimensions a vector should have (up to 4092). For the most accurate results, select the distance metric recommended by your embedding model. Third, add vector data to a vector index with the PutVectors API. You can optionally attach metadata as key value pairs to each vector to filter queries. Fourth, perform a similarity query using the QueryVectors API, specifying the vector to search for and the number of the most similar results to return.

You can create a vector index using the S3 Console or the CreateIndex API. During index creation, you specify the vector bucket, index, distance metric, dimensions, and optionally a list of metadata fields that you want to exclude from filtering during similarity queries. For example, if you want to store data associated with vectors purely for reference, you can specify these as non-filterable metadata fields. Upon creation, each index is assigned a unique Amazon Resource Name (ARN). Subsequently when you make a write or query request, you direct it to a vector index within a vector bucket.

You can add vectors to a vector index using the PutVectors API. Each vector consists of a key, which uniquely identifies each vector in a vector index (e.g. you can programmatically generate a UUID). To maximize write throughput, it is recommended that you insert vectors in large batches, up to the maximum request size. Additionally, you can attach metadata (for example, year, author, genre, and location) as key value pairs to each vector. When you include metadata, by default all fields can be used as filters in a similarity query unless specified as non-filterable metadata at the time of vector index creation. To generate new vector embeddings of your unstructured data, you can use Amazon Bedrock’s InvokeModel API, specifying the model ID of the embedding model you want to use.

You can use the GetVectors API to look up and return vectors and associated metadata by the vector key.

You can run a similarity query with the QueryVectors API, specifying the query vector, the number of relevant results to return (the top k nearest neighbors), and the index ARN. When generating the query vector, you should use the same embedding model that was used to generate the initial vectors stored in the vector index. For example, if you use Amazon Titan Text Embeddings v2 in Amazon Bedrock to generate embeddings of your documents, it is recommended that you use the same model to convert a question to a vector. Additionally, you can use metadata filters in a query, to search vectors that match the filter. When you run the similarity query, by default the vector keys are returned. You can optionally include the distance and metadata in the response.

S3 Vectors offers highly durable and available vector storage. Data written to S3 Vectors is stored on S3, which is designed for 11 9s of data durability. S3 Vectors is designed to deliver 99.99% availability with an availability SLA of 99.9%.

S3 Vectors delivers sub-second query latency times. It uses the elastic throughput of Amazon S3 to handle searches across millions of vectors and is ideal for infrequent query workloads.

For performing similarity queries for your vector embeddings, several factors can affect average recall, including the embedding model, size of the vector dataset (number of vectors and dimensions), and the distribution of queries. S3 Vectors delivers over 90% average recall for most datasets. Average recall measures the quality of query results—90% means the response contains 90% of the ground truth closest vectors, that are stored in the index, to the query vector. However, because actual performance may vary depending on your specific use case, we recommend conducting your own tests with representative data and queries to validate that S3 vector indexes meet your recall requirements.

You can see a list of vectors in a vector index with the ListVectors API, which returns up to 1,000 vectors at a time with an indicator if the response is truncated. The response includes the last modified date, vector key, vector data, and metadata. You can also use the ListVectors API to easily export vector data from a specified vector index. The ListVectors operation is strongly consistent. So, after a write you can immediately list vectors with any changes reflected.

With S3 Vectors, you pay for storage and any applicable write and read requests (e.g., inserting vectors and performing query operations on vectors in a vector index). To see pricing details, see the S3 pricing page.

Yes. While creating a Bedrock Knowledge Base through the Bedrock Console or API, you can configure an existing S3 vector index as your vector store to save on vector storage costs for RAG use cases. If you prefer to let Bedrock create and manage the vector index for you, use the Quick Create workflow in the Bedrock console. Additionally, you can configure a new S3 vector index as your vector store for RAG workflows in Amazon SageMaker Unified Studio.

Yes. There are two ways you can use S3 Vectors with Amazon OpenSearch Service. First, S3 customers can export all vectors from an S3 vector index to OpenSearch Serverless as a new serverless collection using either the S3 or OpenSearch console. If you build natively on S3 Vectors, you will benefit from being able to use OpenSearch Serverless selectively for workloads with real-time query needs. Second, if you are a managed OpenSearch customer, you can now choose S3 Vectors as your engine for vector data that can be queried with sub-second latency. OpenSearch will then automatically use S3 Vectors as the underlying engine for vectors and you can update and search your vector data using the OpenSearch APIs. You gain the cost benefits of S3 Vectors, with no changes to your applications.

S3 Files

Open all

S3 Files is a shared file system that connects any AWS compute directly with your data in Amazon S3. It provides fast, direct access to all of your S3 data as files with full file system semantics and low-latency performance, without your data ever leaving S3. That means file-based applications, agents, and teams can now access and work with S3 data as a file system using the tools they already depend on. You no longer need to duplicate your data or cycle it between object storage and file system storage. Now, file-based tools and applications across your organization can work with your S3 data directly from any compute instance, container, and function using the tools your teams and agents already depend on.

With S3 Files, Amazon S3 is the first and only cloud object store that provides fully-featured, high-performance file system access to your data. S3 Files gives you the performance and simplicity of a file system with the scalability, durability, and cost-effectiveness of S3. There are no data silos, no synchronization complexities, and no tradeoffs. File and object storage, together in one place without compromise.

You should use S3 Files when your file-based applications, AI agents, and tools need to work directly with data stored in S3. S3 Files eliminates the need to copy data between storage systems or build custom integrations. Your existing Python libraries, ML frameworks, CLI utilities, and shell scripts work with your S3 data using standard file operations, with no code changes required. You can mount your S3 bucket across multiple compute resources (EC2, EKS, ECS, Lambda, Fargate, and Batch) for distributed applications where teams, agents, and workloads collaborate on shared datasets in real time.

S3 Files works like a traditional high-performance file system that can be accessed by any Linux-based compute resource, but its view of files and folders reflects what's in your S3 bucket. S3 Files is built using Amazon EFS, which intelligently loads your active working set onto high-performance storage. This delivers low latencies for frequently accessed data while keeping costs proportional to what you're actively using. When you read files, S3 Files lazily loads portions of file metadata and contents onto high-performance storage. Data that doesn't meet your configured file size threshold is read directly from S3, with no file system storage involved. When you write data, your writes are sent to the highly durable high-performance storage and then synced back to S3 to keep your bucket consistent. Data that hasn't been accessed within a configurable window (1 to 365 days, defaulting to 30) automatically expires from this storage, so you pay only for what you're actively using while your authoritative data always remains in S3.

S3 Files intelligently loads your active working set onto high-performance storage, delivering sub-millisecond latencies for frequently accessed data. Each file system delivers up to 5 GiB/s of write throughput performance, up to 250K read IOPS performance per file system, and 50K of write IOPS performance. The maximum per-client read throughput is 3 GiB/s, with multiple terabytes per second of aggregate read throughput.

When accessing files that aren't cached in your file system, the file system needs to first retrieve data from your bucket, which has latencies in the tens of milliseconds. Data stored in the file system is read with as low as sub-millisecond latencies. Writes are staged on the file system with single-digit millisecond latencies.

You can create an S3 file system from the S3 console, AWS CLI, or S3 API, and then mount it to your EC2 instances using standard mount commands. If you’re using a Linux distribution other than Amazon Linux, install the amazon-efs-utils on your instance before mounting your bucket.

For EKS clusters, install the Container Storage Interface (CSI) driver add-on efs-csi-driver and then use the Kubernetes CLI (kubectl) to mount the bucket. For ECS instances, add your S3 file system using Task definitions in the ECS Console or ECS API, and then attach the task definition to your cluster. For Lambda functions, select your S3 file system in the console or add it as a file system to your function configuration APIs. See documentation for a step-by-step guide.

S3 Files supports all storage classes in S3 general purpose buckets. All objects are visible as files within the file system, regardless of their underlying S3 storage class.

If you attempt to open a file that’s backed by an asynchronous storage class (S3 Glacier Flexible Retrieval and S3 Glacier Deep Archive), you will receive an I/O error. To access the file, you must first restore it with an S3 API (see documentation).

S3 Files does not support data in S3 Tables, S3 Vectors, or directory buckets.

S3 Files supports POSIX permissions on files and directories. When accessing files, the file system checks the UID and GID of your client against a file’s permissions. These permissions are stored as object metadata in your S3 bucket using the same format as other AWS file services.

Data is always encrypted in transit and at rest. Data is encrypted in transit between your compute resources and your file system using TLS 1.3. By default, S3 encrypts any data residing in your file system with S3-managed keys (SSE-S3), or optionally, when creating your file system, you can specify a KMS key to use (SSE-KMS).

File systems and S3 handle update operations differently. File systems support operations like partial file updates and append operations, while S3 is designed for full object updates. File systems also support directory renames, whereas renaming a prefix on S3 requires renaming each object that contains the prefix.

File updates made to data in your S3 file system are temporarily staged in the file system before being synchronized to S3. By doing so, the file system intelligently aggregates updates (like multiple updates to a single file done in short succession) to reduce S3 API calls.

S3 file systems are automatically synchronized with your S3 bucket to optimize cost and performance. Initially, your file system starts empty, importing object metadata on-demand as you access files and directories. For example, when you list a directory in your file system, metadata for the directory is imported from the S3 bucket. After a directory is accessed for the first time and its associated metadata is imported, the directory will continue to stay synchronized with S3 as you make updates in the S3 bucket. To optimize performance, the file system automatically pre-loads data for files less than 128KiB when a directory is first accessed. Data for larger files is read directly from S3.

You can also configure fine-grained rules for how data is imported from the S3 bucket to the file system using the PutSynchronizationConfiguration API or the S3 Console. For example, you can configure a rule to automatically store data in the file system after a file is read from a specific S3 prefix.

By default, the file system automatically exports all new files and file updates as a new object or a new version of an existing object within minutes. If you delete a file, the corresponding object in your bucket will have a delete marker as the current version of the object.

S3 file systems provide NFS close-to-open consistency, meaning that when a client closes a file, future opens of that file by any client will see the latest version of the file content. All compute resources reading from and writing to the same mounted file system see consistent data in real time. This is the standard consistency model that most file-based applications expect.

S3 Files implements a consistency model where the S3 bucket serves as the authoritative source of truth. S3 API operations maintain strong read-after-write consistency, file system operations provide close-to-open consistency, and the synchronization between these two systems is eventually consistent.

S3 Files is designed to provide 99.999999999% durability (11 nines) for your authoritative dataset in your S3 bucket. Additionally, data in your file system is stored redundantly across a minimum of 3 Availability Zones (AZ) by default, providing built-in resilience against widespread disaster including the permanent loss of an entire data center.

You can monitor S3 Files using Amazon CloudWatch metrics. CloudWatch provides file system metrics such as file system storage usage and number of client connections. CloudWatch also provides metrics for monitoring updates between your file system and bucket, including the number of files pending export. Additionally, you can use AWS CloudTrail to log management events in your file system. For example, events such as creating a file system or creating a mount target generate entries in CloudTrail. See our documentation for more details.

Use S3 Files when you already have data in S3 and you need file-based applications, AI agents, or teams to work with it directly using standard file operations. S3 Files is the right choice when your primary goal is to bring file access to your existing S3 data without duplicating it or managing a separate storage system. Your data stays in S3 as the authoritative source, and you get file system access on top of it.

Use Amazon EFS when you need a fully managed, cloud-native shared file system as your primary storage for workloads like shared home directories, content management, application configuration, and development environments where your data originates in and is primarily accessed through the file system.

Use Amazon FSx if you have existing file-based applications that run on network-attached file storage. FSx is purpose-built to support those workloads by offering fully managed deployments of popular file systems (NetApp ONTAP, Windows File Server, OpenZFS, or Lustre) with the same features, capabilities, and performance you're used to.

Amazon S3 and IPv6

Open all

Every server and device connected to the internet must have a unique address. Internet Protocol Version 4 (IPv4) was the original 32-bit addressing scheme. However, the continued growth of the internet means that all available IPv4 addresses will be utilized over time. Internet Protocol Version 6 (IPv6) is an addressing mechanism designed to overcome the global address limitation on IPv4.

Using IPv6 support for Amazon S3, applications can connect to Amazon S3 without the need for any IPv6 to IPv4 translation software or systems. You can meet compliance requirements, more easily integrate with existing IPv6-based on-premises applications, and remove the need for expensive networking equipment to handle the address translation. You can also now utilize the existing source address filtering features in IAM policies and bucket policies with IPv6 addresses, expanding your options to secure applications interacting with Amazon S3.

You can get started by pointing your application to Amazon S3’s “dual-stack” endpoint, which supports access over both IPv4 and IPv6. In most cases, no further configuration is required for access over IPv6, because most network clients prefer IPv6 addresses by default. Applications that are impacted by using IPv6 can switch back to the standard IPv4-only endpoints at any time. IPv6 with Amazon S3 is supported in all commercial AWS Regions, including AWS GovCloud (US) Regions, the Amazon Web Services China (Beijing) Region, operated by Sinnet, and the Amazon Web Services China (Ningxia) Region, operated by NWCD.

No, you will see the same performance when using either IPv4 or IPv6 with Amazon S3.

S3 Event Notifications

Open all

You can use the Amazon S3 Event Notifications feature to receive notifications when certain events happen in your S3 bucket, such as PUT, POST, COPY, and DELETE events. You can publish notifications to Amazon EventBridge, Amazon SNS, Amazon SQS, or directly to AWS Lambda.

Amazon S3 Event Notifications let you run workflows, send alerts, or perform other actions in response to changes in your objects stored in S3. You can use S3 Event Notifications to set up triggers to perform actions including transcoding media files when they are uploaded, processing data files when they become available, and synchronizing S3 objects with other data stores. You can also set up event notifications based on object name prefixes and suffixes. For example, you can choose to receive notifications on object names that start with “images/."

For a detailed description of the information included in Amazon S3 Event Notification messages, refer to the configuring Amazon S3 Event Notifications documentation.

For a detailed description of how to configure event notifications, refer to the configuring Amazon S3 Event Notifications documentation. You can learn more about AWS messaging services in the Amazon SNS documentation and the Amazon SQS documentation.

There are no additional charges for using Amazon S3 for event notifications. You pay only for use of Amazon SNS or Amazon SQS to deliver event notifications, or for the cost of running an AWS Lambda function. Visit the Amazon SNS, Amazon SQS, or AWS Lambda pricing pages to view the pricing details for these services.

Amazon S3 Transfer Acceleration

Open all

Amazon S3 Transfer Acceleration creates fast, easy, and secure transfers of files over long distances between your client and your Amazon S3 bucket. S3 Transfer Acceleration leverages Amazon CloudFront’s globally distributed AWS Edge locations. As data arrives at an AWS Edge Location, data is routed to your Amazon S3 bucket over an optimized network path.

To get started with S3 Transfer Acceleration enable S3 Transfer Acceleration on an S3 bucket using the Amazon S3 console, the Amazon S3 API, or the AWS CLI. After S3 Transfer Acceleration is enabled, you can point your Amazon S3 PUT and GET requests to the s3-accelerate endpoint domain name. Your data transfer application must use one of the following two types of endpoints to access the bucket for faster data transfer: .s3-accelerate.amazonaws.com or .s3-accelerate.dualstack.amazonaws.com for the “dual-stack” endpoint. If you want to use standard data transfer, you can continue to use the regular endpoints. There are certain restrictions on which buckets will support S3 Transfer Acceleration. For details, refer to the Amazon S3 documentation.

S3 Transfer Acceleration helps you fully use your bandwidth, minimize the effect of distance on throughput, and is designed to ensure consistently fast data transfer to Amazon S3 regardless of your client’s location. The amount of acceleration primarily depends on your available bandwidth, the distance between the source and destination, and packet loss rates on the network path. Generally, you will see more acceleration when the source is farther from the destination, when there is more available bandwidth, and/or when the object size is bigger. One customer measured a 50% reduction in their average time to ingest 300 MB files from a global user base spread across the US, Europe, and parts of Asia to a bucket in the Asia Pacific (Sydney) Region. Another customer observed cases where performance improved in excess of 500% for users in South East Asia and Australia uploading 250 MB files (in parts of 50 MB) to an S3 bucket in the US East (N. Virginia) Region. Access the S3 Transfer Acceleration speed comparison tool to get a preview of the performance benefit from your location.

S3 Transfer Acceleration is designed to optimize transfer speeds from across the world into S3 buckets. If you are uploading to a centralized bucket from geographically dispersed locations or if you regularly transfer GBs or TBs of data across continents, you may save hours or days of data transfer time with S3 Transfer Acceleration.

S3 Transfer Acceleration provides the same security as regular transfers to Amazon S3. All Amazon S3 security features, such as access restriction based on a client’s IP address, are supported as well. S3 Transfer Acceleration communicates with clients over standard TCP and does not require firewall changes. No data is ever saved at AWS Edge locations.

Each time you use S3 Transfer Acceleration to upload an object, we will check whether S3 Transfer Acceleration is likely to be faster than a regular Amazon S3 transfer. If we determine that S3 Transfer Acceleration is not likely to be faster than a regular Amazon S3 transfer of the same object to the same destination AWS Region, we will not charge for the use of S3 Transfer Acceleration for that transfer, and we may bypass the S3 Transfer Acceleration system for that upload.

Yes, S3 Transfer Acceleration supports all bucket level features including multipart uploads.

S3 Transfer Acceleration optimizes the TCP protocol and adds additional intelligence between the client and the S3 bucket, making S3 Transfer Acceleration a better choice if a higher throughput is desired. If you have objects that are smaller than 1 GB or if the data set is less than 1 GB in size, you should consider using Amazon CloudFront's PUT/POST commands for optimal performance.

AWS Direct Connect is a good choice for customers who have a private networking requirement or who have access to AWS Direct Connect exchanges. S3 Transfer Acceleration is best for submitting data from distributed client locations over the public internet, or where variable network conditions make throughput poor. Some AWS Direct Connect customers use S3 Transfer Acceleration to help with remote office transfers where they may suffer from poor internet performance.

You can benefit from configuring the bucket destination in your third-party gateway to use an S3 Transfer Acceleration endpoint domain.
Visit this File section of the Storage Gateway FAQ to learn more about the AWS implementation.

Yes. Software packages that connect directly into Amazon S3 can take advantage of S3 Transfer Acceleration when they send their jobs to Amazon S3. Learn more about Storage Partner Solutions »

Yes, AWS has expanded its HIPAA compliance program to include S3 Transfer Acceleration as a HIPAA eligible service. If you have an executed Business Associate Agreement (BAA) with AWS, you can use S3 Transfer Acceleration to make fast, easy, and secure transfers of files, including protected health information (PHI) over long distances between your client and your Amazon S3 bucket.

Security

Open all

Amazon S3 is secure by default. Upon creation, only you have access to Amazon S3 buckets that you create, and you have complete control over who has access to your data. Amazon S3 supports user authentication to control access to data. You can use access control mechanisms, such as bucket policies, to selectively grant permissions to users and groups of users. The Amazon S3 console highlights your publicly accessible buckets, indicates the source of public accessibility, and also warns you if changes to your bucket policies or bucket ACLs would make your bucket publicly accessible. You should enable Amazon S3 Block Public Access for all accounts and buckets that you do not want publicly accessible. All new buckets have Block Public Access turned on by default. You can securely upload/download your data to Amazon S3 via SSL endpoints using the HTTPS protocol. Amazon S3 automatically encrypts all object uploads to your bucket (as of January 5, 2023). Alternatively, you can use your own encryption libraries to encrypt data before storing it in Amazon S3.
For more information on security in AWS, refer to the AWS security page, and for S3 security information, visit the S3 security page and the S3 security best practices guide.

Customers can use a number of mechanisms for controlling access to Amazon S3 resources, including AWS Identity and Access Management (IAM) policies, bucket policies, access point policies, access control lists (ACLs), Query String Authentication, Amazon Virtual Private Cloud (Amazon VPC) endpoint policies, service control policies (SCPs) in AWS Organizations, and Amazon S3 Block Public Access.

Yes, customers can optionally configure an Amazon S3 bucket to create access log records for all requests made against it. Alternatively, customers who need to capture IAM/user identity information in their logs can configure AWS CloudTrail Data Events. These access log records can be used for audit purposes and contain details about the request, such as the request type, the resources specified in the request, and the time and date the request was processed.

Amazon S3 encrypts all new data uploads to any bucket. Amazon S3 applies S3-managed server-side encryption (SSE-S3) as the base level of encryption to all object uploads (as of January 5, 2023). SSE-S3 provides a fully-managed solution where Amazon handles key management and key protection using multiple layers of security. You should continue to use SSE-S3 if you prefer to have Amazon manage your keys. Additionally, you can choose to encrypt data using SSE-C, SSE-KMS, DSSE-KMS, or a client library such as the Amazon S3 Encryption Client. Each option allows you to store sensitive data encrypted at rest in Amazon S3. SSE-C allows Amazon S3 to perform encryption and decryption of objects, while you retain control of the encryption keys. With SSE-C, you don’t need to implement or use a client-side library to perform the encryption and decryption of objects you store in Amazon S3, but you do need to manage the keys that you send to Amazon S3 to encrypt and decrypt objects. Use SSE-C if you want to maintain your own encryption keys, but don’t want to implement or leverage a client-side encryption library. SSE-KMS lets AWS Key Management Service (AWS KMS) manage your encryption keys. Using AWS KMS to manage your keys provides several additional benefits. With AWS KMS, there are separate permissions for the use of the KMS key, providing an additional layer of control and protection against unauthorized access to your objects stored in Amazon S3. AWS KMS provides an audit trail so you can see who used your key to access which object and when, as well as view failed attempts to access data from users without permission to decrypt the data. Also, AWS KMS provides additional security controls to support customer efforts to comply with PCI-DSS, HIPAA/HITECH, and FedRAMP industry requirements. DSSE-KMS simplifies the process of applying two layers of encryption to your data, without having to invest in infrastructure required for client-side encryption. Each layer of encryption uses a different implementation of the 256-bit Advanced Encryption Standard with Galois Counter Mode (AES-GCM) algorithm and is vetted and accepted for use on top-secret workloads. DSSE-KMS uses AWS KMS to generate data keys, and lets AWS KMS manage your encryption keys. With AWS KMS, there are separate permissions for the use of the KMS key, providing an additional layer of control and protection against unauthorized access to your objects stored in Amazon S3. AWS KMS provides an audit trail so you can see who used your key to access which object and when, as well as view failed attempts to access data from users without permission to decrypt the data. Also, AWS KMS provides additional security controls to support customer efforts to comply with PCI-DSS, HIPAA/HITECH, and FedRAMP industry requirements. Using an encryption client library, you retain control of the keys and complete the encryption and decryption of objects client-side using an encryption library of your choice. Some customers prefer full end-to-end control of the encryption and decryption of objects; that way, only encrypted objects are transmitted over the internet to Amazon S3. Use a client-side library if you want to maintain control of your encryption keys, are able to implement or use a client-side encryption library, and need to have your objects encrypted before they are sent to Amazon S3 for storage. For more information on using Amazon S3 SSE-S3, SSE-C, or SSE-KMS, refer to protecting data using encryption documentation.

Customers can choose to store all data in Europe by using the Europe (Frankfurt), Europe (Ireland), Europe (Paris), Europe (Stockholm), Europe (Milan), Europe (Spain), Europe (London), or Europe (Zurich) Region. You can also use Amazon S3 on Outposts to keep all of your data on premises on the AWS Outpost, and you may choose to transfer data between AWS Outposts or to an AWS Region. It is your responsibility to ensure that you comply with European privacy laws. View the AWS General Data Protection Regulation (GDPR) Center and AWS Data Privacy Center for more information. If you have more specific location requirements or other data privacy regulations that require you to keep data in a location where there is not an AWS Region, you can use S3 storage classes for AWS Dedicated Local Zones or S3 on Outposts.

By default, your object data and object metadata stay within the single Local Zone (includes Dedicated Local Zones) where you put the object. Bucket management and telemetry data, including bucket names, capacity metrics, CloudTrail logs, CloudWatch metrics, customer managed keys from AWS Key Management Service (KMS), and Identity and Access Management (IAM) policies, are stored back in the parent AWS Region. Optionally, other bucket management features, like S3 Batch Operations, store management metadata with bucket name and object name in the parent AWS Region.

An Amazon VPC Endpoint for Amazon S3 is a logical entity within a VPC that allows connectivity to S3 over the AWS global network. There are two types of VPC endpoints for S3: gateway VPC endpoints and interface VPC endpoints. Gateway endpoints are a gateway that you specify in your route table to access S3 from your VPC over the AWS network. Interface endpoints extend the functionality of gateway endpoints by using private IPs to route requests to S3 from within your VPC, on-premises, or from a different AWS Region. For more information, visit the AWS PrivateLink for Amazon S3 documentation.

You can limit access to your bucket from a specific Amazon VPC Endpoint or a set of endpoints using Amazon S3 bucket policies. S3 bucket policies now support a condition, aws:sourceVpce, that you can use to restrict access. For more details and example policies, read the gateway endpoints for S3 documentation.

AWS PrivateLink for S3 provides private connectivity between Amazon S3 and on-premises. You can provision interface VPC endpoints for S3 in your VPC to connect your on-premises applications directly to S3 over AWS Direct Connect or AWS VPN. You no longer need to use public IPs, change firewall rules, or configure an internet gateway to access S3 from on-premises. To learn more visit the AWS PrivateLink for S3 documentation.

You can create an interface VPC endpoint using the AWS VPC Management Console, AWS Command Line Interface (AWS CLI), AWS SDK, or API. To learn more, visit the documentation.

AWS recommends that you use interface VPC endpoints to access S3 from on-premises or from a VPC in another AWS Region. For resources that are accessing S3 from VPC in the same AWS Region as S3, we recommend using gateway VPC endpoints as they are not billed. To learn more, visit the documentation.

Yes. If you have an existing gateway VPC endpoint, create an interface VPC endpoint in your VPC and update your client applications with the VPC endpoint specific endpoint names. For example, if your VPC endpoint id of the interface endpoint is vpce-0fe5b17a0707d6abc-29p5708s in the us-east-1 Region, then your endpoint specific DNS name will be vpce-0fe5b17a0707d6abc-29p5708s.s3.us-east-1.vpce.amazonaws.com. In this case, only the requests to the VPC endpoint specific names will route through Interface VPC endpoints to S3 while all other requests would continue to route through the gateway VPC endpoint. To learn more, visit the documentation.

Amazon Macie is an AI-powered security service that helps you prevent data loss by automatically discovering, classifying, and protecting sensitive data stored in Amazon S3. Amazon Macie uses machine learning to recognize sensitive data such as personally identifiable information (PII) or intellectual property, assigns a business value, and provides visibility into where this data is stored and how it is being used in your organization. Amazon Macie continuously monitors data access activity for anomalies, and delivers alerts when it detects risk of unauthorized access or inadvertent data leaks. You can use Amazon Macie to protect against security threats by continuously monitoring your data and account credentials. Amazon Macie gives you an automated and low-touch way to discover and classify your business data. It provides controls via templated Lambda functions to revoke access or trigger password reset policies upon the discovery of suspicious behavior, unauthorized data access to entities, or third-party applications. When alerts are generated, you can use Amazon Macie for incident response, using Amazon CloudWatch Events to swiftly take action to protect your data. For more information, visit the Amazon Macie documentation.

Access Analyzer for S3 is a feature that helps you simplify permissions management as you set, verify, and refine policies for your S3 buckets and access points. Access Analyzer for S3 monitors your existing access policies to verify that they provide only the required access to your S3 resources. Access Analyzer for S3 evaluates your bucket access policies and helps you discover and swiftly make changes to buckets that do not require access. Access Analyzer for S3 alerts you when you have a bucket that is configured to allow access to anyone on the internet or that is shared with other AWS accounts. You receive findings about the source and level of public or shared access. For example, Access Analyzer for S3 will proactively inform you if unrequired read or write access was provided through an access control list or bucket policy. With these findings, you can immediately set or restore the required access policy. When reviewing results that show potentially shared access to a bucket, you can Block Public Access to the bucket with a single click in the S3 console. You also can drill down into bucket-level permissions settings to configure granular levels of access. For auditing purposes, you can download Access Analyzer for S3 findings as a CSV report. Additionally, the S3 console reports security warnings, errors, and suggestions from IAM Access Analyzer as you author your S3 policies. The console automatically runs more than 100 policy checks to validate your policies. These checks save you time, guide you to resolve errors, and help you apply security best practices.
For more information, visit the IAM Access Analyzer documentation.

S3 Access Grants

Open all

Amazon S3 Access Grants map identities in directories such as Active Directory, or AWS Identity and Access Management (IAM) principals, to datasets in S3. This helps you manage data permissions at scale by automatically granting S3 access to end-users based on their corporate identity. Additionally, S3 Access Grants log end-user identity and the application used to access S3 data in AWS CloudTrail. This helps to provide a detailed audit history down to the end-user identity for all access to the data in your S3 buckets.

You should use S3 Access Grants if your S3 data is shared and accessed by many users and applications, where some of their identities are in your corporate directory such as Okta or Entra ID, and you need a scalable, simple, and auditable way to grant access to these S3 datasets at scale.

You can get started with S3 Access Grants in four steps. First, configure an S3 Access Grants instance. In this step, if you want to use S3 Access Grants with users and groups in your corporate directory, enable AWS Identity Center and connect S3 Access Grants to your Identity Center instance. Second, register a location with S3 Access Grants. During this process, you give S3 Access Grants an IAM role that is used to create temporary S3 credentials that users and applications can use to access S3. Third, define permission grants that specify who can access what. Finally, at the time of access, have your application request temporary credentials from S3 Access Grants and use Access Grants-vended credentials to access S3.

S3 Access Grants supports two kinds of identities: enterprise user or group identities from AWS Identity Center, and AWS IAM principals including IAM users and roles. When you use S3 Access Grants with AWS Identity Center, you can define data permissions on the basis of directory group memberships. AWS Identity Center is an AWS service that connects to commonly-used identity providers, including Entra ID, Okta, Ping, and others. In addition to supporting directory identities via AWS Identity Center, S3 Access Grants also supports permission rules for AWS IAM principal including IAM users and roles. This is for use cases where you either manage a custom identity federation not through AWS Identity Center but via IAM and SAML assertion (example implementation), or manage application identities based on IAM principals, and still would like to use S3 Access Grants due to its scalability and auditability.

S3 Access Grants offers three access levels: READ, WRITE, and READWRITE. READ allows you to view and retrieve objects from S3. WRITE allows you to write to and delete from S3. READWRITE allows you to do both READ and WRITE.

No. You can only use the three pre-defined access levels (READ/WRITE/READWRITE) that S3 Access Grants offers.

Yes. You can create up to 100,000 grants per S3 Access Grants instance, and up to 1,000 locations per S3 Access Grants instance.

No. The latency for obtaining temporary credentials from S3 Access Grants is similar to obtaining temporary credentials from AWS STS today. Once you have obtained the credentials from S3 Access Grants, you can reuse unexpired credentials for subsequent requests. For these subsequent requests, there is no additional latency for requests authenticated via S3 Access Grants credentials compared to other methods.

If you intend to use S3 Access Grants for directory identities, you will need to set up AWS IAM Identity Center first. AWS IAM Identity Center helps you create or connect your workforce identities, whether the identities are created and stored in Identity Center, or in an external third-party Identity Provider. Refer to the Identity Center documentation for the setup process. Once you have set up the Identity Center instance, you can connect the instance to S3 Access Grants. Thereafter, S3 Access Grants relies on Identity Center to retrieve user attributes such as group membership to evaluate requests and make authorization decisions.

Yes. Whereas today, you initialize your S3 client with IAM credentials associated with your application (for example, IAM role credentials for EC2 or IAM Roles Anywhere; or using long-term IAM user credentials), your application will need to instead obtain S3 Access Grants credentials first before initializing the S3 client. These S3 Access Grants credentials will be specific to the authenticated user in your application. Once the S3 client is initialized with these S3 Access Grants credentials, it can make requests for S3 data as usual using the credentials.

S3 Access Grants today already integrates with EMR and open-source Spark via the S3A connector. In addition, S3 Access Grants integrates with third-party software including Immuta and Informatica so that you can centralize permission management. And finally, S3 Access Grants supports Terraform and CloudFormation for you to programmatically provision S3 Access Grants.

No. S3 Access Grants does not replace IAM and in fact works well with your existing IAM-based data protection strategies (encryption, network, data-perimeter rules). S3 Access Grants is built on IAM primitives and enables you to express finer-grained S3 permissions at scale.

Yes. To utilize S3 Access Grants for objects encrypted with KMS, bucket owners include the necessary KMS permissions in the IAM role that they grant to S3 Access Grants as part of the location registration. S3 Access Grants can then subsequently utilize that IAM role to access the KMS-encrypted objects in the buckets.

You can use either the S3 Access Grants console experience in the AWS Management Console or SDK and CLI APIs for you to view and manage your S3 Access Grants permissions.

No, you cannot grant public access to data with S3 Access Grants.

The request by the application to initiate a data access session with S3 Access Grants will be recorded in CloudTrail. CloudTrail will distinguish the identity of the user making the request and the application identity accessing the data on the user’s behalf. This helps you audit end-user identity of who accessed what data at what time.

S3 Access Grants is charged based on the number of requests to S3 Access Grants. See the pricing page for details.

AWS Lake Formation is for use cases where you need to manage access for tabular data (e.g., Glue tables), where you might want to enforce row- and column-level access. S3 Access Grants is for managing access for direct S3 permissions such as unstructured data including videos, images, logs, etc.

Amazon S3 FAQs

General S3 FAQs

What is Amazon S3?

What can I do with Amazon S3?

How can I get started using Amazon S3?

What can I do with Amazon S3 that I cannot do with an on-premises solution?

What kind of data can I store in Amazon S3?

How much data can I store in Amazon S3?

What is an S3 general purpose bucket?

What is an S3 directory bucket?

What is an S3 table bucket?

What is an S3 vector bucket?

What is the difference between a general purpose bucket, a directory bucket, a table bucket, and a vector bucket?

What does Amazon do with my data in Amazon S3?

Does Amazon store its own data in Amazon S3?

How is Amazon S3 data organized?

How do I interface with Amazon S3?

How reliable is Amazon S3?

How will Amazon S3 perform if traffic from my application suddenly spikes?

Does Amazon S3 offer a Service Level Agreement (SLA)?

What is the consistency model for Amazon S3?

Why does strong read-after-write consistency help me?

AWS Regions

Where is my data stored?

Why should I use Amazon S3 storage classes for AWS Dedicated Local Zones?

Why should I use Amazon S3 in AWS Local Zones?

What is an AWS Region?

What is an AWS Availability Zone (AZ)?

How do I decide which AWS Region to store my data in?

In which parts of the world is Amazon S3 available?

Billing

How much does Amazon S3 cost?

How will I be charged and billed for my use of Amazon S3?

Why do prices vary depending on which Amazon S3 Region I choose?

How am I charged for using Versioning?

How am I charged for accessing Amazon S3 through the AWS Management Console?

How am I charged if my Amazon S3 buckets are accessed from another AWS account?

Do your prices include taxes?

Will I incur any data transfer out to the internet charges when I move my data out of AWS?

I want to move my data out of AWS. How do I request free data transfer out to the internet?

Why do I have to request AWS’ pre-approval for free data transfer out to the internet before moving my data out of AWS?

S3 Tables

What are Amazon S3 Tables?

Why should I use S3 Tables?

How do table buckets work?

How do I get started with S3 Tables?

How do I create and delete tables in my table bucket?

How do I query my tables?

What performance can I expect from S3 Tables?

Can I manually overwrite or delete an object in my table bucket?

How do table bucket permissions work?

Do table buckets support concurrent writes to a single table?

What table and data formats do table buckets support?

What table maintenance operations are offered by table buckets?

Can I track and audit changes made to my tables?

Do table buckets support encryption at rest for my table data?

How much does it cost to use S3 Tables?

How does compaction work for S3 Tables?

How does snapshot management work for S3 Tables?

How does unreferenced file removal work for S3 Tables?

How does S3 Tables replication work?

S3 Vectors

How do I get started with S3 Vectors?

How do I create a vector index in a vector bucket?

How do I add vector data to my vector index?

How do I retrieve vectors and its associated metadata?

How do I query my vector data?

What are the durability and availability characteristics of S3 Vectors?

What query performance can I expect with S3 Vectors?

What recall can I expect when querying S3 Vectors?

How can I see a list of vectors in a vector index?

How much does it cost to use S3 Vectors?

Can I use S3 Vectors as my vector store in Amazon Bedrock Knowledge Bases?

Can I use S3 Vectors with Amazon OpenSearch Service?

S3 Files

What is Amazon S3 Files?

Why should I use S3 Files?

How does S3 Files work?

What performance can I expect from S3 Files?

How do I get started using S3 Files with my compute resources?