AWS Blog

New – High-Resolution Custom Metrics and Alarms for Amazon CloudWatch

Amazon CloudWatch has been an important part of AWS since early 2009! Launched as part of a three-pack that also included Auto Scaling and Elastic Load Balancing, CloudWatch has evolved into a very powerful monitoring service for AWS resources and the applications that you run on the AWS Cloud. CloudWatch custom metrics (launched way back in 2011) allow you to store business and application metrics in CloudWatch, view them in graphs, and initiate actions based on CloudWatch Alarms. Needless to say, we have made many enhancements to CloudWatch over the years! Some of the most recent include Extended Metrics Retention (and a User Interface Update), Dashboards, API/CloudFormation Support for Dashboards, and Alarms on Dashboards.

Originally, metrics were stored at five minute intervals; this was reduced to one minute (also known as Detailed Monitoring) in response to customer requests way back in 2010. This was a welcome change, but now it is time to do better. Our customers are streaming video, running flash sales, deploying code tens or hundreds of times per day, and running applications that scale in and out very quickly as conditions change. In all of these situations, a minute is simply too coarse of an interval. Important, transient spikes can be missed; disparate (yet related) events are difficult to correlate across time, and the MTTR (mean time to repair) when something breaks is too high.

New High-Resolution Metrics
Today we are adding support for high-resolution custom metrics, with plans to add support for AWS services over time. Your applications can now publish metrics to CloudWatch with 1-second resolution. You can watch the metrics scroll across your screen seconds after they are published and you can set up high-resolution CloudWatch Alarms that evaluate as frequently as every 10 seconds.

Imagine alarming when available memory gets low. This is often a transient condition that can be hard to catch with infrequent samples. With high-resolution metrics, you can see, detect (via an alarm), and act on it within seconds:

In this case the alarm on the right would not fire, and you would not know about the issue.

Publishing High-Resolution Metrics
You can publish high-resolution metrics in two different ways:

API – The PutMetricData function now accepts an optional StorageResolution parameter. Set this parameter to 1 to publish high-resolution metrics; omit it (or set it to 60) to publish at standard 1-minute resolution.
collectd plugin – The CloudWatch plugin for collectd has been updated to support collection and publication of high-resolution metrics. You will need to set the enable_high_definition_metrics parameter in the config file for the plugin.

CloudWatch metrics are rolled up over time; resolution effectively decreases as the metrics age. Here’s the schedule:

1 second metrics are available for 3 hours.
60 second metrics are available for 15 days.
5 minute metrics are available for 63 days.
1 hour metrics are available for 455 days (15 months).

When you call GetMetricStatistics you can specify a period of 1, 5, 10, 30 or any multiple of 60 seconds for high-resolution metrics. You can specify any multiple of 60 seconds for standard metrics.

A Quick Demo
I grabbed my nearest EC2 instance, installed the latest version of collectd and the Python plugin:

$ sudo yum install collectd collectd-python

Then I downloaded the setup script for the plugin, made it executable, and ran it:

$ wget https://raw.githubusercontent.com/awslabs/collectd-cloudwatch/master/src/setup.py
$ chmod a+x setup.py
$ sudo ./setup.py

I had already created a suitable IAM Role and added it to my instance; it was automatically detected during setup. I was asked to enable the high resolution metrics:

collectd started running and publishing metrics within seconds. I opened up the CloudWatch Console to take a look:

Then I zoomed in to see the metrics in detail:

I also created an alarm that will check the memory.percent.used metric at 10 second intervals. This will make it easier for me to detect situations where a lot of memory is being used for a short period of time:

Now Available
High-resolution custom metrics and alarms are available now in all Public AWS Regions, with support for AWS GovCloud (US) coming soon.

As was already the case, you can store 10 metrics at no charge every month; see the CloudWatch Pricing page for more information. Pricing for high-resolution metrics is identical to that for standard resolution metrics, with volume tiers that allow you to realize savings (on a per-metric) basis when you use more metrics. High-resolution alarms are priced at $0.30 per alarm per month.

Now Available: Three New AWS Specialty Training Courses

AWS Training allows you to learn from the experts so you can advance your knowledge with practical skills and get more out of the AWS Cloud. Today I am happy to announce that three of our most popular training bootcamps (a staple at AWS re:Invent and AWS Global Summits) are becoming part of our permanent instructor-led training portfolio:

Building a Serverless Data Lake – Teaches you how to design, build, and operate a serverless data lake solution with AWS services.
Secrets to Successful Cloud Transformations – Teaches you how to select the right strategy, people, migration plan, and financial management methodology needed when moving your workloads to the cloud. Does not require advanced technical expertise.
Running Container-Enabled Microservices on AWS – Teaches you how to manage and scale container-enabled applications by using Amazon EC2 Container Service (ECS).

These one-day courses are intended for individuals who would like to dive deeper into a specialized topic with an expert trainer.

You can explore our complete course catalog, and you can search for a public class near you within the AWS Training and Certification Portal. You can also request a private onsite training session for your team by contacting us.

— Jeff;

New – GPU-Powered Streaming Instances for Amazon AppStream 2.0

We launched Amazon AppStream 2.0 at re:Invent 2016. This application streaming service allows you to deliver Windows applications to a desktop browser.

AppStream 2.0 is fully managed and provides consistent, scalable performance by running applications on general purpose, compute optimized, and memory optimized streaming instances, with delivery via NICE DCV – a secure, high-fidelity streaming protocol. Our enterprise and public sector customers have started using AppStream 2.0 in place of legacy application streaming environments that are installed on-premises. They use AppStream 2.0 to deliver both commercial and line of business applications to a desktop browser. Our ISV customers are using AppStream 2.0 to move their applications to the cloud as-is, with no changes to their code. These customers focus on demos, workshops, and commercial SaaS subscriptions.

We are getting great feedback on AppStream 2.0 and have been adding new features very quickly (even by AWS standards). So far this year we have added an image builder, federated access via SAML 2.0, CloudWatch monitoring, Fleet Auto Scaling, Simple Network Setup, persistent storage for user files (backed by Amazon S3), support for VPC security groups, and built-in user management including web portals for users.

New GPU-Powered Streaming Instances
Many of our customers have told us that they want to use AppStream 2.0 to deliver specialized design, engineering, HPC, and media applications to their users. These applications are generally graphically intensive and are designed to run on expensive, high-end PCs in conjunction with a GPU (Graphics Processing Unit). Due to the hardware requirements of these applications, cost considerations have traditionally kept them out of situations where part-time or occasional access would otherwise make sense. Recently, another requirement has come to the forefront. These applications almost always need shared, read-write access to large amounts of sensitive data that is best stored, processed, and secured in the cloud. In order to meet the needs of these users and applications, we are launching two new types of streaming instances today:

Graphics Desktop – Based on the G2 instance type, Graphics Desktop instances are designed for desktop applications that use the CUDA, DirectX, or OpenGL for rendering. These instances are equipped with 15 GiB of memory and 8 vCPUs. You can select this instance family when you build an AppStream image or configure an AppStream fleet:

Graphics Pro – Based on the brand-new G3 instance type, Graphics Pro instances are designed for high-end, high-performance applications that can use the NVIDIA APIs and/or need access to large amounts of memory. These instances are available in three sizes, with 122 to 488 GiB of memory and 16 to 64 vCPUs. Again, you can select this instance family when you configure an AppStream fleet:

To learn more about how to launch, run, and scale a streaming application environment, read Scaling Your Desktop Application Streams with Amazon AppStream 2.0.

As I noted earlier, you can use either of these two instance types to build an AppStream image. This will allow you to test and fine tune your applications and to see the instances in action.

Streaming Instances in Action
We’ve been working with several customers during a private beta program for the new instance types. Here are a few stories (and some cool screen shots) to show you some of the applications that they are streaming via AppStream 2.0:

AVEVA is a world leading provider of engineering design and information management software solutions for the marine, power, plant, offshore and oil & gas industries. As part of their work on massive capital projects, their customers need to bring many groups of specialist engineers together to collaborate on the creation of digital assets. In order to support this requirement, AVEVA is building SaaS solutions that combine the streamed delivery of engineering applications with access to a scalable project data environment that is shared between engineers across the globe. The new instances will allow AVEVA to deliver their engineering design software in SaaS form while maximizing quality and performance. Here’s a screen shot of their Everything 3D app being streamed from AppStream:

Nissan, a Japanese multinational automobile manufacturer, trains its automotive specialists using 3D simulation software running on expensive graphics workstations. The training software, developed by The DiSti Corporation, allows its specialists to simulate maintenance processes by interacting with realistic 3D models of the vehicles they work on. AppStream 2.0’s new graphics capability now allows Nissan to deliver these training tools in real time, with up to date content, to a desktop browser running on low-cost commodity PCs. Their specialists can now interact with highly realistic renderings of a vehicle that allows them to train for and plan maintenance operations with higher efficiency.

Cornell University is an American private Ivy League and land-grant doctoral university located in Ithaca, New York. They deliver advanced 3D tools such as AutoDesk AutoCAD and Inventor to students and faculty to support their course work, teaching, and research. Until now, these tools could only be used on GPU-powered workstations in a lab or classroom. AppStream 2.0 allows them to deliver the applications to a web browser running on any desktop, where they run as if they were on a local workstation. Their users are no longer limited by available workstations in labs and classrooms, and can bring their own devices and have access to their course software. This increased flexibility also means that faculty members no longer need to take lab availability into account when they build course schedules. Here’s a copy of Autodesk Inventor Professional running on AppStream at Cornell:

Now Available
Both of the graphics streaming instance families are available in the US East (Northern Virginia), US West (Oregon), EU (Ireland), and Asia Pacific (Tokyo) Regions and you can start streaming from them today. Your applications must run in a Windows 2012 R2 environment, and can make use of DirectX, OpenGL, CUDA, OpenCL, and Vulkan.

With prices in the US East (Northern Virginia) Region starting at $0.50 per hour for Graphics Desktop instances and $2.05 per hour for Graphics Pro instances, you can now run your simulation, visualization, and HPC workloads in the AWS Cloud on an economical, pay-by-the-hour basis. You can also take advantage of fast, low-latency access to Amazon Elastic Compute Cloud (EC2), Amazon Simple Storage Service (S3), AWS Lambda, Amazon Redshift, and other AWS services to build processing workflows that handle pre- and post-processing of your data.

— Jeff;

Use CloudFormation StackSets to Provision Resources Across Multiple AWS Accounts and Regions

AWS CloudFormation helps AWS customers implement an Infrastructure as Code model. Instead of setting up their environments and applications by hand, they build a template and use it to create all of the necessary resources, collectively known as a CloudFormation stack. This model removes opportunities for manual error, increases efficiency, and ensures consistent configurations over time.

Today I would like to tell you about a new feature that makes CloudFormation even more useful. This feature is designed to help you to address the challenges that you face when you use Infrastructure as Code in situations that include multiple AWS accounts and/or AWS Regions. As a quick review:

Accounts – As I have told you in the past, many organizations use a multitude of AWS accounts, often using AWS Organizations to arrange the accounts into a hierarchy and to group them into Organizational Units, or OUs (read AWS Organizations – Policy-Based Management for Multiple AWS Accounts to learn more). Our customers use multiple accounts for business units, applications, and developers. They often create separate accounts for development, testing, staging, and production on a per-application basis.

Regions – Customers also make great use of the large (and ever-growing) set of AWS Regions. They build global applications that span two or more regions, implement sophisticated multi-region disaster recovery models, replicate S3, Aurora, PostgreSQL, and MySQL data in real time, and choose locations for storage and processing of sensitive data in accord with national and regional regulations.

This expansion into multiple accounts and regions comes with some new challenges with respect to governance and consistency. Our customers tell us that they want to make sure that each new account is set up in accord with their internal standards. Among other things, they want to set up IAM users and roles, VPCs and VPC subnets, security groups, Config Rules, logging, and AWS Lambda functions in a consistent and reliable way.

Introducing StackSet
In order to address these important customer needs, we are launching CloudFormation StackSet today. You can now define an AWS resource configuration in a CloudFormation template and then roll it out across multiple AWS accounts and/or Regions with a couple of clicks. You can use this to set up a baseline level of AWS functionality that addresses the cross-account and cross-region scenarios that I listed above. Once you have set this up, you can easily expand coverage to additional accounts and regions.

This feature always works on a cross-account basis. The administrator account owns one or more StackSets and controls deployment to one or more target accounts. The administrator account must include an assumable IAM role and the target accounts must delegate trust to the administrator account. To learn how to do this, read Prerequisites in the StackSet Documentation.

Each StackSet references a CloudFormation template and contains lists of accounts and regions. All operations apply to the cross-product of the accounts and regions in the StackSet. If the StackSet references three accounts (A1, A2, and A3) and four regions (R1, R2, R3, and R4), there are twelve targets:

Region R1: Accounts A1, A2, and A3.
Region R2: Accounts A1, A2, and A3.
Region R3: Accounts A1, A2, and A3.
Region R4: Accounts A1, A2, and A3.

Deploying a template initiates creation of a CloudFormation stack in an account/region pair. Templates are deployed sequentially to regions (you control the order) to multiple accounts within the region (you control the amount of parallelism). You can also set an error threshold that will terminate deployments if stack creation fails.

You can use your existing CloudFormation templates (taking care to make sure that they are ready to work across accounts and regions), create new ones, or use one of our sample templates. We are launching with support for the AWS partition (all public regions except those in China), and expect to expand it to to the others before too long.

Using StackSets
You can create and deploy StackSets from the CloudFormation Console, via the CloudFormation APIs, or from the command line.

Using the Console, I start by clicking on Create StackSet. I can use my own template or one of the samples. I’ll use the last sample (Add config rule encrypted volumes):

I click on View template to learn more about the template and the rule:

I give my StackSet a name. The template that I selected accepts an optional parameter, and I can enter it at this time:

Next, I choose the accounts and regions. I can enter account numbers directly, reference an AWS organizational unit, or upload a list of account numbers:

I can set up the regions and control the deployment order:

I can also set the deployment options. Once I am done I click on Next to proceed:

I can add tags to my StackSet. They will be applied to the AWS resources created during the deployment:

The deployment begins, and I can track the status from the Console:

I can open up the Stacks section to see each stack. Initially, the status of each stack is OUTDATED, indicating that the template has yet to be deployed to the stack; this will change to CURRENT after a successful deployment. If a stack cannot be deleted, the status will change to INOPERABLE.

After my initial deployment, I can click on Manage StackSet to add additional accounts, regions, or both, to create additional stacks:

Now Available
This new feature is available now and you can start using it today at no extra charge (you pay only for the AWS resources created on your behalf).

— Jeff;

PS – If you create some useful templates and would like to share them with other AWS users, please send a pull request to our AWS Labs GitHub repo.

New – S3 Sync capability for EC2 Systems Manager: Query & Visualize Instance Software Inventory

It is now essential, with the fast paced lives we all seem to lead, to find tools to make it easier to manage our time, our home, and our work. With the pace of technology, the need for technologists to find management tools to easily manage their systems is just as important. With the introduction of Amazon EC2 Systems Manager service during re:Invent 2016, we hoped to provide assistance with the management of your systems and software.

If are not yet familiar with the Amazon EC2 Systems Manager, let me introduce this capability to you. EC2 Systems Manager it is a management service that helps to create system images, collect software inventory, configure both Windows and Linux operating systems, as well as, apply Operating Systems patches. This collection of capabilities allows remote and secure administration for managed EC2 instances or hybrid environments with on-premise machines configured for Systems Manager. With this EC2 service capability, you can additionally record and regulate the software configuration of these instances using AWS Config.

Recently we have added another feature to the inventory capability of EC2 Systems Manager to aid you in the capture of metadata about your application deployments, OS and system configurations, Resource Data Sync aka S3 Sync. S3 Sync for EC2 Systems Manager allows you to aggregate captured inventory data automatically from instances in different regions and multiple accounts and store this information in Amazon S3. With the data in S3, you can run queries against the instance inventory using Amazon Athena, and if you choose, use Amazon QuickSight to visualize the software inventory of your instances.

Let’s look at how we can utilize this Resource Data Sync aka S3 Sync feature with Amazon Athena and Amazon QuickSight to query and visualize the software inventory of instances. First things first, I will make sure that I have the Amazon EC2 Systems Manager prerequisites completed; configuration of the roles and permissions in AWS Identity and Access Management (IAM), as well as, the installation of the SSM Agent on my managed instances. I’ll quickly launch a new EC2 instance for this Systems Manager example.

Now that my instance has launched, I will need to install the SSM Agent onto my aws-blog-demo-instance. One thing I should mention is that it is essential that your IAM user account has administrator access in the VPC in which your instance was launched. You can create a separate IAM user account for instances with EC2 Systems Manager, by following the instructions noted here: http://docs.aws.amazon.com/systems-manager/latest/userguide/sysman-configuring-access-policies.html#sysman-access-user. Since I am using an account with administrative access, I won’t need to create an IAM user to continue installing the SSM Agent on my instance.

To install the SSM Agent, I will SSH into my instance, create a temporary directory, and pull down and install the necessary SSM Agent software for my Amazon Linux EC2 instance. An EC2 instance based upon a Windows AMI already includes the SSM Agent so I would not need to install the agent for Windows instances.

To complete the aforementioned tasks, I will issue the following commands:

mkdir /tmp/ssm

cd /tmp/ssm

sudo yum install -y https://s3.amazonaws.com/ec2-downloads-windows/SSMAgent/latest/linux_amd64/amazon-ssm-agent.rpm

You can find the instructions to install the SSM Agent based upon the type of operating system of your EC2 instance in the Installing SSM Agent section of the EC2 Systems Manager user guide.

Now that I have the Systems Manager agent running on my instance, I’ll need to use a S3 bucket to capture the inventory data. I’ll create a S3 bucket, aws-blog-tew-posts-ec2, to capture the inventory data from my instance. I will also need to add a bucket policy to ensure that EC2 Systems Manager has permissions to write to my bucket. Adding the bucket policy is simple, I select the Permissions tab in the S3 Console and then click the Bucket Policy button. Then I specify a bucket policy which gives the Systems Manager the ability to check bucket permissions and add objects to the bucket. With the policy in place, my S3 bucket is now ready to receive the instance inventory data.

To configure the inventory collection using this bucket, I will head back over to the EC2 console and select Managed Resources under Systems Manager Shared Resources section, then click the Setup Inventory button.

In the Targets section, I’ll manually select the EC2 instance I created earlier from which I want to capture the inventory data. You should note that you can select multiple instances for which to capture inventory data if desired.

Scrolling down to the Schedule section, I will choose 30 minutes for the time interval of how often I wish for inventory metadata to be gathered from my instance. Since I’m keeping the default Enabled value for all of the options in the Parameters section, and I am not going to write the association logs to S3 at this time, I only need to click the Setup Inventory button. When the confirmation dialog comes up noting that the Inventory has been set up successfully, I will click the Close button to go back to the main EC2 console.

Back in the EC2 console, I will set up my Resource Data Sync using my aws-blog-tew-posts-ec3 S3 bucket for my Managed Instance by selecting the Resource Data Syncs button.

To set up my Resource data, I will enter my information for the Sync Name, Bucket Name, Bucket Prefix, and the Bucket Region that my bucket is located. You should also be aware that the Resource Data Sync and the sync S3 target bucket can be located in different regions. Another thing to note is that the CLI command for completing this step is displayed, in case I opt to utilize the AWS CLI for creating the Resource Data Sync. I click the Create button and my Resource Data Sync setup is complete.

After a few minutes, I can go to my S3 bucket and see that my instance inventory data is syncing to my S3 bucket successfully.

With this data syncing directly into S3, I can take advantage of the querying capabilities of the Amazon Athena service to view and query my instance inventory data. I create a folder, athenaresults, within my aws-blog-tew-posts-ec2 S3 bucket, and now off to the Athena console I go!

In the Athena console, I will change the Settings option to point to my athenaresults folder in my bucket by entering: s3://aws-blog-tew-posts-ec2/athenaresults. Now I can create a database named tewec2ssminventorydata for capturing and querying the data sent from SSM to my bucket, by entering in a CREATE DATABASE SQL statement in the Athena editor and clicking the Run Query button.

With my database created, I’ll switch to my tewec2ssminventorydata database and create a table to grab the inventory application data from the S3 bucket synced from the Systems Manager Resource Data Sync.

As the query success message notes, I’ll run the MSCK REPAIR TABLE tew_awsapplication command to partition the newly created table. Now I can run queries against the inventory data being synced from the EC2 Systems Manager to my Amazon S3 buckets. You can learn more about querying data with Amazon Athena on the product page and you can review my blog post on querying and encrypting data with Amazon Athena.

Now that I have query capability of this data it also means I can use Amazon QuickSight to visualize my data.

If you haven’t created an Amazon QuickSight account, you can quickly follow the getting started instructions to setup your QuickSight account. Since I already have a QuickSight account, I’ll go to the QuickSight dashboard and select the Manage Data button. On my Your Data Sets screen, I’ll select the New data set button.
Now I can create a dataset from my Athena table holding the Systems Manager Inventory Data by selecting Athena as my data source.

This takes me through a series of steps to create my data source from the Athena tewec2ssminventorydata database and the tew_awsapplication table.

After choosing Visualize to create my data set and analyze the data in the Athena table, I am now taken to the QuickSight dashboard where I can build graphs and visualizations for my EC2 System Manager inventory data.

Adding the applicationtype field to my graph, allows me to build a visualization using this data.

Summary

With the new Amazon EC2 Systems Manager Resource Data Sync capability to send inventory data to Amazon S3 buckets, you can now create robust data queries using Amazon Athena and build visualizations of this data with Amazon QuickSight. No longer do you have to create custom scripts to aggregate your instance inventory data to an Amazon S3 bucket, now this data can be automatically synced and stored in Amazon S3 allowing you to keep your data even after your instance has been terminated. This new EC2 Systems Manager capability also allows you to send inventory data to S3 from multiple accounts and different regions.

To learn more about Amazon EC2 Systems Manager and EC2 Systems Manager Inventory, take a look at the product pages for the service. You can also build your own query and visualization solution for the EC2 instance inventory data captured in S3 by checking out the EC2 Systems Manager user guide on Using Resource Data Sync to Aggregate Inventory Data.

In the words of my favorite Vulcan, “Live long, query and visualize and prosper” with EC2 Systems Manager.

– Tara

Hightail — Empowering Creative Collaboration in the Cloud

Hightail – formerly YouSendIt – streamlines how creative work is reviewed, improved, and approved by helping more than 50 million professionals around the world get great content in front of their audiences faster. Since its debut in 2004 as a file sharing company, Hightail shifted its strategic direction to focus on delivering value-added creative collaboration services and boasts a strong lineup of name-brand customers.

In today’s guest post, Hightail’s SVP of Technology Shiva Paranandi tells the company’s migration story, moving petabytes of data from on-premises to the cloud. He highlights their cloud vendor evaluation process and reasons for going all-in on AWS.

Hightail started as a way to help people easily share and store large files, but has since evolved into a creative collaboration tool. We became a place where users could not only control and share their digital assets, but also assemble their creative teams, connect with clients, develop creative workflows, and manage projects from start to finish. We now power collaboration services for major brands such as Lionsgate and Jimmy Kimmel Live!. With a growing list of domestic and international clients, we required more internal focus on product development and serving the users. We found that running our own data centers consumed more time, money, and manpower than we were willing to devote.

We needed an approach that would help us iterate more rapidly to meet customer needs and dramatically improve our time to market. We wanted to reduce data center costs and have the flexibility to scale up quickly in any given region around the globe. Setting up a data center in a new location took so long that it was limiting the pace of growth that we could achieve. In addition, we were tired of buying ahead of our needs, which meant we had storage capacity that we did not even use. We required a storage solution that was both tiered and highly scalable to reduce costs by allowing us to keep infrequently used data in inactive storage while also allowing us to resurface it quickly at the customer’s request. Our main drivers were agility and innovation, and the cloud enables these in a significant way. Given that, we decided to adopt a cloud-first policy that would enable us to spend time and money on initiatives that differentiate our business, instead of putting resources into managing our storage and computing infrastructure.

Comparing AWS Against Cloud Competitors

To kick off the migration, we did our due diligence by evaluating a variety of cloud vendors, including AWS, Google, IBM, and Microsoft. AWS stuck out as the clear winner for us. At one point, we considered combining services from multiple cloud providers to meet our needs, but decided the best route was to use AWS exclusively. When we factored in training, synchronization, support, and system availability along with other migration and management elements, it was just not practical to take a multi-cloud approach. With the best cost savings and an unmatched ecosystem of partner solutions, we did not need anyone else and chose to go all-in on AWS.

By migrating to AWS, we were able to secure the lowest cost-per-gigabyte pricing, gain access to a rich ecosystem, quickly develop in-house talent, and maintain SOC II compliance. The ecosystem was particularly important to us and set AWS apart from its competitors with its expansive list of partners. In fact, all the vendors we depend on for services such as previewing images, encoding videos, and serving up presentations were already a part of the network so we were easily able to leverage our existing investments and expertise. If we went with a different provider, it would have meant moving away from a platform that was already working so well for which was not the desired outcome for us. Also, the amount of talent we were able to build up in house on AWS technologies was astounding. Training our internal team to work with AWS was a simple process using available tools such as AWS conferences, training materials, and support.

Migrating Petabytes of Data

Going with AWS made things easier. In many instances, it gave us better functionality than what we were using in house. We moved multiple petabytes of data from on-premises storage to AWS with ease. AWS gave us great speeds with Direct Connect, so we were able to push all the data in a little more than three months with no user impact. We employed AWS Key Management Service to keep our data secure, which eased our minds through the move. We performed extensive QA testing before flipping users over to ensure low customer impact, using methods such as checksums between our data center and the data that got pushed to AWS.

Our new platform on AWS has greatly improved our user experience. We have seen huge improvement in reliability, performance, and uptime—all critical in our line of business. We are now able to achieve upload and download speeds up to 17 times faster than our previous data centers, and uptime has increased by orders of magnitude. Also, the time it takes us to deploy services to a new region has been cut by more than 90%. It used to take us at least six months to get a new region online, and now we can get a region up and running in less than three weeks. On AWS, we can even replicate data at the bucket level across regions for disaster recovery purposes.

To cut costs, we were successfully able to divide our storage infrastructure into frequently and infrequently accessed data. Tiered storage in Amazon S3 has been a huge advantage, allowing us to optimize our storage costs so we have more to invest in product development. We can now move data from inactive to active tiers instantly to meet customer needs and eliminated the need to overprovision our storage infrastructure. It is refreshing to see services automatically scale up or down during peak load times, and know that we are only paying for what we need.

Overall, we achieved our key strategic goal of focusing more on development and less on infrastructure. Our migration felt seamless, and the progress we were able to share is a true testament to how easy it has been for us to run our workloads on AWS. We attribute part of our successful migration to the dedicated support provided by the AWS team. They were pretty awesome. We had a couple of their technicians available 24/7 via chat, which proved to be essential during this large-scale migration.

-Shiva Paranandi, SVP of Technology at Hightail

Learning More

Learn more about cost-effective tiered data storage with Amazon S3, or dive deeper into our AWS Partner Ecosystem to see which solutions could best serve the needs of your company.

New: Server-Side Encryption for Amazon Kinesis Streams

In this age of smart homes, big data, IoT devices, mobile phones, social networks, chatbots, and game consoles, streaming data scenarios are everywhere. Amazon Kinesis Streams enables you to build custom applications that can capture, process, analyze, and store terabytes of data per hour from thousands of streaming data sources. Since Amazon Kinesis Streams allows applications to process data concurrently from the same Kinesis stream, you can build parallel processing systems. For example, you can emit processed data to Amazon S3, perform complex analytics with Amazon Redshift, and even build robust, serverless streaming solutions using AWS Lambda.

Kinesis Streams enables several streaming use cases for consumers, and now we are making the service more effective for securing your data in motion by adding server-side encryption (SSE) support for Kinesis Streams. With this new Kinesis Streams feature, you can now enhance the security of your data and/or meet any regulatory and compliance requirements for any of your organization’s data streaming needs.
In fact, Kinesis Streams is now one of the AWS Services in Scope for the Payment Card Industry Data Security Standard (PCI DSS) compliance program. PCI DSS is a proprietary information security standard administered by the PCI Security Standards Council founded by key financial institutions. PCI DSS compliance applies to all entities that store, process, or transmit cardholder data and/or sensitive authentication data which includes service providers. You can request the PCI DSS Attestation of Compliance and Responsibility Summary using AWS Artifact. But the good news about compliance with Kinesis Streams doesn’t stop there. Kinesis Streams is now also FedRAMP compliant in AWS GovCloud. FedRAMP stands for Federal Risk and Authorization Management Program and is a U.S. government-wide program that delivers a standard approach to the security assessment, authorization, and continuous monitoring for cloud products and services. You can learn more about FedRAMP compliance with AWS Services here.

Now are you ready to get into the keys? Get it, instead of get into the weeds. Okay a little corny, but it was the best I could do. Coming back to discussing SSE for Kinesis Streams, let me explain the flow of server-side encryption with Kinesis. Each data record and partition key put into a Kinesis Stream using the PutRecord or PutRecords API is encrypted using an AWS Key Management Service (KMS) master key. With the AWS Key Management Service (KMS) master key, Kinesis Streams uses the 256-bit Advanced Encryption Standard (AES-256 GCM algorithm) to add encryption to the incoming data.

In order to enable server-side encryption with Kinesis Streams for new or existing streams, you can use the Kinesis management console or leverage one of the available AWS SDKs. Additionally, you can audit the history of your stream encryption, validate the encryption status of a certain stream in the Kinesis Streams console, or check that the PutRecord or GetRecord transactions are encrypted using the AWS CloudTrail service.

Walkthrough: Kinesis Streams Server-Side Encryption

Let’s do a quick walkthrough of server-side encryption with Kinesis Streams. First, I’ll go to the Amazon Kinesis console and select the Streams console option.

Once in the Kinesis Streams console, I can add server-side encryption to one of my existing Kinesis streams or opt to create a new Kinesis stream. For this walkthrough, I’ll opt to quickly create a new Kinesis stream, therefore, I’ll select the Create Kinesis stream button.

I’ll name my stream, KinesisSSE-stream, and allocate one shard for my stream. Remember that the data capacity of your stream is calculated based upon the number of shards specified for the stream. You can use the Estimate the number of shards you’ll need dropdown within the console or read more calculations to estimate the number of shards in a stream here. To complete the creation of my stream, now I click the Create Kinesis stream button.

With my KinesisSSE-stream created, I will select it in the dashboard and choose the Actions dropdown and select the Details option.

On the Details page of the KinesisSSE-stream, there is now a Server-side encryption section. In this section, I will select the Edit button.

Now I can enable server-side encryption for my stream with an AWS KMS master key, by selecting the Enabled radio button. Once selected I can choose which AWS KMS master key to use for the encryption of data in KinesisSSE-stream. I can either select the KMS master key generated by the Kinesis service, (Default) aws/kinesis, or select one of my own KMS master keys that I have previously generated. I’ll select the default master key and all that is left is for me to click the Save button.

That’s it! As you can see from my screenshots below, after only about 20 seconds, server-side encryption was added to my Kinesis stream and now any incoming data into my stream will be encrypted. One thing to note is server-side encryption only encrypts incoming data after encryption has been enabled. Preexisting data that is in a Kinesis stream prior to server-side encryption being enabled will remain unencrypted.

Summary

Kinesis Streams with Server-side encryption using AWS KMS keys makes it easy for you to automatically encrypt the streaming data coming into your stream. You can start, stop, or update server-side encryption for any Kinesis stream using the AWS management console or the AWS SDK. To learn more about Kinesis Server-Side encryption, AWS Key Management Service, or about Kinesis Streams review the Amazon Kinesis getting started guide, the AWS Key Management Service developer guide, or the Amazon Kinesis product page.

Enjoy streaming.

– Tara

AWS HIPAA Eligibility Update (July 2017) – Eight Additional Services

It is time for an update on our on-going effort to make AWS a great host for healthcare and life sciences applications. As you can see from our Health Customer Stories page, Philips, VergeHealth, and Cambia (to choose a few) trust AWS with Protected Health Information (PHI) and Personally Identifying Information (PII) as part of their efforts to comply with HIPAA and HITECH.

In May we announced that we added Amazon API Gateway, AWS Direct Connect, AWS Database Migration Service, and Amazon Simple Queue Service (SQS) to our list of HIPAA eligible services and discussed our how customers and partners are putting them to use.

Eight More Eligible Services
Today I am happy to share the news that we are adding another eight services to the list:

Amazon CloudFront can now be utilized to enhance the delivery and transfer of Protected Health Information data to applications on the Internet. By providing a completely secure and encryptable pathway, CloudFront can now be used as a part of applications that need to cache PHI. This includes applications for viewing lab results or imaging data, and those that transfer PHI from Healthcare Information Exchanges (HIEs).

AWS WAF can now be used to protect applications running on AWS which operate on PHI such as patient care portals, patient scheduling systems, and HIEs. Requests and responses containing encrypted PHI and PII can now pass through AWS WAF.

AWS Shield can now be used to protect web applications such as patient care portals and scheduling systems that operate on encrypted PHI from DDoS attacks.

Amazon S3 Transfer Acceleration can now be used to accelerate the bulk transfer of large amounts of research, genetics, informatics, insurance, or payer/payment data containing PHI/PII information. Transfers can take place between a pair of AWS Regions or from an on-premises system and an AWS Region.

Amazon WorkSpaces can now be used by researchers, informaticists, hospital administrators and other users to analyze, visualize or process PHI/PII data using on-demand Windows virtual desktops.

AWS Directory Service can now be used to connect the authentication and authorization systems of organizations that use or process PHI/PII to their resources in the AWS Cloud. For example, healthcare providers operating hybrid cloud environments can now use AWS Directory Services to allow their users to easily transition between cloud and on-premises resources.

Amazon Simple Notification Service (SNS) can now be used to send notifications containing encrypted PHI/PII as part of patient care, payment processing, and mobile applications.

Amazon Cognito can now be used to authenticate users into mobile patient portal and payment processing applications that use PHI/PII identifiers for accounts.

Additional HIPAA Resources
Here are some additional resources that will help you to build applications that comply with HIPAA and HITECH:

HIPAA Eligible Services Reference – The full list of HIPAA eligible AWS services.
HIPAA Compliance – Details our work around HIPAA and HITECH.
Health Customer Stories – A long list of videos and case studies from our healthcare and life sciences customers.
Healthcare Compliance in the Cloud – A big-picture view of compliance, including HIPAA and FedRAMP, as it relates to healthcare.
Healthcare Partner Solutions – Services and products from members of the AWS Partner Network.
Architecting for HIPAA in the Cloud – Architectural strategies and resources.
AWS HIPAA Compliance Whitepaper – A comprehensive guide to architecting for HIPAA.

Keep in Touch
In order to make use of any AWS service in any manner that involves PHI, you must first enter into an AWS Business Associate Addendum (BAA). You can contact us to start the process.

— Jeff;

Lambda@Edge – Intelligent Processing of HTTP Requests at the Edge

Late last year I announced a preview of Lambda@Edge and talked about how you could use it to intelligently process HTTP requests at locations that are close (latency-wise) to your customers. Developers who applied and gained access to the preview have been making good use of it, and have provided us with plenty of very helpful feedback. During the preview we added the ability to generate HTTP responses and support for CloudWatch Logs, and also updated our roadmap based on the feedback.

Now Generally Available
Today I am happy to announce that Lambda@Edge is now generally available! You can use it to:

Inspect cookies and rewrite URLs to perform A/B testing.
Send specific objects to your users based on the User-Agent header.
Implement access control by looking for specific headers before passing requests to the origin.
Add, drop, or modify headers to direct users to different cached objects.
Generate new HTTP responses.
Cleanly support legacy URLs.
Modify or condense headers or URLs to improve cache utilization.
Make HTTP requests to other Internet resources and use the results to customize responses.

Lambda@Edge allows you to create web-based user experiences that are rich and personal. As is rapidly becoming the norm in today’s world, you don’t need to provision or manage any servers. You simply upload your code (Lambda functions written in Node.js) and pick one of the CloudFront behaviors that you have created for the distribution, along with the desired CloudFront event:

In this case, my function (the imaginatively named EdgeFunc1) would run in response to origin requests for image/* within the indicated distribution. As you can see, you can run code in response to four different CloudFront events:

Viewer Request – This event is triggered when an event arrives from a viewer (an HTTP client, generally a web browser or a mobile app), and has access to the incoming HTTP request. As you know, each CloudFront edge location maintains a large cache of objects so that it can efficiently respond to repeated requests. This particular event is triggered regardless of whether the requested object is already cached.

Origin Request – This event is triggered when the edge location is about to make a request back to the origin, due to the fact that the requested object is not cached at the edge location. It has access to the request that will be made to the origin (often an S3 bucket or code running on an EC2 instance).

Origin Response – This event is triggered after the origin returns a response to a request. It has access to the response from the origin.

Viewer Response – This is event is triggered before the edge location returns a response to the viewer. It has access to the response.

Functions are globally replicated and requests are automatically routed to the optimal location for execution. You can write your code once and with no overt action on your part, have it be available at low latency to users all over the world.

Your code has full access to requests and responses, including headers, cookies, the HTTP method (GET, HEAD, and so forth), and the URI. Subject to a few restrictions, it can modify existing headers and insert new ones.

Lambda@Edge in Action
Let’s create a simple function that runs in response to the Viewer Request event. I open up the Lambda Console and create a new function. I choose the Node.js 6.10 runtime and search for cloudfront blueprints:

I choose cloudfront-response-generation and configure a trigger to invoke the function:

The Lambda Console provides me with some information about the operating environment for my function:

I enter a name and a description for my function, as usual:

The blueprint includes a fully operational function. It generates a “200” HTTP response and a very simple body:

I used this as the starting point for my own code, which pulls some interesting values from the request and displays them in a table:

'use strict';
exports.handler = (event, context, callback) => {

    /* Set table row style */
    const rs = '"border-bottom:1px solid black;vertical-align:top;"';
    /* Get request */
    const request = event.Records[0].cf.request;
   
    /* Get values from request */ 
    const httpVersion = request.httpVersion;
    const clientIp    = request.clientIp;
    const method      = request.method;
    const uri         = request.uri;
    const headers     = request.headers;
    const host        = headers['host'][0].value;
    const agent       = headers['user-agent'][0].value;
    
    var sreq = JSON.stringify(event.Records[0].cf.request, null, '&nbsp;');
    sreq = sreq.replace(/\n/g, '<br/>');

    /* Generate body for response */
    const body = 
     '<html>\n'
     + '<head><title>Hello From Lambda@Edge</title></head>\n'
     + '<body>\n'
     + '<table style="border:1px solid black;background-color:#e0e0e0;border-collapse:collapse;" cellpadding=4 cellspacing=4>\n'
     + '<tr style=' + rs + '><td>Host</td><td>'        + host     + '</td></tr>\n'
     + '<tr style=' + rs + '><td>Agent</td><td>'       + agent    + '</td></tr>\n'
     + '<tr style=' + rs + '><td>Client IP</td><td>'   + clientIp + '</td></tr>\n'
     + '<tr style=' + rs + '><td>Method</td><td>'      + method   + '</td></tr>\n'
     + '<tr style=' + rs + '><td>URI</td><td>'         + uri      + '</td></tr>\n'
     + '<tr style=' + rs + '><td>Raw Request</td><td>' + sreq     + '</td></tr>\n'
     + '</table>\n'
     + '</body>\n'
     + '</html>'

    /* Generate HTTP response */
    const response = {
        status: '200',
        statusDescription: 'HTTP OK',
        httpVersion: httpVersion,
        body: body,
        headers: {
            'vary':          [{key: 'Vary',          value: '*'}],
            'last-modified': [{key: 'Last-Modified', value:'2017-01-13'}]
        },
    };

    callback(null, response);
};

I configure my handler, and request the creation of a new IAM Role with Basic Edge Lambda permissions:

On the next page I confirm my settings (as I would do for a regular Lambda function), and click on Create function:

This creates the function, attaches the trigger to the distribution, and also initiates global replication of the function. The status of my distribution changes to In Progress for the duration of the replication (typically 5 to 8 minutes):

The status changes back to Deployed as soon as the replication completes:

Then I access the root of my distribution (https://dogy9dy9kvj6w.cloudfront.net/), the function runs, and this is what I see:

Feel free to click on the image (it is linked to the root of my distribution) to run my code!

As usual, this is a very simple example and I am sure that you can do a lot better. Here are a few ideas to get you started:

Site Management – You can take an entire dynamic website offline and replace critical pages with Lambda@Edge functions for maintenance or during a disaster recovery operation.

High Volume Content – You can create scoreboards, weather reports, or public safety pages and make them available at the edge, both quickly and cost-effectively.

Create something cool and share it in the comments or in a blog post, and I’ll take a look.

Things to Know
Here are a couple of things to keep in mind as you start to think about how to put Lambda@Edge to use in your application:

Timeouts – Functions that handle Origin Request and Origin Response events must complete within 3 seconds. Functions that handle Viewer Request and Viewer Response events must complete within 1 second.

Versioning – After you update your code in the Lambda Console, you must publish a new version and set up a fresh set of triggers for it, and then wait for the replication to complete. You must always refer to your code using a version number; $LATEST and aliases do not apply.

Headers – As you can see from my code, the HTTP request headers are accessible as an array. The headers fall in to four categories:

Accessible – Can be read, written, deleted, or modified.
Restricted – Must be passed on to the origin.
Read-only – Can be read, but not modified in any way.
Blacklisted – Not seen by code, and cannot be added.

Runtime Environment – The runtime environment provides each function with 128 MB of memory, but no builtin libraries or access to /tmp.

Web Service Access – Functions that handle Origin Request and Origin Response events must complete within 3 seconds can access the AWS APIs and fetch content via HTTP. These requests are always made synchronously with request to the original request or response.

Function Replication – As I mentioned earlier, your functions will be globally replicated. The replicas are visible in the “other” regions from the Lambda Console:

CloudFront – Everything that you already know about CloudFront and CloudFront behaviors is relevant to Lambda@Edge. You can use multiple behaviors (each with up to four Lambda@Edge functions) from each behavior, customize header & cookie forwarding, and so forth. You can also make the association between events and functions (via ARNs that include function versions) while you are editing a behavior:

Available Now
Lambda@Edge is available now and you can start using it today. Pricing is based on the number of times that your functions are invoked and the amount of time that they run (see the Lambda@Edge Pricing page for more info).

— Jeff;

New – Next-Generation GPU-Powered EC2 Instances (G3)

I first wrote about the benefits of GPU-powered computing in 2013 when we launched the G2 instance type. Since that launch, AWS customers have used the G2 instances to deliver high performance graphics to mobile devices, TV sets, and desktops.

Today we are taking a step forward and launching the G3 instance type. Powered by NVIDIA Tesla M60 GPUs, these instances are available in three sizes (all VPC-only and EBS-only):

Model	GPUs	GPU Memory	vCPUs	Main Memory	EBS Bandwidth
g3.4xlarge	1	8 GiB	16	122 GiB	3.5 Gbps
g3.8xlarge	2	16 GiB	32	244 GiB	7 Gbps
g3.16xlarge	4	32 GiB	64	488 GiB	14 Gbps

Each GPU supports 8 GiB of GPU memory, 2048 parallel processing cores, and a hardware encoder capable of supporting up to 10 H.265 (HEVC) 1080p30 streams and up to 18 H.264 1080p30 streams, making them a great fit for 3D rendering & visualization, virtual reality, video encoding, remote graphics workstation (NVIDIA GRID), and other server-side graphics workloads that need a massive amount of parallel processing power. The GPUs support OpenGL 4.5, DirectX 12.0, CUDA 8.0, and OpenCL 1.2. When you launch a G3 instance you have access to an NVIDIA GRID Virtual Workstation License and can make use of the NVIDIA GRID driver without purchasing a license on your own.

The instances use Intel Xeon E5-2686 v4 (Broadwell) processors running at 2.7 GHz. On the networking side, Enhanced Networking (via the Elastic Network Adapter) provides up to 20 Gbps of aggregate network bandwidth within a Placement Group, along with up to 14 Gbps of EBS bandwidth.

Our customers have told us that they are looking forward to visualizing large 3D seismic models, configuring cars in 3D, and providing students with the ability to run high-end 2D and 3D applications. For example, Calgary Scientific can take applications that are powered by the Unreal Engine and make them accessible on mobile devices and from within web pages, with collaborative viewing support. Visit their Demo Gallery to see PureWeb Reality in action:

You can launch these instances today in the US East (Ohio), US East (Northern Virginia), US West (Oregon), US West (Northern California), AWS GovCloud (US), and EU (Ireland) Regions as On-Demand, Reserved Instances, Spot Instances, and Dedicated Hosts, with more Regions coming soon.

— Jeff;