AWS cloud services guide: The right tools for the job

Here are the most common uses for the cloud and which Amazon Web Services components you need for them

AWS cloud services guide: The right tools for the job
Thinkstock

Cloud services are moving from the initial “we’re doing it because everyone else is” state to a more cautious, planned migration, one where IT departments have done a careful assessment of their needs and determined what to move to the cloud and what will stay on-premises.

Getting there takes some hard lessons. A study by IDG Research found that as much as 40 percent of workloads moved off the cloud and back to an on-premises setting. That’s because companies had the mistaken notion that they could move to the cloud and continue to operate like they could in their on-premises environment, when that is not the case.

One defining element of the cloud is its elasticity. Its value is in burst capacity, giving customers a rapid increase in computing power or services access on short notice and then the ability to turn it off and no longer pay for it when they are done. But what cloud services are there to take such advantage of?

To make that process easier, I have compiled a list of the most common uses for the cloud and which Amazon Web Services components you need for them. AWS is of course the most popular cloud platform today.

By default, all these services assume you will use EC2, S3, and Amazon Data Transfer. They form the basic core that everyone uses and then builds on.

AWS services for devops, software development, and testing

Developers find it easy to spin up an instance; do their development, compiles, and testing online; and shut down the VM when done. The challenge is keeping track of usage and shutting it off when not in use.

You start with the basics: AWS Developer Tools. This set of four services lets you build and deliver app updates regularly. The four services are:

  • AWS CodeCommit, to store code in a private Git repository
  • AWS CodePipeline, for continuous integration and delivery
  • AWS CodeBuild, to build and test code
  • AWS CodeDeploy, to automate code deployments

Apps built with Developer Tools can run on AWS or on-premises.

For specialized cases, there are several other code-related services:

  • Amazon Elastic Container Service (ECS) is a highly scalable, high performance container management service that supports Docker containers to be run in an instance.
  • If you are experimenting with serverless computing, AWS Lambda lets you run code without having to provision or manage servers. It supports virtually any type of application or backend service with zero administration.
  • Two services support infrastructure-as-code (IAC) provisioning. IAC is a process where systems like virtual machines are automatically built, managed, and provisioned through code, rather than through less-flexible scripting or a manual process. That’s why infrastructure as code is sometimes referred to as programmable infrastructure.

Next comes management:

  • AWS CloudFormation gives developers and systems administrators templates to create and manage a collection of related AWS resources, provisioning and updating them as needed.
  • AWS OpsWorks is a configuration management service that uses Chef, an automation platform that treats server configurations as code. Apps are deployed on an EC2 instance or your on-premises environment.

All of this can be built on top of AWS networking services to replicate your own infrastructure. Amazon Virtual Private Cloud (VPC) allows you to launch AWS in a virtual network that you’ve defined, so if you are building on-premises apps, it can resemble your own infrastructure.

Finally, to avoid billing balloons, there’s AWS CloudWatch, which offers metrics on data transfer, disk usage, and CPU utilization. You can set up alerts when certain thresholds are exceeded to keep the monthly bill under control.

AWS services for long-term archival and disaster recovery

Most companies do their backups and long-term archives on premises, but what happens if there is a fire in the datacenter, where you keep your backups? Offsite backup is the safest measure, and you can certainly use AWS for long-term backup and disaster recovery.

The main backup service for AWS is Glacier, for which Amazon promises 99.999999999 percent durability as well as full regulatory compliance. You can also run analytics on the data. You can store data for as little as $0.004 per gigabyte per month.

However, Glacier has its caveats, as one consumer found out the hard way: Glacier is meant to hold data but not retrieve a lot. As that customer learned, you can only retrieve 5 percent of your total storage for free. After that it costs you—and can add up quick. So, if you expect to retrieve data regularly or even semiregularly, a backup service like BackBlaze, Dropbox, or OneDrive might be a better fit.

Also, just as glaciers move at a ridiculously slow pace, don’t expect to get at your data immediately. A request can take up to a day to complete. In 2016, Amazon added new options to expedite retrieval. For users with 100TB or more data stored, you can expedite the retrieval for $0.03 per gigabyte and $0.01 per request, which is a premium over the standard cost of $0.01 per gigabyte and $0.050 per 1,000 requests.

For disaster recovery, you use a variety of AWS parts.

First, you have to pick the datacenter region where your data will be stored for many reasons, like compliance with storage requirements and to manage latency.

From there you use Amazon Simple Storage Service (S3), Glacier, AWS Elastic Block Storage for creating snapshots of data volumes, AWS Import/Export (Snowball), and AWS Storage Gateway to connect your on-premises software appliance with cloud-based storage. All are services, but Snowball is a decidedly offline solution.

Transferring many terabytes of data to Amazon’s datacenter can get prohibitively expensive, so Amazon sends a ruggedized Snowball appliance to your facility that you connect to your network and transfer massive amounts of data to. You send it back to Amazon, and it’s added to your storage account. And if you have a lot of data, there’s the Snowmobile, an 18-wheeler with a 45-foot container that can haul away up to 100 petabytes of data.

AWS services for analytics

AWS is a great resource for a particular type of analytics. If you are looking to gather insight from cloud resources—that is, data gathered from the cloud and the internet in general—it works well. But if you wanted to send the contents of your data warehouse to the cloud for processing and send the results back, AWS is not recommended because the costs would explode. However, Amazon will support such processing if you want those functions.

Amazon has a comprehensive set of analytics tools, starting with Athena for analysis of data stored in S3 instances, EMR for Hadoop, QuickSight for business analytics, Redshift for a petabyte-scale data warehouse, Glue to perform ETL tasks on data stores, and Data Pipeline to securely move data. Most recently, it added AWS IoT Analytics to the menu for, what else, analytics of IoT sensors and other data gathered by IoT devices.

Athena is the main analytics service, running on data residing on S3. It supports a variety of data formats, including ORC, JSON, CSV, and Parquet. Data can and should be converted to columnar formats using Apache Parquet. Athena uses Presto, an open source query engine, as its SQL query engine.

For business intelligence, the focus at AWS is Redshift. It is a massively parallel, column-oriented database deployed as a cluster for massive parallel processing. (You can’t even deploy a single node of Redshift.) It’s designed for OLAP-style workloads such as data warehouses, analytics, and ETL.

Redshift is for more advanced data warehouse users. For visualization of customer data, there is QuickSight, a BI service. It uses data stored in any Hadoop repository, an Amazon Redshift data warehouse, or some third-party sources such as Salesforce.com and Oracle.

AWS services for “pop-up” apps and websites

A short-term marketing program for something like a movie premier or product launch means a dedicated site that will see a lot of activity for a short time and then die off. Rather than go through the process of setting up your own site for such short-lived needs, or using a service like GoDaddy or 1and1, AWS offers a variety of web hosting services that scale.

The Simple Website Hosting service is a single Web server with a content management system (CMS) such as WordPress, an e-commerce application such as Magento, or a development stack like LAMP. These are best for low- to medium-traffic sites with frequent content changes, because it will not scale beyond five servers.

Then there’s Lightsail, which is designed for sites that won’t require server-side scripting like PHP or ASP.Net but need to scale for occasionally bursts of high traffic. It offers storage and data management along with a static IP and DNS registration.

Finally, there is Enterprise Web Hosting, which provides multiple servers across at least two datacenters, for lots of capacity with load balancing, high CPU utilization, and lots of applications. You will use a variety of services, most notably CloudFront, a content delivery network, as well as AWS e-commerce apps for handling online sales.

AWS services for app migration

This is one of the main selling points for AWS. Rather than run apps on your local systems, you can migrate them to AWS and run them in the cloud, eliminating the need to maintain a large datacenter.

But not every application lends itself to cloud migration:

  • Apps that need to run at full utilization 24/7, for example, are not suited for the cloud, where everything is metered and you pay for every CPU clock cycle.
  • Apps where lots of data must move back and forth between the cloud and your network are not cloud-friendly, because bandwidth usage is also metered.
  • There can be compliance issues. Although Amazon and its competitors have made great strides at HIPAA compliance for medical records, for example, some things must remain within the confines of a firm’s datacenter.

Where it does make sense, migration to the cloud requires very careful planning and assessment of which apps can and should be moved to the cloud. Once you have cleared that sizable hurdle, you use a recently added AWS service called Server Migration Service to help with the migration. It comes with a connector to analyze your virtualized server environment and collect information about the instances you are using.

From there, the VM is reproduced as an Amazon Machine Image (AMI) stored in the AWS Elastic Block Store (EBS) service. The AMI can then be moved and run on an EC2 instance. At that point, you have a replica of your on-premises virtual instance on AWS.

You can also migrate from MySQL to Amazon RDS just as you would to another database on your network. For more analytical workloads, such as OLAP, tools from their vendors will help you migrate from on-premises to an EC2 database AMI, just as they would help when migrating data from one machine to another.

Copyright © 2018 IDG Communications, Inc.