Review: AWS Lambda redefines 'on demand'

Amazon’s simple, scalable compute service runs your functions whenever needed, but is limited to Java, Python, and Node.js

Review: AWS Lambda redefines 'on demand'
Thinkstock
At a Glance
  • Amazon AWS Lambda

What if you could simply define a function to live in the cloud and handle the designated workloads, and not have to worry about provisioning servers, allocating RAM, scaling the number of instances, or configuring load balancing? How great would that be? In fact it is pretty great, as you can discover by using AWS Lambda.

AWS Lambda is a compute service. You can set up Lambda functions to respond to events, such as changes to data, asynchronously. You can call them directly or through an API gateway, synchronously. And you can use them to respond to HTTP(S) requests, synchronously.

AWS Lambda currently supports three computer language environments for handler functions: Node.js (JavaScript); Java code packaged as Jar or Zip files; and Python. Python support was recently added, unveiled at AWS re:Invent 2015. In the same presentation, CTO Werner Vogel mentioned that Lambda is now the fastest-growing service at AWS.

Amazon touts Lambda’s ability to extend other AWS services with custom logic; its ability to help you build custom back-end services; its completely automated administration; its built-in fault tolerance; its automatic scaling; its integrated security model; its invitation to bring your own code; its pay-per-use business model; and its flexible resource model.

To help you grok Lambda, I’ve curated some questions and answers about the service.

AWS Lambda Q&A

Where do your Lambda functions execute? You don’t know or need to care. The AWS infrastructure provides the execution environment automagically, and it scales as needed to handle the event traffic.

How much memory do your Lambda functions need? You specify a RAM limit, which affects the cost of running them, since you are charged by the gigabyte-second based on the allocation, not by the actual RAM usage. You can tune your memory limit based on your AWS Lambda logs, which tell you how much memory and time each call actually took.

How much CPU time do your Lambda functions take? As much as they need, up to a timeout that you set, which may be as high as five minutes per function call. You are charged for actual gigabyte-seconds used, rounded up to the nearest 100ms. Allowing higher timeouts doesn’t affect your cost unless function executions “run away” or “go into a loop” and need to be stopped. You can tune your time limit based on your AWS Lambda logs, which tell you how much memory and time each call actually took.

Why would you use AWS Lambda for your event-handling and server functions? When you don’t want to fuss with the infrastructure and scaling behind your code.

When would you be unable to use AWS Lambda? When you can’t write your function in Node.js, Java, or Python; when you need a customized environment; or when your code needs to retain state.

When would it be better to use a different paradigm for hosting your functionality? When you have relatively predictable base loads so that you can use inexpensive reserved VM instances with autoscaling rules to handle peak loads. If loads are predictable, it is often less expensive to use reserved instances than to rely on Lambda. It doesn’t take much experience in running your application to determine which kind of deployment is optimal, but it’s hard to guess before you have that experience.

Using AWS Lambda

The first step in defining a Lambda function, as shown in Figure 1, is usually selecting a blueprint. Currently there are 27 Lambda blueprints in the catalog, mostly in Node.js, although a few are in Python. There are no Java blueprints in the catalog.

AWS Lambda select blueprint

Figure 1: The first step in defining a Lambda function is usually selecting a blueprint, unless you’re rolling your own.

As shown in Figure 2, no matter what function you’re running in Lambda, you’ll have to name it, declare a runtime, define a function handler and an IAM role with the proper permissions, and configure endpoints. None of that is hard, but you have to understand the invocation models.

AWS Lambda config function

Figure 2: No matter what function you’re running in Lambda, you’ll have to name it, declare a runtime, define a function handler and an IAM role with the proper permissions, and configure endpoints.

S3 bucket event sources are among the dozen or so types of events supported by Lambda event-handling functions. In Figure 3, I have defined the source as object creation events in a specific S3 bucket that happens to be the user storage location for a mobile client. That probably wasn’t a very good choice of sample, since the Mobile Hub protects the per-user folders in that bucket using Cognito credentials.

AWS Lambda config event sources

Figure 3: S3 bucket event sources are among the dozen or so types of events supported by Lambda event-handling functions.

CloudWatch metrics for Lambda functions are automatically displayed in a cloud dashboard, as shown in Figure 4. No setup was required. Note the invocation error, which was caused by the permissions restriction on the S3 bucket imposed by the Mobile Hub.

AWS Lambda cloudwatch

Figure 4: CloudWatch metrics for Lambda functions are automatically displayed in a cloud dashboard.

How Lambda works

At an oversimplified level, the way AWS Lambda works is as shown in Figure 5. You write some code, set it to trigger, and the code runs when triggered. Then you pay for the CPU and memory used.

AWS Lambda how it works

Figure 5: How AWS Lambda works.

At the implementation level there are a few additional details to consider. To begin with, when you’re setting up a Lambda function, you need to set the RAM allocation and the time limit for each invocation. You would think that these are independent, but they’re not. AWS Lambda allocates CPU power proportional to the memory by using the same ratio as a general-purpose Amazon EC2 instance type, such as an M3 type. Network bandwidth and disk I/O also scale with CPU power and RAM.

Table 1 was extracted from the Amazon EC2 instance creation dialogs:

Lambda function memory allocations range from 128MB to 1,536MB. You would expect a function allocated 1.5GB to get less than half a vCPU, and a function allocated 128MB to get about 1/25th of a vCPU. I can’t imagine that it actually works like that. I would expect AWS to load as many containers of the proper size as will fit in a VM instance and allow the function in each container to share CPU with the other running functions.

But I speculate. My point is that functions with more RAM will get more CPU as well, so there may be cases where performance requirements force you to allocate more RAM to get the CPU resources that allow the function to run in less than the maximum latency you need. Otherwise, you’ll usually want to be a good citizen and keep your RAM allocation fairly tight. Also, there’s no guarantee that a given Lambda function will be CPU-bound if it calls other services that use storage.

Invoking Lambda

Figure 3, above, shows how a Lambda function can be triggered by a single fully specified event source (a put action on a specific S3 bucket). There are a reasonably large range of event sources, as shown in Figure 6 below.

AWS Lambda event sources

Figure 6: Lambda functions may be called by the AWS mobile SDK, from scheduled events, or from events from nine AWS services.

Lambda functions may be called by the AWS mobile SDK, by scheduled events, or by events from nine AWS services. Notice that while DynamoDB and Kinesis can generate events to be handled by Lambda functions, triggers from RDSes such as MySQL cannot currently be handled by Lambda functions. That’s understandable, as relational databases run stored procedures internally to handle triggers, but I’d suggest that enabling the ability to call a Lambda function from a relational database trigger might turn out to be useful in the future.

AWS Lambda imposes safety throttles on your functions of 100 concurrent Lambda function executions per account. This is primarily to protect you from runaway or recursive functions during initial development and testing. You can request a higher service limit if needed, and AWS itself may automatically raise the throttle limits to enable your function to match the incoming event rate, as in the case of triggering the function from an Amazon S3 bucket.

There are a few more points you should understand about Lambda, which I’ll touch on briefly. First, there are two kinds of Lambda function invocations: Event and RequestResponse. Events are invoked automatically and asynchronously by other AWS services, and your function doesn’t need to return a response. RequestResponse calls are synchronous, and they return a response. They are invoked via the AWS console or over HTTPS using the Amazon API Gateway.

Lambda supports both pull and push event models. That’s really a technical detail that depends on the service generating the event, not anything for which you have to write code.

Monitoring Lambda

CloudWatch logs are automatically generated by Lambda calls. In addition, AWS Lambda is integrated with AWS CloudTrail, a service you can enable that captures API calls made by or on behalf of AWS Lambda in your AWS account and delivers the log files to an Amazon S3 bucket.

While I did my screenshots from the Lambda console, there is an AWS CLI that has substantially the same functionality, except it’s more like coding and less like a Where’s Waldo puzzle. The CLI is covered in some of the Lambda walkthroughs. It’s a Python program, which I installed using pip:

       Martins-iMac:~ mheller$ sudo pip install awscli

You can view its help file with aws help.

Overall, I like the AWS Lambda service, and I can see why it has become popular for implementing back ends, processing real-time streams, adding scalable event handling to websites, and performing ETL operations. The major reason not to use Lambda for those types of operations would be the restricted list of supported languages. Although there’s a long list of features that are already well-implemented in Node.js, Java, and Python, it’s by no means complete.

For example, if your goal is to perform Fast Fourier Transform (FFT) analysis on a real-time data stream to calculate and display frequency spectra, you might want to use the standard FFTW routine, which is written in C, along with a graphing package in, say, the R language. Or you might want to use a GPU-accelerated FFT, running in a GPU instance. Neither option currently lends itself to AWS Lambda.

Still, many common operations do lend themselves to running in AWS Lambda, and Lambda is an option to keep in mind as an alternative to provisioning VMs or containers.

InfoWorld Scorecard
Ease of development (20%)
Language support (20%)
Monitoring (20%)
Scalability (20%)
Performance (10%)
Value (10%)
Overall Score (100%)
AWS Lambda 8 7 9 10 9 9 8.6
At a Glance
  • AWS Lambda runs function code in Node.js, Java, and Python without provisioning or managing servers, with essentially unbounded scalability. You are only charged for the gigabyte-seconds you use.

    Pros

    • Runs code without provisioning or managing servers
    • Supports functions written in Node.js, Java 8, and Python 2.7
    • Charged for only the gigabyte-seconds you use
    • Essentially unbounded scalability

    Cons

    • No control over the server environment
    • No retention of state between invocations
    • Supports functions written in only Node.js, Java 8, and Python 2.7
    • No discounts for volume

Copyright © 2015 IDG Communications, Inc.