All posts by Bryan Liston

ServerlessConf and More!

Post Syndicated from Bryan Liston original

ServerlessConf Austin

ServerlessConf Austin is just around the corner! April 26-28th come join us in Austin at the Zach Topfer Theater. Our very own Tim Wagner, Chris Munns and Randall Hunt will be giving some great talks.

Serverlessconf is a community led conference focused on sharing experiences building applications using serverless architectures. Serverless architectures enable developers to express their creativity and focus on user needs instead of spending time managing infrastructure and servers.

Tim Wagner, GM Serverless Applications, will be giving a keynote on Friday the 28th, do not miss this!!!
Chris Munns, Sr. Developer Advocate, will be giving an excellent talk on CI/CD for Serverless Applications.

Check out the full agenda here!

AWS Serverless Updates and More!

Incase you’ve missed out lately on some of our new content such as our new YouTube series “Coding with Sam”, or our new Serverless Special AWS Podcast Series, check them out!

Meet SAM!

We’ve recently come out with a new branding for AWS SAM (Serverless Application Model), so please join me in welcoming SAM the Squirrel!

The goal of AWS SAM is to define a standard application model for serverless applications.

Once again, don’t hesitate to reach out if you have questions, comments, or general feedback.


How to remove boilerplate validation logic in your REST APIs with Amazon API Gateway request validation

Post Syndicated from Bryan Liston original

Ryan Green, Software Development Engineer

Does your API suffer from code bloat or wasted developer time due to implementation of simple input validation rules? One of the necessary but least exciting aspects of building a robust REST API involves implementing basic validation of input data to your API. In addition to increasing the size of the code base, validation logic may require taking on extra dependencies and requires diligence in ensuring the API implementation doesn’t get out of sync with API request/response models and SDKs.

Amazon API Gateway recently announced the release of request validators, a simple but powerful new feature that should help to liberate API developers from the undifferentiated effort of implementing basic request validation in their API backends.

This feature leverages API Gateway models to enable the validation of request payloads against the specified schema, including validation rules as defined in the JSON-Schema Validation specification. Request validators also support basic validation of required HTTP request parameters in the URI, query string, and headers.

When a validation failure occurs, API Gateway fails the API request with an HTTP 400 error, skips the request to the backend integration, and publishes detailed error results in Amazon CloudWatch Logs.

In this post, I show two examples using request validators, validating the request body and the request parameters.

Example: Validating the request body

For this example, you build a simple API for a simulated stock trading system. This API has a resource, "/orders", that represents stock purchase orders. An HTTP POST to this resource allows the client to initiate one or more orders.

A sample request might look like this:

POST /orders

    "account-id": "abcdef123456",
    "type": "STOCK",
    "symbol": "AMZN",
    "shares": 100,
    "details": {
      "limit": 1000
    "account-id": "zyxwvut987654",
    "type": "STOCK",
    "symbol": "BA",
    "shares": 250,
    "details": {
      "limit": 200

The JSON-Schema for this request body might look something like this:

  "$schema": "",
  "title": "Create Orders Schema",
  "type": "array",
  "minItems": 1,
  "items": {
    "type": "object",
    "required": [
    "properties": {
      "account_id": {
        "type": "string",
        "pattern": "[A-Za-z]{6}[0-9]{6}"
      "type": {
        "type": "string",
        "enum": [
      "symbol": {
        "type": "string",
        "minLength": 1,
        "maxLength": 4
      "shares": {
        "type": "number",
        "minimum": 1,
        "maximum": 1000
      "details": {
        "type": "object",
        "required": [
        "properties": {
          "limit": {
            "type": "number"

This schema defines the "shape" of the request model but also defines several constraints on the various properties. Here are the validation rules for this schema:

  • The root array must have at least 1 item
  • All properties are required
  • Account ID must match the regular expression format "[A-Za-z]{6}[0-9]{6}"
  • Type must be one of STOCK, BOND, or CASH
  • Symbol must be a string between 1 and 4 characters
  • Shares must be a number between 1 and 1000

I’m sure you can imagine how this would look in your validation library of choice, or at worst, in a hand-coded implementation.

Now, try this out with API Gateway request validators. The Swagger definition below defines the REST API, models, and request validators. Its two operations define simple mock integrations to simulate behavior of the stock trading API.

Note the request validator definitions under the "x-amazon-apigateway-request-validators" extension, and the references to these validators defined on the operation and on the API.

  "swagger": "2.0",
  "info": {
    "title": "API Gateway - Request Validation Demo - [email protected]"
  "schemes": [
  "produces": [
  "x-amazon-apigateway-request-validators" : {
    "full" : {
      "validateRequestBody" : true,
      "validateRequestParameters" : true
    "body-only" : {
      "validateRequestBody" : true,
      "validateRequestParameters" : false
  "x-amazon-apigateway-request-validator" : "full",
  "paths": {
    "/orders": {
      "post": {
        "x-amazon-apigateway-request-validator": "body-only",
        "parameters": [
            "in": "body",
            "name": "CreateOrders",
            "required": true,
            "schema": {
              "$ref": "#/definitions/CreateOrders"
        "responses": {
          "200": {
            "schema": {
              "$ref": "#/definitions/Message"
          "400" : {
            "schema": {
              "$ref": "#/definitions/Message"
        "x-amazon-apigateway-integration": {
          "responses": {
            "default": {
              "statusCode": "200",
              "responseTemplates": {
                "application/json": "{\"message\" : \"Orders successfully created\"}"
          "requestTemplates": {
            "application/json": "{\"statusCode\": 200}"
          "passthroughBehavior": "never",
          "type": "mock"
      "get": {
        "parameters": [
            "in": "header",
            "name": "Account-Id",
            "required": true
            "in": "query",
            "name": "type",
            "required": false
        "responses": {
          "200" : {
            "schema": {
              "$ref": "#/definitions/Orders"
          "400" : {
            "schema": {
              "$ref": "#/definitions/Message"
        "x-amazon-apigateway-integration": {
          "responses": {
            "default": {
              "statusCode": "200",
              "responseTemplates": {
                "application/json": "[{\"order-id\" : \"qrx987\",\n   \"type\" : \"STOCK\",\n   \"symbol\" : \"AMZN\",\n   \"shares\" : 100,\n   \"time\" : \"1488217405\",\n   \"state\" : \"COMPLETED\"\n},\n{\n   \"order-id\" : \"foo123\",\n   \"type\" : \"STOCK\",\n   \"symbol\" : \"BA\",\n   \"shares\" : 100,\n   \"time\" : \"1488213043\",\n   \"state\" : \"COMPLETED\"\n}\n]"
          "requestTemplates": {
            "application/json": "{\"statusCode\": 200}"
          "passthroughBehavior": "never",
          "type": "mock"
  "definitions": {
    "CreateOrders": {
      "$schema": "",
      "title": "Create Orders Schema",
      "type": "array",
      "minItems" : 1,
      "items": {
        "type": "object",
        "$ref" : "#/definitions/Order"
    "Orders" : {
      "type": "array",
      "$schema": "",
      "title": "Get Orders Schema",
      "items": {
        "type": "object",
        "properties": {
          "order_id": { "type": "string" },
          "time" : { "type": "string" },
          "state" : {
            "type": "string",
            "enum": [
          "order" : {
            "$ref" : "#/definitions/Order"
    "Order" : {
      "type": "object",
      "$schema": "",
      "title": "Schema for a single Order",
      "required": [
      "properties" : {
        "account-id": {
          "type": "string",
          "pattern": "[A-Za-z]{6}[0-9]{6}"
        "type": {
          "type" : "string",
          "enum" : [
        "symbol" : {
          "type": "string",
          "minLength": 1,
          "maxLength": 4
        "shares": {
          "type": "number",
          "minimum": 1,
          "maximum": 1000
        "details": {
          "type": "object",
          "required": [
          "properties": {
            "limit": {
              "type": "number"
    "Message": {
      "type": "object",
      "properties": {
        "message" : {
          "type" : "string"

To create the demo API, run the following commands (requires the AWS CLI):

git clone
cd apigateway-validation-demo
aws apigateway import-rest-api --body "file://validation-swagger.json" --region us-east-1
export API_ID=[API ID from last step]
aws apigateway create-deployment --rest-api-id $API_ID --stage-name test --region us-east-1

Make some requests to this API. Here’s the happy path with valid request body:

curl -v -H "Content-Type: application/json" -X POST -d ' [  
]' https://$


HTTP/1.1 200 OK

{"message" : "Orders successfully created"}

Put the request validator to the test. Notice the errors in the payload:

curl -v -H "Content-Type: application/json" -X POST -d '[
    "account-id": "abcdef123456",
    "type": "foobar",
    "symbol": "thisstringistoolong",
    "shares": 999999,
    "details": {
       "limit": 1000
]' https://$


HTTP/1.1 400 Bad Request

{"message": "Invalid request body"}

When you inspect the CloudWatch Logs entries for this API, you see the detailed error messages for this payload. Run the following command:

pip install apilogs

apilogs get --api-id $API_ID --stage test --watch --region us-east-1`

The CloudWatch Logs entry for this request reveals the specific validation errors:

"Request body does not match model schema for content type application/json: [numeric instance is greater than the required maximum (maximum: 1000, found: 999999), string "thisstringistoolong" is too long (length: 19, maximum allowed: 4), instance value ("foobar") not found in enum (possible values: ["STOCK","BOND","CASH"])]"

Note on Content-Type: 

Request body validation is performed according to the configured request Model which is selected by the value of the request ‘Content-Type’ header. In order to enforce validation and restrict requests to explicitly-defined content types, it’s a good idea to use strict request passthrough behavior (‘"passthroughBehavior": "never"’), so that unsupported content types fail with 415 "Unsupported Media Type" response.

Example: Validating the request parameters

For the next example, add a GET method to the /orders resource that returns the list of purchase orders. This method has an optional query string parameter (type) and a required header parameter (Account-Id).

The request validator configured for the GET method is set to validate incoming request parameters. This performs basic validation on the required parameters, ensuring that the request parameters are present and non-blank.

Here are some example requests.

Happy path:

curl -v -H "Account-Id: abcdef123456" "https://$"


HTTP/1.1 200 OK

[{"order-id" : "qrx987",
   "type" : "STOCK",
   "symbol" : "AMZN",
   "shares" : 100,
   "time" : "1488217405",
   "state" : "COMPLETED"
   "order-id" : "foo123",
   "type" : "STOCK",
   "symbol" : "BA",
   "shares" : 100,
   "time" : "1488213043",
   "state" : "COMPLETED"

Omitting optional type parameter:

curl -v -H "Account-Id: abcdef123456" "https://$"


HTTP/1.1 200 OK

[{"order-id" : "qrx987",
   "type" : "STOCK",
   "symbol" : "AMZN",
   "shares" : 100,
   "time" : "1488217405",
   "state" : "COMPLETED"
   "order-id" : "foo123",
   "type" : "STOCK",
   "symbol" : "BA",
   "shares" : 100,
   "time" : "1488213043",
   "state" : "COMPLETED"

Omitting required Account-Id parameter:

curl -v "https://$"


HTTP/1.1 400 Bad Request

{"message": "Missing required request parameters: [Account-Id]"}


Request validators should help API developers to build better APIs by allowing them to remove boilerplate validation logic from backend implementations and focus on actual business logic and deep validation. This should further reduce the size of the API codebase and also help to ensure that API models and validation logic are kept in sync. 

Please forward any questions or feedback to the API Gateway team through AWS Support or on the AWS Forums.

Scaling Your Desktop Application Streams with Amazon AppStream 2.0

Post Syndicated from Bryan Liston original

Want to stream desktop applications to a web browser, without rewriting them? Amazon AppStream 2.0 is a fully managed, secure, application streaming service. An easy way to learn what the service does is to try out the end-user experience, at no cost.

In this post, I describe how you can scale your AppStream 2.0 environment, and achieve some cost optimizations. I also add some setup and monitoring tips.

AppStream 2.0 workflow

You import your applications into AppStream 2.0 using an image builder. The image builder allows you to connect to a desktop experience from within the AWS Management Console, and then install and test your apps. Then, create an image that is a snapshot of the image builder.

After you have an image containing your applications, select an instance type and launch a fleet of streaming instances. Each instance in the fleet is used by only one user, and you match the instance type used in the fleet to match the needed application performance. Finally, attach the fleet to a stack to set up user access. The following diagram shows the role of each resource in the workflow.

Figure 1: Describing an AppStream 2.0 workflow


Setting up AppStream 2.0

To get started, set up an example AppStream 2.0 stack or use the Quick Links on the console. For this example, I named my stack ds-sample, selected a sample image, and chose the stream.standard.medium instance type. You can explore the resources that you set up in the AWS console, or use the describe-stacks and describe-fleets commands as follows:

Figure 2: Describing an AppStream 2.0 stack


Figure 3: Describing an AppStream 2.0 fleet


To set up user access to your streaming environment, you can use your existing SAML 2.0 compliant directory. Your users can then use their existing credentials to log in. Alternatively, to quickly test a streaming connection, or to start a streaming session from your own website, you can create a streaming URL. In the console, choose Stacks, Actions, Create URL, or call create-streaming-url as follows:

Figure 4: Creating a streaming URL


You can paste the streaming URL into a browser, and open any of the displayed applications.


Now that you have a sample environment set up, here are a few tips on scaling.

Scaling and cost optimization for AppStream 2.0

To provide an instant-on streaming connection, the instances in an AppStream 2.0 fleet are always running. You are charged for running instances, and each running instance can serve exactly one user at any time. To optimize your costs, match the number of running instances to the number of users who want to stream apps concurrently. This section walks through three options for doing this:

  • Fleet Auto Scaling
  • Fixed fleets based on a schedule
  • Fleet Auto Scaling with schedules

Fleet Auto Scaling

To dynamically update the number of running instances, you can use Fleet Auto Scaling. This feature allows you to scale the size of the fleet automatically between a minimum and maximum value based on demand. This is useful if you have user demand that changes constantly, and you want to scale your fleet automatically to match this demand. For examples about setting up and managing scaling policies, see Fleet Auto Scaling.

You can trigger changes to the fleet through the available Amazon CloudWatch metrics:

  • CapacityUtilization – the percentage of running instances already used.
  • AvailableCapacity – the number of instances that are unused and can receive connections from users.
  • InsufficientCapacityError – an error that is triggered when there is no available running instance to match a user’s request.

You can create and attach scaling policies using the AWS SDK or AWS Management Console. I find it convenient to set up the policies using the console. Use the following steps:

  1. In the AWS Management Console, open AppStream 2.0.
  2. Choose Fleets, select a fleet, and choose Scaling Policies.
  3. For Minimum capacity and Maximum capacity, enter values for the fleet.

Figure 5: Fleets tab for setting scaling policies


  1. Create scale out and scale in policies by choosing Add Policy in each section.

Figure 6: Adding a scale out policy


Figure 7: Adding a scale in policy


After you create the policies, they are displayed as part of your fleet details.


The scaling policies are triggered by CloudWatch alarms. These alarms are automatically created on your behalf when you create the scaling policies using the console. You can view and modify the alarms via the CloudWatch console.

Figure 8: CloudWatch alarms for triggering fleet scaling


Fixed fleets based on a schedule

An alternative option to optimize costs and respond to predictable demand is to fix the number of running instances based on the time of day or day of the week. This is useful if you have a fixed number of users signing in at different times of the day― scenarios such as a training classes, call center shifts, or school computer labs. You can easily set the number of instances that are running using the AppStream 2.0 update-fleet command. Update the Desired value for the compute capacity of your fleet. The number of Running instances changes to match the Desired value that you set, as follows:

Figure 9: Updating desired capacity for your fleet


Set up a Lambda function to update your fleet size automatically. Follow the example below to set up your own functions. If you haven’t used Lambda before, see Step 2: Create a HelloWorld Lambda Function and Explore the Console.

To create a function to change the fleet size

  1. In the Lambda console, choose Create a Lambda function.
  2. Choose the Blank Function blueprint. This gives you an empty blueprint to which you can add your code.
  3. Skip the trigger section for now. Later on, you can add a trigger based on time, or any other input.
  4. In the Configure function section:
    1. Provide a name and description.
    2. For Runtime, choose Node.js 4.3.
    3. Under Lambda function handler and role, choose Create a custom role.
    4. In the IAM wizard, enter a role name, for example Lambda-AppStream-Admin. Leave the defaults as is.
    5. After the IAM role is created, attach an AppStream 2.0 managed policy “AmazonAppStreamFullAccess” to the role. For more information, see Working with Managed Policies. This allows Lambda to call the AppStream 2.0 API on your behalf. You can edit and attach your own IAM policy, to limit access to only actions you would like to permit. To learn more, see Controlling Access to Amazon AppStream 2.0.
    6. Leave the default values for the rest of the fields, and choose Next, Create function.
  5. To change the AppStream 2.0 fleet size, choose Code and add some sample code, as follows:
    'use strict';
    This AppStream2 Update-Fleet blueprint sets up a schedule for a streaming fleet
    const AWS = require('aws-sdk');
    const appstream = new AWS.AppStream();
    const fleetParams = {
      Name: 'ds-sample-fleet', /* required */
      ComputeCapacity: {
        DesiredInstances: 1 /* required */
    exports.handler = (event, context, callback) => {
        console.log('Received event:', JSON.stringify(event, null, 2));
        var resource = event.resources[0];
        var increase = resource.includes('weekday-9am-increase-capacity')
        try {
            if (increase) {
                fleetParams.ComputeCapacity.DesiredInstances = 3
            } else {
                fleetParams.ComputeCapacity.DesiredInstances = 1
            appstream.updateFleet(fleetParams, (error, data) => {
                if (error) {
                    console.log(error, error.stack);
                    return callback(error);
                return callback(null, data);
        } catch (error) {
            console.log('Caught Error: ', error);

  6. Test the code. Choose Test and use the “Hello World” test template. The first time you do this, choose Save and Test. Create a test input like the following to trigger the scaling update.


  7. You see output text showing the result of the update-fleet call. You can also use the CLI to check the effect of executing the Lambda function.

Next, to set up a time-based schedule, set a trigger for invoking the Lambda function.

To set a trigger for the Lambda function

  1. Choose Triggers, Add trigger.
  2. Choose CloudWatch Events – Schedule.
  3. Enter a rule name, such as “weekday-9am-increase-capacity”, and a description. For Schedule expression, choose cron. You can edit the value for the cron later.
  4. After the trigger is created, open the event weekday-9am-increase-capacity.
  5. In the CloudWatch console, edit the event details. To scale out the fleet at 9 am on a weekday, you can adjust the time to be: 00 17 ? * MON-FRI *. (If you’re not in Seattle (Pacific Time Zone), change this to another specific time zone).
  6. You can also add another event that triggers at the end of a weekday.


This setup now triggers scale-out and scale-in automatically, based on the time schedule that you set.

Fleet Auto Scaling with schedules

You can choose to combine both the fleet scaling and time-based schedule approaches to manage more complex scenarios. This is useful to manage the number of running instances based on business and non-business hours, and still respond to changes in demand. You could programmatically change the minimum and maximum sizes for your fleet based on time of day or day of week, and apply the default scale-out or scale-in policies. This allows you to respond to predictable minimum demand based on a schedule.

For example, at the start of a work day, you might expect a certain number of users to request streaming connections at one time. You wouldn’t want to wait for the fleet to scale out and meet this requirement. However, during the course of the day, you might expect the demand to scale in or out, and would want to match the fleet size to this demand.

To achieve this, set up the scaling polices via the console, and create a Lambda function to trigger changes to the minimum, maximum, and desired capacity for your fleet based on a schedule. Replace the code for the Lambda function that you created earlier with the following code:

'use strict';

This AppStream2 Update-Fleet function sets up a schedule for a streaming fleet

const AWS = require('aws-sdk');
const appstream = new AWS.AppStream();
const applicationAutoScaling = new AWS.ApplicationAutoScaling();

const fleetParams = {
  Name: 'ds-sample-fleet', /* required */
  ComputeCapacity: {
    DesiredInstances: 1 /* required */

var scalingParams = {
  ResourceId: 'fleet/ds-sample-fleet', /* required - fleet name*/
  ScalableDimension: 'appstream:fleet:DesiredCapacity', /* required */
  ServiceNamespace: 'appstream', /* required */
  MaxCapacity: 1,
  MinCapacity: 6,
  RoleARN: 'arn:aws:iam::659382443255:role/service-role/ApplicationAutoScalingForAmazonAppStreamAccess'

exports.handler = (event, context, callback) => {
    console.log('Received this event now:', JSON.stringify(event, null, 2));
    var resource = event.resources[0];
    var increase = resource.includes('weekday-9am-increase-capacity')

    try {
        if (increase) {
            //usage during business hours - start at capacity of 10 and scale
            //if required. This implies at least 10 users can connect instantly. 
            //More users can connect as the scaling policy triggers addition of
            //more instances. Maximum cap is 20 instances - fleet will not scale
            //beyond 20. This is the cap for number of users.
            fleetParams.ComputeCapacity.DesiredInstances = 10
            scalingParams.MinCapacity = 10
            scalingParams.MaxCapacity = 20
        } else {
            //usage during non-business hours - start at capacity of 1 and scale
            //if required. This implies only 1 user can connect instantly. 
            //More users can connect as the scaling policy triggers addition of
            //more instances. 
            fleetParams.ComputeCapacity.DesiredInstances = 1
            scalingParams.MinCapacity = 1
            scalingParams.MaxCapacity = 10
        //Update minimum and maximum capacity used by the scaling policies
        applicationAutoScaling.registerScalableTarget(scalingParams, (error, data) => {
             if (error) console.log(error, error.stack); 
             else console.log(data);                     
        //Update the desired capacity for the fleet. This sets 
        //the number of running instances to desired number of instances
        appstream.updateFleet(fleetParams, (error, data) => {
            if (error) {
                console.log(error, error.stack);
                return callback(error);

            return callback(null, data);
    } catch (error) {
        console.log('Caught Error: ', error);

Note: To successfully execute this code, you need to add IAM policies to the role used by the Lambda function. The policies allow Lambda to call the Application Auto Scaling service on your behalf.

Figure 10: Inline policies for using Application Auto Scaling with Lambda

"Version": "2012-10-17",
"Statement": [
      "Effect": "Allow", 
         "Action": [
         "Resource": "*"
"Version": "2012-10-17",
"Statement": [
      "Effect": "Allow", 
         "Action": [
         "Resource": "*"

Monitoring usage

After you have set up scaling for your fleet, you can use CloudWatch metrics with AppStream 2.0, and create a dashboard for monitoring. This helps optimize your scaling policies over time based on the amount of usage that you see.

For example, if you were very conservative with your initial set up and over-provisioned resources, you might see long periods of low fleet utilization. On the other hand, if you set the fleet size too low, you would see high utilization or errors from insufficient capacity, which would block users’ connections. You can view CloudWatch metrics for up to 15 months, and drive adjustments to your fleet scaling policy.

Figure 11: Dashboard with custom Amazon CloudWatch metrics



These are just a few ideas for scaling AppStream 2.0 and optimizing your costs. Let us know if these are useful, and if you would like to see similar posts. If you have comments about the service, please post your feedback on the AWS forum for AppStream 2.0.

A Serverless Authentication System by Jumia

Post Syndicated from Bryan Liston original

Jumia Group is an ecosystem of nine different companies operating in 22 different countries in Africa. Jumia employs 3000 people and serves 15 million users/month.

Want to secure and centralize millions of user accounts across Africa? Shut down your servers! Jumia Group unified and centralized customer authentication on nine digital services platforms, operating in 22 (and counting) countries in Africa, totaling over 120 customer and merchant facing applications. All were unified into a custom Jumia Central Authentication System (JCAS), built in a timely fashion and designed using a serverless architecture.

In this post, we give you our solution overview. For the full technical implementation, see the Jumia Central Authentication System post on the Jumia Tech blog.

The challenge

A group-level initiative was started to centralize authentication for all Jumia users in all countries for all companies. But it was impossible to unify the operational databases for the different companies. Each company had its own user database with its own technological implementation. Each company alone had yet to unify the logins for their own countries. The effects of deduplicating all user accounts were yet to be determined but were considered to be large. Finally, there was no team responsible for manage this new project, given that a new dedicated infrastructure would be needed.

With these factors in mind, we decided to design this project as a serverless architecture to eliminate the need for infrastructure and a dedicated team. AWS was immediately considered as the best option for its level of
service, intelligent pricing model, and excellent serverless services.

The goal was simple. For all user accounts on all Jumia Group websites:

  • Merge duplicates that might exist in different companies and countries
  • Create a unique login (username and password)
  • Enrich the user profile in this centralized system
  • Share the profile with all Jumia companies


We had the following initial requirements while designing this solution on the AWS platform:

  • Secure by design
  • Highly available via multimaster replication
  • Single login for all platforms/countries
  • Minimal performance impact
  • No admin overhead

Chosen technologies

We chose the following AWS services and technologies to implement our solution.

Amazon API Gateway

Amazon API Gateway is a fully managed service, making it really simple to set up an API. It integrates directly with AWS Lambda, which was chosen as our endpoint. It can be easily replicated to other regions, using Swagger import/export.

AWS Lambda

AWS Lambda is the base of serverless computing, allowing you to run code without worrying about infrastructure. All our code runs on Lambda functions using Python; some functions are
called from the API Gateway, others are scheduled like cron jobs.

Amazon DynamoDB

Amazon DynamoDB
is a highly scalable, NoSQL database with a good API and a clean pricing model. It has great scalability as well as high availability, and fits the serverless model we aimed for with JCAS.


AWS KMS was chosen as a key manager to perform envelope encryption. It’s a simple and secure key manager with multiple encryption functionalities.

Envelope encryption

Oversimplifying envelope encryption is when you encrypt a key rather than the data itself. That encrypted key, which was used to encrypt the data, may now be stored with the data itself on your persistence layer since it doesn’t decrypt the data if compromised. For more information, see How Envelope Encryption Works with Supported AWS Services.

Envelope encryption was chosen given that master keys have a 4 KB limit for data to be encrypted or decrypted.

Amazon SQS

Amazon SQS is an inexpensive queuing service with dynamic scaling, 14-day retention availability, and easy management. It’s the perfect choice for our needs, as we use queuing systems only as a
fallback when saving data to remote regions fails. All the features needed for those fallback cases were covered.


We also use JSON web tokens for encoding and signing communications between JCAS and company servers. It’s another layer of security for data in transit.

Data structure

Our database design was pretty straightforward. DynamoDB records can be accessed by the primary key and allow access using secondary indexes as well. We created a UUID for each user on JCAS, which is used as a primary key
on DynamoDB.

Most data must be encrypted so we use a single field for that data. On the other hand, there’s data that needs to be stored in separate fields as they need to be accessed from the code without decryption for lookups or basic checks. This indexed data was also stored, as a hash or plain, outside the main ciphered blob store.

We used a field for each searchable data piece:

  • Id (primary key)
  • main phone (indexed)
  • main email (indexed)
  • account status
  • document timestamp

To hold the encrypted data we use a data dictionary with two main dictionaries, info with the user’s encrypted data and keys with the key for each AWS KMS region to decrypt the info blob.
Passwords are stored in two dictionaries, old_hashes contains the legacy hashes from the origin systems and secret holds the user’s JCAS password.

Here’s an example:


Security is like an onion, it needs to be layered. That’s what we did when designing this solution. Our design makes all of our data unreadable at each layer of this solution, while easing our compliance needs.

A field called data stores all personal information from customers. It’s encrypted using AES256-CBC with a key generated and managed by AWS KMS. A new data key is used for each transaction. For communication between companies and the API, we use API Keys, TLS and JWT, in the body to ensure that the post is signed and verified.

Data flow

Our second requirement on JCAS was system availability, so we designed some data pipelines. These pipelines allow multi-region replication, which evades collision using idempotent operations on all endpoints. The only technology added to the stack was Amazon SQS. On SQS queues, we place all the items we aren’t able to replicate at the time of the client’s request.

JCAS may have inconsistencies between regions, caused by network or region availability; by using SQS, we have workers that synchronize all regions as soon as possible.

Example with two regions

We have three Lambda functions and two SQS queues, where:

1) A trigger Lambda function is called for the DynamoDB stream. Upon changes to the user’s table, it tries to write directly to another DynamoDB table in the second region, falling back to writing to two SQS queues.


2) A scheduled Lambda function (cron-style) checks a SQS queue for items and tries writing them to the DynamoDB table that potentially failed.


3) A cron-style Lambda function checks the SQS queue, calling KMS for any items, and fixes the issue.


The following diagram shows the full infrastrucure (for clarity, this diagram leaves out KMS recovery).

CAS Full Diagram


Upon going live, we noticed a minor impact in our response times. Note the brown legend in the images below.


This was a cold start, as the infrastructure started to get hot, response time started to converge. On 27 April, we were almost at the initial 500ms.

It kept steady on values before JCAS went live (≈500ms).

As of the writing of this post, our response time kept improving (dev changed the method name and we changed the subdomain name).

Customer login used to take ≈500ms and it still takes ≈500ms with JCAS. Those times have improved as other components changed inside our code.

Lessons learned

  • Instead of the standard cross-region replication, creating your own DynamoDB cross-region replication might be a better fit for you, if your data and applications allow it.
  • Take some time to tweak the Lambda runtime memory. Remember that it’s billed per 100ms, so it saves you money if you have it run near a round number.
  • KMS takes away the problem of key management with great security by design. It really simplifies your life.
  • Always check the timestamp before operating on data. If it’s invalid, save the money by skipping further KMS, DynamoDB, and Lambda calls. You’ll love your systems even more.
  • Embrace the serverless paradigm, even if it looks complicated at first. It will save your life further down the road when your traffic bursts or you want to find an engineer who knows your whole system.

Next steps

We are going to leverage everything we’ve done so far to implement SSO in Jumia. For a future project, we are already testing OpenID connect with DynamoDB as a backend.


We did a mindset revolution in many ways.
Not only we went completely serverless as we started storing critical info on the cloud.
On top of this we also architectured a system where all user data is decoupled between local systems and our central auth silo.
Managing all of these critical systems became far more predictable and less cumbersome than we thought possible.
For us this is the proof that good and simple designs are the best features to look out for when sketching new systems.

If we were to do this again, we would do it in exactly the same way.

Daniel Loureiro (SecOps) & Tiago Caxias (DBA) – Jumia Group

SAML for Your Serverless JavaScript Application: Part II

Post Syndicated from Bryan Liston original

Contributors: Richard Threlkeld, Gene Ting, Stefano Buliani

The full code for both scenarios—including SAM templates—can be found at the samljs-serverless-sample GitHub repository. We highly recommend you use the SAM templates in the GitHub repository to create the resources, opitonally you can manually create them.

This is the second part of a two part series for using SAML providers in your application and receiving short-term credentials to access AWS Services. These credentials can be limited with IAM roles so the users of the applications can perform actions like fetching data from databases or uploading files based on their level of authorization. For example, you may want to build a JavaScript application that allows a user to authenticate against Active Directory Federation Services (ADFS). The user can be granted scoped AWS credentials to invoke an API to display information in the application or write to an Amazon DynamoDB table.

Part I of this series walked through a client-side flow of retrieving SAML claims and passing them to Amazon Cognito to retrieve credentials. This blog post will take you through a more advanced scenario where logic can be moved to the backend for a more comprehensive and flexible solution.


As in Part I of this series, you need ADFS running in your environment. The following configurations are used for reference:

  1. ADFS federated with the AWS console. For a walkthrough with an AWS CloudFormation template, see Enabling Federation to AWS Using Windows Active Directory, ADFS, and SAML 2.0.
  2. Verify that you can authenticate with user example\bob for both the ADFS-Dev and ADFS-Production groups via the sign-in page.
  3. Create an Amazon Cognito identity pool.

Scenario Overview

The scenario in the last blog post may be sufficient for many organizations but, due to size restrictions, some browsers may drop part or all of a query string when sending a large number of claims in the SAMLResponse. Additionally, for auditing and logging reasons, you may wish to relay SAML assertions via POST only and perform parsing in the backend before sending credentials to the client. This scenario allows you to perform custom business logic and validation as well as putting tracking controls in place.

In this post, we want to show you how these requirements can be achieved in a Serverless application. We also show how different challenges (like XML parsing and JWT exchange) can be done in a Serverless application design. Feel free to mix and match, or swap pieces around to suit your needs.

This scenario uses the following services and features:

  • Cognito for unique ID generation and default role mapping
  • S3 for static website hosting
  • API Gateway for receiving the SAMLResponse POST from ADFS
  • Lambda for processing the SAML assertion using a native XML parser
  • DynamoDB conditional writes for session tracking exceptions
  • STS for credentials via Lambda
  • KMS for signing JWT tokens
  • API Gateway custom authorizers for controlling per-session access to credentials, using JWT tokens that were signed with KMS keys
  • JavaScript-generated SDK from API Gateway using a service proxy to DynamoDB
  • RelayState in the SAMLRequest to ADFS to transmit the CognitoID and a short code from the client to your AWS backend

At a high level, this solution is similar to that of Scenario 1; however, most of the work is done in the infrastructure rather than on the client.

  • ADFS still uses a POST binding to redirect the SAMLResponse to API Gateway; however, the Lambda function does not immediately redirect.
  • The Lambda function decodes and uses an XML parser to read the properties of the SAML assertion.
  • If the user’s assertion shows that they belong to a certain group matching a specified string (“Prod” in the sample), then you assign a role that they can assume (“ADFS-Production”).
  • Lambda then gets the credentials on behalf of the user and stores them in DynamoDB as well as logging an audit record in a separate table.
  • Lambda then returns a short-lived, signed JSON Web Token (JWT) to the JavaScript application.
  • The application uses the JWT to get their stored credentials from DynamoDB through an API Gateway custom authorizer.

The architecture you build in this tutorial is outlined in the following diagram.


First, a user visits your static website hosted on S3. They generate an ephemeral random code that is transmitted during redirection to ADFS, where they are prompted for their Active Directory credentials.

Upon successful authentication, the ADFS server redirects the SAMLResponse assertion, along with the code (as the RelayState) via POST to API Gateway.

The Lambda function parses the SAMLResponse. If the user is part of the appropriate Active Directory group (AWS-Production in this tutorial), it retrieves credentials from STS on behalf of the user.

The credentials are stored in a DynamoDB table called SAMLSessions, along with the short code. The user login is stored in a tracking table called SAMLUsers.

The Lambda function generates a JWT token, with a 30-second expiration time signed with KMS, then redirects the client back to the static website along with this token.

The client then makes a call to an API Gateway resource acting as a DynamoDB service proxy that retrieves the credentials via a DeleteItem call. To make this call, the client passes the JWT in the authorization header.

A custom authorizer runs to validate the token using the KMS key again as well as the original random code.

Now that the client has credentials, it can use these to access AWS resources.

Tutorial: Backend processing and audit tracking

Before you walk through this tutorial you will need the source code from the samljs-serverless-sample Github Repository. You should use the SAM template provided in order to streamline the process but we’ll outline how you you would manually create resources too. There is a readme in the repository with instructions for using the SAM template. Either way you will still perform the manual steps of KMS key configuration, ADFS enablement of RelayState, and Amazon Cognito Identity Pool creation. The template will automate the process in creating the S3 website, Lambda functions, API Gateway resources and DynamoDB tables.

We walk through the details of all the steps and configuration below for illustrative purposes in this tutorial calling out the sections that can be omitted if you used the SAM template.

KMS key configuration

To sign JWT tokens, you need an encrypted plaintext key, to be stored in KMS. You will need to complete this step even if you use the SAM template.

  1. In the IAM console, choose Encryption Keys, Create Key.
  2. For Alias, type sessionMaster.
  3. For Advanced Options, choose KMS, Next Step.
  4. For Key Administrative Permissions, select your administrative role or user account.
  5. For Key Usage Permissions, you can leave this blank as the IAM Role (next section) will have individual key actions configured. This allows you to perform administrative actions on the set of keys while the Lambda functions have rights to just create data keys for encryption/decryption and use them to sign JWTs.
  6. Take note of the Key ID, which is needed for the Lambda functions.

IAM role configuration

You will need an IAM role for executing your Lambda functions. If you are using the SAM template this can be skipped. The sample code in the GitHub repository under Scenario2 creates separate roles for each function, with limited permissions on individual resources when you use the SAM template. We recommend separate roles scoped to individual resources for production deployments. Your Lambda functions need the following permissions:

    "Version": "2012-10-17",
    "Statement": [
            "Sid": "Stmt1432927122000",
            "Effect": "Allow",
            "Action": [
            "Resource": [

Lambda function configuration

If you are not using the SAM template, create the following three Lambda functions from the GitHub repository in /Scenario2/lambda using the following names and environment variables. The Lambda functions are written in Node.js.

  • GenerateKey_awslabs_samldemo
  • ProcessSAML_awslabs_samldemo
  • SAMLCustomAuth_awslabs_samldemo

The functions above are built, packaged, and uploaded to Lambda. For two of the functions, this can be done from your workstation (the sample commands for each function assume OSX or Linux). The third will need to be built on an AWS EC2 instance running the current Lambda AMI.


This function is only used one time to create keys in KMS for signing JWT tokens. The function calls GenerateDataKey and stores the encrypted CipherText blob as Base64 in DynamoDB. This is used by the other two functions for getting the PlainTextKey for signing with a Decrypt operation.

This function only requires a single file. It has the following environment variables:

  • KMS_KEY_ID: Unique identifier from KMS for your sessionMaster Key
  • ENC_CONTEXT: ADFS (or something unique to your organization)

Navigate into /Scenario2/lambda/GenerateKey and run the following commands:

zip –r .

aws lambda create-function --function-name GenerateKey_awslabs_samldemo --runtime nodejs4.3 --role LAMBDA_ROLE_ARN --handler index.handler --timeout 10 --memory-size 512 --zip-file fileb:// --environment Variables={SESSION_DDB_TABLE=SAMLSessions,ENC_CONTEXT=ADFS,RAND_HASH=us-east-1:XXXXXXXX-XXXX-XXXX-XXXX-XXXXXXXXXXXX,KMS_KEY_ID=<kms key="KEY" id="ID">}


This is an API Gateway custom authorizer called after the client has been redirected to the website as part of the login workflow. This function calls a GET against the service proxy to DynamoDB, retrieving credentials. The function uses the KMS key signing validation of the JWT created in the ProcessSAML_awslabs_samldemo function and also validates the random code that was generated at the beginning of the login workflow.

You must install the dependencies before zipping this function up. It has the following environment variables:

  • ENC_CONTEXT: ADFS (or whatever was used in GenerateKey_awslabs_samldemo)

Navigate into /Scenario2/lambda/CustomAuth and run:

npm install

zip –r .

aws lambda create-function --function-name SAMLCustomAuth_awslabs_samldemo --runtime nodejs4.3 --role LAMBDA_ROLE_ARN --handler CustomAuth.handler --timeout 10 --memory-size 512 --zip-file fileb:// --environment Variables={SESSION_DDB_TABLE=SAMLSessions,ENC_CONTEXT=ADFS,ID_HASH= us-east-1:XXXXXXXX-XXXX-XXXX-XXXX-XXXXXXXXXXXX }


This function is called when ADFS sends the SAMLResponse to API Gateway. The function parses the SAML assertion to select a role (based on a simple string search) and extract user information. It then uses this data to get short-term credentials from STS via AssumeRoleWithSAML and stores this information in a SAMLSessions table and tracks the user login via a SAMLUsers table. Both of these are DynamoDB tables but you could also store the user information in another AWS database type, as this is for auditing purposes. Finally, this function creates a JWT (signed with the KMS key) which is only valid for 30 seconds and is returned to the client as part of a 302 redirect from API Gateway.

This function needs to be built on an EC2 server running Amazon Linux. This function leverages two main external libraries:

  • nJwt: Used for secure JWT creation for individual client sessions to get access to their records
  • libxmljs: Used for XML XPath queries of the decoded SAMLResponse from AD FS

Libxmljs uses native build tools and you should run this on EC2 running the same AMI as Lambda and with Node.js v4.3.2; otherwise, you might see errors. For more information about current Lambda AMI information, see Lambda Execution Environment and Available Libraries.

After you have the correct AMI launched in EC2 and have SSH open to that host, install Node.js. Ensure that the Node.js version on EC2 is 4.3.2, to match Lambda. If your version is off, you can roll back with NVM.

After you have set up Node.js, run the following command:

yum install -y make gcc*

Now, create a /saml folder on your EC2 server and copy up ProcessSAML.js and package.json from /Scenario2/lambda/ProcessSAML to the EC2 server. Here is a sample SCP command:

cd ProcessSAML/


package.json    ProcessSAML.js

scp -i ~/path/yourpemfile.pem ./* [email protected]:/home/ec2-user/saml/

Then you can SSH to your server, cd into the /saml directory, and run:

npm install

A successful build should look similar to the following:


Finally, zip up the package and create the function using the following AWS CLI command and these environment variables. Configure the CLI with your credentials as needed.

  • ENC_CONTEXT: ADFS (or whatever was used in GenerateKeyawslabssamldemo)
  • PRINCIPAL_ARN: Full ARN of the AD FS IdP created in the IAM console
  • REDIRECT_URL: Endpoint URL of your static S3 website (or CloudFront distribution domain name if you did that optional step)
zip –r .

aws lambda create-function --function-name ProcessSAML_awslabs_samldemo --runtime nodejs4.3 --role LAMBDA_ROLE_ARN --handler ProcessSAML.handler --timeout 10 --memory-size 512 --zip-file fileb:// –environment Variables={USER_DDB_TABLE=SAMLUsers,SESSION_DDB_TABLE= SAMLSessions,REDIRECT_URL=<your S3 bucket and test page path>,ID_HASH=us-east-1:XXXXXXXX-XXXX-XXXX-XXXX-XXXXXXXXXXXX,ENC_CONTEXT=ADFS,PRINCIPAL_ARN=<your ADFS IdP ARN>}

If you built the first two functions on your workstation and created the ProcessSAML_awslabs_samldemo function separately in the Lambda console before building on EC2, you can update the code after building on with the following command:

aws lambda update-function-code --function-name ProcessSAML_awslabs_samldemo --zip-file fileb://

Role trust policy configuration

This scenario uses STS directly to assume a role. You will need to complete this step even if you use the SAM template. Modify the trust policy, as you did before when Amazon Cognito was assuming the role. In the GitHub repository sample code, ProcessSAML.js is preconfigured to filter and select a role with “Prod” in the name via the selectedRole variable.

This is an example of business logic you can alter in your organization later, such as a callout to an external mapping database for other rules matching. In this tutorial, it corresponds to the ADFS-Production role that was created.

  1. In the IAM console, choose Roles and open the ADFS-Production Role.
  2. Edit the Trust Permissions field and replace the content with the following:

      "Version": "2012-10-17",
      "Statement": [
          "Effect": "Allow",
          "Principal": {
            "Federated": [
          "Action": "sts:AssumeRoleWithSAML"

If you end up using another role (or add more complex filtering/selection logic), ensure that those roles have similar trust policy configurations. Also note that the sample policy above purposely uses an array for the federated provider matching the IdP ARN that you added. If your environment has multiple SAML providers, you could list them here and modify the code in ProcessSAML.js to process requests from different IdPs and grant or revoke credentials accordingly.

DynamoDB table creation

If you are not using the SAM template, create two DynamoDB tables:

  • SAMLSessions: Temporarily stores credentials from STS. Credentials are removed by an API Gateway Service Proxy to the DynamoDB DeleteItem call that simultaneously returns the credentials to the client.
  • SAMLUsers: This table is for tracking user information and the last time they authenticated in the system via ADFS.

The following AWS CLI commands creates the tables (indexed only with a primary key hash, called identityHash and CognitoID respectively):

aws dynamodb create-table \
    --table-name SAMLSessions \
    --attribute-definitions \
        AttributeName=group,AttributeType=S \
    --key-schema AttributeName=identityhash,KeyType=HASH \
    --provisioned-throughput ReadCapacityUnits=5,WriteCapacityUnits=5
aws dynamodb create-table \
    --table-name SAMLUsers \
    --attribute-definitions \
        AttributeName=CognitoID,AttributeType=S \
    --key-schema AttributeName=CognitoID,KeyType=HASH \
    --provisioned-throughput ReadCapacityUnits=5,WriteCapacityUnits=5

After the tables are created, you should be able to run the GenerateKey_awslabs_samldemo Lambda function and see a CipherText key stored in SAMLSessions. This is only for convenience of this post, to demonstrate that you should persist CipherText keys in a data store and never persist plaintext keys that have been decrypted. You should also never log plaintext keys in your code.

API Gateway configuration

If you are not using the SAM template, you will need to create API Gateway resources. If you have created resources for Scenario 1 in Part I, then the naming of these resources may be similar. If that is the case, then simply create an API with a different name (SAMLAuth2 or similar) and follow these steps accordingly.

  1. In the API Gateway console for your API, choose Authorizers, Custom Authorizer.
  2. Select your region and enter SAMLCustomAuth_awslabs_samldemo for the Lambda function. Choose a friendly name like JWTParser and ensure that Identity token source is method.request.header.Authorization. This tells the custom authorizer to look for the JWT in the Authorization header of the HTTP request, which is specified in the JavaScript code on your S3 webpage. Save the changes.


Now it’s time to wire up the Lambda functions to API Gateway.

  1. In the API Gateway console, choose Resources, select your API, and then create a Child Resource called SAML. This includes a POST and a GET method. The POST method uses the ProcessSAML_awslabs_samldemo Lambda function and a 302 redirect, while the GET method uses the JWTParser custom authorizer with a service proxy to DynamoDB to retrieve credentials upon successful authorization.
  2. lambdasamltwo_4.png

  3. Create a POST method. For Integration Type, choose Lambda and add the ProcessSAML_awslabs_samldemo Lambda function. For Method Request, add headers called RelayState and SAMLResponse.


  4. Delete the Method Response code for 200 and add a 302. Create a response header called Location. In the Response Models section, for Content-Type, choose application/json and for Models, choose Empty.


  5. Delete the Integration Response section for 200 and add one for 302 that has a Method response status of 302. Edit the response header for Location to add a Mapping value of integration.response.body.location.


  6. Finally, in order for Lambda to capture the SAMLResponse and RelayState values, choose Integration Request.

  7. In the Body Mapping Template section, for Content-Type, enter application/x-www-form-urlencoded and add the following template:

    "SAMLResponse" :"$input.params('SAMLResponse')",
    "RelayState" :"$input.params('RelayState')",
    "formparams" : $input.json('$')

  8. Create a GET method with an Integration Type of Service Proxy. Select the region and DynamoDB as the AWS Service. Use POST for the HTTP method and DeleteItem for the Action. This is important as you leverage a DynamoDB feature to return the current records when you perform deletion. This simultaneously allows credentials in this system to not be stored long term and also allows clients to retrieve them. For Execution role, use the Lambda role from earlier or a new role that only has IAM scoped permissions for DeleteItem on the SAMLSessions table.


  9. Save this and open Method Request.

  10. For Authorization, select your custom authorizer JWTParser. Add in a header called COGNITO_ID and save the changes.


  11. In the Integration Request, add in a header name of Content-Type and a value for Mapped of ‘application/x-amzn-json-1.0‘ (you need the single quotes surrounding the entry).

  12. Next, in the Body Mapping Template section, for Content-Type, enter application/json and add the following template:

        "TableName": "SAMLSessions",
        "Key": {
            "identityhash": {
                "S": "$input.params('COGNITO_ID')"
        "ReturnValues": "ALL_OLD"

Inspect this closely for a moment. When your client passes the JWT in an Authorization Header to this GET method, the JWTParser Custom Authorizer grants/denies executing a DeleteItem call on the SAMLSessions table.


If it is granted, then there needs to be an item to delete the reference as a primary key to the table. The client JavaScript (seen in a moment) passes its CognitoID through as a header called COGNITO_ID that is mapped above. DeleteItem executes to remove the credentials that were placed there via a call to STS by the ProcessSAML_awslabs_samldemo Lambda function. Because the above action specifies ALL_OLD under the ReturnValues mapping, DynamoDB returns these credentials at the same time.


  1. Save the changes and open your /saml resource root.
  2. Choose Actions, Enable CORS.
  3. In the Access-Control-Allow-Headers section, add COGNITO_ID into the end (inside the quotes and separated from other headers by a comma), then choose Enable CORS and replace existing CORS headers.
  4. When completed, choose Actions, Deploy API. Use the Prod stage or another stage.
  5. In the Stage Editor, choose SDK Generation. For Platform, choose JavaScript and then choose Generate SDK. Save the folder someplace close. Take note of the Invoke URL value at the top, as you need this for ADFS configuration later.

Website configuration

If you are not using the SAM template, create an S3 bucket and configure it as a static website in the same way that you did for Part I.

If you are using the SAM template this will automatically be created for you however the steps below will still need to be completed:

In the source code repository, edit /Scenario2/website/configs.js.

  1. Ensure that the identityPool value matches your Amazon Cognito Pool ID and the region is correct.
  2. Leave adfsUrl the same if you’re testing on your lab server; otherwise, update with the AD FS DNS entries as appropriate.
  3. Update the relayingPartyId value as well if you used something different from the prerequisite blog post.

Next, download the minified version of the AWS SDK for JavaScript in the Browser (aws-sdk.min.js) and place it along with the other files in /Scenario2/website into the S3 bucket.

Copy the files from the API Gateway Generated SDK in the last section to this bucket so that the apigClient.js is in the root directory and lib folder is as well. The imports for these scripts (which do things like sign API requests and configure headers for the JWT in the Authorization header) are already included in the index.html file. Consult the latest API Gateway documentation if the SDK generation process updates in the future

ADFS configuration

Now that the AWS setup is complete, modify your ADFS setup to capture RelayState information about the client and to send the POST response to API Gateway for processing. You will need to complete this step even if you use the SAM template.

If you’re using Windows Server 2008 with ADFS 2.0, ensure that Update Rollup 2 is installed before enabling RelayState. Please see official Microsoft documentation for specific download information.

  1. After Update Rollup 2 is installed, modify %systemroot%\inetpub\adfs\ls\web.config. If you’re on a newer version of Windows Server running AD FS 3.0, modify %systemroot%\ADFS\Microsoft.IdentityServer.Servicehost.exe.config.
  2. Find the section in the XML marked <Microsoft.identityServer.web> and add an entry for <useRelayStateForIdpInitiatedSignOn enabled="true">. If you have the proper ADFS rollup or version installed, this should allow the RelayState parameter to be accepted by the service provider.
  3. In the ADFS console, open Relaying Party Trusts for Amazon Web Services and choose Endpoints.
  4. For Binding, choose POST and for Invoke URL,enter the URL to your API Gateway from the stage that you noted earlier.

At this point, you are ready to test out your webpage. Navigate to the S3 static website Endpoint URL and it should redirect you to the ADFS login screen. If the user login has been recent enough to have a valid SAML cookie, then you should see the login pass-through; otherwise, a login prompt appears. After the authentication has taken place, you should quickly end up back at your original webpage. Using the browser debugging tools, you see “Successful DDB call” followed by the results of a call to STS that were stored in DynamoDB.


As in Scenario 1, the sample code under /scenario2/website/index.html has a button that allows you to “ping” an endpoint to test if the federated credentials are working. If you have used the SAM template this should already be working and you can test it out (it will fail at first – keep reading to find out how to set the IAM permissions!). If not go to API Gateway and create a new Resource called /users at the same level of /saml in your API with a GET method.


For Integration type, choose Mock.


In the Method Request, for Authorization, choose AWS_IAM. In the Integration Response, in the Body Mapping Template section, for Content-Type, choose application/json and add the following JSON:

    "status": "Success",
    "agent": "${context.identity.userAgent}"


Before using this new Mock API as a test, configure CORS and re-generate the JavaScript SDK so that the browser knows about the new methods.

  1. On the /saml resource root and choose Actions, Enable CORS.
  2. In the Access-Control-Allow-Headers section, add COGNITO_ID into the endpoint and then choose Enable CORS and replace existing CORS headers.
  3. Choose Actions, Deploy API. Use the stage that you configured earlier.
  4. In the Stage Editor, choose SDK Generation and select JavaScript as your platform. Choose Generate SDK.
  5. Upload the new apigClient.js and lib directory to the S3 bucket of your static website.

One last thing must be completed before testing (You will need to complete this step even if you use the SAM template) if the credentials can invoke this mock endpoint with AWS_IAM credentials. The ADFS-Production Role needs execute-api:Invoke permissions for this API Gateway resource.

  1. In the IAM console, choose Roles, and open the ADFS-Production Role.

  2. For testing, you can attach the AmazonAPIGatewayInvokeFullAccess policy; however, for production, you should scope this down to the resource as documented in Control Access to API Gateway with IAM Permissions.

  3. After you have attached a policy with invocation rights and authenticated with AD FS to finish the redirect process, choose PING.

If everything has been set up successfully you should see an alert with information about the user agent.

Final Thoughts

We hope these scenarios and sample code help you to not only begin to build comprehensive enterprise applications on AWS but also to enhance your understanding of different AuthN and AuthZ mechanisms. Consider some ways that you might be able to evolve this solution to meet the needs of your own customers and innovate in this space. For example:

  • Completing the CloudFront configuration and leveraging SSL termination for site identification. See if this can be incorporated into the Lambda processing pipeline.
  • Attaching a scope-down IAM policy if the business rules are matched. For example, the default role could be more permissive for a group but if the user is a contractor (username with –C appended) they get extra restrictions applied when assumeRoleWithSaml is called in the ProcessSAML_awslabs_samldemo Lambda function.
  • Changing the time duration before credentials expire on a per-role basis. Perhaps if the SAMLResponse parsing determines the user is an Administrator, they get a longer duration.
  • Passing through additional user claims in SAMLResponse for further logical decisions or auditing by adding more claim rules in the ADFS console. This could also be a mechanism to synchronize some Active Directory schema attributes with AWS services.
  • Granting different sets of credentials if a user has accounts with multiple SAML providers. While this tutorial was made with ADFS, you could also leverage it with other solutions such as Shibboleth and modify the ProcessSAML_awslabs_samldemo Lambda function to be aware of the different IdP ARN values. Perhaps your solution grants different IAM roles for the same user depending on if they initiated a login from Shibboleth rather than ADFS?

The Lambda functions can be altered to take advantage of these options which you can read more about here. For more information about ADFS claim rule language manipulation, see The Role of the Claim Rule Language on Microsoft TechNet.

We would love to hear feedback from our customers on these designs and see different secure application designs that you’re implementing on the AWS platform.

Creating a Simple “Fetch & Run” AWS Batch Job

Post Syndicated from Bryan Liston original

Dougal Ballantyne
Dougal Ballantyne, Principal Product Manager – AWS Batch

Docker enables you to create highly customized images that are used to execute your jobs. These images allow you to easily share complex applications between teams and even organizations. However, sometimes you might just need to run a script!

This post details the steps to create and run a simple “fetch & run” job in AWS Batch. AWS Batch executes jobs as Docker containers using Amazon ECS. You build a simple Docker image containing a helper application that can download your script or even a zip file from Amazon S3. AWS Batch then launches an instance of your container image to retrieve your script and run your job.

AWS Batch overview

AWS Batch enables developers, scientists, and engineers to easily and efficiently run hundreds of thousands of batch computing jobs on AWS. AWS Batch dynamically provisions the optimal quantity and type of compute resources (e.g., CPU or memory optimized instances) based on the volume and specific resource requirements of the batch jobs submitted.

With AWS Batch, there is no need to install and manage batch computing software or server clusters that you use to run your jobs, allowing you to focus on analyzing results and solving problems. AWS Batch plans, schedules, and executes your batch computing workloads across the full range of AWS compute services and features, such as Amazon EC2 Spot Instances.

“Fetch & run” walkthrough

The following steps get everything working:

  • Build a Docker image with the fetch & run script
  • Create an Amazon ECR repository for the image
  • Push the built image to ECR
  • Create a simple job script and upload it to S3
  • Create an IAM role to be used by jobs to access S3
  • Create a job definition that uses the built image
  • Submit and run a job that execute the job script from S3


Before you get started, there a few things to prepare. If this is the first time you have used AWS Batch, you should follow the Getting Started Guide and ensure you have a valid job queue and compute environment.

After you are up and running with AWS Batch, the next thing is to have an environment to build and register the Docker image to be used. For this post, register this image in an ECR repository. This is a private repository by default and can easily be used by AWS Batch jobs

You also need a working Docker environment to complete the walkthrough. For the examples, I used Docker for Mac. Alternatively, you could easily launch an EC2 instance running Amazon Linux and install Docker.

You need the AWS CLI installed. For more information, see Installing the AWS Command Line Interface.

Building the fetch & run Docker image

The fetch & run Docker image is based on Amazon Linux. It includes a simple script that reads some environment variables and then uses the AWS CLI to download the job script (or zip file) to be executed.

To get started, download the source code from the aws-batch-helpers GitHub repository. The following link pulls the latest version: Unzip the downloaded file and navigate to the “fetch-and-run” folder. Inside this folder are two files:

  • Dockerfile

Dockerfile is used by Docker to build an image. Look at the contents; you should see something like the following:

FROM amazonlinux:latest

RUN yum -y install unzip aws-cli
ADD /usr/local/bin/
USER nobody

ENTRYPOINT ["/usr/local/bin/"]
  • The FROM line instructs Docker to pull the base image from the amazonlinux repository, using the latest tag.
  • The RUN line executes a shell command as part of the image build process.
  • The ADD line, copies the script into the /usr/local/bin directory inside the image.
  • The WORKDIR line, sets the default directory to /tmp when the image is used to start a container.
  • The USER line sets the default user that the container executes as.
  • Finally, the ENTRYPOINT line instructs Docker to call the /usr/local/bin/ script when it starts the container. When running as an AWS Batch job, it is passed the contents of the command parameter.

Now, build the Docker image! Assuming that the docker command is in your PATH and you don’t need sudo to access it, you can build the image with the following command (note the dot at the end of the command):

docker build -t awsbatch/fetch_and_run .   

This command should produce an output similar to the following:

Sending build context to Docker daemon 373.8 kB

Step 1/6 : FROM amazonlinux:latest
latest: Pulling from library/amazonlinux
c9141092a50d: Pull complete
Digest: sha256:2010c88ac1e7c118d61793eec71dcfe0e276d72b38dd86bd3e49da1f8c48bf54
Status: Downloaded newer image for amazonlinux:latest
 ---> 8ae6f52035b5
Step 2/6 : RUN yum -y install unzip aws-cli
 ---> Running in e49cba995ea6
Loaded plugins: ovl, priorities
Resolving Dependencies
--> Running transaction check
---> Package aws-cli.noarch 0:1.11.29-1.45.amzn1 will be installed

  << removed for brevity >>

 ---> b30dfc9b1b0e
Removing intermediate container e49cba995ea6
Step 3/6 : ADD /usr/local/bin/
 ---> 256343139922
Removing intermediate container 326092094ede
Step 4/6 : WORKDIR /tmp
 ---> 5a8660e40d85
Removing intermediate container b48a7b9c7b74
Step 5/6 : USER nobody
 ---> Running in 72c2be3af547
 ---> fb17633a64fe
Removing intermediate container 72c2be3af547
Step 6/6 : ENTRYPOINT /usr/local/bin/
 ---> Running in aa454b301d37
 ---> fe753d94c372

Removing intermediate container aa454b301d37
Successfully built 9aa226c28efc

In addition, you should see a new local repository called fetchandrun, when you run the following command:

docker images
REPOSITORY               TAG              IMAGE ID            CREATED             SIZE
awsbatch/fetch_and_run   latest           9aa226c28efc        19 seconds ago      374 MB
amazonlinux              latest           8ae6f52035b5        5 weeks ago         292 MB

To add more packages to the image, you could update the RUN line or add a second one, right after it.

Creating an ECR repository

The next step is to create an ECR repository to store the Docker image, so that it can be retrieved by AWS Batch when running jobs.

  1. In the ECR console, choose Get Started or Create repository.
  2. Enter a name for the repository, for example: awsbatch/fetchandrun.
  3. Choose Next step and follow the instructions.


You can keep the console open, as the tips can be helpful.

Push the built image to ECR

Now that you have a Docker image and an ECR repository, it is time to push the image to the repository. Use the following AWS CLI commands, if you have used the previous example names. Replace the AWS account number in red with your own account.

aws ecr get-login --region us-east-1

docker tag awsbatch/fetch_and_run:latest

docker push

Create a simple job script and upload to S3

Next, create and upload a simple job script that is executed using the fetchandrun image that you just built and registered in ECR. Start by creating a file called with the example content below:


echo "Args: [email protected]"
echo "This is my simple test job!."
echo "jobId: $AWS_BATCH_JOB_ID"
echo "jobQueue: $AWS_BATCH_JQ_NAME"
echo "computeEnvironment: $AWS_BATCH_CE_NAME"
sleep $1
echo "bye bye!!"

Upload the script to an S3 bucket.

aws s3 cp s3://<bucket>/

Create an IAM role

When the fetchandrun image runs as an AWS Batch job, it fetches the job script from Amazon S3. You need an IAM role that the AWS Batch job can use to access S3.

  1. In the IAM console, choose Roles, Create New Role.
  2. Enter a name for your new role, for example: batchJobRole, and choose Next Step.
  3. For Role Type, under AWS Service Roles, choose Select next to “Amazon EC2 Container Service Task Role” and then choose Next Step.


  4. On the Attach Policy page, type “AmazonS3ReadOnlyAccess” into the Filter field and then select the check box for that policy.


  5. Choose Next Step, Create Role. You see the details of the new role.


Create a job definition

Now that you’ve have created all the resources needed, pull everything together and build a job definition that you can use to run one or many AWS Batch jobs.

  1. In the AWS Batch console, choose Job Definitions, Create.
  2. For the Job Definition, enter a name, for example, fetchandrun.
  3. For IAM Role, choose the role that you created earlier, batchJobRole.
  4. For ECR Repository URI, enter the URI where the fetchandrun image was pushed, for example:
  5. Leave the Command field blank.
  6. For vCPUs, enter 1. For Memory, enter 500.


  7. For User, enter “nobody”.

  8. Choose Create job definition.

Submit and run a job

Now, submit and run a job that uses the fetchandrun image to download the job script and execute it.

  1. In the AWS Batch console, choose Jobs, Submit Job.
  2. Enter a name for the job, for example: script_test.
  3. Choose the latest fetchandrun job definition.
  4. For Job Queue, choose a queue, for example: first-run-job-queue.
  5. For Command, enter,60.
  6. Choose Validate Command.


  7. Enter the following environment variables and then choose Submit job.

    • Key=BATCHFILETYPE, Value=script
    • Key=BATCHFILES3_URL, Value=s3:/// Don’t forget to use the correct URL for your file.


  8. After the job is completed, check the final status in the console.


  9. In the job details page, you can also choose View logs for this job in CloudWatch console to see your job log.


How the fetch and run image works

The fetchandrun image works as a combination of the Docker ENTRYPOINT and COMMAND feature, and a shell script that reads environment variables set as part of the AWS Batch job. When building the Docker image, it starts with a base image from Amazon Linux and installs a few packages from the yum repository. This becomes the execution environment for the job.

If the script you planned to run needed more packages, you would add them using the RUN parameter in the Dockerfile. You could even change it to a different base image such as Ubuntu, by updating the FROM parameter.

Next, the script is added to the image and set as the container ENTRYPOINT. The script simply reads some environment variables and then downloads and runs the script/zip file from S3. It is looking for the following environment variables BATCHFILETYPE and BATCHFILES3URL. If you run, with no environment variables, you get the following usage message:

  • BATCHFILETYPE not set, unable to determine type (zip/script) of


export BATCH_FILE_TYPE="script"

export BATCH_FILE_S3_URL="s3://my-bucket/my-script" script-from-s3 [ <script arguments> ]

– or –

export BATCH_FILE_TYPE="zip"

export BATCH_FILE_S3_URL="s3://my-bucket/my-zip" script-from-zip [ <script arguments> ]

This shows that it supports two values for BATCHFILETYPE, either “script” or “zip”. When you set “script”, it causes to download a single file and then execute it, in addition to passing in any further arguments to the script. If you set it to “zip”, this causes to download a zip file, then unpack it and execute the script name passed and any further arguments. You can use the “zip” option to pass more complex jobs with all the applications dependencies in one file.

Finally, the ENTRYPOINT parameter tells Docker to execute the /usr/local/bin/ script when creating a container. In addition, it passes the contents of the COMMAND parameter as arguments to the script. This is what enables you to pass the script and arguments to be executed by the fetchandrun image with the Command field in the SubmitJob API action call.


In this post, I detailed the steps to create and run a simple “fetch & run” job in AWS Batch. You can now easily use the same job definition to run as many jobs as you need by uploading a job script to Amazon S3 and calling SubmitJob with the appropriate environment variables.

If you have questions or suggestions, please comment below.

Automating the Creation of Consistent Amazon EBS Snapshots with Amazon EC2 Systems Manager (Part 2)

Post Syndicated from Bryan Liston original

Nicolas Malaval, AWS Professional Consultant

In my previous blog post, I discussed the challenge of creating Amazon EBS snapshots when you cannot turn off the instance during backup because this might exclude any data that has been cached by any applications or the operating system. I showed how you can use EC2 Systems Manager to run a script remotely on EC2 instances to prepare the applications and the operating system for backup and to automate the creating of snapshots on a daily basis. I gave a practical example of creating consistent Amazon EBS snapshots of Amazon Linux running a MySQL database.

In this post, I walk you through another practical example to create consistent snapshots of a Windows Server instance with Microsoft VSS (Volume Shadow Copy Service).

Understanding the example

VSS (Volume Shadow Copy Service) is a Windows built-in service that coordinates backup of VSS-compatible applications (SQL Server, Exchange Server, etc.) to flush and freeze their I/O operations.

The VSS service initiates and oversees the creation of shadow copies. A shadow copy is a point-in-time and consistent snapshot of a logical volume. For example, C: is a logical volume, which is different than an EBS snapshot. Multiple components are involved in the shadow copy creation:

  • The VSS requester requests the creation of shadow copies.
  • The VSS provider creates and maintains the shadow copies.
  • The VSS writers guarantee that you have a consistent data set to back up. They flush and freeze I/O operations, before the VSS provider creates the shadow copies, and release I/O operations, after the VSS provider has completed this action. There is usually one VSS writer for each VSS-compatible application.

I use Run Command to execute a PowerShell script on the Windows instance:

$EbsSnapshotPsFileName = "C:/tmp/ebsSnapshot.ps1"

$EbsSnapshotPs = New-Item -Type File $EbsSnapshotPsFileName -Force

Add-Content $EbsSnapshotPs '$InstanceID = Invoke-RestMethod -Uri'
Add-Content $EbsSnapshotPs '$AZ = Invoke-RestMethod -Uri'
Add-Content $EbsSnapshotPs '$Region = $AZ.Substring(0, $AZ.Length-1)'
Add-Content $EbsSnapshotPs '$Volumes = ((Get-EC2InstanceAttribute -Region $Region -Instance "$InstanceId" -Attribute blockDeviceMapping).BlockDeviceMappings.Ebs |? {$_.Status -eq "attached"}).VolumeId'
Add-Content $EbsSnapshotPs '$Volumes | New-EC2Snapshot -Region $Region -Description " Consistent snapshot of a Windows instance with VSS" -Force'
Add-Content $EbsSnapshotPs 'Exit $LastExitCode'

First, the script writes in a local file named ebsSnapshot.ps1 a PowerShell script that creates a snapshot of every EBS volume attached to the instance.

$EbsSnapshotCmdFileName = "C:/tmp/ebsSnapshot.cmd"
$EbsSnapshotCmd = New-Item -Type File $EbsSnapshotCmdFileName -Force

Add-Content $EbsSnapshotCmd 'powershell.exe -ExecutionPolicy Bypass -file $EbsSnapshotPsFileName'
Add-Content $EbsSnapshotCmd 'exit $?'

It writes in a second file named ebsSnapshot.cmd a shell script that executes the PowerShell script created earlier.

$VssScriptFileName = "C:/tmp/scriptVss.txt"
$VssScript = New-Item -Type File $VssScriptFileName -Force

Add-Content $VssScript 'reset'
Add-Content $VssScript 'set context persistent'
Add-Content $VssScript 'set option differential'
Add-Content $VssScript 'begin backup'

$Drives = Get-WmiObject -Class Win32_LogicalDisk |? {$_.VolumeName -notmatch "Temporary" -and $_.DriveType -eq "3"} | Select-Object DeviceID

$Drives | ForEach-Object { Add-Content $VssScript $('add volume ' + $_.DeviceID + ' alias Volume' + $_.DeviceID.Substring(0, 1)) }

Add-Content $VssScript 'create'
Add-Content $VssScript "exec $EbsSnapshotCmdFileName"
Add-Content $VssScript 'end backup'

$Drives | ForEach-Object { Add-Content $VssScript $('delete shadows id %Volume' + $_.DeviceID.Substring(0, 1) + '%') }

Add-Content $VssScript 'exit'

It creates a third file named scriptVss.txt containing DiskShadow commands. DiskShadow is a tool included in Windows Server 2008 and above, that exposes the functionality offered by the VSS service. The script creates a shadow copy of each logical volume stored on EBS, runs the shell script ebsSnapshot.cmd to create a snapshot of underlying EBS volumes, and then deletes the shadow copies to free disk space.

diskshadow.exe /s $VssScriptFileName
Exit $LastExitCode

Finally, it runs DiskShadow in script mode.

This PowerShell script is contained in a new SSM document and the maintenance window executes a command from this document every day at midnight on every Windows instance that has a tag “ConsistentSnapshot” equal to “WindowsVSS”.

Implementing and testing the example

First, use AWS CloudFormation to provision some of the required resources in your AWS account.

  1. Open Create a Stack to create a CloudFormation stack from the template.
  2. Choose Next.
  3. Enter the ID of the latest AWS Windows Server 2016 Base AMI available in the current region (see Finding a Windows AMI) in pWindowsAmiId.
  4. Follow the on-screen instructions.

CloudFormation creates the following resources:

  • A VPC with an Internet gateway attached.
  • A subnet on this VPC with a new route table, to enable access to the Internet and therefore to the AWS APIs.
  • An IAM role to grant an EC2 instance the required permissions.
  • A security group that allows RDP access from the Internet, as you need to log on to the instance later on.
  • A Windows instance in the subnet with the IAM role and the security group attached.
  • A SSM document containing the script described in the section above to create consistent EBS snapshots.
  • Another SSM document containing a script to restore logical volumes to a consistent state, as explained in the next section.
  • An IAM role to grant the maintenance window the required permissions.

After the stack creation completes, choose Outputs in the CloudFormation console and note the values returned:

  • IAM role for the maintenance window
  • Names of the two SSM documents

Then, manually create a maintenance window, if you have not already created it. For detailed instructions, see the “Example” section in the previous blog post.

After you create a maintenance window, assign a target where the task will run:

  1. In the Maintenance Window list, choose the maintenance window that you just created.
  2. For Actions, choose Register targets.
  3. For Owner information, enter WindowsVSS.
  4. Under the Select targets by section, choose Specifying tags. For Tag Name, choose ConsistentSnapshot. For Tag Value, choose WindowsVSS.
  5. Choose Register targets.

Finally, assign a task to perform during the window:

  1. In the Maintenance Window list, choose the maintenance window that you just created.
  2. For Actions, choose Register tasks.
  3. For Document, select the name of the SSM document that was returned by CloudFormation, with which to create snapshots.
  4. Under the Target by section, choose the target that you just created.
  5. Under the Role section, select the IAM role that was returned by CloudFormation.
  6. Under Execute on, for Targets, enter 1. For Stop after, enter 1 errors.
  7. Choose Register task.

You can view the history either in the History tab of the Maintenance Windows navigation pane of the Amazon EC2 console, as illustrated on the following figure, or in the Run Command navigation pane, with more details about each command executed.

Restoring logical volumes to a consistent state

DiskShadow―the VSS requester in this case―uses the Windows built-in VSS provider. To create a shadow copy, this built-in provider does not make a complete copy of the data. Instead, it keeps a copy of a block data before a change overwrites it, in a dedicated storage area. The logical volume can be restored to its initial consistent state, by combining the actual volume data with the initial data of the changed blocks.

The DiskShadow command create instructs the VSS service to proceed with the creation of shadow copies, including the release of I/O operations by the VSS writers after the shadow copies are created. Therefore, the EBS snapshots created by the next command exec may not be fully consistent.

Note: A workaround could be to build your own VSS provider in charge of creating EBS snapshots. Doing so would enable the EBS snapshots to be created before I/O operations are released. We will not develop this solution in this blog post.

Therefore, you need to “undo” any I/O operations that may have happened between the moment when the shadow copy was created and the moment when the EBS snapshots were created.

A solution consists of creating an EBS volume from the snapshot, attaching it to an intermediate Windows instance and to “revert” the VSS shadow copy to restore the EBS volume to a consistent state. For sake of simplicity, use the Windows instance that was backed up as the intermediate instance.

To manually restore an EBS snapshot to a consistent state:

  1. In the Amazon EC2 console, choose Instances.
  2. In the search box, enter Consistent EBS Snapshots – Windows with VSS. The search results should display a single instance. Note the Availability Zone for this instance.
  3. Choose Snapshots.
  4. Select the latest snapshot with the description “Consistent snapshot of Windows with VSS” and choose Actions, Create Volume.
  5. Select the same Availability Zone as the instance and choose Create, Volumes.
  6. Select the volume that was just created and choose Actions, Attach Volume.
  7. For Instance, choose Consistent EBS Snapshots – Windows with VSS and choose Attach.
  8. Choose Run Command, Run a command.
  9. In Command document, select the name of a SSM document to restore snapshots returned by CloudFormation. For Target instances, select the Windows and choose Run.

Run Command executes the following PowerShell script on the Windows instance. It retrieves the list of offline disks—which corresponds in this case to the EBS volume that you just attached—and for each offline disk, takes it online, revert existing shadow copies and takes it offline again.

$OfflineDisks = (Get-Disk |? {$_.OperationalStatus -eq "Offline"})

foreach ($OfflineDisk in $OfflineDisks) {
  Set-Disk -Number $OfflineDisk.Number -IsOffline $False
  Set-Disk -Number $OfflineDisk.Number -IsReadonly $False
  Write-Host "Disk " $OfflineDisk.Signature " is now online"

$ShadowCopyIds = (Get-CimInstance Win32_ShadowCopy).Id
Write-Host "Number of shadow copies found: " $ShadowCopyIds.Count

foreach ($ShadowCopyId in $ShadowCopyIds) {
  "revert " + $ShadowCopyId | diskshadow

foreach ($OfflineDisk in $OfflineDisks) {
  $CurrentSignature = (Get-Disk -Number $OfflineDisk.Number).Signature
  if ($OfflineDisk.Signature -eq $CurrentSignature) {
    Set-Disk -Number $OfflineDisk.Number -IsReadonly $True
    Set-Disk -Number $OfflineDisk.Number -IsOffline $True
    Write-Host "Disk " $OfflineDisk.Number " is now offline"
  else {
    Set-Disk -Number $OfflineDisk.Number -Signature $OfflineDisk.Signature
    Write-Host "Reverting to the initial disk signature: " $OfflineDisk.Signature

The EBS volume is now in a consistent state and can be detached from the intermediate instance.


In this series of blog posts, I showed how you can use Amazon EC2 Systems Manager to create consistent EBS snapshots on a daily basis, with two practical examples for Linux and Windows. You can adapt this solution to your own requirements. For example, you may develop scripts for your own applications.

If you have questions or suggestions, please comment below.

Implementing Serverless Manual Approval Steps in AWS Step Functions and Amazon API Gateway

Post Syndicated from Bryan Liston original

Ali Baghani, Software Development Engineer

A common use case for AWS Step Functions is a task that requires human intervention (for example, an approval process). Step Functions makes it easy to coordinate the components of distributed applications as a series of steps in a visual workflow called a state machine. You can quickly build and run state machines to execute the steps of your application in a reliable and scalable fashion.

In this post, I describe a serverless design pattern for implementing manual approval steps. You can use a Step Functions activity task to generate a unique token that can be returned later indicating either approval or rejection by the person making the decision.

Key steps to implementation

When the execution of a Step Functions state machine reaches an activity task state, Step Functions schedules the activity and waits for an activity worker. An activity worker is an application that polls for activity tasks by calling GetActivityTask. When the worker successfully calls the API action, the activity is vended to that worker as a JSON blob that includes a token for callback.

At this point, the activity task state and the branch of the execution that contains the state is paused. Unless a timeout is specified in the state machine definition, which can be up to one year, the activity task state waits until the activity worker calls either SendTaskSuccess or SendTaskFailure using the vended token. This pause is the first key to implementing a manual approval step.

The second key is the ability in a serverless environment to separate the code that fetches the work and acquires the token from the code that responds with the completion status and sends the token back, as long as the token can be shared, i.e., the activity worker in this example is a serverless application supervised by a single activity task state.

In this walkthrough, you use a short-lived AWS Lambda function invoked on a schedule to implement the activity worker, which acquires the token associated with the approval step, and prepares and sends an email to the approver using Amazon SES.

It is very convenient if the application that returns the token can directly call the SendTaskSuccess and SendTaskFailure API actions on Step Functions. This can be achieved more easily by exposing these two actions through Amazon API Gateway so that an email client or web browser can return the token to Step Functions. By combining a Lambda function that acquires the token with the application that returns the token through API Gateway, you can implement a serverless manual approval step, as shown below.

In this pattern, when the execution reaches a state that requires manual approval, the Lambda function prepares and sends an email to the user with two embedded hyperlinks for approval and rejection.

If the authorized user clicks on the approval hyperlink, the state succeeds. If the authorized user clicks on the rejection link, the state fails. You can also choose to set a timeout for approval and, upon timeout, take action, such as resending the email request using retry/catch conditions in the activity task state.

Employee promotion process

As an example pattern use case, you can design a simple employee promotion process which involves a single task: getting a manager’s approval through email. When an employee is nominated for promotion, a new execution starts. The name of the employee and the email address of the employee’s manager are provided to the execution.

You’ll use the design pattern to implement the manual approval step, and SES to send the email to the manager. After acquiring the task token, the Lambda function generates and sends an email to the manager with embedded hyperlinks to URIs hosted by API Gateway.

In this example, I have administrative access to my account, so that I can create IAM roles. Moreover, I have already registered my email address with SES, so that I can send emails with the address as the sender/recipient. For detailed instructions, see Send an Email with Amazon SES.

Here is a list of what you do:

  1. Create an activity
  2. Create a state machine
  3. Create and deploy an API
  4. Create an activity worker Lambda function
  5. Test that the process works

Create an activity

In the Step Functions console, choose Tasks and create an activity called ManualStep.


Remember to keep the ARN of this activity at hand.


Create a state machine

Next, create the state machine that models the promotion process on the Step Functions console. Use StatesExecutionRole-us-east-1, the default role created by the console. Name the state machine PromotionApproval, and use the following code. Remember to replace the value for Resource with your activity ARN.

  "Comment": "Employee promotion process!",
  "StartAt": "ManualApproval",
  "States": {
    "ManualApproval": {
      "Type": "Task",
      "Resource": "arn:aws:states:us-east-1:ACCOUNT_ID:activity:ManualStep",
      "TimeoutSeconds": 3600,
      "End": true

Create and deploy an API

Next, create and deploy public URIs for calling the SendTaskSuccess or SendTaskFailure API action using API Gateway.

First, navigate to the IAM console and create the role that API Gateway can use to call Step Functions. Name the role APIGatewayToStepFunctions, choose Amazon API Gateway as the role type, and create the role.

After the role has been created, attach the managed policy AWSStepFunctionsFullAccess to it.


In the API Gateway console, create a new API called StepFunctionsAPI. Create two new resources under the root (/) called succeed and fail, and for each resource, create a GET method.


You now need to configure each method. Start by the /fail GET method and configure it with the following values:

  • For Integration type, choose AWS Service.
  • For AWS Service, choose Step Functions.
  • For HTTP method, choose POST.
  • For Region, choose your region of interest instead of us-east-1. (For a list of regions where Step Functions is available, see AWS Region Table.)
  • For Action Type, enter SendTaskFailure.
  • For Execution, enter the APIGatewayToStepFunctions role ARN.


To be able to pass the taskToken through the URI, navigate to the Method Request section, and add a URL Query String parameter called taskToken.


Then, navigate to the Integration Request section and add a Body Mapping Template of type application/json to inject the query string parameter into the body of the request. Accept the change suggested by the security warning. This sets the body pass-through behavior to When there are no templates defined (Recommended). The following code does the mapping:

   "cause": "Reject link was clicked.",
   "error": "Rejected",
   "taskToken": "$input.params('taskToken')"

When you are finished, choose Save.

Next, configure the /succeed GET method. The configuration is very similar to the /fail GET method. The only difference is for Action: choose SendTaskSuccess, and set the mapping as follows:

   "output": "\"Approve link was clicked.\"",
   "taskToken": "$input.params('taskToken')"

The last step on the API Gateway console after configuring your API actions is to deploy them to a new stage called respond. You can test our API by choosing the Invoke URL links under either of the GET methods. Because no token is provided in the URI, a ValidationException message should be displayed.


Create an activity worker Lambda function

In the Lambda console, create a Lambda function with a CloudWatch Events Schedule trigger using a blank function blueprint for the Node.js 4.3 runtime. The rate entered for Schedule expression is the poll rate for the activity. This should be above the rate at which the activities are scheduled by a safety margin.

The safety margin accounts for the possibility of lost tokens, retried activities, and polls that happen while no activities are scheduled. For example, if you expect 3 promotions to happen, in a certain week, you can schedule the Lambda function to run 4 times a day during that week. Alternatively, a single Lambda function can poll for multiple activities, either in parallel or in series. For this example, use a rate of one time per minute but do not enable the trigger yet.


Next, create the Lambda function ManualStepActivityWorker using the following Node.js 4.3 code. The function receives the taskToken, employee name, and manager’s email from StepFunctions. It embeds the information into an email, and sends out the email to the manager.

'use strict';
console.log('Loading function');
const aws = require('aws-sdk');
const stepfunctions = new aws.StepFunctions();
const ses = new aws.SES();
exports.handler = (event, context, callback) => {
    var taskParams = {
        activityArn: 'arn:aws:states:us-east-1:ACCOUNT_ID:activity:ManualStep'
    stepfunctions.getActivityTask(taskParams, function(err, data) {
        if (err) {
            console.log(err, err.stack);
  'An error occured while calling getActivityTask.');
        } else {
            if (data === null) {
                // No activities scheduled
                context.succeed('No activities received after 60 seconds.');
            } else {
                var input = JSON.parse(data.input);
                var emailParams = {
                    Destination: {
                        ToAddresses: [
                    Message: {
                        Subject: {
                            Data: 'Your Approval Needed for Promotion!',
                            Charset: 'UTF-8'
                        Body: {
                            Html: {
                                Data: 'Hi!<br />' +
                                    input.employeeName + ' has been nominated for promotion!<br />' +
                                    'Can you please approve:<br />' +
                                    '' + encodeURIComponent(data.taskToken) + '<br />' +
                                    'Or reject:<br />' +
                                    '' + encodeURIComponent(data.taskToken),
                                Charset: 'UTF-8'
                    Source: input.managerEmailAddress,
                    ReplyToAddresses: [
                ses.sendEmail(emailParams, function (err, data) {
                    if (err) {
                        console.log(err, err.stack);
              'Internal Error: The email could not be sent.');
                    } else {
                        context.succeed('The email was successfully sent.');

In the Lambda function handler and role section, for Role, choose Create a new role, LambdaManualStepActivityWorkerRole.


Add two policies to the role: one to allow the Lambda function to call the GetActivityTask API action by calling Step Functions, and one to send an email by calling SES. The result should look as follows:

  "Version": "2012-10-17",
  "Statement": [
      "Effect": "Allow",
      "Action": [
      "Resource": "arn:aws:logs:*:*:*"
      "Effect": "Allow",
      "Action": "states:GetActivityTask",
      "Resource": "arn:aws:states:*:*:activity:ManualStep"
      "Effect": "Allow",
      "Action": "ses:SendEmail",
      "Resource": "*"

In addition, as the GetActivityTask API action performs long-polling with a timeout of 60 seconds, increase the timeout of the Lambda function to 1 minute 15 seconds. This allows the function to wait for an activity to become available, and gives it extra time to call SES to send the email. For all other settings, use the Lambda console defaults.


After this, you can create your activity worker Lambda function.

Test the process

You are now ready to test the employee promotion process.

In the Lambda console, enable the ManualStepPollSchedule trigger on the ManualStepActivityWorker Lambda function.

In the Step Functions console, start a new execution of the state machine with the following input:

{ "managerEmailAddress": "[email protected]", "employeeName" : "Jim" } 

Within a minute, you should receive an email with links to approve or reject Jim’s promotion. Choosing one of those links should succeed or fail the execution.



In this post, you created a state machine containing an activity task with Step Functions, an API with API Gateway, and a Lambda function to dispatch the approval/failure process. Your Step Functions activity task generated a unique token that was returned later indicating either approval or rejection by the person making the decision. Your Lambda function acquired the task token by polling the activity task, and then generated and sent an email to the manager for approval or rejection with embedded hyperlinks to URIs hosted by API Gateway.

If you have questions or suggestions, please comment below.

Resize Images on the Fly with Amazon S3, AWS Lambda, and Amazon API Gateway

Post Syndicated from Bryan Liston original

John Pignata, Solutions Architect

With the explosion of device types used to access the Internet with different capabilities, screen sizes, and resolutions, developers must often provide images in an array of sizes to ensure a great user experience. This can become complex to manage and drive up costs.

Images stored using Amazon S3 are often processed into multiple sizes to fit within the design constraints of a website or mobile application. It’s a common approach to use S3 event notifications and AWS Lambda for eager processing of images when a new object is created in a bucket.

In this post, I explore a different approach and outline a method of lazily generating images, in which a resized asset is only created if a user requests that specific size.

Resizing on the fly

Instead of processing and resizing images into all necessary sizes upon upload, the approach of processing images on the fly has several upsides:

  • Increased agility
  • Reduced storage costs
  • Resilience to failure

Increased agility

When you redesign your website or application, you can add new dimensions on the fly, rather than working to reprocess the entire archive of images that you have stored.

Running a batch process to resize all original images into new, resized dimensions can be time-consuming, costly, and error-prone. With the on-the-fly approach, a developer can instead specify a new set of dimensions and lazily generate new assets as customers use the new website or application.

Reduced storage costs

With eager image processing, the resized images must be stored indefinitely as the operation only happens one time. The approach of resizing on-demand means that developers do not need to store images that are not accessed by users.

As a user request initiates resizing, this also unlocks options for optimizing storage costs for resized image assets, such as S3 lifecycle rules to expire older images that can be tuned to an application’s specific access patterns. If a user attempts to access a resized image that has been removed by a lifecycle rule, the API resizes it on demand to fulfill the request.

Resilience to failure

A key best practice outlined in the Architecting for the Cloud: Best Practices whitepaper is

“Design for failure and nothing will fail.” When building distributed services, developers should be pessimistic and assume that failures will occur.

If image processing is designed to occur only one time upon object creation, an intermittent failure in that process―or any data loss to the processed images―could cause continual failures to future users. When resizing images on-demand, each request initiates processing if a resized image is not found, meaning that future requests could recover from a previous failure automatically.

Architecture overview


Here’s the process:

  1. A user requests a resized asset from an S3 bucket through its static website hosting endpoint. The bucket has a routing rule configured to redirect to the resize API any request for an object that cannot be found.
  2. Because the resized asset does not exist in the bucket, the request is temporarily redirected to the resize API method.
  3. The user’s browser follows the redirect and requests the resize operation via API Gateway.
  4. The API Gateway method is configured to trigger a Lambda function to serve the request.
  5. The Lambda function downloads the original image from the S3 bucket, resizes it, and uploads the resized image back into the bucket as the originally requested key.
  6. When the Lambda function completes, API Gateway permanently redirects the user to the file stored in S3.
  7. The user’s browser requests the now-available resized image from the S3 bucket. Subsequent requests from this and other users will be served directly from S3 and bypass the resize operation. If the resized image is deleted in the future, the above process repeats and the resized image is re-created and replaced into the S3 bucket.

Set up resources

A working example with code is open source and available in the serverless-image-resizing GitHub repo. You can create the required resources by following the README directions, which use an AWS Serverless Application Model (AWS SAM) template, or manually following the directions below.

To create and configure the S3 bucket

  1. In the S3 console, create a new S3 bucket.
  2. Choose Permissions, Add Bucket Policy. Add a bucket policy to allow anonymous access.
  3. Choose Static Website Hosting, Enable website hosting and, for Index Document, enter index.html.
  4. Choose Save.
  5. Note the name of the bucket that you’ve created and the hostname in the Endpoint field.

To create the Lambda function

  1. In the Lambda console, choose Create a Lambda function, Blank Function.
  2. To select an integration, choose the dotted square and choose API Gateway.
  3. To allow all users to invoke the API method, for Security, choose Open and then Next.
  4. For Name, enter resize. For Code entry type, choose Upload a .ZIP file.
  5. Choose Function package and upload the .ZIP file of the contents of the Lambda function.
  6. To configure your function, for Environment variables, add two variables:
    • For Key, enter BUCKET; for Value,enter the bucket name that you created above.
    • For Key, enter URL; for Value, enter the endpoint field that you noted above, prefixed with http://.
  7. To define the execution role permissions for the function, for Role, choose Create a custom role. Choose View Policy Document, Edit, Ok.
  8. Replace YOUR_BUCKET_NAME_HERE with the name of the bucket that you’ve created and copy the following code into the policy document. Note that any leading spaces in your policy may cause a validation error.
  "Version": "2012-10-17",
  "Statement": [
      "Effect": "Allow",
      "Action": [
      "Resource": "arn:aws:logs:*:*:*"
      "Effect": "Allow",
      "Action": "s3:PutObject",
      "Resource": "arn:aws:s3:::__YOUR_BUCKET_NAME_HERE__/*"    
  1. For Memory, choose 1536. For Timeout, enter 10 sec. Choose Next, Create function.
  2. Choose Triggers, and note the hostname in the URL of your function.


To set up the S3 redirection rule

  1. In the S3 console, open the bucket that you created above.
  2. Expand Static Website Hosting, Edit Redirection Rules.
  3. Replace YOUR_API_HOSTNAME_HERE with the hostname that you noted above and copy the following into the redirection rules configuration:

Test image resizing

Upload a test image into your bucket to for testing. The blue marble is a great sample image for testing because it is large and square. Once uploaded, try to retrieve resized versions of the image using your bucket’s static website hosting endpoint:




You should see a smaller version of the test photo. If not, choose Monitoring in your Lambda function and check CloudWatch Logs for troubleshooting. You can also refer to the serverless-image-resizing GitHub repo for a working example that you can deploy to your account.


The solution I’ve outlined is a simplified example of how to implement this functionality. For example, in a real-world implementation, there would likely be a list of permitted sizes to prevent a requestor from filling your bucket with randomly sized images. Further cost optimizations could be employed, such as using S3 lifecycle rules on a bucket dedicated to resized images to expire resized images after a given amount of time.

This approach allows you to lazily generate resized images while taking advantage of serverless architecture. This means you have no operating systems to manage, secure, or patch; no servers to right-size, monitor, or scale; no risk of over-spending by over-provisioning; and no risk of delivering a poor user experience due to poor performance by under-provisioning.

If you have any questions, feedback, or suggestions about this approach, please let us know in the comments!

Seamlessly Scale Predictions with AWS Lambda and MXNet

Post Syndicated from Bryan Liston original

Sunil Mallya, Solutions Architect

Building AI solutions at scale can be challenging, in this blog we’ll look at how to leverage AWS Lambda and MXNet to build a scalable prediction pipeline.

Companies that leverage machine and deep learning invest in much more than just training models. They have sophisticated pipelines that include the following stages:

  • Data storage
  • Pre-processing
  • Feature extraction
  • Model generation
  • Model Analysis
  • Feature engineering
  • Evaluation and feedback


Each stage of the pipeline requires:

  • Elasticity to adapt to changing workload demands
  • Scalability to adapt well to the overall size of the workload
  • Cost effectiveness to optimize total cost of ownership (TCO)

Amazon S3 meets all of the requirements for data and model storage. But the unpredictability of user demand and location can make scaling up for batch predictions (or the results of the model analysis) challenging, and can affect the overall user experience. In this post, we show how to use MXNet and AWS Lambda to deploy models at scale for predictions.

What is MXNet?

MXNet is a full-featured, flexibly programmable, and highly scalable deep learning framework that supports state-of-the-art deep models, including convolutional neural networks (CNNs) and long short-term memory networks (LSTMs). It is the result of collaboration between researchers at several top universities, including the founding institutions of the University of Washington and Carnegie Mellon University.

As discussed in Werner Vogel’s MXNet – Deep Learning Framework of Choice at AWS post, not only is MXNet scalable for multi-instance training, it also scales down to a variety of devices and small memory footprints, even when serving predictions on very large models. MXNet is available through open source under the Apache Version 2 license.

Challenges with the prediction pipeline

As previously mentioned, ML model training and validation is just a small part of the story. After the model is built, the real work begins. To service millions of customers seamlessly, every application must scale. However scaling the prediction portion of the pipeline can be challenging and expensive, especially when end users are geographically dispersed.

Delivering model updates, deploying globally, and maintaining high availability can be difficult. Lambda is a very good deployment option for the prediction pipeline, an excellent solution for serverless web applications, real-time batch processing, and map reduce tasks.

How can you leverage Lambda?

“Code, test, deploy, and let the service do the heavy lifting” is the Lambda customer motto. Regardless of your traffic patterns—high concurrency or bursty workloads—Lambda scales to service your needs.

We compiled and built the MXNet libraries to demonstrate how Lambda scales the prediction pipeline to provide this ease and flexibility for machine learning or deep learning model prediction. We built a sample application that predicts image labels using an 18-layer deep residual network. The model architecture is based on the winning model in the ImageNet competition called ResidualNet. The application produces state-of-the-art results for problems like image classification.

Putting it all together

Lambda has a deployment package limit of 50 MB. This limit means that you might not always be able to package your models along with the code. To accommodate this limitation, you can use S3 to store the model, and download the model when you service the request.

For optimal performance, you need to download the model outside of the lambda_handler function so that the downloaded file persists across< requests in memory. For subsequent Lambda invocations, MXNet uses the downloaded model that’s already in memory. For more information about this optimization, see AWS Lambda: How It Works.

The following reference Lambda function for prediction is quite simple. It loads the model, downloads image from the specified URL, transforms the image into an NDArray, and uses the model to make a prediction that outputs labels, with associated confidence percentages. The implementation code is provided below.

import os
import boto3
import json
import tempfile
import urllib2 
import mxnet as mx
import numpy as np
import cv2
from collections import namedtuple

Batch = namedtuple('Batch', ['data'])
f_params = 'resnet-18-0000.params'
f_symbol = 'resnet-18-symbol.json'

bucket = 'my-model-bucket'
s3 = boto3.resource('s3')
s3_client = boto3.client('s3')


f_params_file = tempfile.NamedTemporaryFile()
s3_client.download_file(bucket, f_params,


f_symbol_file = tempfile.NamedTemporaryFile()
s3_client.download_file(bucket, f_symbol,

def load_model(s_fname, p_fname):
    symbol = mx.symbol.load(s_fname)
    save_dict = mx.nd.load(p_fname)
    arg_params = {}
    aux_params = {}
    for k, v in save_dict.items():
        tp, name = k.split(':', 1)
        if tp == 'arg':
            arg_params[name] = v
        if tp == 'aux':
            aux_params[name] = v
    return symbol, arg_params, aux_params

def predict(url, mod, synsets):
    req = urllib2.urlopen(url)
    arr = np.asarray(bytearray(, dtype=np.uint8)
    cv2_img = cv2.imdecode(arr, -1)
    img = cv2.cvtColor(cv2_img, cv2.COLOR_BGR2RGB)
    if img is None:
        return None
    img = cv2.resize(img, (224, 224))
    img = np.swapaxes(img, 0, 2)
    img = np.swapaxes(img, 1, 2) 
    img = img[np.newaxis, :] 

    prob = mod.get_outputs()[0].asnumpy()
    prob = np.squeeze(prob)
    a = np.argsort(prob)[::-1]
    out = '' 

    for i in a[0:5]:
        out += 'probability=%f, class=%s' %(prob[i], synsets[i])
    out += "\n"

    return out

with open('synset.txt', 'r') as f:
    synsets = [l.rstrip() for l in f]

def lambda_handler(event, context):
    url = ''
    if event['httpMethod'] == 'GET':
        url = event['queryStringParameters']['url']
    elif event['httpMethod'] == 'POST':
        data = json.loads(event['body'])
        url = data['url']

    print "image url: ", url
    sym, arg_params, aux_params = load_model(,
    mod = mx.mod.Module(symbol=sym)
    mod.bind(for_training=False, data_shapes=[('data', (1,3,224,224))])
    mod.set_params(arg_params, aux_params)
    labels = predict(url, mod, synsets)
    out = {
            "headers": {
                "content-type": "application/json",
                "Access-Control-Allow-Origin": "*"
            "body": '{"labels": "%s"}' % labels,  
            "statusCode": 200
    return out

What can you expect in production?

Because the code must download the model from S3, download an image from the web, and run the image through the model, we benchmarked it to evaluate how it scales. We ran a local benchmark on a laptop in San Francisco using wrk to the endpoint deployed in US West (Oregon) Region (us-west-2). We observed an average latency of 1.18 seconds at a sustained rate of 75 requests per second, as shown in the following output.


To benchmark global prediction latencies, we used Goad, a Lambda-based distributed load tester. We observed average latencies ranging from 1.2 seconds to 1.5 seconds. The following figure shows the observed latencies from various regions with the endpoint hosted in the US West (Oregon) Region (us-west-2).



For a state-of-the-art image label prediction model, the numbers are impressive. The average global latency of 1.5 seconds makes it worth exploring integrating Lambda into your machine learning or deep learning pipeline for batch predictions. The libraries and code samples for deployment are available in the mxnet-lambda GitHub repo.

If you have questions or suggestions, please comment below.

Authorizing Access Through a Proxy Resource to Amazon API Gateway and AWS Lambda Using Amazon Cognito User Pools

Post Syndicated from Bryan Liston original

Ed Lima, Solutions Architect

Want to create your own user directory that can scale to hundreds of millions of users? Amazon Cognito user pools are fully managed so that you don’t have to worry about the heavy lifting associated with building, securing, and scaling authentication to your apps.

The AWS Mobile blog post Integrating Amazon Cognito User Pools with API Gateway back in May explained how to integrate user pools with Amazon API Gateway using an AWS Lambda custom authorizer. Since then, we’ve released a new feature where you can directly configure a Cognito user pool authorizer to authenticate your API calls; more recently, we released a new proxy resource feature. In this post, I show how to use these new great features together to secure access to an API backed by a Lambda proxy resource.


In this post, I assume that you have some basic knowledge about the services involved. If not, feel free to review our documentation and tutorials on:

Start by creating a user pool called “myApiUsers”, and enable verifications with optional MFA access for extra security:


Be mindful that if you are using a similar solution for production workloads you will need to request a SMS spending threshold limit increase from Amazon SNS in order to send SMS messages to users for phone number verification or for MFA. For the purposes of this article, since we are only testing our API authentication with a single user the default limit will suffice.

Now, create an app in your user pool, making sure to clear Generate client secret:


Using the client ID of your newly created app, add a user, “jdoe”, with the AWS CLI. The user needs a valid email address and phone number to receive MFA codes:

aws cognito-idp sign-up \
--client-id 12ioh8c17q3stmndpXXXXXXXX \
--username jdoe \
--password [email protected] \
--region us-east-1 \
--user-attributes '[{"Name":"given_name","Value":"John"},{"Name":"family_name","Value":"Doe"},{"Name":"email","Value":"[email protected]"},{"Name":"gender","Value":"Male"},{"Name":"phone_number","Value":"+61XXXXXXXXXX"}]'  

In the Cognito User Pools console, under Users, select the new user and choose Confirm User and Enable MFA:


Your Cognito user is now ready and available to connect.

Next, create a Node.js Lambda function called LambdaForSimpleProxy with a basic execution role. Here’s the code:

'use strict';
console.log('Loading CUP2APIGW2Lambda Function');

exports.handler = function(event, context) {
    var responseCode = 200;
    console.log("request: " + JSON.stringify(event));
    var responseBody = {
        message: "Hello, " + + " " + +"!" + " You are authenticated to your API using Cognito user pools!",
        method: "This is an authorized "+ event.httpMethod + " to Lambda from your API using a proxy resource.",
        body: event.body

    //Response including CORS required header
    var response = {
        statusCode: responseCode,
        headers: {
            "Access-Control-Allow-Origin" : "*"
        body: JSON.stringify(responseBody)

    console.log("response: " + JSON.stringify(response))

For the last piece of the back-end puzzle, create a new API called CUP2Lambda from the Amazon API Gateway console. Under Authorizers, choose Create, Cognito User Pool Authorizer with the following settings:


Create an ANY method under the root of the API as follows:


After that, choose Save, OK to give API Gateway permissions to invoke the Lambda function. It’s time to configure the authorization settings for your ANY method. Under Method Request, enter the Cognito user pool as the authorization for your API:


Finally, choose Actions, Enable CORS. This creates an OPTIONS method in your API:


Now it’s time to deploy the API to a stage (such as prod) and generate a JavaScript SDK from the SDK Generation tab. You can use other methods to connect to your API however in this article I’ll show how to use the API Gateway SDK. Since we are using an ANY method the SDK does not have calls for specific methods other than the OPTIONS method created by Enable CORS, you have to add a couple of extra functions to the apigClient.js file so that your SDK can perform GET and POST operations to your API:

    apigClient.rootGet = function (params, body, additionalParams) {
        if(additionalParams === undefined) { additionalParams = {}; }
        apiGateway.core.utils.assertParametersDefined(params, [], ['body']);       

        var rootGetRequest = {
            verb: 'get'.toUpperCase(),
            path: pathComponent + uritemplate('/').expand(apiGateway.core.utils.parseParametersToObject(params, [])),
            headers: apiGateway.core.utils.parseParametersToObject(params, []),
            queryParams: apiGateway.core.utils.parseParametersToObject(params, []),
            body: body

        return apiGatewayClient.makeRequest(rootGetRequest, authType, additionalParams, config.apiKey);

    apigClient.rootPost = function (params, body, additionalParams) {
        if(additionalParams === undefined) { additionalParams = {}; }
        apiGateway.core.utils.assertParametersDefined(params, ['body'], ['body']);
        var rootPostRequest = {
            verb: 'post'.toUpperCase(),
            path: pathComponent + uritemplate('/').expand(apiGateway.core.utils.parseParametersToObject(params, [])),
            headers: apiGateway.core.utils.parseParametersToObject(params, []),
            queryParams: apiGateway.core.utils.parseParametersToObject(params, []),
            body: body
        return apiGatewayClient.makeRequest(rootPostRequest, authType, additionalParams, config.apiKey);


You can now use a little front end web page to authenticate users and test authorized calls to your API. In order for it to work, you need to add some external libraries and dependencies including the API Gateway SDK you just generated. You can find more details in our Cognito as well as API Gateway SDK documentation guides.

With the dependencies in place, you can use the following JavaScript code to authenticate your Cognito user pool user and connect to your API in order to perform authorized calls (replace your own user pool Id and client ID details accordingly):

<script type="text/javascript">
 //Configure the AWS client with the Cognito role and a blank identity pool to get initial credentials

    region: 'us-east-1',
    credentials: new AWS.CognitoIdentityCredentials({
      IdentityPoolId: ''

  AWSCognito.config.region = 'us-east-1';
  AWSCognito.config.update({accessKeyId: 'null', secretAccessKey: 'null'});
  var token = "";
  //Authenticate user with MFA

  document.getElementById("buttonAuth").addEventListener("click", function(){  
    var authenticationData = {
      Username : document.getElementById('username').value,
      Password : document.getElementById('password').value,

    var showGetPut = document.getElementById('afterLogin');
    var hideLogin = document.getElementById('login');

    var authenticationDetails = new AWSCognito.CognitoIdentityServiceProvider.AuthenticationDetails(authenticationData);

   // Replace with your user pool details

    var poolData = { 
        UserPoolId : 'us-east-1_XXXXXXXXX', 
        ClientId : '12ioh8c17q3stmndpXXXXXXXX', 
        Paranoia : 7

    var userPool = new AWSCognito.CognitoIdentityServiceProvider.CognitoUserPool(poolData);

    var userData = {
        Username : document.getElementById('user').value,
        Pool : userPool

    var cognitoUser = new AWSCognito.CognitoIdentityServiceProvider.CognitoUser(userData);
    cognitoUser.authenticateUser(authenticationDetails, {
      onSuccess: function (result) {
        token = result.getIdToken().getJwtToken(); // CUP Authorizer = ID Token
        console.log('ID Token: ' + result.getIdToken().getJwtToken()); // Show ID Token in the console
        var cognitoGetUser = userPool.getCurrentUser();
        if (cognitoGetUser != null) {
          cognitoGetUser.getSession(function(err, result) {
            if (result) {
              console.log ("User Successfuly Authenticated!");  

        //Hide Login form after successful authentication = 'block'; = 'none';
    onFailure: function(err) {
    mfaRequired: function(codeDeliveryDetails) {
            var verificationCode = prompt('Please input a verification code.' ,'');
            cognitoUser.sendMFACode(verificationCode, this);

//Send a GET request to the API

document.getElementById("buttonGet").addEventListener("click", function(){
  var apigClient = apigClientFactory.newClient();
  var additionalParams = {
      headers: {
        Authorization: token

      .then(function(response) {
        document.getElementById("output").innerHTML = ('<pre align="left"><code>Response: '+JSON.stringify(, null, 2)+'</code></pre>');
      }).catch(function (response) {
        document.getElementById('output').innerHTML = ('<pre align="left"><code>Error: '+JSON.stringify(response, null, 2)+'</code></pre>');

//Send a POST request to the API

document.getElementById("buttonPost").addEventListener("click", function(){
  var apigClient = apigClientFactory.newClient();
  var additionalParams = {
      headers: {
        Authorization: token
 var body = {
        "message": "Sample POST payload"

      .then(function(response) {
        document.getElementById("output").innerHTML = ('<pre align="left"><code>Response: '+JSON.stringify(, null, 2)+'</code></pre>');
      }).catch(function (response) {
        document.getElementById('output').innerHTML = ('<pre align="left"><code>Error: '+JSON.stringify(response, null, 2)+'</code></pre>');

As far as the front end is concerned you can use some simple HTML code to test, such as the following snippet:

<div id="container" class="container">
    <img src="">
    <h1>Cognito User Pools and API Gateway</h1>
    <form name="myform">
          <li class="fields">
            <div id="login">
            <label>User Name: </label>
            <input id="username" size="60" class="req" type="text"/>
            <label>Password: </label>
            <input id="password" size="60" class="req" type="password"/>
            <button class="btn" type="button" id='buttonAuth' title="Log in with your username and password">Log In</button>
            <br />
            <div id="afterLogin" style="display:none;"> 
            <br />
            <button class="btn" type="button" id='buttonPost'>POST</button>
            <button class="btn" type="button" id='buttonGet' >GET</button>
            <br />
    <div id="output"></div>

After adding some extra CSS styling of your choice (for example adding "list-style: none" to remove list bullet points), the front end is ready. You can test it by using a local web server in your computer or a static website on Amazon S3.

Enter the user name and password details for John Doe and choose Log In:


A MFA code is then sent to the user and can be validated accordingly:


After authentication, you can see the ID token generated by Cognito for further access testing:


If you go back to the API Gateway console and test your Cognito user pool authorizer with the same token, you get the authenticated user claims accordingly:


In your front end, you can now perform authenticated GET calls to your API by choosing GET.


Or you can perform authenticated POST calls to your API by choosing POST.


The calls reach your Lambda proxy and return a valid response accordingly. You can also test from the command line using cURL, by sending the user pool ID token that you retrieved from the developer console earlier, in the “Authorization” header:


It’s possible to improve this solution by integrating an Amazon DynamoDB table, for instance. You could detect the method request on event.httpMethod in the Lambda function and issue a GetItem call to a table for a GET request or a PutItem call to a table for a POST request. There are lots of possibilities for this kind of proxy resource integration.


The Cognito user pools integration with API Gateway provides a new way to secure your API workloads, and the new proxy resource for Lambda allows you to perform any business logic or transformations to your API calls from Lambda itself instead of using body mapping templates. These new features provide very powerful options to secure and handle your API logic.

I hope this post helps with your API workloads. If you have questions or suggestions, please comment below.

Managing Your AWS Resources Through a Serverless Policy Engine

Post Syndicated from Bryan Liston original

Stephen Liedeg
Stephen Liedeg, Solutions Architect

Customers are using AWS Lambda in new and interesting ways every day, from data processing of Amazon S3 objects, Amazon DynamoDB streams, and Amazon Kinesis triggers, to providing back-end processing logic for Amazon API Gateway.

In this post, I explore ways in which you can use Lambda as a policy engine to manage your AWS infrastructure. Lambda’s ability to react to platform events makes it an ideal solution for handling changes to your AWS resource state and enforcing organizational policy.

With support for a growing number of triggers, Lambda provides a lightweight, customizable, and cost effective solution to do things like:

  • Shut down idle resources or schedule regular shutdowns during nights, weekends, and public holidays
  • Clean up snapshots older than 6 months
  • Execute regular patching/server maintenance by automating execution of Amazon EC2 Run Command scripts
  • React to changes in your environment by evaluating AWS Config events
  • Perform a custom action if resources are created in regions that you do not wish to run workloads

I have created a sample application that demonstrates how to create a Lambda function to verify whether instances launched into a VPC conform to organizational tagging policies.

Tagging policy solution

Tagging policies are important because they help customers manage and control their AWS resources. Many customers use tags to identify the lifespan of a resource, their security, or operational context, or to assist with billing and cost tracking by assigning cost center codes to resources and later using them to generate billing reports. For these reasons, it is not uncommon for customers to take a “hard-line” approach and simply terminate or isolate compute resources that haven’t been tagged appropriately, in order to drive cost efficiencies and maintain integrity in their environments.

The tagging policy example in this post takes a middle-ground approach, in that it applies some decision-making logic based on a collection of policy rules, and then notifies system administrators of the actions taken on an EC2 instance.

A high-level view of the solution looks like this:


  1. The tagging policy function uses an Amazon CloudWatch scheduled event, which allows you to schedule the execution of your Lambda functions using cron or rate expressions, thereby enabling policy control checks at regular intervals on new and existing EC2 resources.
  2. Tag policies are pulled from DynamoDB, which provides a fast and extensible solution for storing policy definitions that can be modified independently of the function execution.
  3. The function looks for EC2 instances within a specified VPC and verifies that the tags associated with each instance conform to the policy rules.
  4. If required, missing information, such as user name of the IAM user who launched the instance, is retrieved from AWS CloudTrail.
  5. A summary notification of actions undertaken is pushed to an Amazon SNS topic to notify administrators of the policy violations and actions performed.

Note that, while I have chosen to demonstrate the CloudWatch scheduled event trigger to invoke the Lambda function, there are a number of other ways in which you could trigger a tagging policy function. Using AWS CloudTrail or AWS Config, for example, youI could filter events of type ‘RunInstances’ or create a custom config rule, to determine whether newly-created EC2 resources match your tagging policies.

Define the policies

This walkthrough uses DynamoDB to store the policies for each of the tags. DynamoDB provides a scalable, single-digit millisecond latency data store, supporting both document and key-value data models that allows me to extend and evolve my policy model easily over time. Given the nature and size of the data, DynamoDB is also a cost-effective option over a relational database solution. The table you create for this example is straightforward, using a single HASH key to identify the rule.



Use the following AWS CLI command to create the table:

aws dynamodb create-table --table-name acme_cloud_policy_tagging_def --key-schema AttributeName=RuleId,KeyType=HASH --attribute-definitions AttributeName=RuleId,AttributeType=N  --provisioned-throughput ReadCapacityUnits=5,WriteCapacityUnits=5

The sample policy items have been extended with additional attributes:

  • TagKey
  • Action
  • Required
  • Default

These attributes will help build a list a list of policy definitions for each tag and the corresponding behavior that your function should implement should the tags be missing or have no value assigned to them.

The following items have been added to the tagging policy table:

RuleId (N) TagKey (S) Action (S) Required (S) Default (S)
1 ProjectCode Update Y Proj007
2 CreatedBy UserLookup Y
3 Expires Function N today()

In this example, the default behavior for instances launched into the VPC with no tags is to terminate them immediately. This action may not be appropriate for all scenarios, and could be enhanced by stopping the instance (rather than terminating it) and notifying the resource owners that further action is required.

The Update action either creates a tag key and sets the default value if they have been marked as required, or sets the default value if the tag key is present, but has no value.

The UserLookup action in this case searches CloudTrail logs for the IAM user that launched the EC2 instance, and sets the value if it is missing.

Now that the policies have been defined, take a closer look at the actual Lambda function implementation.

Set up the trigger

The first thing you need to do before you create the Lambda function to execute the tagging policy is to create a trigger that runs the function automatically after a specified fixed rate expression, such as run "rate(1 hour)”, or via a cron expression.


After it’s configured, the resulting event looks something like this:

  "account": "123456789012",
  "region": "ap-southeast-2",
  "detail": {},
  "detail-type": "Scheduled Event",
  "source": "",
  "time": "1970-01-01T00:00:00Z",
  "id": "cdc73f9d-aea9-11e3-9d5a-835b769c0d9c",
  "resources": [

Create the Lambda execution policy

The next thing you need to do is define the IAM role under which this Lambda function executes. In addition to the CloudWatchLogs permissions to enable logging on the function, you need to call ec2:DescribeInstances on your EC2 resources to find tag information for the instances in your environment. You also require permissions to read policy definitions from a specified DynamoDB table and to then be allowed to publish the policy reports via Amazon SNS. Working on the basis of least-privilege, the IAM role policy looks something like the following:

    "Version": "2012-10-17",
    "Statement": [
            "Sid": "StmtReadOnlyDynamoDB",
            "Action": [
            "Effect": "Allow",
            "Resource": "arn:aws:dynamodb:ap-southeast-2:123456789012:table/acme_cloud_policy_tagging_definitions"
            "Sid": "StmtLookupCloudTrailEvents",
            "Action": [
            "Effect": "Allow",
            "Resource": "*"
            "Sid": "StmtLambdaCloudWatchLogs",
            "Action": [
            "Effect": "Allow",
            "Resource": "arn:aws:logs:*:*:*"
            "Sid": "StmtPublishSnsNotifications",
            "Action": [
            "Effect": "Allow",
            "Resource": "arn:aws:sns:ap-southeast-2:123456789012:acme_cloud_policy_notifications"
            "Sid": "StmtDescribeEC2",
            "Action": [
            "Effect": "Allow",
            "Resource": "*"

Create the Lambda function

For this example, you create a Python function. The function itself is broken into a number of subroutines, each performing a specific function in the policy execution.

AWS Lambda function handler

def lambda_handler(event, context):
    print('Beginning Policy check.')
    policies = get_policy_definitions()
    for instance in find_instances('vpc-abc123c1'):
        validate_instance_tags(instance, policies)
    if len(report_items) > 0:
    print('Policy check complete.')
    return 'OK'

The Lambda function orchestrates the policy logic in the following way:

  1. Load the policy rules from the DynamoDB table:
def get_policy_definitions():
    dynamodb = boto3.resource('dynamodb')
    policy_table = dynamodb.Table('acme_cloud_policy_tagging_def')
    response = policy_table.scan()
    policies = response['Items']
    return policies  
  1. Find the tags for all EC2 instances within a specified VPC. Note that this rule processing revalidates every instance; this is to ensure that no changes have been made to instance tagging after the last policy execution. For simplicity, the VPC ID has been hard-coded into the function. In a production scenario, you would look this value up:
def find_instances(vpc_id):
    ec2 = boto3.resource('ec2')
    vpc = ec2.Vpc('%s' % vpc_id)
    return list(vpc.instances.all())
  1. After you have all the instances in the VPC, apply the policies:
def validate_instance_tags(instance, policies):
    print(u'Validating tags for instance: {0:s} '.format(
    tags = instance.tags
    if tags is None:
        report_items.append(u'{0:s} has been terminated. Reason: No tags found.'.format(

    for p in policies:
        policy_key = p['TagKey']
        policy_action = p['Action']
        if 'Default' in p:
            policy_default_value = p['Default']
            policy_default_value = ''
        if not policy_key_exists(tags, policy_key):
            print(u'Instance {0:s} is missing tag {1:s}. Applying policy.'.format(, policy_key))

            if policy_action == 'Update':
                instance.create_tags(Tags=[{'Key': policy_key, 'Value': policy_default_value}])
                report_items.append(u'Instance {0:s} missing tag {1:s}. New tag created.'.format(, policy_key))

            elif policy_action == 'UserLookup':
                    user_id = find_who_launched_instance(
                    report_items.append(u'Instance {0:s} missing tag {1:s}. User name set.'.format(, e.message))
                except StandardError as e:
                    user_id = "Undefined"
                    report_items.append(u'Instance {0:s} missing tag {1:s}. User name set to Undefined.'.format(, e.message))

                instance.create_tags(Tags=[{'Key': policy_key, 'Value': user_id}])

            elif policy_action == 'Function':
                if policy_default_value == 'today()':
                    instance.create_tags(Tags=[{'Key': policy_key, 'Value': str(}])
                    report_items.append(u'Instance {0:s} missing tag {1:s}. New tag created.'.format(, policy_key))
  1. The CreatedBy tag rule is defined as Lookup, meaning if the tag is missing or empty, you search the CloudTrail logs to determine the IAM user that launched a specified instance. If the IAM user ID is found, the tag value is set to the instance:
def find_who_launched_instance(instance_id):
    response = cloudtrail.lookup_events(
                'AttributeKey': 'EventName',
                'AttributeValue': 'RunInstances'
        StartTime=datetime(2016, 6, 4),,

    events_list = response['Events']
    for event in events_list:
        resources = event['Resources']
        for resource in resources:
            if (resource['ResourceType'] == 'AWS::EC2::Instance') and (resource['ResourceName'] == instance_id):
                return event['Username']
                raise Exception("Unable to determine IAM user that launched instance.")
  1. Finally, after all the policy rules have been applied to the instances in your VPC, send an Amazon SNS notification, to which your system administrators have been subscribed, to inform them of any policy violations and the actions taken by the Lambda function:
def send_notification():
    print("Sending notification.")
    topic_arn = 'arn:aws:sns:ap-southeast-2:12345678910:acme_cloud_policy_notifications'

    message = 'These following tagging policy violation occurred:\\n'

    for ri in report_items:
        message += '-- {0:s} \n'.format(ri)
        sns = boto3.client('sns')
        sns.publish(TopicArn=('%s' % topic_arn),
                    Subject='ACME Cloud Tagging Policy Report',
    except ClientError as ex:
        raise Exception(ex.message)  

The emailed report generated by the policy engine generates the following output. The format of the notification is, of course, customisable and can contain as much or as little information as needed. These notifications can also act as a trigger themselves, allowing you to link policies.



As I have demonstrated, using Lambda as a policy engine to manage your AWS resources and to maintain operational integrity of your environment is an extremely lightweight, powerful, and customisable solution.

Policies can be composed in a number of ways, and integrating them with various triggers provides an ideal mechanism for creating a secure, automated, proactive, event-driven infrastructure across all your regions. And given that the first 1 million requests per month are free, you’d be able to manage a significant portion of your infrastructure for little or no cost.

Furthermore, the concepts presented in this post aren’t specific to managing your infrastructure; they can quite easily also be applied to a security context. Monitoring changes in your security groups or network ACLs through services like AWS Config allow you to proactively take action on unauthorised changes in your environment. (This may be the subject of another post.)

If you have questions or suggestions, please comment below.

Continuous Deployment for Serverless Applications

Post Syndicated from Bryan Liston original

With a continuous deployment infrastructure, developers can quickly and safely release new features and bug fixes for their applications without manually triggering any deployment scripts. Amazon Web Services offers a number of products that make the creation of deployment pipelines easier:

A typical serverless application consists of one or more functions triggered by events such as object uploads to Amazon S3, Amazon SNS notifications, or API actions. Those functions can stand alone or leverage other resources such as Amazon DynamoDB tables or S3 buckets. The most basic serverless application is simply a function.

This post shows you how to leverage AWS services to create a continuous deployment pipeline for your serverless applications. You use the Serverless Application Model (SAM) to define the application and its resources, CodeCommit as your source repository, CodeBuild to package your source code and SAM templates, AWS CloudFormation to deploy your application, and CodePipeline to bring it all together and orchestrate your application deployment.

Creating a pipeline

Pipelines pick up source code changes from a repository, build and package the application, and then push the new update through a series of stages, running integration tests to ensure that all features are intact and backward-compatible on each stage.

Each stage uses its own resources; for example, if you have a "dev" stage that points to a "dev" function, they are completely separate from the "prod" stage that points to a "prod" function. If your application uses other AWS services, such as S3 or DynamoDB, you should also have different resources for each stage. You can use environment variables in your AWS Lambda function to parameterize the resource names in the Lambda code.

To make this easier for you, we have created a CloudFormation template that deploys the required resources. If your application conforms to the same specifications as our sample, this pipeline will work for you:

  • The source repository contains an application SAM file and a test SAM file.
  • The SAM file called app-sam.yaml defines all of the resources and functions used by the application. In the sample, this is a single function that uses the Express framework and the aws-serverless-express library.
  • The application SAM template exports the API endpoint generated in a CloudFormation output variable called ApiUrl.
  • The SAM file called test-sam.yaml defines a single function in charge of running the integration tests on each stage of the deployment.
  • The test SAM file exports the name of the Lambda function that it creates to a CloudFormation output variable called TestFunction.

You can find the link to start the pipeline deployment at the end of this section. The template asks for a name for the service being deployed (the sample is called TimeService) and creates a CodeCommit repository to hold the application’s source code, a CodeBuild project to package the SAM templates and prepare them for deployment, an S3 bucket to store build artifacts along the way, and a multi-stage CodePipeline pipeline for deployments.

The pipeline picks up your code when it’s committed to the source repository, runs the build process, and then proceeds to start the deployment to each stage. Before moving on to the next stage, the pipeline also executes integration tests: if the tests fail, the pipeline stops.

This pipeline consists of six stages:

  1. Source – the source step picks up new commits from the CodeCommit repository. CodePipeline also supports S3 and GitHub as sources for this step.
  2. Build – Using CodeBuild, you pull down your application’s dependencies and use the AWS CLI to package your app and test SAM templates for deployment. The buildspec.yml file in the root of the sample application defines the commands that CodeBuild executes at each step.
  3. DeployTests – In the first step, you deploy the updated integration tests using the test-sam.yaml file from your application. You deploy the updated tests first so that they are ready to run on all the following stages of the pipeline.
  4. Beta – This is the first step for your app’s deployment. Using the SAM template packaged in the Build step, you deploy the Lambda function and API Gateway endpoint to the beta stage. At the end of the deployment, this stage run your test function against the beta API.
  5. Gamma – Push the updated function and API to the gamma stage, then run the integration tests again.
  6. Prod – Rinse, repeat. Before proceeding with the prod deployment, your sample pipeline has a manual approval step.

Running the template

  1. Choose Launch Stack below to create the pipeline in your AWS account. This button takes you to the Create stack page of the CloudFormation console with the S3 link to the pre-populated template.
  2. Choose Next and customize your StackName and ServiceName.
  3. Skip the Options screen, choose Next, acknowledge the fact that the template can create IAM roles in your account, and choose Create.

Running integration tests

Integration tests decide whether your pipeline can move on and deploy the app code to the next stage. To keep the pipeline completely serverless, we decided to use a Lambda function to run the integration tests.

To run the test function, the pipeline template also includes a Lambda function called <YourServiceName>_start_tests. The start_tests function reads the output of the test deployment CloudFormation stack as well as the current stage’s stack, extracts the output values from the stacks (the API endpoint and the test function name), and triggers an asynchronous execution of the test function. The test function is then in charge of updating the CodePipeline job status with the outcome of the tests. The test function in the sample application generates a random success or failure output.

In the future, for more complex integration tests, you could use AWS Step Functions to execute multiple tests at the same time.

The sample application

The sample application is a very simple API; it exposes time and time/{timeZone} endpoints that return the current time. The code for the application is written in JavaScript and uses the moment-timezone library to generate and format the timestamps. Download the source code for the sample application.

The source code includes the application itself under the app folder, and the integration tests for the application under the test folder. In the root directory for the sample, you will find two SAM templates, one for the application and one for the test function. The buildspec.yml file contains the instructions for the CodeBuild container. At the moment, the buildspecs use npm to download the app’s dependencies and then the CloudFormation package command of the AWS CLI to prepare the SAM deployment package. For a sophisticated application, you would run your unit tests in the build step.

After you have downloaded the sample code, you can push it to the CodeCommit repository created by the pipeline template. The app-sam.yaml and test-sam.yaml files should be in the root of the repository. Using the CodePipeline console, you can follow the progress of the application deployment. The first time the source code is imported, the deployment can take a few minutes to start. Keep in mind that for the purpose of this demo, the integration tests function generates random failures.

After the application is deployed to a stage, you can find the API endpoint URL in the CloudFormation console by selecting the correct stack in the list and opening the Outputs tab in the bottom frame.


Continuous deployment and integration are a must for modern application development. It allows teams to iterate on their app at a faster clip and deliver new features and fixes in customers’ hands quickly. With this pipeline template, you can bring this automation to your serverless applications without writing any additional code or managing any infrastructure.

You can re-use the same pipeline template for multiple services. The only requirement is that they conform to the same structure as the sample app with the app-sam.yaml and test-sam.yaml in the same repository.

Scripting Languages for AWS Lambda: Running PHP, Ruby, and Go

Post Syndicated from Bryan Liston original

Dimitrij Zub
Dimitrij Zub, Solutions Architect
Raphael Sack
Raphael Sack, Technical Trainer

In our daily work with partners and customers, we see a lot of different amazing skills, expertise and experience in many fields, and programming languages. From languages that have been around for a while to languages on the cutting edge, many teams have developed a deep understanding of concepts of each language; they want to apply these languages with and within the innovations coming from AWS, such as AWS Lambda.

Lambda provides native support for a wide array of languages, such as Scala, Java, Node.js, Python, and C/C++ Linux native applications. In this post, we outline how you can use Lambda with different scripting languages.

For each language, you perform the following tasks:

  • Prepare: Launch an instance from an AMI and log in via SSH
  • Compile and package the language for Lambda
  • Install: Create the Lambda package and test the code

The preparation and installation steps are similar between languages, but we provide step-by-step guides and examples for compiling and packaging PHP, Go, and Ruby.

Common steps to prepare

You can use the capabilities of Lambda to run arbitrary executables to prepare the binaries to be executed within the Lambda environment.

The following steps are only an overview on how to get PHP, Go, or Ruby up and running on Lambda; however, using this approach, you can add more specific libraries, extend the compilation scope, and leverage JSON to interconnect your Lambda function to Amazon API Gateway and other services.

After your binaries have been compiled and your basic folder structure is set up, you won’t need to redo those steps for new projects or variations of your code. Simply write the code to accept inputs from STDIN and return to STDOUT and the written Node.js wrapper takes care of bridging the runtimes for you.

For the sake of simplicity, we demonstrate the preparation steps for PHP only, but these steps are also applicable for the other environments described later.

In the Amazon EC2 console, choose Launch instance. When you choose an AMI, use one of the AMIs in the Lambda Execution Environment and Available Libraries list, for the same region in which you will run the PHP code and launch an EC2 instance to have a compiler. For more information, see Step 1: Launch an Instance.

Pick t2.large as the EC2 instance type to have two cores and 8 GB of memory for faster PHP compilation times.


Choose Review and Launch to use the defaults for storage and add the instance to a default, SSH only, security group generated by the wizard.

Choose Launch to continue; in the launch dialog, you can select an existing key-pair value for your login or create a new one. In this case, create a new key pair called “php” and download it.


After downloading the keys, navigate to the download folder and run the following command:

chmod 400 php.pem

This is required because of SSH security standards. You can now connect to the instance using the EC2 public DNS. Get the value by selecting the instance in the console and looking it up under Public DNS in the lower right part of the screen.

ssh -i php.pem [email protected][PUBLIC DNS]

You’re done! With this instance up and running, you have the right AMI in the right region to be able to continue with all the other steps.

Getting ready for PHP

After you have logged in to your running AMI, you can start compiling and packaging your environment for Lambda. With PHP, you compile the PHP 7 environment from the source and make it ready to be packaged for the Lambda environment.

Setting up PHP on the instance

The next step is to prepare the instance to compile PHP 7, configure the PHP 7 compiler to output in a defined directory, and finally compile PHP 7 to the Lambda AMI.

Update the package manager by running the following command:

sudo yum update –y

Install the minimum necessary libraries to be able to compile PHP 7:

sudo yum install gcc gcc-c++ libxml2-devel -y 

With the dependencies installed, you need to download the PHP 7 sources available from

PHP Downloads.

For this post, we were running the EC2 instance in Ireland, so we selected as our mirror. Run the following command to download the sources to the instance and choose your own mirror for the appropriate region.

cd ~

wget .

Extract the files using the following command:

tar -jxvf php-7.0.7.tar.bz2

This creates the php-7.0.7 folder in your home directory. Next, create a dedicated folder for the php-7 binaries by running the following commands.

mkdir /home/ec2-user/php-7-bin

./configure --prefix=/home/ec2-user/php-7-bin/

This makes sure the PHP compilation is nicely packaged into the php binaries folder you created in your home directory. Keep in mind, that you only compile the baseline PHP here to reduce the amount of dependencies required for your Lambda function.

You can add more dependencies and more compiler options to your PHP binaries using the options available in ./configure. Run ./configure –h for more information about what can be packaged into your PHP distribution to be used with Lambda, but also keep in mind that this will increase the overall binaries package.

Finally, run the following command to start the compilation:

make install


After the compilation is complete, you can quickly confirm that PHP is functional by running the following command:

cd ~/php-7-bin/bin/

./php –v

PHP 7.0.7 (cli) (built: Jun 16 2016 09:14:04) ( NTS )

Copyright (c) 1997-2016 The PHP Group

Zend Engine v3.0.0, Copyright (c) 1998-2016 Zend Technologies

Time to code

Using your favorite editor, you can create an entry point PHP file, which in this case reads input from a Linux pipe and provide its output to stdout. Take a simple JSON document and count the amounts of top-level attributes for this matter. Name the file HelloLambda.php.


$data = stream_get_contents(STDIN);

$json = json_decode($data, true);

$result = json_encode(array('result' => count($json)));

echo $result."n";


Creating the Lambda package

With PHP compiled and ready to go, all you need to do now is to create your Lambda package with the Node.js wrapper as an entry point.

First, tar the php-7-bin folder where the binaries reside using the following command:

cd ~

tar -zcvf php-7-bin.tar.gz php-7-bin/

Download it to your local project folder where you can continue development, by logging out and running the following command from your local machine (Linux or OSX), or using tools like WinSCP on Windows:

scp -i php.pem [email protected][EC2_HOST]:~/php-7-bin.tar.gz .

With the package download, you create your Lambda project in a new folder, which you can call php-lambda for this specific example. Unpack all files into this folder, which should result in the following structure:


+-- php-7-bin

The next step is to create a Node.js wrapper file. The file takes the inputs of the Lambda invocations, invoke the PHP binary with helloLambda.php as a parameter, and provide the inputs via Linux pipe to PHP for processing. Call the file php.js and copy the following content:

process.env['PATH'] = process.env['PATH'] + ':' + process.env['LAMBDA_TASK_ROOT'];

const spawn = require('child_process').spawn;

exports.handler = function(event, context) {

    //var php = spawn('php',['helloLambda.php']); //local debug only
    var php = spawn('php-7-bin/bin/php',['helloLambda.php']);
    var output = "";

    //send the input event json as string via STDIN to php process

    //close the php stream to unblock php process

    //dynamically collect php output
    php.stdout.on('data', function(data) {

    //react to potential errors
    php.stderr.on('data', function(data) {
            console.log("STDERR: "+data);

    //finalize when php process is done.
    php.on('close', function(code) {

//local debug only

With all the files finalized, the folder structure should look like the following:


+– php-7-bin

— helloLambda.php

— php.js

The final step before the deployment is to zip the package into an archive which can be uploaded to Lambda. Call the package Feel free to remove unnecessary files, such as phpdebug, from the php-7-bin/bin folder to reduce the size of the archive.

Go, Lambda, go!

The following steps are an overview of how to compile and execute Go applications on Lambda. As with the PHP section, you are free to enhance and build upon the Lambda function with other AWS services and your application infrastructure. Though this example allows you to use your own Linux machine with a fitting distribution to work locally, it might still be useful to understand the Lambda AMIs for test and automation.

To further enhance your environment, you may want to create an automatic compilation pipeline and even deployment of the Go application to Lambda. Consider using versioning and aliases, as they help in managing new versions and dev/test/production code.

Setting up Go on the instance

The next step is to set up the Go binaries on the instance, so that you can compile the upcoming application.

First, make sure your packages are up to date (always):

sudo yum update -y

Next, visit the official Go site, check for the latest version, and download it to EC2 or to your local machine if using Linux:

cd ~

wget .

Extract the files using the following command:

tar -xvf go1.6.2.linux-amd64.tar.

This creates a folder named “go” in your home directory.

Time to code

For this example, you create a very simple application that counts the amount of objects in the provided JSON element. Using your favorite editor, create a file named “HelloLambda.go” with the following code directly on the machine to which you have downloaded the Go package, which may be the EC2 instance you started in the beginning or your local environment, in which case you are not stuck with vi.

package main

import (

func main() {
    var dat map[string]interface{}
    fmt.Printf( "Welcome to Lambda Go, now Go Go Go!n" )
    if len( os.Args ) < 2 {
        fmt.Println( "Missing args" )

    err := json.Unmarshal([]byte(os.Args[1]), &dat)

    if err == nil {
        fmt.Println( len( dat ) )
    } else {

Before compiling, configure an environment variable to tell the Go compiler where all the files are located:

export GOROOT=~/go/

You are now set to compile a nifty new application!

~/go/bin/go build ./HelloLambda.go

Start your application for the very first time:

./HelloLambda '{ "we" : "love", "using" : "Lambda" }'

You should see output similar to:

Welcome to Lambda Go, now Go Go Go!


Creating the Lambda package

You have already set up your machine to compile Go applications, written the code, and compiled it successfully; all that is left is to package it up and deploy it to Lambda.

If you used an EC2 instance, copy the binary from the compilation instance and prepare it for packaging. To copy out the binary, use the following command from your local machine (Linux or OSX), or using tools such as WinSCP on Windows.

scp -i GoLambdaGo.pem [email protected]:~/goLambdaGo .

With the binary ready, create the Lambda project in a new folder, which you can call go-lambda.

The next step is to create a Node.js wrapper file to invoke the Go application; call it go.js. The file takes the inputs of the Lambda invocations and invokes the Go binary.

Here’s the content for another example of a Node.js wrapper:

const exec = require('child_process').exec;
exports.handler = function(event, context) {
    const child = exec('./goLambdaGo ' + ''' + JSON.stringify(event) + ''', (error) => {
        // Resolve with result of process
        context.done(error, 'Process complete!');

    // Log process stdout and stderr
    child.stdout.on('data', console.log);
    child.stderr.on('data', console.error);


With all the files finalized and ready, your folder structure should look like the following:


— go.js

— goLambdaGo

The final step before deployment is to zip the package into an archive that can be uploaded to Lambda; call the package

On a Linux or OSX machine, run the following command:

zip -r ./goLambdaGo ./go.js

A gem in Lambda

For convenience, you can use the same previously used instance, but this time to compile Ruby for use with Lambda. You can also create a new instance using the same instructions.

Setting up Ruby on the instance

The next step is to set up the Ruby binaries and dependencies on the EC2 instance or local Linux environment, so that you can package the upcoming application.

First, make sure your packages are up to date (always):

sudo yum update -y

For this post, you use Traveling Ruby, a project that helps in creating “portable”, self-contained Ruby packages. You can download the latest version from Traveling Ruby linux-x86:

cd ~

wget .

Extract the files to a new folder using the following command:

mkdir LambdaRuby

tar -xvf traveling-ruby-20150715-2.2.2-linux-x86_64.tar.gz -C LambdaRuby

This creates the “LambdaRuby” folder in your home directory.

Time to code

For this demonstration, you create a very simple application that counts the amount of objects in a provided JSON element. Using your favorite editor, create a file named “lambdaRuby.rb” with the following code:


require 'json'

# You can use this to check your Ruby version from within puts(RUBY_VERSION)

if ARGV.length > 0
    puts JSON.parse( ARGV[0] ).length
    puts "0"

Now, start your application for the very first time, using the following command:

./lambdaRuby.rb '{ "we" : "love", "using" : "Lambda" }'

You should see the amount of fields in the JSON as output (2).

Creating the Lambda package

You have downloaded the Ruby gem, written the code, and tested it successfully… all that is left is to package it up and deploy it to Lambda. Because Ruby is an interpreter-based language, you create a Node.js wrapper and package it with the Ruby script and all the Ruby files.

The next step is to create a Node.js wrapper file to invoke your Ruby application; call it ruby.js. The file takes the inputs of the Lambda invocations and invoke your Ruby application. Here’s the content for a sample Node.js wrapper:

const exec = require('child_process').exec;

exports.handler = function(event, context) {
    const child = exec('./lambdaRuby.rb ' + ''' + JSON.stringify(event) + ''', (result) => {
        // Resolve with result of process

    // Log process stdout and stderr
    child.stdout.on('data', console.log);
    child.stderr.on('data', console.error);

With all the files finalized and ready, your folder structure should look like this:


+– bin

+– bin.real

+– info

— lambdaRuby.rb

+– lib

— ruby.js

The final step before the deployment is to zip the package into an archive to be uploaded to Lambda. Call the package

On a Linux or OSX machine, run the following command:

zip -r ./

Copy your zip file from the instance so you can upload it. To copy out the archive, use the following command from your local machine (Linux or OSX), or using tools such as WinSCP on Windows.

scp -i RubyLambda.pem [email protected]:~/LambdaRuby/ .

Common steps to install

With the package done, you are ready to deploy the PHP, Go, or Ruby runtime into Lambda.

Log in to the AWS Management Console and navigate to Lambda; make sure that the region matches the one which you selected the AMI for in the preparation step.

For simplicity, I’ve used PHP as an example for the deployment; however, the steps below are the same for Go and Ruby.

Creating the Lambda function

Choose Create a Lambda function, Skip. Select the following fields and upload your previously created archive.


The most important areas are:

  • Name: The name to give your Lambda function
  • Runtime: Node.js
  • Lambda function code: Select the zip file created in the PHP, Go, or Ruby section, such as,, or
  • Handler: php.handler (as in the code, the entry function is called handler and the file is php.js. If you have used the file names from the Go and Ruby sections use the following format: [js file name without .js].handler, i.e., go.handler)
  • Role: Choose Basic Role if you have not yet created one, and create a role for your Lambda function execution

Choose Next, Create function to continue to testing.

Testing the Lambda function

To test the Lambda function, choose Test in the upper right corner, which displays a sample event with three top-level attributes.


Feel free to add more, or simply choose Save and test to see that your function has executed properly.



In this post, we outlined three different ways to create scripting language runtimes for Lambda, from compiling against the Lambda runtime for PHP and being able to run scripts, compiling the actuals as in Go, or using packaged binaries as much as possible with Ruby. We hope you enjoyed the ideas, found the hidden gems, and are now ready to go to create some pretty hefty projects in your favorite language, enjoying serverless, Amazon Kinesis, and API Gateway along the way.

If you have questions or suggestions, please comment below.

Serverless at re:Invent 2016 – Wrap-up

Post Syndicated from Bryan Liston original

The re:Invent 2016 conference was an exciting week to be working on serverless at AWS. We announced new features like support for C# and dead letter queues, and launched new application constructs with Lambda such as [email protected], AWS Greengrass, Amazon Lex, and AWS Step Functions. In addition we also added support for surfacing services built using API Gateway in the AWS marketplace, expanded the capabilities for custom authorizers, and launched a reference developer portal for managing APIs. Catch up on all the great re:Invent launches here.

In addition to the serverless mini-con with deep dive talks and best practices, we also had deep customer talks by folks from Thomson Reuters, Vevo, Expedia, and FINRA. If you weren’t able to attend the mini-con or missed a specific session, here is a quick link to the entire Serverless Mini Conference Playlist. Other interesting sessions from other tracks are listed below.

Individual Sessions from the Mini Conference

Other Interesting Sessions

If there are other sessions or talks you think I should capture in this list, let me know!

Robust Serverless Application Design with AWS Lambda Dead Letter Queues

Post Syndicated from Bryan Liston original

Gene Ting
Gene Ting, Solutions Architect

AWS Lambda is a serverless, event-driven compute service that allows developers to bring their functions to the cloud easily. A key challenge that Lambda developers often face is to create solutions that handle exceptions and failures gracefully. Some examples include:

  • Notifying operations support when a function fails with context
  • Sending jobs that have timed out to a handler that can either notify operations of a critical failure or rebalance jobs

Now, with the release of Lambda Dead Letter Queues, Lambda functions can be configured to notify when the function fails, with context on what the failure was.

In this post, we show how you can configure your function to deliver notification to an Amazon SQS queue or Amazon SNS topic, and how you can create a process to automatically generate more meaningful notifications when your Lambda functions fail.

Introducing Lambda Dead Letter Queues

Dead-letter queues are a powerful concept, which help software developers find software issue patterns in their asynchronous processing components. The way it works is simple—when your messaging component receives a message and detects a fatal or unhandled error while processing the message, it sends information about the message that failed to another location, such as another queue or another notification system. SQS provides dead letter queues today, sending messages that couldn’t be handled to a different queue for further investigation.

AWS Lambda Dead Letter Queues builds upon the concept by enabling Lambda functions to be configured with an SQS queue or SNS topic as a destination to which the Lambda service can send information about an asynchronous request when processing fails. The Lambda service sends information about the failed request when the request will no longer be retried. Supported invocations include:

  • An event type invocation from a custom application
  • Any AWS event source that’s not a DynamoDB table, Amazon Kinesis stream, or API Gateway resource request integration

Take the typical beginner use case for learning about serverless applications on AWS: creating thumbnails from images dropped onto an S3 bucket. The transcoding Lambda function can be configured to send any transcoding failures to an SNS topic, which triggers a Lambda function for further investigation.

Now, you can set up a dead letter queue for an existing Lambda function and test out the feature.

Configuring a DLQ target for a Lambda function

First, make sure that the execution role for the Lambda function is allowed to publish to the SNS topic. For this demo, use the sns-lambda-test topic. An example is provided below:


If an SQS queue is the intended target, you need a comparable policy that allows the appropriate SendMessage action to the queue.

Next, choose an existing Lambda function against which to configure a dead-letter queue. For this example, choose a predeployed function, such as CreateThumbnail.


Select the function, choose Configuration, expand the **Advanced settings **section in the middle of the page, and scroll to the DLQ Resource form. Choose SNS and for SNS Topic name, enter sns-lambda-test.
That’s it—the function is now configured and ready for testing.

Processing failure notifications

One easy way to test the handler for your dead letter queue is to submit an event that is known to fail for the Lambda function. In this example, you can simply drop a text file pretending to be an image to the S3 bucket, to be recognized by the image thumbnail creator as a non-image file, and have the handler exit with an error message.

When Lambda sends an error notification to an SNS topic, three additional message attributes are attached to the notification in the MessageAttributes object:

  • RequestID – The request ID.
  • ErrorCode – The HTTP response code that would have been given if the handler was synchronously called.
  • ErrorMessage – The error message given back by the Lambda runtime. In the example above, it is the error message from the handler.

In addition to these attributes, the body of the event is held in the Message attribute of the Sns object. If you use an SQS queue instead, the additional attributes are in the MessageAttributes object and the event body is held in the Body attribute of the message.

Handling timeouts

One of the most common failures to occur in Lambda functions is a timeout. In this scenario, the Lambda function executes until it’s been forcefully terminated by the Lambda runtime, which sends an error message indicating that the function has timed out, as in the following example error message:

"ErrorMessage": {

"Type": "String", 

"Value": "2016-11-29T04:27:36.789Z b4797725-b5eb-11e6-acb2-17876a085622 Task timed out after 300.00 seconds" 


An error handler can simply parse for the string Task timed out after in the Value attribute, and act accordingly, such as breaking the request into multiple Lambda invocations, or sending to a different queue that spins up EC2 instances in an Auto Scaling group for handling larger jobs.

Handling critical failures

Another scenario that you may need to handle is when critical failures occur. Some examples of a critical failure are:

  • A misconfiguration of the Lambda handler
  • A system crash, such as an out-of-memory error

In either case, there’s very little that can be handled gracefully in application logic. These kinds of errors can be forwarded to operations support for root cause analysis or break glass fixes.

In the case of a system crash, your dead letter queue receives an error message similar to the following:

"RequestID": { "Type": "String", "Value": "6502cad0-b641-11e6-bd4e-279609143c53" }, 

"ErrorCode": { "Type": "String", "Value": "200" },

"ErrorMessage": { "Type": "String", "Value": "Process exited before completing request" }

For this example, the Lambda handler was forced to crash with an out-of-memory error, which can be found by searching in the Lambda handler’s log stream by the given RequestID.

In the case of a misconfiguration, your dead letter queue receives an error message along the following lines:

"ErrorMessage": { "Type": "String", "Value": "Cannot find module '/var/task/index'" }

In this example, the Lambda handler was misconfigured to load a non-existent index.js module.

Monitoring Lambda functions configured with dead letter queues

Lambda functions with a configured dead letter queue also come with their own CloudWatch metric called “DeadLetterErrors”. The metric is incremented whenever the dead letter message payload can’t be sent to the dead letter queue at any time.


With the launch of Dead Letter Queues, Lambda function developers can now create much simpler functions by focusing only on the business logic, and leverage the AWS Lambda infrastructure to delegate error handling elsewhere in a more graceful manner.

For more information, read about Dead Letter Queues in the AWS Lambda Developer Guide. Happy coding everyone, and have fun creating awesome serverless applications!