Tag Archives: api gateway

Announcing the Winners of the AWS Chatbot Challenge – Conversational, Intelligent Chatbots using Amazon Lex and AWS Lambda

Post Syndicated from Tara Walker original https://aws.amazon.com/blogs/aws/announcing-the-winners-of-the-aws-chatbot-challenge-conversational-intelligent-chatbots-using-amazon-lex-and-aws-lambda/

A couple of months ago on the blog, I announced the AWS Chatbot Challenge in conjunction with Slack. The AWS Chatbot Challenge was an opportunity to build a unique chatbot that helped to solve a problem or that would add value for its prospective users. The mission was to build a conversational, natural language chatbot using Amazon Lex and leverage Lex’s integration with AWS Lambda to execute logic or data processing on the backend.

I know that you all have been anxiously waiting to hear announcements of who were the winners of the AWS Chatbot Challenge as much as I was. Well wait no longer, the winners of the AWS Chatbot Challenge have been decided.

May I have the Envelope Please? (The Trumpets sound)

The winners of the AWS Chatbot Challenge are:

  • First Place: BuildFax Counts by Joe Emison
  • Second Place: Hubsy by Andrew Riess, Andrew Puch, and John Wetzel
  • Third Place: PFMBot by Benny Leong and his team from MoneyLion.
  • Large Organization Winner: ADP Payroll Innovation Bot by Eric Liu, Jiaxing Yan, and Fan Yang

 

Diving into the Winning Chatbot Projects

Let’s take a walkthrough of the details for each of the winning projects to get a view of what made these chatbots distinctive, as well as, learn more about the technologies used to implement the chatbot solution.

 

BuildFax Counts by Joe Emison

The BuildFax Counts bot was created as a real solution for the BuildFax company to decrease the amount the time that sales and marketing teams can get answers on permits or properties with permits meet certain criteria.

BuildFax, a company co-founded by bot developer Joe Emison, has the only national database of building permits, which updates data from approximately half of the United States on a monthly basis. In order to accommodate the many requests that come in from the sales and marketing team regarding permit information, BuildFax has a technical sales support team that fulfills these requests sent to a ticketing system by manually writing SQL queries that run across the shards of the BuildFax databases. Since there are a large number of requests received by the internal sales support team and due to the manual nature of setting up the queries, it may take several days for getting the sales and marketing teams to receive an answer.

The BuildFax Counts chatbot solves this problem by taking the permit inquiry that would normally be sent into a ticket from the sales and marketing team, as input from Slack to the chatbot. Once the inquiry is submitted into Slack, a query executes and the inquiry results are returned immediately.

Joe built this solution by first creating a nightly export of the data in their BuildFax MySQL RDS database to CSV files that are stored in Amazon S3. From the exported CSV files, an Amazon Athena table was created in order to run quick and efficient queries on the data. He then used Amazon Lex to create a bot to handle the common questions and criteria that may be asked by the sales and marketing teams when seeking data from the BuildFax database by modeling the language used from the BuildFax ticketing system. He added several different sample utterances and slot types; both custom and Lex provided, in order to correctly parse every question and criteria combination that could be received from an inquiry.  Using Lambda, Joe created a Javascript Lambda function that receives information from the Lex intent and used it to build a SQL statement that runs against the aforementioned Athena database using the AWS SDK for JavaScript in Node.js library to return inquiry count result and SQL statement used.

The BuildFax Counts bot is used today for the BuildFax sales and marketing team to get back data on inquiries immediately that previously took up to a week to receive results.

Not only is BuildFax Counts bot our 1st place winner and wonderful solution, but its creator, Joe Emison, is a great guy.  Joe has opted to donate his prize; the $5,000 cash, the $2,500 in AWS Credits, and one re:Invent ticket to the Black Girls Code organization. I must say, you rock Joe for helping these kids get access and exposure to technology.

 

Hubsy by Andrew Riess, Andrew Puch, and John Wetzel

Hubsy bot was created to redefine and personalize the way users traditionally manage their HubSpot account. HubSpot is a SaaS system providing marketing, sales, and CRM software. Hubsy allows users of HubSpot to create engagements and log engagements with customers, provide sales teams with deals status, and retrieves client contact information quickly. Hubsy uses Amazon Lex’s conversational interface to execute commands from the HubSpot API so that users can gain insights, store and retrieve data, and manage tasks directly from Facebook, Slack, or Alexa.

In order to implement the Hubsy chatbot, Andrew and the team members used AWS Lambda to create a Lambda function with Node.js to parse the users request and call the HubSpot API, which will fulfill the initial request or return back to the user asking for more information. Terraform was used to automatically setup and update Lambda, CloudWatch logs, as well as, IAM profiles. Amazon Lex was used to build the conversational piece of the bot, which creates the utterances that a person on a sales team would likely say when seeking information from HubSpot. To integrate with Alexa, the Amazon Alexa skill builder was used to create an Alexa skill which was tested on an Echo Dot. Cloudwatch Logs are used to log the Lambda function information to CloudWatch in order to debug different parts of the Lex intents. In order to validate the code before the Terraform deployment, ESLint was additionally used to ensure the code was linted and proper development standards were followed.

 

PFMBot by Benny Leong and his team from MoneyLion

PFMBot, Personal Finance Management Bot,  is a bot to be used with the MoneyLion finance group which offers customers online financial products; loans, credit monitoring, and free credit score service to improve the financial health of their customers. Once a user signs up an account on the MoneyLion app or website, the user has the option to link their bank accounts with the MoneyLion APIs. Once the bank account is linked to the APIs, the user will be able to login to their MoneyLion account and start having a conversation with the PFMBot based on their bank account information.

The PFMBot UI has a web interface built with using Javascript integration. The chatbot was created using Amazon Lex to build utterances based on the possible inquiries about the user’s MoneyLion bank account. PFMBot uses the Lex built-in AMAZON slots and parsed and converted the values from the built-in slots to pass to AWS Lambda. The AWS Lambda functions interacting with Amazon Lex are Java-based Lambda functions which call the MoneyLion Java-based internal APIs running on Spring Boot. These APIs obtain account data and related bank account information from the MoneyLion MySQL Database.

 

ADP Payroll Innovation Bot by Eric Liu, Jiaxing Yan, and Fan Yang

ADP PI (Payroll Innovation) bot is designed to help employees of ADP customers easily review their own payroll details and compare different payroll data by just asking the bot for results. The ADP PI Bot additionally offers issue reporting functionality for employees to report payroll issues and aids HR managers in quickly receiving and organizing any reported payroll issues.

The ADP Payroll Innovation bot is an ecosystem for the ADP payroll consisting of two chatbots, which includes ADP PI Bot for external clients (employees and HR managers), and ADP PI DevOps Bot for internal ADP DevOps team.


The architecture for the ADP PI DevOps bot is different architecture from the ADP PI bot shown above as it is deployed internally to ADP. The ADP PI DevOps bot allows input from both Slack and Alexa. When input comes into Slack, Slack sends the request to Lex for it to process the utterance. Lex then calls the Lambda backend, which obtains ADP data sitting in the ADP VPC running within an Amazon VPC. When input comes in from Alexa, a Lambda function is called that also obtains data from the ADP VPC running on AWS.

The architecture for the ADP PI bot consists of users entering in requests and/or entering issues via Slack. When requests/issues are entered via Slack, the Slack APIs communicate via Amazon API Gateway to AWS Lambda. The Lambda function either writes data into one of the Amazon DynamoDB databases for recording issues and/or sending issues or it sends the request to Lex. When sending issues, DynamoDB integrates with Trello to keep HR Managers abreast of the escalated issues. Once the request data is sent from Lambda to Lex, Lex processes the utterance and calls another Lambda function that integrates with the ADP API and it calls ADP data from within the ADP VPC, which runs on Amazon Virtual Private Cloud (VPC).

Python and Node.js were the chosen languages for the development of the bots.

The ADP PI bot ecosystem has the following functional groupings:

Employee Functionality

  • Summarize Payrolls
  • Compare Payrolls
  • Escalate Issues
  • Evolve PI Bot

HR Manager Functionality

  • Bot Management
  • Audit and Feedback

DevOps Functionality

  • Reduce call volume in service centers (ADP PI Bot).
  • Track issues and generate reports (ADP PI Bot).
  • Monitor jobs for various environment (ADP PI DevOps Bot)
  • View job dashboards (ADP PI DevOps Bot)
  • Query job details (ADP PI DevOps Bot)

 

Summary

Let’s all wish all the winners of the AWS Chatbot Challenge hearty congratulations on their excellent projects.

You can review more details on the winning projects, as well as, all of the submissions to the AWS Chatbot Challenge at: https://awschatbot2017.devpost.com/submissions. If you are curious on the details of Chatbot challenge contest including resources, rules, prizes, and judges, you can review the original challenge website here:  https://awschatbot2017.devpost.com/.

Hopefully, you are just as inspired as I am to build your own chatbot using Lex and Lambda. For more information, take a look at the Amazon Lex developer guide or the AWS AI blog on Building Better Bots Using Amazon Lex (Part 1)

Chat with you soon!

Tara

New – AWS SAM Local (Beta) – Build and Test Serverless Applications Locally

Post Syndicated from Randall Hunt original https://aws.amazon.com/blogs/aws/new-aws-sam-local-beta-build-and-test-serverless-applications-locally/

Today we’re releasing a beta of a new tool, SAM Local, that makes it easy to build and test your serverless applications locally. In this post we’ll use SAM local to build, debug, and deploy a quick application that allows us to vote on tabs or spaces by curling an endpoint. AWS introduced Serverless Application Model (SAM) last year to make it easier for developers to deploy serverless applications. If you’re not already familiar with SAM my colleague Orr wrote a great post on how to use SAM that you can read in about 5 minutes. At it’s core, SAM is a powerful open source specification built on AWS CloudFormation that makes it easy to keep your serverless infrastructure as code – and they have the cutest mascot.

SAM Local takes all the good parts of SAM and brings them to your local machine.

There are a couple of ways to install SAM Local but the easiest is through NPM. A quick npm install -g aws-sam-local should get us going but if you want the latest version you can always install straight from the source: go get github.com/awslabs/aws-sam-local (this will create a binary named aws-sam-local, not sam).

I like to vote on things so let’s write a quick SAM application to vote on Spaces versus Tabs. We’ll use a very simple, but powerful, architecture of API Gateway fronting a Lambda function and we’ll store our results in DynamoDB. In the end a user should be able to curl our API curl https://SOMEURL/ -d '{"vote": "spaces"}' and get back the number of votes.

Let’s start by writing a simple SAM template.yaml:

AWSTemplateFormatVersion : '2010-09-09'
Transform: AWS::Serverless-2016-10-31
Resources:
  VotesTable:
    Type: "AWS::Serverless::SimpleTable"
  VoteSpacesTabs:
    Type: "AWS::Serverless::Function"
    Properties:
      Runtime: python3.6
      Handler: lambda_function.lambda_handler
      Policies: AmazonDynamoDBFullAccess
      Environment:
        Variables:
          TABLE_NAME: !Ref VotesTable
      Events:
        Vote:
          Type: Api
          Properties:
            Path: /
            Method: post

So we create a [dynamo_i] table that we expose to our Lambda function through an environment variable called TABLE_NAME.

To test that this template is valid I’ll go ahead and call sam validate to make sure I haven’t fat-fingered anything. It returns Valid! so let’s go ahead and get to work on our Lambda function.

import os
import os
import json
import boto3
votes_table = boto3.resource('dynamodb').Table(os.getenv('TABLE_NAME'))

def lambda_handler(event, context):
    print(event)
    if event['httpMethod'] == 'GET':
        resp = votes_table.scan()
        return {'body': json.dumps({item['id']: int(item['votes']) for item in resp['Items']})}
    elif event['httpMethod'] == 'POST':
        try:
            body = json.loads(event['body'])
        except:
            return {'statusCode': 400, 'body': 'malformed json input'}
        if 'vote' not in body:
            return {'statusCode': 400, 'body': 'missing vote in request body'}
        if body['vote'] not in ['spaces', 'tabs']:
            return {'statusCode': 400, 'body': 'vote value must be "spaces" or "tabs"'}

        resp = votes_table.update_item(
            Key={'id': body['vote']},
            UpdateExpression='ADD votes :incr',
            ExpressionAttributeValues={':incr': 1},
            ReturnValues='ALL_NEW'
        )
        return {'body': "{} now has {} votes".format(body['vote'], resp['Attributes']['votes'])}

So let’s test this locally. I’ll need to create a real DynamoDB database to talk to and I’ll need to provide the name of that database through the enviornment variable TABLE_NAME. I could do that with an env.json file or I can just pass it on the command line. First, I can call:
$ echo '{"httpMethod": "POST", "body": "{\"vote\": \"spaces\"}"}' |\
TABLE_NAME="vote-spaces-tabs" sam local invoke "VoteSpacesTabs"

to test the Lambda – it returns the number of votes for spaces so theoritically everything is working. Typing all of that out is a pain so I could generate a sample event with sam local generate-event api and pass that in to the local invocation. Far easier than all of that is just running our API locally. Let’s do that: sam local start-api. Now I can curl my local endpoints to test everything out.
I’ll run the command: $ curl -d '{"vote": "tabs"}' http://127.0.0.1:3000/ and it returns: “tabs now has 12 votes”. Now, of course I did not write this function perfectly on my first try. I edited and saved several times. One of the benefits of hot-reloading is that as I change the function I don’t have to do any additional work to test the new function. This makes iterative development vastly easier.

Let’s say we don’t want to deal with accessing a real DynamoDB database over the network though. What are our options? Well we can download DynamoDB Local and launch it with java -Djava.library.path=./DynamoDBLocal_lib -jar DynamoDBLocal.jar -sharedDb. Then we can have our Lambda function use the AWS_SAM_LOCAL environment variable to make some decisions about how to behave. Let’s modify our function a bit:

import os
import json
import boto3
if os.getenv("AWS_SAM_LOCAL"):
    votes_table = boto3.resource(
        'dynamodb',
        endpoint_url="http://docker.for.mac.localhost:8000/"
    ).Table("spaces-tabs-votes")
else:
    votes_table = boto3.resource('dynamodb').Table(os.getenv('TABLE_NAME'))

Now we’re using a local endpoint to connect to our local database which makes working without wifi a little easier.

SAM local even supports interactive debugging! In Java and Node.js I can just pass the -d flag and a port to immediately enable the debugger. For Python I could use a library like import epdb; epdb.serve() and connect that way. Then we can call sam local invoke -d 8080 "VoteSpacesTabs" and our function will pause execution waiting for you to step through with the debugger.

Alright, I think we’ve got everything working so let’s deploy this!

First I’ll call the sam package command which is just an alias for aws cloudformation package and then I’ll use the result of that command to sam deploy.

$ sam package --template-file template.yaml --s3-bucket MYAWESOMEBUCKET --output-template-file package.yaml
Uploading to 144e47a4a08f8338faae894afe7563c3  90570 / 90570.0  (100.00%)
Successfully packaged artifacts and wrote output template to file package.yaml.
Execute the following command to deploy the packaged template
aws cloudformation deploy --template-file package.yaml --stack-name 
$ sam deploy --template-file package.yaml --stack-name VoteForSpaces --capabilities CAPABILITY_IAM
Waiting for changeset to be created..
Waiting for stack create/update to complete
Successfully created/updated stack - VoteForSpaces

Which brings us to our API:
.

I’m going to hop over into the production stage and add some rate limiting in case you guys start voting a lot – but otherwise we’ve taken our local work and deployed it to the cloud without much effort at all. I always enjoy it when things work on the first deploy!

You can vote now and watch the results live! http://spaces-or-tabs.s3-website-us-east-1.amazonaws.com/

We hope that SAM Local makes it easier for you to test, debug, and deploy your serverless apps. We have a CONTRIBUTING.md guide and we welcome pull requests. Please tweet at us to let us know what cool things you build. You can see our What’s New post here and the documentation is live here.

Randall

timeShift(GrafanaBuzz, 1w) Issue 5

Post Syndicated from Blogs on Grafana Labs Blog original https://grafana.com/blog/2017/07/21/timeshiftgrafanabuzz-1w-issue-5/

We cover a lot of ground in this week’s timeShift. From diving into building your own plugin, finding the right dashboard, configuration options in the alerting feature, to monitoring your local weather, there’s something for everyone. Are you writing an article about Grafana, or have you come across an article you found interesting? Please get in touch, we’ll add it to our roundup.


From the Blogosphere

  • Going open-source in monitoring, part III: 10 most useful Grafana dashboards to monitor Kubernetes and services: We have hundreds of pre-made dashboards ready for you to install into your on-prem or hosted Grafana, but not every one will fit your specific monitoring needs. In part three of the series, Sergey discusses is experiences with finding useful dashboards and shows off ten of the best dashboards you can install for monitoring Kubernetes clusters and the services deployed on them.

  • Using AWS Lambda and API gateway for server-less Grafana adapters: Sometimes you’ll want to visualize metrics from a data source that may not yet be supported in Grafana natively. With the plugin functionality introduced in Grafana 3.0, anyone can create their own data sources. Using the SimpleJson data source, Jonas describes how he used AWS Lambda and AWS API gateway to write data source adapters for Grafana.

  • How to Use Grafana to Monitor JMeter Non-GUI Results – Part 2: A few issues ago we listed an article for using Grafana to monitor JMeter Non-GUI results, which required a number of non-trivial steps to complete. This article shows of an easier way to accomplish this that doesn’t require any additional configuration of InfluxDB.

  • Programming your Personal Weather Chart: It’s always great to see Grafana used outside of the typical dev-ops usecase. This article runs you through the steps to create your own weather chart and show off your local weather stats in Grafana. BONUS: Rob shows off a magic mirror he created, which can display this data.

  • vSphere Performance data – Part 6 – The Dashboard(s): This 6-part series goes into a ton of detail and walks you through the various methods of retrieving vSphere performance data, storing the data in a TSDB, and creating dashboards for the metrics. Part 6 deals specifically with Grafana, but I highly recommend reading all of the articles, as it chronicles the journey of metrics exploration, storage, and visualization from someone who had no prior experience with time series data.

  • Alerting in Grafana: Alerting in Grafana is a fairly new feature and one that we’re continuing to iterate on. We’re soon adding additional data source support, new notification channels, clustering, silencing rules, and more. This article steps you through all the configuration options to get you to your first alert.


Plugins and Dashboards

It can seem like work slows during July and August, but we’re still seeing a lot of activity in the community. This week we have a new graph panel to show off that gives you some unique looking dashboards, and an update to the Zabbix data source, which adds some really great features. You can install both of the plugins now on your on-prem Grafana via our cli, or with one-click on GrafanaCloud.

NEW PLUGIN

Bubble Chart Panel This super-cool looking panel groups your tag values into clusters of circles. The size of the circle represents the aggregated value of the time series data. There are also multiple color schemes to make those bubbles POP (pun intended)! Currently it works against OpenTSDB and Bosun, so give it a try!

Install Now

UPDATED PLUGIN

Zabbix Alex has been hard at work, making improvements on the Zabbix App for Grafana. This update adds annotations, template variables, alerting and more. Thanks Alex! If you’d like to try out the app, head over to http://play.grafana-zabbix.org/dashboard/db/zabbix-db-mysql?orgId=2

Install 3.5.1 Now


This week’s MVC (Most Valuable Contributor)

Open source software can’t thrive without the contributions from the community. Each week we’ll recognize a Grafana contributor and thank them for all of their PRs, bug reports and feedback.

mk-dhia (Dhia)
Thank you so much for your improvements to the Elasticsearch data source!


Tweet of the Week

We scour Twitter each week to find an interesting/beautiful dashboard and show it off! #monitoringLove

This week’s tweet comes from @geek_dave

Great looking dashboard Dave! And thank you for adding new features and keeping it updated. It’s creators like you who make the dashboard repository so awesome!


Upcoming Events

We love when people talk about Grafana at meetups and conferences.

Monday, July 24, 2017 – 7:30pm | Google Campus Warsaw


Ząbkowska 27/31, Warsaw, Poland

Iot & HOME AUTOMATION #3 openHAB, InfluxDB, Grafana:
If you are interested in topics of the internet of things and home automation, this might be a good occasion to meet people similar to you. If you are into it, we will also show you how we can all work together on our common projects.

RSVP


Tell us how we’re Doing.

We’d love your feedback on what kind of content you like, length, format, etc – so please keep the comments coming! You can submit a comment on this article below, or post something at our community forum. Help us make this better.

Follow us on Twitter, like us on Facebook, and join the Grafana Labs community.

AWS HIPAA Eligibility Update (July 2017) – Eight Additional Services

Post Syndicated from Jeff Barr original https://aws.amazon.com/blogs/aws/aws-hipaa-eligibility-update-july-2017-eight-additional-services/

It is time for an update on our on-going effort to make AWS a great host for healthcare and life sciences applications. As you can see from our Health Customer Stories page, Philips, VergeHealth, and Cambia (to choose a few) trust AWS with Protected Health Information (PHI) and Personally Identifying Information (PII) as part of their efforts to comply with HIPAA and HITECH.

In May we announced that we added Amazon API Gateway, AWS Direct Connect, AWS Database Migration Service, and Amazon Simple Queue Service (SQS) to our list of HIPAA eligible services and discussed our how customers and partners are putting them to use.

Eight More Eligible Services
Today I am happy to share the news that we are adding another eight services to the list:

Amazon CloudFront can now be utilized to enhance the delivery and transfer of Protected Health Information data to applications on the Internet. By providing a completely secure and encryptable pathway, CloudFront can now be used as a part of applications that need to cache PHI. This includes applications for viewing lab results or imaging data, and those that transfer PHI from Healthcare Information Exchanges (HIEs).

AWS WAF can now be used to protect applications running on AWS which operate on PHI such as patient care portals, patient scheduling systems, and HIEs. Requests and responses containing encrypted PHI and PII can now pass through AWS WAF.

AWS Shield can now be used to protect web applications such as patient care portals and scheduling systems that operate on encrypted PHI from DDoS attacks.

Amazon S3 Transfer Acceleration can now be used to accelerate the bulk transfer of large amounts of research, genetics, informatics, insurance, or payer/payment data containing PHI/PII information. Transfers can take place between a pair of AWS Regions or from an on-premises system and an AWS Region.

Amazon WorkSpaces can now be used by researchers, informaticists, hospital administrators and other users to analyze, visualize or process PHI/PII data using on-demand Windows virtual desktops.

AWS Directory Service can now be used to connect the authentication and authorization systems of organizations that use or process PHI/PII to their resources in the AWS Cloud. For example, healthcare providers operating hybrid cloud environments can now use AWS Directory Services to allow their users to easily transition between cloud and on-premises resources.

Amazon Simple Notification Service (SNS) can now be used to send notifications containing encrypted PHI/PII as part of patient care, payment processing, and mobile applications.

Amazon Cognito can now be used to authenticate users into mobile patient portal and payment processing applications that use PHI/PII identifiers for accounts.

Additional HIPAA Resources
Here are some additional resources that will help you to build applications that comply with HIPAA and HITECH:

Keep in Touch
In order to make use of any AWS service in any manner that involves PHI, you must first enter into an AWS Business Associate Addendum (BAA). You can contact us to start the process.

Jeff;

AWS Adds 12 More Services to Its PCI DSS Compliance Program

Post Syndicated from Sara Duffer original https://aws.amazon.com/blogs/security/aws-adds-12-more-services-to-its-pci-dss-compliance-program/

Twelve more AWS services have obtained Payment Card Industry Data Security Standard (PCI DSS) compliance, giving you more options, flexibility, and functionality to process and store sensitive payment card data in the AWS Cloud. The services were audited by Coalfire to ensure that they meet strict PCI DSS standards.

The newly compliant AWS services are:

AWS now offers 42 services that meet PCI DSS standards, putting administrators in better control of their frameworks and making workloads more efficient and cost effective.

For more information about the AWS PCI DSS compliance program, see Compliance Resources, AWS Services in Scope by Compliance Program, and PCI DSS Compliance.

– Sara

Take the Journey: Build Your First Serverless Web Application

Post Syndicated from Tara Walker original https://aws.amazon.com/blogs/aws/build-your-first-serverless-application/

I realized at a young age that I really liked writing those special statements that would control the computer and make it work in the manner in which I desired. This technique of controlling the computer and building things on the machine, I learned from my teachers was called writing code, and it fascinated me. Even now, what seems like centuries later, I still get the thrill of writing code, building cool solutions, and tackling all the associated challenges of this craft. It is no wonder then, that I am a huge fan of serverless computing and serverless architectures.

Serverless Computing allows me to do what I enjoy, which is write code, without having to provision and/or configure servers. Using the AWS Serverless Platform means that all the heavy lifting of server management is handled by AWS, allowing you to focus on building your application.

If you enjoy coding like I do and have yet to dive into building serverless applications, boy do I have some sensational news for you. You can build your own serverless web application with our new Serverless Web Application Guide, which provides step-by-step instructions for you to create and deploy your serverless web application on AWS.

 

The Serverless Web Application Guide is a hands-on tutorial that will assist you in building a fully scalable, serverless web application using the following AWS Services:

  • AWS Lambda: a managed service for serverless compute that allows you to run code without provisioning or managing servers
  • Amazon S3: a managed service that provides simple, durable, scalable object storage
  • Amazon Cognito: a managed service that allows you to add user sign-up, and data synchronization to your application
  • Amazon API Gateway: a managed service which you can create, publish, and maintain secure APIs
  • Amazon DynamoDB: a fast and flexible NoSQL managed cloud database with support for various document and key-value storage models

The application you will build is a simple web application designed for a fictional transportation service. The application will enable users to register and login into the website to request rides from a very unique transportation fleet. You will accomplish this by using the aforementioned AWS services with the serverless application architecture shown in the diagram below.

 
The guide breaks up the each step to build your serverless web application into five separate modules.

 

  1. Static Web Hosting: Amazon S3 hosts static web resources including HTML, CSS, JavaScript, and image files that are loaded in the user’s browser.
  2. User Management: Amazon Cognito provides user management and authentication functions to secure the backend API.
  3. Serverless Backend: Amazon DynamoDB provides a persistence layer where data can be stored by the API’s Lambda function.
  4. RESTful APIs: JavaScript executed in the browser sends and receives data from a public backend API built using AWS Lambda and API Gateway.
  5. Resource Cleanup: All the resources created throughout the tutorial will be terminated.

To be successful in building the application, you must remember to complete each module in sequential order, as the modules are dependent on resources created in the previous one. Some of the guide’s modules provide CloudFormation templates to aid you in generating the necessary resources to build the application if you do not wish to create them manually.

 

Summary

Now that you know all about this fantastic new guide for building a serverless web application, you are ready to journey into the world of AWS serverless computing and have some fun writing the code to build the application. The guide is great for beginners and yet still has cool features that even seasoned serverless computing developers will enjoy building. And to top it off, you don’t have to worry about the cost. Each service used is eligible for the AWS Free Tier and is only estimated to cost less than $0.25 if you are outside of Free Tier usage limits.

Take the plunge today and dive into building serverless applications on the AWS serverless platform with this new and exciting Serverless Web Application Guide.

 

Tara

Secure API Access with Amazon Cognito Federated Identities, Amazon Cognito User Pools, and Amazon API Gateway

Post Syndicated from Ed Lima original https://aws.amazon.com/blogs/compute/secure-api-access-with-amazon-cognito-federated-identities-amazon-cognito-user-pools-and-amazon-api-gateway/

Ed Lima, Solutions Architect

 

Our identities are what define us as human beings. Philosophical discussions aside, it also applies to our day-to-day lives. For instance, I need my work badge to get access to my office building or my passport to travel overseas. My identity in this case is attached to my work badge or passport. As part of the system that checks my access, these documents or objects help define whether I have access to get into the office building or travel internationally.

This exact same concept can also be applied to cloud applications and APIs. To provide secure access to your application users, you define who can access the application resources and what kind of access can be granted. Access is based on identity controls that can confirm authentication (AuthN) and authorization (AuthZ), which are different concepts. According to Wikipedia:

 

The process of authorization is distinct from that of authentication. Whereas authentication is the process of verifying that “you are who you say you are,” authorization is the process of verifying that “you are permitted to do what you are trying to do.” This does not mean authorization presupposes authentication; an anonymous agent could be authorized to a limited action set.

Amazon Cognito allows building, securing, and scaling a solution to handle user management and authentication, and to sync across platforms and devices. In this post, I discuss the different ways that you can use Amazon Cognito to authenticate API calls to Amazon API Gateway and secure access to your own API resources.

 

Amazon Cognito Concepts

 

It’s important to understand that Amazon Cognito provides three different services:

Today, I discuss the use of the first two. One service doesn’t need the other to work; however, they can be configured to work together.
 

Amazon Cognito Federated Identities

 
To use Amazon Cognito Federated Identities in your application, create an identity pool. An identity pool is a store of user data specific to your account. It can be configured to require an identity provider (IdP) for user authentication, after you enter details such as app IDs or keys related to that specific provider.

After the user is validated, the provider sends an identity token to Amazon Cognito Federated Identities. In turn, Amazon Cognito Federated Identities contacts the AWS Security Token Service (AWS STS) to retrieve temporary AWS credentials based on a configured, authenticated IAM role linked to the identity pool. The role has appropriate IAM policies attached to it and uses these policies to provide access to other AWS services.

Amazon Cognito Federated Identities currently supports the IdPs listed in the following graphic.

 



Continue reading Secure API Access with Amazon Cognito Federated Identities, Amazon Cognito User Pools, and Amazon API Gateway

Event: AWS Serverless Roadshow – Hands-on Workshops

Post Syndicated from Tara Walker original https://aws.amazon.com/blogs/aws/event-aws-serverless-roadshow-hands-on-workshops/

Surely, some of you have contemplated how you would survive the possible Zombie apocalypse or how you would build your exciting new startup to disrupt the transportation industry when Unicorn haven is uncovered. Well, there is no need to worry; I know just the thing to get you prepared to handle both of those scenarios: the AWS Serverless Computing Workshop Roadshow.

With the roadshow’s serverless workshops, you can get hands-on experience building serverless applications and microservices so you can rebuild what remains of our great civilization after a widespread viral infection causes human corpses to reanimate around the world in the AWS Zombie Microservices Workshop. In addition, you can give your startup a jump on the competition with the Wild Rydes workshop in order to revolutionize the transportation industry; just in time for a pilot’s crash landing leading the way to the discovery of abundant Unicorn pastures found on the outskirts of the female Amazonian warrior inhabited island of Themyscira also known as Paradise Island.

These free, guided hands-on workshops will introduce the basics of building serverless applications and microservices for common and uncommon scenarios using services like AWS Lambda, Amazon API Gateway, Amazon DynamoDB, Amazon S3, Amazon Kinesis, AWS Step Functions, and more. Let me share some advice before you decide to tackle Zombies and mount Unicorns – don’t forget to bring your laptop to the workshop and make sure you have an AWS account established and available for use for the event.

Check out the schedule below and get prepared today by registering for an upcoming workshop in a city near you. Remember these are workshops are completely free, so participation is on a first come, first served basis. So register and get there early, we need Zombie hunters and Unicorn riders across the globe.  Learn more about AWS Serverless Computing Workshops here and register for your city using links below.

Event Location Date
Wild Rydes New York Thursday, June 8
Wild Rydes Austin Thursday, June 22
Wild Rydes Santa Monica Thursday, July 20
Zombie Apocalypse Chicago Thursday, July 20
Wild Rydes Atlanta Tuesday, September 12
Zombie Apocalypse Dallas Tuesday, September 19

 

I look forward to fighting zombies and riding unicorns with you all.

Tara

Building High-Throughput Genomics Batch Workflows on AWS: Workflow Layer (Part 4 of 4)

Post Syndicated from Andy Katz original https://aws.amazon.com/blogs/compute/building-high-throughput-genomics-batch-workflows-on-aws-workflow-layer-part-4-of-4/

Aaron Friedman is a Healthcare and Life Sciences Partner Solutions Architect at AWS

Angel Pizarro is a Scientific Computing Technical Business Development Manager at AWS

This post is the fourth in a series on how to build a genomics workflow on AWS. In Part 1, we introduced a general architecture, shown below, and highlighted the three common layers in a batch workflow:

  • Job
  • Batch
  • Workflow

In Part 2, you built a Docker container for each job that needed to run as part of your workflow, and stored them in Amazon ECR.

In Part 3, you tackled the batch layer and built a scalable, elastic, and easily maintainable batch engine using AWS Batch. This solution took care of dynamically scaling your compute resources in response to the number of runnable jobs in your job queue length as well as managed job placement.

In part 4, you build out the workflow layer of your solution using AWS Step Functions and AWS Lambda. You then run an end-to-end genomic analysis―specifically known as exome secondary analysis―for many times at a cost of less than $1 per exome.

Step Functions makes it easy to coordinate the components of your applications using visual workflows. Building applications from individual components that each perform a single function lets you scale and change your workflow quickly. You can use the graphical console to arrange and visualize the components of your application as a series of steps, which simplify building and running multi-step applications. You can change and add steps without writing code, so you can easily evolve your application and innovate faster.

An added benefit of using Step Functions to define your workflows is that the state machines you create are immutable. While you can delete a state machine, you cannot alter it after it is created. For regulated workloads where auditing is important, you can be assured that state machines you used in production cannot be altered.

In this blog post, you will create a Lambda state machine to orchestrate your batch workflow. For more information on how to create a basic state machine, please see this Step Functions tutorial.

All code related to this blog series can be found in the associated GitHub repository here.

Build a state machine building block

To skip the following steps, we have provided an AWS CloudFormation template that can deploy your Step Functions state machine. You can use this in combination with the setup you did in part 3 to quickly set up the environment in which to run your analysis.

The state machine is composed of smaller state machines that submit a job to AWS Batch, and then poll and check its execution.

The steps in this building block state machine are as follows:

  1. A job is submitted.
    Each analytical module/job has its own Lambda function for submission and calls the batchSubmitJob Lambda function that you built in the previous blog post. You will build these specialized Lambda functions in the following section.
  2. The state machine queries the AWS Batch API for the job status.
    This is also a Lambda function.
  3. The job status is checked to see if the job has completed.
    If the job status equals SUCCESS, proceed to log the final job status. If the job status equals FAILED, end the execution of the state machine. In all other cases, wait 30 seconds and go back to Step 2.

Here is the JSON representing this state machine.

{
  "Comment": "A simple example that submits a Job to AWS Batch",
  "StartAt": "SubmitJob",
  "States": {
    "SubmitJob": {
      "Type": "Task",
      "Resource": "arn:aws:lambda:us-east-1:<account-id>::function:batchSubmitJob",
      "Next": "GetJobStatus"
    },
    "GetJobStatus": {
      "Type": "Task",
      "Resource": "arn:aws:lambda:us-east-1:<account-id>:function:batchGetJobStatus",
      "Next": "CheckJobStatus",
      "InputPath": "$",
      "ResultPath": "$.status"
    },
    "CheckJobStatus": {
      "Type": "Choice",
      "Choices": [
        {
          "Variable": "$.status",
          "StringEquals": "FAILED",
          "End": true
        },
        {
          "Variable": "$.status",
          "StringEquals": "SUCCEEDED",
          "Next": "GetFinalJobStatus"
        }
      ],
      "Default": "Wait30Seconds"
    },
    "Wait30Seconds": {
      "Type": "Wait",
      "Seconds": 30,
      "Next": "GetJobStatus"
    },
    "GetFinalJobStatus": {
      "Type": "Task",
      "Resource": "arn:aws:lambda:us-east-1:<account-id>:function:batchGetJobStatus",
      "End": true
    }
  }
}

Building the Lambda functions for the state machine

You need two basic Lambda functions for this state machine. The first one submits a job to AWS Batch and the second checks the status of the AWS Batch job that was submitted.

In AWS Step Functions, you specify an input as JSON that is read into your state machine. Each state receives the aggregate of the steps immediately preceding it, and you can specify which components a state passes on to its children. Because you are using Lambda functions to execute tasks, one of the easiest routes to take is to modify the input JSON, represented as a Python dictionary, within the Lambda function and return the entire dictionary back for the next state to consume.

Building the batchSubmitIsaacJob Lambda function

For Step 1 above, you need a Lambda function for each of the steps in your analysis workflow. As you created a generic Lambda function in the previous post to submit a batch job (batchSubmitJob), you can use that function as the basis for the specialized functions you’ll include in this state machine. Here is such a Lambda function for the Isaac aligner.

from __future__ import print_function

import boto3
import json
import traceback

lambda_client = boto3.client('lambda')



def lambda_handler(event, context):
    try:
        # Generate output put
        bam_s3_path = '/'.join([event['resultsS3Path'], event['sampleId'], 'bam/'])

        depends_on = event['dependsOn'] if 'dependsOn' in event else []

        # Generate run command
        command = [
            '--bam_s3_folder_path', bam_s3_path,
            '--fastq1_s3_path', event['fastq1S3Path'],
            '--fastq2_s3_path', event['fastq2S3Path'],
            '--reference_s3_path', event['isaac']['referenceS3Path'],
            '--working_dir', event['workingDir']
        ]

        if 'cmdArgs' in event['isaac']:
            command.extend(['--cmd_args', event['isaac']['cmdArgs']])
        if 'memory' in event['isaac']:
            command.extend(['--memory', event['isaac']['memory']])

        # Submit Payload
        response = lambda_client.invoke(
            FunctionName='batchSubmitJob',
            InvocationType='RequestResponse',
            LogType='Tail',
            Payload=json.dumps(dict(
                dependsOn=depends_on,
                containerOverrides={
                    'command': command,
                },
                jobDefinition=event['isaac']['jobDefinition'],
                jobName='-'.join(['isaac', event['sampleId']]),
                jobQueue=event['isaac']['jobQueue']
            )))

        response_payload = response['Payload'].read()

        # Update event
        event['bamS3Path'] = bam_s3_path
        event['jobId'] = json.loads(response_payload)['jobId']
        
        return event
    except Exception as e:
        traceback.print_exc()
        raise e

In the Lambda console, create a Python 2.7 Lambda function named batchSubmitIsaacJob and paste in the above code. Use the LambdaBatchExecutionRole that you created in the previous post. For more information, see Step 2.1: Create a Hello World Lambda Function.

This Lambda function reads in the inputs passed to the state machine it is part of, formats the data for the batchSubmitJob Lambda function, invokes that Lambda function, and then modifies the event dictionary to pass onto the subsequent states. You can repeat these for each of the other tools, which can be found in the tools//lambda/lambda_function.py script in the GitHub repo.

Building the batchGetJobStatus Lambda function

For Step 2 above, the process queries the AWS Batch DescribeJobs API action with jobId to identify the state that the job is in. You can put this into a Lambda function to integrate it with Step Functions.

In the Lambda console, create a new Python 2.7 function with the LambdaBatchExecutionRole IAM role. Name your function batchGetJobStatus and paste in the following code. This is similar to the batch-get-job-python27 Lambda blueprint.

from __future__ import print_function

import boto3
import json

print('Loading function')

batch_client = boto3.client('batch')

def lambda_handler(event, context):
    # Log the received event
    print("Received event: " + json.dumps(event, indent=2))
    # Get jobId from the event
    job_id = event['jobId']

    try:
        response = batch_client.describe_jobs(
            jobs=[job_id]
        )
        job_status = response['jobs'][0]['status']
        return job_status
    except Exception as e:
        print(e)
        message = 'Error getting Batch Job status'
        print(message)
        raise Exception(message)

Structuring state machine input

You have structured the state machine input so that general file references are included at the top-level of the JSON object, and any job-specific items are contained within a nested JSON object. At a high level, this is what the input structure looks like:

{
        "general_field_1": "value1",
        "general_field_2": "value2",
        "general_field_3": "value3",
        "job1": {},
        "job2": {},
        "job3": {}
}

Building the full state machine

By chaining these state machine components together, you can quickly build flexible workflows that can process genomes in multiple ways. The development of the larger state machine that defines the entire workflow uses four of the above building blocks. You use the Lambda functions that you built in the previous section. Rename each building block submission to match the tool name.

We have provided a CloudFormation template to deploy your state machine and the associated IAM roles. In the CloudFormation console, select Create Stack, choose your template (deploy_state_machine.yaml), and enter in the ARNs for the Lambda functions you created.

Continue through the rest of the steps and deploy your stack. Be sure to check the box next to "I acknowledge that AWS CloudFormation might create IAM resources."

Once the CloudFormation stack is finished deploying, you should see the following image of your state machine.

In short, you first submit a job for Isaac, which is the aligner you are using for the analysis. Next, you use parallel state to split your output from "GetFinalIsaacJobStatus" and send it to both your variant calling step, Strelka, and your QC step, Samtools Stats. These then are run in parallel and you annotate the results from your Strelka step with snpEff.

Putting it all together

Now that you have built all of the components for a genomics secondary analysis workflow, test the entire process.

We have provided sequences from an Illumina sequencer that cover a region of the genome known as the exome. Most of the positions in the genome that we have currently associated with disease or human traits reside in this region, which is 1–2% of the entire genome. The workflow that you have built works for both analyzing an exome, as well as an entire genome.

Additionally, we have provided prebuilt reference genomes for Isaac, located at:

s3://aws-batch-genomics-resources/reference/

If you are interested, we have provided a script that sets up all of that data. To execute that script, run the following command on a large EC2 instance:

make reference REGISTRY=<your-ecr-registry>

Indexing and preparing this reference takes many hours on a large-memory EC2 instance. Be careful about the costs involved and note that the data is available through the prebuilt reference genomes.

Starting the execution

In a previous section, you established a provenance for the JSON that is fed into your state machine. For ease, we have auto-populated the input JSON for you to the state machine. You can also find this in the GitHub repo under workflow/test.input.json:

{
  "fastq1S3Path": "s3://aws-batch-genomics-resources/fastq/SRR1919605_1.fastq.gz",
  "fastq2S3Path": "s3://aws-batch-genomics-resources/fastq/SRR1919605_2.fastq.gz",
  "referenceS3Path": "s3://aws-batch-genomics-resources/reference/hg38.fa",
  "resultsS3Path": "s3://<bucket>/genomic-workflow/results",
  "sampleId": "NA12878_states_1",
  "workingDir": "/scratch",
  "isaac": {
    "jobDefinition": "isaac-myenv:1",
    "jobQueue": "arn:aws:batch:us-east-1:<account-id>:job-queue/highPriority-myenv",
    "referenceS3Path": "s3://aws-batch-genomics-resources/reference/isaac/"
  },
  "samtoolsStats": {
    "jobDefinition": "samtools_stats-myenv:1",
    "jobQueue": "arn:aws:batch:us-east-1:<account-id>:job-queue/lowPriority-myenv"
  },
  "strelka": {
    "jobDefinition": "strelka-myenv:1",
    "jobQueue": "arn:aws:batch:us-east-1:<account-id>:job-queue/highPriority-myenv",
    "cmdArgs": " --exome "
  },
  "snpEff": {
    "jobDefinition": "snpeff-myenv:1",
    "jobQueue": "arn:aws:batch:us-east-1:<account-id>:job-queue/lowPriority-myenv",
    "cmdArgs": " -t hg38 "
  }
}

You are now at the stage to run your full genomic analysis. Copy the above to a new text file, change paths and ARNs to the ones that you created previously, and save your JSON input as input.states.json.

In the CLI, execute the following command. You need the ARN of the state machine that you created in the previous post:

aws stepfunctions start-execution --state-machine-arn <your-state-machine-arn> --input file://input.states.json

Your analysis has now started. By using Spot Instances with AWS Batch, you can quickly scale out your workflows while concurrently optimizing for cost. While this is not guaranteed, most executions of the workflows presented here should cost under $1 for a full analysis.

Monitoring the execution

The output from the above CLI command gives you the ARN that describes the specific execution. Copy that and navigate to the Step Functions console. Select the state machine that you created previously and paste the ARN into the search bar.

The screen shows information about your specific execution. On the left, you see where your execution currently is in the workflow.

In the following screenshot, you can see that your workflow has successfully completed the alignment job and moved onto the subsequent steps, which are variant calling and generating quality information about your sample.

You can also navigate to the AWS Batch console and see that progress of all of your jobs reflected there as well.

Finally, after your workflow has completed successfully, check out the S3 path to which you wrote all of your files. If you run a ls –recursive command on the S3 results path, specified in the input to your state machine execution, you should see something similar to the following:

2017-05-02 13:46:32 6475144340 genomic-workflow/results/NA12878_run1/bam/sorted.bam
2017-05-02 13:46:34    7552576 genomic-workflow/results/NA12878_run1/bam/sorted.bam.bai
2017-05-02 13:46:32         45 genomic-workflow/results/NA12878_run1/bam/sorted.bam.md5
2017-05-02 13:53:20      68769 genomic-workflow/results/NA12878_run1/stats/bam_stats.dat
2017-05-02 14:05:12        100 genomic-workflow/results/NA12878_run1/vcf/stats/runStats.tsv
2017-05-02 14:05:12        359 genomic-workflow/results/NA12878_run1/vcf/stats/runStats.xml
2017-05-02 14:05:12  507577928 genomic-workflow/results/NA12878_run1/vcf/variants/genome.S1.vcf.gz
2017-05-02 14:05:12     723144 genomic-workflow/results/NA12878_run1/vcf/variants/genome.S1.vcf.gz.tbi
2017-05-02 14:05:12  507577928 genomic-workflow/results/NA12878_run1/vcf/variants/genome.vcf.gz
2017-05-02 14:05:12     723144 genomic-workflow/results/NA12878_run1/vcf/variants/genome.vcf.gz.tbi
2017-05-02 14:05:12   30783484 genomic-workflow/results/NA12878_run1/vcf/variants/variants.vcf.gz
2017-05-02 14:05:12    1566596 genomic-workflow/results/NA12878_run1/vcf/variants/variants.vcf.gz.tbi

Modifications to the workflow

You have now built and run your genomics workflow. While diving deep into modifications to this architecture are beyond the scope of these posts, we wanted to leave you with several suggestions of how you might modify this workflow to satisfy additional business requirements.

  • Job tracking with Amazon DynamoDB
    In many cases, such as if you are offering Genomics-as-a-Service, you might want to track the state of your jobs with DynamoDB to get fine-grained records of how your jobs are running. This way, you can easily identify the cost of individual jobs and workflows that you run.
  • Resuming from failure
    Both AWS Batch and Step Functions natively support job retries and can cover many of the standard cases where a job might be interrupted. There may be cases, however, where your workflow might fail in a way that is unpredictable. In this case, you can use custom error handling with AWS Step Functions to build out a workflow that is even more resilient. Also, you can build in fail states into your state machine to fail at any point, such as if a batch job fails after a certain number of retries.
  • Invoking Step Functions from Amazon API Gateway
    You can use API Gateway to build an API that acts as a "front door" to Step Functions. You can create a POST method that contains the input JSON to feed into the state machine you built. For more information, see the Implementing Serverless Manual Approval Steps in AWS Step Functions and Amazon API Gateway blog post.

Conclusion

While the approach we have demonstrated in this series has been focused on genomics, it is important to note that this can be generalized to nearly any high-throughput batch workload. We hope that you have found the information useful and that it can serve as a jump-start to building your own batch workloads on AWS with native AWS services.

For more information about how AWS can enable your genomics workloads, be sure to check out the AWS Genomics page.

Other posts in this four-part series:

Please leave any questions and comments below.

Four HIPAA Eligible Services Recently Added to the AWS Business Associate Agreement

Post Syndicated from Chad Woolf original https://aws.amazon.com/blogs/security/four-hipaa-eligible-services-recently-added-to-the-aws-business-associate-agreement/

HIPAA logo

We are pleased to announce that the following four AWS services have been added in recent weeks to the AWS Business Associate Agreement (BAA):

As with all HIPAA Eligible Services covered under the BAA, Protected Health Information (PHI) must be encrypted while at rest or in transit. See Architecting for HIPAA Security and Compliance on Amazon Web Services, which explains how you can configure each AWS HIPAA Eligible Service to store, process, and transmit PHI.

For more details, see the full AWS Blog post.

– Chad

Roundup of AWS HIPAA Eligible Service Announcements

Post Syndicated from Ana Visneski original https://aws.amazon.com/blogs/aws/roundup-of-aws-hipaa-eligible-service-announcements/

At AWS we have had a number of HIPAA eligible service announcements. Patrick Combes, the Healthcare and Life Sciences Global Technical Leader at AWS, and Aaron Friedman, a Healthcare and Life Sciences Partner Solutions Architect at AWS, have written this post to tell you all about it.

-Ana


We are pleased to announce that the following AWS services have been added to the BAA in recent weeks: Amazon API Gateway, AWS Direct Connect, AWS Database Migration Service, and Amazon SQS. All four of these services facilitate moving data into and through AWS, and we are excited to see how customers will be using these services to advance their solutions in healthcare. While we know the use cases for each of these services are vast, we wanted to highlight some ways that customers might use these services with Protected Health Information (PHI).

As with all HIPAA-eligible services covered under the AWS Business Associate Addendum (BAA), PHI must be encrypted while at-rest or in-transit. We encourage you to reference our HIPAA whitepaper, which details how you might configure each of AWS’ HIPAA-eligible services to store, process, and transmit PHI. And of course, for any portion of your application that does not touch PHI, you can use any of our 90+ services to deliver the best possible experience to your users. You can find some ideas on architecting for HIPAA on our website.

Amazon API Gateway
Amazon API Gateway is a web service that makes it easy for developers to create, publish, monitor, and secure APIs at any scale. With PHI now able to securely transit API Gateway, applications such as patient/provider directories, patient dashboards, medical device reports/telemetry, HL7 message processing and more can securely accept and deliver information to any number and type of applications running within AWS or client presentation layers.

One particular area we are excited to see how our customers leverage Amazon API Gateway is with the exchange of healthcare information. The Fast Healthcare Interoperability Resources (FHIR) specification will likely become the next-generation standard for how health information is shared between entities. With strong support for RESTful architectures, FHIR can be easily codified within an API on Amazon API Gateway. For more information on FHIR, our AWS Healthcare Competency partner, Datica, has an excellent primer.

AWS Direct Connect
Some of our healthcare and life sciences customers, such as Johnson & Johnson, leverage hybrid architectures and need to connect their on-premises infrastructure to the AWS Cloud. Using AWS Direct Connect, you can establish private connectivity between AWS and your datacenter, office, or colocation environment, which in many cases can reduce your network costs, increase bandwidth throughput, and provide a more consistent network experience than Internet-based connections.

In addition to a hybrid-architecture strategy, AWS Direct Connect can assist with the secure migration of data to AWS, which is the first step to using the wide array of our HIPAA-eligible services to store and process PHI, such as Amazon S3 and Amazon EMR. Additionally, you can connect to third-party/externally-hosted applications or partner-provided solutions as well as securely and reliably connect end users to those same healthcare applications, such as a cloud-based Electronic Medical Record system.

AWS Database Migration Service (DMS)
To date, customers have migrated over 20,000 databases to AWS through the AWS Database Migration Service. Customers often use DMS as part of their cloud migration strategy, and now it can be used to securely and easily migrate your core databases containing PHI to the AWS Cloud. As your source database remains fully operational during the migration with DMS, you minimize downtime for these business-critical applications as you migrate your databases to AWS. This service can now be utilized to securely transfer such items as patient directories, payment/transaction record databases, revenue management databases and more into AWS.

Amazon Simple Queue Service (SQS)
Amazon Simple Queue Service (SQS) is a message queueing service for reliably communicating among distributed software components and microservices at any scale. One way that we envision customers using SQS with PHI is to buffer requests between application components that pass HL7 or FHIR messages to other parts of their application. You can leverage features like SQS FIFO to ensure your messages containing PHI are passed in the order they are received and delivered in the order they are received, and available until a consumer processes and deletes it. This is important for applications with patient record updates or processing payment information in a hospital.

Let’s get building!
We are beyond excited to see how our customers will use our newly HIPAA-eligible services as part of their healthcare applications. What are you most excited for? Leave a comment below.

How to remove boilerplate validation logic in your REST APIs with Amazon API Gateway request validation

Post Syndicated from Bryan Liston original https://aws.amazon.com/blogs/compute/how-to-remove-boilerplate-validation-logic-in-your-rest-apis-with-amazon-api-gateway-request-validation/


Ryan Green, Software Development Engineer

Does your API suffer from code bloat or wasted developer time due to implementation of simple input validation rules? One of the necessary but least exciting aspects of building a robust REST API involves implementing basic validation of input data to your API. In addition to increasing the size of the code base, validation logic may require taking on extra dependencies and requires diligence in ensuring the API implementation doesn’t get out of sync with API request/response models and SDKs.

Amazon API Gateway recently announced the release of request validators, a simple but powerful new feature that should help to liberate API developers from the undifferentiated effort of implementing basic request validation in their API backends.

This feature leverages API Gateway models to enable the validation of request payloads against the specified schema, including validation rules as defined in the JSON-Schema Validation specification. Request validators also support basic validation of required HTTP request parameters in the URI, query string, and headers.

When a validation failure occurs, API Gateway fails the API request with an HTTP 400 error, skips the request to the backend integration, and publishes detailed error results in Amazon CloudWatch Logs.

In this post, I show two examples using request validators, validating the request body and the request parameters.

Example: Validating the request body

For this example, you build a simple API for a simulated stock trading system. This API has a resource, "/orders", that represents stock purchase orders. An HTTP POST to this resource allows the client to initiate one or more orders.

A sample request might look like this:

POST /orders

[
  {
    "account-id": "abcdef123456",
    "type": "STOCK",
    "symbol": "AMZN",
    "shares": 100,
    "details": {
      "limit": 1000
    }
  },
  {
    "account-id": "zyxwvut987654",
    "type": "STOCK",
    "symbol": "BA",
    "shares": 250,
    "details": {
      "limit": 200
    }
  }
]

The JSON-Schema for this request body might look something like this:

{
  "$schema": "http://json-schema.org/draft-04/schema#",
  "title": "Create Orders Schema",
  "type": "array",
  "minItems": 1,
  "items": {
    "type": "object",
    "required": [
      "account-id",
      "type",
      "symbol",
      "shares",
      "details"
    ],
    "properties": {
      "account_id": {
        "type": "string",
        "pattern": "[A-Za-z]{6}[0-9]{6}"
      },
      "type": {
        "type": "string",
        "enum": [
          "STOCK",
          "BOND",
          "CASH"
        ]
      },
      "symbol": {
        "type": "string",
        "minLength": 1,
        "maxLength": 4
      },
      "shares": {
        "type": "number",
        "minimum": 1,
        "maximum": 1000
      },
      "details": {
        "type": "object",
        "required": [
          "limit"
        ],
        "properties": {
          "limit": {
            "type": "number"
          }
        }
      }
    }
  }
}

This schema defines the "shape" of the request model but also defines several constraints on the various properties. Here are the validation rules for this schema:

  • The root array must have at least 1 item
  • All properties are required
  • Account ID must match the regular expression format "[A-Za-z]{6}[0-9]{6}"
  • Type must be one of STOCK, BOND, or CASH
  • Symbol must be a string between 1 and 4 characters
  • Shares must be a number between 1 and 1000

I’m sure you can imagine how this would look in your validation library of choice, or at worst, in a hand-coded implementation.

Now, try this out with API Gateway request validators. The Swagger definition below defines the REST API, models, and request validators. Its two operations define simple mock integrations to simulate behavior of the stock trading API.

Note the request validator definitions under the "x-amazon-apigateway-request-validators" extension, and the references to these validators defined on the operation and on the API.

{
  "swagger": "2.0",
  "info": {
    "title": "API Gateway - Request Validation Demo - [email protected]"
  },
  "schemes": [
    "https"
  ],
  "produces": [
    "application/json"
  ],
  "x-amazon-apigateway-request-validators" : {
    "full" : {
      "validateRequestBody" : true,
      "validateRequestParameters" : true
    },
    "body-only" : {
      "validateRequestBody" : true,
      "validateRequestParameters" : false
    }
  },
  "x-amazon-apigateway-request-validator" : "full",
  "paths": {
    "/orders": {
      "post": {
        "x-amazon-apigateway-request-validator": "body-only",
        "parameters": [
          {
            "in": "body",
            "name": "CreateOrders",
            "required": true,
            "schema": {
              "$ref": "#/definitions/CreateOrders"
            }
          }
        ],
        "responses": {
          "200": {
            "schema": {
              "$ref": "#/definitions/Message"
            }
          },
          "400" : {
            "schema": {
              "$ref": "#/definitions/Message"
            }
          }
        },
        "x-amazon-apigateway-integration": {
          "responses": {
            "default": {
              "statusCode": "200",
              "responseTemplates": {
                "application/json": "{\"message\" : \"Orders successfully created\"}"
              }
            }
          },
          "requestTemplates": {
            "application/json": "{\"statusCode\": 200}"
          },
          "passthroughBehavior": "never",
          "type": "mock"
        }
      },
      "get": {
        "parameters": [
          {
            "in": "header",
            "name": "Account-Id",
            "required": true
          },
          {
            "in": "query",
            "name": "type",
            "required": false
          }
        ],
        "responses": {
          "200" : {
            "schema": {
              "$ref": "#/definitions/Orders"
            }
          },
          "400" : {
            "schema": {
              "$ref": "#/definitions/Message"
            }
          }
        },
        "x-amazon-apigateway-integration": {
          "responses": {
            "default": {
              "statusCode": "200",
              "responseTemplates": {
                "application/json": "[{\"order-id\" : \"qrx987\",\n   \"type\" : \"STOCK\",\n   \"symbol\" : \"AMZN\",\n   \"shares\" : 100,\n   \"time\" : \"1488217405\",\n   \"state\" : \"COMPLETED\"\n},\n{\n   \"order-id\" : \"foo123\",\n   \"type\" : \"STOCK\",\n   \"symbol\" : \"BA\",\n   \"shares\" : 100,\n   \"time\" : \"1488213043\",\n   \"state\" : \"COMPLETED\"\n}\n]"
              }
            }
          },
          "requestTemplates": {
            "application/json": "{\"statusCode\": 200}"
          },
          "passthroughBehavior": "never",
          "type": "mock"
        }
      }
    }
  },
  "definitions": {
    "CreateOrders": {
      "$schema": "http://json-schema.org/draft-04/schema#",
      "title": "Create Orders Schema",
      "type": "array",
      "minItems" : 1,
      "items": {
        "type": "object",
        "$ref" : "#/definitions/Order"
      }
    },
    "Orders" : {
      "type": "array",
      "$schema": "http://json-schema.org/draft-04/schema#",
      "title": "Get Orders Schema",
      "items": {
        "type": "object",
        "properties": {
          "order_id": { "type": "string" },
          "time" : { "type": "string" },
          "state" : {
            "type": "string",
            "enum": [
              "PENDING",
              "COMPLETED"
            ]
          },
          "order" : {
            "$ref" : "#/definitions/Order"
          }
        }
      }
    },
    "Order" : {
      "type": "object",
      "$schema": "http://json-schema.org/draft-04/schema#",
      "title": "Schema for a single Order",
      "required": [
        "account-id",
        "type",
        "symbol",
        "shares",
        "details"
      ],
      "properties" : {
        "account-id": {
          "type": "string",
          "pattern": "[A-Za-z]{6}[0-9]{6}"
        },
        "type": {
          "type" : "string",
          "enum" : [
            "STOCK",
            "BOND",
            "CASH"]
        },
        "symbol" : {
          "type": "string",
          "minLength": 1,
          "maxLength": 4
        },
        "shares": {
          "type": "number",
          "minimum": 1,
          "maximum": 1000
        },
        "details": {
          "type": "object",
          "required": [
            "limit"
          ],
          "properties": {
            "limit": {
              "type": "number"
            }
          }
        }
      }
    },
    "Message": {
      "type": "object",
      "properties": {
        "message" : {
          "type" : "string"
        }
      }
    }
  }
}

To create the demo API, run the following commands (requires the AWS CLI):

git clone https://github.com/rpgreen/apigateway-validation-demo.git
cd apigateway-validation-demo
aws apigateway import-rest-api --body "file://validation-swagger.json" --region us-east-1
export API_ID=[API ID from last step]
aws apigateway create-deployment --rest-api-id $API_ID --stage-name test --region us-east-1

Make some requests to this API. Here’s the happy path with valid request body:

curl -v -H "Content-Type: application/json" -X POST -d ' [  
   { 
      "account-id":"abcdef123456",
      "type":"STOCK",
      "symbol":"AMZN",
      "shares":100,
      "details":{  
         "limit":1000
      }
   }
]' https://$API_ID.execute-api.us-east-1.amazonaws.com/test/orders

Response:

HTTP/1.1 200 OK

{"message" : "Orders successfully created"}

Put the request validator to the test. Notice the errors in the payload:

curl -v -H "Content-Type: application/json" -X POST -d '[
  {
    "account-id": "abcdef123456",
    "type": "foobar",
    "symbol": "thisstringistoolong",
    "shares": 999999,
    "details": {
       "limit": 1000
    }
  }
]' https://$API_ID.execute-api.us-east-1.amazonaws.com/test/orders

Response:

HTTP/1.1 400 Bad Request

{"message": "Invalid request body"}

When you inspect the CloudWatch Logs entries for this API, you see the detailed error messages for this payload. Run the following command:

pip install apilogs

apilogs get --api-id $API_ID --stage test --watch --region us-east-1`

The CloudWatch Logs entry for this request reveals the specific validation errors:

"Request body does not match model schema for content type application/json: [numeric instance is greater than the required maximum (maximum: 1000, found: 999999), string "thisstringistoolong" is too long (length: 19, maximum allowed: 4), instance value ("foobar") not found in enum (possible values: ["STOCK","BOND","CASH"])]"

Note on Content-Type: 

Request body validation is performed according to the configured request Model which is selected by the value of the request ‘Content-Type’ header. In order to enforce validation and restrict requests to explicitly-defined content types, it’s a good idea to use strict request passthrough behavior (‘"passthroughBehavior": "never"’), so that unsupported content types fail with 415 "Unsupported Media Type" response.

Example: Validating the request parameters

For the next example, add a GET method to the /orders resource that returns the list of purchase orders. This method has an optional query string parameter (type) and a required header parameter (Account-Id).

The request validator configured for the GET method is set to validate incoming request parameters. This performs basic validation on the required parameters, ensuring that the request parameters are present and non-blank.

Here are some example requests.

Happy path:

curl -v -H "Account-Id: abcdef123456" "https://$API_ID.execute-api.us-east-1.amazonaws.com/test/orders?type=STOCK"

Response:

HTTP/1.1 200 OK

[{"order-id" : "qrx987",
   "type" : "STOCK",
   "symbol" : "AMZN",
   "shares" : 100,
   "time" : "1488217405",
   "state" : "COMPLETED"
},
{
   "order-id" : "foo123",
   "type" : "STOCK",
   "symbol" : "BA",
   "shares" : 100,
   "time" : "1488213043",
   "state" : "COMPLETED"
}]

Omitting optional type parameter:

curl -v -H "Account-Id: abcdef123456" "https://$API_ID.execute-api.us-east-1.amazonaws.com/test/orders"

Response:

HTTP/1.1 200 OK

[{"order-id" : "qrx987",
   "type" : "STOCK",
   "symbol" : "AMZN",
   "shares" : 100,
   "time" : "1488217405",
   "state" : "COMPLETED"
},
{
   "order-id" : "foo123",
   "type" : "STOCK",
   "symbol" : "BA",
   "shares" : 100,
   "time" : "1488213043",
   "state" : "COMPLETED"
}]

Omitting required Account-Id parameter:

curl -v "https://$API_ID.execute-api.us-east-1.amazonaws.com/test/orders?type=STOCK"

Response:

HTTP/1.1 400 Bad Request

{"message": "Missing required request parameters: [Account-Id]"}

Conclusion

Request validators should help API developers to build better APIs by allowing them to remove boilerplate validation logic from backend implementations and focus on actual business logic and deep validation. This should further reduce the size of the API codebase and also help to ensure that API models and validation logic are kept in sync. 

Please forward any questions or feedback to the API Gateway team through AWS Support or on the AWS Forums.

Welcome to the Newest AWS Community Heroes (Spring 2017)

Post Syndicated from Ana Visneski original https://aws.amazon.com/blogs/aws/welcome-to-the-newest-aws-community-heroes-spring-2017/

We would like to extend a very warm welcome to the newest AWS Community Heroes:

AWS Community Heroes share their knowledge and demonstrate their enthusiasm for AWS in a plethora of ways. They go above and beyond to share AWS insights via social media, blog posts, open source projects, and through in-person events, user groups, and workshops.


Mark Nunnikhoven
Mark Nunnikhoven explores the impact of technology on individuals, organizations, and communities through the lens of privacy and security. Asking the question, “How can we better protect our information?” Mark studies the world of cybercrime to better understand the risks and threats to our digital world.

As the Vice President of Cloud Research at Trend Micro, a long time Amazon Web Services Advanced Technology Partner and provider of security tools for the AWS Cloud, Mark uses that knowledge to help organizations around the world modernize their security practices by taking advantage of the power of the AWS Cloud.

With a strong focus on automation, he helps bridge the gap between DevOps and traditional security through his writing, speaking, teaching, and by engaging with the AWS community.

 

SangUk Park
SangUk Park is a Chief Solutions Architect at Megazone, which became Korea’s first AWS Partner in 2012 and is the only AWS Premier Consulting Partner to provide AWS support in Korean.

He served as a System Architect for KT’s public cloud and VDI design, and led the system operation of YDOnline and Nexon Japan, one of the leading online gaming companies. Certified both as an AWS Solutions Architect – Professional and AWS DevOps Engineer – Professional, SangUk has authored AWS books, including DevOps and AWS Cloud Design Patterns, and translated four books related to the AWS Cloud.

He’s been making efforts to revitalize the local AWS Korea User Group community as co-leader by presenting at AWS Korea User Group meetings and AWS Summits, and helping to establish small group gatherings such as the AWSKRUG System Engineers in Gangnam. Also, he has done many hands-on labs and has been running a booth as a leader of the user groups at AWS events to cultivate developers and system engineers.

SangUk maintains a close relationship with the Japanese AWS User Group (JAWS UG), using his excellent Japanese communication skills and experiences in Japan. He makes every effort to participate in events held between Japanese and Korean user groups as a facilitator and translator, and will promote cross-regional communications beyond APAC going forward.

 

James Hall
James Hall has been working in the digital sector for over a decade. He is the author of the popular jsPDF library, and is a founder/Director of Parallax, a digital agency in the UK. He’s worked as a software developer on a wide variety of projects, from LED Billboards, car unlocking apps, to large web applications and tools.

Parallax built an online recording studio for David Guetta and UEFA using Serverless technology shortly after API Gateway was released. Since then they have consulted on various serverless projects and technologies. They run the AWS Meetup in Leeds, and help companies around the world build their businesses online. James has contributed to and promotes the Serverless Framework which allows you to elegantly build web applications on top of Lambda and related services.

 

Drew Firment
Drew Firment works with business leaders and technology teams from organizations that seek to accelerate cloud adoption. He has over twenty years of experience leading large-scale technology programs, enterprise platforms, and cultural transformations in a fast-paced agile environment.

After migrating Capital One’s early adopters of AWS into production, his focus shifted toward accelerating a scaleable and sustainable transition to cloud computing. Drew pioneered the intersection of strategy, governance, engineering, agile, and education to drive an enterprise-wide talent transformation. He founded Capital One’s cloud engineering college, and implemented an innovative outcome-based curriculum oriented towards learning communities. Several thousand employees have enrolled in his cloud-fluency program, enabling well over 1,000 AWS certifications since its inception.

Drew has earned all three of the AWS associate-level certifications, enjoys developing custom Amazon Alexa skills using AWS Lambda, and believes serverless is the future of cloud computing. He also serves as an advisory partner to A Cloud Guru and is editor-in-chief of the their community-sourced publication.

Welcome
Please join me in welcoming to our newest AWS Community Heroes!

-Ana

A Serverless Authentication System by Jumia

Post Syndicated from Bryan Liston original https://aws.amazon.com/blogs/compute/a-serverless-authentication-system-by-jumia/

Jumia Group is an ecosystem of nine different companies operating in 22 different countries in Africa. Jumia employs 3000 people and serves 15 million users/month.

Want to secure and centralize millions of user accounts across Africa? Shut down your servers! Jumia Group unified and centralized customer authentication on nine digital services platforms, operating in 22 (and counting) countries in Africa, totaling over 120 customer and merchant facing applications. All were unified into a custom Jumia Central Authentication System (JCAS), built in a timely fashion and designed using a serverless architecture.

In this post, we give you our solution overview. For the full technical implementation, see the Jumia Central Authentication System post on the Jumia Tech blog.

The challenge

A group-level initiative was started to centralize authentication for all Jumia users in all countries for all companies. But it was impossible to unify the operational databases for the different companies. Each company had its own user database with its own technological implementation. Each company alone had yet to unify the logins for their own countries. The effects of deduplicating all user accounts were yet to be determined but were considered to be large. Finally, there was no team responsible for manage this new project, given that a new dedicated infrastructure would be needed.

With these factors in mind, we decided to design this project as a serverless architecture to eliminate the need for infrastructure and a dedicated team. AWS was immediately considered as the best option for its level of
service, intelligent pricing model, and excellent serverless services.

The goal was simple. For all user accounts on all Jumia Group websites:

  • Merge duplicates that might exist in different companies and countries
  • Create a unique login (username and password)
  • Enrich the user profile in this centralized system
  • Share the profile with all Jumia companies

Requirements

We had the following initial requirements while designing this solution on the AWS platform:

  • Secure by design
  • Highly available via multimaster replication
  • Single login for all platforms/countries
  • Minimal performance impact
  • No admin overhead

Chosen technologies

We chose the following AWS services and technologies to implement our solution.

Amazon API Gateway

Amazon API Gateway is a fully managed service, making it really simple to set up an API. It integrates directly with AWS Lambda, which was chosen as our endpoint. It can be easily replicated to other regions, using Swagger import/export.

AWS Lambda

AWS Lambda is the base of serverless computing, allowing you to run code without worrying about infrastructure. All our code runs on Lambda functions using Python; some functions are
called from the API Gateway, others are scheduled like cron jobs.

Amazon DynamoDB

Amazon DynamoDB
is a highly scalable, NoSQL database with a good API and a clean pricing model. It has great scalability as well as high availability, and fits the serverless model we aimed for with JCAS.

AWS KMS

AWS KMS was chosen as a key manager to perform envelope encryption. It’s a simple and secure key manager with multiple encryption functionalities.

Envelope encryption

Oversimplifying envelope encryption is when you encrypt a key rather than the data itself. That encrypted key, which was used to encrypt the data, may now be stored with the data itself on your persistence layer since it doesn’t decrypt the data if compromised. For more information, see How Envelope Encryption Works with Supported AWS Services.

Envelope encryption was chosen given that master keys have a 4 KB limit for data to be encrypted or decrypted.

Amazon SQS

Amazon SQS is an inexpensive queuing service with dynamic scaling, 14-day retention availability, and easy management. It’s the perfect choice for our needs, as we use queuing systems only as a
fallback when saving data to remote regions fails. All the features needed for those fallback cases were covered.

JWT

We also use JSON web tokens for encoding and signing communications between JCAS and company servers. It’s another layer of security for data in transit.

Data structure

Our database design was pretty straightforward. DynamoDB records can be accessed by the primary key and allow access using secondary indexes as well. We created a UUID for each user on JCAS, which is used as a primary key
on DynamoDB.

Most data must be encrypted so we use a single field for that data. On the other hand, there’s data that needs to be stored in separate fields as they need to be accessed from the code without decryption for lookups or basic checks. This indexed data was also stored, as a hash or plain, outside the main ciphered blob store.

We used a field for each searchable data piece:

  • Id (primary key)
  • main phone (indexed)
  • main email (indexed)
  • account status
  • document timestamp

To hold the encrypted data we use a data dictionary with two main dictionaries, info with the user’s encrypted data and keys with the key for each AWS KMS region to decrypt the info blob.
Passwords are stored in two dictionaries, old_hashes contains the legacy hashes from the origin systems and secret holds the user’s JCAS password.

Here’s an example:
data_at_rest

Security

Security is like an onion, it needs to be layered. That’s what we did when designing this solution. Our design makes all of our data unreadable at each layer of this solution, while easing our compliance needs.

A field called data stores all personal information from customers. It’s encrypted using AES256-CBC with a key generated and managed by AWS KMS. A new data key is used for each transaction. For communication between companies and the API, we use API Keys, TLS and JWT, in the body to ensure that the post is signed and verified.

Data flow

Our second requirement on JCAS was system availability, so we designed some data pipelines. These pipelines allow multi-region replication, which evades collision using idempotent operations on all endpoints. The only technology added to the stack was Amazon SQS. On SQS queues, we place all the items we aren’t able to replicate at the time of the client’s request.

JCAS may have inconsistencies between regions, caused by network or region availability; by using SQS, we have workers that synchronize all regions as soon as possible.

Example with two regions

We have three Lambda functions and two SQS queues, where:

1) A trigger Lambda function is called for the DynamoDB stream. Upon changes to the user’s table, it tries to write directly to another DynamoDB table in the second region, falling back to writing to two SQS queues.

fig1

2) A scheduled Lambda function (cron-style) checks a SQS queue for items and tries writing them to the DynamoDB table that potentially failed.

fig2

3) A cron-style Lambda function checks the SQS queue, calling KMS for any items, and fixes the issue.

fig3

The following diagram shows the full infrastrucure (for clarity, this diagram leaves out KMS recovery).

CAS Full Diagram

Results

Upon going live, we noticed a minor impact in our response times. Note the brown legend in the images below.

lat_1

This was a cold start, as the infrastructure started to get hot, response time started to converge. On 27 April, we were almost at the initial 500ms.
lat_2

It kept steady on values before JCAS went live (≈500ms).
lat_3

As of the writing of this post, our response time kept improving (dev changed the method name and we changed the subdomain name).
lat_4

Customer login used to take ≈500ms and it still takes ≈500ms with JCAS. Those times have improved as other components changed inside our code.

Lessons learned

  • Instead of the standard cross-region replication, creating your own DynamoDB cross-region replication might be a better fit for you, if your data and applications allow it.
  • Take some time to tweak the Lambda runtime memory. Remember that it’s billed per 100ms, so it saves you money if you have it run near a round number.
  • KMS takes away the problem of key management with great security by design. It really simplifies your life.
  • Always check the timestamp before operating on data. If it’s invalid, save the money by skipping further KMS, DynamoDB, and Lambda calls. You’ll love your systems even more.
  • Embrace the serverless paradigm, even if it looks complicated at first. It will save your life further down the road when your traffic bursts or you want to find an engineer who knows your whole system.

Next steps

We are going to leverage everything we’ve done so far to implement SSO in Jumia. For a future project, we are already testing OpenID connect with DynamoDB as a backend.

Conclusion

We did a mindset revolution in many ways.
Not only we went completely serverless as we started storing critical info on the cloud.
On top of this we also architectured a system where all user data is decoupled between local systems and our central auth silo.
Managing all of these critical systems became far more predictable and less cumbersome than we thought possible.
For us this is the proof that good and simple designs are the best features to look out for when sketching new systems.

If we were to do this again, we would do it in exactly the same way.

Daniel Loureiro (SecOps) & Tiago Caxias (DBA) – Jumia Group

Is it on AWS? Domain Identification Using AWS Lambda

Post Syndicated from Jeff Barr original https://aws.amazon.com/blogs/aws/is-it-on-aws-domain-identification-using-aws-lambda/

In the guest post below, my colleague Tim Bray explains how he built IsItOnAWS.com . Powered by the list of AWS IP address ranges and using a pair of AWS Lambda functions that Tim wrote, the site aims to tell you if your favorite website is running on AWS.

Jeff;


Is it on AWS?
I did some recreational programming over Christmas and ended up with a little Lambda function that amused me and maybe it’ll amuse you too. It tells you whether or not a given domain name (or IP address) (even IPv6!) is in the published list of AWS IP address ranges. You can try it out over at IsItOnAWS.com. Part of the construction involves one Lambda function creating another.

That list of of ranges, given as IPv4 and IPv6 CIDRs wrapped in JSON, is here; the how-to documentation is here and there’s a Jeff Barr blog. Here are a few lines of the “IP-Ranges” JSON:

{
  "syncToken": "1486776130",
  "createDate": "2017-02-11-01-22-10",
  "prefixes": [
    {
      "ip_prefix": "13.32.0.0/15",
      "region": "GLOBAL",
      "service": "AMAZON"
    },
    ...
  "ipv6_prefixes": [
    {
      "ipv6_prefix": "2400:6500:0:7000::/56",
      "region": "ap-southeast-1",
      "service": "AMAZON"
    },

As soon as I saw it, I thought “I wonder if IsItOnAWS.com is available?” It was, and so I had to build this thing. I wanted it to be:

  1. Serverless (because that’s what the cool kids are doing),
  2. simple (because it’s a simple problem, look up a number in a range of numbers), and
  3. fast. Because well of course.

Database or Not?
The construction seemed pretty obvious: Simplify the IP-Ranges into a table, then look up addresses in it. So, where to put the table? I thought about Amazon DynamoDB, but it’s not obvious how best to search on what in effect is a numeric range. I thought about SQL databases, where it is obvious, but note #2 above. I thought about Redis or some such, but then you have to provision instances, see #1 above. I actually ended up stuck for a few days scratching my head over this one.

Then a question occurred to me: How big is that list of ranges? It turns out to have less than a thousand entries. So who needs a database anyhow? Let’s just sort that JSON into an array and binary-search it. OK then, where does the array go? Amazon S3 would be easy, but hey, look at #3 above; S3’s fast, but why would I want it in the loop for every request? So I decided to just generate a little file containing the ranges as an array literal, and include it right into the IsItOnAWS Lambda function. Which meant I’d have to rebuild and upload the function every time the IP addresses change.

It turns out that if you care about those addresses, you can subscribe to an Amazon Simple Notification Service (SNS) topic that will notify you whenever it changes (in my recent experience, once or twice a week). And you can hook your subscription up to a Lambda function. With that, I felt I’d found all the pieces anyone could need. There are two Lambda functions: the first, newranges.js, gets the change notifications, generates the JavaScript form of the IP-Ranges data, and uploads a second Lambda function, isitonaws.js, which includes that JavaScript. Vigilant readers will have deduced this is all with the Node runtime.

The new-ranges function, your typical async/waterfall thing, is a little more complex than I’d expected going in.

Postmodern IP Addresses
Its first task is to fetch the IP-Ranges, a straightforward HTTP GET. Then you take that JSON and smooth it out to make it more searchable. Unsurprisingly, there are both IPv4 and IPv6 ranges, and to make things easy I wanted to mash ’em all together into a single array that I could search with simple string or numeric matching. And since IPv6 addresses are way too big for JavaScript numbers to hold, they needed to be strings.

It turns out the way the IPv4 space embeds into IPv6’s ("::ffff:0:0/96") is a little surprising. I’d always assumed it’d be like the BMP mapping into the low bits of Unicode. I idly wonder why it’s this way, but not enough to research it.

The code for crushing all those CIDRs together into a nice searchable array ended up being kind of brutish, but it gets the job done.

Building Lambda in Lambda
Next, we need to construct the lambda that’s going to actually handle the IsItOnAWS request. This has to be a Zipfile, and NPM has tools to make those. Then it was a matter of jamming the zipped bytes into S3 and uploading them to make the new Lambda function.

The sharp-eyed will note that once I’d created the zip, I could have just uploaded it to Lambda directly. I used the S3 interim step because I wanted to to be able to download the generated “ranges” data structure and actually look at it; at some point I may purify the flow.

The actual IsItOnAWS runtime is laughably simple, aside from a bit of work around hitting DNS to look up addresses for names, then mashing them into the same format we used for the ranges array. I didn’t do any HTML templating, just read it out of a file in the zip and replaced an invisible <div> with the results if there were any. Except for, I got to code up a binary search method, which only happens once a decade or so but makes me happy.

Putting the Pieces Together
Once I had all this code working, I wanted to connect it to the world, which meant using Amazon API Gateway. I’ve found this complex in the past, but this time around I plowed through Create an API with Lambda Proxy Integration through a Proxy Resource, and found it reasonably linear and surprise-free.

However, it’s mostly focused on constructing APIs (i.e. JSON in/out) as opposed to human experiences. It doesn’t actually say how to send HTML for a human to consume in a browser, but it’s not hard to figure out. Here’s how (from Node):

context.succeed({
  "statusCode": 200,
  "headers": { "Content-type": "text/html" },
  "body": "<html>Your HTML Here</html>"
});

Once I had everything hooked up to API Gateway, the last step was pointing isitonaws.com at it. And that’s why I wrote this code in December-January, but am blogging at you now. Back then, Amazon Certificate Manager (ACM) certs couldn’t be used with API Gateway, and in 2017, life is just too short to go through the old-school ceremony for getting a cert approved and hooked up. ACM makes the cert process a real no-brainer. What with ACM and Let’s Encrypt loose in the wild, there’s really no excuse any more for having a non-HTTPS site. Both are excellent, but if you’re using AWS services like API Gateway and CloudFront like I am here, ACM is a smoother fit. Also it auto-renews, which you have to like.

So as of now, hooking up a domain name via HTTPS and CloudFront to your API Gateway API is dead easy; see Use Custom Domain Name as API Gateway API Host Name. Worked for me, first time, but something to watch out for (in March 2017, anyhow): When you get to the last step of connecting your ACM cert to your API, you get a little spinner that wiggles at you for several minutes while it hooks things up; this is apparently normal. Fortunately I got distracted and didn’t give up and refresh or cancel or anything, which might have screwed things up.

By the way, as a side-effect of using API Gateway, this is all running through CloudFront. So what with that, and not having a database, you’d expect it to be fast. And yep, it sure is, from here in Vancouver anyhow. Fast enough to not bother measuring.

I also subscribed my email to the “IP-Ranges changed” SNS topic, so every now and then I get an email telling me it’s changed, and I smile because I know that my Lambda wrote a new Lambda, all automatic, hands-off, clean, and fast.

Tim Bray, Senior Principal Engineer

 

SAML for Your Serverless JavaScript Application: Part II

Post Syndicated from Bryan Liston original https://aws.amazon.com/blogs/compute/saml-for-your-serverless-javascript-application-part-ii/

Contributors: Richard Threlkeld, Gene Ting, Stefano Buliani

The full code for both scenarios—including SAM templates—can be found at the samljs-serverless-sample GitHub repository. We highly recommend you use the SAM templates in the GitHub repository to create the resources, opitonally you can manually create them.


This is the second part of a two part series for using SAML providers in your application and receiving short-term credentials to access AWS Services. These credentials can be limited with IAM roles so the users of the applications can perform actions like fetching data from databases or uploading files based on their level of authorization. For example, you may want to build a JavaScript application that allows a user to authenticate against Active Directory Federation Services (ADFS). The user can be granted scoped AWS credentials to invoke an API to display information in the application or write to an Amazon DynamoDB table.

Part I of this series walked through a client-side flow of retrieving SAML claims and passing them to Amazon Cognito to retrieve credentials. This blog post will take you through a more advanced scenario where logic can be moved to the backend for a more comprehensive and flexible solution.

Prerequisites

As in Part I of this series, you need ADFS running in your environment. The following configurations are used for reference:

  1. ADFS federated with the AWS console. For a walkthrough with an AWS CloudFormation template, see Enabling Federation to AWS Using Windows Active Directory, ADFS, and SAML 2.0.
  2. Verify that you can authenticate with user example\bob for both the ADFS-Dev and ADFS-Production groups via the sign-in page.
  3. Create an Amazon Cognito identity pool.

Scenario Overview

The scenario in the last blog post may be sufficient for many organizations but, due to size restrictions, some browsers may drop part or all of a query string when sending a large number of claims in the SAMLResponse. Additionally, for auditing and logging reasons, you may wish to relay SAML assertions via POST only and perform parsing in the backend before sending credentials to the client. This scenario allows you to perform custom business logic and validation as well as putting tracking controls in place.

In this post, we want to show you how these requirements can be achieved in a Serverless application. We also show how different challenges (like XML parsing and JWT exchange) can be done in a Serverless application design. Feel free to mix and match, or swap pieces around to suit your needs.

This scenario uses the following services and features:

  • Cognito for unique ID generation and default role mapping
  • S3 for static website hosting
  • API Gateway for receiving the SAMLResponse POST from ADFS
  • Lambda for processing the SAML assertion using a native XML parser
  • DynamoDB conditional writes for session tracking exceptions
  • STS for credentials via Lambda
  • KMS for signing JWT tokens
  • API Gateway custom authorizers for controlling per-session access to credentials, using JWT tokens that were signed with KMS keys
  • JavaScript-generated SDK from API Gateway using a service proxy to DynamoDB
  • RelayState in the SAMLRequest to ADFS to transmit the CognitoID and a short code from the client to your AWS backend

At a high level, this solution is similar to that of Scenario 1; however, most of the work is done in the infrastructure rather than on the client.

  • ADFS still uses a POST binding to redirect the SAMLResponse to API Gateway; however, the Lambda function does not immediately redirect.
  • The Lambda function decodes and uses an XML parser to read the properties of the SAML assertion.
  • If the user’s assertion shows that they belong to a certain group matching a specified string (“Prod” in the sample), then you assign a role that they can assume (“ADFS-Production”).
  • Lambda then gets the credentials on behalf of the user and stores them in DynamoDB as well as logging an audit record in a separate table.
  • Lambda then returns a short-lived, signed JSON Web Token (JWT) to the JavaScript application.
  • The application uses the JWT to get their stored credentials from DynamoDB through an API Gateway custom authorizer.

The architecture you build in this tutorial is outlined in the following diagram.

lambdasamltwo_1.png

First, a user visits your static website hosted on S3. They generate an ephemeral random code that is transmitted during redirection to ADFS, where they are prompted for their Active Directory credentials.

Upon successful authentication, the ADFS server redirects the SAMLResponse assertion, along with the code (as the RelayState) via POST to API Gateway.

The Lambda function parses the SAMLResponse. If the user is part of the appropriate Active Directory group (AWS-Production in this tutorial), it retrieves credentials from STS on behalf of the user.

The credentials are stored in a DynamoDB table called SAMLSessions, along with the short code. The user login is stored in a tracking table called SAMLUsers.

The Lambda function generates a JWT token, with a 30-second expiration time signed with KMS, then redirects the client back to the static website along with this token.

The client then makes a call to an API Gateway resource acting as a DynamoDB service proxy that retrieves the credentials via a DeleteItem call. To make this call, the client passes the JWT in the authorization header.

A custom authorizer runs to validate the token using the KMS key again as well as the original random code.

Now that the client has credentials, it can use these to access AWS resources.

Tutorial: Backend processing and audit tracking

Before you walk through this tutorial you will need the source code from the samljs-serverless-sample Github Repository. You should use the SAM template provided in order to streamline the process but we’ll outline how you you would manually create resources too. There is a readme in the repository with instructions for using the SAM template. Either way you will still perform the manual steps of KMS key configuration, ADFS enablement of RelayState, and Amazon Cognito Identity Pool creation. The template will automate the process in creating the S3 website, Lambda functions, API Gateway resources and DynamoDB tables.

We walk through the details of all the steps and configuration below for illustrative purposes in this tutorial calling out the sections that can be omitted if you used the SAM template.

KMS key configuration

To sign JWT tokens, you need an encrypted plaintext key, to be stored in KMS. You will need to complete this step even if you use the SAM template.

  1. In the IAM console, choose Encryption Keys, Create Key.
  2. For Alias, type sessionMaster.
  3. For Advanced Options, choose KMS, Next Step.
  4. For Key Administrative Permissions, select your administrative role or user account.
  5. For Key Usage Permissions, you can leave this blank as the IAM Role (next section) will have individual key actions configured. This allows you to perform administrative actions on the set of keys while the Lambda functions have rights to just create data keys for encryption/decryption and use them to sign JWTs.
  6. Take note of the Key ID, which is needed for the Lambda functions.

IAM role configuration

You will need an IAM role for executing your Lambda functions. If you are using the SAM template this can be skipped. The sample code in the GitHub repository under Scenario2 creates separate roles for each function, with limited permissions on individual resources when you use the SAM template. We recommend separate roles scoped to individual resources for production deployments. Your Lambda functions need the following permissions:

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "Stmt1432927122000",
            "Effect": "Allow",
            "Action": [
                "dynamodb:PutItem",
                “dynamodb:GetItem”,
                “dynamodb:DeleteItem”,
                "logs:CreateLogGroup",
                "logs:CreateLogStream",
                "logs:PutLogEvents",
                "kms:GenerateDataKey*",
                “kms:Decrypt”
            ],
            "Resource": [
                "*"
            ]
        }
    ]
}

Lambda function configuration

If you are not using the SAM template, create the following three Lambda functions from the GitHub repository in /Scenario2/lambda using the following names and environment variables. The Lambda functions are written in Node.js.

  • GenerateKey_awslabs_samldemo
  • ProcessSAML_awslabs_samldemo
  • SAMLCustomAuth_awslabs_samldemo

The functions above are built, packaged, and uploaded to Lambda. For two of the functions, this can be done from your workstation (the sample commands for each function assume OSX or Linux). The third will need to be built on an AWS EC2 instance running the current Lambda AMI.

GenerateKey_awslabs_samldemo

This function is only used one time to create keys in KMS for signing JWT tokens. The function calls GenerateDataKey and stores the encrypted CipherText blob as Base64 in DynamoDB. This is used by the other two functions for getting the PlainTextKey for signing with a Decrypt operation.

This function only requires a single file. It has the following environment variables:

  • KMS_KEY_ID: Unique identifier from KMS for your sessionMaster Key
  • SESSION_DDB_TABLE: SAMLSessions
  • ENC_CONTEXT: ADFS (or something unique to your organization)
  • RAND_HASH: us-east-1:XXXXXXXX-XXXX-XXXX-XXXX-XXXXXXXXXXXX

Navigate into /Scenario2/lambda/GenerateKey and run the following commands:

zip –r generateKey.zip .

aws lambda create-function --function-name GenerateKey_awslabs_samldemo --runtime nodejs4.3 --role LAMBDA_ROLE_ARN --handler index.handler --timeout 10 --memory-size 512 --zip-file fileb://generateKey.zip --environment Variables={SESSION_DDB_TABLE=SAMLSessions,ENC_CONTEXT=ADFS,RAND_HASH=us-east-1:XXXXXXXX-XXXX-XXXX-XXXX-XXXXXXXXXXXX,KMS_KEY_ID=<kms key="KEY" id="ID">}

SAMLCustomAuth_awslabs_samldemo

This is an API Gateway custom authorizer called after the client has been redirected to the website as part of the login workflow. This function calls a GET against the service proxy to DynamoDB, retrieving credentials. The function uses the KMS key signing validation of the JWT created in the ProcessSAML_awslabs_samldemo function and also validates the random code that was generated at the beginning of the login workflow.

You must install the dependencies before zipping this function up. It has the following environment variables:

  • SESSION_DDB_TABLE: SAMLSessions
  • ENC_CONTEXT: ADFS (or whatever was used in GenerateKey_awslabs_samldemo)
  • ID_HASH: us-east-1:XXXXXXXX-XXXX-XXXX-XXXX-XXXXXXXXXXXX

Navigate into /Scenario2/lambda/CustomAuth and run:

npm install

zip –r custom_auth.zip .

aws lambda create-function --function-name SAMLCustomAuth_awslabs_samldemo --runtime nodejs4.3 --role LAMBDA_ROLE_ARN --handler CustomAuth.handler --timeout 10 --memory-size 512 --zip-file fileb://custom_auth.zip --environment Variables={SESSION_DDB_TABLE=SAMLSessions,ENC_CONTEXT=ADFS,ID_HASH= us-east-1:XXXXXXXX-XXXX-XXXX-XXXX-XXXXXXXXXXXX }

ProcessSAML_awslabs_samldemo

This function is called when ADFS sends the SAMLResponse to API Gateway. The function parses the SAML assertion to select a role (based on a simple string search) and extract user information. It then uses this data to get short-term credentials from STS via AssumeRoleWithSAML and stores this information in a SAMLSessions table and tracks the user login via a SAMLUsers table. Both of these are DynamoDB tables but you could also store the user information in another AWS database type, as this is for auditing purposes. Finally, this function creates a JWT (signed with the KMS key) which is only valid for 30 seconds and is returned to the client as part of a 302 redirect from API Gateway.

This function needs to be built on an EC2 server running Amazon Linux. This function leverages two main external libraries:

  • nJwt: Used for secure JWT creation for individual client sessions to get access to their records
  • libxmljs: Used for XML XPath queries of the decoded SAMLResponse from AD FS

Libxmljs uses native build tools and you should run this on EC2 running the same AMI as Lambda and with Node.js v4.3.2; otherwise, you might see errors. For more information about current Lambda AMI information, see Lambda Execution Environment and Available Libraries.

After you have the correct AMI launched in EC2 and have SSH open to that host, install Node.js. Ensure that the Node.js version on EC2 is 4.3.2, to match Lambda. If your version is off, you can roll back with NVM.

After you have set up Node.js, run the following command:

yum install -y make gcc*

Now, create a /saml folder on your EC2 server and copy up ProcessSAML.js and package.json from /Scenario2/lambda/ProcessSAML to the EC2 server. Here is a sample SCP command:

cd ProcessSAML/

ls

package.json    ProcessSAML.js

scp -i ~/path/yourpemfile.pem ./* [email protected]:/home/ec2-user/saml/

Then you can SSH to your server, cd into the /saml directory, and run:

npm install

A successful build should look similar to the following:

lambdasamltwo_2.png

Finally, zip up the package and create the function using the following AWS CLI command and these environment variables. Configure the CLI with your credentials as needed.

  • SESSION_DDB_TABLE: SAMLSessions
  • ENC_CONTEXT: ADFS (or whatever was used in GenerateKeyawslabssamldemo)
  • PRINCIPAL_ARN: Full ARN of the AD FS IdP created in the IAM console
  • USER_DDB_TABLE: SAMLUsers
  • REDIRECT_URL: Endpoint URL of your static S3 website (or CloudFront distribution domain name if you did that optional step)
  • ID_HASH: us-east-1:XXXXXXXX-XXXX-XXXX-XXXX-XXXXXXXXXXXX
zip –r saml.zip .

aws lambda create-function --function-name ProcessSAML_awslabs_samldemo --runtime nodejs4.3 --role LAMBDA_ROLE_ARN --handler ProcessSAML.handler --timeout 10 --memory-size 512 --zip-file fileb://saml.zip –environment Variables={USER_DDB_TABLE=SAMLUsers,SESSION_DDB_TABLE= SAMLSessions,REDIRECT_URL=<your S3 bucket and test page path>,ID_HASH=us-east-1:XXXXXXXX-XXXX-XXXX-XXXX-XXXXXXXXXXXX,ENC_CONTEXT=ADFS,PRINCIPAL_ARN=<your ADFS IdP ARN>}

If you built the first two functions on your workstation and created the ProcessSAML_awslabs_samldemo function separately in the Lambda console before building on EC2, you can update the code after building on with the following command:

aws lambda update-function-code --function-name ProcessSAML_awslabs_samldemo --zip-file fileb://saml.zip

Role trust policy configuration

This scenario uses STS directly to assume a role. You will need to complete this step even if you use the SAM template. Modify the trust policy, as you did before when Amazon Cognito was assuming the role. In the GitHub repository sample code, ProcessSAML.js is preconfigured to filter and select a role with “Prod” in the name via the selectedRole variable.

This is an example of business logic you can alter in your organization later, such as a callout to an external mapping database for other rules matching. In this tutorial, it corresponds to the ADFS-Production role that was created.

  1. In the IAM console, choose Roles and open the ADFS-Production Role.
  2. Edit the Trust Permissions field and replace the content with the following:

    {
      "Version": "2012-10-17",
      "Statement": [
        {
          "Effect": "Allow",
          "Principal": {
            "Federated": [
              "arn:aws:iam::ACCOUNTNUMBER:saml-provider/ADFS"
    ]
          },
          "Action": "sts:AssumeRoleWithSAML"
        }
      ]
    }

If you end up using another role (or add more complex filtering/selection logic), ensure that those roles have similar trust policy configurations. Also note that the sample policy above purposely uses an array for the federated provider matching the IdP ARN that you added. If your environment has multiple SAML providers, you could list them here and modify the code in ProcessSAML.js to process requests from different IdPs and grant or revoke credentials accordingly.

DynamoDB table creation

If you are not using the SAM template, create two DynamoDB tables:

  • SAMLSessions: Temporarily stores credentials from STS. Credentials are removed by an API Gateway Service Proxy to the DynamoDB DeleteItem call that simultaneously returns the credentials to the client.
  • SAMLUsers: This table is for tracking user information and the last time they authenticated in the system via ADFS.

The following AWS CLI commands creates the tables (indexed only with a primary key hash, called identityHash and CognitoID respectively):

aws dynamodb create-table \
    --table-name SAMLSessions \
    --attribute-definitions \
        AttributeName=group,AttributeType=S \
    --key-schema AttributeName=identityhash,KeyType=HASH \
    --provisioned-throughput ReadCapacityUnits=5,WriteCapacityUnits=5
aws dynamodb create-table \
    --table-name SAMLUsers \
    --attribute-definitions \
        AttributeName=CognitoID,AttributeType=S \
    --key-schema AttributeName=CognitoID,KeyType=HASH \
    --provisioned-throughput ReadCapacityUnits=5,WriteCapacityUnits=5

After the tables are created, you should be able to run the GenerateKey_awslabs_samldemo Lambda function and see a CipherText key stored in SAMLSessions. This is only for convenience of this post, to demonstrate that you should persist CipherText keys in a data store and never persist plaintext keys that have been decrypted. You should also never log plaintext keys in your code.

API Gateway configuration

If you are not using the SAM template, you will need to create API Gateway resources. If you have created resources for Scenario 1 in Part I, then the naming of these resources may be similar. If that is the case, then simply create an API with a different name (SAMLAuth2 or similar) and follow these steps accordingly.

  1. In the API Gateway console for your API, choose Authorizers, Custom Authorizer.
  2. Select your region and enter SAMLCustomAuth_awslabs_samldemo for the Lambda function. Choose a friendly name like JWTParser and ensure that Identity token source is method.request.header.Authorization. This tells the custom authorizer to look for the JWT in the Authorization header of the HTTP request, which is specified in the JavaScript code on your S3 webpage. Save the changes.

    lambdasamltwo_3.png

Now it’s time to wire up the Lambda functions to API Gateway.

  1. In the API Gateway console, choose Resources, select your API, and then create a Child Resource called SAML. This includes a POST and a GET method. The POST method uses the ProcessSAML_awslabs_samldemo Lambda function and a 302 redirect, while the GET method uses the JWTParser custom authorizer with a service proxy to DynamoDB to retrieve credentials upon successful authorization.
  2. lambdasamltwo_4.png

  3. Create a POST method. For Integration Type, choose Lambda and add the ProcessSAML_awslabs_samldemo Lambda function. For Method Request, add headers called RelayState and SAMLResponse.

    lambdasamltwo_5.png

  4. Delete the Method Response code for 200 and add a 302. Create a response header called Location. In the Response Models section, for Content-Type, choose application/json and for Models, choose Empty.

    lambdasamltwo_6.png

  5. Delete the Integration Response section for 200 and add one for 302 that has a Method response status of 302. Edit the response header for Location to add a Mapping value of integration.response.body.location.

    lambdasamltwo_7.png

  6. Finally, in order for Lambda to capture the SAMLResponse and RelayState values, choose Integration Request.

  7. In the Body Mapping Template section, for Content-Type, enter application/x-www-form-urlencoded and add the following template:

    {
    "SAMLResponse" :"$input.params('SAMLResponse')",
    "RelayState" :"$input.params('RelayState')",
    "formparams" : $input.json('$')
    }

  8. Create a GET method with an Integration Type of Service Proxy. Select the region and DynamoDB as the AWS Service. Use POST for the HTTP method and DeleteItem for the Action. This is important as you leverage a DynamoDB feature to return the current records when you perform deletion. This simultaneously allows credentials in this system to not be stored long term and also allows clients to retrieve them. For Execution role, use the Lambda role from earlier or a new role that only has IAM scoped permissions for DeleteItem on the SAMLSessions table.

    lambdasamltwo_8.png

  9. Save this and open Method Request.

  10. For Authorization, select your custom authorizer JWTParser. Add in a header called COGNITO_ID and save the changes.

    lambdasamltwo_9.png

  11. In the Integration Request, add in a header name of Content-Type and a value for Mapped of ‘application/x-amzn-json-1.0‘ (you need the single quotes surrounding the entry).

  12. Next, in the Body Mapping Template section, for Content-Type, enter application/json and add the following template:

    {
        "TableName": "SAMLSessions",
        "Key": {
            "identityhash": {
                "S": "$input.params('COGNITO_ID')"
            }
        },
        "ReturnValues": "ALL_OLD"
    }

Inspect this closely for a moment. When your client passes the JWT in an Authorization Header to this GET method, the JWTParser Custom Authorizer grants/denies executing a DeleteItem call on the SAMLSessions table.

ADF

If it is granted, then there needs to be an item to delete the reference as a primary key to the table. The client JavaScript (seen in a moment) passes its CognitoID through as a header called COGNITO_ID that is mapped above. DeleteItem executes to remove the credentials that were placed there via a call to STS by the ProcessSAML_awslabs_samldemo Lambda function. Because the above action specifies ALL_OLD under the ReturnValues mapping, DynamoDB returns these credentials at the same time.

lambdasamltwo_10.png

  1. Save the changes and open your /saml resource root.
  2. Choose Actions, Enable CORS.
  3. In the Access-Control-Allow-Headers section, add COGNITO_ID into the end (inside the quotes and separated from other headers by a comma), then choose Enable CORS and replace existing CORS headers.
  4. When completed, choose Actions, Deploy API. Use the Prod stage or another stage.
  5. In the Stage Editor, choose SDK Generation. For Platform, choose JavaScript and then choose Generate SDK. Save the folder someplace close. Take note of the Invoke URL value at the top, as you need this for ADFS configuration later.

Website configuration

If you are not using the SAM template, create an S3 bucket and configure it as a static website in the same way that you did for Part I.

If you are using the SAM template this will automatically be created for you however the steps below will still need to be completed:

In the source code repository, edit /Scenario2/website/configs.js.

  1. Ensure that the identityPool value matches your Amazon Cognito Pool ID and the region is correct.
  2. Leave adfsUrl the same if you’re testing on your lab server; otherwise, update with the AD FS DNS entries as appropriate.
  3. Update the relayingPartyId value as well if you used something different from the prerequisite blog post.

Next, download the minified version of the AWS SDK for JavaScript in the Browser (aws-sdk.min.js) and place it along with the other files in /Scenario2/website into the S3 bucket.

Copy the files from the API Gateway Generated SDK in the last section to this bucket so that the apigClient.js is in the root directory and lib folder is as well. The imports for these scripts (which do things like sign API requests and configure headers for the JWT in the Authorization header) are already included in the index.html file. Consult the latest API Gateway documentation if the SDK generation process updates in the future

ADFS configuration

Now that the AWS setup is complete, modify your ADFS setup to capture RelayState information about the client and to send the POST response to API Gateway for processing. You will need to complete this step even if you use the SAM template.

If you’re using Windows Server 2008 with ADFS 2.0, ensure that Update Rollup 2 is installed before enabling RelayState. Please see official Microsoft documentation for specific download information.

  1. After Update Rollup 2 is installed, modify %systemroot%\inetpub\adfs\ls\web.config. If you’re on a newer version of Windows Server running AD FS 3.0, modify %systemroot%\ADFS\Microsoft.IdentityServer.Servicehost.exe.config.
  2. Find the section in the XML marked <Microsoft.identityServer.web> and add an entry for <useRelayStateForIdpInitiatedSignOn enabled="true">. If you have the proper ADFS rollup or version installed, this should allow the RelayState parameter to be accepted by the service provider.
  3. In the ADFS console, open Relaying Party Trusts for Amazon Web Services and choose Endpoints.
  4. For Binding, choose POST and for Invoke URL,enter the URL to your API Gateway from the stage that you noted earlier.

At this point, you are ready to test out your webpage. Navigate to the S3 static website Endpoint URL and it should redirect you to the ADFS login screen. If the user login has been recent enough to have a valid SAML cookie, then you should see the login pass-through; otherwise, a login prompt appears. After the authentication has taken place, you should quickly end up back at your original webpage. Using the browser debugging tools, you see “Successful DDB call” followed by the results of a call to STS that were stored in DynamoDB.

lambdasamltwo_11.png

As in Scenario 1, the sample code under /scenario2/website/index.html has a button that allows you to “ping” an endpoint to test if the federated credentials are working. If you have used the SAM template this should already be working and you can test it out (it will fail at first – keep reading to find out how to set the IAM permissions!). If not go to API Gateway and create a new Resource called /users at the same level of /saml in your API with a GET method.

lambdasamltwo_12.png

For Integration type, choose Mock.

lambdasamltwo_13.png

In the Method Request, for Authorization, choose AWS_IAM. In the Integration Response, in the Body Mapping Template section, for Content-Type, choose application/json and add the following JSON:

{
    "status": "Success",
    "agent": "${context.identity.userAgent}"
}

lambdasamltwo_14.png

Before using this new Mock API as a test, configure CORS and re-generate the JavaScript SDK so that the browser knows about the new methods.

  1. On the /saml resource root and choose Actions, Enable CORS.
  2. In the Access-Control-Allow-Headers section, add COGNITO_ID into the endpoint and then choose Enable CORS and replace existing CORS headers.
  3. Choose Actions, Deploy API. Use the stage that you configured earlier.
  4. In the Stage Editor, choose SDK Generation and select JavaScript as your platform. Choose Generate SDK.
  5. Upload the new apigClient.js and lib directory to the S3 bucket of your static website.

One last thing must be completed before testing (You will need to complete this step even if you use the SAM template) if the credentials can invoke this mock endpoint with AWS_IAM credentials. The ADFS-Production Role needs execute-api:Invoke permissions for this API Gateway resource.

  1. In the IAM console, choose Roles, and open the ADFS-Production Role.

  2. For testing, you can attach the AmazonAPIGatewayInvokeFullAccess policy; however, for production, you should scope this down to the resource as documented in Control Access to API Gateway with IAM Permissions.

  3. After you have attached a policy with invocation rights and authenticated with AD FS to finish the redirect process, choose PING.

If everything has been set up successfully you should see an alert with information about the user agent.

Final Thoughts

We hope these scenarios and sample code help you to not only begin to build comprehensive enterprise applications on AWS but also to enhance your understanding of different AuthN and AuthZ mechanisms. Consider some ways that you might be able to evolve this solution to meet the needs of your own customers and innovate in this space. For example:

  • Completing the CloudFront configuration and leveraging SSL termination for site identification. See if this can be incorporated into the Lambda processing pipeline.
  • Attaching a scope-down IAM policy if the business rules are matched. For example, the default role could be more permissive for a group but if the user is a contractor (username with –C appended) they get extra restrictions applied when assumeRoleWithSaml is called in the ProcessSAML_awslabs_samldemo Lambda function.
  • Changing the time duration before credentials expire on a per-role basis. Perhaps if the SAMLResponse parsing determines the user is an Administrator, they get a longer duration.
  • Passing through additional user claims in SAMLResponse for further logical decisions or auditing by adding more claim rules in the ADFS console. This could also be a mechanism to synchronize some Active Directory schema attributes with AWS services.
  • Granting different sets of credentials if a user has accounts with multiple SAML providers. While this tutorial was made with ADFS, you could also leverage it with other solutions such as Shibboleth and modify the ProcessSAML_awslabs_samldemo Lambda function to be aware of the different IdP ARN values. Perhaps your solution grants different IAM roles for the same user depending on if they initiated a login from Shibboleth rather than ADFS?

The Lambda functions can be altered to take advantage of these options which you can read more about here. For more information about ADFS claim rule language manipulation, see The Role of the Claim Rule Language on Microsoft TechNet.

We would love to hear feedback from our customers on these designs and see different secure application designs that you’re implementing on the AWS platform.