Tag Archives: AWS Step Functions

Fine-grained Continuous Delivery With CodePipeline and AWS Step Functions

Post Syndicated from Richard H Boyd original https://aws.amazon.com/blogs/devops/new-fine-grained-continuous-delivery-with-codepipeline-and-aws-stepfunctions/

Automating your software release process is an important step in adopting DevOps best practices. AWS CodePipeline is a fully managed continuous delivery service that helps you automate your release pipelines for fast and reliable application and infrastructure updates. CodePipeline was modeled after the way that the retail website Amazon.com automated software releases, and many early decisions for CodePipeline were based on the lessons learned from operating a web application at that scale.

However, while most cross-cutting best practices apply to most releases, there are also business specific requirements that are driven by domain or regulatory requirements. CodePipeline attempts to strike a balance between enforcing best practices out-of-the-box and offering enough flexibility to cover as many use-cases as possible.

To support use cases requiring fine-grained customization, we are launching today a new AWS CodePipeline action type for starting an AWS Step Functions state machine execution. Previously, accomplishing such a workflow required you to create custom integrations that marshaled data between CodePipeline and Step Functions. However, you can now start either a Standard or Express Step Functions state machine during the execution of a pipeline.

With this integration, you can do the following:

·       Conditionally run an Amazon SageMaker hyper-parameter tuning job

·       Write and read values from Amazon DynamoDB, as an atomic transaction, to use in later stages of the pipeline

·       Run an Amazon Elastic Container Service (Amazon ECS) task until some arbitrary condition is satisfied, such as performing integration or load testing

Example Application Overview

In the following use case, you’re working on a machine learning application. This application contains both a machine learning model that your research team maintains and an inference engine that your engineering team maintains. When a new version of either the model or the engine is released, you want to release it as quickly as possible if the latency is reduced and the accuracy improves. If the latency becomes too high, you want the engineering team to review the results and decide on the approval status. If the accuracy drops below some threshold, you want the research team to review the results and decide on the approval status.

This example will assume that a CodePipeline already exists and is configured to use a CodeCommit repository as the source and builds an AWS CodeBuild project in the build stage.

The following diagram illustrates the components built in this post and how they connect to existing infrastructure.

Architecture Diagram for CodePipline Step Functions integration

First, create a Lambda function that uses Amazon Simple Email Service (Amazon SES) to email either the research or engineering team with the results and the opportunity for them to review it. See the following code:

import json
import os
import boto3
import base64

def lambda_handler(event, context):
    email_contents = """
    <html>
    <body>
    <p><a href="{url_base}/{token}/success">PASS</a></p>
    <p><a href="{url_base}/{token}/fail">FAIL</a></p>
    </body>
    </html>
"""
    callback_base = os.environ['URL']
    token = base64.b64encode(bytes(event["token"], "utf-8")).decode("utf-8")

    formatted_email = email_contents.format(url_base=callback_base, token=token)
    ses_client = boto3.client('ses')
    ses_client.send_email(
        Source='[email protected]',
        Destination={
            'ToAddresses': [event["team_alias"]]
        },
        Message={
            'Subject': {
                'Data': 'PLEASE REVIEW',
                'Charset': 'UTF-8'
            },
            'Body': {
                'Text': {
                    'Data': formatted_email,
                    'Charset': 'UTF-8'
                },
                'Html': {
                    'Data': formatted_email,
                    'Charset': 'UTF-8'
                }
            }
        },
        ReplyToAddresses=[
            '[email protected]',
        ]
    )
    return {}

To set up the Step Functions state machine to orchestrate the approval, use AWS CloudFormation with the following template. The Lambda function you just created is stored in the email_sender/app directory. See the following code:

AWSTemplateFormatVersion: '2010-09-09'
Transform: AWS::Serverless-2016-10-31

Resources:
  NotifierFunction:
    Type: AWS::Serverless::Function
    Properties:
      CodeUri: email_sender/
      Handler: app.lambda_handler
      Runtime: python3.7
      Timeout: 30
      Environment:
        Variables:
          URL: !Sub "https://${TaskTokenApi}.execute-api.${AWS::Region}.amazonaws.com/Prod"
      Policies:
      - Statement:
        - Sid: SendEmail
          Effect: Allow
          Action:
          - ses:SendEmail
          Resource: '*'

  MyStepFunctionsStateMachine:
    Type: AWS::StepFunctions::StateMachine
    Properties:
      RoleArn: !GetAtt SFnRole.Arn
      DefinitionString: !Sub |
        {
          "Comment": "A Hello World example of the Amazon States Language using Pass states",
          "StartAt": "ChoiceState",
          "States": {
            "ChoiceState": {
              "Type": "Choice",
              "Choices": [
                {
                  "Variable": "$.accuracypct",
                  "NumericLessThan": 96,
                  "Next": "ResearchApproval"
                },
                {
                  "Variable": "$.latencyMs",
                  "NumericGreaterThan": 80,
                  "Next": "EngineeringApproval"
                }
              ],
              "Default": "SuccessState"
            },
            "EngineeringApproval": {
                 "Type":"Task",
                 "Resource":"arn:aws:states:::lambda:invoke.waitForTaskToken",
                 "Parameters":{  
                    "FunctionName":"${NotifierFunction.Arn}",
                    "Payload":{
                      "latency.$":"$.latencyMs",
                      "team_alias":"[email protected]",
                      "token.$":"$$.Task.Token"
                    }
                 },
                 "Catch": [ {
                    "ErrorEquals": ["HandledError"],
                    "Next": "FailState"
                 } ],
              "Next": "SuccessState"
            },
            "ResearchApproval": {
                 "Type":"Task",
                 "Resource":"arn:aws:states:::lambda:invoke.waitForTaskToken",
                 "Parameters":{  
                    "FunctionName":"${NotifierFunction.Arn}",
                    "Payload":{  
                       "accuracy.$":"$.accuracypct",
                       "team_alias":"[email protected]",
                       "token.$":"$$.Task.Token"
                    }
                 },
                 "Catch": [ {
                    "ErrorEquals": ["HandledError"],
                    "Next": "FailState"
                 } ],
              "Next": "SuccessState"
            },
            "FailState": {
              "Type": "Fail",
              "Cause": "Invalid response.",
              "Error": "Failed Approval"
            },
            "SuccessState": {
              "Type": "Succeed"
            }
          }
        }

  TaskTokenApi:
    Type: AWS::ApiGateway::RestApi
    Properties: 
      Description: String
      Name: TokenHandler
  SuccessResource:
    Type: AWS::ApiGateway::Resource
    Properties:
      ParentId: !Ref TokenResource
      PathPart: "success"
      RestApiId: !Ref TaskTokenApi
  FailResource:
    Type: AWS::ApiGateway::Resource
    Properties:
      ParentId: !Ref TokenResource
      PathPart: "fail"
      RestApiId: !Ref TaskTokenApi
  TokenResource:
    Type: AWS::ApiGateway::Resource
    Properties:
      ParentId: !GetAtt TaskTokenApi.RootResourceId
      PathPart: "{token}"
      RestApiId: !Ref TaskTokenApi
  SuccessMethod:
    Type: AWS::ApiGateway::Method
    Properties:
      HttpMethod: GET
      ResourceId: !Ref SuccessResource
      RestApiId: !Ref TaskTokenApi
      AuthorizationType: NONE
      MethodResponses:
        - ResponseParameters:
            method.response.header.Access-Control-Allow-Origin: true
          StatusCode: 200
      Integration:
        IntegrationHttpMethod: POST
        Type: AWS
        Credentials: !GetAtt APIGWRole.Arn
        Uri: !Sub "arn:aws:apigateway:${AWS::Region}:states:action/SendTaskSuccess"
        IntegrationResponses:
          - StatusCode: 200
            ResponseTemplates:
              application/json: |
                {}
          - StatusCode: 400
            ResponseTemplates:
              application/json: |
                {"uhoh": "Spaghetti O's"}
        RequestTemplates:
          application/json: |
              #set($token=$input.params('token'))
              {
                "taskToken": "$util.base64Decode($token)",
                "output": "{}"
              }
        PassthroughBehavior: NEVER
        IntegrationResponses:
          - StatusCode: 200
      OperationName: "TokenResponseSuccess"
  FailMethod:
    Type: AWS::ApiGateway::Method
    Properties:
      HttpMethod: GET
      ResourceId: !Ref FailResource
      RestApiId: !Ref TaskTokenApi
      AuthorizationType: NONE
      MethodResponses:
        - ResponseParameters:
            method.response.header.Access-Control-Allow-Origin: true
          StatusCode: 200
      Integration:
        IntegrationHttpMethod: POST
        Type: AWS
        Credentials: !GetAtt APIGWRole.Arn
        Uri: !Sub "arn:aws:apigateway:${AWS::Region}:states:action/SendTaskFailure"
        IntegrationResponses:
          - StatusCode: 200
            ResponseTemplates:
              application/json: |
                {}
          - StatusCode: 400
            ResponseTemplates:
              application/json: |
                {"uhoh": "Spaghetti O's"}
        RequestTemplates:
          application/json: |
              #set($token=$input.params('token'))
              {
                 "cause": "Failed Manual Approval",
                 "error": "HandledError",
                 "output": "{}",
                 "taskToken": "$util.base64Decode($token)"
              }
        PassthroughBehavior: NEVER
        IntegrationResponses:
          - StatusCode: 200
      OperationName: "TokenResponseFail"

  APIDeployment:
    Type: AWS::ApiGateway::Deployment
    DependsOn:
      - FailMethod
      - SuccessMethod
    Properties:
      Description: "Prod Stage"
      RestApiId:
        Ref: TaskTokenApi
      StageName: Prod

  APIGWRole:
    Type: AWS::IAM::Role
    Properties:
      AssumeRolePolicyDocument:
        Version: "2012-10-17"
        Statement:
          - Effect: "Allow"
            Principal:
              Service:
                - "apigateway.amazonaws.com"
            Action:
              - "sts:AssumeRole"
      Path: "/"
      Policies:
        - PolicyName: root
          PolicyDocument:
            Version: 2012-10-17
            Statement:
              - Effect: Allow
                Action: 
                 - 'states:SendTaskSuccess'
                 - 'states:SendTaskFailure'
                Resource: '*'
  SFnRole:
    Type: "AWS::IAM::Role"
    Properties:
      AssumeRolePolicyDocument:
        Version: "2012-10-17"
        Statement:
          - Effect: "Allow"
            Principal:
              Service:
                - "states.amazonaws.com"
            Action:
              - "sts:AssumeRole"
      Path: "/"
      Policies:
        - PolicyName: root
          PolicyDocument:
            Version: 2012-10-17
            Statement:
              - Effect: Allow
                Action: 
                 - 'lambda:InvokeFunction'
                Resource: !GetAtt NotifierFunction.Arn

 

After you create the CloudFormation stack, you have a state machine, an Amazon API Gateway REST API, a Lambda function, and the roles each resource needs.

Your pipeline invokes the state machine with the load test results, which contain the accuracy and latency statistics. It decides which, if either, team to notify of the results. If the results are positive, it returns a success status without notifying either team. If a team needs to be notified, the Step Functions asynchronously invokes the Lambda function and passes in the relevant metric and the team’s email address. The Lambda function renders an email with links to the pass/fail response so the team can choose the Pass or Fail link in the email to respond to the review. You use the REST API to capture the response and send it to Step Functions to continue the state machine execution.

The following diagram illustrates the visual workflow of the approval process within the Step Functions state machine.

StepFunctions StateMachine for approving code changes

 

After you create your state machine, Lambda function, and REST API, return to CodePipeline console and add the Step Functions integration to your existing release pipeline. Complete the following steps:

  1. On the CodePipeline console, choose Pipelines.
  2. Choose your release pipeline.CodePipeline before adding StepFunction integration
  3. Choose Edit.CodePipeline Edit View
  4. Under the Edit:Build section, choose Add stage.
  5. Name your stage Release-Approval.
  6. Choose Save.
    You return to the edit view and can see the new stage at the end of your pipeline.CodePipeline Edit View with new stage
  7. In the Edit:Release-Approval section, choose Add action group.
  8. Add the Step Functions StateMachine invocation Action to the action group. Use the following settings:
    1. For Action name, enter CheckForRequiredApprovals.
    2. For Action provider, choose AWS Step Functions.
    3. For Region, choose the Region where your state machine is located (this post uses US West (Oregon)).
    4. For Input artifacts, enter BuildOutput (the name you gave the output artifacts in the build stage).
    5. For State machine ARN, choose the state machine you just created.
    6. For Input type¸ choose File path. (This parameter tells CodePipeline to take the contents of a file and use it as the input for the state machine execution.)
    7. For Input, enter results.json (where you store the results of your load test in the build stage of the pipeline).
    8. For Variable namespace, enter StepFunctions. (This parameter tells CodePipeline to store the state machine ARN and execution ARN for this event in a variable namespace named StepFunctions. )
    9. For Output artifacts, enter ApprovalArtifacts. (This parameter tells CodePipeline to store the results of this execution in an artifact called ApprovalArtifacts. )Edit Action Configuration
  9. Choose Done.
    You return to the edit view of the pipeline.
    CodePipeline Edit Configuration
  10. Choose Save.
  11. Choose Release change.

When the pipeline execution reaches the approval stage, it invokes the Step Functions state machine with the results emitted from your build stage. This post hard-codes the load-test results to force an engineering approval by increasing the latency (latencyMs) above the threshold defined in the CloudFormation template (80ms). See the following code:

{
  "accuracypct": 100,
  "latencyMs": 225
}

When the state machine checks the latency and sees that it’s above 80 milliseconds, it invokes the Lambda function with the engineering email address. The engineering team receives a review request email similar to the following screenshot.

review email

If you choose PASS, you send a request to the API Gateway REST API with the Step Functions task token for the current execution, which passes the token to Step Functions with the SendTaskSuccess command. When you return to your pipeline, you can see that the approval was processed and your change is ready for production.

Approved code change with stepfunction integration

Cleaning Up

When the engineering and research teams devise a solution that no longer mixes performance information from both teams into a single application, you can remove this integration by deleting the CloudFormation stack that you created and deleting the new CodePipeline stage that you added.

Conclusion

For more information about CodePipeline Actions and the Step Functions integration, see Working with Actions in CodePipeline.

Simplifying application orchestration with AWS Step Functions and AWS SAM

Post Syndicated from Rob Sutter original https://aws.amazon.com/blogs/compute/simplifying-application-orchestration-with-aws-step-functions-and-aws-sam/

Modern software applications consist of multiple components distributed across many services. AWS Step Functions lets you define serverless workflows to orchestrate these services so you can build and update your apps quickly. Step Functions manages its own state and retries when there are errors, enabling you to focus on your business logic. Now, with support for Step Functions in the AWS Serverless Application Model (AWS SAM), you can easily create, deploy, and maintain your serverless applications.

The most recent AWS SAM update introduces the AWS::Serverless::StateMachine component that simplifies the definition of workflows in your application. Because the StateMachine is an AWS SAM component, you can apply AWS SAM policy templates to scope the permissions of your workflows. AWS SAM also provides configuration options for invoking your workflows based on events or a schedule that you specify.

Defining a simple state machine

The simplest way to begin orchestrating your applications with Step Functions and AWS SAM is to install the latest version of the AWS SAM CLI.

Creating a state machine with AWS SAM CLI

To create a state machine with the AWS SAM CLI, perform the following steps:

  1. From a command line prompt, enter sam init
  2. Choose AWS Quick Start Templates
  3. Select nodejs12.x as the runtime
  4. Provide a project name
  5. Choose the Hello World Example quick start application template

Screen capture showing the first execution of sam init selecting the Hello World Example quick start application template

The AWS SAM CLI downloads the quick start application template and creates a new directory with sample code. Change into the sam-app directory and replace the contents of template.yaml with the following code:


# Copyright 2020 Amazon.com, Inc. or its affiliates. All Rights Reserved.
# SPDX-License-Identifier: Apache-2.0
AWSTemplateFormatVersion: '2010-09-09'
Transform: AWS::Serverless-2016-10-31

Resources:
  SimpleStateMachine:
    Type: AWS::Serverless::StateMachine
    Properties:
      Definition:
        StartAt: Single State
        States:
          Single State:
            Type: Pass
            End: true
      Policies:
        - CloudWatchPutMetricPolicy: {}

This is a simple yet complete template that defines a Step Functions Standard Workflow with a single Pass state. The Transform: AWS::Serverless-2016-10-31 line indicates that this is an AWS SAM template and not a basic AWS CloudFormation template. This enables the AWS::Serverless components and policy templates such as CloudWatchPutMetricPolicy on the last line, which allows you to publish metrics to Amazon CloudWatch.

Deploying a state machine with AWS SAM CLI

To deploy your state machine with the AWS SAM CLI:

  1. Save your template.yaml file
  2. Delete any function code in the directory, such as hello-world
  3. Enter sam deploy --guided into the terminal and follow the prompts
  4. Enter simple-state-machine as the stack name
  5. Select the defaults for the remaining prompts

Screen capture showing the first execution of sam deploy --guided

For additional information on visualizing, executing, and monitoring your workflow, see the tutorial Create a Step Functions State Machine Using AWS SAM.

Refining your workflow

The StateMachine component not only simplifies creation of your workflows, but also provides powerful control over how your workflow executes. You can compose complex workflows from all available Amazon States Language (ASL) states. Definition substitution allows you to reference resources. Finally, you can manage access permissions using AWS Identity and Access Management (IAM) policies and roles.

Service integrations

Step Functions service integrations allow you to call other AWS services directly from Task states. The following example shows you how to use a service integration to store information about a workflow execution directly in an Amazon DynamoDB table. Replace the Resources section of your template.yaml file with the following code:


Resources:
  SAMTable:
    Type: AWS::Serverless::SimpleTable

  SimpleStateMachine:
    Type: AWS::Serverless::StateMachine
    Properties:
      Definition:
        StartAt: FirstState
        States:
          FirstState:
            Type: Pass
            Next: Write to DynamoDB
          Write to DynamoDB:
            Type: Task
            Resource: arn:aws:states:::dynamodb:putItem
            Parameters:
              TableName: !Ref SAMTable
              Item:
                id:
                  S.$: $$.Execution.Id
            ResultPath: $.DynamoDB
            End: true
      Policies:
        - DynamoDBWritePolicy: 
            TableName: !Ref SAMTable

The AWS::Serverless::SimpleTable is an AWS SAM component that creates a DynamoDB table with on-demand capacity and reasonable defaults. To learn more, see the SimpleTable component documentation.

The Write to DynamoDB state is a Task with a service integration to the DynamoDB PutItem API call. The above code stores a single item with a field id containing the execution ID, taken from the context object of the current workflow execution.

Notice that DynamoDBWritePolicy replaces the CloudWatchPutMetricPolicy policy from the previous workflow. This is another AWS SAM policy template that provides write access only to a named DynamoDB table.

Definition substitutions

AWS SAM supports definition substitutions when defining a StateMachine resource. Definition substitutions work like template string substitution. First, you specify a Definition or DefinitionUri property of the StateMachine that contains variables specified in ${dollar_sign_brace} notation. Then you provide values for those variables as a map via the DefinitionSubstitution property.

The AWS SAM CLI provides a quick start template that demonstrates definition substitutions. To create a workflow using this template, perform the following steps:

  1. From a command line prompt in an empty directory, enter sam init
  2. Choose AWS Quick Start Templates
  3. Select your preferred runtime
  4. Provide a project name
  5. Choose the Step Functions Sample App (Stock Trader) quick start application template

Screen capture showing the execution of sam init selecting the Step Functions Sample App (Stock Trader) quick start application template

Change into the newly created directory and open the template.yaml file with your preferred text editor. Note that the Definition property is a path to a file, not a string as in your previous template. The DefinitionSubstitutions property is a map of key-value pairs. These pairs should match variables in the statemachine/stockTrader.asl.json file referenced under DefinitionUri.


      DefinitionUri: statemachine/stockTrader.asl.json
      DefinitionSubstitutions:
        StockCheckerFunctionArn: !GetAtt StockCheckerFunction.Arn
        StockSellerFunctionArn: !GetAtt StockSellerFunction.Arn
        StockBuyerFunctionArn: !GetAtt StockBuyerFunction.Arn
        DDBPutItem: !Sub arn:${AWS::Partition}:states:::dynamodb:putItem
        DDBTable: !Ref TransactionTable

Open the statemachine/stockTrader.asl.json file and look for the first state, Check Stock Value. The Resource property for this state is not a Lambda function ARN, but a replacement expression, “${StockCheckerFunctionArn}”. You see from DefinitionSubstitutions that this maps to the ARN of the StockCheckerFunction resource, an AWS::Serverless::Function also defined in template.yaml. AWS SAM CLI transforms these components into a complete, standard CloudFormation template at deploy time.

Separating the state machine definition into its own file allows you to benefit from integration with the AWS Toolkit for Visual Studio Code. With your state machine in a separate file, you can make changes and visualize your workflow within the IDE while still referencing it from your AWS SAM template.

Screen capture of a rendering of the AWS Step Functions workflow from the Step Functions Sample App (Stock Trader) quick start application template

Managing permissions and access

AWS SAM support allows you to apply policy templates to your state machines. AWS SAM policy templates provide pre-defined IAM policies for common scenarios. These templates appropriately limit the scope of permissions for your state machine while simultaneously simplifying your AWS SAM templates. You can also apply AWS managed policies to your state machines.

If AWS SAM policy templates and AWS managed policies do not suit your needs, you can also create inline policies or attach an IAM role. This allows you to tailor the permissions of your state machine to your exact use case.

Additional configuration

AWS SAM provides additional simplification for configuring event sources and logging.

Event sources

Event sources determine what events can start execution of your workflow. These sources can include HTTP requests to Amazon API Gateway REST APIs and Amazon EventBridge rules. For example, the below Events block creates an API Gateway REST API. Whenever that API receives an HTTP POST request to the path /request, it starts an execution of the state machine:


      Events:
        HttpRequest:
          Type: Api
          Properties:
            Method: POST
            Path: /request

Event sources can also start executions of your workflow on a schedule that you specify. The quick start template you created above provides the following example. When this event source is enabled, the workflow executes once every hour:


      Events:
        HourlyTradingSchedule:
          Type: Schedule 
          Properties:
            Enabled: False
            Schedule: "rate(1 hour)"

Architecture diagram for the Step Functions Sample App (Stock Trader) quick start application template

To learn more about schedules as event sources, see the AWS SAM documentation on GitHub.

Logging

Both Standard Workflows and Express Workflows support logging execution history to CloudWatch Logs. To enable logging for your workflow, you must define an AWS::Logs::LogGroup and add a Logging property to your StateMachine definition. You also must attach an IAM policy or role that provides sufficient permissions to create and publish logs. The following code shows how to add logging to an existing workflow:


Resources:
  SAMLogs:
    Type: AWS::Logs::LogGroup

  SimpleStateMachine:
    Type: AWS::Serverless::StateMachine
    Properties:
      Definition: {…}
      Logging:
        Destinations:
          - CloudWatchLogsLogGroup: 
              LogGroupArn: !GetAtt SAMLogs.Arn
        IncludeExecutionData: true
        Level: ALL
      Policies:
        - CloudWatchLogsFullAccess
      Type: EXPRESS

 

Conclusion

Step Functions workflows simplify orchestration of distributed services and accelerate application development. AWS SAM support for Step Functions compounds those benefits by helping you build, deploy, and monitor your workflows more quickly and more precisely. In this post, you learned how to use AWS SAM to define simple workflows and more complex workflows with service integrations. You also learned how to manage security permissions, event sources, and logging for your Step Functions workflows.

To learn more about building with Step Functions, see the AWS Step Functions playlist on the AWS Serverless YouTube channel. To learn more about orchestrating modern, event-driven applications with Step Functions, see the App 2025 playlist.

Now go build!

Best practices for organizing larger serverless applications

Post Syndicated from James Beswick original https://aws.amazon.com/blogs/compute/best-practices-for-organizing-larger-serverless-applications/

Well-designed serverless applications are decoupled, stateless, and use minimal code. As projects grow, a goal for development managers is to maintain the simplicity of design and low-code implementation. This blog post provides recommendations for designing and managing code repositories in larger serverless projects, and best practices for deploying releases of production systems.

Organizing your code repositories

Many serverless applications begin as monolithic applications. This can occur either because a simple application has grown more complex over time, or because developers are following existing development practices. A monolithic application is represented by a single AWS Lambda function performing multiple tasks, and a mono-repo is a single repository containing the entire application logic.

Monoliths work well for the simplest serverless applications that perform single-purpose functions. These are small applications such as cron jobs, data processing tasks, and some asynchronous processes. As those applications evolve into workflows or develop new features, it becomes important to refactor the code into smaller services.

Using frameworks such as the AWS Serverless Application Model (SAM) or the Serverless Framework can make it easier to group common pieces of functionality into smaller services. Each of these can have a separate code repository. For SAM, the template.yaml file contains all the resources and function definitions needed for an application. Consequently, breaking an application into microservices with separate templates is a simple way to split repos and resource groups.

Separate templates for microservices

In the smallest unit of a serverless application, it’s also possible to create one repository per function. If these functions are independent and do not share other AWS resources, this may be appropriate. Helper functions and simple event processing code are examples of candidates for this kind of repo structure.

In most cases, it makes sense to create repos around groups of functions and resources that define a microservice. In an ecommerce example, “Payment processing” is a microservice with multiple smaller related functions that share common resources.

As with any software, the repo design depends upon the use-case and structure of development teams. One large repo makes it harder for developer teams to work on different features, and test and deploy. Having too many repos can create duplicate code, and difficulty in sharing resources across repos. Finding the balance for your project is an important step in designing your application architecture.

Using AWS services instead of code libraries

AWS services are important building blocks for your serverless applications. These can frequently provide greater scale, performance, and reliability than bundled code packages with similar functionality.

For example, many web applications that are migrated to Lambda use web frameworks like Flask (for Python) or Express (for Node.js). Both packages support routing and separate user contexts that are well suited if the application is running on a web server. Using these packages in Lambda functions results in architectures like this:

Web servers in Lambda functions

In this case, Amazon API Gateway proxies all requests to the Lambda function to handle routing. As the application develops more routes, the Lambda function grows in size and deployments of new versions replace the entire function. It becomes harder for multiple developers to work on the same project in this context.

This approach is generally unnecessary, and it’s often better to take advantage of the native routing functionality available in API Gateway. In many cases, there is no need for the web framework in the Lambda function, which increases the size of the deployment package. API Gateway is also capable of validating parameters, reducing the need for checking parameters with custom code. It can also provide protection against unauthorized access, and a range of other features more suited to be handled at the service level. When using API Gateway this way, the new architecture looks like this:

Using API Gateway for routing

Additionally, the Lambda functions consist of less code and fewer package dependencies. This makes testing easier and reduces the need to maintain code library versions. Different developers in a team can work on separate routing functions independently, and it becomes simpler to reuse code in future projects. You can configure routes in API Gateway in the application’s SAM template:

Resources:
  GetProducts:
    Type: AWS::Serverless::Function 
    Properties:
      CodeUri: getProducts/
      Handler: app.handler
      Runtime: nodejs12.x
      Events:
        GetProductsAPI:
          Type: Api 
          Properties:
            Path: /getProducts
            Method: get

Similarly, you should usually avoid performing workflow orchestrations within Lambda functions. These are sections of code that call out to other services and functions, and perform subsequent actions based on successful execution or failure.

Lambda functions with embedded workflow orchestrations

These workflows quickly become fragile and difficult to modify for new requirements. They can cause idling in the Lambda function, meaning that the function is waiting for return values from external sources, increasingly the cost of execution.

Often, a better approach is to use AWS Step Functions, which can represent complex workflows as JSON definitions in the application’s SAM template. This service reduces the amount of custom code required, and enables long-lived workflows that minimize idling in Lambda functions. It also manages in-flight executions as workflows are upgraded. The example above, rearchitected with a Step Functions workflow, looks like this:

Using Step Functions for orchestration

Using multiple AWS accounts for development teams

There are many ways to deploy serverless applications to production. As applications grow and become more important to your business, development managers generally want to improve the robustness of the deployment process. You have a number of options within AWS for managing the development and deployment of serverless applications.

First, it is highly recommended to use more than one AWS account. Using AWS Organizations, you can centrally manage the billing, compliance, and security of these accounts. You can attach policies to groups of accounts to avoid custom scripts and manual processes. One simple approach is to provide each developer with an AWS account, and then use separate accounts for a beta deployment stage and production:

Multiple AWS accounts in a deployment pipeline

The developer accounts can contains copies of production resources and provide the developer with admin-level permissions to these resources. Each developer has their own set of limits for the account, so their usage does not impact your production environment. Individual developers can deploy CloudFormation stacks and SAM templates into these accounts with minimal risk to production assets.

This approach allows developers to test Lambda functions locally on their development machines against live cloud resources in their individual accounts. It can help create a robust unit testing process, and developers can then push code to a repository like AWS CodeCommit when ready.

By integrating with AWS Secrets Manager, you can store different sets of secrets in each environment and eliminate any need for credentials stored in code. As code is promoted from developer account through to the beta and production accounts, the correct set of credentials is automatically used. You do not need to share environment-level credentials with individual developers.

It’s also possible to implement a CI/CD process to start build pipelines when code is deployed. To deploy a sample application using a multi-account deployment flow, follow this serverless CI/CD tutorial.

Managing feature releases in serverless applications

As you implement CI/CD pipelines for your production serverless applications, it is best practice to favor safe deployments over entire application upgrades. Unlike traditional software deployments, serverless applications are a combination of custom code in Lambda functions and AWS service configurations.

A feature release may consist of a version change in a Lambda function. It may have a different endpoint in API Gateway, or use a new resource such as a DynamoDB table. Access to the deployed feature may be controlled via user configuration and feature toggles, depending upon the application. AWS SAM has AWS CodeDeploy built-in, which allows you to configure canary deployments in the YAML configuration:

Resources:
 GetProducts:
   Type: AWS::Serverless::Function
   Properties:
     CodeUri: getProducts/
     Handler: app.handler
     Runtime: nodejs12.x

     AutoPublishAlias: live

     DeploymentPreference:
       Type: Canary10Percent10Minutes 
       Alarms:
         # A list of alarms that you want to monitor
         - !Ref AliasErrorMetricGreaterThanZeroAlarm
         - !Ref LatestVersionErrorMetricGreaterThanZeroAlarm
       Hooks:
         # Validation Lambda functions run before/after traffic shifting
         PreTraffic: !Ref PreTrafficLambdaFunction
         PostTraffic: !Ref PostTrafficLambdaFunction

CodeDeploy automatically creates aliases pointing to the old and versions of a function. The canary deployment enables you to gradually shift traffic from the old to the new alias, as you become confident that the new version is working as expected. Or you can rollback the update if needed. You can also set PreTraffic and PostTraffic hooks to invoke Lambda functions before and after traffic shifting.

Conclusion

As any software application grows in size, it’s important for development managers to organize code repositories and manage releases. There are established patterns in serverless to help manage larger applications. Generally, it’s best to avoid monolithic functions and mono-repos, and you should scope repositories to either the microservice or function level.

Well-designed serverless applications use custom code in Lambda functions to connect with managed services. It’s important to identify libraries and packages that can be replaced with services to minimize the deployment size and simplify the code base. This is especially true in applications that have been migrated from server-based environments.

Using AWS Organizations, you manage groups of accounts to enable your developers to have their own AWS accounts for development. This enables engineers to clone production assets and test against the AWS Cloud when writing and debugging code. You can use a CI/CD pipeline to push code through a beta environment to production, while safeguarding secrets using Secrets Manager. You can also use CodeDeploy to manage canary deployments easily.

To learn more about deploying Lambda functions with SAM and CodeDeploy, follow the steps in this tutorial.

New – Building a Continuous Integration Workflow with Step Functions and AWS CodeBuild

Post Syndicated from Danilo Poccia original https://aws.amazon.com/blogs/aws/new-building-a-continuous-integration-workflow-with-step-functions-and-aws-codebuild/

Automating your software build is an important step to adopt DevOps best practices. To help you with that, we built AWS CodeBuild, a fully managed continuous integration service that compiles source code, runs tests, and produces packages that are ready for deployment.

However, there are so many possible customizations in our customers’ build processes, and we have seen developers spend time in creating their own custom workflows to coordinate the different activities required by their software build. For example, you may want to run, or not, some tests, or skip static analysis of your code when you need to deploy a quick fix. Depending on the results of your unit tests, you may want to take different actions, or be notified via SNS.

To simplify that, we are launching today a new AWS Step Functions service integration with CodeBuild. Now, during the execution of a state machine, you can start or stop a build, get build report summaries, and delete past build executions records.

In this way, you can define your own workflow-driven build process, and trigger it manually or automatically. For example you can:

With this integration, you can use the full capabilities of Step Functions to automate your software builds. For example, you can use a Parallel state to create parallel builds for independent components of the build. Starting from a list of all the branches in your code repository, you can use a Map state to run a set of steps (automating build, unit tests, and integration tests) for each branch. You can also leverage in the same workflow other Step Functions service integrations. For instance, you can send a message to an SQS queue to track your activities, or start a containerized application you just built using Amazon ECS and AWS Fargate.

Using Step Functions for a Workflow-Driven Build Process
I am working on a Java web application. To be sure that it works as I add new features, I wrote a few tests using JUnit Jupiter. I want those tests to be run just after the build process, but not always because tests can slow down some quick iterations. When I run tests, I want to store and view the reports of my tests using CodeBuild. At the end, I want to be notified in an SNS topic if the tests run, and if they were successful.

I created a repository in CodeCommit and I included two buildspec files for CodeBuild:

  • buildspec.yml is the default and is using Apache Maven to run the build and the tests, and then is storing test results as reports.
version: 0.2
phases:
  build:
    commands:
      - mvn package
artifacts:
  files:
    - target/binary-converter-1.0-SNAPSHOT.jar
reports:
  SurefireReports:
    files:
      - '**/*'
    base-directory: 'target/surefire-reports'
  • buildspec-notests.yml is doing only the build, and no tests are executed.
version: 0.2
phases:
  build:
    commands:
      - mvn package -DskipTests
artifacts:
  files:
    - target/binary-converter-1.0-SNAPSHOT.jar

To set up the CodeBuild project and the Step Functions state machine to automate the build, I am using AWS CloudFormation with the following template:

AWSTemplateFormatVersion: 2010-09-09
Description: AWS Step Functions sample project for getting notified on AWS CodeBuild test report results
Resources:
  CodeBuildStateMachine:
    Type: AWS::StepFunctions::StateMachine
    Properties:
      RoleArn: !GetAtt [ CodeBuildExecutionRole, Arn ]
      DefinitionString:
        !Sub
          - |-
            {
              "Comment": "An example of using CodeBuild to run (or not run) tests, get test results and send a notification.",
              "StartAt": "Run Tests?",
              "States": {
                "Run Tests?": {
                  "Type": "Choice",
                  "Choices": [
                    {
                      "Variable": "$.tests",
                      "BooleanEquals": false,
                      "Next": "Trigger CodeBuild Build Without Tests"
                    }
                  ],
                  "Default": "Trigger CodeBuild Build With Tests"
                },
                "Trigger CodeBuild Build With Tests": {
                  "Type": "Task",
                  "Resource": "arn:${AWS::Partition}:states:::codebuild:startBuild.sync",
                  "Parameters": {
                    "ProjectName": "${projectName}"
                  },
                  "Next": "Get Test Results"
                },
                "Trigger CodeBuild Build Without Tests": {
                  "Type": "Task",
                  "Resource": "arn:${AWS::Partition}:states:::codebuild:startBuild.sync",
                  "Parameters": {
                    "ProjectName": "${projectName}",
                    "BuildspecOverride": "buildspec-notests.yml"
                  },
                  "Next": "Notify No Tests"
                },
                "Get Test Results": {
                  "Type": "Task",
                  "Resource": "arn:${AWS::Partition}:states:::codebuild:batchGetReports",
                  "Parameters": {
                    "ReportArns.$": "$.Build.ReportArns"
                  },
                  "Next": "All Tests Passed?"
                },
                "All Tests Passed?": {
                  "Type": "Choice",
                  "Choices": [
                    {
                      "Variable": "$.Reports[0].Status",
                      "StringEquals": "SUCCEEDED",
                      "Next": "Notify Success"
                    }
                  ],
                  "Default": "Notify Failure"
                },
                "Notify Success": {
                  "Type": "Task",
                  "Resource": "arn:${AWS::Partition}:states:::sns:publish",
                  "Parameters": {
                    "Message": "CodeBuild build tests succeeded",
                    "TopicArn": "${snsTopicArn}"
                  },
                  "End": true
                },
                "Notify Failure": {
                  "Type": "Task",
                  "Resource": "arn:${AWS::Partition}:states:::sns:publish",
                  "Parameters": {
                    "Message": "CodeBuild build tests failed",
                    "TopicArn": "${snsTopicArn}"
                  },
                  "End": true
                },
                "Notify No Tests": {
                  "Type": "Task",
                  "Resource": "arn:${AWS::Partition}:states:::sns:publish",
                  "Parameters": {
                    "Message": "CodeBuild build without tests",
                    "TopicArn": "${snsTopicArn}"
                  },
                  "End": true
                }
              }
            }
          - {snsTopicArn: !Ref SNSTopic, projectName: !Ref CodeBuildProject}
  SNSTopic:
    Type: AWS::SNS::Topic
  CodeBuildProject:
    Type: AWS::CodeBuild::Project
    Properties:
      ServiceRole: !Ref CodeBuildServiceRole
      Artifacts:
        Type: NO_ARTIFACTS
      Environment:
        Type: LINUX_CONTAINER
        ComputeType: BUILD_GENERAL1_SMALL
        Image: aws/codebuild/standard:2.0
      Source:
        Type: CODECOMMIT
        Location: https://git-codecommit.us-east-1.amazonaws.com/v1/repos/binary-converter
  CodeBuildExecutionRole:
    Type: "AWS::IAM::Role"
    Properties:
      AssumeRolePolicyDocument:
        Version: "2012-10-17"
        Statement:
          - Effect: Allow
            Action: "sts:AssumeRole"
            Principal:
              Service: states.amazonaws.com
      Path: "/"
      Policies:
        - PolicyName: CodeBuildExecutionRolePolicy
          PolicyDocument:
            Version: "2012-10-17"
            Statement:
              - Effect: Allow
                Action:
                  - "sns:Publish"
                Resource:
                  - !Ref SNSTopic
              - Effect: Allow
                Action:
                  - "codebuild:StartBuild"
                  - "codebuild:StopBuild"
                  - "codebuild:BatchGetBuilds"
                  - "codebuild:BatchGetReports"
                Resource: "*"
              - Effect: Allow
                Action:
                  - "events:PutTargets"
                  - "events:PutRule"
                  - "events:DescribeRule"
                Resource:
                  - !Sub "arn:${AWS::Partition}:events:${AWS::Region}:${AWS::AccountId}:rule/StepFunctionsGetEventForCodeBuildStartBuildRule"
  CodeBuildServiceRole:
    Type: AWS::IAM::Role
    Properties:
      AssumeRolePolicyDocument:
        Version: "2012-10-17"
        Statement:
          - Effect: Allow
            Action: "sts:AssumeRole"
            Effect: Allow
            Principal:
              Service: codebuild.amazonaws.com
      Path: /
      Policies:
        - PolicyName: CodeBuildServiceRolePolicy
          PolicyDocument:
            Version: "2012-10-17"
            Statement:
              - Effect: Allow
                Action:
                - "logs:CreateLogGroup"
                - "logs:CreateLogStream"
                - "logs:PutLogEvents"
                - "codebuild:CreateReportGroup"
                - "codebuild:CreateReport"
                - "codebuild:UpdateReport"
                - "codebuild:BatchPutTestCases"
                - "codecommit:GitPull"
                Resource: "*"
Outputs:
  StateMachineArn:
    Value: !Ref CodeBuildStateMachine
  ExecutionInput:
    Description: Sample input to StartExecution.
    Value:
      >
        {}

When the CloudFormation stack has been created, there are two CodeBuild tasks in the state machine definition:

  • The first CodeBuild task is using a synchronous integration (startBuild.sync) to automatically wait for the build to terminate before progressing to the next step:
"Trigger CodeBuild Build With Tests": {
  "Type": "Task",
  "Resource": "arn:aws:states:::codebuild:startBuild.sync",
  "Parameters": {
    "ProjectName": "CodeBuildProject-HaVamwTeX8kM"
  },
  "Next": "Get Test Results"
}
  • The second CodeBuild task is using the BuildspecOverride parameter to override the default buildspec file used by the build with the one not running tests:
"Trigger CodeBuild Build Without Tests": {
  "Type": "Task",
  "Resource": "arn:aws:states:::codebuild:startBuild.sync",
  "Parameters": {
    "ProjectName": "CodeBuildProject-HaVamwTeX8kM",
    "BuildspecOverride": "buildspec-notests.yml"
  },
  "Next": "Notify No Tests"
},

The first step is a Choice that looks into the input of the state machine execution to decide if to run tests, or not. For example, to run tests I can give in input:

{
  "tests": true
}

This is the visual workflow of the execution running tests, all tests are passed.

I change the value of "tests" to false, and start a new execution that goes on a different branch.

This time the buildspec is not executing tests, and I get a notification that no tests were run.

When starting this workflow automatically after an activity on GitHub or CodeCommit, I could look into the last commit message for specific patterns, and customize the build process accordingly. For example, I could skip tests if the  [skip tests] string is part of the commit message. Similarly, in a production environment I could skip code static analysis, to have faster integration for urgent changes, if the [skip static analysis] message in included in the commit.

Extending the Workflow for Containerized Applications
A great way to distribute applications to different environments, is to package them as Docker images. In this way, I can also add a step to my build workflow and start the containerized application in an Amazon ECS task (running on AWS Fargate) for the Quality Assurance (QA) team.

First, I create an image repository in ECR and add permissions to the service role used by the CodeBuild project to upload to ECR, as described here.

Then, in the code repository, I follow this example to add:

  • A Dockerfile to prepare the Docker container with the software build, and start the application.
  • A buildspec-docker.yml file with the commands to create and upload the Docker image.

The final workflow is automating all these steps:

  1. Building the software from the source code.
  2. Creating the Docker image.
  3. Uploading of the Docker image to ECR.
  4. Starting the QA environment on ECS and Fargate.
  5. Sending an SNS notification that the QA environment is ready.

The workflow and its steps can easily be customized based on your requirements. For example, with a few changes, you can adapt the buildspec file to push the image to Docker Hub.

Available Now
The CodeBuild service integration is available in all commercial and GovCloud regions where Step Functions and CodeBuild services are offered. For regional availability, please see the AWS Region Table. For more information, please look at the documentation.

As AWS Serverless Hero Gojko Adzic pointed out on the AWS DevOps Blog, CodeBuild can also be used to execute administrative tasks. The integration with Step Functions opens a whole set of new possibilities.

Let me know what are you going to use this new service integration for!

Danilo

Creating a scalable serverless import process for Amazon DynamoDB

Post Syndicated from James Beswick original https://aws.amazon.com/blogs/compute/creating-a-scalable-serverless-import-process-for-amazon-dynamodb/

Amazon DynamoDB is a web-scale NoSQL database designed to provide low latency access to data. It’s well suited to many serverless applications as a primary data store, and fits into many common enterprise architectures. In this post, I show how you can import large amounts of data to DynamoDB using a serverless approach. This uses Amazon S3 as a staging area and AWS Lambda for the custom business logic.

This pattern is useful as a general import mechanism into DynamoDB because it separates the challenge of scaling from the data transformation logic. The incoming data is stored in S3 objects, formatted as JSON, CSV, or any custom format your applications produce. The process works whether you import only a few large files or many small files. It takes advantage of parallelization to import data quickly into a DynamoDB table.

Using S3-to-Lambda to import at scale to DynamoDB.

This is useful for applications where upstream services produce transaction information, and can be effective for handling data generated by spiky workloads. Alternatively, it’s also a simple way to migrate from another data source to DynamoDB, especially for large datasets.

In this blog post, I show two different import applications. The first is a direct import into the DynamoDB table. The second explores a more advanced method for smoothing out volume in the import process. The code uses the AWS Serverless Application Model (SAM), enabling you to deploy the application in your own AWS Account. This walkthrough creates resources covered in the AWS Free Tier but you may incur cost for large data imports.

To set up both example applications, visit the GitHub repo and follow the instructions in the README.md file.

Directly importing data from S3 to DynamoDB

The first example application loads data directly from S3 to DynamoDB via a Lambda function. This uses the following architecture:

Architecture for the first example application.

  1. A downstream process creates source import data in JSON format and writes to an S3 bucket.
  2. When the objects are saved, S3 invokes the main Lambda function.
  3. The function reads the S3 object and converts the JSON into the correct format for the DynamoDB table. It uploads this data in batches to the table.

The repo’s SAM template creates a DynamoDB table with a partition key, configured to use on-demand capacity. This mode enables the DynamoDB service to scale appropriately to match the number of writes required by the import process. This means you do not need to manage DynamoDB table capacity, as you would in the standard provisioned mode.

  DDBtable:
    Type: AWS::DynamoDB::Table
    Properties:
      AttributeDefinitions:
      - AttributeName: ID
        AttributeType: S
      KeySchema:
      - AttributeName: ID
        KeyType: HASH
      BillingMode: PAY_PER_REQUEST

The template defines the Lambda function to import the data:

  ImportFunction:
    Type: AWS::Serverless::Function 
    Properties:
      CodeUri: importFunction/
      Handler: app.handler
      Runtime: nodejs12.x
      MemorySize: 512
      Environment:
        Variables:
          DDBtable: !Ref DDBtable      
      Policies:
        - DynamoDBCrudPolicy:
            TableName: !Ref DDBtable        
        - S3ReadPolicy:
            BucketName: !Ref InputBucketName
      Events:
        FileUpload:
          Type: S3
          Properties:
            Bucket: !Ref InputS3Bucket
            Events: s3:ObjectCreated:*
            Filter: 
              S3Key:
                Rules:
                  - Name: suffix
                    Value: '.json'            

This uses SAM policy templates to provide write access to the DynamoDB table and read access to S3 bucket. It also defines the event that causes the function invocation from S3, filtering only for new objects with a .json suffix.

Testing the application

  1. Deploy the first application by following the README.md in the GitHub repo, and note the application’s S3 bucket name and DynamoDB table name.
  2. Change into the dataGenerator directory:
    cd ./dataGenerator
  3. Create sample data for testing. The following command creates 10 files of 100 records each:
    node ./app.js 100 10
  4. Upload the sample data into your application’s S3 bucket, replacing your-bucket below with your bucket name:
    aws s3 cp ./data/ s3://your-bucket --recursiveYour console output shows the following, confirming that the sample data is uploaded to S3.Generating and uploading sample data for testing.
  5. After a few seconds, enter this command to show the number of items in the application’s DynamoDB table. Replace your-table with your deployed table name:aws dynamodb scan --table-name your-table --select "COUNT"Your console output shows that 1,000 items are now stored in the DynamoDB and the files are successfully imported.Checking the number of items stored in the DynamoDB.

With on-demand provisioning, the per-table limit of 40,000 write request units still applies. For high volumes or sudden, spiky workloads, DynamoDB may throttle the import when using this approach. Any throttling events appears in the Metrics tab of the table in the DynamoDB service console. Throttling is intended to protect your infrastructure but there are times when you want to process these high volumes. The second application in the repo shows how to address this.

Handling extreme loads and variability in the import process

In this next example, the goal is to smooth out the traffic, so that the load process into DynamoDB is much more consistent. The key service used to achieve this is Amazon SQS, which holds all the items until a loader process stores the data in DynamoDB. The architecture looks like this:

Architecture for the second example application.

  1. A downstream process creates source import data in JSON format and writes to an S3 bucket.
  2. When the objects are saved, S3 invokes a Lambda function that transforms the input and adds these as messages in an Amazon SQS queue.
  3. The Lambda polls the SQS queue and invokes a function to process the messages batches.
  4. The function converts the JSON messages into the correct format for the DynamoDB table. It uploads this data in batches to the table.

Testing the application

In this test, you generate a much larger amount of data using a greater number of S3 objects. The instructions below creates 100,000 sample records, so running this code may incur cost on your AWS bill.

  1. Deploy the second application by following the README.md in the GitHub repo, and note the application’s S3 bucket name and DynamoDB table name.
  2. Change into the dataGenerator directory:
    cd ./dataGenerator
  3. Create sample data for testing. The following command creates 100 files of 1,000 records each:
    node ./app.js 1000 100
  4. Upload the sample data into your application’s S3 bucket, replacing your-bucket below with your deployed bucket name:aws s3 cp ./data/ s3://your-bucket --recursiveThis process takes around 10 minutes to complete with the default configuration in the repo.
  5. From the DynamoDB console, select the application’s table and then choose the Metrics tab. Select the Write capacity graph to zoom into the chart:Using CloudWatch to view WCUs consumed in the uploading process.

The default configuration deliberately slows down the load process to illustrate how it works. Using this approach, the load into the database is much more consistent, consuming between 125-150 write capacity units (WCUs) per minute. This design makes it possible to vary how quickly you load data into the DynamoDB table, depending upon the needs of your use-case.

How this works

In this second application, there are multiple points where the application uses a configuration setting to throttle the flow of data to the next step.

Throttling points in the architecture.

  1. AddToQueue function: this loads data from the source S3 object into SQS in batches of 25 messages. Depending on the size of your source records, you may add more records into a single SQS message, which has a size limit of 256 Kb. You can also compress this message with gzip to further add more records.
  2. Function concurrency: the SAM template sets the Loader function’s concurrency to 1, using the ReservedConcurrentExecutions attribute. In effect, this stops Lambda from scaling this function, which means it keeps fetching the next batch from SQS as soon as processing finishes. The concurrency is a multiplier – as this value is increased, the loading into the DynamoDB table increases proportionately, if there are messages available in SQS. Select a value greater than 1 to use parallelization in the load process.
  3. Loader function: this consumes messages from the SQS queue. The BatchSize configured in the SAM template is set to four messages per invocation. Since each message contains 25 records, this represents 100 records per invocation when the queue has enough messages. You can set a BatchSize value from 1 to 10, so could increase this from the application’s default.

When you combine these settings, it’s possible to dramatically increase the throughput for loading data into DynamoDB. Increasing the load also increases WCUs consumed, which increases cost. Your use case can inform you about the optimal balance between speed and cost. You can make changes, too – it’s simple to make adjustments to meet your requirements.

Additionally, each of the services used has its own service limits. For high production loads, it’s important to understand the quotas set, and whether these are soft or hard limits. If your application requires higher throughput, you can request raising soft limits via an AWS Support Center ticket.

Conclusion

DynamoDB does not offer a native import process and existing solutions may not meet your needs for unplanned, large-scale imports. The AWS Database Migration Service is not serverless, and the AWS Data Pipeline is schedule-based rather than event-based. This solution is designed to provide a fully serverless alternative that responds to incoming data to S3 on-demand.

In this post, I show how you can create a simple import process directly to the DynamoDB table, triggered by objects put into an S3 bucket. This provides a near-real time import process. I also show a more advanced approach to smooth out traffic for high-volume or spiky workloads. This helps create a resilient and consistent data import for DynamoDB.

To learn more, watch this video to see how to deploy and test the DynamoDB importer application.

Building an automated knowledge repo with Amazon EventBridge and Zendesk

Post Syndicated from Benjamin Smith original https://aws.amazon.com/blogs/compute/building-an-automated-knowledge-repo-with-amazon-eventbridge-and-zendesk/

Zendesk Guide is a smart knowledge base that helps customers harness the power of institutional knowledge. It enables users to build a customizable help center and customer portal.

This post shows how to implement a bidirectional event orchestration pattern between AWS services and an Amazon EventBridge third-party integration partner. This example uses support ticket events to build a customer self-service knowledge repository. It uses the EventBridge partner integration with Zendesk to accelerate the growth of a customer help center.

The examples in this post are part of a serverless application called FreshTracks. This is built in Vue.js and demonstrates SaaS integrations with Amazon EventBridge. To test this example, ask a question on the Fresh Tracks application.

The backend components for this EventBridge integration with Zendesk have been extracted into a separate example application in this GitHub repo.

How the application works

Routing Zendesk events with Amazon EventBridge.

Routing Zendesk events with Amazon EventBridge.

  1. A user searches the knowledge repository via a widget embedded in the web application.
  2. If there is no answer, the user submits the question via the web widget.
  3. Zendesk receives the question as a support ticket.
  4. Zendesk emits events when the support ticket is resolved.
  5. These events are streamed into a custom SaaS event bus in EventBridge.
  6. Event rules match events and send them downstream to an AWS Step Functions Express Workflow.
  7. The Express Workflow orchestrates Lambda functions to retrieve additional information about the event with the Zendesk API.
  8. A Lambda function uses the Zendesk API to publish a new help article from the support ticket data.
  9. The new article is searchable on the website widget for other users to read.

Before deploying this application, you must generate an API key from within Zendesk.

Creating the Zendesk API resource

Use an API to execute events on your Zendesk account from AWS. Follow these steps to generate a Zendesk API token. This is used by the application to authenticate Zendesk API calls.

To generate an API token

  1. Log in to the Zendesk dashboard.
  2. Click the Admin icon in the sidebar, then select Channels > API.
  3. Click the Settings tab, and make sure that Token Access is enabled.
  4. Click the + button to the right of Active API Tokens.

    Creating a Zendesk API token.

    Creating a Zendesk API token.

  5. Copy the token, and store it securely. Once you close this window, the full token is not displayed again.
  6. Click Save to return to the API page, which shows a truncated version of the token.

    Zendesk API token.

    Zendesk API token.

Configuring Zendesk with Amazon EventBridge

Step 1. Configuring your Zendesk event source.

  1. Go to your Zendesk Admin Center and select Admin Center > Integrations.
  2. Choose Connect in events Connector for Amazon EventBridge to open the page to configure your Zendesk event source.

    Zendesk integrations

    Zendesk integrations

  3. Enter your AWS account ID in the Amazon Web Services account ID field, and select the Region to receive events.
  4. Choose Save.

    Zendesk Amazon EventBridge configuration.

Step 2. Associate the Zendesk event source with a new event bus: 

  1. Log into the AWS Management Console and navigate to services > Amazon EventBridge > Partner event sources
    New event source

    New event source

     

  2. Select the radio button next to the new event source and choose Associate with event bus.
    Associating event source with event bus.

    Associating event source with event bus.

     

  3. Choose Associate.

Deploying the backend application

After associating the Event source with a new partner event bus, you can deploy backend services to receive events.

To set up the example application, visit the GitHub repo and follow the instructions in the README.md file.

When deploying the application stack, make sure to provide the custom event bus name, and Zendesk API credentials with --parameter-overrides.

sam deploy --parameter-overrides ZendeskEventBusName=aws.partner/zendesk.com/123456789/default ZenDeskDomain=MydendeskDomain ZenDeskPassword=myAPITOken ZenDeskUsername=myZendeskAgentUsername

You can find the name of the new Zendesk custom event bus in the custom event bus section of the EventBridge console.

Routing events with rules

When a support ticket is updated in Zendesk, a number of individual events are streamed to EventBridge, these include an event for each of:

  • Agent Assignment Changed
  • Comment Created
  • Status Changed
  • Brand Changed
  • Subject Changed

An EventBridge rule is used filter for events. The AWS Serverless Application Model (SAM) template defines the rule with the `AWS::Events::Rule` resource type. This routes the event downstream to an AWS Step Functions Express Workflow.  The EventPattern is shown below:

  ZendeskNewWebQueryClosed: 
    Type: AWS::Events::Rule
    Properties: 
      Description: "New Web Query"
      EventBusName: 
         Ref: ZendeskEventBusName
      EventPattern: 
        account:
        - !Sub '${AWS::AccountId}'
        detail-type: 
        - "Support Ticket: Comment Created"
        detail:
          ticket_event:
            ticket:
              status: 
              - solved
              tags:
              - web_widget
              tags: 
              - guide
      Targets: 
        - RoleArn: !GetAtt [ MyStatesExecutionRole, Arn ]
          Arn: !Ref FreshTracksZenDeskQueryMachine
          Id: NewQuery

The tickets must have two specific tags (web_widget and guide) for this pattern to match. These are defined as separate fields to create an AND  matching rule, instead of declaring within the same array field to create an OR rule. A new comment on a support ticket triggers the event.

The Step Functions Express Workflow

The application routes events to a Step Functions Express Workflow that is defined in the application’s SAM template:

FreshTracksZenDeskQueryMachine:
    Type: "AWS::StepFunctions::StateMachine"
    Properties:
      StateMachineType: EXPRESS
      DefinitionString: !Sub |
               {
                    "Comment": "Create a new article from a zendeskTicket",
                    "StartAt": "GetFullZendeskTicket",
                    "States": {
                      "GetFullZendeskTicket": {
                      "Comment": "Get Full Ticket Details",
                      "Type": "Task",
                      "ResultPath": "$.FullTicket",
                      "Resource": "${GetFullZendeskTicket.Arn}",
                      "Next": "GetFullZendeskUser"
                      },
                      "GetFullZendeskUser": {
                      "Comment": "Get Full User Details",
                      "Type": "Task",
                      "ResultPath": "$.FullUser",
                      "Resource": "${GetFullZendeskUser.Arn}",
                      "Next": "PublishArticle"
                      },
                      "PublishArticle": {
                      "Comment": "Publish as an article",
                      "Type": "Task",
                      "Resource": "${CreateZendeskArticle.Arn}",
                      "End": true
                      }
                    }
                }
      RoleArn: !GetAtt [ MyStatesExecutionRole, Arn ]

This application is suited for a Step Functions Express Workflow because it is orchestrating short duration, high-volume, event-based workloads.  Each workflow task is idempotent and stateless. The Express Workflow carries the workload’s state by passing the output of one task to the input of the next. The Amazon States Language ResultPath definition is used to control where each tasks output is appended to workflow’s state before it is passed to the next task.

 

AWS StepFunctions Express workflow

AWS StepFunctions Express workflow

Lambda functions

Each task in this Express Workflow invokes a Lambda function defined within the example application’s SAM template. The Lambda functions use the Node.js Axios package to make a request to Zendesk’s API.  The Zendesk API credentials are stored in the Lambda function’s environment variables and accessible via ‘process.env’.

The first two Lambda functions in the workflow make a GET request to Zendesk. This retrieves additional data about the support ticket, the author, and the agent’s response.

The final Lambda function makes a POST request to Zendesk API. This creates and publishes a new article using this data.  The permission_group and section defined in this function must be set to your Zendesk account’s default permission group ID and FAQ section ID.

AWS Lambda function code

AWS Lambda function code

Integrating with your front-end application

Follow the instructions in the Fresh Tracks repo on GitHub to deploy the front-end application. This application includes Zendesk’s web widget script in the index.html page. The widget has been customized using Zendesk’s javascript API. This is implemented in the navigation component to insert custom forms into the widget and prefill the email address field for authenticated users. The backend application starts receiving Zendesk emitted events immediately.

The video below demonstrates the implementation from end to end.

Conclusion

This post explains how to set up EventBridge’s third-party integration with Zendesk to capture events. The example backend application demonstrates how to filter these events, and send downstream to a Step Functions Express Workflow. The Express Workflow orchestrates a series of stateless Lambda functions to gather additional data about the event. Zendesk’s API is then used to publish a new help guide article from this data.

This pattern provides a framework for bidirectional event orchestration between AWS services, custom web applications and third party integration partners. This can be replicated and applied to any number of third party integration partners.

This is implemented with minimal code to provide near real-time streaming of events and without adding latency to your application.

The possibilities are vast. I am excited to see how builders use this bidirectional serverless pattern to add even more value to their third party services.

Start here to learn about other SaaS integrations with Amazon EventBridge.

Automating scalable business workflows using minimal code

Post Syndicated from James Beswick original https://aws.amazon.com/blogs/compute/automating-scalable-business-workflows-using-minimal-code/

Organizations frequently have complex workflows embedded in their processes. When a customer places an order, it triggers a workflow. Or when an employee requests vacation time, this starts another set of processes. Managing these at scale can be challenging in traditional applications, which must often manage thousands of separate tasks.

In this blog post, I show how to use a serverless application to build and manage enterprise workflows at scale. This minimal-code solution is highly scalable and flexible, and can be modified easily to meet your needs. This application uses Amazon S3, AWS Lambda, and AWS Step Functions:

Using S3-to-Lambda to trigger Step Functions workflows

AWS Step Functions allows you to represent workflows as a JSON state machine. This service can help remove custom code and convoluted logic from distributed systems, and make it easier to maintain and modify. S3 is a highly scalable service that stores trillions of objects, and Lambda runs custom code in response to events. By combining these services, it’s simple to build resilient workflows with high throughput, triggered by putting objects in S3 buckets.

There are many business use-cases for this approach. For example, you could automatically pay invoices from approved vendors under a threshold amount by reading the invoices stored in S3 using Amazon Textract. Or your application could automatically book consultations for patients emailing their completed authorization forms. Almost any action that is triggered by a document or form is a potential candidate for an automated workflow solution.

To set up the example application, visit the GitHub repo and follow the instructions in the README.md file. The code uses the AWS Serverless Application Model (SAM), enabling you to deploy the application in your own AWS account. This walkthrough creates resources covered in the AWS Free Tier but you may incur cost if you test with large amounts of data.

How the application works

The starting point for this serverless solution is S3. When new objects are stored, this triggers a Lambda function that starts an execution in the Step Functions workflow. Lambda scales to keep pace as more objects are written to the S3 bucket, and Step Functions creates a separate execution for each S3 object. It also manages the state of all the distinct workflows.

Simple Step Functions workflow.

  1. A downstream process stores data in the S3 bucket.
  2. This invokes the Start Execution Lambda function. The function creates a new execution in Step Functions using the S3 object as event data.
  3. The workflow invokes the Decider function. This uses Amazon Rekognition to detect the contents of objects stored in S3.
  4. This function uses environment variables to determine the matching attributes. If the S3 object matches the criteria, it triggers the Match function. Otherwise, the No Match function is invoked.

The application’s SAM template configures the Step Functions state machine as JSON. It also defines an IAM role allowing Step Functions to invoke the Lambda functions. The initial function invoked by S3 is defined to accept the state machine ARN as an environment variable. The template also defines the permissions needed and the S3 trigger:

  StartExecutionFunction:
    Type: AWS::Serverless::Function 
    Properties:
      CodeUri: StartExecutionFunction/
      Handler: app.handler
      Runtime: nodejs12.x
      MemorySize: 128
      Environment:
        Variables:
          stateMachineArn: !Ref 'MatcherStateMachine'
      Policies:
        - S3CrudPolicy:
            BucketName: !Ref InputBucketName
        - Statement:
          - Effect: Allow
            Resource: !Ref 'MatcherStateMachine'
            Action:
              - states:*
      Events:
        FileUpload:
          Type: S3
          Properties:
            Bucket: !Ref InputBucket
            Events: s3:ObjectCreated:*

This uses SAM policy templates to provide read access to the S3 bucket. It also defines the event that causes the function invocation from S3, filtering only for new objects with a .json suffix.

The Decider function is the first step of the Step Functions workflow. It uses Amazon Rekognition to detect labels and words from the images provide. The SAM template passes the required labels and words to the function, together with an optional confidence score:

  DeciderFunction:
    Type: AWS::Serverless::Function 
    Properties:
      CodeUri: deciderFunction/
      Handler: app.handler
      Environment:
        Variables:
          requiredWords: "NEW YORK"
          requiredLabels: "Driving License,Person"
          minConfidence: 70

If the requiredLabels environment variable is present, the function’s code calls Amazon Rekognition’s detectLabel method. It then calls the detectText method if the requiredWords environment variable is used:

// The standard Lambda handler
exports.handler = async (event) => {
  return await processDocument(event)
}

// Detect words/labels on document or image
const processDocument = async (event) => {

  // If using a required labels
  if (process.env.requiredLabels) {
    // If no match, return immediately
    if (!await checkRequiredLabels(event)) return 'NoMatch'
  }  

  // If using a required words test
  if (process.env.requiredWords) {
    // If no match, return immediately
    if (!await checkRequiredWords(event)) return 'NoMatch'
  }

  return 'Match'
}

The Decider function returns “Match” or “No match” to the Step Functions workflow. This invokes downstream functions depending on the result. The Match and No Match functions are stubs where you can build the intended functionality in the workflow. This Step Functions workflow is designed generically so you can extend the functionality easily.

Testing the application

Deploy the first application by following the README.md in the GitHub repo, and note the application’s S3 bucket name. There are three test cases:

  • Create a workflow for a matched subject in an image. From photos uploaded to S3, identify which images contain one or more subjects, and invoke the Match path of the workflow.
  • Create a workflow for invoices from a specific vendor. From multiple invoices uploaded, matching those from a vendor, and trigger the Match path of the workflow.
  • Create a workflow for driver licenses issued by a single state. From a collection of drivers licenses, trigger the Match workflow for only a single state.

1. Create a workflow for matched subject in an image

In this example, the application identifies cats in images uploaded to the S3 bucket. The default configuration in the SAM template in the GitHub repo contains the environment variables set for this example:

Environment variables in SAM template.

First, I upload over 20 images of various animals to the S3 bucket:

Uploading files to the S3 bucket.

After navigating to the Step Function console, and selecting the application’s state machine, it shows 24 separate executions, one per image:

Step Functions execution detail.

I select one of these executions, for cat3.jpg. This has followed the MatchFound execution path of the workflow:

MatchFound execution path.

2. Create a workflow for invoices for a specified vendor.

For this example, the application looks for a customer account number and vendor name in invoices uploaded to the S3 bucket. The Decider function uses environment variables to determine the matching keywords. These can be updated by either deploying the SAM template or editing the Lambda function directly.

I modify the SAM template to match the vendor name and account number as follows:

SAM template with vendor information.

Next I upload several different invoices from the local machine to the S3 bucket:

Uploading different files to the S3 bucket.

In the Step Functions console, I select the execution for utility-bill.png. This execution matches the criteria and follows the MatchFound path in the workflow.

MatchFound path in visual workflow.

3. Matching a driver’s license by state

In this example, the application routes based upon the state where a driver’s license is issued. For this test, I use a range of sample images of licenses from DMVs in multiple states.

I modify the SAM template so the Decider function uses both label and word detection. I set “Driving License” and “Person” as required labels. This ensures that Amazon Rekognition identifies a person is in the photo in addition to the document type.

Environment variables in the SAM template.

Next, I upload the driver’s license images to the S3 bucket:

Uploading files to the S3 bucket.

In the Step Functions console, I open the execution for the driver-license-ny.png file, and it has followed the MatchFound path in the workflow:

Execution path for driver's license test.

When I select the execution for the Texas driver’s license, this did not match and has followed the NoMatchFound execution path:

Execution path for NoMatchFound.

Extending the functionality

By triggering Step Functions workflows from S3 PutObject events, this application is highly scalable. As more objects are stored in the S3 bucket, it creates as many executions as needed in the state machine. The custom code only handles the specific logic requirements for a single object and the Lambda service scales up to meet demand.

In these examples, the application uses Amazon Rekognition to analyze specific document types or image contents. You could extend this logic to include value ranges, multiple alternative workflow paths, or include steps to enable human intervention.

Using Step Functions also makes it easy to modify workflows as requirements change. Any incomplete workflows continue on the existing version of the state machine used when they started. As a result, you can add steps without impacting existing code, making it faster to adapt applications to users’ needs.

Conclusion

You can use Step Functions to model many common business workflows with JSON. Combining this powerful workflow management service with the scalability of S3 and Lambda, you can quickly build nuanced solutions that operate at scale.

In this post, I show how you can deploy a simple Step Functions workflow where executions are created by objects stored in an S3 bucket. Using minimal code, it can perform complex workflow routing tasks based on document types and contents. This provides a highly flexible and scalable way to manage common organizational workflow needs.

ICYMI: Serverless Q1 2020

Post Syndicated from Moheeb Zara original https://aws.amazon.com/blogs/compute/icymi-serverless-q1-2020/

Welcome to the ninth edition of the AWS Serverless ICYMI (in case you missed it) quarterly recap. Every quarter, we share all of the most recent product launches, feature enhancements, blog posts, webinars, Twitch live streams, and other interesting things that you might have missed!

A calendar of the January, February, and March.

In case you missed our last ICYMI, checkout what happened last quarter here.

Launches/New products

In 2018, we launched the AWS Well-Architected Tool. This allows you to review workloads in a structured way based on the AWS Well-Architected Framework. Until now, we’ve provided workload-specific advice using the concept of a “lens.”

As of February, this tool now lets you apply those lenses to provide greater visibility in specific technology domains to assess risks and find areas for improvement. Serverless is the first available lens.

You can apply a lens when defining a workload in the Well-Architected Tool console.

A screenshot of applying a lens.

HTTP APIs beta was announced at AWS re:Invent 2019. Now HTTP APIs is generally available (GA) with more features to help developers build APIs better, faster, and at lower cost. HTTP APIs for Amazon API Gateway is built from the ground up based on lessons learned from building REST and WebSocket APIs, and looking closely at customer feedback.

For the majority of use cases, HTTP APIs offers up to 60% reduction in latency.

HTTP APIs costs at least 71% lower when compared against API Gateway REST APIs.

A bar chart showing the cost comparison between HTTP APIs and API Gateway.

HTTP APIs also offers a more intuitive experience and powerful features, like easily configuring cross origin resource scripting (CORS), JWT authorizers, auto-deploying stages, and simplified route integrations.

AWS Lambda

You can now view and monitor the number of concurrent executions of your AWS Lambda functions by version and alias. Previously, the ConcurrentExecutions metric measured and emitted the sum of concurrent executions for all functions in the account. It included even those that had a reserved concurrency limit specified.

Now, the ConcurrentExecutions metric is emitted for all functions, versions, aliases. This can be used to see which functions consume your concurrency limits and estimate peak traffic based on consumption averages. Fine grain visibility in these areas can help plan appropriate configuration for Provisioned Concurrency.

A Lambda function written in Ruby 2.7.

A Lambda function written in Ruby 2.7.

AWS Lambda now supports Ruby 2.7. Developers can take advantage of new features in this latest release of Ruby, like pattern matching, argument forwarding and numbered arguments. Lambda functions written in Ruby 2.7 run on Amazon Linux 2.

Updated AWS Mock .NET Lambda Test Tool

Updated AWS Mock .NET Lambda Test Tool

.NET Core 3.1 is now a supported runtime in AWS Lambda. You can deploy to Lambda by setting the runtime parameter value to dotnetcore3.1. Updates have also been released for the AWS Toolkit for Visual Studio and .NET Core Global Tool Amazon .Lambda.Tools. These make it easier to build and deploy your .NET Core 3.1 Lambda functions.

With .NET Core 3.1, you can take advantage of all the new features it brings to Lambda, including C# 8.0, F# 4.7 support, and .NET Standard 2.1 support, a new JSON serializer, and a ReadyToRun feature for ahead-of-time compilation. The AWS Mock .NET Lambda Test Tool has also been updated to support .NET Core 3.1 with new features to help debug and improve your workloads.

Cost Savings

Last year we announced Savings Plans for AWS Compute Services. This is a flexible discount model provided in exchange for a commitment of compute usage over a period of one or three years. AWS Lambda now participates in Compute Savings Plans, allowing customers to save money. Visit the AWS Cost Explorer to get started.

Amazon API Gateway

With the HTTP APIs launched in GA, customers can build APIs for services behind private ALBs, private NLBs, and IP-based services registered in AWS Cloud Map such as ECS tasks. To make it easier for customers to work between API Gateway REST APIs and HTTP APIs, customers can now use the same custom domain across both REST APIs and HTTP APIs. In addition, this release also enables customers to perform granular throttling for routes, improved usability when using Lambda as a backend, and better error logging.

AWS Step Functions

AWS Step Functions VS Code plugin.

We launched the AWS Toolkit for Visual Studio Code back in 2019 and last month we added toolkit support for AWS Step Functions. This enables you to define, visualize, and create workflows without leaving VS Code. As you craft your state machine, it is continuously rendered with helpful tools for debugging. The toolkit also allows you to update state machines in the AWS Cloud with ease.

To further help with debugging, we’ve added AWS Step Functions support for CloudWatch Logs. For standard workflows, you can select different levels of logging and can exclude logging of a workflow’s payload. This makes it easier to monitor event-driven serverless workflows and create metrics and alerts.

AWS Amplify

AWS Amplify is a framework for building modern applications, with a toolchain for easily adding services like authentication, storage, APIs, hosting, and more, all via command line interface.

Customers can now use the Amplify CLI to take advantage of AWS Amplify console features like continuous deployment, instant cache invalidation, custom redirects, and simple configuration of custom domains. This means you can do end-to-end development and deployment of a web application entirely from the command line.

Amazon DynamoDB

You can now easily increase the availability of your existing Amazon DynamoDB tables into additional AWS Regions without table rebuilds by updating to the latest version of global tables. You can benefit from improved replicated write efficiencies without any additional cost.

On-demand capacity mode is now available in the Asia Pacific (Osaka-Local) Region. This is a flexible capacity mode for DynamoDB that can serve thousands of requests per second without requiring capacity planning. DynamoDB on-demand offers simple pay-per-request pricing for read and write requests so that you only pay for what you use, making it easy to balance cost and performance.

AWS Serverless Application Repository

The AWS Serverless Application Repository (SAR) is a service for packaging and sharing serverless application templates using the AWS Serverless Application Model (SAM). Applications can be customized with parameters and deployed with ease. Previously, applications could only be shared publicly or with specific AWS account IDs. Now, SAR has added sharing for AWS Organizations. These new granular permissions can be added to existing SAR applications. Learn how to take advantage of this feature today to help improve your organizations productivity.

Amazon Cognito

Amazon Cognito, a service for managing identity providers and users, now supports CloudWatch Usage Metrics. This allows you to monitor events in near-real time, such as sign-in and sign-out. These can be turned into metrics or CloudWatch alarms at no additional cost.

Cognito User Pools now supports logging for all API calls with AWS CloudTrail. The enhanced CloudTrail logging improves governance, compliance, and operational and risk auditing capabilities. Additionally, Cognito User Pools now enables customers to configure case sensitivity settings for user aliases, including native user name, email alias, and preferred user name alias.

Serverless posts

Our team is always working to build and write content to help our customers better understand all our serverless offerings. Here is a list of the latest published to the AWS Compute Blog this quarter.

January

February

March

Tech Talks and events

We hold AWS Online Tech Talks covering serverless topics throughout the year. You can find these in the serverless section of the AWS Online Tech Talks page. We also delivered talks at conferences and events around the globe, regularly join in on podcasts, and record short videos you can find to learn in quick byte-sized chunks.

Here are the highlights from Q1.

January

February

March

Live streams

Rob Sutter, a Senior Developer Advocate on AWS Serverless, has started hosting Serverless Office Hours every Tuesday at 14:00 ET on Twitch. He’ll be imparting his wisdom on Step Functions, Lambda, Golang, and taking questions on all things serverless.

Check out some past sessions:

Happy Little APIs Season 2 is airing every other Tuesday on the AWS Twitch Channel. Checkout the first episode where Eric Johnson and Ran Ribenzaft, Serverless Hero and CTO of Epsagon, talk about private integrations with HTTP API.

Eric Johnson is also streaming “Sessions with SAM” every Thursday at 10AM PST. Each week Eric shows how to use SAM to solve different problems with serverless and how to leverage SAM templates to build out powerful serverless applications. Catch up on the last few episodes on our Twitch channel.

Relax with a cup of your favorite morning beverage every Friday at 12PM EST with a Serverless Coffee Break with James Beswick. These are chats about all things serverless with special guests. You can catch these live on Twitter or on your own time with these recordings.

AWS Serverless Heroes

This year, we’ve added some new faces to the list of AWS Serverless Heroes. The AWS Hero program is a selection of worldwide experts that have been recognized for their positive impact within the community. They share helpful knowledge and organize events and user groups. They’re also contributors to numerous open-source projects in and around serverless technologies.

Still looking for more?

The Serverless landing page has even more information. The Lambda resources page contains case studies, webinars, whitepapers, customer stories, reference architectures, and Getting Started tutorials.

AWS Step Functions support in Visual Studio Code

Post Syndicated from Rob Sutter original https://aws.amazon.com/blogs/compute/aws-step-functions-support-in-visual-studio-code/

The AWS Toolkit for Visual Studio Code has been installed over 115,000 times since launching in July 2019. We are excited to announce toolkit support for AWS Step Functions, enabling you to define, visualize, and create your Step Functions workflows without leaving VS Code.

Version 1.8 of the toolkit provides two new commands in the Command Palette to help you define and visualize your workflows. The toolkit also provides code snippets for seven different Amazon States Language (ASL) state types and additional service integrations to speed up workflow development. Automatic linting detects errors in your state machine as you type, and provides tooltips to help you correct the errors. Finally, the toolkit allows you to create or update Step Functions workflows in your AWS account without leaving VS Code.

Defining a new state machine

To define a new Step Functions state machine, first open the VS Code Command Palette by choosing Command Palette from the View menu. Enter Step Functions to filter the available options and choose AWS: Create a new Step Functions state machine.

Screen capture of the Command Palette in Visual Studio Code with the text ">AWS Step Functions" entered

Creating a new Step Functions state machine in VS Code

A dialog box appears with several options to help you get started quickly. Select Hello world to create a basic example using a series of Pass states.

A screen capture of the Visual Studio Code Command Palette "Select a starter template" dialog with "Hello world" selected

Selecting the “Hello world” starter template

VS Code creates a new Amazon States Language file containing a workflow with examples of the Pass, Choice, Fail, Wait, and Parallel states.

A screen capture of a Visual Studio Code window with a "Hello World" example state machine

The “Hello World” example state machine

Pass states allow you to define your workflow before building the implementation of your logic with Task states. This lets you work with business process owners to ensure you have the workflow right before you start writing code. For more information on the other state types, see State Types in the ASL documentation.

Save your new workflow by choosing Save from the File menu. VS Code automatically applies the .asl.json extension.

Visualizing state machines

In addition to helping define workflows, the toolkit also enables you to visualize your workflows without leaving VS Code.

To visualize your new workflow, open the Command Palette and enter Preview state machine to filter the available options. Choose AWS: Preview state machine graph.

A screen capture of the Visual Studio Code Command Palette with the text ">Preview state machine" entered and the option "AWS: Preview state machine graph" highlighted

Previewing the state machine graph in VS Code

The toolkit renders a visualization of your workflow in a new tab to the right of your workflow definition. The visualization updates automatically as the workflow definition changes.

A screen capture of a Visual Studio Code window with two side-by-side tabs, one with a state machine definition and one with a preview graph for the same state machine

A state machine preview graph

Modifying your state machine definition

The toolkit provides code snippets for 12 different ASL states and service integrations. To insert a code snippet, place your cursor within the States object in your workflow and press Ctrl+Space to show the list of available states.

A screen capture of a Visual Studio Code window with a code snippet insertion dialog showing twelve Amazon States Langauge states

Code snippets are available for twelve ASL states

In this example, insert a newline after the definition of the Pass state, press Ctrl+Space, and choose Map State to insert a code snippet with the required structure for an ASL Map State.

Debugging state machines

The toolkit also includes features to help you debug your Step Functions state machines. Visualization is one feature, as it allows the builder and the product owner to confirm that they have a shared understanding of the relevant process.

Automatic linting is another feature that helps you debug your workflows. For example, when you insert the Map state into your workflow, a number of errors are detected, underlined in red in the editor window, and highlighted in red in the Minimap. The visualization tab also displays an error to inform you that the workflow definition has errors.

A screen capture of a Visual Studio Code window with a tooltip dialog indicating an "Unreachable state" error

A tooltip indicating an “Unreachable state” error

Hovering over an error opens a tooltip with information about the error. In this case, the toolkit is informing you that MapState is unreachable. Correct this error by changing the value of Next in the Pass state above from Hello World Example to MapState. The red underline automatically disappears, indicating the error has been resolved.

To finish reconciling the errors in your workflow, cut all of the following states from Hello World Example? through Hello World and paste into MapState, replacing the existing values of MapState.Iterator.States. The workflow preview updates automatically, indicating that the errors have been resolved. The MapState is indicated by the three dashed lines surrounding most of the workflow.

A Visual Studio Code window displaying two tabs, an updated state machine definition and the automatically-updated preview of the same state machine

Automatically updating the state machine preview after changes

Creating and updating state machines in your AWS account

The toolkit enables you to publish your state machine directly to your AWS account without leaving VS Code. Before publishing a state machine to your account, ensure that you establish credentials for your AWS account for the toolkit.

Creating a state machine in your AWS account

To publish a new state machine to your AWS account, bring up the VS Code Command Palette as before. Enter Publish to filter the available options and choose AWS: Publish state machine to Step Functions.

Screen capture of the Visual Studio Command Palette with the command "AWS: Publish state machine to Step Functions" highlighted

Publishing a state machine to AWS Step Functions

Choose Quick Create from the dialog box to create a new state machine in your AWS account.

Screen Capture from a Visual Studio Code flow to publish a state machine to AWS Step Functions with "Quick Create" highlighted

Publishing a state machine to AWS Step Functions

Select an existing execution role for your state machine to assume. This role must already exist in your AWS account.

For more information on creating execution roles for state machines, please visit Creating IAM Roles for AWS Step Functions.

Screen capture from Visual Studio Code showing a selection execution role dialog with "HelloWorld_IAM_Role" selected

Selecting an IAM execution role for a state machine

Provide a name for the new state machine in your AWS account, for example, Hello-World. The name must be from one to 80 characters, and can use alphanumeric characters, dashes, or underscores.

Screen capture from a Visual Studio Code flow entering "Hello-World" as a state machine name

Naming your state machine

Press the Enter or Return key to confirm the name of your state machine. The Output console opens, and the toolkit displays the result of creating your state machine. The toolkit provides the full Amazon Resource Name (ARN) of your new state machine on completion.

Screen capture from Visual Studio Code showing the successful creation of a new state machine in the Output window

Output of creating a new state machine

You can check creation for yourself by visiting the Step Functions page in the AWS Management Console. Choose the newly-created state machine and the Definition tab. The console displays the definition of your state machine along with a preview graph.

Screen capture of the AWS Management Console showing the newly-created state machine

Viewing the new state machine in the AWS Management Console

Updating a state machine in your AWS account

It is common to change workflow definitions as you refine your application. To update your state machine in your AWS account, choose Quick Update instead of Quick Create. Select your existing workflow.

A screen capture of a Visual Studio Code dialog box with a single state machine displayed and highlighted

Selecting an existing state machine to update

The toolkit displays “Successfully updated state machine” and the ARN of your state machine in the Output window on completion.

Summary

In this post, you learn how to use the AWS Toolkit for VS Code to create and update Step Functions state machines in your local development environment. You discover how sample templates, code snippets, and automatic linting can accelerate your development workflows. Finally, you see how to create and update Step Functions workflows in your AWS account without leaving VS Code.

Install the latest release of the toolkit and start building your workflows in VS Code today.

 

Testing and creating CI/CD pipelines for AWS Step Functions

Post Syndicated from Matt Noyce original https://aws.amazon.com/blogs/devops/testing-and-creating-ci-cd-pipelines-for-aws-step-functions-using-aws-codepipeline-and-aws-codebuild/

AWS Step Functions allow users to easily create workflows that are highly available, serverless, and intuitive. Step Functions natively integrate with a variety of AWS services including, but not limited to, AWS Lambda, AWS Batch, AWS Fargate, and Amazon SageMaker. It offers the ability to natively add error handling, retry logic, and complex branching, all through an easy-to-use JSON-based language known as the Amazon States Language.

AWS CodePipeline is a fully managed Continuous Delivery System that allows for easy and highly configurable methods for automating release pipelines. CodePipeline allows the end-user the ability to build, test, and deploy their most critical applications and infrastructure in a reliable and repeatable manner.

AWS CodeCommit is a fully managed and secure source control repository service. It eliminates the need to support and scale infrastructure to support highly available and critical code repository systems.

This blog post demonstrates how to create a CI/CD pipeline to comprehensively test an AWS Step Function state machine from start to finish using CodeCommit, AWS CodeBuild, CodePipeline, and Python.

CI/CD pipeline steps

The pipeline contains the following steps, as shown in the following diagram.

CI/CD pipeline steps

  1. Pull the source code from source control.
  2. Lint any configuration files.
  3. Run unit tests against the AWS Lambda functions in codebase.
  4. Deploy the test pipeline.
  5. Run end-to-end tests against the test pipeline.
  6. Clean up test state machine and test infrastructure.
  7. Send approval to approvers.
  8. Deploy to Production.

Prerequisites

In order to get started building this CI/CD pipeline there are a few prerequisites that must be met:

  1. Create or use an existing AWS account (instructions on creating an account can be found here).
  2. Define or use the example AWS Step Function states language definition (found below).
  3. Write the appropriate unit tests for your Lambda functions.
  4. Determine end-to-end tests to be run against AWS Step Function state machine.

The CodePipeline project

The following screenshot depicts what the CodePipeline project looks like, including the set of stages run in order to securely, reliably, and confidently deploy the AWS Step Function state machine to Production.

CodePipeline project

Creating a CodeCommit repository

To begin, navigate to the AWS console to create a new CodeCommit repository for your state machine.

CodeCommit repository

In this example, the repository is named CalculationStateMachine, as it contains the contents of the state machine definition, Python tests, and CodeBuild configurations.

CodeCommit structure

Breakdown of repository structure

In the CodeCommit repository above we have the following folder structure:

  1. config – this is where all of the Buildspec files will live for our AWS CodeBuild jobs.
  2. lambdas – this is where we will store all of our AWS Lambda functions.
  3. tests – this is the top-level folder for unit and end-to-end tests. It contains two sub-folders (unit and e2e).
  4. cloudformation – this is where we will add any extra CloudFormation templates.

Defining the state machine

Inside of the CodeCommit repository, create a State Machine Definition file called sm_def.json that defines the state machine in Amazon States Language.

This example creates a state machine that invokes a collection of Lambda functions to perform calculations on the given input values. Take note that it also performs a check against a specific value and, through the use of a Choice state, either continues the pipeline or exits it.

sm_def.json file:

{
  "Comment": "CalulationStateMachine",
  "StartAt": "CleanInput",
  "States": {
    "CleanInput": {
      "Type": "Task",
      "Resource": "arn:aws:states:::lambda:invoke",
      "Parameters": {
        "FunctionName": "CleanInput",
        "Payload": {
          "input.$": "$"
        }
      },
      "Next": "Multiply"
    },
    "Multiply": {
      "Type": "Task",
      "Resource": "arn:aws:states:::lambda:invoke",
      "Parameters": {
        "FunctionName": "Multiply",
        "Payload": {
          "input.$": "$.Payload"
        }
      },
      "Next": "Choice"
    },
    "Choice": {
      "Type": "Choice",
      "Choices": [
        {
          "Variable": "$.Payload.result",
          "NumericGreaterThanEquals": 20,
          "Next": "Subtract"
        }
      ],
      "Default": "Notify"
    },
    "Subtract": {
      "Type": "Task",
      "Resource": "arn:aws:states:::lambda:invoke",
      "Parameters": {
        "FunctionName": "Subtract",
        "Payload": {
          "input.$": "$.Payload"
        }
      },
      "Next": "Add"
    },
    "Notify": {
      "Type": "Task",
      "Resource": "arn:aws:states:::sns:publish",
      "Parameters": {
        "TopicArn": "arn:aws:sns:us-east-1:657860672583:CalculateNotify",
        "Message.$": "$$",
        "Subject": "Failed Test"
      },
      "End": true
    },
    "Add": {
      "Type": "Task",
      "Resource": "arn:aws:states:::lambda:invoke",
      "Parameters": {
        "FunctionName": "Add",
        "Payload": {
          "input.$": "$.Payload"
        }
      },
      "Next": "Divide"
    },
    "Divide": {
      "Type": "Task",
      "Resource": "arn:aws:states:::lambda:invoke",
      "Parameters": {
        "FunctionName": "Divide",
        "Payload": {
          "input.$": "$.Payload"
        }
      },
      "End": true
    }
  }
}

This will yield the following AWS Step Function state machine after the pipeline completes:

State machine

CodeBuild Spec files

The CI/CD pipeline uses a collection of CodeBuild BuildSpec files chained together through CodePipeline. The following sections demonstrate what these BuildSpec files look like and how they can be used to chain together and build a full CI/CD pipeline.

AWS States Language linter

In order to determine whether or not the State Machine Definition is valid, include a stage in your CodePipeline configuration to evaluate it. Through the use of a Ruby Gem called statelint, you can verify the validity of your state machine definition as follows:

lint_buildspec.yaml file:

version: 0.2
env:
  git-credential-helper: yes
phases:
  install:
    runtime-versions:
      ruby: 2.6
    commands:
      - yum -y install rubygems
      - gem install statelint

  build:
    commands:
      - statelint sm_def.json

If your configuration is valid, you do not see any output messages. If the configuration is invalid, you receive a message telling you that the definition is invalid and the pipeline terminates.

Lambda unit testing

In order to test your Lambda function code, you need to evaluate whether or not it passes a set of tests. You can test each individual Lambda function deployed and used inside of the state machine. You can feed various inputs into your Lambda functions and assert that the output is what you expect it to be. In this case, you use Python pytest to kick-off tests and validate results.

unit_test_buildspec.yaml file:

version: 0.2
env:
  git-credential-helper: yes
phases:
  install:
    runtime-versions:
      python: 3.8
    commands:
      - pip3 install -r tests/requirements.txt

  build:
    commands:
      - pytest -s -vvv tests/unit/ --junitxml=reports/unit.xml

reports:
  StateMachineUnitTestReports:
    files:
      - "**/*"
    base-directory: "reports"

Take note that in the CodeCommit repository includes a directory called tests/unit, which includes a collection of unit tests that are run and validated against your Lambda function code. Another very important part of this BuildSpec file is the reports section, which generates reports and metrics about the results, trends, and overall success of your tests.

CodeBuild test reports

After running the unit tests, you are able to see reports about the results of the run. Take note of the reports section of the BuildSpec file, along with the –junitxml=reports/unit.xml command run along with the pytest command. This generates a set of reports that can be visualized in CodeBuild.

Navigate to the specific CodeBuild project you want to examine and click on the specific execution of interest. There is a tab called Reports, as seen in the following screenshot:

Test reports

Select the specific report of interest to see a breakdown of the tests that have run, as shown in the following screenshot:

Test visualization

With Report Groups, you can also view an aggregated list of tests that have run over time. This report includes various features such as the number of average test cases that have run, average duration, and the overall pass rate, as shown in the following screenshot:

Report groups

The AWS CloudFormation template step

The following BuildSpec file is used to generate an AWS CloudFormation template that inject the State Machine Definition into AWS CloudFormation.

template_sm_buildspec.yaml file:

version: 0.2
env:
  git-credential-helper: yes
phases:
  install:
    runtime-versions:
      python: 3.8

  build:
    commands:
      - python template_statemachine_cf.py

The Python script that templates AWS CloudFormation to deploy the State Machine Definition given the sm_def.json file in your repository follows:

template_statemachine_cf.py file:

import sys
import json

def read_sm_def (
    sm_def_file: str
) -> dict:
    """
    Reads state machine definition from a file and returns it as a dictionary.

    Parameters:
        sm_def_file (str) = the name of the state machine definition file.

    Returns:
        sm_def_dict (dict) = the state machine definition as a dictionary.
    """

    try:
        with open(f"{sm_def_file}", "r") as f:
            return f.read()
    except IOError as e:
        print("Path does not exist!")
        print(e)
        sys.exit(1)

def template_state_machine(
    sm_def: dict
) -> dict:
    """
    Templates out the CloudFormation for creating a state machine.

    Parameters:
        sm_def (dict) = a dictionary definition of the aws states language state machine.

    Returns:
        templated_cf (dict) = a dictionary definition of the state machine.
    """
    
    templated_cf = {
        "AWSTemplateFormatVersion": "2010-09-09",
        "Description": "Creates the Step Function State Machine and associated IAM roles and policies",
        "Parameters": {
            "StateMachineName": {
                "Description": "The name of the State Machine",
                "Type": "String"
            }
        },
        "Resources": {
            "StateMachineLambdaRole": {
                "Type": "AWS::IAM::Role",
                "Properties": {
                    "AssumeRolePolicyDocument": {
                        "Version": "2012-10-17",
                        "Statement": [
                            {
                                "Effect": "Allow",
                                "Principal": {
                                    "Service": "states.amazonaws.com"
                                },
                                "Action": "sts:AssumeRole"
                            }
                        ]
                    },
                    "Policies": [
                        {
                            "PolicyName": {
                                "Fn::Sub": "States-Lambda-Execution-${AWS::StackName}-Policy"
                            },
                            "PolicyDocument": {
                                "Version": "2012-10-17",
                                "Statement": [
                                    {
                                        "Effect": "Allow",
                                        "Action": [
                                            "logs:CreateLogStream",
                                            "logs:CreateLogGroup",
                                            "logs:PutLogEvents",
                                            "sns:*"             
                                        ],
                                        "Resource": "*"
                                    },
                                    {
                                        "Effect": "Allow",
                                        "Action": [
                                            "lambda:InvokeFunction"
                                        ],
                                        "Resource": "*"
                                    }
                                ]
                            }
                        }
                    ]
                }
            },
            "StateMachine": {
                "Type": "AWS::StepFunctions::StateMachine",
                "Properties": {
                    "DefinitionString": sm_def,
                    "RoleArn": {
                        "Fn::GetAtt": [
                            "StateMachineLambdaRole",
                            "Arn"
                        ]
                    },
                    "StateMachineName": {
                        "Ref": "StateMachineName"
                    }
                }
            }
        }
    }

    return templated_cf


sm_def_dict = read_sm_def(
    sm_def_file='sm_def.json'
)

print(sm_def_dict)

cfm_sm_def = template_state_machine(
    sm_def=sm_def_dict
)

with open("sm_cfm.json", "w") as f:
    f.write(json.dumps(cfm_sm_def))

Deploying the test pipeline

In order to verify the full functionality of an entire state machine, you should stand it up so that it can be tested appropriately. This is an exact replica of what you will deploy to Production: a completely separate stack from the actual production stack that is deployed after passing appropriate end-to-end tests and approvals. You can take advantage of the AWS CloudFormation target supported by CodePipeline. Please take note of the configuration in the following screenshot, which shows how to configure this step in the AWS console:

Deploy test pipeline

End-to-end testing

In order to validate that the entire state machine works and executes without issues given any specific changes, feed it some sample inputs and make assertions on specific output values. If the specific assertions pass and you get the output that you expect to receive, you can proceed to the manual approval phase.

e2e_tests_buildspec.yaml file:

version: 0.2
env:
  git-credential-helper: yes
phases:
  install:
    runtime-versions:
      python: 3.8
    commands:
      - pip3 install -r tests/requirements.txt

  build:
    commands:
      - pytest -s -vvv tests/e2e/ --junitxml=reports/e2e.xml

reports:
  StateMachineReports:
    files:
      - "**/*"
    base-directory: "reports"

Manual approval (SNS topic notification)

In order to proceed forward in the CI/CD pipeline, there should be a formal approval phase before moving forward with a deployment to Production. Using the Manual Approval stage in AWS CodePipeline, you can configure the pipeline to halt and send a message to an Amazon SNS topic before moving on further. The SNS topic can have a variety of subscribers, but in this case, subscribe an approver email address to the topic so that they can be notified whenever an approval is requested. Once the approver approves the pipeline to move to Production, the pipeline will proceed with deploying the production version of the Step Function state machine.

This Manual Approval stage can be configured in the AWS console using a configuration similar to the following:

Manual approval

Deploying to Production

After the linting, unit testing, end-to-end testing, and the Manual Approval phases have passed, you can move on to deploying the Step Function state machine to Production. This phase is similar to the Deploy Test Stage phase, except the name of your AWS CloudFormation stack is different. In this case, you also take advantage of the AWS CloudFormation target for CodeDeploy:

Deploy to production

After this stage completes successfully, your pipeline execution is complete.

Cleanup

After validating that the test state machine and Lambda functions work, include a CloudFormation step that will tear-down the existing test infrastructure (as it is no longer needed). This can be configured as a new CodePipeline step similar to the below configuration:

CloudFormation Template for cleaning up resources

Conclusion

You have linted and validated your AWS States Language definition, unit tested your Lambda function code, deployed a test AWS state machine, run end-to-end tests, received Manual Approval to deploy to Production, and deployed to Production. This gives you and your team confidence that any changes made to your state machine and surrounding Lambda function code perform correctly in Production.

 

About the Author

matt noyce profile photo

 

Matt Noyce is a Cloud Application Architect in Professional Services at Amazon Web Services.
He works with customers to architect, design, automate, and build solutions on AWS
for their business needs.

Building Windows containers with AWS CodePipeline and custom actions

Post Syndicated from Dmitry Kolomiets original https://aws.amazon.com/blogs/devops/building-windows-containers-with-aws-codepipeline-and-custom-actions/

Dmitry Kolomiets, DevOps Consultant, Professional Services

AWS CodePipeline and AWS CodeBuild are the primary AWS services for building CI/CD pipelines. AWS CodeBuild supports a wide range of build scenarios thanks to various built-in Docker images. It also allows you to bring in your own custom image in order to use different tools and environment configurations. However, there are some limitations in using custom images.

Considerations for custom Docker images:

  • AWS CodeBuild has to download a new copy of the Docker image for each build job, which may take longer time for large Docker images.
  • AWS CodeBuild provides a limited set of instance types to run the builds. You might have to use a custom image if the build job requires higher memory, CPU, graphical subsystems, or any other functionality that is not part of the out-of-the-box provided Docker image.

Windows-specific limitations

  • AWS CodeBuild supports Windows builds only in a limited number of AWS regions at this time.
  • AWS CodeBuild executes Windows Server containers using Windows Server 2016 hosts, which means that build containers are huge—it is not uncommon to have an image size of 15 GB or more (with .NET Framework SDK installed). Windows Server 2019 containers, which are almost half as small, cannot be used due to host-container mismatch.
  • AWS CodeBuild runs build jobs inside Docker containers. You should enable privileged mode in order to build and publish Linux Docker images as part of your build job. However, DIND is not supported on Windows and, therefore, AWS CodeBuild cannot be used to build Windows Server container images.

The last point is the critical one for microservice type of applications based on Microsoft stacks (.NET Framework, Web API, IIS). The usual workflow for this kind of applications is to build a Docker image, push it to ECR and update ECS / EKS cluster deployment.

Here is what I cover in this post:

  • How to address the limitations stated above by implementing AWS CodePipeline custom actions (applicable for both Linux and Windows environments).
  • How to use the created custom action to define a CI/CD pipeline for Windows Server containers.

CodePipeline custom actions

By using Amazon EC2 instances, you can address the limitations with Windows Server containers and enable Windows build jobs in the regions where AWS CodeBuild does not provide native Windows build environments. To accommodate the specific needs of a build job, you can pick one of the many Amazon EC2 instance types available.

The downside of this approach is additional management burden—neither AWS CodeBuild nor AWS CodePipeline support Amazon EC2 instances directly. There are ways to set up a Jenkins build cluster on AWS and integrate it with CodeBuild and CodeDeploy, but these options are too “heavy” for the simple task of building a Docker image.

There is a different way to tackle this problem: AWS CodePipeline provides APIs that allow you to extend a build action though custom actions. This example demonstrates how to add a custom action to offload a build job to an Amazon EC2 instance.

Here is the generic sequence of steps that the custom action performs:

  • Acquire EC2 instance (see the Notes on Amazon EC2 build instances section).
  • Download AWS CodePipeline artifacts from Amazon S3.
  • Execute the build command and capture any errors.
  • Upload output artifacts to be consumed by subsequent AWS CodePipeline actions.
  • Update the status of the action in AWS CodePipeline.
  • Release the Amazon EC2 instance.

Notice that most of these steps are the same regardless of the actual build job being executed. However, the following parameters will differ between CI/CD pipelines and, therefore, have to be configurable:

  • Instance type (t2.micro, t3.2xlarge, etc.)
  • AMI (builds could have different prerequisites in terms of OS configuration, software installed, Docker images downloaded, etc.)
  • Build command line(s) to execute (MSBuild script, bash, Docker, etc.)
  • Build job timeout

Serverless custom action architecture

CodePipeline custom build action can be implemented as an agent component installed on an Amazon EC2 instance. The agent polls CodePipeline for build jobs and executes them on the Amazon EC2 instance. There is an example of such an agent on GitHub, but this approach requires installation and configuration of the agent on all Amazon EC2 instances that carry out the build jobs.

Instead, I want to introduce an architecture that enables any Amazon EC2 instance to be a build agent without additional software and configuration required. The architecture diagram looks as follows:

Serverless custom action architecture

There are multiple components involved:

  1. An Amazon CloudWatch Event triggers an AWS Lambda function when a custom CodePipeline action is to be executed.
  2. The Lambda function retrieves the action’s build properties (AMI, instance type, etc.) from CodePipeline, along with location of the input artifacts in the Amazon S3 bucket.
  3. The Lambda function starts a Step Functions state machine that carries out the build job execution, passing all the gathered information as input payload.
  4. The Step Functions flow acquires an Amazon EC2 instance according to the provided properties, waits until the instance is up and running, and starts an AWS Systems Manager command. The Step Functions flow is also responsible for handling all the errors during build job execution and releasing the Amazon EC2 instance once the Systems Manager command execution is complete.
  5. The Systems Manager command runs on an Amazon EC2 instance, downloads CodePipeline input artifacts from the Amazon S3 bucket, unzips them, executes the build script, and uploads any output artifacts to the CodePipeline-provided Amazon S3 bucket.
  6. Polling Lambda updates the state of the custom action in CodePipeline once it detects that the Step Function flow is completed.

The whole architecture is serverless and requires no maintenance in terms of software installed on Amazon EC2 instances thanks to the Systems Manager command, which is essential for this solution. All the code, AWS CloudFormation templates, and installation instructions are available on the GitHub project. The following sections provide further details on the mentioned components.

Custom Build Action

The custom action type is defined as an AWS::CodePipeline::CustomActionType resource as follows:

  Ec2BuildActionType: 
    Type: AWS::CodePipeline::CustomActionType
    Properties: 
      Category: !Ref CustomActionProviderCategory
      Provider: !Ref CustomActionProviderName
      Version: !Ref CustomActionProviderVersion
      ConfigurationProperties: 
        - Name: ImageId 
          Description: AMI to use for EC2 build instances.
          Key: true 
          Required: true
          Secret: false
          Queryable: false
          Type: String
        - Name: InstanceType
          Description: Instance type for EC2 build instances.
          Key: true 
          Required: true
          Secret: false
          Queryable: false
          Type: String
        - Name: Command
          Description: Command(s) to execute.
          Key: true 
          Required: true
          Secret: false
          Queryable: false
          Type: String 
        - Name: WorkingDirectory 
          Description: Working directory for the command to execute.
          Key: true 
          Required: false
          Secret: false
          Queryable: false
          Type: String 
        - Name: OutputArtifactPath 
          Description: Path of the file(-s) or directory(-es) to use as custom action output artifact.
          Key: true 
          Required: false
          Secret: false
          Queryable: false
          Type: String 
      InputArtifactDetails: 
        MaximumCount: 1
        MinimumCount: 0
      OutputArtifactDetails: 
        MaximumCount: 1
        MinimumCount: 0 
      Settings: 
        EntityUrlTemplate: !Sub "https://${AWS::Region}.console.aws.amazon.com/systems-manager/documents/${RunBuildJobOnEc2Instance}"
        ExecutionUrlTemplate: !Sub "https://${AWS::Region}.console.aws.amazon.com/states/home#/executions/details/{ExternalExecutionId}"

The custom action type is uniquely identified by Category, Provider name, and Version.

Category defines the stage of the pipeline in which the custom action can be used, such as build, test, or deploy. Check the AWS documentation for the full list of allowed values.

Provider name and Version are the values used to identify the custom action type in the CodePipeline console or AWS CloudFormation templates. Once the custom action type is installed, you can add it to the pipeline, as shown in the following screenshot:

Adding custom action to the pipeline

The custom action type also defines a list of user-configurable properties—these are the properties identified above as specific for different CI/CD pipelines:

  • AMI Image ID
  • Instance Type
  • Command
  • Working Directory
  • Output artifacts

The properties are configurable in the CodePipeline console, as shown in the following screenshot:

Custom action properties

Note the last two settings in the Custom Action Type AWS CloudFormation definition: EntityUrlTemplate and ExecutionUrlTemplate.

EntityUrlTemplate defines the link to the AWS Systems Manager document that carries over the build actions. The link is visible in AWS CodePipeline console as shown in the following screenshot:

Custom action's EntityUrlTemplate link

ExecutionUrlTemplate defines the link to additional information related to a specific execution of the custom action. The link is also visible in the CodePipeline console, as shown in the following screenshot:

Custom action's ExecutionUrlTemplate link

This URL is defined as a link to the Step Functions execution details page, which provides high-level information about the custom build step execution, as shown in the following screenshot:

Custom build step execution

This page is a convenient visual representation of the custom action execution flow and may be useful for troubleshooting purposes as it gives an immediate access to error messages and logs.

The polling Lambda function

The Lambda function polls CodePipeline for custom actions when it is triggered by the following CloudWatch event:

  source: 
    - "aws.codepipeline"
  detail-type: 
    - "CodePipeline Action Execution State Change"
  detail: 
    state: 
      - "STARTED"

The event is triggered for every CodePipeline action started, so the Lambda function should verify if, indeed, there is a custom action to be processed.

The rest of the lambda function is trivial and relies on the following APIs to retrieve or update CodePipeline actions and deal with instances of Step Functions state machines:

CodePipeline API

AWS Step Functions API

You can find the complete source of the Lambda function on GitHub.

Step Functions state machine

The following diagram shows complete Step Functions state machine. There are three main blocks on the diagram:

  • Acquiring an Amazon EC2 instance and waiting while the instance is registered with Systems Manager
  • Running a Systems Manager command on the instance
  • Releasing the Amazon EC2 instance

Note that it is necessary to release the Amazon EC2 instance in case of error or exception during Systems Manager command execution, relying on Fallback States to guarantee that.

You can find the complete definition of the Step Function state machine on GitHub.

Step Functions state machine

Systems Manager Document

The AWS Systems Manager Run Command does all the magic. The Systems Manager agent is pre-installed on AWS Windows and Linux AMIs, so no additional software is required. The Systems Manager run command executes the following steps to carry out the build job:

  1. Download input artifacts from Amazon S3.
  2. Unzip artifacts in the working folder.
  3. Run the command.
  4. Upload output artifacts to Amazon S3, if any; this makes them available for the following CodePipeline stages.

The preceding steps are operating-system agnostic, and both Linux and Windows instances are supported. The following code snippet shows the Windows-specific steps.

You can find the complete definition of the Systems Manager document on GitHub.

mainSteps:
  - name: win_enable_docker
    action: aws:configureDocker
    inputs:
      action: Install

  # Windows steps
  - name: windows_script
    precondition:
      StringEquals: [platformType, Windows]
    action: aws:runPowerShellScript
    inputs:
      runCommand:
        # Ensure that if a command fails the script does not proceed to the following commands
        - "$ErrorActionPreference = \"Stop\""

        - "$jobDirectory = \"{{ workingDirectory }}\""
        # Create temporary folder for build artifacts, if not provided
        - "if ([string]::IsNullOrEmpty($jobDirectory)) {"
        - "    $parent = [System.IO.Path]::GetTempPath()"
        - "    [string] $name = [System.Guid]::NewGuid()"
        - "    $jobDirectory = (Join-Path $parent $name)"
        - "    New-Item -ItemType Directory -Path $jobDirectory"
                # Set current location to the new folder
        - "    Set-Location -Path $jobDirectory"
        - "}"

        # Download/unzip input artifact
        - "Read-S3Object -BucketName {{ inputBucketName }} -Key {{ inputObjectKey }} -File artifact.zip"
        - "Expand-Archive -Path artifact.zip -DestinationPath ."

        # Run the build commands
        - "$directory = Convert-Path ."
        - "$env:PATH += \";$directory\""
        - "{{ commands }}"
        # We need to check exit code explicitly here
        - "if (-not ($?)) { exit $LASTEXITCODE }"

        # Compress output artifacts, if specified
        - "$outputArtifactPath  = \"{{ outputArtifactPath }}\""
        - "if ($outputArtifactPath) {"
        - "    Compress-Archive -Path $outputArtifactPath -DestinationPath output-artifact.zip"
                # Upload compressed artifact to S3
        - "    $bucketName = \"{{ outputBucketName }}\""
        - "    $objectKey = \"{{ outputObjectKey }}\""
        - "    if ($bucketName -and $objectKey) {"
                    # Don't forget to encrypt the artifact - CodePipeline bucket has a policy to enforce this
        - "        Write-S3Object -BucketName $bucketName -Key $objectKey -File output-artifact.zip -ServerSideEncryption aws:kms"
        - "    }"
        - "}"
      workingDirectory: "{{ workingDirectory }}"
      timeoutSeconds: "{{ executionTimeout }}"

CI/CD pipeline for Windows Server containers

Once you have a custom action that offloads the build job to the Amazon EC2 instance, you may approach the problem stated at the beginning of this blog post: how to build and publish Windows Server containers on AWS.

With the custom action installed, the solution is quite straightforward. To build a Windows Server container image, you need to provide the value for Windows Server with Containers AMI, the instance type to use, and the command line to execute, as shown in the following screenshot:

Windows Server container custom action properties

This example executes the Docker build command on a Windows instance with the specified AMI and instance type, using the provided source artifact. In real life, you may want to keep the build script along with the source code and push the built image to a container registry. The following is a PowerShell script example that not only produces a Docker image but also pushes it to AWS ECR:

# Authenticate with ECR
Invoke-Expression -Command (Get-ECRLoginCommand).Command

# Build and push the image
docker build -t <ecr-repository-url>:latest .
docker push <ecr-repository-url>:latest

return $LASTEXITCODE

You can find a complete example of the pipeline that produces the Windows Server container image and pushes it to Amazon ECR on GitHub.

Notes on Amazon EC2 build instances

There are a few ways to get Amazon EC2 instances for custom build actions. Let’s take a look at a couple of them below.

Start new EC2 instance per job and terminate it at the end

This is a reasonable default strategy that is implemented in this GitHub project. Each time the pipeline needs to process a custom action, you start a new Amazon EC2 instance, carry out the build job, and terminate the instance afterwards.

This approach is easy to implement. It works well for scenarios in which you don’t have many builds and/or builds take some time to complete (tens of minutes). In this case, the time required to provision an instance is amortized. Conversely, if the builds are fast, instance provisioning time could be actually longer than the time required to carry out the build job.

Use a pool of running Amazon EC2 instances

There are cases when it is required to keep builder instances “warm”, either due to complex initialization or merely to reduce the build duration. To support this scenario, you could maintain a pool of always-running instances. The “acquisition” phase takes a warm instance from the pool and the “release” phase returns it back without terminating or stopping the instance. A DynamoDB table can be used as a registry to keep track of “busy” instances and provide waiting or scaling capabilities to handle high demand.

This approach works well for scenarios in which there are many builds and demand is predictable (e.g. during work hours).

Use a pool of stopped Amazon EC2 instances

This is an interesting approach, especially for Windows builds. All AWS Windows AMIs are generalized using a sysprep tool. The important implication of this is that the first start time for Windows EC2 instances is quite long: it could easily take more than 5 minutes. This is generally unacceptable for short-living build jobs (if your build takes just a minute, it is annoying to wait 5 minutes to start the instance).

Interestingly, once the Windows instance is initialized, subsequent starts take less than a minute. To utilize this, you could create a pool of initialized and stopped Amazon EC2 instances. In this case, for the acquisition phase, you start the instance, and when you need to release it, you stop or hibernate it.

This approach provides substantial improvements in terms of build start-up time.

The downside is that you reuse the same Amazon EC2 instance between the builds—it is not completely clean environment. Build jobs have to be designed to expect the presence of artifacts from the previous executions on the build instance.

Using an Amazon EC2 fleet with spot instances

Another variation of the previous strategies is to use Amazon EC2 Fleet to make use of cost-efficient spot instances for your build jobs.

Amazon EC2 Fleet makes it possible to combine on-demand instances with spot instances to deliver cost-efficient solution for your build jobs. On-demand instances can provide the minimum required capacity and spot instances provide a cost-efficient way to improve performance of your build fleet.

Note that since spot instances could be terminated at any time, the Step Functions workflow has to support Amazon EC2 instance termination and restart the build on a different instance transparently for CodePipeline.

Limits and Cost

The following are a few final thoughts.

Custom action timeouts

The default maximum execution time for CodePipeline custom actions is one hour. If your build jobs require more than an hour, you need to request a limit increase for custom actions.

Cost of running EC2 build instances

Custom Amazon EC2 instances could be even more cost effective than CodeBuild for many scenarios. However, it is difficult to compare the total cost of ownership of a custom-built fleet with CodeBuild. CodeBuild is a fully managed build service and you pay for each minute of using the service. In contrast, with Amazon EC2 instances you pay for the instance either per hour or per second (depending on instance type and operating system), EBS volumes, Lambda, and Step Functions. Please use the AWS Simple Monthly Calculator to get the total cost of your projected build solution.

Cleanup

If you are running the above steps as a part of workshop / testing, then you may delete the resources to avoid any further charges to be incurred. All resources are deployed as part of CloudFormation stack, so go to the Services, CloudFormation, select the specific stack and click delete to remove the stack.

Conclusion

The CodePipeline custom action is a simple way to utilize Amazon EC2 instances for your build jobs and address a number of CodePipeline limitations.

With AWS CloudFormation template available on GitHub you can import the CodePipeline custom action with a simple Start/Terminate instance strategy into your account and start using the custom action in your pipelines right away.

The CodePipeline custom action with a simple Start/Terminate instance strategy is available on GitHub as an AWS CloudFormation stack. You could import the stack to your account and start using the custom action in your pipelines right away.

An example of the pipeline that produces Windows Server containers and pushes them to Amazon ECR can also be found on GitHub.

I invite you to clone the repositories to play with the custom action, and to make any changes to the action definition, Lambda functions, or Step Functions flow.

Feel free to ask any questions or comments below, or file issues or PRs on GitHub to continue the discussion.

ICYMI: Serverless Q4 2019

Post Syndicated from Rob Sutter original https://aws.amazon.com/blogs/compute/icymi-serverless-q4-2019/

Welcome to the eighth edition of the AWS Serverless ICYMI (in case you missed it) quarterly recap. Every quarter, we share the most recent product launches, feature enhancements, blog posts, webinars, Twitch live streams, and other interesting things that you might have missed!

In case you missed our last ICYMI, checkout what happened last quarter here.

The three months comprising the fourth quarter of 2019

AWS re:Invent

AWS re:Invent 2019

re:Invent 2019 dominated the fourth quarter at AWS. The serverless team presented a number of talks, workshops, and builder sessions to help customers increase their skills and deliver value more rapidly to their own customers.

Serverless talks from re:Invent 2019

Chris Munns presenting 'Building microservices with AWS Lambda' at re:Invent 2019

We presented dozens of sessions showing how customers can improve their architecture and agility with serverless. Here are some of the most popular.

Videos

Decks

You can also find decks for many of the serverless presentations and other re:Invent presentations on our AWS Events Content.

AWS Lambda

For developers needing greater control over performance of their serverless applications at any scale, AWS Lambda announced Provisioned Concurrency at re:Invent. This feature enables Lambda functions to execute with consistent start-up latency making them ideal for building latency sensitive applications.

As shown in the below graph, provisioned concurrency reduces tail latency, directly impacting response times and providing a more responsive end user experience.

Graph showing performance enhancements with AWS Lambda Provisioned Concurrency

Lambda rolled out enhanced VPC networking to 14 additional Regions around the world. This change brings dramatic improvements to startup performance for Lambda functions running in VPCs due to more efficient usage of elastic network interfaces.

Illustration of AWS Lambda VPC to VPC NAT

New VPC to VPC NAT for Lambda functions

Lambda now supports three additional runtimes: Node.js 12, Java 11, and Python 3.8. Each of these new runtimes has new version-specific features and benefits, which are covered in the linked release posts. Like the Node.js 10 runtime, these new runtimes are all based on an Amazon Linux 2 execution environment.

Lambda released a number of controls for both stream and async-based invocations:

  • You can now configure error handling for Lambda functions consuming events from Amazon Kinesis Data Streams or Amazon DynamoDB Streams. It’s now possible to limit the retry count, limit the age of records being retried, configure a failure destination, or split a batch to isolate a problem record. These capabilities help you deal with potential “poison pill” records that would previously cause streams to pause in processing.
  • For asynchronous Lambda invocations, you can now set the maximum event age and retry attempts on the event. If either configured condition is met, the event can be routed to a dead letter queue (DLQ), Lambda destination, or it can be discarded.

AWS Lambda Destinations is a new feature that allows developers to designate an asynchronous target for Lambda function invocation results. You can set separate destinations for success and failure. This unlocks new patterns for distributed event-based applications and can replace custom code previously used to manage routing results.

Illustration depicting AWS Lambda Destinations with success and failure configurations

Lambda Destinations

Lambda also now supports setting a Parallelization Factor, which allows you to set multiple Lambda invocations per shard for Kinesis Data Streams and DynamoDB Streams. This enables faster processing without the need to increase your shard count, while still guaranteeing the order of records processed.

Illustration of multiple AWS Lambda invocations per Kinesis Data Streams shard

Lambda Parallelization Factor diagram

Lambda introduced Amazon SQS FIFO queues as an event source. “First in, first out” (FIFO) queues guarantee the order of record processing, unlike standard queues. FIFO queues support messaging batching via a MessageGroupID attribute that supports parallel Lambda consumers of a single FIFO queue, enabling high throughput of record processing by Lambda.

Lambda now supports Environment Variables in the AWS China (Beijing) Region and the AWS China (Ningxia) Region.

You can now view percentile statistics for the duration metric of your Lambda functions. Percentile statistics show the relative standing of a value in a dataset, and are useful when applied to metrics that exhibit large variances. They can help you understand the distribution of a metric, discover outliers, and find hard-to-spot situations that affect customer experience for a subset of your users.

Amazon API Gateway

Screen capture of creating an Amazon API Gateway HTTP API in the AWS Management Console

Amazon API Gateway announced the preview of HTTP APIs. In addition to significant performance improvements, most customers see an average cost savings of 70% when compared with API Gateway REST APIs. With HTTP APIs, you can create an API in four simple steps. Once the API is created, additional configuration for CORS and JWT authorizers can be added.

AWS SAM CLI

Screen capture of the new 'sam deploy' process in a terminal window

The AWS SAM CLI team simplified the bucket management and deployment process in the SAM CLI. You no longer need to manage a bucket for deployment artifacts – SAM CLI handles this for you. The deployment process has also been streamlined from multiple flagged commands to a single command, sam deploy.

AWS Step Functions

One powerful feature of AWS Step Functions is its ability to integrate directly with AWS services without you needing to write complicated application code. In Q4, Step Functions expanded its integration with Amazon SageMaker to simplify machine learning workflows. Step Functions also added a new integration with Amazon EMR, making EMR big data processing workflows faster to build and easier to monitor.

Screen capture of an AWS Step Functions step with Amazon EMR

Step Functions step with EMR

Step Functions now provides the ability to track state transition usage by integrating with AWS Budgets, allowing you to monitor trends and react to usage on your AWS account.

You can now view CloudWatch Metrics for Step Functions at a one-minute frequency. This makes it easier to set up detailed monitoring for your workflows. You can use one-minute metrics to set up CloudWatch Alarms based on your Step Functions API usage, Lambda functions, service integrations, and execution details.

Step Functions now supports higher throughput workflows, making it easier to coordinate applications with high event rates. This increases the limits to 1,500 state transitions per second and a default start rate of 300 state machine executions per second in US East (N. Virginia), US West (Oregon), and Europe (Ireland). Click the above link to learn more about the limit increases in other Regions.

Screen capture of choosing Express Workflows in the AWS Management Console

Step Functions released AWS Step Functions Express Workflows. With the ability to support event rates greater than 100,000 per second, this feature is designed for high-performance workloads at a reduced cost.

Amazon EventBridge

Illustration of the Amazon EventBridge schema registry and discovery service

Amazon EventBridge announced the preview of the Amazon EventBridge schema registry and discovery service. This service allows developers to automate discovery and cataloging event schemas for use in their applications. Additionally, once a schema is stored in the registry, you can generate and download a code binding that represents the schema as an object in your code.

Amazon SNS

Amazon SNS now supports the use of dead letter queues (DLQ) to help capture unhandled events. By enabling a DLQ, you can catch events that are not processed and re-submit them or analyze to locate processing issues.

Amazon CloudWatch

Amazon CloudWatch announced Amazon CloudWatch ServiceLens to provide a “single pane of glass” to observe health, performance, and availability of your application.

Screenshot of Amazon CloudWatch ServiceLens in the AWS Management Console

CloudWatch ServiceLens

CloudWatch also announced a preview of a capability called Synthetics. CloudWatch Synthetics allows you to test your application endpoints and URLs using configurable scripts that mimic what a real customer would do. This enables the outside-in view of your customers’ experiences, and your service’s availability from their point of view.

CloudWatch introduced Embedded Metric Format, which helps you ingest complex high-cardinality application data as logs and easily generate actionable metrics. You can publish these metrics from your Lambda function by using the PutLogEvents API or using an open source library for Node.js or Python applications.

Finally, CloudWatch announced a preview of Contributor Insights, a capability to identify who or what is impacting your system or application performance by identifying outliers or patterns in log data.

AWS X-Ray

AWS X-Ray announced trace maps, which enable you to map the end-to-end path of a single request. Identifiers show issues and how they affect other services in the request’s path. These can help you to identify and isolate service points that are causing degradation or failures.

X-Ray also announced support for Amazon CloudWatch Synthetics, currently in preview. CloudWatch Synthetics on X-Ray support tracing canary scripts throughout the application, providing metrics on performance or application issues.

Screen capture of AWS X-Ray Service map in the AWS Management Console

X-Ray Service map with CloudWatch Synthetics

Amazon DynamoDB

Amazon DynamoDB announced support for customer-managed customer master keys (CMKs) to encrypt data in DynamoDB. This allows customers to bring your own key (BYOK) giving you full control over how you encrypt and manage the security of your DynamoDB data.

It is now possible to add global replicas to existing DynamoDB tables to provide enhanced availability across the globe.

Another new DynamoDB capability to identify frequently accessed keys and database traffic trends is currently in preview. With this, you can now more easily identify “hot keys” and understand usage of your DynamoDB tables.

Screen capture of Amazon CloudWatch Contributor Insights for DynamoDB in the AWS Management Console

CloudWatch Contributor Insights for DynamoDB

DynamoDB also released adaptive capacity. Adaptive capacity helps you handle imbalanced workloads by automatically isolating frequently accessed items and shifting data across partitions to rebalance them. This helps reduce cost by enabling you to provision throughput for a more balanced workload instead of over provisioning for uneven data access patterns.

Amazon RDS

Amazon Relational Database Services (RDS) announced a preview of Amazon RDS Proxy to help developers manage RDS connection strings for serverless applications.

Illustration of Amazon RDS Proxy

The RDS Proxy maintains a pool of established connections to your RDS database instances. This pool enables you to support a large number of application connections so your application can scale without compromising performance. It also increases security by enabling IAM authentication for database access and enabling you to centrally manage database credentials using AWS Secrets Manager.

AWS Serverless Application Repository

The AWS Serverless Application Repository (SAR) now offers Verified Author badges. These badges enable consumers to quickly and reliably know who you are. The badge appears next to your name in the SAR and links to your GitHub profile.

Screen capture of SAR Verifiedl developer badge in the AWS Management Console

SAR Verified developer badges

AWS Developer Tools

AWS CodeCommit launched the ability for you to enforce rule workflows for pull requests, making it easier to ensure that code has pass through specific rule requirements. You can now create an approval rule specifically for a pull request, or create approval rule templates to be applied to all future pull requests in a repository.

AWS CodeBuild added beta support for test reporting. With test reporting, you can now view the detailed results, trends, and history for tests executed on CodeBuild for any framework that supports the JUnit XML or Cucumber JSON test format.

Screen capture of AWS CodeBuild

CodeBuild test trends in the AWS Management Console

Amazon CodeGuru

AWS announced a preview of Amazon CodeGuru at re:Invent 2019. CodeGuru is a machine learning based service that makes code reviews more effective and aids developers in writing code that is more secure, performant, and consistent.

AWS Amplify and AWS AppSync

AWS Amplify added iOS and Android as supported platforms. Now developers can build iOS and Android applications using the Amplify Framework with the same category-based programming model that they use for JavaScript apps.

Screen capture of 'amplify init' for an iOS application in a terminal window

The Amplify team has also improved offline data access and synchronization by announcing Amplify DataStore. Developers can now create applications that allow users to continue to access and modify data, without an internet connection. Upon connection, the data synchronizes transparently with the cloud.

For a summary of Amplify and AppSync announcements before re:Invent, read: “A round up of the recent pre-re:Invent 2019 AWS Amplify Launches”.

Illustration of AWS AppSync integrations with other AWS services

Q4 serverless content

Blog posts

October

November

December

Tech talks

We hold several AWS Online Tech Talks covering serverless tech talks throughout the year. These are listed in the Serverless section of the AWS Online Tech Talks page.

Here are the ones from Q4:

Twitch

October

There are also a number of other helpful video series covering Serverless available on the AWS Twitch Channel.

AWS Serverless Heroes

We are excited to welcome some new AWS Serverless Heroes to help grow the serverless community. We look forward to some amazing content to help you with your serverless journey.

AWS Serverless Application Repository (SAR) Apps

In this edition of ICYMI, we are introducing a section devoted to SAR apps written by the AWS Serverless Developer Advocacy team. You can run these applications and review their source code to learn more about serverless and to see examples of suggested practices.

Still looking for more?

The Serverless landing page has much more information. The Lambda resources page contains case studies, webinars, whitepapers, customer stories, reference architectures, and even more Getting Started tutorials. We’re also kicking off a fresh series of Tech Talks in 2020 with new content providing greater detail on everything new coming out of AWS for serverless application developers.

Throughout 2020, the AWS Serverless Developer Advocates are crossing the globe to tell you more about serverless, and to hear more about what you need. Follow this blog to keep up on new launches and announcements, best practices, and examples of serverless applications in action.

You can also follow all of us on Twitter to see latest news, follow conversations, and interact with the team.

Chris Munns: @chrismunns
Eric Johnson: @edjgeek
James Beswick: @jbesw
Moheeb Zara: @virgilvox
Ben Smith: @benjamin_l_s
Rob Sutter: @rts_rob
Julian Wood: @julian_wood

Happy coding!

Orchestrating a security incident response with AWS Step Functions

Post Syndicated from Benjamin Smith original https://aws.amazon.com/blogs/compute/orchestrating-a-security-incident-response-with-aws-step-functions/

In this post I will show how to implement the callback pattern of an AWS Step Functions Standard Workflow. This is used to add a manual approval step into an automated security incident response framework. The framework could be extended to remediate automatically, according to the individual policy actions defined. For example, applying alternative actions, or restricting actions to specific ARNs.

The application uses Amazon EventBridge to trigger a Step Functions Standard Workflow on an IAM policy creation event. The workflow compares the policy action against a customizable list of restricted actions. It uses AWS Lambda and Step Functions to roll back the policy temporarily, then notify an administrator and wait for them to approve or deny.

Figure 1: High-level architecture diagram.

Important: the application uses various AWS services, and there are costs associated with these services after the Free Tier usage. Please see the AWS pricing page for details.

You can deploy this application from the AWS Serverless Application Repository. You then create a new IAM Policy to trigger the rule and run the application.

Deploy the application from the Serverless Application Repository

  1. Find the “Automated-IAM-policy-alerts-and-approvals” app in the Serverless Application Repository.
  2. Complete the required application settings
    • Application name: an identifiable name for the application.
    • EmailAddress: an administrator’s email address for receiving approval requests.
    • restrictedActions: the IAM Policy actions you want to restrict.

      Figure 2 Deployment Fields

  3. Choose Deploy.

Once the deployment process is completed, 21 new resources are created. This includes:

  • Five Lambda functions that contain the business logic.
  • An Amazon EventBridge rule.
  • An Amazon SNS topic and subscription.
  • An Amazon API Gateway REST API with two resources.
  • An AWS Step Functions state machine

To receive Amazon SNS notifications as the application administrator, you must confirm the subscription to the SNS topic. To do this, choose the Confirm subscription link in the verification email that was sent to you when deploying the application.

EventBridge receives new events in the default event bus. Here, the event is compared with associated rules. Each rule has an event pattern defined, which acts as a filter to match inbound events to their corresponding rules. In this application, a matching event rule triggers an AWS Step Functions execution, passing in the event payload from the policy creation event.

Running the application

Trigger the application by creating a policy either via the AWS Management Console or with the AWS Command Line Interface.

Using the AWS CLI

First install and configure the AWS CLI, then run the following command:

aws iam create-policy --policy-name my-bad-policy1234 --policy-document '{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "VisualEditor0",
            "Effect": "Allow",
            "Action": [
                "s3:GetBucketObjectLockConfiguration",
                "s3:DeleteObjectVersion",
                "s3:DeleteBucket"
            ],
            "Resource": "*"
        }
    ]
}'

Using the AWS Management Console

  1. Go to Services > Identity Access Management (IAM) dashboard.
  2. Choose Create policy.
  3. Choose the JSON tab.
  4. Paste the following JSON:
    {
        "Version": "2012-10-17",
        "Statement": [
            {
                "Sid": "VisualEditor0",
                "Effect": "Allow",
                "Action": [
                    "s3:GetBucketObjectLockConfiguration",
                    "s3:DeleteObjectVersion",
                    "s3:DeleteBucket"
                ],
                "Resource": "*"
            }
        ]
    }
  5. Choose Review policy.
  6. In the Name field, enter my-bad-policy.
  7. Choose Create policy.

Either of these methods creates a policy with the permissions required to delete Amazon S3 buckets. Deleting an S3 bucket is one of the restricted actions set when the application is deployed:

Figure 3 default restricted actions

This sends the event to EventBridge, which then triggers the Step Functions state machine. The Step Functions state machine holds each state object in the workflow. Some of the state objects use the Lambda functions created during deployment to process data.

Others use Amazon States Language (ASL) enabling the application to conditionally branch, wait, and transition to the next state. Using a state machine decouples the business logic from the compute functionality.

After triggering the application, go to the Step Functions dashboard and choose the newly created state machine. Choose the current running state machine from the executions table.

Figure 4 State machine executions.

You see a visual representation of the current execution with the workflow is paused at the AskUser state.

Figure 5 Workflow Paused

These are the states in the workflow:

ModifyData
State Type: Pass
Re-structures the input data into an object that is passed throughout the workflow.

ValidatePolicy
State type: Task. Services: AWS Lambda
Invokes the ValidatePolicy Lambda function that checks the new policy document against the restricted actions.

ChooseAction
State type: Choice
Branches depending on input from ValidatePolicy step.

TempRemove
State type: Task. Service: AWS Lambda
Creates a new default version of the policy with only permissions for Amazon CloudWatch Logs and deletes the previously created policy version.

AskUser
State type: Choice
Sends an approval email to user via SNS, with the task token that initiates the callback pattern.

UsersChoice
State type: Choice
Branch based on the user action to approve or deny.

Denied
State type: Pass
Ends the execution with no further action.

Approved
State type: Task. Service: AWS Lambda
Restores the initial policy document by creating as a new version.

AllowWithNotification
State type: Task. Services: AWS Lambda
With no restricted actions detected, the user is still notified of change (via an email from SNS) before execution ends.

The callback pattern

An important feature of this application is the ability for an administrator to approve or deny a new policy. The Step Functions callback pattern makes this possible.

The callback pattern allows a workflow to pause during a task and wait for an external process to return a task token. The task token is generated when the task starts. When the AskUser function is invoked, it is passed a task token. The task token is published to the SNS topic along with the API resources for approval and denial. These API resources are created when the application is first deployed.

When the administrator clicks on the approve or deny links, it passes the token with the API request to the receiveUser Lambda function. This Lambda function uses the incoming task token to resume the AskUser state.

The lifecycle of the task token as it transitions through each service is shown below:

Figure 6 Task token lifecycle

  1. To invoke this callback pattern, the askUser state definition is declared using the .waitForTaskToken identifier, with the task token passed into the Lambda function as a payload parameter:
    "AskUser":{
     "Type": "Task",
     "Resource": "arn:aws:states:::lambda:invoke.waitForTaskToken",
     "Parameters":{  
     "FunctionName": "${AskUser}",
     "Payload":{  
     "token.$":"$$.Task.Token"
      }
     },
      "ResultPath":"$.taskresult",
      "Next": "usersChoice"
      },
  2. The askUser Lambda function can then access this token within the event object:
    exports.handler = async (event,context) => {
        let approveLink = `process.env.APIAllowEndpoint?token=${JSON.stringify(event.token)}`
        let denyLink = `process.env.APIDenyEndpoint?token=${JSON.stringify(event.token)}
    //code continues
  3. The task token is published to an SNS topic along with the message text parameter:
        let params = {
     TopicArn: process.env.Topic,
     Message: `A restricted Policy change has been detected Approve:${approveLink} Or Deny:${denyLink}` 
    }
     let res = await sns.publish(params).promise()
    //code continues
  4. The administrator receives an email with two links, one to approve and one to deny. The task token is appended to these links as a request query string parameter named token:

    Figure 7 Approve / deny email.

  5. Using the Amazon API Gateway proxy integration, the task token is passed directly to the recieveUser Lambda function from the API resource, and accessible from within in the function code as part of the event’s queryStringParameter object:
    exports.handler = async(event, context) => {
    //some code
        let taskToken = event.queryStringParameters.token
    //more code
    
  6.  The token is then sent back to the askUser state via an API call from within the recieveUser Lambda function.  This API call also defines the next course of action for the workflow to take.
    //some code 
    let params = {
            output: JSON.stringify({"action":NextAction}),
            taskToken: taskTokenClean
        }
    let res = await stepfunctions.sendTaskSuccess(params).promise()
    //code continues
    

Each Step Functions execution can last for up to a year, allowing for long wait periods for the administrator to take action. There is no extra cost for a longer wait time as you pay for the number of state transitions, and not for the idle wait time.

Conclusion

Using EventBridge to route IAM policy creation events directly to AWS Step Functions reduces the need for unnecessary communication layers. It helps promote good use of compute resources, ensuring Lambda is used to transform data, and not transport or orchestrate.

Using Step Functions to invoke services sequentially has two important benefits for this application. First, you can identify the use of restricted policies quickly and automatically. Also, these policies can be removed and held in a ‘pending’ state until approved.

Step Functions Standard Workflow’s callback pattern can create a robust orchestration layer that allows administrators to review each change before approving or denying.

For the full code base see the GitHub repository https://github.com/bls20AWS/AutomatedPolicyOrchestrator.

For more information on other Step Functions patterns, see our documentation on integration patterns.

ICYMI: Serverless re:Invent re:Cap 2019

Post Syndicated from Eric Johnson original https://aws.amazon.com/blogs/compute/icymi-serverless-reinvent-recap-2019/

Thank you for attending re:Invent 2019

In the week before AWS re:Invent 2019 we wrote about a number of service and feature launches leading up to the biggest event of the year for us at AWS. These included new features for AWS Lambda, integrations for AWS Step Functions, and other exciting service and feature launches for related product areas. But this was just the warm-up – AWS re:Invent 2019 itself saw several new serverless or serverless related announcements.

Here’s what’s new.

AWS Lambda

For developers needing greater control over performance of their serverless applications at any scale, AWS Lambda announced Provisioned Concurrency. This feature enables Lambda functions to execute with consistent start-up latency making them ideal for building latency sensitive applications.

AWS Step Functions

Express work flows

AWS Step Functions released AWS Step Functions Express Workflows. With the ability to support event rates greater than 100,000 per second, this feature is designed for high performance workloads at a reduced cost.

Amazon EventBridge

EventBridge schema registry and discovery

Amazon EventBridge announced the preview of the Amazon EventBridge schema registry and discovery service. This service allows developers to automate discovery and cataloging event schemas for use in their applications. Additionally, once a schema is stored in the registry, you can generate and download a code binding that represents the schema as an object in your code.

Amazon API Gateway

HTTP API

Amazon API Gateway announced the preview of HTTP APIs. With HTTP APIs most customers will see an average cost saving up to 70%, when compared to API Gateway REST APIs. In addition, you will see significant performance improvements in the API Gateway service overhead. With HTTP APIs, you can create an API in four simple steps. Once the API is created, additional configuration for CORS and JWT authorizers can be added.

Databases

Amazon Relational Database Services (RDS) announced a previews of Amazon RDS Proxy to help developers manage RDS connection strings for serverless applications.

RDS Proxy

The RDS proxy maintains a pool of established connections to your RDS database instances. This pool enables you to support a large number of application connections so your application can scale without compromising performance. It also increases security by enabling IAM authentication for database access and enabling you to centrally manage database credentials using AWS Secrets Manager.

AWS Amplify

Amplify platform choices

AWS Amplify has expanded their delivery platforms to include iOS and Android. Developers can now build iOS and Android applications using the Amplify Framework with the same category-based programming model that they use for JavaScript apps.

The Amplify team has also improved offline data access and synchronization by announcing Amplify DataStore. Developers can now create applications that allow users to continue to access and modify data, without an internet connection. Upon connection, the data synchronizes transparently with the cloud.

Amazon CodeGuru

Whether you are a team of one or an enterprise with thousands of developers, code review can be difficult. At re:Invent 2019, AWS announced a preview of Amazon CodeGuru, a machine learning based service to help make code reviews more effective and aid developers in writing code that is secure, performant, and consistent.

Serverless talks from re:Invent 2019

re:Invent presentation recordings

We presented dozens of sessions showing how customers can improve their architecture and agility with serverless. Here are some of the most popular.

Videos

Decks

You can also find decks for many of the serverless presentations and other re:Invent presentations on our AWS Events Content.

Conclusion

Prior to AWS re:Invent, AWS serverless had many service and feature launches and the pace continued throughout re:Invent itself. As we head towards 2020, follow this blog to keep up on new launches and announcements, best practices, and examples of serverless applications in action

Additionally, the AWS Serverless Developer Advocates will be crossing the globe to tell you more about serverless, and to hear more about what you need. You can also follow all of us on Twitter to see latest news, follow conversations, and interact with the team.

Chris Munns: @chrismunns
Eric Johnson: @edjgeek
James Beswick: @jbesw
Moheeb Zara: @virgilvox
Ben Smith: @benjamin_l_s
Rob Sutter: @rts_rob
Julian Wood: @julian_wood

Happy coding!

New Express Workflows for AWS Step Functions

Post Syndicated from Benjamin Smith original https://aws.amazon.com/blogs/compute/new-express-workflows-for-aws-step-functions/

Today, AWS is introducing Express Workflows for AWS Step Functions. This is a new workflow type to orchestrate AWS services at a higher-throughput than existing workflows.

Developers have been using AWS Step Functions since December 2016 to assemble long running workflows to orchestrate AWS Lambda Functions and other serverless services. Customers were looking for better ways to optimize for workloads that require higher event rates and shorter durations.

The new AWS Step Functions Express Workflows type uses fast, in-memory processing for high-event-rate workloads of up to 100,000 state transitions per second, for a total workflow duration of up to 5 minutes. Express Workflows are suited to streaming data processing, IoT data ingestion, mobile backends, and other high-throughput use-cases. Existing workflows in AWS Step Functions are now called Standard Workflows.

Getting started

You can build and run Express Workflows using the AWS Management Console, AWS CLI, or AWS CloudFormation. The steps below explain how to build an Express Workflow from within the AWS Management Console:

1.    In the AWS Step Functions console, choose Create state machine.

Figure 1 Creating a state machine from the AWS Step Functions Console

Figure 1 Creating a state machine from the AWS Step Functions console

2.    In the Type section, choose Express.

Figure 2 New Express Workflow option

Figure 2 New Express Workflow option

3.    Enter a name for the Express Workflow in the State machine name field. In the Definition section, you can generate code snippets from a selection of example use cases.

4.    Leave the default definition and choose Next.

Figure 3 Define your workflow with ASL

Figure 3 Define your workflow with ASL

You can create a new role for the Express Workflow execution or use an existing role.

5.    Leave Create New Role selected, and give it a logical name in the Role name field.

A new section named Logging configuration appears in the AWS Management Console. Here you can choose what level of logging for CloudWatch Logs. For Express Workflows, you must enable logging to inspect and debug executions.

6.    Choose ALL for Log level and leave the defaults for Include execution data and CloudWatch log group fields.

Figure 4 Logging configuration

Figure 4 Logging configuration

7.    Choose Create state machine.

Once the Express Workflow is created, in the Details section the type is Express.

Figure 5 New 'type' in details section

Figure 5 New ‘type’ in details section

Executing an Express Workflow from within the AWS Management Console is the same process as with a Standard Workflow.

Combining workflows

Applications may require a combination of both long-running and high-event-rate workflows. For example, the initial step in a workflow may involve ingesting and processing IoT stream data (Express Workflow), followed by executing a long-running machine learning model to derive insights (Standard Workflow).

Applications can benefit from the wait state of a Step Functions Standard Workflow and the shorter duration cost efficiencies of an Express Workflow when used together. An example support ticket automation application shows the SaaS integration capabilities of Amazon EventBridge, using Amazon Comprehend for sentiment analysis on support tickets.

Figure 6 Support ticket automation application architecture

Figure 6 Support ticket automation application architecture

You can improve part of the workflow by moving the short duration Lambda ‘Set tags’, ’Set category’, and ‘Escalate priority’ functions into a nested Express Workflow to orchestrate the re-occurring ‘Process Ticket’ stage. The Step Functions Standard Workflow uses the ‘Wait’ task to pause for a set amount of time until the Express Workflow completes. The Express Workflow is a distinct child workflow with its own Success or Fail task state from within the parent workflow. The child workflow must complete within the parent’s duration limit and uses the parent retry policy.

Figure 7 Improved support ticket automation application

Figure 7 Improved support ticket automation application

Monitoring and logging with Express Workflows

Standard Workflows show execution history and visual debugging in the Step Functions console. Express Workflows send execution history to CloudWatch Logs. Two new tabs, Monitoring and Logging, have been added to the AWS Step Functions console to gain visibility into Express Workflow executions.

The Monitoring tab shows six different graphs with CloudWatch metrics for Execution errors, Execution Succeeded, Execution Duration, Billed Duration, Billed Memory, and Executions Started.

The Logging tab shows the logging configuration with a quick link to CloudWatch Logs.

Figure 8 New logging and monitoring tabs

Figure 8 New logging and monitoring tabs

State execution details are visible in CloudWatch Logs. CloudWatch Logs Insights provides search and log analysis with a purpose-built query language.

API changes

The Step Functions API has been updated to integrate Express Workflows for both creating and updating a state machine.

New parameter: Type, String
State machines have a new field called Type. Which can be Standard or Express. The default type is Standard to maintain backwards compatibility.

New parameter: Log-level, Integer
Defines the log detail level for CloudWatch Logs.

New exception: InvalidStateMachineType
This is thrown when type is not Express or Standard, or when there’s an activity (.sync or .waitForTaskToken connector).

Creating state machine with these new parameters:

–        API Call without using Type returns a Express Workflow state machine.

$ aws stepfunctions create-state-machine \
 --definition "{}" \
 --name "FlightTicketHandler" \
 --execution-role-arn "arn:aws:iam:::role:roleName/user" \
 --log-level 1 \
 --tags "Key=CreatedBy,Value=Diego,Key=stack,Value=Production"

–        API Call using Type Express

$ aws stepfunctions create-state-machine \
 --definition "{}" \
 --name "FlightTicketHandler" \
 --execution-role-arn "arn:aws:iam:::role:roleName/user" \
 --log-level 1 \
 --tags "Key=CreatedBy,Value=Diego,Key=stack,Value=Production" \
 --type "EXPRESS"

–        API Call using Type Standard

$ aws stepfunctions create-state-machine \
 --definition "{}" \
 --name "FlightTicketHandler" \
 --execution-role-arn "arn:aws:iam:::role:roleName/user" \
 --log-level 1 \
 --tags "Key=CreatedBy,Value=Diego,Key=stack,Value=Production" \
 --type "STANDARD"

–        Updating the log level of a state machine

$ aws stepfunctions update-state-machine \
 --state-machine-arn "arn:aws:states:us-east-1:123456789012:stateMachine:${StateMachineName}" \
 --log-level 0 

For more information regarding the API changes, refer to the API documentation.

Comparing Standard and Express Workflows

Both workflow types use the declarative code semantics of Amazon States Language to build the workflow definition. Here are some of the key differences:

Express WorkflowsStandard Workflows
Supported execution start rateOver 100,000 per secondOver 2,000 per second
Max run time5 minutes1 year
Execution guaranteeAt least onceExactly once
Execution LoggingAvailable in CloudWatch LogsIn Step Functions service
Pricing

$1.00 per million invocations

Tiered pricing based on memory and duration

$25.00 per million state transitions

Service Integrations and Patterns

 

Supports all service integrations. Does not support Job-run (.sync) or Callback (.wait For Callback) integration patterns

Supports all service integrations and patterns

 

Conclusion

Many customers are already using AWS Step Functions Standard Workflows to orchestrate long-running, auditable workloads. The addition of a new Express Workflow type adds a lower-priced, higher-throughput workflow capability to AWS Step Functions.

Express Workflows complement AWS Step Functions Standard Express Workflows. Developers now have the power to choose the workflow type that best suits their needs, or choose to mix and blend as appropriate.

You can get started with Express Workflows via the AWS Management Console, AWS CLI, or AWS CloudFormation. It is available in all AWS Regions where AWS Step Functions is available.

For more information on where AWS Step Functions is available, see the AWS Region Table. For pricing for Express Workflows, see pricing.

New – AWS Step Functions Express Workflows: High Performance & Low Cost

Post Syndicated from Jeff Barr original https://aws.amazon.com/blogs/aws/new-aws-step-functions-express-workflows-high-performance-low-cost/

We launched AWS Step Functions at re:Invent 2016, and our customers took to the service right away, using them as a core element of their multi-step workflows. Today, we see customers building serverless workflows that orchestrate machine learning training, report generation, order processing, IT automation, and many other multi-step processes. These workflows can run for up to a year, and are built around a workflow model that includes checkpointing, retries for transient failures, and detailed state tracking for auditing purposes.

Based on usage and feedback, our customers really like the core Step Functions model. They love the declarative specifications and the ease with which they can build, test, and scale their workflows. In fact, customers like Step Functions so much that they want to use them for high-volume, short-duration use cases such as IoT data ingestion, streaming data processing, and mobile application backends.

New Express Workflows
Today we are launching Express Workflows as an option to the existing Standard Workflows. The Express Workflows use the same declarative specification model (the Amazon States Language) but are designed for those high-volume, short-duration use cases. Here’s what you need to know:

Triggering – You can use events and read/write API calls associated with a long list of AWS services to trigger execution of your Express Workflows.

Execution Model – Express Workflows use an at-least-once execution model, and will not attempt to automatically retry any failed steps, but you can use Retry and Catch, as described in Error Handling. The steps are not checkpointed, so per-step status information is not available. Successes and failures are logged to CloudWatch Logs, and you have full control over the logging level.

Workflow Steps – Express Workflows support many of the same service integrations as Standard Workflows, with the exception of Activity Tasks. You can initiate long-running services such as AWS Batch, AWS Glue, and Amazon SageMaker, but you cannot wait for them to complete.

Duration – Express Workflows can run for up to five minutes of wall-clock time. They can invoke other Express or Standard Workflows, but cannot wait for them to complete. You can also invoke Express Workflows from Standard Workflows, composing both types in order to meet the needs of your application.

Event Rate – Express Workflows are designed to support a per-account invocation rate greater than 100,000 events per second. Accounts are configured for 6,000 events per second by default and we will, as usual, raise it on request.

Pricing – Standard Workflows are priced based on the number of state transitions. Express Workflows are priced based on the number of invocations and a GB/second charge based on the amount of memory used to track the state of the workflow during execution. While the pricing models are not directly comparable, Express Workflows will be far more cost-effective at scale. To learn more, read about AWS Step Functions Pricing.

As you can see, most of what you already know about Standard Workflows also applies to Express Workflows! You can replace some of your Standard Workflows with Express Workflows, and you can use Express Workflows to build new types of applications.

Using Express Workflows
I can create an Express Workflow and attach it to any desired events with just a few minutes of work. I simply choose the Express type in the console:

Then I define my state machine:

I configure the CloudWatch logging, and add a tag:

Now I can attach my Express Workflow to my event source. I open the EventBridge Console and create a new rule:

I define a pattern that matches PutObject events on a single S3 bucket:

I select my Express Workflow as the event target, add a tag, and click Create:

The particular event will occur only if I have a CloudTrail trail that is set up to record object-level activity:

Then I upload an image to my bucket, and check the CloudWatch Logs group to confirm that my workflow ran as expected:

As a more realistic test, I can upload several hundred images at once and confirm that my Lambda functions are invoked with high concurrency:

I can also use the new Monitoring tab in the Step Functions console to view the metrics that are specific to the state machine:

Available Now
You can create and use AWS Step Functions Express Workflows today in all AWS Regions!

Jeff;

Decoupled Serverless Scheduler To Run HPC Applications At Scale on EC2

Post Syndicated from Emma White original https://aws.amazon.com/blogs/compute/decoupled-serverless-scheduler-to-run-hpc-applications-at-scale-on-ec2/

This post is written by Ludvig Nordstrom and Mark Duffield | on November 27, 2019

In this blog post, we dive in to a cloud native approach for running HPC applications at scale on EC2 Spot Instances, using a decoupled serverless scheduler. This architecture is ideal for many workloads in the HPC and EDA industries, and can be used for any batch job workload.

At the end of this blog post, you will have two takeaways.

  1. A highly scalable environment that can run on hundreds of thousands of cores across EC2 Spot Instances.
  2. A fully serverless architecture for job orchestration.

We discuss deploying and running a pre-built serverless job scheduler that can run both Windows and Linux applications using any executable file format for your application. This environment provides high performance, scalability, cost efficiency, and fault tolerance. We introduce best practices and benefits to creating this environment, and cover the architecture, running jobs, and integration in to existing environments.

quick note about the term cloud native: we use the term loosely in this blog. Here, cloud native  means we use AWS Services (to include serverless and microservices) to build out our compute environment, instead of a traditional lift-and-shift method.

Let’s get started!

 

Solution overview

This blog goes over the deployment process, which leverages AWS CloudFormation. This allows you to use infrastructure as code to automatically build out your environment. There are two parts to the solution: the Serverless Scheduler and Resource Automation. Below are quick summaries of each part of the solutions.

Part 1 – The serverless scheduler

This first part of the blog builds out a serverless workflow to get jobs from SQS and run them across EC2 instances. The CloudFormation template being used for Part 1 is serverless-scheduler-app.template, and here is the Reference Architecture:

 

Serverless Scheduler Reference Architecture . Reference Architecture for Part 1. This architecture shows just the Serverless Schduler. Part 2 builds out the resource allocation architecture. Outlined Steps with detail from figure one

    Figure 1: Serverless Scheduler Reference Architecture (grayed-out area is covered in Part 2).

Read the GitHub Repo if you want to look at the Step Functions workflow contained in preceding images. The walkthrough explains how the serverless application retrieves and runs jobs on its worker, updates DynamoDB job monitoring table, and manages the worker for its lifetime.

 

Part 2 – Resource automation with serverless scheduler


This part of the solution relies on the serverless scheduler built in Part 1 to run jobs on EC2.  Part 2 simplifies submitting and monitoring jobs, and retrieving results for users. Jobs are spread across our cost-optimized Spot Instances. AWS Autoscaling automatically scales up the compute resources when jobs are submitted, then terminates them when jobs are finished. Both of these save you money.

The CloudFormation template used in Part 2 is resource-automation.template. Building on Figure 1, the additional resources launched with Part 2 are noted in the following image, they are an S3 Bucket, AWS Autoscaling Group, and two Lambda functions.

Resource Automation using Serverless Scheduler This is Part 2 of the deployment process, and leverages the Part 1 architecture. This provides the resource allocation, that allows for automated job submission and EC2 Auto Scaling. Detailed steps for the prior image

 

Figure 2: Resource Automation using Serverless Scheduler

                               

Introduction to decoupled serverless scheduling

HPC schedulers traditionally run in a classic master and worker node configuration. A scheduler on the master node orchestrates jobs on worker nodes. This design has been successful for decades, however many powerful schedulers are evolving to meet the demands of HPC workloads. This scheduler design evolved from a necessity to run orchestration logic on one machine, but there are now options to decouple this logic.

What are the possible benefits that decoupling this logic could bring? First, we avoid a number of shortfalls in the environment such as the need for all worker nodes to communicate with a single master node. This single source of communication limits scalability and creates a single point of failure. When we split the scheduler into decoupled components both these issues disappear.

Second, in an effort to work around these pain points, traditional schedulers had to create extremely complex logic to manage all workers concurrently in a single application. This stifled the ability to customize and improve the code – restricting changes to be made by the software provider’s engineering teams.

Serverless services, such as AWS Step Functions and AWS Lambda fix these major issues. They allow you to decouple the scheduling logic to have a one-to-one mapping with each worker, and instead share an Amazon Simple Queue Service (SQS) job queue. We define our scheduling workflow in AWS Step Functions. Then the workflow scales out to potentially thousands of “state machines.” These state machines act as wrappers around each worker node and manage each worker node individually.  Our code is less complex because we only consider one worker and its job.

We illustrate the differences between a traditional shared scheduler and decoupled serverless scheduler in Figures 3 and 4.

 

Traditional Scheduler Model This shows a traditional sceduler where there is one central schduling host, and then multiple workers.

Figure 3: Traditional Scheduler Model

 

Decoupled Serverless Scheduler on each instance This shows what a Decoupled Serverless Scheduler design looks like, wit

Figure 4: Decoupled Serverless Scheduler on each instance

 

Each decoupled serverless scheduler will:

  • Retrieve and pass jobs to its worker
  • Monitor its workers health and take action if needed
  • Confirm job success by checking output logs and retry jobs if needed
  • Terminate the worker when job queue is empty just before also terminating itself

With this new scheduler model, there are many benefits. Decoupling schedulers into smaller schedulers increases fault tolerance because any issue only affects one worker. Additionally, each scheduler consists of independent AWS Lambda functions, which maintains the state on separate hardware and builds retry logic into the service.  Scalability also increases, because jobs are not dependent on a master node, which enables the geographic distribution of jobs. This geographic distribution allows you to optimize use of low-cost Spot Instances. Also, when decoupling the scheduler, workflow complexity decreases and you can customize scheduler logic. You can leverage lower latency job monitoring and customize automated responses to job events as they happen.

 

Benefits

  • Fully managed –  With Part 2, Resource Automation deployed, resources for a job are managed. When a job is submitted, resources launch and run the job. When the job is done, worker nodes automatically shut down. This prevents you from incurring continuous costs.

 

  • Performance – Your application runs on EC2, which means you can choose any of the high performance instance types. Input files are automatically copied from Amazon S3 into local Amazon EC2 Instance Store for high performance storage during execution. Result files are automatically moved to S3 after each job finishes.

 

  • Scalability – A worker node combined with a scheduler state machine become a stateless entity. You can spin up as many of these entities as you want, and point them to an SQS queue. You can even distribute worker and state machine pairs across multiple AWS regions. These two components paired with fully managed services optimize your architecture for scalability to meet your desired number of workers.

 

  • Fault Tolerance –The solution is completely decoupled, which means each worker has its own state machine that handles scheduling for that worker. Likewise, each state machine is decoupled into Lambda functions that make up your state machine. Additionally, the scheduler workflow includes a Lambda function that confirms each successful job or resubmits jobs.

 

  • Cost Efficiency – This fault tolerant environment is perfect for EC2 Spot Instances. This means you can save up to 90% on your workloads compared to On-Demand Instance pricing. The scheduler workflow ensures little to no idle time of workers by closely monitoring and sending new jobs as jobs finish. Because the scheduler is serverless, you only incur costs for the resources required to launch and run jobs. Once the job is complete, all are terminated automatically.

 

  • Agility – You can use AWS fully managed Developer Tools to quickly release changes and customize workflows. The reduced complexity of a decoupled scheduling workflow means that you don’t have to spend time managing a scheduling environment, and can instead focus on your applications.

 

 

Part 1 – serverless scheduler as a standalone solution

 

If you use the serverless scheduler as a standalone solution, you can build clusters and leverage shared storage such as FSx for Lustre, EFS, or S3. Additionally, you can use AWS CloudFormation or to deploy more complex compute architectures that suit your application. So, the EC2 Instances that run the serverless scheduler can be launched in any number of ways. The scheduler only requires the instance id and the SQS job queue name.

 

Submitting Jobs Directly to serverless scheduler

The severless scheduler app is a fully built AWS Step Function workflow to pull jobs from an SQS queue and run them on an EC2 Instance. The jobs submitted to SQS consist of an AWS Systems Manager Run Command, and work with any SSM Document and command that you chose for your jobs. Examples of SSM Run Commands are ShellScript and PowerShell.  Feel free to read more about Running Commands Using Systems Manager Run Command.

The following code shows the format of a job submitted to SQS in JSON.

  {

    "job_id": "jobId_0",

    "retry": "3",

    "job_success_string": " ",

    "ssm_document": "AWS-RunPowerShellScript",

    "commands":

        [

            "cd C:\\ProgramData\\Amazon\\SSM; mkdir Result",

            "Copy-S3object -Bucket my-bucket -KeyPrefix jobs/date/jobId_0 -LocalFolder .\\",

            "C:\\ProgramData\\Amazon\\SSM\\jobId_0.bat",

            "Write-S3object -Bucket my-bucket -KeyPrefix jobs/date/jobId_0 –Folder .\\Result\\"

        ],

  }

 

Any EC2 Instance associated with a serverless scheduler it receives jobs picked up from a designated SQS queue until the queue is empty. Then, the EC2 resource automatically terminates. If the job fails, it retries until it reaches the specified number of times in the job definition. You can include a specific string value so that the scheduler searches for job execution outputs and confirms the successful completions of jobs.

 

Tagging EC2 workers to get a serverless scheduler state machine

In Part 1 of the deployment, you must manage your EC2 Instance launch and termination. When launching an EC2 Instance, tag it with a specific tag key that triggers a state machine to manage that instance. The tag value is the name of the SQS queue that you want your state machine to poll jobs from.

In the following example, “my-scheduler-cloudformation-stack-name” is the tag key that serverless scheduler app will for with any new EC2 instance that starts. Next, “my-sqs-job-queue-name” is the default job queue created with the scheduler. But, you can change this to any queue name you want to retrieve jobs from when an instance is launched.

{"my-scheduler-cloudformation-stack-name":"my-sqs-job-queue-name"}

 

Monitor jobs in DynamoDB

You can monitor job status in the following DynamoDB. In the table you can find job_id, commands sent to Amazon EC2, job status, job output logs from Amazon EC2, and retries among other things.

Alternatively, you can query DynamoDB for a given job_id via the AWS Command Line Interface:

aws dynamodb get-item --table-name job-monitoring \

                      --key '{"job_id": {"S": "/my-jobs/my-job-id.bat"}}'

 

Using the “job_success_string” parameter

For the prior DynamoDB table, we submitted two identical jobs using an example script that you can also use. The command sent to the instance is “echo Hello World.” The output from this job should be “Hello World.” We also specified three allowed job retries.  In the following image, there are two jobs in SQS queue before they ran.  Look closely at the different “job_success_strings” for each and the identical command sent to both:

DynamoDB CLI info This shows an example DynamoDB CLI output with job information.

From the image we see that Job2 was successful and Job1 retried three times before permanently labelled as failed. We forced this outcome to demonstrate how the job success string works by submitting Job1 with “job_success_string” as “Hello EVERYONE”, as that will not be in the job output “Hello World.” In “Job2” we set “job_success_string” as “Hello” because we knew this string will be in the output log.

Job outputs commonly have text that only appears if job succeeded. You can also add this text yourself in your executable file. With “job_success_string,” you can confirm a job’s successful output, and use it to identify a certain value that you are looking for across jobs.

 

Part 2 – Resource Automation with the serverless scheduler

The additional services we deploy in Part 2 integrate with existing architectures to launch resources for your serverless scheduler. These services allow you to submit jobs simply by uploading input files and executable files to an S3 bucket.

Likewise, these additional resources can use any executable file format you want, including proprietary application level scripts. The solution automates everything else. This includes creating and submitting jobs to SQS job queue, spinning up compute resources when new jobs come in, and taking them back down when there are no jobs to run. When jobs are done, result files are copied to S3 for the user to retrieve. Similar to Part 1, you can still view the DynamoDB table for job status.

This architecture makes it easy to scale out to different teams and departments, and you can submit potentially hundreds of thousands of jobs while you remain in control of resources and cost.

 

Deeper Look at the S3 Architecture

The following diagram shows how you can submit jobs, monitor progress, and retrieve results. To submit jobs, upload all the needed input files and an executable script to S3. The suffix of the executable file (uploaded last) triggers an S3 event to start the process, and this suffix is configurable.

The S3 key of the executable file acts as the job id, and is kept as a reference to that job in DynamoDB. The Lambda (#2 in diagram below) uses the S3 key of the executable to create three SSM Run Commands.

  1. Synchronize all files in the same S3 folder to a working directory on the EC2 Instance.
  2. Run the executable file on EC2 Instances within a specified working directory.
  3. Synchronize the EC2 Instances working directory back to the S3 bucket where newly generated result files are included.

This Lambda (#2) then places the job on the SQS queue using the schedulers JSON formatted job definition seen above.

IMPORTANT: Each set of job files should be given a unique job folder in S3 or more files than needed might be moved to the EC2 Instance.

 

Figure 5: Resource Automation using Serverless Scheduler - A deeper look A deeper dive in to Part 2, resource allcoation.

Figure 5: Resource Automation using Serverless Scheduler – A deeper look

 

EC2 and Step Functions workflow use the Lambda function (#3 in prior diagram) and the Auto Scaling group to scale out based on the number of jobs in the queue to a maximum number of workers (plus state machine), as defined in the Auto Scaling Group. When the job queue is empty, the number of running instances scale down to 0 as they finish their remaining jobs.

 

Process Submitting Jobs and Retrieving Results

  1. Seen in1, upload input file(s) and an executable file into a unique job folder in S3 (such as /year/month/day/jobid/~job-files). Upload the executable file last because it automatically starts the job. You can also use a script to upload multiple files at a time but each job will need a unique directory. There are many ways to make S3 buckets available to users including AWS Storage Gateway, AWS Transfer for SFTP, AWS DataSync, the AWS Console or any one of the AWS SDKs leveraging S3 API calls.
  2. You can monitor job status by accessing the DynamoDB table directly via the AWS Management Console or use the AWS CLI to call DynamoDB via an API call.
  3. Seen in step 5, you can retrieve result files for jobs from the same S3 directory where you left the input files. The DynamoDB table confirms when jobs are done. The SQS output queue can be used by applications that must automatically poll and retrieve results.

You no longer need to create or access compute nodes as compute resources. These automatically scale up from zero when jobs come in, and then back down to zero when jobs are finished.

 

Deployment

Read the GitHub Repo for deployment instructions. Below are CloudFormation templates to help:

AWS RegionLaunch Stack
eu-north-1link to zone
ap-south-1
eu-west-3
eu-west-2
eu-west-1
ap-northeast-3
ap-northeast-2
ap-northeast-1
sa-east-1
ca-central-1
ap-southeast-1
ap-southeast-2
eu-central-1
us-east-1
us-east-2
us-west-1
us-west-2

 

 

Additional Points on Usage Patterns

 

  • While the two solutions in this blog are aimed at HPC applications, they can be used to run any batch jobs. Many customers that run large data processing batch jobs in their data lakes could use the serverless scheduler.

 

  • You can build pipelines of different applications when the output of one job triggers another to do something else – an example being pre-processing, meshing, simulation, post-processing. You simply deploy the Resource Automation template several times, and tailor it so that the output bucket for one step is the input bucket for the next step.

 

  • You might look to use the “job_success_string” parameter for iteration/verification used in cases where a shot-gun approach is needed to run thousands of jobs, and only one has a chance of producing the right result. In this case the “job_success_string” would identify the successful job from potentially hundreds of thousands pushed to SQS job queue.

 

Scale-out across teams and departments

Because all services used are serverless, you can deploy as many run environments as needed without increasing overall costs. Serverless workloads only accumulate cost when the services are used. So, you could deploy ten job environments and run one job in each, and your costs would be the same if you had one job environment running ten jobs.

 

All you need is an S3 bucket to upload jobs to and an associated AMI that has the right applications and license configuration. Because a job configuration is passed to the scheduler at each job start, you can add new teams by creating an S3 bucket and pointing S3 events to a default Lambda function that pulls configurations for each job start.

 

Setup CI/CD pipeline to start continuous improvement of scheduler

If you are advanced, we encourage you to clone the git repo and customize this solution. The serverless scheduler is less complex than other schedulers, because you only think about one worker and the process of one job’s run.

Ways you could tailor this solution:

  • Add intelligent job scheduling using AWS Sagemaker  – It is hard to find data as ready for ML as log data because every job you run has different run times and resource consumption. So, you could tailor this solution to predict the best instance to use with ML when workloads are submitted.
  • Add Custom Licensing Checkout Logic – Simply add one Lambda function to your Step Functions workflow to make an API call a license server before continuing with one or more jobs. You can start a new worker when you have a license checked out or if a license is not available then the instance can terminate to remove any costs waiting for licenses.
  • Add Custom Metrics to DynamoDB – You can easily add metrics to DynamoDB because the solution already has baseline logging and monitoring capabilities.
  • Run on other AWS Services – There is a Lambda function in the Step Functions workflow called “Start_Job”. You can tailor this Lambda to run your jobs on AWS Sagemaker, AWS EMR, AWS EKS or AWS ECS instead of EC2.

 

Conclusion

 

Although HPC workloads and EDA flows may still be dependent on current scheduling technologies, we illustrated the possibilities of decoupling your workloads from your existing shared scheduling environments. This post went deep into decoupled serverless scheduling, and we understand that it is difficult to unwind decades of dependencies. However, leveraging numerous AWS Services encourages you to think completely differently about running workloads.

But more importantly, it encourages you to Think Big. With this solution you can get up and running quickly, fail fast, and iterate. You can do this while scaling to your required number of resources, when you want them, and only pay for what you use.

Serverless computing  catalyzes change across all industries, but that change is not obvious in the HPC and EDA industries. This solution is an opportunity for customers to take advantage of the nearly limitless capacity that AWS.

Please reach out with questions about HPC and EDA on AWS. You now have the architecture and the instructions to build your Serverless Decoupled Scheduling environment.  Go build!


About the Authors and Contributors

Authors 

 

Ludvig Nordstrom is a Senior Solutions Architect at AWS

 

 

 

 

Mark Duffield is a Tech Lead in Semiconductors at AWS

 

 

 

Contributors

 

Steve Engledow is a Senior Solutions Builder at AWS

 

 

 

 

Arun Thomas is a Senior Solutions Builder at AWS

 

 

Automating Zendesk With Amazon EventBridge and AWS Step Functions

Post Syndicated from benjasl original https://aws.amazon.com/blogs/compute/automating-zendesk-with-amazon-eventbridge-and-aws-step-functions/

In July 2019, AWS launched Amazon EventBridge, a serverless event bus that offers third-party software as a service (SaaS) integration capabilities. This service allows applications and AWS services to integrate with each other in near-real time via an event bus. Amazon EventBridge launched with a number of partner integrations, to enable you to quickly connect to some of your favorite SaaS solutions.

This post describes how to deploy an application from the AWS Serverless Application Repository that uses EventBridge to seamlessly integrate with and automate Zendesk. The application performs sentiment analysis on Zendesk support tickets with Amazon Comprehend. It then uses AWS Lambda and AWS Step Functions to categorize and orchestrate the escalation priority, based on configurable SLA wait times.

High-level architecture diagram

This application serves as a starter template for an automated ticket escalation policy. It could be extended to self-serve and remediate automatically, according to the individual tickets submitted. For example, creating database backups in response to release tickets, or creating new user accounts for user access requests.

Important: the application uses various AWS services, and there are costs associated with these services after the Free Tier usage. Please see the AWS pricing page for details. This application also requires a Zendesk account.

To show how AWS services integrate applications or third-party SaaS via EventBridge, you deploy this application from the AWS Serverless Application Repository. You then enable, connect, and configure the EventBridge rules from within the AWS Management Console before triggering the rule and running the application.

Before deploying this application from the AWS Serverless Application Repository, you must generate an API key from within Zendesk.

Creating the Zendesk API Resource

Use an API to execute events on your Zendesk account from AWS. It’s not currently possible to sync bidirectionally between Zendesk and AWS. Follow these steps to generate a Zendesk API Token that is used by the application to authenticate Zendesk API calls.

To generate an API token:

1. Log in to the Zendesk dashboard.

2. Choose the Admin icon in the sidebar, then select Channels > API.

3. Choose the Settings tab, and make sure that Token Access is enabled.

4. Choose the + button to the right of Active API Tokens.

Creating a Zendesk API token

5. Copy the token, and store it securely. Once you close this window, the full token will never be displayed again.

6. Choose Save to return to the API page, which shows a truncated version of the token.

Zendesk API token

Deploy the application from the Serverless Application Repository

1. Go to the deployment page on the Serverless Application Repository.

2. Fill out the required deployment fields:

  • ZenDeskDomain: this appears in the account’s URL: https://[yoursubdomain].zendesk.com.
  • ZenDeskPassword: the API key generated in the earlier step, “Creating the Zendesk API Resource.”
  • ZenDeskUsername: the account’s primary email address.

Deployment Fields

3. Choose Deploy.

Once the deployment process has completed, five new resources have been created. This includes four Lambda functions that perform the individual compute functionality, and one Step Functions state machine.

AWS Step Functions is a serverless orchestration service. It lets you easily coordinate multiple Lambda functions into flexible workflows that are easy to debug and easy to change. The state machine is used to manage the Lambda functions, together with business logic and wait times.

When EventBridge receives a new event, it’s directed into the pre-assigned event bus. Here, it’s compared with associated rules. Each rule has an event pattern defined, which acts as a filter to match inbound events to their corresponding rules. In this application, a matching event rule triggers an AWS Step Functions invocation, passing in the event payload from Zendesk.

To integrate a partner SaaS application with Amazon EventBridge, you must configure three components:

1. The event source

2. The event bus

3. The event rule and target

Configuring Zendesk with Amazon EventBridge

To send Zendesk events to EventBridge, you need access to the Zendesk Events connector early access program (EAP). You can register for this here.

Step 1. Configuring your Zendesk event source

1. Go to your Zendesk Admin Center and select Admin Center > Integrations.

Zendesk integrations

2. Choose Connect in Events Connector for Amazon EventBridge to open the page to configure your Zendesk event source.

3. Enter your AWS account ID in the Amazon Web Services account ID field, and select the Region to receive events.

4. Choose Save.

Step 2. Associate the Zendesk event source with a new event bus

1. Sign into the AWS Management Console and navigate to Services > Amazon EventBridge > Partner event sources.

New event source

2. Select the radio button next to the new event source and choose the Associate with event bus button.

Associating event source with event bus

3. Choose Associate.

4. Navigate to Amazon EventBridge > Events > Event buses.

Creating an event bus

5. You can see the newly-created event bus in the Custom event bus section.

Step 3 Create a new Rule for the event bus

1. Navigate to the rules page in the EventBridge Console, then select Events > Rules.

2. To select the new event bus, use the drop-down arrow in the Select event bus section.

Custom event bus

3. Choose Create Rule.

4. Enter a name for the new rule, such as “New Zendesk Ticket.”

5. In the Define Pattern section, choose Event pattern. Select Custom Pattern. A new input box appears that allows you to enter a pre-defined event pattern, represented as a JSON object. This is used to match relevant events.

6. Copy and paste this JSON object into the Event Pattern input box.

{
    "account": [
        "{YourAWSAccountNumber}"
    ],
    "detail-type": [
        "Support Ticket: Ticket Created"
    ]
 }

This event pattern can be found in the list of event schemas provided by Zendesk. It’s important to test the event pattern to ensure it correctly matches the event schema that EventBridge receives.

7. Choose Save.

Each event has the option to forward the data input (or a filtered version) onto a wide selection of targets. This application invokes a Step Functions state machine and passes in the Zendesk event data.

8. In the Targets Section drop-down, select Step Functions state machine. Select the application’s step function.

Event target selector

9. Scroll down and choose Create.

Running the application

Once EventBridge is configured to receive Zendesk events, it’s possible to trigger the application by creating a new ticket in Zendesk. This sends the event to EventBridge, which then triggers the Step Functions state machine:

Step Function Orchestration

The Step Functions state machine holds each state object in the workflow. Some of the state objects use the Lambda functions created in the earlier steps to process data. Others use Amazon States Language (ASL) enabling the application to conditionally branch, wait, and transition to the next state.

Using a state machine this way ensures that the business logic is decoupled from the Lambda compute functionality. Each of the Step Functions states are detailed below:

ZenDeskGetFullTicket

State Type: Task, service: AWS Lambda

This function receives a ticket ID and invokes the Zendesk API to retrieve a complete record of ticket metadata. This is used for the subsequent lifecycle of the AWS Step Functions state machine.

ZenDeskDemoGetSentiment

State type: Task. Services: AWS Lambda, Amazon Comprehend

This function uses Amazon Comprehend, a natural language processing (NLP) service using machine learning to find insights and relationships in text. For this use case, the ‘detect Sentiment’ function determines the sentiment of a Zendesk ticket.  The function accepts a single text string as its input and returns a JSON object containing a sentiment score.

isNegative

State type: Choice

This choice state adds branching logic to the state machine. It uses a “choice rule” to determine if the string input from the preceding task is equal to Negative. If true, it branches on to the next task. If false, the state machine’s execution ends.

SetTags

State type: Task. Service: AWS Lambda

This task invokes the “ZenDeskDemoSetTags” Lambda function. A Zendesk API resource sets a new tag on the ticket before passing the returned output onto the next state.

isClosed

State type: Choice

This compares the current status input to the string “Open” to check if a ticket has been actioned or closed. If a ticket status remains “Open”, the state machine continues along the true branch to the “GetSLAWaitTime” State. Otherwise it exits along the false branch and end execution.

GetSLAWaitTime

State type: Choice

This state conditionally branches to a different SLA wait time, depending on the ticket’s current priority status.

SLAUrgentWait, SLAHighWait, SANormalWait

State type: Wait

These three states delay the state machine from continuing for a set period of time dependent on the urgency of the ticket, allowing the ticket to be actioned by a Zendesk agent.  The wait time is specified when deploying the application.

ZenDeskDemoSetPriority

State type: Task. Services: AWS Lambda

This Lambda function receives a ticket ID and priority value, then invokes Zendesk’s API to escalate the ticket to a higher priority value.

closedOrNotNegative

State type: Pass

This state passes its input to its output, without performing work. Pass states are useful when constructing and debugging state machines.

FinalEscalation

State type: Success

This stops the execution successfully.

The sequence shows an accelerated version of the ticket’s lifecycle in Zendesk:

Zendesk ticket lifecycle

The application runs entirely in the background. Each Step Functions invocation can last for up to a year, allowing for long wait periods before automatically escalating the ticket’s priority. There is no extra cost associated with longer wait time – you only pay for the number of state transitions and not for the idle wait time.

Conclusion

Using EventBridge to route an event directly to AWS Step Functions has reduced the need for unnecessary communication layers. It helps promote good use of compute resources, ensuring Lambda is used to transform data and not transport or orchestrate.

The implementation of AWS Step Functions adds resiliency to the orchestration layer and allows the compute processes to remain decoupled from the business logic. This application demonstrates how EventBridge can be used as management layer for event ingestion and routing.  Additional Zendesk events such as “Comment Created”, “Priority Changed” or any number listed in the Zendesk events schema can be added using a rule.

By adding a single connection point from Zendesk to AWS, you can extend and automate your support ticketing system with a serverless application that is performant, cost-efficient, and scalable.

Combining the functionality of your favorite SaaS solutions with the power of AWS, EventBridge has the potential to trigger a new wave of serverless applications. What will you integrate with first?

New – Using Step Functions to Orchestrate Amazon EMR workloads

Post Syndicated from Danilo Poccia original https://aws.amazon.com/blogs/aws/new-using-step-functions-to-orchestrate-amazon-emr-workloads/

AWS Step Functions allows you to add serverless workflow automation to your applications. The steps of your workflow can run anywhere, including in AWS Lambda functions, on Amazon Elastic Compute Cloud (EC2), or on-premises. To simplify building workflows, Step Functions is directly integrated with multiple AWS Services: Amazon ECS, AWS Fargate, Amazon DynamoDB, Amazon Simple Notification Service (SNS), Amazon Simple Queue Service (SQS), AWS Batch, AWS Glue, Amazon SageMaker, and (to run nested workflows) with Step Functions itself.

Starting today, Step Functions connects to Amazon EMR, enabling you to create data processing and analysis workflows with minimal code, saving time, and optimizing cluster utilization. For example, building data processing pipelines for machine learning is time consuming and hard. With this new integration, you have a simple way to orchestrate workflow capabilities, including parallel executions and dependencies from the result of a previous step, and handle failures and exceptions when running data processing jobs.

Specifically, a Step Functions state machine can now:

  • Create or terminate an EMR cluster, including the possibility to change the cluster termination protection. In this way, you can reuse an existing EMR cluster for your workflow, or create one on-demand during execution of a workflow.
  • Add or cancel an EMR step for your cluster. Each EMR step is a unit of work that contains instructions to manipulate data for processing by software installed on the cluster, including tools such as Apache Spark, Hive, or Presto.
  • Modify the size of an EMR cluster instance fleet or group, allowing you to manage scaling programmatically depending on the requirements of each step of your workflow. For example, you may increase the size of an instance group before adding a compute-intensive step, and reduce the size just after it has completed.

When you create or terminate a cluster or add an EMR step to a cluster, you can use synchronous integrations to move to the next step of your workflow only when the corresponding activity has completed on the EMR cluster.

Reading the configuration or the state of your EMR clusters is not part of the Step Functions service integration. In case you need that, the EMR List* and Describe* APIs can be accessed using Lambda functions as tasks.

Building a Workflow with EMR and Step Functions
On the Step Functions console, I create a new state machine. The console renders it visually, so that is much easier to understand:

To create the state machine, I use the following definition using the Amazon States Language (ASL):

{
  "StartAt": "Should_Create_Cluster",
  "States": {
    "Should_Create_Cluster": {
      "Type": "Choice",
      "Choices": [
        {
          "Variable": "$.CreateCluster",
          "BooleanEquals": true,
          "Next": "Create_A_Cluster"
        },
        {
          "Variable": "$.CreateCluster",
          "BooleanEquals": false,
          "Next": "Enable_Termination_Protection"
        }
      ],
      "Default": "Create_A_Cluster"
    },
    "Create_A_Cluster": {
      "Type": "Task",
      "Resource": "arn:aws:states:::elasticmapreduce:createCluster.sync",
      "Parameters": {
        "Name": "WorkflowCluster",
        "VisibleToAllUsers": true,
        "ReleaseLabel": "emr-5.28.0",
        "Applications": [{ "Name": "Hive" }],
        "ServiceRole": "EMR_DefaultRole",
        "JobFlowRole": "EMR_EC2_DefaultRole",
        "LogUri": "s3://aws-logs-123412341234-eu-west-1/elasticmapreduce/",
        "Instances": {
          "KeepJobFlowAliveWhenNoSteps": true,
          "InstanceFleets": [
            {
              "InstanceFleetType": "MASTER",
              "TargetOnDemandCapacity": 1,
              "InstanceTypeConfigs": [
                {
                  "InstanceType": "m4.xlarge"
                }
              ]
            },
            {
              "InstanceFleetType": "CORE",
              "TargetOnDemandCapacity": 1,
              "InstanceTypeConfigs": [
                {
                  "InstanceType": "m4.xlarge"
                }
              ]
            }
          ]
        }
      },
      "ResultPath": "$.CreateClusterResult",
      "Next": "Merge_Results"
    },
    "Merge_Results": {
      "Type": "Pass",
      "Parameters": {
        "CreateCluster.$": "$.CreateCluster",
        "TerminateCluster.$": "$.TerminateCluster",
        "ClusterId.$": "$.CreateClusterResult.ClusterId"
      },
      "Next": "Enable_Termination_Protection"
    },
    "Enable_Termination_Protection": {
      "Type": "Task",
      "Resource": "arn:aws:states:::elasticmapreduce:setClusterTerminationProtection",
      "Parameters": {
        "ClusterId.$": "$.ClusterId",
        "TerminationProtected": true
      },
      "ResultPath": null,
      "Next": "Add_Steps_Parallel"
    },
    "Add_Steps_Parallel": {
      "Type": "Parallel",
      "Branches": [
        {
          "StartAt": "Step_One",
          "States": {
            "Step_One": {
              "Type": "Task",
              "Resource": "arn:aws:states:::elasticmapreduce:addStep.sync",
              "Parameters": {
                "ClusterId.$": "$.ClusterId",
                "Step": {
                  "Name": "The first step",
                  "ActionOnFailure": "CONTINUE",
                  "HadoopJarStep": {
                    "Jar": "command-runner.jar",
                    "Args": [
                      "hive-script",
                      "--run-hive-script",
                      "--args",
                      "-f",
                      "s3://eu-west-1.elasticmapreduce.samples/cloudfront/code/Hive_CloudFront.q",
                      "-d",
                      "INPUT=s3://eu-west-1.elasticmapreduce.samples",
                      "-d",
                      "OUTPUT=s3://MY-BUCKET/MyHiveQueryResults/"
                    ]
                  }
                }
              },
              "End": true
            }
          }
        },
        {
          "StartAt": "Wait_10_Seconds",
          "States": {
            "Wait_10_Seconds": {
              "Type": "Wait",
              "Seconds": 10,
              "Next": "Step_Two (async)"
            },
            "Step_Two (async)": {
              "Type": "Task",
              "Resource": "arn:aws:states:::elasticmapreduce:addStep",
              "Parameters": {
                "ClusterId.$": "$.ClusterId",
                "Step": {
                  "Name": "The second step",
                  "ActionOnFailure": "CONTINUE",
                  "HadoopJarStep": {
                    "Jar": "command-runner.jar",
                    "Args": [
                      "hive-script",
                      "--run-hive-script",
                      "--args",
                      "-f",
                      "s3://eu-west-1.elasticmapreduce.samples/cloudfront/code/Hive_CloudFront.q",
                      "-d",
                      "INPUT=s3://eu-west-1.elasticmapreduce.samples",
                      "-d",
                      "OUTPUT=s3://MY-BUCKET/MyHiveQueryResults/"
                    ]
                  }
                }
              },
              "ResultPath": "$.AddStepsResult",
              "Next": "Wait_Another_10_Seconds"
            },
            "Wait_Another_10_Seconds": {
              "Type": "Wait",
              "Seconds": 10,
              "Next": "Cancel_Step_Two"
            },
            "Cancel_Step_Two": {
              "Type": "Task",
              "Resource": "arn:aws:states:::elasticmapreduce:cancelStep",
              "Parameters": {
                "ClusterId.$": "$.ClusterId",
                "StepId.$": "$.AddStepsResult.StepId"
              },
              "End": true
            }
          }
        }
      ],
      "ResultPath": null,
      "Next": "Step_Three"
    },
    "Step_Three": {
      "Type": "Task",
      "Resource": "arn:aws:states:::elasticmapreduce:addStep.sync",
      "Parameters": {
        "ClusterId.$": "$.ClusterId",
        "Step": {
          "Name": "The third step",
          "ActionOnFailure": "CONTINUE",
          "HadoopJarStep": {
            "Jar": "command-runner.jar",
            "Args": [
              "hive-script",
              "--run-hive-script",
              "--args",
              "-f",
              "s3://eu-west-1.elasticmapreduce.samples/cloudfront/code/Hive_CloudFront.q",
              "-d",
              "INPUT=s3://eu-west-1.elasticmapreduce.samples",
              "-d",
              "OUTPUT=s3://MY-BUCKET/MyHiveQueryResults/"
            ]
          }
        }
      },
      "ResultPath": null,
      "Next": "Disable_Termination_Protection"
    },
    "Disable_Termination_Protection": {
      "Type": "Task",
      "Resource": "arn:aws:states:::elasticmapreduce:setClusterTerminationProtection",
      "Parameters": {
        "ClusterId.$": "$.ClusterId",
        "TerminationProtected": false
      },
      "ResultPath": null,
      "Next": "Should_Terminate_Cluster"
    },
    "Should_Terminate_Cluster": {
      "Type": "Choice",
      "Choices": [
        {
          "Variable": "$.TerminateCluster",
          "BooleanEquals": true,
          "Next": "Terminate_Cluster"
        },
        {
          "Variable": "$.TerminateCluster",
          "BooleanEquals": false,
          "Next": "Wrapping_Up"
        }
      ],
      "Default": "Wrapping_Up"
    },
    "Terminate_Cluster": {
      "Type": "Task",
      "Resource": "arn:aws:states:::elasticmapreduce:terminateCluster.sync",
      "Parameters": {
        "ClusterId.$": "$.ClusterId"
      },
      "Next": "Wrapping_Up"
    },
    "Wrapping_Up": {
      "Type": "Pass",
      "End": true
    }
  }
}

I let the Step Functions console create a new AWS Identity and Access Management (IAM) role for the executions of this state machine. The role automatically includes all permissions required to access EMR.

This state machine can either use an existing EMR cluster, or create a new one. I can use the following input to create a new cluster that is terminated at the end of the workflow:

{
"CreateCluster": true,
"TerminateCluster": true
}

To use an existing cluster, I need to provide input in the cluster ID, using this syntax:

{
"CreateCluster": false,
"TerminateCluster": false,
"ClusterId": "j-..."
}

Let’s see how that works. As the workflow starts, the Should_Create_Cluster Choice state looks into the input to decide if it should enter the Create_A_Cluster state or not. There, I use a synchronous call (elasticmapreduce:createCluster.sync) to wait for the new EMR cluster to reach the WAITING state before progressing to the next workflow state. The AWS Step Functions console shows the resource that is being created with a link to the EMR console:

After that, the Merge_Results Pass state merges the input state with the cluster ID of the newly created cluster to pass it to the next step in the workflow.

Before starting to process any data, I use the Enable_Termination_Protection state (elasticmapreduce:setClusterTerminationProtection) to help ensure that the EC2 instances in my EMR cluster are not shut down by an accident or error.

Now I am ready to do something with the EMR cluster. I have three EMR steps in the workflow. For the sake of simplicity, these steps are all based on this Hive tutorial. For each step, I use Hive’s SQL-like interface to run a query on some sample CloudFront logs and write the results to Amazon Simple Storage Service (S3). In a production use case, you’d probably have a combination of EMR tools processing and analyzing your data in parallel (two or more steps running at the same time) or with some dependencies (the output of one step is required by another step). Let’s try to do something similar.

First I execute Step_One and Step_Two inside a Parallel state:

  • Step_One is running the EMR step synchronously as a job (elasticmapreduce:addStep.sync). That means that the execution waits for the EMR step to be completed (or cancelled) before moving on to the next step in the workflow. You can optionally add a timeout to monitor that the execution of the EMR step happens within an expected time frame.
  • Step_Two is adding an EMR step asynchronously (elasticmapreduce:addStep). In this case, the workflow moves to the next step as soon as EMR replies that the request has been received. After a few seconds, to try another integration, I cancel Step_Two (elasticmapreduce:cancelStep). This integration can be really useful in production use cases. For example, you can cancel an EMR step if you get an error from another step running in parallel that would make it useless to continue with the execution of this step.

After those two steps have both completed and produce their results, I execute Step_Three as a job, similarly to what I did for Step_One. When Step_Three has completed, I enter the Disable_Termination_Protection step, because I am done using the cluster for this workflow.

Depending on the input state, the Should_Terminate_Cluster Choice state is going to enter the Terminate_Cluster state (elasticmapreduce:terminateCluster.sync) and wait for the EMR cluster to terminate, or go straight to the Wrapping_Up state and leave the cluster running.

Finally I have a state for Wrapping_Up. I am not doing much in this final state actually, but you can’t end a workflow from a Choice state.

In the EMR console I see the status of my cluster and of the EMR steps:

Using the AWS Command Line Interface (CLI), I find the results of my query in the S3 bucket configured as output for the EMR steps:

aws s3 ls s3://MY-BUCKET/MyHiveQueryResults/
...

Based on my input, the EMR cluster is still running at the end of this workflow execution. I follow the resource link in the Create_A_Cluster step to go to the EMR console and terminate it. In case you are following along with this demo, be careful to not leave your EMR cluster running if you don’t need it.

Available Now
Step Functions integration with EMR is available in all regions. There is no additional cost for using this feature on top of the usual Step Functions and EMR pricing.

You can now use Step Functions to quickly build complex workflows for executing EMR jobs. A workflow can include parallel executions, dependencies, and exception handling. Step Functions makes it easy to retry failed jobs and terminate workflows after critical errors, because you can specify what happens when something goes wrong. Let me know what are you going to use this feature for!

Danilo

ICYMI: Serverless Q3 2019

Post Syndicated from James Beswick original https://aws.amazon.com/blogs/compute/icymi-serverless-q3-2019/

This post is courtesy of Julian Wood, Senior Developer Advocate – AWS Serverless

Welcome to the seventh edition of the AWS Serverless ICYMI (in case you missed it) quarterly recap. Every quarter, we share all of the most recent product launches, feature enhancements, blog posts, webinars, Twitch live streams, and other interesting things that you might have missed!

In case you missed our last ICYMI, checkout what happened last quarter here.

ICYMI calendar

Launches/New products

Amazon EventBridge was technically launched in this quarter although we were so excited to let you know, we squeezed it into the Q2 2019 update. If you missed it, EventBridge is the serverless event bus that connects application data from your own apps, SaaS, and AWS services. This allows you to create powerful event-driven serverless applications using a variety of event sources.

The AWS Bahrain Region has opened, the official name is Middle East (Bahrain) and the API name is me-south-1. AWS Cloud now spans 22 geographic Regions with 69 Availability Zones around the world.

AWS Lambda

In September we announced dramatic improvements in cold starts for Lambda functions inside a VPC. With this announcement, you see faster function startup performance and more efficient usage of elastic network interfaces, drastically reducing VPC cold starts.

VPC to VPC NAT

These improvements are rolling out to all existing and new VPC functions at no additional cost. Rollout is ongoing, you can track the status from the announcement post.

AWS Lambda now supports custom batch window for Kinesis and DynamoDB Event sources, which helps fine-tune Lambda invocation for cost optimization.

You can now deploy Amazon Machine Images (AMIs) and Lambda functions together from the AWS Marketplace using using AWS CloudFormation with just a few clicks.

AWS IoT Events actions now support AWS Lambda as a target. Previously you could only define actions to publish messages to SNS and MQTT. Now you can define actions to invoke AWS Lambda functions and even more targets, such as Amazon Simple Queue Service and Amazon Kinesis Data Firehose, and republish messages to IoT Events.

The AWS Lambda Console now shows recent invocations using CloudWatch Logs Insights. From the monitoring tab in the console, you can view duration, billing, and memory statistics for the 10 most recent invocations.

AWS Step Functions

AWS Step Functions example

AWS Step Functions has now been extended to support probably its most requested feature, Dynamic Parallelism, which allows steps within a workflow to be executed in parallel, with a new Map state type.

One way to use the new Map state is for fan-out or scatter-gather messaging patterns in your workflows:

  • Fan-out is applied when delivering a message to multiple destinations, and can be useful in workflows such as order processing or batch data processing. For example, you can retrieve arrays of messages from Amazon SQS and Map sends each message to a separate AWS Lambda function.
  • Scatter-gather broadcasts a single message to multiple destinations (scatter), and then aggregates the responses back for the next steps (gather). This is useful in file processing and test automation. For example, you can transcode ten 500-MB media files in parallel, and then join to create a 5-GB file.

Another important update is AWS Step Functions adds support for nested workflows, which allows you to orchestrate more complex processes by composing modular, reusable workflows.

AWS Amplify

A new Predictions category as been added to the Amplify Framework to quickly add machine learning capabilities to your web and mobile apps.

Amplify framework

With a few lines of code you can add and configure AI/ML services to configure your app to:

  • Identify text, entities, and labels in images using Amazon Rekognition, or identify text in scanned documents to get the contents of fields in forms and information stored in tables using Amazon Textract.
  • Convert text into a different language using Amazon Translate, text to speech using Amazon Polly, and speech to text using Amazon Transcribe.
  • Interpret text to find the dominant language, the entities, the key phrases, the sentiment, or the syntax of unstructured text using Amazon Comprehend.

AWS Amplify CLI (part of the open source Amplify Framework) has added local mocking and testing. This allows you to mock some of the most common cloud services and test your application 100% locally.

For this first release, the Amplify CLI can mock locally:

amplify mock

AWS CloudFormation

The CloudFormation team has released the much-anticipated CloudFormation Coverage Roadmap.

Styled after the popular AWS Containers Roadmap, the CloudFormation Coverage Roadmap provides transparency about our priorities, and the opportunity to provide your input.

The roadmap contains four columns:

  • Shipped – Available for use in production in all public AWS Regions.
  • Coming Soon – Generally a few months out.
  • We’re working on It – Work in progress, but further out.
  • Researching – We’re thinking about the right way to implement the coverage.

AWS CloudFormation roadmap

Amazon DynamoDB

NoSQL Workbench for Amazon DynamoDB has been released in preview. This is a free, client-side application available for Windows and macOS. It helps you more easily design and visualize your data model, run queries on your data, and generate the code for your application.

Amazon Aurora

Amazon Aurora Serverless is a dynamically scaling version of Amazon Aurora. It automatically starts up, shuts down, and scales up or down, based on your application workload.

Aurora Serverless has had a MySQL compatible edition for a while, now we’re excited to bring more serverless joy to databases with the PostgreSQL compatible version now GA.

We also have a useful post on Reducing Aurora PostgreSQL storage I/O costs.

AWS Serverless Application Repository

The AWS Serverless Application Repository has had some useful SAR apps added by Serverless Developer Advocate James Beswick.

  • S3 Auto Translator which automatically converts uploaded objects into other languages specified by the user, using Amazon Translate.
  • Serverless S3 Uploader allows you to upload JPG files to Amazon S3 buckets from your web applications using presigned URLs.

Serverless posts

July

August

September

Tech talks

We hold several AWS Online Tech Talks covering serverless tech talks throughout the year. These are listed in the Serverless section of the AWS Online Tech Talks page.

Here are the ones from Q3:

Twitch

July

August

September

There are also a number of other helpful video series covering Serverless available on the AWS Twitch Channel.

AWS re:Invent

AWS re:Invent

December 2 – 6 in Las Vegas, Nevada is peak AWS learning time with AWS re:Invent 2019. Join tens of thousands of AWS customers to learn, share ideas, and see exciting keynote announcements.

Be sure to take a look at the growing catalog of serverless sessions this year. Make sure to book time for Builders SessionsChalk Talks, and Workshops as these sessions will fill up quickly. The schedule is updated regularly so if your session is currently fully booked, a repeat may be scheduled.

Register for AWS re:Invent now!

What did we do at AWS re:Invent 2018? Check out our recap here: AWS re:Invent 2018 Recap at the San Francisco Loft.

Our friends at IOPipe have written 5 tips for avoiding serverless FOMO at this year’s re:Invent.

AWS Serverless Heroes

We are excited to welcome some new AWS Serverless Heroes to help grow the serverless community. We look forward to some amazing content to help you with your serverless journey.

Still looking for more?

The Serverless landing page has much more information. The Lambda resources page contains case studies, webinars, whitepapers, customer stories, reference architectures, and even more Getting Started tutorials.