Tag Archives: automation

Integrating AWS Device Farm with your CI/CD pipeline to run cross-browser Selenium tests

Post Syndicated from Mahesh Biradar original https://aws.amazon.com/blogs/devops/integrating-aws-device-farm-with-ci-cd-pipeline-to-run-cross-browser-selenium-tests/

Continuously building, testing, and deploying your web application helps you release new features sooner and with fewer bugs. In this blog, you will create a continuous integration and continuous delivery (CI/CD) pipeline for a web app using AWS CodeStar services and AWS Device Farm’s desktop browser testing service.  AWS CodeStar is a suite of services that help you quickly develop and build your web apps on AWS.

AWS Device Farm’s desktop browser testing service helps developers improve the quality of their web apps by executing their Selenium tests on different desktop browsers hosted in AWS. For each test executed on the service, Device Farm generates action logs, web driver logs, video recordings, to help you quickly identify issues with your app. The service offers pay-as-you-go pricing so you only pay for the time your tests are executing on the browsers with no upfront commitments or additional costs. Furthermore, Device Farm provides a default concurrency of 50 Selenium sessions so you can run your tests in parallel and speed up the execution of your test suites.

After you’ve completed the steps in this blog, you’ll have a working pipeline that will build your web app on every code commit and test it on different versions of desktop browsers hosted in AWS.

Solution overview

In this solution, you use AWS CodeStar to create a sample web application and a CI/CD pipeline. The following diagram illustrates our solution architecture.

Figure 1: Deployment pipeline architecture

Figure 1: Deployment pipeline architecture

Prerequisites

Before you deploy the solution, complete the following prerequisite steps:

  1. On the AWS Management Console, search for AWS CodeStar.
  2. Choose Getting Started.
  3. Choose Create Project.
  4. Choose a sample project.

For this post, we choose a Python web app deployed on Amazon Elastic Compute Cloud (Amazon EC2) servers.

  1. For Templates, select Python (Django).
  2. Choose Next.
Figure 2: Choose a template in CodeStar

Figure 2: Choose a template in CodeStar

  1. For Project name, enter Demo CICD Selenium.
  2. Leave the remaining settings at their default.

You can choose from a list of existing key pairs. If you don’t have a key pair, you can create one.

  1. Choose Next.
  2. Choose Create project.
 Figure 3: Verify the project configuration

Figure 3: Verify the project configuration

AWS CodeStar creates project resources for you as listed in the following table.

Service Resource Created
AWS CodePipeline project demo-cicd-selen-Pipeline
AWS CodeCommit repository Demo-CICD-Selenium
AWS CodeBuild project demo-cicd-selen
AWS CodeDeploy application demo-cicd-selen
AWS CodeDeploy deployment group demo-cicd-selen-Env
Amazon EC2 server Tag: Environment = demo-cicd-selen-WebApp
IAM role AWSCodeStarServiceRole, CodeStarWorker*

Your EC2 instance needs to have access to run AWS Device Farm to run the Selenium test scripts. You can use service roles to achieve that.

  1. Attach policy AWSDeviceFarmFullAccess to the IAM role CodeStarWorker-demo-cicd-selen-WebApp.

You’re now ready to create an AWS Cloud9 environment.

Check the Pipelines tab of your AWS CodeStar project; it should show success.

  1. On the IDE tab, under Cloud 9 environments, choose Create environment.
  2. For Environment name, enter Demo-CICD-Selenium.
  3. Choose Create environment.
  4. Wait for the environment to be complete and then choose Open IDE.
  5. In the IDE, follow the instructions to set up your Git and make sure it’s up to date with your repo.

The following screenshot shows the verification that the environment is set up.

Figure 4: Verify AWS Cloud9 setup

Figure 4: Verify AWS Cloud9 setup

You can now verify the deployment.

  1. On the Amazon EC2 console, choose Instances.
  2. Choose demo-cicd-selen-WebApp.
  3. Locate its public IP and choose open address.
Figure 5: Locating IP address of the instance

Figure 5: Locating IP address of the instance

 

A webpage should open. If it doesn’t, check your VPN and firewall settings.

Now that you have a working pipeline, let’s move on to creating a Device Farm project for browser testing.

Creating a Device Farm project

To create your browser testing project, complete the following steps:

  1. On the Device Farm console, choose Desktop browser testing project.
  2. Choose Create a new project.
  3. For Project name, enter Demo cicd selenium.
  4. Choose Create project.
Figure 6: Creating AWS Device Farm project

Figure 6: Creating AWS Device Farm project

 

Note down the project ARN; we use this in the Selenium script for the remote web driver.

Figure 7: Project ARN for AWS Device Farm project

Figure 7: Project ARN for AWS Device Farm project

 

Testing the solution

This solution uses the following script to run browser testing. We call this script in the validate service lifecycle hook of the CodeDeploy project.

  1. Open your AWS Cloud9 IDE (which you made as a prerequisite).
  2. Create a folder tests under the project root directory.
  3. Add the sample Selenium script browser-test-sel.py under tests folder with the following content (replace <sample_url> with the url of your web application refer pre-requisite step 18):
import boto3
from selenium import webdriver
from selenium.webdriver.common.keys import Keys

devicefarm_client = boto3.client("devicefarm", region_name="us-west-2")
testgrid_url_response = devicefarm_client.create_test_grid_url(
    projectArn="arn:aws:devicefarm:us-west-2:<your project ARN>",
    expiresInSeconds=300)

driver = webdriver.Remote(testgrid_url_response["url"],
                          webdriver.DesiredCapabilities.FIREFOX)
try:
    driver.implicitly_wait(30)
    driver.maximize_window()
    driver.get("<sample_url>")
    if driver.find_element_by_id("Layer_1"):
        print("graphics generated in full screen")
    assert driver.find_element_by_id("Layer_1")
    driver.set_window_position(0, 0) and driver.set_window_size(1000, 400)
    driver.get("<sample_url>")
    tower = driver.find_element_by_id("tower1")
    if tower.is_displayed():
        print("graphics generated after resizing")
    else:
        print("graphics not generated at this window size")
       # this is where you can fail the script with error if you expect the graphics to load. And pipeline will terminate
except Exception as e:
    print(e)
finally:
    driver.quit()

This script launches a website (in Firefox) created by you in the prerequisite step using the AWS CodeStar template and verifies if a graphic element is loaded.

The following screenshot shows a full-screen application with graphics loaded.

Figure 8: Full screen application with graphics loaded

Figure 8: Full screen application with graphics loaded

 

The following screenshot shows the resized window with no graphics loaded.

Figure 9: Resized window application with No graphics loaded

Figure 9: Resized window application with No graphics loaded

  1. Create the file validate_service in the scripts folder under the root directory with the following content:
#!/bin/bash
if [ "$DEPLOYMENT_GROUP_NAME" == "demo-cicd-selen-Env" ]
then
cd /home/ec2-user
source environment/bin/activate
python tests/browser-test-sel.py
fi

This script is part of CodeDeploy scripts and determines whether to stop the pipeline or continue based on the output from the browser testing script from the preceding step.

  1. Modify the file appspec.yml under the root directory, add the tests files and ValidateService hook , the file should look like following:
version: 0.0
os: linux
files:
 - source: /ec2django/
   destination: /home/ec2-user/ec2django
 - source: /helloworld/
   destination: /home/ec2-user/helloworld
 - source: /manage.py
   destination: /home/ec2-user
 - source: /supervisord.conf
   destination: /home/ec2-user
 - source: /requirements.txt
   destination: /home/ec2-user
 - source: /requirements/
   destination: /home/ec2-user/requirements
 - source: /tests/
   destination: /home/ec2-user/tests

permissions:
  - object: /home/ec2-user/manage.py
    owner: ec2-user
    mode: 644
    type:
      - file
  - object: /home/ec2-user/supervisord.conf
    owner: ec2-user
    mode: 644
    type:
      - file
hooks:
  AfterInstall:
    - location: scripts/install_dependencies
      timeout: 300
      runas: root
    - location: scripts/codestar_remote_access
      timeout: 300
      runas: root
    - location: scripts/start_server
      timeout: 300
      runas: root

  ApplicationStop:
    - location: scripts/stop_server
      timeout: 300
      runas: root

  ValidateService:
    - location: scripts/validate_service
      timeout: 600
      runas: root

This file is used by AWS CodeDeploy service to perform the deployment and validation steps.

  1. Modify the artifacts section in the buildspec.yml file. The section should look like the following:
artifacts:
  files:
    - 'template.yml'
    - 'ec2django/**/*'
    - 'helloworld/**/*'
    - 'scripts/**/*'
    - 'tests/**/*'
    - 'appspec.yml'
    - 'manage.py'
    - 'requirements/**/*'
    - 'requirements.txt'
    - 'supervisord.conf'
    - 'template-configuration.json'

This file is used by AWS CodeBuild service to package the code

  1. Modify the file Common.txt in the requirements folder under the root directory, the file should look like the following:
# dependencies common to all environments 
Django=2.1.15 
selenium==3.141.0 
boto3 >= 1.10.44
pytest

  1. Save All the changes, your folder structure should look like the following:
── README.md
├── appspec.yml*
├── buildspec.yml*
├── db.sqlite3
├── ec2django
├── helloworld
├── manage.py
├── requirements
│   ├── common.txt*
│   ├── dev.txt
│   ├── prod.txt
├── requirements.txt
├── scripts
│   ├── codestar_remote_access
│   ├── install_dependencies
│   ├── start_server
│   ├── stop_server
│   └── validate_service**
├── supervisord.conf
├── template-configuration.json
├── template.yml
├── tests
│   └── browser-test-sel.py**

**newly added files
*modified file

Running the tests

The Device Farm desktop browsing project is now integrated with your pipeline. All you need to do now is commit the code, and CodePipeline takes care of the rest.

  1. On Cloud9 terminal, go to project root directory.
  2. Run git add . to stage all changed files for commit.
  3. Run git commit -m “<commit message>” to commit the changes.
  4. Run git push to push the changes to repository, this should trigger the Pipeline.
  5. Check the Pipelines tab of your AWS CodeStar project; it should show success.
  6. Go to AWS Device Farm console and click on Desktop browser testing projects.
  7. Click on your Project Demo cicd selenium.

You can verify the running of your Selenium test cases using the recorded run steps shown on the Device Farm console, the video of the run, and the logs, all of which can be downloaded on the Device Farm console, or using the AWS SDK and AWS Command Line Interface (AWS CLI).

The following screenshot shows the project run details on the console.

Figure 10: Viewing AWS Device Farm project run details

Figure 10: Viewing AWS Device Farm project run details

 

To Test the Selenium script locally you can run the following commands.

1. Create a Python virtual environment for your Django project. This virtual environment allows you to isolate this project and install any packages you need without affecting the system Python installation. At the terminal, go to project root directory and type the following command:

$ python3 -m venv ./venv

2. Activate the virtual environment:

$ source ./venv/bin/activate

3. Install development Python dependencies for this project:

$ pip install -r requirements/dev.txt

4. Run Selenium Script:

$ python tests/browser-test-sel.py

Testing the failure scenario (optional)

To test the failure scenario, you can modify the sample script browser-test-sel.py at the else statement.

The following code shows the lines to change:

else:
print("graphics was not generated at this form size")
# this is where you can fail the script with error if you expect the graphics to load. And pipeline will terminate

The following is the updated code:

else:
exit(1)
# this is where you can fail the script with error if you expect the graphics to load. And pipeline will terminate

Commit the change, and the pipeline should fail and stop the deployment.

Conclusion

Integrating Device Farm with CI/CD pipelines allows you to control deployment based on browser testing results. Failing the Selenium test on validation failures can stop and roll back the deployment, and a successful testing can continue the pipeline to deploy the solution to the final stage. Device Farm offers a one-stop solution for testing your native and web applications on desktop browsers and real mobile devices.

Mahesh Biradar is a Solutions Architect at AWS. He is a DevOps enthusiast and enjoys helping customers implement cost-effective architectures that scale..

Automate Amazon EC2 instance isolation by using tags

Post Syndicated from Jose Obando original https://aws.amazon.com/blogs/security/automate-amazon-ec2-instance-isolation-by-using-tags/

Containment is a crucial part of an overall Incident Response Strategy, as this practice allows time for responders to perform forensics, eradication and recovery during an Incident. There are many different approaches to containment. In this post, we will be focusing on isolation—the ability to keep multiple targets separated so that each target only sees and affects itself—as a containment strategy.

I’ll show you how to automate isolation of an Amazon Elastic Compute Cloud (Amazon EC2) instance by using an AWS Lambda function that’s triggered by tag changes on the instance, as reported by Amazon CloudWatch Events.

CloudWatch Event Rules deliver a near real-time stream of system events that describe changes in Amazon Web Services (AWS) resources. See also Amazon EventBridge.

Preparing for an incident is important as outlined in the Security Pillar of the AWS Well-Architected Framework.

Out of the 7 Design Principles for Security in the Cloud, as per the Well-Architected Framework, this solution will cover the following:

  • Enable traceability: Monitor, alert, and audit actions and changes to your environment in real time. Integrate log and metric collection with systems to automatically investigate and take action.
  • Automate security best practices: Automated software-based security mechanisms can improve your ability to securely scale more rapidly and cost-effectively. Create secure architectures, including through the implementation of controls that can be defined and managed by AWS as code in version-controlled templates.
  • Prepare for security events: Prepare for an incident by implementing incident management and investigation policy and processes that align to your organizational requirements. Run incident response simulations and use tools with automation to help increase your speed for detection, investigation, and recovery.

After detecting an event in the Detection phase and analyzing in the Analysis phase, you can automate the process of logically isolating an instance from a Virtual Private Cloud (VPC) in Amazon Web Services (AWS).

In this blog post, I describe how to automate EC2 instance isolation by using the tagging feature that a responder can use to identify instances that need to be isolated. A Lambda function then uses AWS API calls to isolate the instances by performing the actions described in the following sections.

Use cases

Your organization can use automated EC2 instance isolation for scenarios like these:

  • A security analyst wants to automate EC2 instance isolation in order to respond to security events in a timely manner.
  • A security manager wants to provide their first responders with a way to quickly react to security incidents without providing too much access to higher security features.

High-level architecture and design

The example solution in this blog post uses a CloudWatch Events rule to trigger a Lambda function. The CloudWatch Events rule is triggered when a tag is applied to an EC2 instance. The Lambda code triggers further actions based on the contents of the event. Note that the CloudFormation template includes the appropriate permissions to run the function.

The event flow is shown in Figure 1 and works as follows:

  1. The EC2 instance is tagged.
  2. The CloudWatch Events rule filters the event.
  3. The Lambda function is invoked.
  4. If the criteria are met, the isolation workflow begins.

When the Lambda function is invoked and the criteria are met, these actions are performed:

  1. Checks for IAM instance profile associations.
  2. If the instance is associated to a role, the Lambda function disassociates that role.
  3. Applies the isolation role that you defined during CloudFormation stack creation.
  4. Checks the VPC where the EC2 instance resides.
    • If there is no isolation security group in the VPC (if the VPC is new, for example), the function creates one.
  5. Applies the isolation security group.

Note: If you had a security group with an open (0.0.0.0/0) outbound rule, and you apply this Isolation security group, your existing SSH connections to the instance are immediately dropped. On the other hand, if you have a narrower inbound rule that initially allows the SSH connection, the existing connection will not be broken by changing the group. This is known as Connection Tracking.

Figure 1: High-level diagram showing event flow

Figure 1: High-level diagram showing event flow

For the deployment method, we will be using an AWS CloudFormation Template. AWS CloudFormation gives you an easy way to model a collection of related AWS and third-party resources, provision them quickly and consistently, and manage them throughout their lifecycles, by treating infrastructure as code.

The AWS CloudFormation template that I provide here deploys the following resources:

  • An EC2 instance role for isolation – this is attached to the EC2 Instance to prevent further communication with other AWS Services thus limiting the attack surface to your overall infrastructure.
  • An Amazon CloudWatch Events rule – this is used to detect changes to an AWS EC2 resource, in this case a “tag change event”. We will use this as a trigger to our Lambda function.
  • An AWS Identity and Access Management (IAM) role for Lambda functions – this is what the Lambda function will use to execute the workflow.
  • A Lambda function for automation – this function is where all the decision logic sits, once triggered it will follow a set of steps described in the following section.
  • Lambda function permissions – this is used by the Lambda function to execute.
  • An IAM instance profile – this is a container for an IAM role that you can use to pass role information to an EC2 instance.

Supporting functions within the Lambda function

Let’s dive deeper into each supporting function inside the Lambda code.

The following function identifies the virtual private cloud (VPC) ID for a given instance. This is needed to identify which security groups are present in the VPC.

def identifyInstanceVpcId(instanceId):
    instanceReservations = ec2Client.describe_instances(InstanceIds=[instanceId])['Reservations']
    for instanceReservation in instanceReservations:
        instancesDescription = instanceReservation['Instances']
        for instance in instancesDescription:
            return instance['VpcId']

The following function modifies the security group of an EC2 instance.

def modifyInstanceAttribute(instanceId,securityGroupId):
    response = ec2Client.modify_instance_attribute(
        Groups=[securityGroupId],
        InstanceId=instanceId)

The following function creates a security group on a VPC that blocks all egress access to the security group.

def createSecurityGroup(groupName, descriptionString, vpcId):
    resource = boto3.resource('ec2')
    securityGroupId = resource.create_security_group(GroupName=groupName, Description=descriptionString, VpcId=vpcId)
    securityGroupId.revoke_egress(IpPermissions= [{'IpProtocol': '-1','IpRanges': [{'CidrIp': '0.0.0.0/0'}],'Ipv6Ranges': [],'PrefixListIds': [],'UserIdGroupPairs': []}])
    return securityGroupId.

Deploy the solution

To deploy the solution provided in this blog post, first download the CloudFormation template, and then set up a CloudFormation stack that specifies the tags that are used to trigger the automated process.

Download the CloudFormation template

To get started, download the CloudFormation template from Amazon S3. Alternatively, you can launch the CloudFormation template by selecting the following Launch Stack button:

Select the Launch Stack button to launch the template

Deploy the CloudFormation stack

Start by uploading the CloudFormation template to your AWS account.

To upload the template

  1. From the AWS Management Console, open the CloudFormation console.
  2. Choose Create Stack.
  3. Choose With new resources (standard).
  4. Choose Upload a template file.
  5. Select Choose File, and then select the YAML file that you just downloaded.
Figure 2: CloudFormation stack creation

Figure 2: CloudFormation stack creation

Specify stack details

You can leave the default values for the stack as long as there aren’t any resources provisioned already with the same name, such as an IAM role. For example, if left with default values an IAM role named “SecurityIsolation-IAMRole” will be created. Otherwise, the naming convention is fully customizable from this screen and you can enter your choice of name for the CloudFormation stack, and modify the parameters as you see fit. Figure 3 shows the parameters that you can set.

The Evaluation Parameters section defines the tag key and value that will initiate the automated response. Keep in mind that these values are case-sensitive.

Figure 3: CloudFormation stack parameters

Figure 3: CloudFormation stack parameters

Choose Next until you reach the final screen, shown in Figure 4, where you acknowledge that an IAM role is created and you trust the source of this template. Select the check box next to the statement I acknowledge that AWS CloudFormation might create IAM resources with custom names, and then choose Create Stack.

Figure 4: CloudFormation IAM notification

Figure 4: CloudFormation IAM notification

After you complete these steps, the following resources will be provisioned, as shown in Figure 5:

  • EC2IsolationRole
  • EC2TagChangeEvent
  • IAMRoleForLambdaFunction
  • IsolationLambdaFunction
  • IsolationLambdaFunctionInvokePermissions
  • RootInstanceProfile
Figure 5: CloudFormation created resources

Figure 5: CloudFormation created resources

Testing

To start your automation, tag an EC2 instance using the tag defined during the CloudFormation setup. If you’re using the Amazon EC2 console, you can apply tags to resources by using the Tags tab on the relevant resource screen, or you can use the Tags screen, the AWS CLI or an AWS SDK. A detailed walkthrough for each approach can be found in the Amazon EC2 Documentation page.

Reverting Changes

If you need to remove the restrictions applied by this workflow, complete the following steps.

  1. From the EC2 dashboard, in the Instances section, check the box next to the instance you want to modify.

    Figure 6: Select the instance to modify

    Figure 6: Select the instance to modify

  2. In the top right, select Actions, choose Instance settings, and then choose Modify IAM role.

    Figure 7: Choose Actions > Instance settings > Modify IAM role

    Figure 7: Choose Actions > Instance settings > Modify IAM role

  3. Under IAM role, choose the IAM role to attach to your instance, and then select Save.

    Figure 8: Choose the IAM role to attach

    Figure 8: Choose the IAM role to attach

  4. Select Actions, choose Networking, and then choose Change security groups.

    Figure 9: Choose Actions > Networking > Change security groups

    Figure 9: Choose Actions > Networking > Change security groups

  5. Under Associated security groups, select Remove and add a different security group with the access you want to grant to this instance.

Summary

Using the CloudFormation template provided in this blog post, a Security Operations Center analyst could have only tagging privileges and isolate an EC2 instance based on this tag. Or a security service such as Amazon GuardDuty could trigger a lambda to apply the tag as part of a workflow. This means the Security Operations Center analyst wouldn’t need administrative privileges over the EC2 service.

This solution creates an isolation security group for any new VPCs that don’t have one already. The security group would still follow the naming convention defined during the CloudFormation stack launch, but won’t be part of the provisioned resources. If you decide to delete the stack, manual cleanup would be required to remove these security groups.

This solution terminates established inbound Secure Shell (SSH) sessions that are associated to the instance, and isolates the instance from new connections either inbound or outbound. For outbound connections that are already established (for example, reverse shell), you either need to shut down the network interface card (NIC) at the operating system (OS) level, restart the instance network stack at the OS level, terminate the established connections, or apply a network access control list (network ACL).

For more information, see the following documentation:

If you have feedback about this post, submit comments in the Comments section below.

Want more AWS Security how-to content, news, and feature announcements? Follow us on Twitter.

Author

Jose Obando

Jose is a Security Consultant on the Global Financial Services team. He helps the world’s top financial institutions improve their security posture in the cloud. He has a background in network security and cloud architecture. In his free time, you can find him playing guitar or training in Muay Thai.

Growth Engineering at Netflix — Automated Imagery Generation

Post Syndicated from Netflix Technology Blog original https://netflixtechblog.com/growth-engineering-at-netflix-automated-imagery-generation-5a105fd51569

Growth Engineering at Netflix — Automated Imagery Generation

by Eric Eiswerth

Background

There’s a good chance you’ve probably visited the Netflix homepage. In the Growth Engineering team, we refer to this as the top of the signup funnel. For more background on the signup funnel and Growth Engineering’s role in the signup funnel, please read our initial post on the topic: Growth Engineering at Netflix — Accelerating Innovation. The primary focus of this post will be the top of the signup funnel. In particular, the Netflix homepage:

As discussed in our previous post, Growth Engineering owns the business logic and protocols that allow our UI partners to build lightweight and flexible applications for almost any platform. In some cases, like the homepage, this even involves providing appropriate imagery (e.g., the background image shown above). In this post, we’ll take a deep dive into the journey of content-based imagery on the Netflix homepage.

Motivation

At Netflix we do one thing — entertainment — and we aim to do it really well. We live and breathe TV shows and films, and we want everyone to be able to enjoy them too. That’s why we aspire to have best in class stories, across genres and believe people should have access to new voices, cultures and perspectives. The member-focused teams at Netflix are responsible for making sure the member experience is relevant and personalized, ensuring that this content is shown to the right people at the right time. But what about non-members; those who are simply interested in signing up for Netflix, how should we highlight our content and convey our value propositions to them?

The Solution

The main mechanism for highlighting our content in the signup flow is through content-based imagery. Before designing a solution it’s important to understand the main product requirements for such a feature:

  • The content needs to be new, relevant, and regional (not all countries have the same catalogue).
  • The artwork needs to appeal to a broader audience. The non-member homepage serves a very broad audience and is not personalized to the extent of the member experience.
  • The imagery needs to be localized.
  • We need to be able to easily determine what imagery is present for a given platform, region, and language.
  • The homepage needs to load in a reasonable amount of time, even in poor network conditions.

Unpacking Product Requirements

Given the scale we require and the product requirements listed above, there are a number of technical requirements:

  • A list of titles for the asset, in some order.
  • Ensure the titles are appropriate for a broad audience, which means all titles need to be tagged with metadata.
  • Localized images for each of the titles.
  • Different assets for different device types and screen sizes.
  • Server-generated assets, since client-side generation would require the retrieval of many individual images, which would increase latency and time-to-render.
  • To reduce latency, assets should be generated in an offline fashion and not in real time.
  • The assets need to be compressed, without reducing quality significantly.
  • The assets will need to be stored somewhere and we’ll need to generate URLs for each of them.
  • We’ll need to figure out how to provide the correct asset URL for a given request.
  • We’ll need to build a search index so that the assets can be searchable.

Given this set of requirements, we can effectively break this work down into 3 functional buckets:

The Design

For our design, we decided to build 3 separate microservices, mapping to the aforementioned functional buckets. Let’s take a look at each of these services in turn.

Asset Generation

The Asset Generation Service is responsible for generating groups of assets. We call these groups of assets, asset groups. Each request will generate a single asset group that will contain one or more assets. To support the demands of our stakeholders we designed a Domain Specific Language (DSL) that we call an asset generation recipe. An asset generation request contains a recipe. Below is an example of a simple recipe:

{
"titleIds": [12345, 23456, 34567, …],
"countries": [“US”],
"type": “perspective”, // this is the design of the asset
"rows": 10, // the number of rows and columns can control density
"cols": 15,
"padding": 10, // padding between individual images
"columnOffsets": [0, 0, 0, 0…], // the y-offset for each column
"rowOffsets": [0, -100, 0, -100, …], // the x-offset for each row
"size": [1920, 1080] // size in pixels
}

This recipe can then be issued via an HTTP POST request to the Asset Generation Service. The recipe can then be translated into ImageMagick commands that can do the heavy lifting. At a high level, the following diagram captures the necessary steps required to build an asset.

Generating a single localized asset is a big achievement, but we still need to store the asset somewhere and have the ability to search for it. This requires an asset storage solution.

Asset Storage

We refer to asset storage and management simply as asset management. We felt it would be beneficial to create a separate microservice for asset management for 2 reasons. First, asset generation is CPU intensive and bursty. We can leverage high performance VMs in AWS to generate the assets. We can scale up when generation is occurring and scale down when there is no batch in the queue. However, it would be cost-inefficient to leverage this same hardware for lightweight and more consistent traffic patterns that an asset management service requires.

Let’s take a look at the internals of the Asset Management Service.

At this point we’ve laid out all the details in order to generate a content-based asset and have it stored as part of an asset group, which is persisted and indexed. The next thing to consider is, how do we retrieve an asset in real time and surface it on the Netflix homepage?

If you recall in our previous blog post, Growth Engineering owns a service called the Orchestration Service. It is a mid-tier service that emits a custom JSON data structure that contains fields that are consumed by the UI. The UI can then use these fields to control the presentation in the UI layer. There are two approaches for adding fields to the Orchestration Service’s response. First, the fields can be coded by hand. Second, fields can be added via configuration via a service we call the Customization Service. Since assets will need to be periodically refreshed and we want this process to be entirely automated, it makes sense to pursue the configuration-based approach. To accomplish this, the Asset Management Service needs to translate an asset group into a rule definition for the Customization Service.

Customization Service

Let’s review the Orchestration Service and introduce the Customization Service. The Orchestration Service emits fields in response to upstream requests. For the homepage, there are typically only a small number of fields provided by the Orchestration Service. The following fields are supplied by application code. For example:

{
“fields”: {
“email” : {
“type”: “StringField”,
“value”: “”
},
“nextAction”: {
“type”: “Action”,
“withFields” [“email”]
}
}
}

The Orchestration Service also supports fields supplied by configuration. We call these adaptive fields. Adaptive fields are provided by the Customization Service. The Customization Service is a rules engine that emits the adaptive fields. For example, a rule to provide the background image for the homepage in the en-US locale would look as follows:

{
“country”: “US”,
“language”: “en”,
“platform”: “browser”,
“resolution”: “high”
}

The corresponding payload for such a rule might look as follows:

{
“backgroundImage”: “https://cdn.netflix.com/bgimageurl.jpg”
}

Bringing this all together, the response from the Orchestration Service would now look as follows:

{
“fields”: {
“email” : {
“type”: “StringField”,
“value”: “”
},
“nextAction”: {
“type”: “Action”,
“withFields” [“email”]
}
},
“adaptiveFields”: {
“backgroundImage”: “https://cdn.netflix.com/bgimageurl.jpg”
}
}

At this point, we are now able to generate an asset, persist it, search it, and generate customization rules for it. The generated rules then enable us to return a particular asset for a particular request. Let’s put it all together and review the system interaction diagram.

We now have all the pieces in place to automatically generate artwork and have that artwork appear on the Netflix homepage for a given request. At least one open question remains, how can we scale asset generation?

Scaling Asset Generation

Arguably, there are a number of approaches that could be used to scale asset generation. We decided to opt for an all-or-nothing approach. Meaning, all assets for a given recipe need to be generated as a single asset group. This enables smooth rollback in case of any errors. Additionally, asset generation is CPU intensive and each recipe can produce 1000s of assets as a result of the number of platform, region, and language permutations. Even with high performance VMs, generating 1000s of assets can take a long time. As a result, we needed to find a way to distribute asset generation across multiple VMs. Here’s what the final architecture looked like.

Briefly, let’s review the steps:

  1. The batch process is initiated by a cron job. The job executes a script that contains an asset generation recipe.
  2. The Asset Generation Service receives the request and creates asset generation tasks that can be distributed across any number of Asset Generation Worker nodes. One of the nodes is elected as the leader via Zookeeper. Its job is to coordinate asset generation across the other workers and ensure all assets get generated.
  3. Once the primary worker node has all the assets, it creates an asset group in the Asset Management Service. The Asset Management Service persists, indexes, and uploads the assets to the CDN.
  4. Finally, the Asset Management Service creates rules from the asset group and pushes the rules to the Customization Service. Once the data is published in the Customization Service, the Orchestration Service can supply the correct URLs in its JSON response by invoking the Customization Service with a request context that matches a given set of rules.

Conclusion

Automated asset generation has proven to be an extremely valuable investment. It is low-maintenance, high-leverage, and has allowed us to experiment with a variety of different types of assets on different platforms and on different parts of the product. This project was technically challenging and highly rewarding, both to the engineers involved in the project, and to the business. The Netflix homepage has come a long way over the last several years.

We’re hiring! Join Growth Engineering and help us build the future of Netflix.


Growth Engineering at Netflix — Automated Imagery Generation was originally published in Netflix TechBlog on Medium, where people are continuing the conversation by highlighting and responding to this story.

How to deploy the AWS Solution for Security Hub Automated Response and Remediation

Post Syndicated from Ramesh Venkataraman original https://aws.amazon.com/blogs/security/how-to-deploy-the-aws-solution-for-security-hub-automated-response-and-remediation/

In this blog post I show you how to deploy the Amazon Web Services (AWS) Solution for Security Hub Automated Response and Remediation. The first installment of this series was about how to create playbooks using Amazon CloudWatch Events, AWS Lambda functions, and AWS Security Hub custom actions that you can run manually based on triggers from Security Hub in a specific account. That solution requires an analyst to directly trigger an action using Security Hub custom actions and doesn’t work for customers who want to set up fully automated remediation based on findings across one or more accounts from their Security Hub master account.

The solution described in this post automates the cross-account response and remediation lifecycle from executing the remediation action to resolving the findings in Security Hub and notifying users of the remediation via Amazon Simple Notification Service (Amazon SNS). You can also deploy these automated playbooks as custom actions in Security Hub, which allows analysts to run them on-demand against specific findings. You can deploy these remediations as custom actions or as fully automated remediations.

Currently, the solution includes 10 playbooks aligned to the controls in the Center for Internet Security (CIS) AWS Foundations Benchmark standard in Security Hub, but playbooks for other standards such as AWS Foundational Security Best Practices (FSBP) will be added in the future.

Solution overview

Figure 1 shows the flow of events in the solution described in the following text.

Figure 1: Flow of events

Figure 1: Flow of events

Detect

Security Hub gives you a comprehensive view of your security alerts and security posture across your AWS accounts and automatically detects deviations from defined security standards and best practices.

Security Hub also collects findings from various AWS services and supported third-party partner products to consolidate security detection data across your accounts.

Ingest

All of the findings from Security Hub are automatically sent to CloudWatch Events and Amazon EventBridge and you can set up CloudWatch Events and EventBridge rules to be invoked on specific findings. You can also send findings to CloudWatch Events and EventBridge on demand via Security Hub custom actions.

Remediate

The CloudWatch Event and EventBridge rules can have AWS Lambda functions, AWS Systems Manager automation documents, or AWS Step Functions workflows as the targets of the rules. This solution uses automation documents and Lambda functions as response and remediation playbooks. Using cross-account AWS Identity and Access Management (IAM) roles, the playbook performs the tasks to remediate the findings using the AWS API when a rule is invoked.

Log

The playbook logs the results to the Amazon CloudWatch log group for the solution, sends a notification to an Amazon Simple Notification Service (Amazon SNS) topic, and updates the Security Hub finding. An audit trail of actions taken is maintained in the finding notes. The finding is updated as RESOLVED after the remediation is run. The security finding notes are updated to reflect the remediation performed.

Here are the steps to deploy the solution from this GitHub project.

  • In the Security Hub master account, you deploy the AWS CloudFormation template, which creates an AWS Service Catalog product along with some other resources. For a full set of what resources are deployed as part of an AWS CloudFormation stack deployment, you can find the full set of deployed resources in the Resources section of the deployed AWS CloudFormation stack. The solution uses the AWS Service Catalog to have the remediations available as a product that can be deployed after granting the users the required permissions to launch the product.
  • Add an IAM role that has administrator access to the AWS Service Catalog portfolio.
  • Deploy the CIS playbook from the AWS Service Catalog product list using the IAM role you added in the previous step.
  • Deploy the AWS Security Hub Automated Response and Remediation template in the master account in addition to the member accounts. This template establishes AssumeRole permissions to allow the playbook Lambda functions to perform remediations. Use AWS CloudFormation StackSets in the master account to have a centralized deployment approach across the master account and multiple member accounts.

Deployment steps for automated response and remediation

This section reviews the steps to implement the solution, including screenshots of the solution launched from an AWS account.

Launch AWS CloudFormation stack on the master account

As part of this AWS CloudFormation stack deployment, you create custom actions to configure Security Hub to send findings to CloudWatch Events. Lambda functions are used to provide remediation in response to actions sent to CloudWatch Events.

Note: In this solution, you create custom actions for the CIS standards. There will be more custom actions added for other security standards in the future.

To launch the AWS CloudFormation stack

  1. Deploy the AWS CloudFormation template in the Security Hub master account. In your AWS console, select CloudFormation and choose Create new stack and enter the S3 URL.
  2. Select Next to move to the Specify stack details tab, and then enter a Stack name as shown in Figure 2. In this example, I named the stack SO0111-SHARR, but you can use any name you want.
     
    Figure 2: Creating a CloudFormation stack

    Figure 2: Creating a CloudFormation stack

  3. Creating the stack automatically launches it, creating 21 new resources using AWS CloudFormation, as shown in Figure 3.
     
    Figure 3: Resources launched with AWS CloudFormation

    Figure 3: Resources launched with AWS CloudFormation

  4. An Amazon SNS topic is automatically created from the AWS CloudFormation stack.
  5. When you create a subscription, you’re prompted to enter an endpoint for receiving email notifications from Amazon SNS as shown in Figure 4. To subscribe to that topic that was created using CloudFormation, you must confirm the subscription from the email address you used to receive notifications.
     
    Figure 4: Subscribing to Amazon SNS topic

    Figure 4: Subscribing to Amazon SNS topic

Enable Security Hub

You should already have enabled Security Hub and AWS Config services on your master account and the associated member accounts. If you haven’t, you can refer to the documentation for setting up Security Hub on your master and member accounts. Figure 5 shows an AWS account that doesn’t have Security Hub enabled.
 

Figure 5: Enabling Security Hub for first time

Figure 5: Enabling Security Hub for first time

AWS Service Catalog product deployment

In this section, you use the AWS Service Catalog to deploy Service Catalog products.

To use the AWS Service Catalog for product deployment

  1. In the same master account, add roles that have administrator access and can deploy AWS Service Catalog products. To do this, from Services in the AWS Management Console, choose AWS Service Catalog. In AWS Service Catalog, select Administration, and then navigate to Portfolio details and select Groups, roles, and users as shown in Figure 6.
     
    Figure 6: AWS Service Catalog product

    Figure 6: AWS Service Catalog product

  2. After adding the role, you can see the products available for that role. You can switch roles on the console to assume the role that you granted access to for the product you added from the AWS Service Catalog. Select the three dots near the product name, and then select Launch product to launch the product, as shown in Figure 7.
     
    Figure 7: Launch the product

    Figure 7: Launch the product

  3. While launching the product, you can choose from the parameters to either enable or disable the automated remediation. Even if you do not enable fully automated remediation, you can still invoke a remediation action in the Security Hub console using a custom action. By default, it’s disabled, as highlighted in Figure 8.
     
    Figure 8: Enable or disable automated remediation

    Figure 8: Enable or disable automated remediation

  4. After launching the product, it can take from 3 to 5 minutes to deploy. When the product is deployed, it creates a new CloudFormation stack with a status of CREATE_COMPLETE as part of the provisioned product in the AWS CloudFormation console.

AssumeRole Lambda functions

Deploy the template that establishes AssumeRole permissions to allow the playbook Lambda functions to perform remediations. You must deploy this template in the master account in addition to any member accounts. Choose CloudFormation and create a new stack. In Specify stack details, go to Parameters and specify the Master account number as shown in Figure 9.
 

Figure 9: Deploy AssumeRole Lambda function

Figure 9: Deploy AssumeRole Lambda function

Test the automated remediation

Now that you’ve completed the steps to deploy the solution, you can test it to be sure that it works as expected.

To test the automated remediation

  1. To test the solution, verify that there are 10 actions listed in Custom actions tab in the Security Hub master account. From the Security Hub master account, open the Security Hub console and select Settings and then Custom actions. You should see 10 actions, as shown in Figure 10.
     
    Figure 10: Custom actions deployed

    Figure 10: Custom actions deployed

  2. Make sure you have member accounts available for testing the solution. If not, you can add member accounts to the master account as described in Adding and inviting member accounts.
  3. For testing purposes, you can use CIS 1.5 standard, which is to require that the IAM password policy requires at least one uppercase letter. Check the existing settings by navigating to IAM, and then to Account Settings. Under Password policy, you should see that there is no password policy set, as shown in Figure 11.
     
    Figure 11: Password policy not set

    Figure 11: Password policy not set

  4. To check the security settings, go to the Security Hub console and select Security standards. Choose CIS AWS Foundations Benchmark v1.2.0. Select CIS 1.5 from the list to see the Findings. You will see the Status as Failed. This means that the password policy to require at least one uppercase letter hasn’t been applied to either the master or the member account, as shown in Figure 12.
     
    Figure 12: CIS 1.5 finding

    Figure 12: CIS 1.5 finding

  5. Select CIS 1.5 – 1.11 from Actions on the top right dropdown of the Findings section from the previous step. You should see a notification with the heading Successfully sent findings to Amazon CloudWatch Events as shown in Figure 13.
     
    Figure 13: Sending findings to CloudWatch Events

    Figure 13: Sending findings to CloudWatch Events

  6. Return to Findings by selecting Security standards and then choosing CIS AWS Foundations Benchmark v1.2.0. Select CIS 1.5 to review Findings and verify that the Workflow status of CIS 1.5 is RESOLVED, as shown in Figure 14.
     
    Figure 14: Resolved findings

    Figure 14: Resolved findings

  7. After the remediation runs, you can verify that the Password policy is set on the master and the member accounts. To verify that the password policy is set, navigate to IAM, and then to Account Settings. Under Password policy, you should see that the account uses a password policy, as shown in Figure 15.
     
    Figure 15: Password policy set

    Figure 15: Password policy set

  8. To check the CloudWatch logs for the Lambda function, in the console, go to Services, and then select Lambda and choose the Lambda function and within the Lambda function, select View logs in CloudWatch. You can see the details of the function being run, including updating the password policy on both the master account and the member account, as shown in Figure 16.
     
    Figure 15: Lambda function log

    Figure 16: Lambda function log

Conclusion

In this post, you deployed the AWS Solution for Security Hub Automated Response and Remediation using Lambda and CloudWatch Events rules to remediate non-compliant CIS-related controls. With this solution, you can ensure that users in member accounts stay compliant with the CIS AWS Foundations Benchmark by automatically invoking guardrails whenever services move out of compliance. New or updated playbooks will be added to the existing AWS Service Catalog portfolio as they’re developed. You can choose when to take advantage of these new or updated playbooks.

If you have feedback about this post, submit comments in the Comments section below. If you have questions about this post, start a new thread on the AWS Security Hub forum or contact AWS Support.

Want more AWS Security how-to content, news, and feature announcements? Follow us on Twitter.

Author

Ramesh Venkataraman

Ramesh is a Solutions Architect who enjoys working with customers to solve their technical challenges using AWS services. Outside of work, Ramesh enjoys following stack overflow questions and answers them in any way he can.

Tracking the latest server images in Amazon EC2 Image Builder pipelines

Post Syndicated from Eric Johnson original https://aws.amazon.com/blogs/compute/tracking-the-latest-server-images-in-amazon-ec2-image-builder-pipelines/

This post courtesy of Anoop Rachamadugu, Cloud Architect at AWS

The Amazon EC2 Image Builder service helps users to build and maintain server images. The images created by EC2 Image Builder can be used with Amazon Elastic Compute Cloud (EC2) and on-premises. Image Builder reduces the effort of keeping images up-to-date and secure by providing a graphical interface, built-in automation, and AWS-provided security settings. Customers have told us that they manage multiple server images and are looking for ways to track the latest server images created by the pipelines.

In this blog post, I walk through a solution that uses AWS Lambda and AWS Systems Manager (SSM) Parameter Store. It tracks and updates the latest Amazon Machine Image (AMI) IDs every time an Image Builder pipeline is run. With Lambda, you pay only for what you use. You are charged based on the number of requests for your functions and the time it takes for your code to run. In this case, the Lambda function is invoked upon the completion of the image builder pipeline. Standard SSM parameters are available at no additional charge.

Users can reference the SSM parameters in automation scripts and AWS CloudFormation templates providing access to the latest AMI ID for your EC2 infrastructure. Consider the use case of updating Amazon Machine Image (AMI) IDs for the EC2 instances in your CloudFormation templates. Normally, you might map AMI IDs to specific instance types and Regions. Then to update these, you would manually change them in each of your templates. With the SSM parameter integration, your code remains untouched and a CloudFormation stack update operation automatically fetches the latest Parameter Store value.

Overview

This solution uses a Lambda function written in Python that subscribes to an Amazon Simple Notification Service (SNS) topic. The Lambda function and the SNS topic are deployed using AWS SAM CLI. Once deployed, the SNS topic must be configured in an existing Image Builder pipeline. This results in the Lambda function being invoked at the completion of the Image Builder pipeline.

When a Lambda function subscribes to an SNS topic, it is invoked with the payload of the published messages. The Lambda function receives the message payload as an input parameter. The Lambda function first checks the message payload to see if the image status is available. If the image state is available, it retrieves the AMI ID from the message payload and updates the SSM parameter.

EC2 Image builder architecture diagram

EC2 Image builder architecture diagram

Prerequisites

To get started with this solution, the following is required:

Deploying the solution

The solution consists of two files, which can be downloaded from the amazon-ec2-image-builder GitHub repository.

  1. The Python file image-builder-lambda-update-ssm.py contains the code for the Lambda function. It first checks the SNS message payload to determine if the image is available. If it’s available, it extracts the AMI ID from the SNS message payload and updates the SSM parameter specified.The ‘ssm_parameter_name’ variable specifies the SSM parameter path where the AMI ID should be stored and updated. The Lambda function finishes by adding tags to the SSM parameter.
  2. The template.yaml file is an AWS SAM template. It deploys the Lambda function, SNS topic, and IAM role required for the Lambda function. I use Python 3.7 as the runtime and assign a memory of 256 MB for the Lambda function. The IAM policy gives the Lambda function permissions to retrieve and update SSM parameters. Deploy this application using the AWS SAM CLI guided deploy:
    sam deploy --guided

After deploying the application, note the ARN of the created SNS topic. Next, update the infrastructure settings of an existing Image Builder pipeline with this newly created SNS topic. This results in the Lambda function being invoked upon the completion of the image builder pipeline.

Configuration details

Configuration details

Verifying the solution

After the completion of the image builder pipeline, use the AWS CLI or check the AWS Management Console to verify the updated SSM parameter. To verify via AWS CLI, run the following commands to retrieve and list the tags attached to the SSM parameter:

aws ssm get-parameter --name ‘/ec2-imagebuilder/latest’
aws ssm list-tags-for-resource --resource-type "Parameter" --resource-id ‘/ec2-imagebuilder/latest’

To verify via the AWS Management Console, navigate to the Parameter Store under AWS Systems Manager. Search for the parameter /ec2-imagebuilder/latest:

AWS Systems Manager: Parameter Store

AWS Systems Manager: Parameter Store

Select the Tags tab to view the tags attached to the SSM parameter:

Image builder tags list

Image builder tags list

Referencing the SSM Parameter in CloudFormation templates

Users can reference the SSM parameters in automation scripts and AWS CloudFormation templates providing access to the latest AMI ID for your EC2 infrastructure. This sample code shows how to reference the SSM parameter in a CloudFormation template.

Parameters :
  LatestAmiId :
    Type : 'AWS::SSM::Parameter::Value<AWS::EC2::Image::Id>'
    Default: ‘/ec2-imagebuilder/latest’

Resources :
  Instance :
    Type : 'AWS::EC2::Instance'
    Properties :
      ImageId : !Ref LatestAmiId

Conclusion

In this blog post, I demonstrate a solution that allows users to track and update the latest AMI ID created by the Image Builder pipelines. The Lambda function retrieves the AMI ID of the image created by a pipeline and update an AWS Systems Manager parameter. This Lambda function is triggered via an SNS topic configured in an Image Builder pipeline.

The solution is deployed using AWS SAM CLI. I also note how users can reference Systems Manager parameters in AWS CloudFormation templates providing access to the latest AMI ID for your EC2 infrastructure.

The amazon-ec2-image-builder-samples GitHub repository provides a number of examples for getting started with EC2 Image Builder. Image Builder can make it easier for you to build virtual machine (VM) images.

Getting started with DevOps automation

Post Syndicated from Jared Murrell original https://github.blog/2020-10-29-getting-started-with-devops-automation/

This is the second post in our series on DevOps fundamentals. For a guide to what DevOps is and answers to common DevOps myths check out part one.

What role does automation play in DevOps?

First things first—automation is one of the key principles for accelerating with DevOps. As noted in my last blog post, it enables consistency, reliability, and efficiency within the organization, making it easier for teams to discover and troubleshoot problems. 

However, as we’ve worked with organizations, we’ve found not everyone knows where to get started, or which processes can and should be automated. In this post, we’ll discuss a few best practices and insights to get teams moving in the right direction.

A few helpful guidelines

The path to DevOps automation is continually evolving. Before we dive into best practices, there are a few common guidelines to keep in mind as you’re deciding what and how you automate. 

  • Choose open standards. Your contributors and team may change, but that doesn’t mean your tooling has to. By maintaining tooling that follows common, open standards, you can simplify onboarding and save time on specialized training. Community-driven standards for packaging, runtime, configuration, and even networking and storage—like those found in Kubernetes—also become even more important as DevOps and deployments move toward the cloud.
  • Use dynamic variables. Prioritizing reusable code will reduce the amount of rework and duplication you have, both now and in the future. Whether in scripts or specialized tools, securely using externally-defined variables is an easy way to apply your automation to different environments without needing to change the code itself.
  • Use flexible tooling you can take with you. It’s not always possible to find a tool that fits every situation, but using a DevOps tool that allows you to change technologies also helps reduce rework when companies change direction. By choosing a solution with a wide ecosystem of partner integrations that works with any cloud, you’ll be able to  define your unique set of best practices and reach your goals—without being restricted by your toolchain.

DevOps automation best practices

Now that our guidelines are in place, we can evaluate which sets of processes we need to automate. We’ve broken some best practices for DevOps automation into four categories to help you get started. 

1. Continuous integration, continuous delivery, and continuous deployment

We often think of the term “DevOps” as being synonymous with “CI/CD”. At GitHub we recognize that DevOps includes so much more, from enabling contributors to build and run code (or deploy configurations) to improving developer productivity. In turn, this shortens the time it takes to build and deliver applications, helping teams add value and learn faster. While CI/CD and DevOps aren’t precisely the same, CI/CD is still a core component of DevOps automation.

  • Continuous integration (CI) is a process that implements testing on every change, enabling users to see if their changes break anything in the environment. 
  • Continuous delivery (CD) is the practice of building software in a way that allows you to deploy any successful release candidate to production at any time.
  • Continuous deployment (CD) takes continuous delivery a step further. With continuous deployment, every successful change is automatically deployed to production. Since some industries and technologies can’t immediately release new changes to customers (think hardware and manufacturing), adopting continuous deployment depends on your organization and product.

Together, continuous integration and continuous delivery (commonly referred to as CI/CD) create a collaborative process for people to work on projects through shared ownership. At the same time, teams can maintain quality control through automation and bring new features to users with continuous deployment. 

2. Change management

Change management is often a critical part of business processes. Like the automation guidelines, there are some common principles and tooling that development and operations teams can use to create consistency.  

  • Version control: The practice of using version control has a long history rooted in helping people revert changes and learn from past decisions. From RCS to SVN, CVS to Perforce, ClearCase to Git, version control is a staple for enabling teams to collaborate by providing a common workflow and code base for individuals to work with. 
  • Change control: Along with maintaining your code’s version history, having a system in place to coordinate and facilitate changes helps to maintain product direction, reduces the probability of harmful changes to your code, and encourages a collaborative process.
  • Configuration management: Configuration management makes it easier for everyone to manage complex deployments through templates and manage changes at scale with proper controls and approvals.

3. ‘X’ as code

By now, you also may have heard of “infrastructure as code,” “configuration as code,” “policy as code,” or some of the other “as code” models. These models provide a declarative framework for managing different aspects of your operating environments through high level abstractions. Stated another way, you provide variables to a tool and the output is consistently the same, allowing you to recreate your resources consistently. DevOps implements the “as code” principle with several goals, including: an auditable change trail for compliance, collaborative change process via version control, a consistent, testable and reliable way of deploying resources, and as a way to lower the learning curve for new team members. 

  • Infrastructure as code (IaC) provides a declarative model for creating immutable infrastructure using the same versioning and workflow that developers use for source code. As changes are introduced to your infrastructure requirements, new infrastructure is defined, tested, and deployed with new configurations through automated declarative pipelines.
  • Platform as code (PaC) provides a declarative model for services similar to how infrastructure as code provides a framework for recreating the same infrastructure—allowing you to rapidly deploy services to existing infrastructure with high-level abstractions.
  • Configuration as code (CaC) brings the next level of declarative pipelining by defining the configuration of your applications as versioned resources.
  • Policy as code brings versioning and the DevOps workflow to security and policy management. 

4. Continuous monitoring

Operational insights are an invaluable component of any production environment. In order to understand the behaviors of your software in production, you need to have information about how it operates. Continuous monitoring—the processes and technology that monitor performance and stability of applications and infrastructure throughout the software lifecycle—provides operations teams with data to help troubleshoot, and development teams the information needed to debug and patch. This also leads into an important aspect of security, where DevSecOps takes on these principles with a security focus. Choosing the right monitoring tools can be the difference between a slight service interruption and a major outage. When it comes to gaining operational insights, there are some important considerations: 

  • Logging gives you a continuous stream of data about your business’ critical components. Application logs, infrastructure logs, and audit logs all provide important data that helps teams learn and improve products.
  • Monitoring provides a level of intelligence and interpretation to the raw data provided in logs and metrics. With advanced tooling, monitoring can provide teams with correlated insights beyond what the raw data provides.
  • Alerting provides proactive notifications to respective teams to help them stay ahead of major issues. When effectively implemented, these alerts not only let you know when something has gone wrong, but can also provide teams with critical debugging information to help solve the problem quickly.
  • Tracing takes logging a step further, providing a deeper level of application performance and behavioral insights that can greatly impact the stability and scalability of applications in production environments.

Putting DevOps automation into action

At this point, we’ve talked much about automation in the DevOps space, so is DevOps all about automation? Put simply, no. Automation is an important means to accomplishing this work efficiently between teams. Whether you’re new to DevOps or migrating from another set of automation solutions, testing new tooling with a small project or process is a great place to start. It will lay the foundation for scaling and standardizing automation across your entire organization, including how to measure effectiveness and progression toward your goals. 

Regardless of which toolset you choose to automate your DevOps workflow, evaluating your teams’ current workflows and the information you need to do your work will help guide you to your tool and platform selection, and set the stage for success. Here are a few more resources to help you along the way:

Want to see what DevOps automation looks like in practice? See how engineers at Wiley build faster and more securely with GitHub Actions.

Building, bundling, and deploying applications with the AWS CDK

Post Syndicated from Cory Hall original https://aws.amazon.com/blogs/devops/building-apps-with-aws-cdk/

The AWS Cloud Development Kit (AWS CDK) is an open-source software development framework to model and provision your cloud application resources using familiar programming languages.

The post CDK Pipelines: Continuous delivery for AWS CDK applications showed how you can use CDK Pipelines to deploy a TypeScript-based AWS Lambda function. In that post, you learned how to add additional build commands to the pipeline to compile the TypeScript code to JavaScript, which is needed to create the Lambda deployment package.

In this post, we dive deeper into how you can perform these build commands as part of your AWS CDK build process by using the native AWS CDK bundling functionality.

If you’re working with Python, TypeScript, or JavaScript-based Lambda functions, you may already be familiar with the PythonFunction and NodejsFunction constructs, which use the bundling functionality. This post describes how to write your own bundling logic for instances where a higher-level construct either doesn’t already exist or doesn’t meet your needs. To illustrate this, I walk through two different examples: a Lambda function written in Golang and a static site created with Nuxt.js.

Concepts

A typical CI/CD pipeline contains steps to build and compile your source code, bundle it into a deployable artifact, push it to artifact stores, and deploy to an environment. In this post, we focus on the building, compiling, and bundling stages of the pipeline.

The AWS CDK has the concept of bundling source code into a deployable artifact. As of this writing, this works for two main types of assets: Docker images published to Amazon Elastic Container Registry (Amazon ECR) and files published to Amazon Simple Storage Service (Amazon S3). For files published to Amazon S3, this can be as simple as pointing to a local file or directory, which the AWS CDK uploads to Amazon S3 for you.

When you build an AWS CDK application (by running cdk synth), a cloud assembly is produced. The cloud assembly consists of a set of files and directories that define your deployable AWS CDK application. In the context of the AWS CDK, it might include the following:

  • AWS CloudFormation templates and instructions on where to deploy them
  • Dockerfiles, corresponding application source code, and information about where to build and push the images to
  • File assets and information about which S3 buckets to upload the files to

Use case

For this use case, our application consists of front-end and backend components. The example code is available in the GitHub repo. In the repository, I have split the example into two separate AWS CDK applications. The repo also contains the Golang Lambda example app and the Nuxt.js static site.

Golang Lambda function

To create a Golang-based Lambda function, you must first create a Lambda function deployment package. For Go, this consists of a .zip file containing a Go executable. Because we don’t commit the Go executable to our source repository, our CI/CD pipeline must perform the necessary steps to create it.

In the context of the AWS CDK, when we create a Lambda function, we have to tell the AWS CDK where to find the deployment package. See the following code:

new lambda.Function(this, 'MyGoFunction', {
  runtime: lambda.Runtime.GO_1_X,
  handler: 'main',
  code: lambda.Code.fromAsset(path.join(__dirname, 'folder-containing-go-executable')),
});

In the preceding code, the lambda.Code.fromAsset() method tells the AWS CDK where to find the Golang executable. When we run cdk synth, it stages this Go executable in the cloud assembly, which it zips and publishes to Amazon S3 as part of the PublishAssets stage.

If we’re running the AWS CDK as part of a CI/CD pipeline, this executable doesn’t exist yet, so how do we create it? One method is CDK bundling. The lambda.Code.fromAsset() method takes a second optional argument, AssetOptions, which contains the bundling parameter. With this bundling parameter, we can tell the AWS CDK to perform steps prior to staging the files in the cloud assembly.

Breaking down the BundlingOptions parameter further, we can perform the build inside a Docker container or locally.

Building inside a Docker container

For this to work, we need to make sure that we have Docker running on our build machine. In AWS CodeBuild, this means setting privileged: true. See the following code:

new lambda.Function(this, 'MyGoFunction', {
  code: lambda.Code.fromAsset(path.join(__dirname, 'folder-containing-source-code'), {
    bundling: {
      image: lambda.Runtime.GO_1_X.bundlingDockerImage,
      command: [
        'bash', '-c', [
          'go test -v',
          'GOOS=linux go build -o /asset-output/main',
      ].join(' && '),
    },
  })
  ...
});

We specify two parameters:

  • image (required) – The Docker image to perform the build commands in
  • command (optional) – The command to run within the container

The AWS CDK mounts the folder specified as the first argument to fromAsset at /asset-input inside the container, and mounts the asset output directory (where the cloud assembly is staged) at /asset-output inside the container.

After we perform the build commands, we need to make sure we copy the Golang executable to the /asset-output location (or specify it as the build output location like in the preceding example).

This is the equivalent of running something like the following code:

docker run \
  --rm \
  -v folder-containing-source-code:/asset-input \
  -v cdk.out/asset.1234a4b5/:/asset-output \
  lambci/lambda:build-go1.x \
  bash -c 'GOOS=linux go build -o /asset-output/main'

Building locally

To build locally (not in a Docker container), we have to provide the local parameter. See the following code:

new lambda.Function(this, 'MyGoFunction', {
  code: lambda.Code.fromAsset(path.join(__dirname, 'folder-containing-source-code'), {
    bundling: {
      image: lambda.Runtime.GO_1_X.bundlingDockerImage,
      command: [],
      local: {
        tryBundle(outputDir: string) {
          try {
            spawnSync('go version')
          } catch {
            return false
          }

          spawnSync(`GOOS=linux go build -o ${path.join(outputDir, 'main')}`);
          return true
        },
      },
    },
  })
  ...
});

The local parameter must implement the ILocalBundling interface. The tryBundle method is passed the asset output directory, and expects you to return a boolean (true or false). If you return true, the AWS CDK doesn’t try to perform Docker bundling. If you return false, it falls back to Docker bundling. Just like with Docker bundling, you must make sure that you place the Go executable in the outputDir.

Typically, you should perform some validation steps to ensure that you have the required dependencies installed locally to perform the build. This could be checking to see if you have go installed, or checking a specific version of go. This can be useful if you don’t have control over what type of build environment this might run in (for example, if you’re building a construct to be consumed by others).

If we run cdk synth on this, we see a new message telling us that the AWS CDK is bundling the asset. If we include additional commands like go test, we also see the output of those commands. This is especially useful if you wanted to fail a build if tests failed. See the following code:

$ cdk synth
Bundling asset GolangLambdaStack/MyGoFunction/Code/Stage...
✓  . (9ms)
✓  clients (5ms)

DONE 8 tests in 11.476s
✓  clients (5ms) (coverage: 84.6% of statements)
✓  . (6ms) (coverage: 78.4% of statements)

DONE 8 tests in 2.464s

Cloud Assembly

If we look at the cloud assembly that was generated (located at cdk.out), we see something like the following code:

$ cdk synth
Bundling asset GolangLambdaStack/MyGoFunction/Code/Stage...
✓  . (9ms)
✓  clients (5ms)

DONE 8 tests in 11.476s
✓  clients (5ms) (coverage: 84.6% of statements)
✓  . (6ms) (coverage: 78.4% of statements)

DONE 8 tests in 2.464s

It contains our GolangLambdaStack CloudFormation template that defines our Lambda function, as well as our Golang executable, bundled at asset.01cf34ff646d380829dc4f2f6fc93995b13277bde7db81c24ac8500a83a06952/main.

Let’s look at how the AWS CDK uses this information. The GolangLambdaStack.assets.json file contains all the information necessary for the AWS CDK to know where and how to publish our assets (in this use case, our Golang Lambda executable). See the following code:

{
  "version": "5.0.0",
  "files": {
    "01cf34ff646d380829dc4f2f6fc93995b13277bde7db81c24ac8500a83a06952": {
      "source": {
        "path": "asset.01cf34ff646d380829dc4f2f6fc93995b13277bde7db81c24ac8500a83a06952",
        "packaging": "zip"
      },
      "destinations": {
        "current_account-current_region": {
          "bucketName": "cdk-hnb659fds-assets-${AWS::AccountId}-${AWS::Region}",
          "objectKey": "01cf34ff646d380829dc4f2f6fc93995b13277bde7db81c24ac8500a83a06952.zip",
          "assumeRoleArn": "arn:${AWS::Partition}:iam::${AWS::AccountId}:role/cdk-hnb659fds-file-publishing-role-${AWS::AccountId}-${AWS::Region}"
        }
      }
    }
  }
}

The file contains information about where to find the source files (source.path) and what type of packaging (source.packaging). It also tells the AWS CDK where to publish this .zip file (bucketName and objectKey) and what AWS Identity and Access Management (IAM) role to use (assumeRoleArn). In this use case, we only deploy to a single account and Region, but if you have multiple accounts or Regions, you see multiple destinations in this file.

The GolangLambdaStack.template.json file that defines our Lambda resource looks something like the following code:

{
  "Resources": {
    "MyGoFunction0AB33E85": {
      "Type": "AWS::Lambda::Function",
      "Properties": {
        "Code": {
          "S3Bucket": {
            "Fn::Sub": "cdk-hnb659fds-assets-${AWS::AccountId}-${AWS::Region}"
          },
          "S3Key": "01cf34ff646d380829dc4f2f6fc93995b13277bde7db81c24ac8500a83a06952.zip"
        },
        "Handler": "main",
        ...
      }
    },
    ...
  }
}

The S3Bucket and S3Key match the bucketName and objectKey from the assets.json file. By default, the S3Key is generated by calculating a hash of the folder location that you pass to lambda.Code.fromAsset(), (for this post, folder-containing-source-code). This means that any time we update our source code, this calculated hash changes and a new Lambda function deployment is triggered.

Nuxt.js static site

In this section, I walk through building a static site using the Nuxt.js framework. You can apply the same logic to any static site framework that requires you to run a build step prior to deploying.

To deploy this static site, we use the BucketDeployment construct. This is a construct that allows you to populate an S3 bucket with the contents of .zip files from other S3 buckets or from a local disk.

Typically, we simply tell the BucketDeployment construct where to find the files that it needs to deploy to the S3 bucket. See the following code:

new s3_deployment.BucketDeployment(this, 'DeployMySite', {
  sources: [
    s3_deployment.Source.asset(path.join(__dirname, 'path-to-directory')),
  ],
  destinationBucket: myBucket
});

To deploy a static site built with a framework like Nuxt.js, we need to first run a build step to compile the site into something that can be deployed. For Nuxt.js, we run the following two commands:

  • yarn install – Installs all our dependencies
  • yarn generate – Builds the application and generates every route as an HTML file (used for static hosting)

This creates a dist directory, which you can deploy to Amazon S3.

Just like with the Golang Lambda example, we can perform these steps as part of the AWS CDK through either local or Docker bundling.

Building inside a Docker container

To build inside a Docker container, use the following code:

new s3_deployment.BucketDeployment(this, 'DeployMySite', {
  sources: [
    s3_deployment.Source.asset(path.join(__dirname, 'path-to-nuxtjs-project'), {
      bundling: {
        image: cdk.BundlingDockerImage.fromRegistry('node:lts'),
        command: [
          'bash', '-c', [
            'yarn install',
            'yarn generate',
            'cp -r /asset-input/dist/* /asset-output/',
          ].join(' && '),
        ],
      },
    }),
  ],
  ...
});

For this post, we build inside the publicly available node:lts image hosted on DockerHub. Inside the container, we run our build commands yarn install && yarn generate, and copy the generated dist directory to our output directory (the cloud assembly).

The parameters are the same as described in the Golang example we walked through earlier.

Building locally

To build locally, use the following code:

new s3_deployment.BucketDeployment(this, 'DeployMySite', {
  sources: [
    s3_deployment.Source.asset(path.join(__dirname, 'path-to-nuxtjs-project'), {
      bundling: {
        local: {
          tryBundle(outputDir: string) {
            try {
              spawnSync('yarn --version');
            } catch {
              return false
            }

            spawnSync('yarn install && yarn generate');

       fs.copySync(path.join(__dirname, ‘path-to-nuxtjs-project’, ‘dist’), outputDir);
            return true
          },
        },
        image: cdk.BundlingDockerImage.fromRegistry('node:lts'),
        command: [],
      },
    }),
  ],
  ...
});

Building locally works the same as the Golang example we walked through earlier, with one exception. We have one additional command to run that copies the generated dist folder to our output directory (cloud assembly).

Conclusion

This post showed how you can easily compile your backend and front-end applications using the AWS CDK. You can find the example code for this post in this GitHub repo. If you have any questions or comments, please comment on the GitHub repo. If you have any additional examples you want to add, we encourage you to create a Pull Request with your example!

Our code also contains examples of deploying the applications using CDK Pipelines, so if you’re interested in deploying the example yourself, check out the example repo.

 

About the author

Cory Hall

Cory is a Solutions Architect at Amazon Web Services with a passion for DevOps and is based in Charlotte, NC. Cory works with enterprise AWS customers to help them design, deploy, and scale applications to achieve their business goals.

Complete CI/CD with AWS CodeCommit, AWS CodeBuild, AWS CodeDeploy, and AWS CodePipeline

Post Syndicated from Nitin Verma original https://aws.amazon.com/blogs/devops/complete-ci-cd-with-aws-codecommit-aws-codebuild-aws-codedeploy-and-aws-codepipeline/

Many organizations have been shifting to DevOps practices, which is the combination of cultural philosophies, practices, and tools that increases your organization’s ability to deliver applications and services at high velocity; for example, evolving and improving products at a faster pace than organizations using traditional software development and infrastructure management processes.

DevOps-Feedback-Flow

An integral part of DevOps is adopting the culture of continuous integration and continuous delivery/deployment (CI/CD), where a commit or change to code passes through various automated stage gates, all the way from building and testing to deploying applications, from development to production environments.

This post uses the AWS suite of CI/CD services to compile, build, and install a version-controlled Java application onto a set of Amazon Elastic Compute Cloud (Amazon EC2) Linux instances via a fully automated and secure pipeline. The goal is to promote a code commit or change to pass through various automated stage gates all the way from development to production environments, across AWS accounts.

AWS services

This solution uses the following AWS services:

  • AWS CodeCommit – A fully-managed source control service that hosts secure Git-based repositories. CodeCommit makes it easy for teams to collaborate on code in a secure and highly scalable ecosystem. This solution uses CodeCommit to create a repository to store the application and deployment codes.
  • AWS CodeBuild – A fully managed continuous integration service that compiles source code, runs tests, and produces software packages that are ready to deploy, on a dynamically created build server. This solution uses CodeBuild to build and test the code, which we deploy later.
  • AWS CodeDeploy – A fully managed deployment service that automates software deployments to a variety of compute services such as Amazon EC2, AWS Fargate, AWS Lambda, and your on-premises servers. This solution uses CodeDeploy to deploy the code or application onto a set of EC2 instances running CodeDeploy agents.
  • AWS CodePipeline – A fully managed continuous delivery service that helps you automate your release pipelines for fast and reliable application and infrastructure updates. This solution uses CodePipeline to create an end-to-end pipeline that fetches the application code from CodeCommit, builds and tests using CodeBuild, and finally deploys using CodeDeploy.
  • AWS CloudWatch Events – An AWS CloudWatch Events rule is created to trigger the CodePipeline on a Git commit to the CodeCommit repository.
  • Amazon Simple Storage Service (Amazon S3) – An object storage service that offers industry-leading scalability, data availability, security, and performance. This solution uses an S3 bucket to store the build and deployment artifacts created during the pipeline run.
  • AWS Key Management Service (AWS KMS) – AWS KMS makes it easy for you to create and manage cryptographic keys and control their use across a wide range of AWS services and in your applications. This solution uses AWS KMS to make sure that the build and deployment artifacts stored on the S3 bucket are encrypted at rest.

Overview of solution

This solution uses two separate AWS accounts: a dev account (111111111111) and a prod account (222222222222) in Region us-east-1.

We use the dev account to deploy and set up the CI/CD pipeline, along with the source code repo. It also builds and tests the code locally and performs a test deploy.

The prod account is any other account where the application is required to be deployed from the pipeline in the dev account.

In summary, the solution has the following workflow:

  • A change or commit to the code in the CodeCommit application repository triggers CodePipeline with the help of a CloudWatch event.
  • The pipeline downloads the code from the CodeCommit repository, initiates the Build and Test action using CodeBuild, and securely saves the built artifact on the S3 bucket.
  • If the preceding step is successful, the pipeline triggers the Deploy in Dev action using CodeDeploy and deploys the app in dev account.
  • If successful, the pipeline triggers the Deploy in Prod action using CodeDeploy and deploys the app in the prod account.

The following diagram illustrates the workflow:

cicd-overall-flow

 

Failsafe deployments

This example of CodeDeploy uses the IN_PLACE type of deployment. However, to minimize the downtime, CodeDeploy inherently supports multiple deployment strategies. This example makes use of following features: rolling deployments and automatic rollback.

CodeDeploy provides the following three predefined deployment configurations, to minimize the impact during application upgrades:

  • CodeDeployDefault.OneAtATime – Deploys the application revision to only one instance at a time
  • CodeDeployDefault.HalfAtATime – Deploys to up to half of the instances at a time (with fractions rounded down)
  • CodeDeployDefault.AllAtOnce – Attempts to deploy an application revision to as many instances as possible at once

For OneAtATime and HalfAtATime, CodeDeploy monitors and evaluates instance health during the deployment and only proceeds to the next instance or next half if the previous deployment is healthy. For more information, see Working with deployment configurations in CodeDeploy.

You can also configure a deployment group or deployment to automatically roll back when a deployment fails or when a monitoring threshold you specify is met. In this case, the last known good version of an application revision is automatically redeployed after a failure with the new application version.

How CodePipeline in the dev account deploys apps in the prod account

In this post, the deployment pipeline using CodePipeline is set up in the dev account, but it has permissions to deploy the application in the prod account. We create a special cross-account role in the prod account, which has the following:

  • Permission to use fetch artifacts (app) rom Amazon S3 and deploy it locally in the account using CodeDeploy
  • Trust with the dev account where the pipeline runs

CodePipeline in the dev account assumes this cross-account role in the prod account to deploy the app.

Do I need multiple accounts?
If you answer “yes” to any of the following questions you should consider creating more AWS accounts:

  • Does your business require administrative isolation between workloads? Administrative isolation by account is the most straightforward way to grant independent administrative groups different levels of administrative control over AWS resources based on workload, development lifecycle, business unit (BU), or data sensitivity.
  • Does your business require limited visibility and discoverability of workloads? Accounts provide a natural boundary for visibility and discoverability. Workloads cannot be accessed or viewed unless an administrator of the account enables access to users managed in another account.
  • Does your business require isolation to minimize blast radius? Separate accounts help define boundaries and provide natural blast-radius isolation to limit the impact of a critical event such as a security breach, an unavailable AWS Region or Availability Zone, account suspensions, and so on.
  • Does your business require a particular workload to operate within AWS service limits without impacting the limits of another workload? You can use AWS account service limits to impose restrictions on a business unit, development team, or project. For example, if you create an AWS account for a project group, you can limit the number of Amazon Elastic Compute Cloud (Amazon EC2) or high performance computing (HPC) instances that can be launched by the account.
  • Does your business require strong isolation of recovery or auditing data? If regulatory requirements require you to control access and visibility to auditing data, you can isolate the data in an account separate from the one where you run your workloads (for example, by writing AWS CloudTrail logs to a different account).

Prerequisites

For this walkthrough, you should complete the following prerequisites:

  1. Have access to at least two AWS accounts. For this post, the dev and prod accounts are in us-east-1. You can search and replace the Region and account IDs in all the steps and sample AWS Identity and Access Management (IAM) policies in this post.
  2. Ensure you have EC2 Linux instances with the CodeDeploy agent installed in all the accounts or VPCs where the sample Java application is to be installed (dev and prod accounts).
    • To manually create EC2 instances with CodeDeploy agent, refer Create an Amazon EC2 instance for CodeDeploy (AWS CLI or Amazon EC2 console). Keep in mind the following:
      • CodeDeploy uses EC2 instance tags to identify instances to use to deploy the application, so it’s important to set tags appropriately. For this post, we use the tag name Application with the value MyWebApp to identify instances where the sample app is installed.
      • Make sure to use an EC2 instance profile (AWS Service Role for EC2 instance) with permissions to read the S3 bucket containing artifacts built by CodeBuild. Refer to the IAM role cicd_ec2_instance_profile in the table Roles-1 below for the set of permissions required. You must update this role later with the actual KMS key and S3 bucket name created as part of the deployment process.
    • To create EC2 Linux instances via AWS Cloudformation, download and launch the AWS CloudFormation template from the GitHub repo: cicd-ec2-instance-with-codedeploy.json
      • This deploys an EC2 instance with AWS CodeDeploy agent.
      • Inputs required:
        • AMI : Enter name of the Linux AMI in your region. (This template has been tested with latest Amazon Linux 2 AMI)
        • Ec2SshKeyPairName: Name of an existing SSH KeyPair
        • Ec2IamInstanceProfile: Name of an existing EC2 instance profile. Note: Use the permissions in the template cicd_ec2_instance_profile_policy.json to create the policy for this EC2 Instance Profile role. You must update this role later with the actual KMS key and S3 bucket name created as part of the deployment process.
        • Update the EC2 instance Tags per your need.
  3. Ensure required IAM permissions. Have an IAM user with an IAM Group or Role that has the following access levels or permissions:

    AWS Service / Components  Access Level Accounts Comments
    AWS CodeCommit Full (admin) Dev Use AWS managed policy AWSCodeCommitFullAccess.
    AWS CodePipeline Full (admin) Dev Use AWS managed policy AWSCodePipelineFullAccess.
    AWS CodeBuild Full (admin) Dev Use AWS managed policy AWSCodeBuildAdminAccess.
    AWS CodeDeploy Full (admin) All

    Use AWS managed policy

    AWSCodeDeployFullAccess.

    Create S3 bucket and bucket policies Full (admin) Dev IAM policies can be restricted to specific bucket.
    Create KMS key and policies Full (admin) Dev IAM policies can be restricted to specific KMS key.
    AWS CloudFormation Full (admin) Dev

    Use AWS managed policy

    AWSCloudFormationFullAccess.

    Create and pass IAM roles Full (admin) All Ability to create IAM roles and policies can be restricted to specific IAM roles or actions. Also, an admin team with IAM privileges could create all the required roles. Refer to the IAM table Roles-1 below.
    AWS Management Console and AWS CLI As per IAM User permissions All To access suite of Code services.

     

  4. Create Git credentials for CodeCommit in the pipeline account (dev account). AWS allows you to either use Git credentials or associate SSH public keys with your IAM user. For this post, use Git credentials associated with your IAM user (created in the previous step). For instructions on creating a Git user, see Create Git credentials for HTTPS connections to CodeCommit. Download and save the Git credentials to use later for deploying the application.
  5. Create all AWS IAM roles as per the following tables (Roles-1). Make sure to update the following references in all the given IAM roles and policies:
    • Replace the sample dev account (111111111111) and prod account (222222222222) with actual account IDs
    • Replace the S3 bucket mywebapp-codepipeline-bucket-us-east-1-111111111111 with your preferred bucket name.
    • Replace the KMS key ID key/82215457-e360-47fc-87dc-a04681c91ce1 with your KMS key ID.

Table: Roles-1

Service IAM Role Type Account IAM Role Name (used for this post) IAM Role Policy (required for this post) IAM Role Permissions
AWS CodePipeline Service role Dev (111111111111)

cicd_codepipeline_service_role

Select Another AWS Account and use this account as the account ID to create the role.

Later update the trust as follows:
“Principal”: {“Service”: “codepipeline.amazonaws.com”},

Use the permissions in the template cicd_codepipeline_service_policy.json to create the policy for this role. This CodePipeline service role has appropriate permissions to the following services in a local account:

  • Manage CodeCommit repos
  • Initiate build via CodeBuild
  • Create deployments via CodeDeploy
  • Assume cross-account CodeDeploy role in prod account to deploy the application
AWS CodePipeline IAM role Dev (111111111111)

cicd_codepipeline_trigger_cwe_role

Select Another AWS Account and use this account as the account ID to create the role.

Later update the trust as follows:
“Principal”: {“Service”: “events.amazonaws.com”},

Use the permissions in the template cicd_codepipeline_trigger_cwe_policy.json to create the policy for this role. CodePipeline uses this role to set a CloudWatch event to trigger the pipeline when there is a change or commit made to the code repository.
AWS CodePipeline IAM role Prod (222222222222)

cicd_codepipeline_cross_ac_role

Choose Another AWS Account and use the dev account as the trusted account ID to create the role.

Use the permissions in the template cicd_codepipeline_cross_ac_policy.json to create the policy for this role. This role is created in the prod account and has permissions to use CodeDeploy and fetch from Amazon S3. The role is assumed by CodePipeline from the dev account to deploy the app in the prod account. Make sure to set up trust with the dev account for this IAM role on the Trust relationships tab.
AWS CodeBuild Service role Dev (111111111111)

cicd_codebuild_service_role

Choose CodeBuild as the use case to create the role.

Use the permissions in the template cicd_codebuild_service_policy.json to create the policy for this role. This CodeBuild service role has appropriate permissions to:

  • The S3 bucket to store artefacts
  • Stream logs to CloudWatch Logs
  • Pull code from CodeCommit
  • Get the SSM parameter for CodeBuild
  • Miscellaneous Amazon EC2 permissions
AWS CodeDeploy Service role Dev (111111111111) and Prod (222222222222)

cicd_codedeploy_service_role

Choose CodeDeploy as the use case to create the role.

Use the built-in AWS managed policy AWSCodeDeployRole for this role. This CodeDeploy service role has appropriate permissions to:

  • Miscellaneous Amazon EC2 Auto Scaling
  • Miscellaneous Amazon EC2
  • Publish Amazon SNS topic
  • AWS CloudWatch metrics
  • Elastic Load Balancing
EC2 Instance Service role for EC2 instance profile Dev (111111111111) and Prod (222222222222)

cicd_ec2_instance_profile

Choose EC2 as the use case to create the role.

Use the permissions in the template cicd_ec2_instance_profile_policy.json to create the policy for this role.

This is set as the EC2 instance profile for the EC2 instances where the app is deployed. It has appropriate permissions to fetch artefacts from Amazon S3 and decrypt contents using the KMS key.

 

You must update this role later with the actual KMS key and S3 bucket name created as part of the deployment process.

 

 

Setting up the prod account

To set up the prod account, complete the following steps:

  1. Download and launch the AWS CloudFormation template from the GitHub repo: cicd-codedeploy-prod.json
    • This deploys the CodeDeploy app and deployment group.
    • Make sure that you already have a set of EC2 Linux instances with the CodeDeploy agent installed in all the accounts where the sample Java application is to be installed (dev and prod accounts). If not, refer back to the Prerequisites section.
  2. Update the existing EC2 IAM instance profile (cicd_ec2_instance_profile):
    • Replace the S3 bucket name mywebapp-codepipeline-bucket-us-east-1-111111111111 with your S3 bucket name (the one used for the CodePipelineArtifactS3Bucket variable when you launched the CloudFormation template in the dev account).
    • Replace the KMS key ARN arn:aws:kms:us-east-1:111111111111:key/82215457-e360-47fc-87dc-a04681c91ce1 with your KMS key ARN (the one created as part of the CloudFormation template launch in the dev account).

Setting up the dev account

To set up your dev account, complete the following steps:

  1. Download and launch the CloudFormation template from the GitHub repo: cicd-aws-code-suite-dev.json
    The stack deploys the following services in the dev account:

    • CodeCommit repository
    • CodePipeline
    • CodeBuild environment
    • CodeDeploy app and deployment group
    • CloudWatch event rule
    • KMS key (used to encrypt the S3 bucket)
    • S3 bucket and bucket policy
  2. Use following values as inputs to the CloudFormation template. You should have created all the existing resources and roles beforehand as part of the prerequisites.

    Key Example Value Comments
    CodeCommitWebAppRepo MyWebAppRepo Name of the new CodeCommit repository for your web app.
    CodeCommitMainBranchName master Main branch name on your CodeCommit repository. Default is master (which is pushed to the prod environment).
    CodeBuildProjectName MyCBWebAppProject Name of the new CodeBuild environment.
    CodeBuildServiceRole arn:aws:iam::111111111111:role/cicd_codebuild_service_role ARN of an existing IAM service role to be associated with CodeBuild to build web app code.
    CodeDeployApp MyCDWebApp Name of the new CodeDeploy app to be created for your web app. We assume that the CodeDeploy app name is the same in all accounts where deployment needs to occur (in this case, the prod account).
    CodeDeployGroupDev MyCICD-Deployment-Group-Dev Name of the new CodeDeploy deployment group to be created in the dev account.
    CodeDeployGroupProd MyCICD-Deployment-Group-Prod Name of the existing CodeDeploy deployment group in prod account. Created as part of the prod account setup.

    CodeDeployGroupTagKey

     

    Application Name of the tag key that CodeDeploy uses to identify the existing EC2 fleet for the deployment group to use.

    CodeDeployGroupTagValue

     

    MyWebApp Value of the tag that CodeDeploy uses to identify the existing EC2 fleet for the deployment group to use.
    CodeDeployConfigName CodeDeployDefault.OneAtATime

    Desired Code Deploy config name. Valid options are:

    CodeDeployDefault.OneAtATime

    CodeDeployDefault.HalfAtATime

    CodeDeployDefault.AllAtOnce

    For more information, see Deployment configurations on an EC2/on-premises compute platform.

    CodeDeployServiceRole arn:aws:iam::111111111111:role/cicd_codedeploy_service_role

    ARN of an existing IAM service role to be associated with CodeDeploy to deploy web app.

     

    CodePipelineName MyWebAppPipeline Name of the new CodePipeline to be created for your web app.
    CodePipelineArtifactS3Bucket mywebapp-codepipeline-bucket-us-east-1-111111111111 Name of the new S3 bucket to be created where artifacts for the pipeline are stored for this web app.
    CodePipelineServiceRole arn:aws:iam::111111111111:role/cicd_codepipeline_service_role ARN of an existing IAM service role to be associated with CodePipeline to deploy web app.
    CodePipelineCWEventTriggerRole arn:aws:iam::111111111111:role/cicd_codepipeline_trigger_cwe_role ARN of an existing IAM role used to trigger the pipeline you named earlier upon a code push to the CodeCommit repository.
    CodeDeployRoleXAProd arn:aws:iam::222222222222:role/cicd_codepipeline_cross_ac_role ARN of an existing IAM role in the cross-account for CodePipeline to assume to deploy the app.

    It should take 5–10 minutes for the CloudFormation stack to complete. When the stack is complete, you can see that CodePipeline has built the pipeline (MyWebAppPipeline) with the CodeCommit repository and CodeBuild environment, along with actions for CodeDeploy in local (dev) and cross-account (prod). CodePipeline should be in a failed state because your CodeCommit repository is empty initially.

  3. Update the existing Amazon EC2 IAM instance profile (cicd_ec2_instance_profile):
    • Replace the S3 bucket name mywebapp-codepipeline-bucket-us-east-1-111111111111 with your S3 bucket name (the one used for the CodePipelineArtifactS3Bucket parameter when launching the CloudFormation template in the dev account).
    • Replace the KMS key ARN arn:aws:kms:us-east-1:111111111111:key/82215457-e360-47fc-87dc-a04681c91ce1 with your KMS key ARN (the one created as part of the CloudFormation template launch in the dev account).

Deploying the application

You’re now ready to deploy the application via your desktop or PC.

  1. Assuming you have the required HTTPS Git credentials for CodeCommit as part of the prerequisites, clone the CodeCommit repo that was created earlier as part of the dev account setup. Obtain the name of the CodeCommit repo to clone, from the CodeCommit console. Enter the Git user name and password when prompted. For example:
    $ git clone https://git-codecommit.us-east-1.amazonaws.com/v1/repos/MyWebAppRepo my-web-app-repo
    Cloning into 'my-web-app-repo'...
    Username for 'https://git-codecommit.us-east-1.amazonaws.com/v1/repos/MyWebAppRepo': xxxx
    Password for 'https://[email protected]/v1/repos/MyWebAppRepo': xxxx

  2. Download the MyWebAppRepo.zip file containing a sample Java application, CodeBuild configuration to build the app, and CodeDeploy config file to deploy the app.
  3. Copy and unzip the file into the my-web-app-repo Git repository folder created earlier.
  4. Assuming this is the sample app to be deployed, commit these changes to the Git repo. For example:
    $ cd my-web-app-repo 
    $ git add -A 
    $ git commit -m "initial commit" 
    $ git push

For more information, see Tutorial: Create a simple pipeline (CodeCommit repository).

After you commit the code, the CodePipeline will be triggered and all the stages and your application should be built, tested, and deployed all the way to the production environment!

The following screenshot shows the entire pipeline and its latest run:

 

Troubleshooting

To troubleshoot any service-related issues, see the following:

Cleaning up

To avoid incurring future charges or to remove any unwanted resources, delete the following:

  • EC2 instance used to deploy the application
  • CloudFormation template to remove all AWS resources created through this post
  •  IAM users or roles

Conclusion

Using this solution, you can easily set up and manage an entire CI/CD pipeline in AWS accounts using the native AWS suite of CI/CD services, where a commit or change to code passes through various automated stage gates all the way from building and testing to deploying applications, from development to production environments.

FAQs

In this section, we answer some frequently asked questions:

  1. Can I expand this deployment to more than two accounts?
    • Yes. You can deploy a pipeline in a tooling account and use dev, non-prod, and prod accounts to deploy code on EC2 instances via CodeDeploy. Changes are required to the templates and policies accordingly.
  2. Can I ensure the application isn’t automatically deployed in the prod account via CodePipeline and needs manual approval?
  3. Can I use a CodeDeploy group with an Auto Scaling group?
    • Yes. Minor changes required to the CodeDeploy group creation process. Refer to the following Solution Variations section for more information.
  4. Can I use this pattern for EC2 Windows instances?

Solution variations

In this section, we provide a few variations to our solution:

Author bio

author-pic

 Nitin Verma

Nitin is currently a Sr. Cloud Architect in the AWS Managed Services(AMS). He has many years of experience with DevOps-related tools and technologies. Speak to your AWS Managed Services representative to deploy this solution in AMS!

 

Automated CloudFormation Testing Pipeline with TaskCat and CodePipeline

Post Syndicated from Raleigh Hansen original https://aws.amazon.com/blogs/devops/automated-cloudformation-testing-pipeline-with-taskcat-and-codepipeline/

Researchers at Academic Medical Centers (AMCs) use programs such as Observational Health Data Sciences and Informatics (OHDSI) and Research Electronic Data Capture (REDCap) to interact with healthcare data. Our internal team at AWS has provided solutions such as OHDSI-on-AWS and REDCap environments on AWS to help clinicians analyze healthcare data in the AWS Cloud. Occasionally, these solutions break due to a change in some portion of the solution (e.g. updated services). The Automated Solutions Testing Pipeline enables our team to take a proactive approach to discovering these breaks and their cause in order to expedite the repair process.

OHDSI-on-AWS provides these AMCs with the ability to store and analyze observational health data in the AWS cloud. REDCap is a web application for managing surveys and databases with HIPAA-compliant environments. Using our solutions, these programs can be spun up easily on the AWS infrastructure using AWS CloudFormation templates.

Updates to AWS services and other program libraries can cause the CloudFormation template to fail during deployment. Other times, the outputs may not be operating correctly, or the template may not work on every AWS region. This can create a negative customer experience. Some customers may discover this kind of break and decide to not move forward with using the solution. Other customers may not even realize the solution is broken, so they might be unknowingly working with an uncooperative environment. Furthermore, we cannot always provide fast support to the customers who contact us about broken solutions. To meet our team’s needs and the needs of our customers, we decided to focus our efforts on taking a CI/CD approach to maintain these solutions. We developed the Automated Testing Pipeline which regularly tests solution deployment and changes to source files.

This post shows the features of the Automated Testing Pipeline and provides resources to help you get started using it with your AWS account.

Overview of Automated Testing Pipeline Solution

The Automated Testing Pipeline solution as a whole is designed to automatically deploy CloudFormation templates, run tests against the deployed environments, send notifications if an issue is discovered, and allow for insightful testing data to be easily explored.

CloudFormation templates to be tested are stored in an Amazon S3 bucket. Custom test scripts and TaskCat deployment configuration are stored in an AWS CodeCommit repository.

The pipeline is triggered in one of three ways: an update to the CloudFormation Template in S3, an Amazon CloudWatch events rule, and an update to the testing source code repository. Once the pipeline has been triggered, AWS CodeBuild pulls the source code to deploy the CloudFormation template, test the deployed environment, and store the results in an S3 bucket. If any failures are discovered, subscribers to the failure topic are notified. The following diagram shows its overall architecture.

Diagram of Automated Testing Pipeline architecture

Diagram of Automated Testing Pipeline architecture

In order to create the Automated Testing Pipeline, two interns collaborated over the course of 5 weeks to produce the architecture and custom test scripts. We divided the work of constructing a serverless architecture and writing out test scripts for the output urls for OHDSI-on-AWS and REDCap environments on AWS.

The following tasks were completed to build out the Automated Testing Pipeline solution:

  • Setup AWS IAM roles for accessing AWS resources securely
  • Create CloudWatch events to trigger AWS CodePipeline
  • Setup CodePipeline and CodeBuild to run TaskCat and testing scripts
  • Configure TaskCat to deploy CloudFormation solutions in various AWS Regions
  • Write test scripts to interact with CloudFormation solutions’ deployed environments
  • Subscribe to receive emails detailing test results
  • Create a CloudFormation template for the Automated Testing Pipeline

The architecture can be extended to test any CloudFormation stack. For this particular use case, we wrote the test scripts specifically to test the urls output by the CloudFormation solutions. The Automated Testing Pipeline has the following features:

  • Deployed in a single AWS Region, with the exception of the tested CloudFormation solution
  • Has a serverless architecture operating at the AWS Region level
  • Deploys a pipeline which can deploy and test the CloudFormation solution
  • Creates CloudWatch events to activate the pipeline on a schedule or when the solution is updated
  • Creates an Amazon SNS topic for notifying subscribers when there are errors
  • Includes code for running TaskCat and scripts to test solution functionality
  • Built automatically in minutes
  • Low in cost with free tier benefits

The pipeline is triggered automatically when an event occurs. These events include a change to the CloudFormation solution template, a change to the code in the testing repository, and an alarm set off by a regular schedule. Additional events can be added in the CloudWatch console.

When the pipeline is triggered, the testing environment is set up by CodeBuild. CodeBuild uses a build specification file kept within our source repository to set up the environment and run the test scripts. We created a CodeCommit repository to host the test scripts alongside the build specification. The build specification includes commands run TaskCat — an open-source tool for testing the deployment of CloudFormation templates. TaskCat provides the ability to test the deployment of the CloudFormation solution, but we needed custom test scripts to ensure that we can interact with the deployed environment as expected. If the template is successfully deployed, CodeBuild handles running the test scripts against the CloudFormation solution environment. In our case, the environment is accessed via urls output by the CloudFormation solution.

We used a Selenium WebDriver for interacting with the web pages given by the output urls. This allowed us to programmatically navigate a headless web browser in the serverless environment and gave us the ability to use text output by JavaScript functions to understand the state of the test. You can see this interaction occurring in the code snippet below.

def log_in(driver, user, passw, link, btn_path, title):
    """Enter username and password then submit to log in

        :param driver: webdriver for Chrome page
        :param user: username as String
        :param passw: password as String
        :param link: url for page being tested as String
        :param btn_path: xpath to submit button
        :param title: expected page title upon successful sign in
        :return: success String tuple if log in completed, failure description tuple String otherwise
    """
    try:
        # post username and password data
        driver.find_element_by_xpath("//input[ @name='username' ]").send_keys(user)
        driver.find_element_by_xpath("//input[ @name='password' ]").send_keys(passw)

        # click sign in button and wait for page update
        driver.find_element_by_xpath(btn_path).click()
    except NoSuchElementException:
        return 'FAILURE', 'Unable to access page elements'

    try:
        WebDriverWait(driver, 20).until(ec.url_changes(link))
        WebDriverWait(driver, 20).until(ec.title_is(title))
    except TimeoutException as e:
        print("Timeout occurred (" + e + ") while attempting to sign in to " + driver.current_url)
        if "Sign In" in driver.title or "invalid user" in driver.page_source.lower():
            return 'FAILURE', 'Incorrect username or password'
        else:
            return 'FAILURE', 'Sign in attempt timed out'

    return 'SUCCESS', 'Sign in complete'

We store the test results in JSON format for ease of parsing. TaskCat generates a dashboard which we customize to display these test results. We are able to insert our JSON results into the dashboard in order to make it easy to find errors and access log files. This dashboard is a static html file that can be hosted on an S3 bucket. In addition, messages are published to topics in SNS whenever an error occurs which provide a link to this dashboard.

Dashboard containing descriptions of tests and their results

Customized TaskCat dashboard

In true CI/CD fashion, this end-to-end design automatically performs tasks that would otherwise be performed manually. We have shown how deploying solutions, testing solutions, notifying maintainers, and providing a results dashboard are all actions handled entirely by the Automated Testing Pipeline.

Getting Started with the Automated Testing Pipeline

Prerequisite tasks to complete before deploying the pipeline:

Once the prerequisite tasks are completed, the pipeline is ready to be deployed. Detailed information about deployment, altering the source code to fit your use case, and troubleshooting issues can be found at the GitHub page for the Automated Testing Pipeline.

For those looking to jump right into deployment, click the Launch Stack button below.

Button to click to deploy the Automated Testing Pipeline via CloudFormation

Tasks to complete after deployment:

  • Subscribe to SNS topic for error messages
  • Update the code to match the parameters and CloudFormation template that were chosen
  • Skip this step if you are testing OHDSI-on-AWS. Upload the desired CloudFormation template to the created source S3 Bucket
  • Push the source code to the created CodeCommit Repository

After the code is pushed to the CodeCommit repository and the CloudFormation template has been uploaded to S3, the pipeline will run automatically. You can visit the CodePipeline console to confirm that the pipeline is running with an “in progress” status.

You may desire to alter various aspects of the Automated Testing Pipeline to better fit your use case. Listed below are some actions you can take to modify the solution to fit your needs:

  • Go to CloudWatch Events and update rules for automatically started the pipeline.
  • Scale out testing by providing custom testing scripts or altering the existing ones.
  • Test a different CloudFormation template by uploading it to the source S3 bucket created and configuring the pipeline accordingly. Custom test scripts will likely be required for this use case.

Challenges Addressed by the Automated Testing Pipeline

The Automated Testing Pipeline directly addresses the challenges we faced with maintaining our OHDSI and REDCap solutions. Additionally, the pipeline can be used whenever there is a need to test CloudFormation templates that are being used on a regular basis or are distributed to other users. Listed below is the set of specific challenges we faced maintaining CloudFormation solutions and how the pipeline addresses them.

Table describing challenges faced with their direct solution offered by Testing Pipeline

The desire to better serve our customers guided our decision to create the Automated Testing Pipeline. For example, we know that source code used to build the OHDSI-on-AWS environment changes on occasion. Some of these changes have caused the environment to stop functioning correctly. This left us with cases where our customers had to either open an issue on GitHub or reach out to AWS directly for support. Our customers depend on OHDSI-on-AWS functioning properly, so fixing issues is of high priority to our team. The ability to run tests regularly allows us to take action without depending on notice from our customers. Now, we can be the first ones to know if something goes wrong and get to fixing it sooner.

“This automation will help us better monitor the CloudFormation-based projects our customers depend on to ensure they’re always in working order.” — James Wiggins, EDU HCLS SA Manager

Cleaning Up

If you decide to quit using the Automated Testing Pipeline, follow the steps below to get rid of the resources associated with it in your AWS account.

  • Delete CloudFormation solution root Stack
  • Delete pipeline CloudFormation Stack
  • Delete ATLAS S3 Bucket if OHDSI-on-AWS was chosen

Deleting the pipeline CloudFormation stack handles removing the resources associated with its architecture. Depending on the CloudFormation template chosen for testing, additional resources associated with it may need to be removed. Visit our GitHub page for more information on removing resources.

Conclusion

The ability to continuously test preexisting solutions on AWS has great benefits for our team and our customers. The automated nature of this testing frees up time for us and our customers, and the dashboard makes issues more visible and easier to resolve. We believe that sharing this story can benefit anyone facing challenges maintaining CloudFormation solutions in AWS. Check out the Getting Started with the Automated Testing Pipeline section of this post to deploy the solution.

Additional Resources

More information about the key services and open-source software used in our pipeline can be found at the following documentation pages:

About the Authors

Raleigh Hansen is a former Solutions Architect Intern on the Academic Medical Centers team at AWS. She is passionate about solving problems and improving upon existing systems. She also adores spending time with her two cats.

Dan Le is a former Solutions Architect Intern on the Academic Medical Centers team at AWS. He is passionate about technology and enjoys doing art and music.

Amazon Supplier Fraud

Post Syndicated from Bruce Schneier original https://www.schneier.com/blog/archives/2020/08/amazon_supplier.html

Interesting story of an Amazon supplier fraud:

According to the indictment, the brothers swapped ASINs for items Amazon ordered to send large quantities of different goods instead. In one instance, Amazon ordered 12 canisters of disinfectant spray costing $94.03. The defendants allegedly shipped 7,000 toothbrushes costing $94.03 each, using the code for the disinfectant spray, and later billed Amazon for over $650,000.

In another instance, Amazon ordered a single bottle of designer perfume for $289.78. In response, according to the indictment, the defendants sent 927 plastic beard trimmers costing $289.79 each, using the ASIN for the perfume. Prosecutors say the brothers frequently shipped and charged Amazon for more than 10,000 units of an item when it had requested fewer than 100. Once Amazon detected the fraud and shut down their accounts, the brothers allegedly tried to open new ones using fake names, different email addresses, and VPNs to obscure their identity.

It all worked because Amazon is so huge that everything is automated.

Building an automated knowledge repo with Amazon EventBridge and Zendesk

Post Syndicated from Benjamin Smith original https://aws.amazon.com/blogs/compute/building-an-automated-knowledge-repo-with-amazon-eventbridge-and-zendesk/

Zendesk Guide is a smart knowledge base that helps customers harness the power of institutional knowledge. It enables users to build a customizable help center and customer portal.

This post shows how to implement a bidirectional event orchestration pattern between AWS services and an Amazon EventBridge third-party integration partner. This example uses support ticket events to build a customer self-service knowledge repository. It uses the EventBridge partner integration with Zendesk to accelerate the growth of a customer help center.

The examples in this post are part of a serverless application called FreshTracks. This is built in Vue.js and demonstrates SaaS integrations with Amazon EventBridge. To test this example, ask a question on the Fresh Tracks application.

The backend components for this EventBridge integration with Zendesk have been extracted into a separate example application in this GitHub repo.

How the application works

Routing Zendesk events with Amazon EventBridge.

Routing Zendesk events with Amazon EventBridge.

  1. A user searches the knowledge repository via a widget embedded in the web application.
  2. If there is no answer, the user submits the question via the web widget.
  3. Zendesk receives the question as a support ticket.
  4. Zendesk emits events when the support ticket is resolved.
  5. These events are streamed into a custom SaaS event bus in EventBridge.
  6. Event rules match events and send them downstream to an AWS Step Functions Express Workflow.
  7. The Express Workflow orchestrates Lambda functions to retrieve additional information about the event with the Zendesk API.
  8. A Lambda function uses the Zendesk API to publish a new help article from the support ticket data.
  9. The new article is searchable on the website widget for other users to read.

Before deploying this application, you must generate an API key from within Zendesk.

Creating the Zendesk API resource

Use an API to execute events on your Zendesk account from AWS. Follow these steps to generate a Zendesk API token. This is used by the application to authenticate Zendesk API calls.

To generate an API token

  1. Log in to the Zendesk dashboard.
  2. Click the Admin icon in the sidebar, then select Channels > API.
  3. Click the Settings tab, and make sure that Token Access is enabled.
  4. Click the + button to the right of Active API Tokens.

    Creating a Zendesk API token.

    Creating a Zendesk API token.

  5. Copy the token, and store it securely. Once you close this window, the full token is not displayed again.
  6. Click Save to return to the API page, which shows a truncated version of the token.

    Zendesk API token.

    Zendesk API token.

Configuring Zendesk with Amazon EventBridge

Step 1. Configuring your Zendesk event source.

  1. Go to your Zendesk Admin Center and select Admin Center > Integrations.
  2. Choose Connect in events Connector for Amazon EventBridge to open the page to configure your Zendesk event source.

    Zendesk integrations

    Zendesk integrations

  3. Enter your AWS account ID in the Amazon Web Services account ID field, and select the Region to receive events.
  4. Choose Save.

    Zendesk Amazon EventBridge configuration.

Step 2. Associate the Zendesk event source with a new event bus: 

  1. Log into the AWS Management Console and navigate to services > Amazon EventBridge > Partner event sources
    New event source

    New event source

     

  2. Select the radio button next to the new event source and choose Associate with event bus.
    Associating event source with event bus.

    Associating event source with event bus.

     

  3. Choose Associate.

Deploying the backend application

After associating the Event source with a new partner event bus, you can deploy backend services to receive events.

To set up the example application, visit the GitHub repo and follow the instructions in the README.md file.

When deploying the application stack, make sure to provide the custom event bus name, and Zendesk API credentials with --parameter-overrides.

sam deploy --parameter-overrides ZendeskEventBusName=aws.partner/zendesk.com/123456789/default ZenDeskDomain=MydendeskDomain ZenDeskPassword=myAPITOken ZenDeskUsername=myZendeskAgentUsername

You can find the name of the new Zendesk custom event bus in the custom event bus section of the EventBridge console.

Routing events with rules

When a support ticket is updated in Zendesk, a number of individual events are streamed to EventBridge, these include an event for each of:

  • Agent Assignment Changed
  • Comment Created
  • Status Changed
  • Brand Changed
  • Subject Changed

An EventBridge rule is used filter for events. The AWS Serverless Application Model (SAM) template defines the rule with the `AWS::Events::Rule` resource type. This routes the event downstream to an AWS Step Functions Express Workflow.  The EventPattern is shown below:

  ZendeskNewWebQueryClosed: 
    Type: AWS::Events::Rule
    Properties: 
      Description: "New Web Query"
      EventBusName: 
         Ref: ZendeskEventBusName
      EventPattern: 
        account:
        - !Sub '${AWS::AccountId}'
        detail-type: 
        - "Support Ticket: Comment Created"
        detail:
          ticket_event:
            ticket:
              status: 
              - solved
              tags:
              - web_widget
              tags: 
              - guide
      Targets: 
        - RoleArn: !GetAtt [ MyStatesExecutionRole, Arn ]
          Arn: !Ref FreshTracksZenDeskQueryMachine
          Id: NewQuery

The tickets must have two specific tags (web_widget and guide) for this pattern to match. These are defined as separate fields to create an AND  matching rule, instead of declaring within the same array field to create an OR rule. A new comment on a support ticket triggers the event.

The Step Functions Express Workflow

The application routes events to a Step Functions Express Workflow that is defined in the application’s SAM template:

FreshTracksZenDeskQueryMachine:
    Type: "AWS::StepFunctions::StateMachine"
    Properties:
      StateMachineType: EXPRESS
      DefinitionString: !Sub |
               {
                    "Comment": "Create a new article from a zendeskTicket",
                    "StartAt": "GetFullZendeskTicket",
                    "States": {
                      "GetFullZendeskTicket": {
                      "Comment": "Get Full Ticket Details",
                      "Type": "Task",
                      "ResultPath": "$.FullTicket",
                      "Resource": "${GetFullZendeskTicket.Arn}",
                      "Next": "GetFullZendeskUser"
                      },
                      "GetFullZendeskUser": {
                      "Comment": "Get Full User Details",
                      "Type": "Task",
                      "ResultPath": "$.FullUser",
                      "Resource": "${GetFullZendeskUser.Arn}",
                      "Next": "PublishArticle"
                      },
                      "PublishArticle": {
                      "Comment": "Publish as an article",
                      "Type": "Task",
                      "Resource": "${CreateZendeskArticle.Arn}",
                      "End": true
                      }
                    }
                }
      RoleArn: !GetAtt [ MyStatesExecutionRole, Arn ]

This application is suited for a Step Functions Express Workflow because it is orchestrating short duration, high-volume, event-based workloads.  Each workflow task is idempotent and stateless. The Express Workflow carries the workload’s state by passing the output of one task to the input of the next. The Amazon States Language ResultPath definition is used to control where each tasks output is appended to workflow’s state before it is passed to the next task.

 

AWS StepFunctions Express workflow

AWS StepFunctions Express workflow

Lambda functions

Each task in this Express Workflow invokes a Lambda function defined within the example application’s SAM template. The Lambda functions use the Node.js Axios package to make a request to Zendesk’s API.  The Zendesk API credentials are stored in the Lambda function’s environment variables and accessible via ‘process.env’.

The first two Lambda functions in the workflow make a GET request to Zendesk. This retrieves additional data about the support ticket, the author, and the agent’s response.

The final Lambda function makes a POST request to Zendesk API. This creates and publishes a new article using this data.  The permission_group and section defined in this function must be set to your Zendesk account’s default permission group ID and FAQ section ID.

AWS Lambda function code

AWS Lambda function code

Integrating with your front-end application

Follow the instructions in the Fresh Tracks repo on GitHub to deploy the front-end application. This application includes Zendesk’s web widget script in the index.html page. The widget has been customized using Zendesk’s javascript API. This is implemented in the navigation component to insert custom forms into the widget and prefill the email address field for authenticated users. The backend application starts receiving Zendesk emitted events immediately.

The video below demonstrates the implementation from end to end.

Conclusion

This post explains how to set up EventBridge’s third-party integration with Zendesk to capture events. The example backend application demonstrates how to filter these events, and send downstream to a Step Functions Express Workflow. The Express Workflow orchestrates a series of stateless Lambda functions to gather additional data about the event. Zendesk’s API is then used to publish a new help guide article from this data.

This pattern provides a framework for bidirectional event orchestration between AWS services, custom web applications and third party integration partners. This can be replicated and applied to any number of third party integration partners.

This is implemented with minimal code to provide near real-time streaming of events and without adding latency to your application.

The possibilities are vast. I am excited to see how builders use this bidirectional serverless pattern to add even more value to their third party services.

Start here to learn about other SaaS integrations with Amazon EventBridge.

Automatic Instacart Bots

Post Syndicated from Bruce Schneier original https://www.schneier.com/blog/archives/2020/04/automatic_insta.html

Instacart is taking legal action against bots that automatically place orders:

Before it closed, to use Cartdash users first selected what items they want from Instacart as normal. Once that was done, they had to provide Cartdash with their Instacart email address, password, mobile number, tip amount, and whether they prefer the first available delivery slot or are more flexible. The tool then checked that their login credentials were correct, logged in, and refreshed the checkout page over and over again until a new delivery window appeared. It then placed the order, Koch explained.

I think I am writing a new book about hacking in general, and want to discuss this. First, does this count as a hack? I feel like it is, since it’s a way to subvert the Instacart ordering system.

When asked if this tool may give people an unfair advantage over those who don’t use the tool, Koch said, “at this point, it’s a matter of awareness, not technical ability, since people who can use Instacart can use Cartdash.” When pushed on how, realistically, not every user of Instacart is going to know about Cartdash, even after it may receive more attention, and the people using Cartdash will still have an advantage over people who aren’t using automated tools, Koch again said, “it’s a matter of awareness, not technical ability.”

Second, should Instacart take action against this? On the one hand, it isn’t “fair” in that Cartdash users get an advantage in finding a delivery slot. But it’s not really any different than programs that “snipe” on eBay and other bidding platforms.

Third, does Instacart even stand a chance in the long run. As various AI technologies give us more agents and bots, this is going to increasingly become the new normal. I think we need to figure out a fair allocation mechanism that doesn’t rely on the precise timing of submissions.

Introducing Dispatch

Post Syndicated from Netflix Technology Blog original https://netflixtechblog.com/introducing-dispatch-da4b8a2a8072

By Kevin Glisson, Marc Vilanova, Forest Monsen

Netflix is pleased to announce the open-source release of our crisis management orchestration framework: Dispatch!

Okay, but what is Dispatch? Put simply, Dispatch is:

All of the ad-hoc things you’re doing to manage incidents today, done for you, and a bunch of other things you should’ve been doing, but have not had the time!

Dispatch helps us effectively manage security incidents by deeply integrating with existing tools used throughout an organization (Slack, GSuite, Jira, etc.,) Dispatch leverages the existing familiarity of these tools to provide orchestration instead of introducing another tool.

This means you can let Dispatch focus on creating resources, assembling participants, sending out notifications, tracking tasks, and assisting with post-incident reviews; allowing you to focus on actually fixing the issue! Sounds interesting? Continue reading!

The Challenge of Crisis Management

Managing incidents is a stressful job. You are dealing with many questions all at once: What’s the scope? Who can help me? Who do I need to engage? How do I manage all of this?

In general, every incident is unique and extraordinary, if the same incidents are happening over and over you’re firefighting.

There are four main components to Crisis Management that we are attempting to address:

  1. Resource Management — The management of not only data collected about the incident itself but all of the metadata about the response.
  2. Individual Engagement — Understanding the best way to engage individuals and teams, and doing so based on incident context.
  3. Life Cycle Management — Providing the Incident Commander (IC) tools to easily manage the life cycle of the incident.
  4. Incident Learning — Building on past incidents to speed up the resolution of future incidents.

We will use the following terminology throughout the rest of the discussion:

  • Incident Commanders are individuals that are responsible for driving the incident to resolution.
  • Incident Participants are individuals that are Subject Matter Experts (SMEs) that have been engaged to help resolve the incident.
  • Resources are documents, screenshots, logs or any other piece of digital information that is used during an incident.

The Checklist

For an average incident, there are quite a few steps to managing an incident and much of it is typically handled on an ad-hoc basis by a human. Let’s enumerate them:

  1. Declare an Incident — There are many different entry points to a potential incident: automated alerts, an internal notification, or an external notification.
  2. Determine Incident Commander — Determining the sole individual responsible for driving a particular incident to resolution based on the incident source, type, and priority.
  3. Create Communication Channels — Communication during incidents is key. Establishing dedicated and standardized channels for communication prevents the creation of communication silos.
  4. Create Incident Document — The central document responsible for containing up-to-date incident information, including a description of the incident, links to resources, rough notes from in-person meetings, open questions, action items, and timeline information.
  5. Engage Individual Resources — An incident commander will not be able to resolve an incident by themselves, they must identify and engage additional resources within the organization to help them.
  6. Orient Individual Resources — Engaging additional resources is not enough, the Incident Commander needs to orient these resources to the situation at hand.
  7. Notify Key Stakeholders — For any given incident, key stakeholders not directly involved in resolving the incident need to be made aware of the incident.
  8. Drive Incident to Resolution — The actual resolution of the incident, creating tasks, asking questions, and tracking answers. Making note of key learnings to be addressed after resolution.
  9. Perform Post Incident Review (PIR) — Review how the incident process was performed, tracking actions to be performed after the incident, and driving learning through structuring informal knowledge.

Each of these steps has the incident commander and incident participants moving through various systems and interfaces. Each context switch adds to the cognitive load on the responder and distracting them from resolving the incident itself.

Toward Better Crisis Management

Crisis management is not a new challenge, tools like Jira, PagerDuty, VictorOps are all helping organizations manage and respond to incidents. When setting out to automate our incident management process we had two main goals:

  1. Re-use existing tools users were already familiar with; reducing the learning curve to contributing to incidents.
  2. Catalog, store and analyze our incident data to speed up resolution.

Meet Dispatch!

Dispatch

Dispatch is a crisis management orchestration framework that manages incident metadata and resources. It uses tools already in use throughout an organization, providing incident participants a comprehensive crisis management toolset, allowing them to focus on resolving the incident.

Unlike many of our tools Dispatch is not tightly bound to AWS, Dispatch does not use any AWS APIs at all! While Dispatch doesn’t use AWS APIs, it leverages multiple APIs that are deeply embedded into the organization (e.g. Slack, GSuite, PagerDuty, etc.,). In addition to all of the built-in integrations, Dispatch provides multiple integration points that allow it to fit into just about any existing environment.

Although developed as a tool to help Netflix manage security incidents, nothing about Dispatch is specific to a security use-case. At its core, Dispatch aims to manage the entire lifecycle of an incident, focusing on engaging individuals and providing them the context they need to drive the incident to resolution.

Workflow

Let’s take a look at what an incident commander’s new workflow would look like using Dispatch:

Some key benefits of the new workflow are:

  • The incident commander no longer needs to manage access to resources or multiple data streams.
  • Communications are standardized (both in style and interval) across incidents.
  • Incident participants are automatically engaged based on the type, priority, and description of the incident.
  • Incident tasks are tracked and owners are reminded if they’re not completed on time.
  • All incident data is centrally tracked.
  • A common API is provided for internal users and tools.

We want to make reporting incidents as frictionless as possible, giving users a straightforward path to engage the resources they need in a time of crisis.

Jumping between different tools, ensuring data is correct and in sync is a low-value exercise for an incident commander. Instead, we centralized on two common tools to manage the entire lifecycle. Slack for managing incident metadata (e.g. status, title, description, priority, etc,.) and Google Doc and Google Drive for managing data itself.

When teams need to look across many incidents, Dispatch provides an Admin UI. This interface is also where incident knowledge is managed. From common terms and their definitions, individuals, teams, and services. The Admin UI is how we manage incident knowledge for use in future incidents.

Architecture

Dispatch makes use of the following components:

  • Python 3.8 with FastAPI (including helper packages)
  • VueJS UI
  • Postgres

We’re shipping Dispatch with built-in plugins that allow you to create and manage resources with GSuite (Docs, Drive, Sheets, Calendar, Groups), Jira, PagerDuty, and Slack. But the plugin architecture allows for integrations with whatever tools your organization is already using.

Getting Started

Dispatch is available now on the Netflix Open Source site. You can try out Dispatch using Docker. Detailed instructions on setup and configuration are available in our docs.

Interested in Contributing?

Feel free to reach out or submit pull requests if you have any suggestions. We’re looking forward to seeing what new plugins you create to make Dispatch work for you! We hope you’ll find Dispatch as useful as we do!

Oh, and we’re hiring — if you’d like to help us solve these sorts of problems, take a look at https://jobs.netflix.com/teams/security, and reach out!


Introducing Dispatch was originally published in Netflix TechBlog on Medium, where people are continuing the conversation by highlighting and responding to this story.

Automated Response and Remediation with AWS Security Hub

Post Syndicated from Jonathan Rau original https://aws.amazon.com/blogs/security/automated-response-and-remediation-with-aws-security-hub/

AWS Security Hub is a service that gives you aggregated visibility into your security and compliance status across multiple AWS accounts. In addition to consuming findings from Amazon services and integrated partners, Security Hub gives you the option to create custom actions, which allow a customer to manually invoke a specific response or remediation action on a specific finding. You can send custom actions to Amazon CloudWatch Events as a specific event pattern, allowing you to create a CloudWatch Events rule that listens for these actions and sends them to a target service, such as a Lambda function or Amazon SQS queue.

By creating custom actions mapped to specific finding type and by developing a corresponding Lambda function for that custom action, you can achieve targeted, automated remediation for these findings. This allows a customer to specifically decide if he or she wants to invoke a remediation action on a specific finding. A customer can also use these Lambda functions as the target of fully automated remediation actions that do not require any human review.

In this blog post, I’ll show you how to build custom actions, CloudWatch Event rules, and Lambda functions for a dozen targeted actions that can help you remediate CIS AWS Foundations Benchmark-related compliance findings. I’ll also cover use cases for sending findings to an issue management system and for automating security patching. To promote rapid deployment and adoption of this solution, you’ll deploy a majority of the necessary components via AWS CloudFormation.

Note: The full repository for current and future response and remediation templates is hosted on GitHub and includes additional technical guidance for expanding the solution provided in this post.

Solution architecture

Figure 1 - Solution Architecture Overview

Figure 1 – Solution Architecture Overview

Figure 1 shows how a finding travels from an integrated service to a custom action:

  1. Integrated services send their findings to Security Hub.
  2. From the Security Hub console, you’ll choose a custom action for a finding. Each custom action is then emitted as a CloudWatch Event.
  3. The CloudWatch Event rule triggers a Lambda function. This function is mapped to a custom action based on the custom action’s ARN.
  4. Dependent on the particular rule, the Lambda function that is invoked will perform a remediation action on your behalf.

For the purpose of this blog post, I’ll refer to the end-to-end combination of a custom action, a CloudWatch Event rule, a Lambda function, plus any supporting services needed to perform a specific action as a “playbook.” To demonstrate how a remediation solution works end-to-end, I’ll show you how to build your first playbook manually. You’ll deploy the remainder of the playbooks via CloudFormation.

I’ll also show you how to modify four of the playbooks (three of which are appended by an asterisk), as they use AWS Lambda environment variables to perform their actions, we’ll walk through populating these later.

Based on feedback from Security Hub customers, the following controls from the CIS AWS Foundations Benchmark will be supported by this blog post:

  • 1.3 – “Ensure credentials unused for 90 days or greater are disabled”
  • 1.4 – “Ensure access keys are rotated every 90 days or less”
  • 1.5 – “Ensure IAM password policy requires at least one uppercase letter”
  • 1.6 – “Ensure IAM password policy requires at least one lowercase letter”
  • 1.7 – “Ensure IAM password policy requires at least one symbol”
  • 1.8 – “Ensure IAM password policy requires at least one number”
  • 1.9 – “Ensure IAM password policy requires a minimum length of 14 or greater”
  • 1.10 – “Ensure IAM password policy prevents password reuse”
  • 1.11 – “Ensure IAM password policy expires passwords within 90 days or less”
  • 2.2 – “Ensure CloudTrail log file validation is enabled”
  • 2.3 – “Ensure the S3 bucket CloudTrail logs to is not publicly accessible”
  • 2.4 – “Ensure CloudTrail trails are integrated with Amazon CloudWatch Logs”*
  • 2.6 – “Ensure S3 bucket access logging is enabled on the CloudTrail S3 bucket”*
  • 2.7 – “Ensure CloudTrail logs are encrypted at rest using AWS KMS CMKs”
  • 2.8 – “Ensure rotation for customer created CMKs is enabled”
  • 2.9 – “Ensure VPC flow logging is enabled in all VPCs”*
  • 4.1 – “Ensure no security groups allow ingress from 0.0.0.0/0 to port 22”
  • 4.2 – “Ensure no security groups allow ingress from 0.0.0.0/0 to port 3389”
  • 4.3 – “Ensure the default security group of every VPC restricts all traffic”

You’ll also deploy and modify an additional playbook, “Send findings to JIRA.” You can find the high-level description of each playbook in each custom action creation script, as well as in the CloudFormation resource descriptions.

Note: If you want to send Security Hub findings to a security information and event management tool (SIEM) such as Amazon ElasticSearch Service or a third-party solution, you must change the CloudWatch Events event pattern to match all findings and use different targets such as Amazon Kinesis Data Streams to Kinesis Data Firehose to load your SIEM. This process is out of scope for this post.

Prerequisites

Ensure you have Security Hub and AWS Config turned on in your Region. Also, note that the solution in this blog post is meant to support a single account and will not support cross-account remediation as deployed. Refer to the Knowledge Center article How can I configure a Lambda function to assume a role from another AWS account? for basic information on cross-account roles for Lambda.

For the playbook “Apply Security Patches,” your EC2 instances must be managed by Systems Manager. For more information on managed instances, see AWS Systems Manager Managed Instances in the AWS Systems Manager User Guide.

Manually create a remediation playbook

To demonstrate the end-to-end process of building a playbook, I’ll first show you how to create one manually, before you deploy the remaining playbooks via CloudFormation. You’ll build a playbook to remediate Control 2.7 of the AWS Foundations Benchmark, “ensure CloudTrail logs are encrypted at rest using AWS Key Management Service (KMS) Customer Managed Keys (CMK).” Configuring CloudTrail to use KMS encryption (called SSE-KMS) provides additional confidentiality controls on you log data. To access your CloudTrail logs, users must not only have S3 read permissions for the corresponding log bucket, they now must be granted decrypt permissions by the KMS key policy.

Important note: The way this remediation, and all other remediation code is written, you can only target one finding at a time via the Action Menu.

You’ll achieve automated remediation by using a Lambda function to create a new KMS CMK and alias which identifies the non-compliant CloudTrail trail. You’ll then attach a KMS key policy that only allows the AWS account that owns the trail to decrypt the logs by using the IAM condition for StringEquals: kms:CallerAccount. You only need to run this playbook once per non-compliant CloudTrail trail.

To get started, follow these steps:

  1. Navigate to the Security Hub console, select Settings from the navigation pane, then select the Custom Actions tab.
  2. Choose Create custom action and enter values for Action name, Description, and Custom action ID, then choose Create custom action again, as shown in Figure 2.

    For the purpose of this blog post, I’ll refer to my action name as “CIS 2.7 RR” where the “RR” stands for “Response and remediation.”
     

    Figure 2 - Create custom action

    Figure 2 – Create custom action

  3. Copy the Amazon resource number (ARN) down, as you’ll need it in step 11.
  4. Navigate to the Lambda console and select Create function.
  5. Enter a function name, choose Python 3.7 runtime, and under Permissions select Create a new role with basic Lambda permissions. Then choose Create function.
  6. Scroll down to Execution role and select the hyperlink under Existing role. This will open a new tab in the IAM console.
  7. From the IAM console, select Add inline policy, then select the JSON tab, paste in the below IAM policy JSON, and select Review Policy.
    
    {
        "Version": "2012-10-17",
        "Statement": [
          {
            "Sid": "kmssid",
            "Action": [
              "kms:CreateAlias",
              "kms:CreateKey",
              "kms:PutKeyPolicy"
            ],
            "Effect": "Allow",
            "Resource": "*"
          },
          {
            "Sid": "cloudtrailsid",
            "Action": [
              "cloudtrail:UpdateTrail"
            ],
            "Effect": "Allow",
            "Resource": "*"
          }
        ]
      }
    

  8. Give the in-line policy a name and select Create policy.
  9. Back in the Lambda console, increase Timeout to 1 minute and Memory to 256MB. Scroll up to Function code, paste in the below code, and select Save.
    
     # Copyright 2019 Amazon.com, Inc. or its affiliates. All Rights Reserved.
     # SPDX-License-Identifier: MIT-0
     #
     # Permission is hereby granted, free of charge, to any person obtaining a copy of this
     # software and associated documentation files (the "Software"), to deal in the Software
     # without restriction, including without limitation the rights to use, copy, modify,
     # merge, publish, distribute, sublicense, and/or sell copies of the Software, and to
     # permit persons to whom the Software is furnished to do so.
     #
     # THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED,
     # INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A
     # PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT
     # HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION
     # OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE
     # SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
    
    import boto3
    import json
    import time
    
    def lambda_handler(event, context):
        # parse non-compliant trail from Security Hub finding
        noncompliantTrail = str(event['detail']['findings'][0]['Resources'][0]['Details']['Other']['name'])
        # parse account ID from Security Hub finding, will be needed for Key Policy
        accountID = str(event['detail']['findings'][0]['AwsAccountId'])
        
        # import boto3 clients for KMS and CloudTrail
        kms = boto3.client('kms')
        cloudtrail = boto3.client('cloudtrail')
    
        # create a new KMS CMK to encrypt the non-compliant trail
        try:
            createKey = kms.create_key(
            Description='Generated by Security Hub to remediate CIS 2.7 Ensure CloudTrail logs are encrypted at rest using KMS CMKs',
            KeyUsage='ENCRYPT_DECRYPT',
            Origin='AWS_KMS'
            )
            # save key id as a variable
            cloudtrailKey = str(createKey['KeyMetadata']['KeyId'])
            print("Created Key" + " " + cloudtrailKey)
        except Exception as e:
            print(e)
            print("KMS CMK creation failed")
            raise
            
        # wait 2 seconds for key creation to propogate
        time.sleep(2)
    
        # attach an alias for easy identification to the key - must always begin with "alias/"
        try:
            createAlias = kms.create_alias(
            AliasName='alias/' + noncompliantTrail + '-CMK',
            TargetKeyId=cloudtrailKey
            )
            print(createAlias)
        except Exception as e:
            print(e)
            print("Failed to create KMS Alias")
            raise
        
        # wait 1 second
        time.sleep(1)
    
        # policy name for PutKeyPolicy is always "default"
        policyName = 'default'
        # set Key Policy as JSON object
        keyPolicy={
            "Version": "2012-10-17",
            "Id": "Key policy created by CloudTrail",
            "Statement": [
                {
                    "Sid": "Enable IAM User Permissions",
                    "Effect": "Allow",
                    "Principal": {
                        "AWS": [ "arn:aws:iam::" + accountID + ":root" ]
                    },
                    "Action": "kms:*",
                    "Resource": "*"
                },
                {
                    "Sid": "Allow CloudTrail to encrypt logs",
                    "Effect": "Allow",
                    "Principal": {
                        "Service": "cloudtrail.amazonaws.com"
                    },
                    "Action": "kms:GenerateDataKey*",
                    "Resource": "*",
                    "Condition": {
                        "StringLike": {
                            "kms:EncryptionContext:aws:cloudtrail:arn": "arn:aws:cloudtrail:*:" + accountID + ":trail/" + noncompliantTrail
                        }
                    }
                },
                {
                    "Sid": "Allow CloudTrail to describe key",
                    "Effect": "Allow",
                    "Principal": {
                        "Service": "cloudtrail.amazonaws.com"
                    },
                    "Action": "kms:DescribeKey",
                    "Resource": "*"
                },
                {
                    "Sid": "Allow principals in the account to decrypt log files",
                    "Effect": "Allow",
                    "Principal": {
                        "AWS": "*"
                    },
                    "Action": [
                        "kms:Decrypt",
                        "kms:ReEncryptFrom"
                    ],
                    "Resource": "*",
                    "Condition": {
                        "StringEquals": {
                            "kms:CallerAccount": accountID
                        },
                        "StringLike": {
                            "kms:EncryptionContext:aws:cloudtrail:arn": "arn:aws:cloudtrail:*:" + accountID + ":trail/" + noncompliantTrail
                        }
                    }
                },
                {
                    "Sid": "Allow alias creation during setup",
                    "Effect": "Allow",
                    "Principal": {
                        "AWS": "*"
                    },
                    "Action": "kms:CreateAlias",
                    "Resource": "*",
                    "Condition": {
                        "StringEquals": {
                            "kms:CallerAccount": accountID
                        }
                    }
                }
            ]
        }
        # attaches above key policy to key
        try:
            attachKeyPolicy = kms.put_key_policy(
                KeyId=cloudtrailKey,
                Policy=json.dumps(keyPolicy),
                PolicyName=policyName
            )
            print(attachKeyPolicy)
        except Exception as e:
            print(e)
            print("Failed to attach key policy to Key:" + " " + cloudtrailKey)
    
        # update CloudTrail with the new CMK
        try:
            encryptTrail = cloudtrail.update_trail(
            Name=noncompliantTrail,
            KmsKeyId=cloudtrailKey
            )
            print(encryptTrail)
            print("CloudTrail trail" + " " + noncompliantTrail + " " + "has been successfully encrypted!")
        except Exception as e:
            print(e)
            print("Failed to attach KMS CMK to CloudTrail")
    

  10. Navigate to the CloudWatch console, and choose Events, then Create rule.
  11. On the left, next to Event Pattern Preview, select Edit.
  12. Under Event Source, find Build custom event pattern and paste the JSON below into the text box. Replace the ARN under “resources” with the ARN of the custom action you created in step 3.
    
    {
      "source": [
        "aws.securityhub"
      ],
      "detail-type": [
        "Security Hub Findings - Custom Action"
      ],
      "resources": [
        "arn:aws:securityhub:us-west-2:123456789012:action/custom/test-action1"
      ]
    }
    

  13. On the same screen, under Targets on the right, select add Target, choose the Lambda function you created in step 4, then select Configure details.
  14. On the next screen, enter values for Name and Description, then select Create rule.
  15. Back in the Security Hub console, select Compliance standards from the menu on the left, then select View results to see the CIS AWS Benchmarks.
  16. Find rule 2.7 Ensure CloudTrail logs are encrypted at rest using KMS CMKs and select the hyperlink to see all findings related to that control.
  17. Select a finding that is FAILED, with a Resource type that reads AwsCloudTrailTrail.
  18. From the top right, select the Actions dropdown menu, then select CIS 2.7 RR (or whatever you named this action in step 2), as shown in Figure 3.

    Selecting this action will execute the rule you created in step 13, which will invoke the Lambda function you created in step 9.
     

    Figure 3 - Security Hub Custom Actions

    Figure 3 – Security Hub Custom Actions

  19. Navigate to the CloudTrail console to ensure your CloudTrail trail is updated with SSE-KMS.

This creation flow via the console is universal for all playbook development with Security Hub. You’ll use the Actions menu in the same way to trigger the playbooks you deploy via CloudFormation in the next section of this walkthrough.

Note: To monitor the actions that are taken by the playbooks’ Lambda functions, refer to the functions’ logs. Both success and error messages will appear to help you diagnose as needed.

Deploy remediation playbooks via CloudFormation

Download the CloudFormation template from GitHub and create a CloudFormation stack. For more information about how to create a CloudFormation stack, see Getting Started with AWS CloudFormation in the AWS CloudFormation User Guide.

After your stack has finished deploying, navigate to the Resources tab and select the hyperlink for each resource to be taken to their respective consoles. The logical ID for each resource will be prepended by the action the resource corresponds to. For example, in figure 4, CIS13RRCWEPermissions denotes CloudWatch Event permissions for AWS CIS Benchmark Control 1.3. This logical ID structure is used throughout the template.
 

Figure 4 - CloudFormation Resources

Figure 4 – CloudFormation Resources

Next, I’ll show you how to modify the Lambda functions associated with the following playbooks: “Send findings to JIRA,” CIS 2.4, CIS 2.6, and CIS 2.9.

Playbook modification: send findings to JIRA

Note: If you don’t currently have a JIRA Software Data Center deployment set up in your account, you can deploy one with a free evaluation period by following this Quick Start. If you are not interested in using JIRA or you use a different issue management tool, you can skip this section.

This playbook works by using the associated Lambda function to execute a Systems Manager automation document called AWS-CreateJiraIssue.

This Systems Manager document will in turn deploy a CloudFormation stack and create a custom Lambda function to map environmental variables (listed below) into JIRA via your playbook’s Lambda function.

Once complete, the CloudFormation stack will self-delete and the automation will be complete. To see this flow, refer to figure 5, below:
 

Figure 5 - Send to JIRA Architecture Diagram

Figure 5 – Send to JIRA Architecture Diagram

  1. A finding is selected and the custom action “Create JIRA Issue” is invoked, which triggers a CloudWatch Event.
  2. The CloudWatch Event rule will trigger a Lambda function.
  3. Lambda will invoke the AWS-CreateJiraIssue document via Systems Manager Automation.
  4. Systems Manager Automation will pass your Lambda environmental variables and Security Hub finding as document parameters.
  5. The document will create a CloudFormation Stack that contains another custom Lambda to invoke JIRA APIs for creating issues.
  6. The document’s Lambda function uses your parameters to create an issue in JIRA.
  7. JIRA will send back a response to note failure or success, and the CloudFormation stack will self-delete.

To get started, navigate to the Lambda console and find the function named SendToJIRA, then scroll down to Environmental variables: You should see 4, with the text “placeholder” next to each. The following steps walk you through how to populate them:

  1. Fill out the JIRA_API_PARAMETER field:
    1. Refer to Atlassian’s instructions to generate a JIRA API token
    2. Follow the Systems Manager Parameter Store walkthrough to create a parameter.
      1. In Step 6 of the linked instructions, choose Secure String and choose the default KMS key.
      2. In Step 7 of the linked instructions, paste in your newly generated JIRA API token.
    3. Copy the name of the parameter and paste it as the value of the JIRA_API_PARAMETER field.
  2. Fill out the JIRA_PROJECT field by pasting in your JIRA Software project’s Project Key.
  3. Fill out the JIRA_SECURITY_ISSUE_USER field:
    1. This value maps to a user in JIRA Software; refer to Atlassian’s instructions for how to create a user. Use the API key you generated in step 1.A as this user’s password.
    2. Paste the user name into the JIRA_SECURITY_ISSUE_USER field.
  4. Fill out the JIRA_URL field:
    1. From JIRA Software navigate to the System sub-menu of JIRA Administration and look for the Base URL field, underneath Settings – General Settings (see figure 6).
    2. Copy the Base URL value and paste it into the JIRA_URL field.
       
      Figure 6 - Base URL in JIRA

      Figure 6 – Base URL in JIRA

  5. When finished, select Save at the top of the Lambda console, then navigate to the Security Hub console and select Findings from the left-hand menu.
  6. Select any finding, and then, from the Actions dropdown in the top right, select Create JIRA Issue.
  7. Navigate to your JIRA instance and wait a few minutes for the new issue to appear, as shown in Figure 7.
     
    Figure 7 - Security Hub Finding in JIRA

    Figure 7 – Security Hub Finding in JIRA

    Note: If your issue hasn’t populated in JIRA after a few minutes, refer to the Systems Manager Automation console or CloudFormation console to find the stack that was created by Systems Manager and refer to the failure messages to troubleshoot further.

CIS 2.4 response & remediation playbook modification

CIS Control 2.4 is “Ensure CloudTrail trails are integrated with Amazon CloudWatch Logs.” The intent of this recommendation is to ensure that API activity recorded by CloudTrail is available to query in near real-time for the purpose of troubleshooting or security incident investigation with CloudWatch.

To send your CloudTrail logs to CloudWatch, the Lambda function for this playbook will create a brand new CloudWatch Logs group that has the name of the non-compliant CloudTrail trail in it for easy identification and then update your non-compliant CloudTrail trail to send its logs to the newly created log group.

To accomplish this, CloudTrail needs an IAM role and permissions to be allowed to publish logs to CloudWatch. To avoid creating multiple new IAM roles and policies via Lambda, you’ll populate the ARN of this IAM role in the Lambda environmental variables for this playbook.

Note: If you don’t currently have an IAM role for CloudTrail, follow these instructions from the CloudTrail user guide to create one.

To update this playbook:

  1. Navigate to the Lambda console and find the function named CIS_2-4_RR, then scroll down to Environmental variables.
  2. Find the variable CLOUDTRAIL_CW_LOGGING_ROLE_ARN with text that reads “placeholder” as the value.
  3. Paste the ARN of the IAM role into the field for this value, then select Save.

CIS 2.6 response & remediation playbook modification

CIS Control 2.6 is “Ensure S3 bucket access logging is enabled on the CloudTrail S3 bucket.” An access log record contains details about the request, such as the request type, the resources specified in the request worked, and the time and date the request was processed. AWS recommends that you enable bucket access logging on the CloudTrail S3 bucket. By enabling S3 bucket logging on target S3 buckets, you can capture all the events that might affect objects in the bucket. Configuring logs to be placed in a separate bucket enables centralized collection of access log information, which can be useful in security and incident response workflows.

Note: Security Hub supports CIS AWS Foundations controls only on resources in the same Region and owned by the same account as the one in which Security Hub is enabled and being used. For example, if you’re using Security Hub in the us-east-2 Region, and you’re storing CloudTrail logs in a bucket in the us-west-2 Region, Security Hub cannot find the bucket in the us-west-2 Region. The control returns a warning that the resource cannot be located. Similarly, if you’re aggregating logs from multiple accounts into a single bucket, the CIS control fails for all accounts except the account that owns the bucket.

To ensure the S3 bucket that contains your CloudTrail logs has access logging enabled, the Lambda function for this playbook invokes the Systems Manager document AWS-ConfigureS3BucketLogging. This document will enable access logging for that bucket. To avoid statically populating your S3 access logging bucket in the Lambda function’s code, you’ll pass that value in via an environmental variable.

Note: If you do not currently have an S3 bucket configured to receive access logs, follow the directions from the S3 user guide to create one.

To update this playbook:

  1. Navigate to the Lambda console and find the function named CIS_2-6_RR, then scroll down to Environmental variables.
  2. Find the variable ACCESS_LOGGING_BUCKET with text that reads ‘placeholder’ as the value.
  3. Paste the name of the S3 bucket that will receive the access logs into the field for this value and select Save.

CIS 2.9 response & remediation playbook modification

CIS Control 2.9 is “Ensure VPC flow logging is enabled in all VPCs.” The Amazon Virtual Private Cloud (VPC) flow logs feature enables you to capture information about the IP traffic going to and from network interfaces in your VPC. After you’ve created a flow log, you can view and retrieve its data in CloudWatch Logs. AWS recommends that you enable flow logging for packet rejects for VPCs. Flow logs provide visibility into network traffic that traverses the VPC and can detect anomalous traffic or provide insight into your security workflow.

To enable VPC flow logging for rejected packets, the Lambda function for this playbook will create a new CloudWatch Logs group. For easy identification, the name of the group will include the non-compliant VPC name. The Lambda function will programmatically update your VPC to enable flow logs to be sent to the newly created log group.

Similar to CloudTrail logging, VPC flow log need an IAM role and permissions to be allowed to publish logs to CloudWatch. To avoid creating multiple new IAM roles and policies via Lambda, you’ll populate the ARN of this IAM role in the Lambda environmental variables for this playbook.

Note: If you don’t currently have an IAM role that VPC flow logs can use to deliver logs to CloudWatch, follow the directions from the VPC user guide to create one.

To update this playbook:

  1. Navigate to the Lambda console and find the function named CIS_2-9_RR, then scroll down to Environmental variables.
  2. Find the variable flowLogRoleARN with text that reads ‘placeholder’ as the value.
  3. Paste in the ARN of the VPC flow logs IAM role the field for this value, and select Save.

Conclusion

In this blog post, I showed you how to create, deploy, and execute response and remediation playbooks for Security Hub. By combining custom actions, CloudWatch Event rules, and Lambda functions, you can quickly remediate non-compliant resources. I also showed you how to take pre-defined actions such as sending findings to JIRA, and how to deploy an additional seven playbooks via CloudFormation.

You can create playbooks that take other actions, such as updating Network Access Control Lists to help block malicious traffic from a TOR exit node via the UnauthorizedAccess:EC2/TorIPCaller finding from GuardDuty. Using playbooks with Security Hub is one more way to build toward the security and compliance of your AWS resources.

To avoid incurring additional charges from AWS resources, delete the CloudFormation stack you deployed as well as the resources you manually created for the CIS 2.7 response and remediation playbook.

If you have feedback about this blog post, submit comments in the Comments section below. If you have questions about this blog post, start a new thread on the Security Hub forum or contact AWS Support.

Want more AWS Security how-to content, news, and feature announcements? Follow us on Twitter.

Jonathon Rau

Jonathan Rau

Jonathan is the Senior TPM for AWS Security Hub. He holds an AWS Certified Specialty-Security certification and is extremely passionate about cyber security, data privacy, and new emerging technologies, such as blockchain. He devotes personal time into research and advocacy about those same topics.

How FactSet automated exporting data from Amazon DynamoDB to Amazon S3 Parquet to build a data analytics platform

Post Syndicated from Arvind Godbole original https://aws.amazon.com/blogs/big-data/how-factset-automated-exporting-data-from-amazon-dynamodb-to-amazon-s3-parquet-to-build-a-data-analytics-platform/

This is a guest post by Arvind Godbole, Lead Software Engineer with FactSet and Tarik Makota, AWS Principal Solutions Architect. In their own words “FactSet creates flexible, open data and software solutions for tens of thousands of investment professionals around the world, which provides instant access to financial data and analytics that investors use to make crucial decisions. At FactSet, we are always working to improve the value that our products provide.”

One area that we’ve been looking into is the relevancy of search results for our clients. Given the wide variety of client use cases and the large number of searches per day, we needed a platform to store anonymized usage data and allow us to analyze that data to boost results using our custom scoring algorithm. Amazon EMR was the obvious choice to host the calculations, but the question arose on how to get our anonymized data into a form that Amazon EMR could use. We worked with AWS and chose to use Amazon DynamoDB to prepare the data for usage in Amazon EMR.

This post walks you through how FactSet takes data from a DynamoDB table and converts that data into Apache Parquet. We store the Parquet files in Amazon S3 to enable near real-time analysis with Amazon EMR. Along the way, we encountered challenges related to data type conversion, which we will explain and show how we were able to overcome these.

Workflow overview

Our workflow contained the following steps:

  1. Anonymized log data is stored into DynamoDB tables. These entries have different fields, depending on how the logs were generated. Whenever we create items in the tables, we use DynamoDB Streams to write out a record. The stream records contain information from a single item in a DynamoDB table.
  2. An AWS Lambda function is hooked into the DynamoDB stream to capture the new items stored in a DynamoDB table. We built our Lambda function off of the lambda-streams-to-firehose project on GitHub to convert the DynamoDB stream image to JSON, which we stringify and push to Amazon Kinesis Data Firehose.
  3. Kinesis Data Firehose transforms the JSON data into Parquet using data contained within an AWS Glue Data Catalog table.
  4. Kinesis Data Firehose stores the Parquet files in S3.
  5. An AWS Glue crawler discovers the schema of DynamoDB items and stores the associated metadata into the Data Catalog.

The following diagram illustrates this workflow.

AWS Glue provides tools to help with data preparation and analysis. A crawler can run on a DynamoDB table to take inventory of the table data and store that information in a Data Catalog. Other services can use the Data Catalog as an index to the location, schema, and types of the table data. There are other ways to add metadata into a Data Catalog, but the key idea is that you can update and modify the metadata easily. For more information, see Populating the AWS Glue Data Catalog.

Problem: Data type disparities

Using a variety of technologies to build a solution often requires mapping and converting data types between these technologies. The cloud is no exception. In our case, log items stored in DynamoDB contained attributes of type String Set. String Set values caused data conversion exceptions when Kinesis tried to transform the data to Parquet. After investigating the problem, we found the following:

  • As the crawler indexes the DynamoDB table, Set data types (StringSet, NumberSet) are stored in the Glue metadata catalog as set<string> and set<bigint>.
  • Kinesis Data Firehose uses that same catalog when it performs the conversion to Apache Parquet. The conversion requires valid Hive data types.
  • set<string> and set<bigint> are not valid Hive data types, so the conversion fails, and an exception is generated. The exception looks similar to the following code:
    [{
       "lastErrorCode": "DataFormatConversion.InvalidSchema",
       "lastErrorMessage": "The schema is invalid. Error parsing the schema: Error: type expected at the position 38 of 'array,used:bigint>>' but 'set' is found."
    }]

Solution: Construct data mapping

While working with the AWS team, we confirmed that the Kinesis Data Firehose converter needs valid Hive data types in the Data Catalog to succeed. When it comes to complex data types, Hive doesn’t support set<data_type>, but it does support the following:

  • ARRAY<data_type>
  • MAP<primitive_type, data_type
  • STRUCT<col_name : data_type [COMMENT col_comment], ...>
  • UNIONTYPE<data_type, data_type, ...>

In our case, this meant that we must convert set<string> and set<bigint> into array<string> and array<bigint>. Our first step was to manually change the types directly in the Data Catalog. After we updated the Data Catalog to change all occurrences of set<data_type> to array<data_type>, the Kinesis transformation to Parquet completed successfully.

Our business case calls for a data store that can store items with different attributes in the same table and the addition of new attributes on-the-fly. We took advantage of DynamoDB’s schema-less nature and ability to scale up and down on-demand so we could focus on our functionality and not the management of the underlying infrastructure. For more information, see Should Your DynamoDB Table Be Normalized or Denormalized?

If our data had a static schema, a manual change would be good enough. Given our business case, a manual solution wasn’t going to scale. Every time we introduced new attributes to the DynamoDB table, we needed to run the crawler, which re-created the metadata and overwrote the change.

Serverless event architecture

To automate the data type updates to the Data Catalog, we used Amazon EventBridge and Lambda to implement the modifications to the data type mapping. EventBridge is a serverless event bus that connects applications using events. An event is a signal that a system’s state has changed, such as the status of a Data Catalog table.

The following diagram shows the previous workflow with the new architecture.

  1. The crawler stays as-is and crawls the DynamoDB table to obtain the metadata.
  2. The metadata obtained by the crawler is stored in the Data Catalog. Previous metadata is updated or removed, and changes (manual or automated) are overwritten.
  3. The event GlueTableChanged in EventBridge listens to any changes to the Data Catalog tables. After we receive the event that there was a change to the table, we trigger the Lambda function.
  4. The Lambda function uses AWS SDK to update the Glue Catalog table using the glue.update_table() API to replace occurrences of set<data_type> with array<data_type>.

To set up EventBridge, we set Event pattern to be “Pre-defined pattern by service”. For service provider, we selected AWS and Glue as service. Event Type we selected “Glue Data Catalog Table State Change”. The following screenshot shows the EventBridge configuration that sends events to the Lambda function that updates the Data Catalog.

The following is the baseline Lambda code:

# This is NOT production worthy code please modify and implement error handling routines as appropriate
import json
import logging
import boto3

glue = boto3.client('glue')

logger = logging.getLogger()
logger.setLevel(logging.INFO)

# Define subsegments manually
def table_contains_set(databaseName, tableName):
    
    # returns Glue Catalog description for Table structure
    response = glue.get_table( DatabaseName=databaseName,Name=tableName)
    logger.info(response)  
    
    # loop thru all the Columns of the table 
    isModified = False
    for i in response['Table']['StorageDescriptor']['Columns']: 
        logger.info("## Column: " + str(i['Name']))
        # if Column datatype starts with set< then change it to array<
        if i['Type'].find("set<") != -1:
            i['Type'] = i['Type'].replace("set<", "array<")
            isModified = True
            logger.info(i['Type'])
    
    if isModified:
        # following 3 statements simply clean up the response JSON so that update_table API call works
        del response['Table']['DatabaseName']
        del response['Table']['CreateTime']
        del response['Table']['UpdateTime']
        glue.update_table(DatabaseName=databaseName,TableInput=response['Table'],SkipArchive=True)
        
    logger.info("============ ### =============") 
    logger.info(response)
    
    return True
    
def lambda_handler(event, context):
    logger.info('## EVENT')
    # logger.info(event)
    # This is Sample of the event payload that would be received
    # { 'version': '0', 
    #   'id': '2b402842-21f5-1d76-1a9a-c90076d1d7da', 
    #   'detail-type': 'Glue Data Catalog Table State Change', 
    #   'source': 'aws.glue', 
    #   'account': '1111111111', 
    #   'time': '2019-08-18T02:53:41Z', 
    #   'region': 'us-east-1', 
    #   'resources': ['arn:aws:glue:us-east-1:111111111:table/ddb-glue-fh/ddb_glu_fh_sample'], 
    #   'detail': {
    #           'databaseName': 'ddb-glue-fh', 
    #           'changedPartitions': [], 
    #           'typeOfChange': 'UpdateTable', 
    #           'tableName': 'ddb_glu_fh_sample'
    #    }
    # }
    
    # get the database and table name of the Glue table triggered the event
    databaseName = event['detail']['databaseName']
    tableName = event['detail']['tableName']
    logger.info("DB: " + databaseName + " | Table: " + tableName)
    
    table_contains_set(databaseName, tableName)
   
    # TODO implement and modify
    return {
        'statusCode': 200,
        'body': json.dumps('Hello from Lambda!')
    }

The Lambda function is straightforward; this post provides a basic skeleton. You can use this as a template to implement your own functionality for your specific data.

Conclusion

Simple things such as data type conversion and mapping can create unexpected outcomes and challenges when data crosses service boundaries. One of the advantages of AWS is the wide variety of tools with which you can create robust and scalable solutions tailored to your needs. Using event-driven architecture, we solved our data type conversion errors and automated the process to eliminate the issue as we move forward.

 


About the Authors

Arvind Godbole is a Lead Software Engineer at FactSet Research Systems. He has experience in building high-performance, high-availability client facing products and services, ranging from real-time financial applications to search infrastructure. He is currently building an analytics platform to gain insights into client workflows. He holds a B.S. in Computer Engineering from the University of California, San Diego

 

 

 

Tarik Makota is a Principal Solutions Architect with the Amazon Web Services. He provides technical guidance, design advice and thought leadership to AWS’ customers across US Northeast. He holds an M.S. in Software Development and Management from Rochester Institute of Technology.

 

 

Protect your veggies from hail with a Raspberry Pi Zero W

Post Syndicated from Alex Bate original https://www.raspberrypi.org/blog/protect-your-veggies-from-hail-with-a-raspberry-pi-zero-w/

Tired of losing vegetable crops to frequent summertime hail storms, Nick Rogness decided to build something to protect them. And the result is brilliant!

Digital Garden with hail protection

Tired of getting your garden destroyed by hail storms? I was, so I did something about it…maker style!

“I live in a part of the country where hail and severe weather are commonplace during the summer months,” Nick explains in his Hackster tutorial. “I was getting frustrated every year when my wife’s garden was get demolished by the nightly hail storms losing our entire haul of vegetable goodies!”

Nick drew up plans for a solution to his hail problem, incorporating liner actuators bolted to a 12ft × 12ft frame that surrounds the vegetable patch. When a storm is on the horizon, the actuators pull a heavy-duty tarp over the garden.

Nick connected two motor controllers to a Raspberry Pi Zero W. The Raspberry Pi then controls the actuators to pull the tarp, either when a manual rocker switch is flipped or when it’s told to do so via weather-controlled software.

“Software control of the garden was accomplished by using a Raspberry Pi and MQTT to communicate via Adafruit IO to reach the mobile app on my phone,” Nick explains. The whole build is powered by a 12V Marine deep-cycle battery that’s charged using a solar panel.

You can view the full tutorial on Hackster, including the code for the project.

The post Protect your veggies from hail with a Raspberry Pi Zero W appeared first on Raspberry Pi.

Orchestrating a security incident response with AWS Step Functions

Post Syndicated from Benjamin Smith original https://aws.amazon.com/blogs/compute/orchestrating-a-security-incident-response-with-aws-step-functions/

In this post I will show how to implement the callback pattern of an AWS Step Functions Standard Workflow. This is used to add a manual approval step into an automated security incident response framework. The framework could be extended to remediate automatically, according to the individual policy actions defined. For example, applying alternative actions, or restricting actions to specific ARNs.

The application uses Amazon EventBridge to trigger a Step Functions Standard Workflow on an IAM policy creation event. The workflow compares the policy action against a customizable list of restricted actions. It uses AWS Lambda and Step Functions to roll back the policy temporarily, then notify an administrator and wait for them to approve or deny.

Figure 1: High-level architecture diagram.

Important: the application uses various AWS services, and there are costs associated with these services after the Free Tier usage. Please see the AWS pricing page for details.

You can deploy this application from the AWS Serverless Application Repository. You then create a new IAM Policy to trigger the rule and run the application.

Deploy the application from the Serverless Application Repository

  1. Find the “Automated-IAM-policy-alerts-and-approvals” app in the Serverless Application Repository.
  2. Complete the required application settings
    • Application name: an identifiable name for the application.
    • EmailAddress: an administrator’s email address for receiving approval requests.
    • restrictedActions: the IAM Policy actions you want to restrict.

      Figure 2 Deployment Fields

  3. Choose Deploy.

Once the deployment process is completed, 21 new resources are created. This includes:

  • Five Lambda functions that contain the business logic.
  • An Amazon EventBridge rule.
  • An Amazon SNS topic and subscription.
  • An Amazon API Gateway REST API with two resources.
  • An AWS Step Functions state machine

To receive Amazon SNS notifications as the application administrator, you must confirm the subscription to the SNS topic. To do this, choose the Confirm subscription link in the verification email that was sent to you when deploying the application.

EventBridge receives new events in the default event bus. Here, the event is compared with associated rules. Each rule has an event pattern defined, which acts as a filter to match inbound events to their corresponding rules. In this application, a matching event rule triggers an AWS Step Functions execution, passing in the event payload from the policy creation event.

Running the application

Trigger the application by creating a policy either via the AWS Management Console or with the AWS Command Line Interface.

Using the AWS CLI

First install and configure the AWS CLI, then run the following command:

aws iam create-policy --policy-name my-bad-policy1234 --policy-document '{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "VisualEditor0",
            "Effect": "Allow",
            "Action": [
                "s3:GetBucketObjectLockConfiguration",
                "s3:DeleteObjectVersion",
                "s3:DeleteBucket"
            ],
            "Resource": "*"
        }
    ]
}'

Using the AWS Management Console

  1. Go to Services > Identity Access Management (IAM) dashboard.
  2. Choose Create policy.
  3. Choose the JSON tab.
  4. Paste the following JSON:
    {
        "Version": "2012-10-17",
        "Statement": [
            {
                "Sid": "VisualEditor0",
                "Effect": "Allow",
                "Action": [
                    "s3:GetBucketObjectLockConfiguration",
                    "s3:DeleteObjectVersion",
                    "s3:DeleteBucket"
                ],
                "Resource": "*"
            }
        ]
    }
  5. Choose Review policy.
  6. In the Name field, enter my-bad-policy.
  7. Choose Create policy.

Either of these methods creates a policy with the permissions required to delete Amazon S3 buckets. Deleting an S3 bucket is one of the restricted actions set when the application is deployed:

Figure 3 default restricted actions

This sends the event to EventBridge, which then triggers the Step Functions state machine. The Step Functions state machine holds each state object in the workflow. Some of the state objects use the Lambda functions created during deployment to process data.

Others use Amazon States Language (ASL) enabling the application to conditionally branch, wait, and transition to the next state. Using a state machine decouples the business logic from the compute functionality.

After triggering the application, go to the Step Functions dashboard and choose the newly created state machine. Choose the current running state machine from the executions table.

Figure 4 State machine executions.

You see a visual representation of the current execution with the workflow is paused at the AskUser state.

Figure 5 Workflow Paused

These are the states in the workflow:

ModifyData
State Type: Pass
Re-structures the input data into an object that is passed throughout the workflow.

ValidatePolicy
State type: Task. Services: AWS Lambda
Invokes the ValidatePolicy Lambda function that checks the new policy document against the restricted actions.

ChooseAction
State type: Choice
Branches depending on input from ValidatePolicy step.

TempRemove
State type: Task. Service: AWS Lambda
Creates a new default version of the policy with only permissions for Amazon CloudWatch Logs and deletes the previously created policy version.

AskUser
State type: Choice
Sends an approval email to user via SNS, with the task token that initiates the callback pattern.

UsersChoice
State type: Choice
Branch based on the user action to approve or deny.

Denied
State type: Pass
Ends the execution with no further action.

Approved
State type: Task. Service: AWS Lambda
Restores the initial policy document by creating as a new version.

AllowWithNotification
State type: Task. Services: AWS Lambda
With no restricted actions detected, the user is still notified of change (via an email from SNS) before execution ends.

The callback pattern

An important feature of this application is the ability for an administrator to approve or deny a new policy. The Step Functions callback pattern makes this possible.

The callback pattern allows a workflow to pause during a task and wait for an external process to return a task token. The task token is generated when the task starts. When the AskUser function is invoked, it is passed a task token. The task token is published to the SNS topic along with the API resources for approval and denial. These API resources are created when the application is first deployed.

When the administrator clicks on the approve or deny links, it passes the token with the API request to the receiveUser Lambda function. This Lambda function uses the incoming task token to resume the AskUser state.

The lifecycle of the task token as it transitions through each service is shown below:

Figure 6 Task token lifecycle

  1. To invoke this callback pattern, the askUser state definition is declared using the .waitForTaskToken identifier, with the task token passed into the Lambda function as a payload parameter:
    "AskUser":{
     "Type": "Task",
     "Resource": "arn:aws:states:::lambda:invoke.waitForTaskToken",
     "Parameters":{  
     "FunctionName": "${AskUser}",
     "Payload":{  
     "token.$":"$$.Task.Token"
      }
     },
      "ResultPath":"$.taskresult",
      "Next": "usersChoice"
      },
  2. The askUser Lambda function can then access this token within the event object:
    exports.handler = async (event,context) => {
        let approveLink = `process.env.APIAllowEndpoint?token=${JSON.stringify(event.token)}`
        let denyLink = `process.env.APIDenyEndpoint?token=${JSON.stringify(event.token)}
    //code continues
  3. The task token is published to an SNS topic along with the message text parameter:
        let params = {
     TopicArn: process.env.Topic,
     Message: `A restricted Policy change has been detected Approve:${approveLink} Or Deny:${denyLink}` 
    }
     let res = await sns.publish(params).promise()
    //code continues
  4. The administrator receives an email with two links, one to approve and one to deny. The task token is appended to these links as a request query string parameter named token:

    Figure 7 Approve / deny email.

  5. Using the Amazon API Gateway proxy integration, the task token is passed directly to the recieveUser Lambda function from the API resource, and accessible from within in the function code as part of the event’s queryStringParameter object:
    exports.handler = async(event, context) => {
    //some code
        let taskToken = event.queryStringParameters.token
    //more code
    
  6.  The token is then sent back to the askUser state via an API call from within the recieveUser Lambda function.  This API call also defines the next course of action for the workflow to take.
    //some code 
    let params = {
            output: JSON.stringify({"action":NextAction}),
            taskToken: taskTokenClean
        }
    let res = await stepfunctions.sendTaskSuccess(params).promise()
    //code continues
    

Each Step Functions execution can last for up to a year, allowing for long wait periods for the administrator to take action. There is no extra cost for a longer wait time as you pay for the number of state transitions, and not for the idle wait time.

Conclusion

Using EventBridge to route IAM policy creation events directly to AWS Step Functions reduces the need for unnecessary communication layers. It helps promote good use of compute resources, ensuring Lambda is used to transform data, and not transport or orchestrate.

Using Step Functions to invoke services sequentially has two important benefits for this application. First, you can identify the use of restricted policies quickly and automatically. Also, these policies can be removed and held in a ‘pending’ state until approved.

Step Functions Standard Workflow’s callback pattern can create a robust orchestration layer that allows administrators to review each change before approving or denying.

For the full code base see the GitHub repository https://github.com/bls20AWS/AutomatedPolicyOrchestrator.

For more information on other Step Functions patterns, see our documentation on integration patterns.

Multi-branch CodePipeline strategy with event-driven architecture

Post Syndicated from Henrique Bueno original https://aws.amazon.com/blogs/devops/multi-branch-codepipeline-strategy-with-event-driven-architecture/

Henrique Bueno, DevOps Consultant, Professional Services

This blog post presents a solution for automated pipelines creation in AWS CodePipeline when a new branch is created in an AWS CodeCommit repository. A use case for this solution is when a GitFlow approach using CodePipeline is required. The strategy presented here is used by AWS customers to enable the use of GitFlow using only AWS tools.

CodePipeline is a fully managed continuous delivery service that helps you automate your release pipelines for fast and reliable application and infrastructure updates. CodePipeline automates the build, test, and deploy phases of your release process every time there is a code change, based on the release model you define.

CodeCommit is a fully managed source control service that hosts secure Git-based repositories. It makes it easy for teams to collaborate on code in a secure and highly scalable ecosystem.

GitFlow is a branching model designed around the project release. This provides a robust framework for managing larger projects. Gitflow is ideally suited for projects that have a scheduled release cycle.

Applicability

When using CodePipeline to orchestrate pipelines and CodeCommit as a code source, in addition to setting a repository, you must also set which branch will trigger the pipeline. This configuration works perfectly for the trunk-based strategy, in which you have only one main branch and all the developers interact with this single branch. However, when you need to work with a multi-branching strategy like GitFlow, the requirement to set a pipeline for each branch brings additional challenges.

It’s important to note that trunk-based is, by far, the best strategy for taking full advantage of a DevOps approach; this is the branching strategy that AWS recommends to its customers. On the other hand, many customers like to work with multiple branches and believe it justifies the effort and complexity in dealing with branching merges. This solution is for these customers.

Solution Overview

One of the great benefits of working with Infrastructure as Code is the ability to create multiple identical environments through a single template. This example uses AWS CloudFormation templates to provision pipelines and other necessary resources, as shown in the following diagram.

Multi-branch CodePipeline strategy with event-driven architecture

Multi-branch CodePipeline strategy with event-driven architecture

The template is hosted in an Amazon S3 bucket. An AWS Lambda function deploys a new AWS CloudFormation stack based on this template. This Lambda function is trigged for an Amazon CloudWatch Events rule that looks for events at the CodePipeline repository.

The CreatePipeline Events rule

The AWS CloudFormation snippet that creates the Events rule follows. The Events rule monitors create and delete branches events in all repositories, triggering the CreatePipeline Lambda function.

#----------------------------------------------------------------------#
# EventRule to trigger LambdaPipeline lambda
#----------------------------------------------------------------------#
  CreatePipelineRule:
    Type: AWS::Events::Rule
    Properties: 
      Description: "EventRule"
      EventPattern:
        source:
          - aws.codecommit
        detail-type:
          - 'CodeCommit Repository State Change'
        detail:
          event:
              - referenceDeleted
              - referenceCreated
          referenceType:
            - branch
      State: ENABLED
      Targets: 
      - Arn: !GetAtt CreatePipeline.Arn
        Id: CreatePipeline

 

The CreatePipeline Lambda function

The Lambda function receives the event details, parses the variables, and executes the appropriate actions. If the event is referenceCreated, then the stack is created; otherwise the stack is deleted. The stack name created or deleted is the junction of the repository name plus the new branch name. This is a very simple function.

#----------------------------------------------------------------------#
# Lambda for Stack Creation
#----------------------------------------------------------------------#
import boto3
def lambda_handler(event, context):
    Region = event['region']
    Account = event['account']
    RepositoryName = event['detail']['repositoryName']
    NewBranch = event['detail']['referenceName']
    Event = event['detail']['event']
    if NewBranch == "master":
        quit()
    if Event == "referenceCreated":
        cf_client = boto3.client('cloudformation')
        cf_client.create_stack(
            StackName=f'Pipeline-{RepositoryName}-{NewBranch}',
            TemplateURL=f'https://s3.amazonaws.com/{Account}-templates/TemplatePipeline.yaml',
            Parameters=[
                {
                    'ParameterKey': 'RepositoryName',
                    'ParameterValue': RepositoryName,
                    'UsePreviousValue': False
                },
                {
                    'ParameterKey': 'BranchName',
                    'ParameterValue': NewBranch,
                    'UsePreviousValue': False
                }
            ],
            OnFailure='ROLLBACK',
            Capabilities=['CAPABILITY_NAMED_IAM']
        )
    else:
        cf_client = boto3.client('cloudformation')
        cf_client.delete_stack(
            StackName=f'Pipeline-{RepositoryName}-{NewBranch}'
        )

The logic for creating only the CI or the CI+CD is on the AWS CloudFormation template. The Conditions section of AWS CloudFormation analyzes the new branch name.

Conditions: 
    BranchMaster: !Equals [ !Ref BranchName, "master" ]
    BranchDevelop: !Equals [ !Ref BranchName, "develop"]
    Setup: !Equals [ !Ref Setup, true ]
  • If the new branch is named master, then a stack will be created containing CI+CD pipelines, with deploy stages in the homologation and production environments.
  • If the new branch is named develop, then a stack will be created containing CI+CD pipelines, with a deploy stage in the Dev environment.
  • If the new branch has any other name, then the stack will be created with only a CI pipeline.

Since the purpose of this blog post is to present only a sample of automated pipelines creation, the pipelines used here are for examples only: they don’t deploy to any environment.

 

Applicability

This event-driven strategy permits pipelines to be created or deleted along with the branches. Since the entire environment is created using Infrastructure as Code and the template is the same for all pipelines, there is no possibility of different configuration issues between environments and pipeline stages.

A GitFlow simulation could resemble that shown in the following diagram:

GitFlow Diagram

GitFlow Diagram

  1. First, a CodeCommit repository is created, along with the master branch and its respective pipeline (CI+CD).
  2. The developer creates a branch called develop based on the master branch. The pipeline (CI+CD at Dev) is automatically created.
  3. The developer creates a feature-branch called feature-a based on the develop branch. The CI pipeline for this branch is automatically created.
  4. The developer creates a Pull Request from the feature-a branch to the develop branch. As soon as the Pull Request is accepted and merged and the feature-a branch is deleted, its pipeline is automatically deleted.

The same process can be followed for the release branch and hotfix branch. Once the branch is created, a new pipeline is created for it which follows its branch lifecycle.

Pipelines Stacks

 

Implementation

Before you start, make sure that the AWS CLI is installed and configured on your machine by following these steps:

  1. Clone the repository.
  2. Create the prerequisites stack.
  3. Copy the AWS CloudFormation template to Amazon S3.
  4. Copy the seed.zip file to the Amazon S3 bucket.
  5. Create the first repository and its pipeline.
  6. Create the develop branch.
  7. Create the first feature branch.
  8. Create the first Pull Request.
  9. Execute the Pull Request approval.
  10. Cleanup.

 

1. Clone the repository

Clone the repository with the sample code.

The main files are:

  • Setup.yaml: an AWS CloudFormation template for creating pipeline prerequisites.
  • TemplatePipeline.yaml: an AWS CloudFormation template for pipeline creation.
  • seed/buildspec/CIAction.yaml: a configuration file for an AWS CodeBuild project at the CI stage.
  • seed/buildspec/CDAction.yaml: a configuration file for a CodeBuild project at the CD stage.
# Command to clone the repository
git clone https://github.com/aws-samples/aws-codepipeline-multi-branch-strategy.git
cd aws-codepipeline-multi-branch-strategy

 

2. Create the prerequisite stack

The Setup stack creates the resources that are prerequisites for pipeline creation, as shown in the following chart.

These resources are created only once and they fit all the pipelines created in this example.

Resources

# Command to create Setup stack
aws cloudformation deploy --stack-name Setup-Pipeline \
--template-file Setup.yaml --region us-east-1 --capabilities CAPABILITY_NAMED_IAM

 

3. Copy the AWS CloudFormation template to Amazon S3

For the Lambda function to deploy a new pipeline stack, it needs to get the AWS CloudFormation template from somewhere. To enable it to do so, you need to save the template inside the Amazon S3 bucket that you just created at the Setup stack.

# Command that copy Template to S3 Bucket
aws s3 cp TemplatePipeline.yaml s3://"$(aws sts get-caller-identity --query Account --output text)"-templates/ --acl private

 

4. Copy the seed.zip file to the Amazon S3 bucket

CodeCommit permits you to populate a repository at the moment of its creation as a first commit. The content of this first commit can be saved in a .zip file in an Amazon S3 bucket. Use this CodeCommit option to populate your repository with BuildSpec files for CodeBuild.

# Command to create zip file with the Buildspec folder content.
zip -r seed.zip buildspec

# Command that copy seed.zip file to S3 Bucket.
aws s3 cp seed.zip s3://"$(aws sts get-caller-identity --query Account --output text)"-templates/ --acl private

 

5. Create the first repository and its pipeline

Now that the Setup stack is created and the seed file is stored in an Amazon S3 bucket, create the first CodeCommit repository. Every time that you want to create a new repository, execute the command below to create a new stack.

# Command to create the stack with the CodeCommit repository,
# CodeBuild Projects and the Pipeline for the master branch.
# Note: Change "myapp" by the name you want.

RepoName="myapp"
aws cloudformation deploy --stack-name Repo-$RepoName --template-file TemplatePipeline.yaml \
--parameter-overrides RepositoryName=$RepoName Setup=true \
--region us-east-1 --capabilities CAPABILITY_NAMED_IAM

When the stack is created, in addition to the CodeCommit repository, the CodeBuild projects and the master branch pipeline are also created. By default, a CodeCommit repository is created empty, with no branch. When the repository is populated with the seed.zip file, the master branch is created.

Resources

Access the CodeCommit repository to see the seed files at the master branch. Access the CodePipeline console to see that there’s a new pipeline with the name as the repository. This pipeline contains the CI+CD stages (homolog and prod).

 

6. Create the develop branch

To simulate a real development scenario, create a new branch called develop based on the master branch. In the GitFlow concept these two (master and develop) branches are fixed and never deleted.

When this new branch is created, the Events rule identifies that there’s a change on this repository and triggers the CreatePipeline Lambda function to create a new pipeline for this branch. Access the CodePipeline console to see that there’s a new pipeline with the name of the repository plus the branch name. This pipeline contains the CI+CD stages (Dev).

# Configure Git Credentials using AWS CLI Credential Helper
mkdir myapp
cd myapp
git config --global credential.helper '!aws codecommit credential-helper [email protected]'
git config --global credential.UseHttpPath true

# Clone the CodeCommit repository
# You can get the URL in the CodeCommit Console
git clone https://git-codecommit.us-east-1.amazonaws.com/v1/repos/myapp .

# Create the develop branch
# For more details: https://docs.aws.amazon.com/codecommit/latest/userguide/how-to-create-branch.html
git checkout -b develop
git push origin develop

 

7. Create the first feature branch

Now that there are two main and fixed branches (master and develop), you can create a feature branch. In the GitFlow concept, feature branches have a short lifetime and are frequently merged to the develop branch. This type of branch only exists during the development period. When the feature development is finished, it is merged to the develop branch and the feature branch is deleted.

# Create the feature-branch branch
# make sure that you are at develop branch
git checkout -b feature-abc
git push origin feature-abc

Just as with the develop branch, when this new branch is created, the Events rule triggers the CreatePipeline Lambda function to create a new pipeline for this branch. Access the CodePipeline console to see that there’s a new pipeline with the name of the repository plus the branch name. This pipeline contains only the CI stage, without a CD stage.

Pipelines-2

 

8. Create the first Pull Request

It’s possible to simulate the end of a feature development, when the branch is ready to be merged with the develop branch. Keep in mind that the feature branch has a short lifecycle:  when the merge is done, the feature branch is deleted along with its pipeline.

# Create the Pull Request
aws codecommit create-pull-request --title "My Pull Request" \
--description "Please review these changes by Tuesday" \
--targets repositoryName=$RepoName,sourceReference=feature-abc,destinationReference=develop \
--region us-east-1

 

9. Execute the Pull Request approval

To merge the feature branch to the develop branch, the Pull Request needs to be approved. In a real scenario, a teammate should do a peer review before approval.

# Accept the Pull Request
# You can get the Pull-Request-ID in the json output of the create-pull-request command
aws codecommit merge-pull-request-by-fast-forward --pull-request-id <PULL_REQUEST_ID_FROM_PREVIOUS_COMMAND> \
--repository-name $RepoName --region us-east-1

# Delete the feature-branch
aws codecommit delete-branch --repository-name $RepoName --branch-name feature-abc --region us-east-1

The new code is integrated with the develop branch. If there’s a conflict, it needs to be solved. After that, the featurebranch is deleted together with its pipeline. The Event Rule triggers the CreatePipeline Lambda function to delete the pipeline for its branch. Access the CodePipeline console to see that the pipeline for the feature branch is deleted.

Pipelines

 

10. Cleanup

To remove the resources created as part of this blog post, follow these steps:

Delete Pipeline Stacks

# Delete all the pipeline Stacks created by CreatePipeline Lambda
Pipelines=$(aws cloudformation list-stacks --stack-status-filter --region us-east-1 --query 'StackSummaries[? StackStatus==`CREATE_COMPLETE` && starts_with(StackName, `Pipeline`) == `true`].[StackName]' --output text)
while read -r Pipeline rest; do aws cloudformation delete-stack --stack-name $Pipeline --region us-east-1 ; done <<< $Pipelines

Delete Repository Stacks

# Delete all the Repository stacks Stacks created by Step 5.
Repos=$(aws cloudformation list-stacks --stack-status-filter --region us-east-1 --query 'StackSummaries[? StackStatus==`CREATE_COMPLETE` && starts_with(StackName, `Repo-`) == `true`].[StackName]' --output text)
while read -r Repo rest; do aws cloudformation delete-stack --stack-name $Repo --region us-east-1 ; done <<< $Repos

Delete Setup-Pipeline Stack

# Cleaning Bucket before Stack deletion 
aws s3 rm s3://"$(aws sts get-caller-identity --query Account --output text)"-templates --recursive

# Delete Setup Stack
aws cloudformation delete-stack --stack-name Setup-Pipeline --region us-east-1 

 

Conclusion

This blog post discussed how you can work with event-driven strategy and Infrastructure as Code to implement a multi-branch pipeline flow using CodePipeline. It demonstrated how an Events rule and Lambda function can be used to fully orchestrate the creation and deletion of pipelines.

 

About the Author

Henrique Bueno

Henrique Bueno

Henrique is a DevOps Consultant in the Brazilian Professional Services Team at Amazon Web Services. He has helped AWS customers design, build and deploy cloud native applications by following the Twelve-Factor App methodology.

Integrating SonarQube as a pull request approver on AWS CodeCommit

Post Syndicated from David Jackson original https://aws.amazon.com/blogs/devops/integrating-sonarqube-as-a-pull-request-approver-on-aws-codecommit/

Integrating SonarQube as a pull request approver on AWS CodeCommit

On Nov 25th, AWS CodeCommit launched a new feature that allows customers to configure approval rules on pull requests. Approval rules act as a gate on your source code changes. Pull requests which fail to satisfy the required approvals cannot be merged into your important branches. Additionally, CodeCommit launched the ability to create approval rule templates, which are rulesets that can automatically be applied to all pull requests created for one or more repositories in your AWS account. With templates, it becomes simple to create rules like “require one approver from my team” for any number of repositories in your AWS account.

A common problem for software developers is accidentally or unintentionally merging code with bugs, defects, or security vulnerabilities into important master branches. Once bad code is merged into a master branch, it can be difficult to remove. It’s also potentially costly if the code is deployed into production environments and causes outages or other serious issues. Using CodeCommit’s new features, adding required approvers to your repository pull requests can help identify and mitigate those issues before they are merged into your master branches.

The most rudimentary use of required approvers is to require at least one team member to approve each pull request. While adding human team members as approvers is an important part of the pull request workflow, this feature can also be used to require ‘robot’ approvers of your pull requests, and you can trigger them automatically on each new or updated pull request. Robotic approvers can help find issues that humans miss and enforce best practices regarding code style, test coverage, and more.

Customers have been asking us how we can integrate code review tools with AWS CodeCommit pull requests. I encourage you to check out Amazon CodeGuru Reviewer, which is a service that uses program analysis and machine learning to detect potential defects that are difficult for developers to find and recommends fixes in your Java code, and was launched in preview at the AWS Re:Invent 2019 conference. Another popular tool is SonarQube, which is an open-source platform for performing code quality analysis. It helps detect defects, bugs, and security vulnerabilities in your pull requests. This blog post shows you how to integrate SonarQube into the pull requests workflow.

This post shows…

Time to read 10 minutes
Time to complete 20 minutes
Cost to complete (estimated) $0.40/month for secret, ~$0.02 per build on CodeBuild. $0-1 for CodeCommit user depending on current free tier status. (at publication time)
Learning level Intermediate (200)
Services used AWS CodeCommit, AWS CodeBuild, AWS CloudFormation, Amazon Elastic Compute Cloud (EC2), AWS CloudWatch Events, AWS Identity and Access Management, AWS Secrets Manager

Solution overview

In this solution, you create a CodeCommit repository that requires a successful SonarQube quality analysis before pull requests can be merged. You can create the required AWS resources in your account by using the provided AWS CloudFormation template. This template creates the following resources:

  • A new CodeCommit repository, containing a starter Java project that uses the Apache Maven build system, as well as a custom buildspec.yml file to facilitate communication with SonarQube and CodeCommit.
  • An AWS CodeBuild project which invokes your SonarQube instance on build, then reports the status of the analysis back to CodeCommit.
  • An Amazon CloudWatch Events Rule, which listens for pullRequestCreated and pullRequestSourceBranchUpdated events from CodeCommit, and invokes your CodeBuild project.
  • An AWS Secrets Manager secret, which securely stores and provides the username and password of your SonarQube user to the CodeBuild project on-demand.
  • IAM roles for CodeBuild and CloudWatch events.

Although this tutorial showcases a Java project with Maven, the design principles should also apply for other languages and build systems with SonarQube integrations.

Design

The following diagram shows the flow of data, starting with a new or updated pull request on CodeCommit. CloudWatch Events listens for these events and invokes your CodeBuild project. The CodeBuild container clones your repository source commit, performs a Maven install, and invokes the quality analysis on SonarQube, using the credentials obtained from AWS Secrets Manager. When finished, CodeBuild leaves a comment on your pull request, and potentially approves your pull request.

 

Diagram showing the flow of data between the AWS service components, as well as the SonarQube.

Prerequisites

For this walkthrough, you require:

  • An AWS account
  • A SonarQube server instance (Optional setup instructions included if you don’t have one already)

SonarQube instance setup (Optional)

This tutorial shows a basic setup of SonarQube on Amazon EC2 for informational purposes only. It does not include details about securing your Amazon EC2 instance or SonarQube installation. Please be sure you have secured your environments before placing sensitive data on them.

  1. To start, get a SonarQube server instance up and running. If you are already using SonarQube, feel free to skip these instructions and just note down your host URL and port number for later. If you don’t have one already, I recommend using a fresh Amazon EC2 instance for the job. You can get up and running quickly in just a few commands. I’ve selected an Amazon Linux 2 AMI for my EC2 instance.
  2. Download and install the latest JDK 11 module. Because I am using an Amazon Linux 2 EC2 instance, I can directly install Amazon Corretto 11 with yum.

$ sudo yum install java-11-amazon-corretto-headless

  1. After it’s installed, verify you’re using this version of Java:

$ sudo alternatives --config java

  1. Choose the Java 11 version you just installed.
  2. Download the latest SonarQube installation.
  3. Copy the zip-file onto your Amazon EC2 instance.
  4. Unzip the file into your home directory:

$ unzip sonarqube-8.0.zip -d ~/

This will copy the files into a directory like /home/ec2-user/sonarqube-8.0.

Now, start the server!

$ ~/sonarqube-8.0/bin/linux-x86-64/sonar.sh start

This should start a SonarQube server running on an address like http://<instance-address>:9000. It may take a few moments for the server to start.

Steps

Follow these steps to create automated pull request approvals.

Create a SonarQube User

Get started by creating a SonarQube user from your SonarQube webpage. This user is the identity used by the robot caller to your SonarQube for this workflow.

  1. Go to the Administration tab on your SonarQube instance.
  2. Choose Security, then Users, as shown in the following screenshot.Screenshot showing where to find the user management options inside SonarQube.
  3. Choose Create User. Fill in the form, and note down the Login and Password You will need to provide these values when creating the following AWS resources.
  4. Choose Create.

Create AWS resources

For this integration, you need to create some AWS resources:

  • AWS CodeCommit repository
  • AWS CodeBuild project
  • Amazon CloudWatch Events rule (to trigger builds when pull requests are created or updated)
  • IAM role (for CodeBuild to assume)
  • IAM role (for CloudWatch Events to assume and invoke CodeBuild)
  • AWS Secrets Manager secret (to store and manage your SonarQube user credentials)

I have created an AWS CloudFormation template to provision these resources for you. You can download the template from the sample repository on GitHub for this blog demo. This repository also contains the sample code which will be uploaded to your CodeCommit repository. The contents of this GitHub repository will automatically be copied into your new CodeCommit repository for you when you create this CloudFormation stack. This is because I’ve conveniently uploaded a zip-file of the contents into a publicly-readable S3 bucket, and am using it within this CloudFormation template.

  1. Download or copy the CloudFormation template from GitHub and save it as template.yaml on your local computer.
  2. At the CloudFormation console, choose Create Stack (with new resources).
  3. Choose Upload a template file.
  4. Choose Choose file and select the template.yaml file you just saved.
  5. Choose Next.
  6. Give your stack a name, optionally update the CodeCommit repository name and description, and paste in the username and password of the SonarQube user you created.
  7. Choose Next.
  8. Review the stack options and choose Next.
  9. On Step 4, review your stack, acknowledge the required capabilities, and choose Create Stack.
  10. Wait for the stack creation to complete before proceeding.
  11. Before leaving the AWS CloudFormation console, choose the Resources tab and note down the newly created CodeBuildRole’s Physical Id, as shown in the following screenshot. You need this in the next step. Screenshot showing the Physical Id of the CodeBuild role created through CloudFormation.

Create an Approval Rule Template

Now that your resources are created, create an Approval Rule Template in the CodeCommit console. This template allows you to define a required approver for new pull requests on specific repositories.

  1. On the CodeCommit console home page, choose Approval rule templates in the left panel. Choose Create template.
  2. Give the template a name (like Require SonarQube approval) and optionally, a description.
  3. Set the number of approvals needed as 1.
  4. Under Approval pool members, choose Add.
  5. Set the approver type to Fully qualified ARN. Since the approver will be the identity obtained by assuming the CodeBuild execution role, your approval pool ARN should be the following string:
    arn:aws:sts::<Your AccountId>:assumed-role/<Your CodeBuild IAM role name>/*
    The CodeBuild IAM role name is the Physical Id of the role you created and noted down above. You can also find the full name either in the IAM console or the AWS CloudFormation stack details. Adding this role to the approval pool allows any identity assuming your CodeBuild role to satisfy this approval rule.
  6. Under Associated repositories, find and choose your repository (PullRequestApproverBlogDemo). This ensures that any pull requests subsequently created on your repository will have this rule by default.
  7. Choose Create.

Update the repository with a SonarQube endpoint URL

For this step, you update your CodeCommit repository code to include the endpoint URL of your SonarQube instance. This allows CodeBuild to know where to go to invoke your SonarQube.

You can use the AWS Management Console to make this code change.

  1. Head back to the CodeCommit home page and choose your repository name from the Repositories list.
  2. You need a new branch on which to update the code. From the repository page, choose Branches, then Create branch.
  3. Give the new branch a name (such as update-url) and make sure you are branching from master. Choose Create branch.
  4. You should now see two branches in the table. Choose the name of your new branch (update-url) to start browsing the code on this branch. On the update-url branch, open the buildspec.yml file by choosing it.
  5. Choose Edit to make a change.
  6. In the pre_build steps, modify line 17 with your SonarQube instance url and listen port number, as shown in the following screenshot.Screenshot showing buildspec yaml code.
  7. To save, scroll down and fill out the author, email, and commit message. When you’re happy, commit this by choosing Commit changes.

Create a Pull Request

You are now ready to create a pull request!

  1. From the CodeCommit console main page, choose Repositories and PullRequestApproverBlogDemo.
  2. In the left navigation panel, choose Pull Requests.
  3. Choose Create pull request.
  4. Select master as your destination branch, and your new branch (update-url) as the source branch.
  5. Choose Compare.
  6. Give your pull request a title and description, and choose Create pull request.

It’s time to see the magic in action. Now that you’ve created your pull request, you should already see that your pull request requires one approver but is not yet approved. This rule comes from the template you created and associated earlier.

You’ll see images like the following screenshot if you browse through the tabs on your pull request:

Screenshot showing that your pull request has 0 of 1 rule satisfied, with 0 approvals. Screenshot showing a table of approval rules on this pull request which were applied by a template. Require SonarQube approval is listed but not yet satisfied.

Thanks to the CloudWatch Events Rule, CodeBuild should already be hard at work cloning your repository, performing a build, and invoking your SonarQube instance. It is able to find the SonarQube URL you provided because CodeBuild is cloning the source branch of your pull request. If you choose to peek at your project in the CodeBuild console, you should see an in-progress build.

Once the build has completed, head back over to your CodeCommit pull request page. If all went well, you’ll be able to see that SonarQube approved your pull request and left you a comment. (Or alternatively, failed and also left you a comment while not approving).

The Activity tab should resemble that in the following screenshot:

Screenshot showing that a comment was made by SonarQube through CodeBuild, and that the quality gate passed. The comment includes a link back to the SonarQube instance.

The Approvals tab should resemble that in the following screenshot:

Screenshot of Approvals tab on the pull request. The approvals table shows an approval by the SonarQube and that the rule to require SonarQube approval is satisfied.

Suppose you need to make a change to your pull request. If you perform updates to your source branch, the approval status will be reset. As your push completes, a new SonarQube analysis will begin just as it did the first time.

Once your SonarQube thresholds are satisfied and your pull request is approved, feel free to merge it!

Cleanup

To avoid incurring additional charges, you may want to delete the AWS resources you created for this project. To do this, simply navigate to the CloudFormation console, select the stack you created above, and choose Delete. If you are sure you want to delete, confirm by choosing Delete stack. CloudFormation will delete all the resources you created with this stack.

Conclusion

In this tutorial, you created a workflow to watch for pull request changes to your repository, triggered a CodeBuild project execution which invoked your SonarQube for code quality analysis, and then reported back to CodeCommit to approve your pull request.

I hope this guide illustrates the potential power of combining pull request approval rules with robotic approvers. While this example is specifically about integrating SonarQube, the same pattern can be used to invoke other robotic approvers using CodeBuild, or by invoking an AWS Lambda function instead.

This tutorial was written and tested using SonarQube Version 8.0 (build 29455).

Transforming DevOps at Broadridge on AWS

Post Syndicated from Som Chatterjee original https://aws.amazon.com/blogs/devops/transforming-devops-for-a-fintech-on-aws/

by Tom Koukourdelis (Broadridge – Vice President, Head of Global Cloud Platform Development and Engineering), Sreedhar Reddy (Broadridge – Vice President, Enterprise Cloud Architecture)

We have seen large enterprises in all industry segments meaningfully utilizing AWS to build new capabilities and deliver business value. While doing so, enterprises have to balance existing systems, processes, tools, and culture while innovating at pace with industry disruptors. Broadridge Financial Solutions, Inc. (NYSE: BR) is no exception. Broadridge is a $4 billion global FinTech leader and a leading provider of investor communications and technology-driven solutions to banks, broker-dealers, asset and wealth managers, and corporate issuers.

This blog post explores how we adopted AWS at scale while being secured and compliant, as well as delivering a high degree of productivity for our builders on AWS. It also describes the steps we took to create technical (a cloud solution as a foundation based on AWS) and procedural (organizational) capabilities by leveraging AWS cloud adoption constructs. The improvement in our builder productivity and agility directly contributes to rolling out differentiated business capabilities addressing our customer needs in a timely manner. In this post, we share real-life learnings and takeaways to adopt AWS at scale, transform business and application team experiences, and deliver customer delight.

Background

At Broadridge we have number of distributed and mainframe systems supporting multiple financial services domains and sub-domains such as post trade, proxy communications, financial and regulatory reporting, portfolio management, and financial operations. The majority of these systems were built and deployed years ago at on-premises data centers all over the US and abroad.

Builder personas at Broadridge are diverse in terms of location, culture, and the technology stack they use to build and support applications (we use a number of front-end JS frameworks; .NET; Java; ColdFusion for web development; ORMs for data entity relational mapping; IBM MQ; Apache Camel for messaging; databases like SQL, Oracle, Sybase, and other open source stacks for transaction management; databases, and batch processing on virtualized and bare metal instances). With more than 200 on-premises distributed applications and mainframe systems across front-, mid-, and back-office ecosystems, we wanted to leverage AWS to improve efficiency and build agility, and to reduce costs. The ability to reach customers at new geographies, reduced time to market, and opportunities to build new business competencies were key parameters as well.

Broadridge’s core tenents for cloud adoption

When AWS adoption within Broadridge attained a critical mass (known as the Foundation stage of adoption), the business and technology leadership teams defined our posture of cloud adoption and shared them with teams across the organization using the following tenets. Enterprises looking to adopt AWS at scale should define similar tenets fit for their organizations in plain language understandable by everyone across the board.

  • Iterate: Understanding that we cannot disrupt ongoing initiatives, small and iterative approach of moving workloads to cloud in waves— rinse and repeat— was to be adopted. Staying away from long-drawn, capital-intensive big bangs were to be avoided.
  • Fully automate: Starting from infrastructure deployment to application build, test, and release, we decided early on that automation and no-touch deployment are the right approach both to leverage cloud capabilities and to fuel a shift toward a matured DevOps culture.
  • Trust but verify only exceptions: Security and regulatory compliance are paramount for an organization like Broadridge. Guardrails (such as service control policies, managed AWS Config rules, multi-account strategy) and controls (such as PCI, NIST control frameworks) are iteratively developed to baseline every AWS account and AWS resource deployed. Manual security verification of workloads isn’t needed unless an exception is raised. Defense in depth (distancing attack surface from sensitive data and resources using multi-layered security) strategies were to be applied.
  • Go fast; re-hosting is acceptable: Not every workload needs to go through years of rewriting and refactoring before it is deemed suitable for the cloud. Minor tweaking (light touch re-platforming) to go fast (such as on-premises Oracle to RDS for Oracle) is acceptable.
  • Timeliness and small wins are key: Organizations spend large sums of capital to completely rewrite applications and by the time they are done, the business goal and customer expectation will have changed. That leads to material dissatisfaction with customers. We wanted to avoid that by setting small, measurable targets.
  • Cloud fluency: Investment in training and upskilling builders and leaders across the organization (developers, infra-ops, sec-ops, managers, salesforce, HR, and executive leadership) were to be to made to build fluency on the cloud.

The first milestone

The first milestone in our adoption journey was synonymous with Project stage of adoption and had the following characteristics.

A controlled sprawl of shadow IT

We first gave small teams with little to no exposure to critical business functions (such as customer data and SLA-oriented workloads) sandboxes to test out proofs of concepts (PoC) on AWS. We created the cloud sandboxes with least privilege, and added additional privileges upon request after verification. During this time, our key AWS usage characteristics were:

  • Manual AWS account setup with least privilege
  • Manual IAM role creation with role boundaries and authentication and authorization from the existing enterprise Active Directory
  • Integration with existing Security Information and Event Management (SIEM) tools to audit role sprawl and config changes
  • Proofs of concepts only
  • Account tagging for chargeback and tracking purposes
  • No automated build, test, deploy, or integration with existing delivery pipeline
  • Small and definitive timeframes for PoCs with defined goals

A typical AWS environment at this stage will resemble that shown in the following diagram:

Representative AWS usage during first milestone

As shown above, at this time the corporate assets were connected to a highly restrictive AWS environment through VPN. The access to the AWS environment were setup based on AWS Identity primitives or IAM roles mapped to and federated with the on-premises Active Directory. There was a single VPC setup for a sandbox account with no egress to the internet. There were no customer data hosted on this AWS environment and the AWS environment was connected with our SIEM of choice.

Early adopters became first educators and mentors

Members of the first teams to carry out proofs of concept on AWS shared learnings with each other and with the leadership team within Broadridge. This helped build communities of practices (CoPs) over time. Initial CoPs established were for networking and security, and were later extended to various practices like Terraform, Chef, and Jenkins.

Tech PMO team within Broadridge as the quasi-central cloud team

Ownership is vital no matter how small the effort and insignificant the impact of risky experimentation. The ownership of account setup, role creation, integration with on-premises AD and SIEM, and oversight to ensure that the experimentation does not pose any risk to the brand led us to build a central cloud team with experienced AWS and infrastructure practitioners. This team created a process for cloud migration with first manual guardrails of allowed and disallowed actions, manual interventions, and checkpoints built in every step.

At this stage, a representative pattern of work products across teams resembles what is shown below.

Work products across teams during initial stages of AWS usage

As the diagram suggests, individual application teams built overlapping—and, in many cases, identical—technical building blocks across the teams. This was acceptable as the teams were experimenting and running PoCs on AWS. In an actual production application delivery, the blocks marked with a * would be considered technical and functional waste—that is, undifferentiated lift which increases the cost of doing business.

The second milestone

In hindsight, this is perhaps the most important milestone in our cloud adoption journey. This step was marked with following key characteristics:

  • Every new team doing PoCs are rebuilding the same building blocks: This includes networking (VPCs and security groups), identity primitives (account, roles, and policies), monitoring (Amazon CloudWatch setup and custom metrics), and compute (images with org-mandated security patches).
  • The teams usually asking the same first fundamental questions: These include questions such as: What is an ideal CIDR block range? How do we integrate with SIEM? How do we spin up web servers on Amazon EC2? How do we secure access to data? How do we setup workload monitoring?
  • Security reviews rarely finding new security gaps but adding time to the process: A central security group as part of the central cloud team reviewed every new account request and every new service usage request without finding new security gaps when the application team used the baseline guardrails.
  • Manual effort is spent on tagging, chargeback, and other approvals: A portion of the application PoC/minimum viable product (MVP) lifecycle was spent on housekeeping. While housekeeping was necessary, the effort spent was undifferentiated.

The follow diagram represents the efforts for every team during the first phase.

Team wise efforts showing duplicative work

As shown above, every application team spent effort on building nearly the same capabilities before they could begin developing their team specific application functionalities and assets. The common blocks of work are undifferentiated and leads to spending effort which also varies depending on the efficiency of the team.

During this step, learnings from the PoCs led us to establish the tenets shared earlier in this post. To address the learnings, Broadridge established a cloud platform team. The cloud platform team, also referred to as the cloud enablement engine (CEE), is a team of builders who create the foundational building blocks on AWS that address common infrastructure, security, monitoring, auditing, and break-glass controls. At the same time, we established a cloud business office (CBO) as a liaison between the application and business teams and the CEE. CBO exists to manage and prioritize foundational requirements from multiple application teams as they go online on AWS and helps create the product backlog for CEE.

Cloud Enablement Engine Responsibilities:

  • Build out foundational building blocks utilizing AWS multi-account strategy
  • Build security guardrails, compliance controls, infrastructure as code automation, auditing and monitoring controls
  • Implement cloud platform backlog that funnel from CBO as common asks from app teams
  • Work with our AWS team to understand service roadmap, future releases, and provide feedback

Cloud Business Office Responsibilities:

  • Identify and prioritize repeating technical building blocks that cuts across multiple teams
  • Establish acceptable architecture patterns based on application use cases
  • Manage cloud programs to ensure CEE deliverables and business expectations align
  • Identify skilling needs, budget, and track spend
  • Contribute to the cloud platform backlog
  • Work with AWS team to understand service roadmap, future releases, and provide feedback

These teams were set up to scale AWS adoption, put building blocks into the hands of the applications teams, and ultimately deliver differentiated capabilities to Broadridge’s business teams and end customers. The following diagram translates the relationship and modus operandi among the teams:

CEE and CBO working model

Upon establishing the conceptual working model, the CBO and CEE teams looked at solutions from AWS to enable them to achieve the working model quickly. The starting point was AWS Landing Zone (ALZ). ALZ is an AWS solution based on the AWS multi-account strategy. It is a set of vetted constructs and best practices that we use as mechanisms to accelerate AWS adoption.

AWS multi-account strategy

The multi-account strategy employs best practices around separation of concerns, reduction of blast radius, account setup based on Software Development Life Cycle (SDLC) phases, and base operational roles for auditing, monitoring, security, and compliance, as shown in the above diagram. This strategy defines the need for having centralized shared or core accounts, which works as the master account for monitoring, governance, security, and auditing. A number of AWS services like Amazon GuardDuty, AWS Security Hub, and AWS Config configurations are set in these centralized accounts. Spoke or child accounts are vended as per a team’s requirement which are spun up with these governance, monitoring, and security defaults connected to the centralized account for log capturing, threat detection, configuration management, and security management.

The third milestone

The third milestone is synonymous with the Foundation stage of adoption

Using the ALZ construct, our CEE team developed a core set of principles to be used by every application team. Based on our core tenets, the CEE team built out an entry point (a web-based UI workflow application). This web UI was the entry point for any application team requesting an environment within AWS for experimentation or to begin the application development life cycle. Simplistically, the web UI sat on top of an automation engine built using APIs from AWS, ALZ components (Account Vending Machine, Shared Services Account, Logging Account, Security Account, default security groups, default IAM roles, and AD groups), and Terraform based code. The CBO team helped establish the common architecture patterns that was codified into this engine.

Team on-boarding workflow using foundational building blocks on AWS

An Angular based web UI is the starting point for application team to request for the AWS accounts. The web UI entry point asks a number of questions validating the type of account requested along with its intended purpose, ingress/egress requirements, high availability and disaster recovery requirements, business unit for charge back and ownership purposes. Once all information is entered, it sends out a notification based on a preset organization dispatch matrix rule. Upon receiving the request, the approver has the option to approve it or asks further clarification question. Once satisfactorily answered the approver approves the account vending request and a Terraform code is kicked in to create the default account.

When an account is created through this process, the following defaults are set up for a secure environment for development, testing, and staging. Similar guardrails are deployed in the production accounts as well.

  • Creates a new account under an existing AWS Organizational Unit (OU) based on the input parameters. Tags the chargeback codes, custom tags, and also integrates the resources with existing CMDB
  • Connects the new account to the master shared services and logging account as per the AWS Landing Zone constructs
  • Integrates with the CloudWatch event bus as a sender account
  • Runs stsAssumeRole commands on the new account to create infosec cross-account roles
  • Defines actions, conditions, role limits, and account policies
  • Creates environment variables related to the account in the parameter store within AWS Systems Manager
  • Connects the new account to TrendMicro for AV purposes
  • Attaches the default VPC of the new account to an existing AWS Transit Gateway
  • Generates a Splunk key for the account to store in the Splunk KV store
  • Uses AWS APIs to attach Enterprise support to the new account
  • Creates or amends a new AD group based on the IAM role
  • Integrates as an Amazon Macie member account
  • Enables AWS Security Hub for the account by running an enable-security-hub call
  • Sets up Chef runner for the new account
  • Runs account setting lock procedures to set Amazon S3 public settings, EBS default encryption setting
  • Enable firewall by setting AWS WAF rules for the account
  • Integrates the newly created account with CloudHealth and Dome9

Deploying all these guardrails in any new accounts removes the need for manual setup and intervention. This gives application developers the needed freedom to stop worrying about infrastructure and access provisioning while giving them a higher speed to value.

Using these technical and procedural cloud adoption constructs, we have been able to reduce application onboarding time. This has led to quicker delivery of business capability with the application teams focusing only on what differentiates their business rather than repeatedly building undifferentiated work products. This has also led to creation of mature building blocks over time for use of the application teams. Using these building blocks the teams are also modernizing applications by iteratively replacing old application blocks.

Conclusion

In summary, we are able to deliver better business outcomes and differentiated customer experience by:

  • Building common asks as reusable and automated enterprise assets and improving the overall enterprise-wide maturity by indexing on and growing these assets.
  • Depending on an experienced team to deliver baseline operational controls and guardrails.
  • Improving their security posture with higher-level and managed AWS security services instead of rebuilding everything from the ground up.
  • Using the Cloud Business Office to improve funneling of common asks. This helps the next team on AWS to benefit from a readily available set of approved services and application blueprints.

We will continue to build on and maturing these reusable building blocks by using AWS services and new feature releases.

 

The content and opinions in this blog are those of the third-party author and AWS is not responsible for the content or accuracy of this post.