Tag Archives: AWS Cloud9

Updating opt-in status for Amazon Pinpoint channels

Post Syndicated from Varinder Dhanota original https://aws.amazon.com/blogs/messaging-and-targeting/updating-opt-in-status-for-amazon-pinpoint-channels/

In many real-world scenarios, customers are using home-grown or 3rd party systems to manage their campaign related information. This includes user preferences, segmentation, targeting, interactions, and more. To create customer-centric engagement experiences with such existing systems, migrating or integrating into Amazon Pinpoint is needed. Luckily, many AWS services and mechanisms can help to streamline this integration in a resilient and cost-effective way.

In this blog post, we demonstrate a sample solution that captures changes from an on-premises application’s database by utilizing AWS Integration and Transfer Services and updates Amazon Pinpoint in real-time.

If you are looking for a serverless, mobile-optimized preference center allowing end users to manage their Pinpoint communication preferences and attributes, you can also check the Amazon Pinpoint Preference Center.

Architecture

Architecture

In this scenario, users’ SMS opt-in/opt-out preferences are managed by a home-grown customer application. Users interact with the application over its web interface. The application, saves the customer preferences on a MySQL database.

This solution’s flow of events is triggered with a change (insert / update / delete) happening in the database. The change event is then captured by AWS Database Migration Service (DMS) that is configured with an ongoing replication task. This task continuously monitors a specified database and forwards the change event to an Amazon Kinesis Data Streams stream. Raw events that are buffered in this stream are polled by an AWS Lambda function. This function transforms the event, and makes it ready to be passed to Amazon Pinpoint API. This API call will in turn, change the opt-in/opt-out subscription status of the channel for that user.

Ongoing replication tasks are created against multiple types of database engines, including Oracle, MS-SQL, Postgres, and more. In this blog post, we use a MySQL based RDS instance to demonstrate this architecture. The instance will have a database we name pinpoint_demo and one table we name optin_status. In this sample, we assume the table is holding details about a user and their opt-in preference for SMS messages.

userid phone optin lastupdate
user1 +12341111111 1 1593867404
user2 +12341111112 1 1593867404
user2 +12341111113 1 1593867404

Prerequisites

  1. AWS CLI is configured with an active AWS account and appropriate access.
  2. You have an understanding of Amazon Pinpoint concepts. You will be using Amazon Pinpoint to create a segment, populate endpoints, and validate phone numbers. For more details, see the Amazon Pinpoint product page and documentation.

Setup

First, you clone the repository that contains a stack of templates to your local environment. Make sure you have configured your AWS CLI with AWS credentials. Follow the steps below to deploy the CloudFormation stack:

  1. Clone the git repository containing the CloudFormation templates:
    git clone https://github.com/aws-samples/amazon-pinpoint-rds-integration.git
    cd amazon-pinpoint-rds-integration
  2. You need an S3 Bucket to hold the template:
    aws s3 create-bucket –bucket <YOUR-BUCKET-NAME>
  3. Run the following command to package the CloudFormation templates:
    aws cloudformation package --template-file template_stack.yaml --output-template-file template_out.yaml --s3-bucket <YOUR-BUCKET-NAME>
  4. Deploy the stack with the following command:
    aws cloudformation deploy --template-file template_out.yaml --stack-name pinpointblogstack --capabilities CAPABILITY_AUTO_EXPAND CAPABILITY_NAMED_IAM

The AWS CloudFormation stack will create and configure resources for you. Some of the resources it will create are:

  • Amazon RDS instance with MySQL
  • AWS Database Migration Service replication instance
  • AWS Database Migration Service source endpoint for MySQL
  • AWS Database Migration Service target endpoint for Amazon Kinesis Data Streams
  • Amazon Kinesis Data Streams stream
  • AWS Lambda Function
  • Amazon Pinpoint Application
  • A Cloud9 environment as a bastion host

The deployment can take up to 15 minutes. You can track its progress in the CloudFormation console’s Events tab.

Populate RDS data

A CloudFormation stack will output the DNS address of an RDS endpoint and Cloud9 environment upon completion. The Cloud9 environment acts as a bastion host and allows you to reach the RDS instance endpoint deployed into the private subnet by CloudFormation.

  1. Open the AWS Console and navigate to the Cloud9 service.
    Cloud9Console
  2. Click on the Open IDE button to reach your IDE environment.
    Cloud9Env
  3. At the console pane of your IDE, type the following to login to your RDS instance. You can find the RDS Endpoint address at the outputs section of the CloudFormation stack. It is under the key name RDSInstanceEndpoint.
    mysql -h <YOUR_RDS_ENDPOINT> -uadmin -pmypassword
    use blog_db;
  4. Issue the following command to create a table that holds the user’s opt-in status:
    create table optin_status (
      userid varchar(50) not null,
      phone varchar(50) not null,
      optin tinyint default 1,
      lastupdate TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP
    );
  5. Next, load sample data into the table. The following inserts nine users for this demo:
    
    INSERT INTO optin_status (userid, phone, optin) VALUES ('user1', '+12341111111', 1);
    INSERT INTO optin_status (userid, phone, optin) VALUES ('user2', '+12341111112', 1);
    INSERT INTO optin_status (userid, phone, optin) VALUES ('user3', '+12341111113', 1);
    INSERT INTO optin_status (userid, phone, optin) VALUES ('user4', '+12341111114', 1);
    INSERT INTO optin_status (userid, phone, optin) VALUES ('user5', '+12341111115', 1);
    INSERT INTO optin_status (userid, phone, optin) VALUES ('user6', '+12341111116', 1);
    INSERT INTO optin_status (userid, phone, optin) VALUES ('user7', '+12341111117', 1);
    INSERT INTO optin_status (userid, phone, optin) VALUES ('user8', '+12341111118', 1);
    INSERT INTO optin_status (userid, phone, optin) VALUES ('user9', '+12341111119', 1);
  6. The table’s opt-in column holds the SMS opt-in status and phone number for a specific user.

Start the DMS Replication Task

Now that the environment is ready, you can start the DMS replication task and start watching the changes in this table.

  1. From the AWS DMS Console, go to the Database Migration Tasks section.
    DMSMigTask
  2. Select the Migration task named blogreplicationtask.
  3. From the Actions menu, click on Restart/Resume to start the migration task. Wait until the task’s Status transitions from Ready to Starting and Replication ongoing.
  4. At this point, all the changes on the source database are replicated into a Kinesis stream. Before introducing the AWS Lambda function that will be polling this stream, configure the Amazon Pinpoint application.

Inspect the AWS Lambda Function

An AWS Lambda function has been created to receive the events. The Lambda function uses Python and Boto3 to read the records delivered by Kinesis Data Streams. It then performs the update_endpoint API calls in order to add, update, or delete endpoints in the Amazon Pinpoint application.

Lambda code and configuration is accessible through the Lambda Functions Console. In order to inspect the Python code, click the Functions item on the left side. Select the function starting with pinpointblogstack-MainStack by clicking on the function name.

Note: The PINPOINT_APPID under the Environment variables section. This variable provides the Lambda function with the Amazon Pinpoint application ID to make the API call.

LambdaPPAPPID

Inspect Amazon Pinpoint Application in Amazon Pinpoint Console

A Pinpoint application is needed by the Lambda Function to update the endpoints. This application has been created with an SMS Channel by the CloudFormation template. Once the data from the RDS database has been imported into Pinpoint as SMS endpoints, you can validate this import by creating a segment in Pinpoint.

PinpointProject

Testing

With the Lambda function ready, you now test the whole solution.

  1. To initiate the end-to-end test, go to the Cloud9 terminal. Perform the following SQL statement on the optin_table:
    UPDATE optin_status SET optin=0 WHERE userid='user1';
    UPDATE optin_status SET optin=0 WHERE userid='user2';
    UPDATE optin_status SET optin=0 WHERE userid='user3';
    UPDATE optin_status SET optin=0 WHERE userid='user4';
  2. This statement will cause four changes in the database which is collected by DMS and passed to Kinesis Data Streams stream.
  3. This triggers the Lambda function that construct an update_endpoint API call to the Amazon Pinpoint application.
  4. The update_endpoint operation is an upsert operation. Therefore, if the endpoint does not exist on the Amazon Pinpoint application, it creates one. Otherwise, it updates the current endpoint.
  5. In the initial dataset, all the opt-in values are 1. Therefore, these endpoints will be created with an OptOut value of NONE in Amazon Pinpoint.
  6. All OptOut=NONE typed endpoints are considered as active endpoints. Therefore, they are available to be used within segments.

Create Amazon Pinpoint Segment

  1. In order to see these changes, go to the Pinpoint console. Click on PinpointBlogApp.
    PinpointConsole
  2. Click on Segments on the left side. Then click Create a segment.
    PinpointSegment
  3. For the segment name, enter US-Segment.
  4. Select Endpoint from the Filter dropdown.
  5. Under the Choose an endpoint attribute dropdown, select Country.
  6. For Choose values enter US.
    Note: As you do this, the right panel Segment estimate will refresh to show the number of endpoints eligible for this segment filter.
  7. Click Create segment at the bottom of the page.
    PinpointSegDetails
  8. Once the new segment is created, you are directed to the newly created segment with configuration details. You should see five eligible endpoints corresponding to database table rows.
    PinpointSegUpdate
  9. Now, change one row by issuing the following SQL statement. This simulates a user opting out from SMS communication for one of their numbers.
    UPDATE optin_status SET optin=0 WHERE userid='user5';
  10. After the update, go to the Amazon Pinpoint console. Check the eligible endpoints again. You should only see four eligible endpoints.

PinpointSegUpdate

Cleanup

If you no longer want to incur further charge, delete the Cloudformation stack named pinpointblogstack. Select it and click Delete.

PinpointCleanup

Conclusion

This solution walks you through how opt-in change events are delivered from Amazon RDS to Amazon Pinpoint. You can use this solution in other use cases as well. Some examples are importing segments from a 3rd party application like Salesforce and importing other types of channels like e-mail, push, and voice. To learn more about Amazon Pinpoint, visit our website.

Agile website delivery with Hugo and AWS Amplify

Post Syndicated from Nigel Harris original https://aws.amazon.com/blogs/devops/agile-website-delivery-with-hugo-and-aws-amplify/

In this post, we show how you can rapidly configure and deploy a website using Hugo (an AWS Cloud9 integrated development environment (IDE) for content editing), AWS CodeCommit for source code control, and AWS Amplify to implement a source code-controlled, automated deployment process.

When hosting a website on AWS, you can choose from several options. One popular option is to use Amazon Simple Storage Service (Amazon S3) to host a static website. If you prefer full access to the infrastructure hosting your website, you can use the NGINX Quick Start to quickly deploy web server infrastructure using AWS CloudFormation.

Static website generators such as Hugo and MkDocs accelerate the website content generation process, and can be a valuable tool when trying to rapidly deliver technical documentation or similar content. Typically, the content creation process requires programming in HTML and CSS.

Hugo is written in Go and available under the Apache 2.0 license. It provides several themes (collections of layouts) that accelerate website creation by drastically reducing the need to focus on format. You can author content in Markdown and output in multiple languages and formats (including ebook formats). Excellent examples of public websites built using Hugo include Digital.gov and Kubernetes.io.

 

Solution overview

This solution illustrates how to provision a hosted, source code-controlled Hugo generated website using CodeCommit and Amplify Console. The provisioned website is configured with a custom subdomain and an SSL certificate. We use an AWS Cloud9 IDE to enable content creation in the cloud.

 

Setting up an AWS Cloud9 IDE

Start by provisioning an AWS Cloud9 IDE. AWS Cloud9 environments run using Amazon Elastic Compute Cloud (Amazon EC2). You need to provision your AWS Cloud9 environment into an existing public subnet in an Amazon Virtual Private Cloud (Amazon VPC) within your AWS account. You can complete this in the following steps:

1. Access your AWS account using with an identity with administrative privileges. If you don’t have an AWS account, you can create one.

2. Create a new AWS Cloud9 environment using the wizard on the AWS Cloud9 console.

3. Enter a name for your desktop and an optional description.

4. Choose Next step.

Naming your Cloud 9 environment

5. In the Environment settings section, for Environment type, select Create a new EC2 instance for environment (direct access).

6. For Instance type, select your preferred instance type (the default, t2.micro, works for this use case)

7. Under Network settings, for Network (VPC), choose a VPC that you wish to deploy your AWS Cloud9 instance into. You may wish to use your default VPC, which is suitable for the purpose of this tutorial.

8. Choose a public subnet from this VPC for deployment.

Cloud9 Settings

9. Leave all other settings unchanged and choose Next step.
10. Review your choices and choose Create environment.

Environment creation takes a few minutes to complete. When the environment is ready, you receive access to the AWS Cloud9 IDE in your browser. We return to it shortly to develop content for your Hugo website.

Your Cloud9 Desktop

Configuring a source code repository to track content changes

Static website generators enable rapid changes to website content and layout. Source control management (SCM) systems provide a revision history for your code, and allow you to revert to previous versions of a project when unintended changes are introduced. SCM systems become increasingly important as the velocity of change and the number of team members introducing change increases.

You now create a source code repository to track changes to your content. You use CodeCommit, a fully-managed source control service that hosts secure Git-based repositories.

1. In a new browser, sign in to the CodeCommit Console and create a new repository.

2. For Repository name, enter amplify-website.

3. For Description, enter an appropriate description.

4. Choose Create.

Create repository

Repository creation takes just a few moments.

5. In the Connection steps section, choose the appropriate method to connect to your repository based on how you accessed your AWS account.

For this post, I signed in to my AWS account using federated access, so I choose the HTTPS Git Remote CodeCommit (HTTPS-GRC) tab. This is the recommended connection method for this sign-in type. You can also configure a connection to your repository using SSH or Git credentials over HTTPS. SSH and Git credentials over HTTPS are appropriate methods if you have signed in to your AWS account as an AWS Identity and Access Management (IAM) user. The Amazon CodeCommit console provides additional information regarding each of these connection types, including links to supporting documentation.

Connect to Repo

 

Configuring and deploying an example website

You’re now ready to configure and deploy your website.

1. Return to the browser with your AWS Cloud9 IDE and place your cursor in the lower terminal pane of the IDE.

The terminal pane provides Bash shell access on the EC2 instance running AWS Cloud9.

You now create a Hugo website. The website design is based on Hugo-theme-learn. Themes are collections of Hugo layouts that take all the hassle out of building your website. Learn is a multilingual-ready theme authored by Mathieu Cornic, designed for building technical documentation websites.

Hugo provides a variety of themes on their website. Many of the themes include bundled example website content that you can easily adapt by following the accompanying theme documentation.

2. Enter the following code to download an existing example website stored as a .zip file, extract it, and commit the contents into CodeCommit from your AWS Cloud9 IDE:

cd ~/environment
aws s3 cp s3://ee-assets-prod-us-east-1/modules/3c5ba9cb6ff44465b96993d210f67147/v1/example-website.zip ~/environment/example-website.zip
unzip example-website.zip
rm example-website.zip

The following screenshot shows your output.

example website copy commands

 

Next, we run commands to create a directory to host your website and copy files into place from the example website to get started. We then create a new default branch called main (formerly referred to as the master branch), local to our AWS Cloud9 instance. We then copy files into place from the example website. After adding and committing them locally, we push all our changes to the remote Amazon Codecommit repository.

3. Enter the following code:

mkdir ~/environment/amplify-website/
cd ~/environment/amplify-website/
git init
git remote add origin codecommit::us-east-1://amplify-website
git remote -v
git checkout -b main
cp -rp ~/environment/example-website/* ~/environment/amplify-website/
git add *
git commit -am "first commit"
git push -u origin main

Deployment and hosting is achieved by using Amplify Console, a static web hosting service that accelerates your application release cycle by providing a simple CI/CD workflow for building and deploying static web applications.

4. On the Amplify console, under Deploy, choose Get Started.

Amplify banner

5. On the Get started with the Amplify Console page, select AWS CodeCommit as your source code repository.

6. Choose Continue.

Amplify get started page

7. On the Add repository branch page, for Recently updated repositories, choose your repository.

8. For Branch, choose main.

9. Choose Next.

add branch

On the Configure build settings page, Amplify automatically uses the amplify.yml file for build settings for your deployment. You committed this into your source code repository in the previous step. The amplify.yml file is detected from the root of your website directory structure.

10. Choose Next.

Amplify configure build settings

11. On the review page, choose Save and deploy.

Amplify builds and deploys your Amplify website within minutes, and shows you its progress. When deployment is complete, you can access the website to see the sample content.

amplify website

The following screenshot shows your example website.

sample website

 

Promoting changes to the website

We can now update the line of text in the home page and commit and publish this change.

1. Return to the browser with your AWS Cloud9 IDE and place your cursor in the lower terminal pane of the IDE.

2. On the navigation pane, choose the file ~/environment/amplify-website/workshop/content/_index.en.md.

The contents of the file open under a new tab in the upper pane.

3. Change the string First Line of Text to First Update to Website.

content change

4. From the File menu, choose Save to save the changes you have made to the _index.en.md file.

save content changes

5. Commit the changes and push to CodeCommit by running the following command in the lower terminal pane in AWS Cloud9:

git add *; git commit -am "homepage update"; git push origin main

The output in your AWS Cloud9 terminal should appear similar to the following screenshot.

commit output

6. Return to the Amplify Console and observe how the committed change in CodeCommit is automatically detected. Amplify runs deployment steps to push your changes to the website.

amplify deploy changes

7. Access the URL of your website after this update is complete to verify that the first line of text on your home page has changed.

updated website

You can repeat this process to make source-code controlled, automated changes to your website.

Adding a custom domain

Adding a custom domain to your Amplify configuration makes it easier for clients to access your content. You can register new domains using Amazon Route 53 or, if you have an existing domain registered outside of AWS, you can integrate it with Route 53 and Amplify. For our use case, the domain www.hugoonamplify.com is a registered a domain name using a third-party registrar (NameCheap). You can manage DNS configurations for domains registered outside of AWS using Route 53.

Start by configuring a public hosted zone in Route 53.

1. On the Route 53 console, choose Hosted zones.

2. Choose Create hosted zone.

hosted zones

3. For Domain name, enter hugoonamplify.com.

4. For Description, enter an appropriate description.

5. For Type, select Public hosted zone.

hosted zones configuration

6. Choose Create hosted zone.

7. Save the addresses of the name servers that respond to client DNS lookup requests for the custom domain.

create hosted zone

8. In a separate browser, access the console of your DNS registrar.

9. Configure a custom DNS name servers setting on the console of the third-party domain name registrar.

This configuration specifies the Route 53 assigned name servers as authoritative DNS for our custom domain. For this use case, propagation of this change may take up to 48 hours.

namecheap console

10. Use https://who.is to verify that the AWS name servers are listed correctly for your custom domain to internet clients.

whois lookup

You can now set up your custom domain in Amplify. Amplify helps you configure DNS and set up SSL for your desired custom domain.

domain management

11. On the Amplify Console, under App settings, choose Domain management.

12. Choose Add domain.

13. For Domain, enter your custom domain name (hugoonamplify.com).

14. Choose Configure domain.

15. For Subdomain, I only want to set up www and choose to exclude the root of my custom domain.

16. Choose Save.

Amplify begins the process of creating the SSL certificates. Amplify sends a notification that it’s issuing an SSL certificate to secure traffic to the custom domain.

ssl domain management

After a few moments, it proceeds to SSL configuration and indicates that ownership of domain is in progress.

ssl domain management configuration

Amplify verifies domain ownership by creating a sample CNAME record in your hosted zone file. When ownership is verified, the domain is propagated onto an Amazon CloudFront distribution managed by the Amplify service, and domain activation is complete.

ssl domain management configured

Clients can now access the website using the custom domain name www.hugoonaplify.com.

access website via custom domain

 

Establishing a subdomain for development

You can create a development website in Amplify that is aligned to a development code branch in CodeCommit that enables testing changes prior to production release.

1. Access the AWS Cloud9 IDE and use the terminal to enter the following commands to create a development branch and push changes to CodeCommit using the current content from the main branch with a single content change:

git checkout -b development
git branch
git remote -v
git add *; git commit -am "first development commit";
git push -u origin development

2. Open and edit the file ~/environment/amplify-website/workshop/content/_index.en.md and change the string Update to Website to something else.

Alternatively, run the following Unix sed command from the terminal in AWS Cloud9 to make that content change:

sed -i 's/Update to Website/Update to Development/g' ~/environment/amplify-website/workshop/content/_index.en.md

3. Commit and push your change with the following code:

git add *; git commit -am "second development commit"; git push -u origin development

You now configure a subdomain in Amplify to allow developers to review changes.

4. Return to the amplify-website app.

5. Choose Connect branch.

connect branch

6. For Branch, choose the development branch you created and committed code into.

7. Choose Next.

add development branch

Amplify builds a second website based on the contents of the development branch. You can see the instance of your website matched to the development code branch on Amplify Console.

amplify two branches

8. Access the domain management menu item in your Amplify application to add a friendly subdomain.

9. Edit the domain and add a subdomain item with a name of your choice (for example, dev).

10. Associate it to the development branch containing the committed code and content changes.

11. Choose Add.

add dev domain

You can access the subdomain to verify the changes.

verify domain

Controlling access to development

You may wish to restrict access to new content as it’s deployed into the development website.

1. On Amplify Console, choose your application.

2. Choose Access control.

3. Under Access control settings, choose your preferred settings.

You have the option to restrict access globally or on a branch-by-branch basis. For this use case, we create a simple password protection for a user named developer on the development branch and site.

access control settings

 

Cleaning up

Unless you plan to keep the website you have constructed, you can quickly clean up provisioned assets and avoid any unnecessary costs.

1. On Amplify Console, select the app you created.

2. From the Actions drop-down menu, choose Delete app.

3. In the pop-up window, confirm the deletion.

4. On the CodeCommit dashboard, select the repository you created.

5. Choose Delete.

6. In the pop-up window, confirm the deletion.

7. On the AWS Cloud9 dashboard, select the IDE you created.

8. Choose Delete.

9. In the pop-up window, confirm the deletion.

 

Conclusion

Hugo is a powerful tool that enables accelerated delivery of content in a variety of formats including image portfolios, online resume presentation, blogging, and technical documentation. Amplify Console provides a convenient, easy-to-use, static web hosting service that can greatly accelerate delivery of static content.

When combining Hugo with Amplify Console, you can rapidly deploy websites in minutes with features such as friendly URLS, environments matched to code branches, and encryption (SSL). Visit gohugo.io to find out more about Hugo. For more information about how Amplify Console can help you rapidly deploy Hugo and other modern web applications, see the AWS Amplify Console User Guide.

Nigel Harris

Nigel Harris

Nigel Harris is an Enterprise Solutions Architect at Amazon Web Services. He works with AWS customers to provide guidance and technical assistance on AWS architectures.

Isolating network access to your AWS Cloud9 environments

Post Syndicated from Brandon Wu original https://aws.amazon.com/blogs/security/isolating-network-access-to-your-aws-cloud9-environments/

In this post, I show you how to create isolated AWS Cloud9 environments for your developers without requiring ingress (inbound) access from the internet. I also walk you through optional steps to further isolate your AWS Cloud9 environment by removing egress (outbound) access. Until recently, AWS Cloud9 required you to allow ingress Secure Shell (SSH) access from authorized AWS Cloud9 IP addresses. Now AWS Cloud 9 allows you to create and run your development environments within your isolated Amazon Virtual Private Cloud (Amazon VPC), without direct connectivity from the internet, adding an additional layer of security.

AWS Cloud9 is an integrated development environment (IDE) that lets you write, run, edit, and debug code using only a web browser. Developers who use AWS Cloud9 have access to an isolated environment where they can innovate, experiment, develop, and perform early testing without impacting the overall security and stability of other environments. By using AWS Cloud9, you can store your code securely in a version control system (like AWS CodeCommit), configure your AWS Cloud9 EC2 development environments to use encrypted Amazon Elastic Block Store (Amazon EBS) volumes, and share your environments within the same account.

Solution overview

Before enhanced virtual private cloud (VPC) support was available, AWS Cloud9 required you to allow ingress Secure Shell (SSH) access from authorized AWS Cloud9 IP addresses in order to use the IDE. The addition of private VPC support enables you to create and run AWS Cloud9 environments in private subnets without direct connectivity from the internet. You can use VPC security groups to configure the ingress and egress traffic that you allow, or choose to disallow all traffic.

Since this feature uses AWS Systems Manager to support using AWS Cloud9 in private subnets, it’s worth taking a minute to read and understand a bit about it before you continue. Systems Manager Session Manager provides an interactive shell connection between AWS Cloud9 and its associated Amazon Elastic Compute Cloud (Amazon EC2) instance in the Amazon Virtual Private Cloud (Amazon VPC). The AWS Cloud9 instance initiates an egress connection to the Session Manager service using the pre-installed Systems Manager agent. In order to use this feature, your developers must have access to instances managed by Session Manager in their IAM policy.

When you create an AWS Cloud9 no-ingress EC2 instance (with access via Systems Manager) into a private subnet, its security group doesn’t have an ingress rule to allow incoming network traffic. The security group does, however, have an egress rule that permits egress traffic from the instance. AWS Cloud9 requires this to download packages and libraries to keep the AWS Cloud9 IDE up to date.

If you want to prevent egress connectivity in addition to ingress traffic for the instance, you can configure Systems Manager to use an interface VPC endpoint. This allows you to restrict egress connections from your environment and ensure the encrypted connections between the AWS Cloud9 EC2 instance and Systems Manager are carried over the AWS global network. The architecture of accessing your AWS Cloud9 instance using Systems Manager and interface VPC endpoints is shown in Figure 1.
 

Figure 1: Accessing AWS Cloud9 environment via AWS Systems Manager and Interface VPC Endpoints

Figure 1: Accessing AWS Cloud9 environment via AWS Systems Manager and Interface VPC Endpoints

Note: The use of interface VPC endpoints incurs an additional charge for each hour your VPC endpoints remain provisioned. This is in addition to the AWS Cloud9 EC2 instance cost.

Prerequisites

You must have a VPC configured with an attached internet gateway, public and private subnets, and a network address translation (NAT) gateway created in your public subnet. Your VPC must also have DNS resolution and DNS hostnames options enabled. To learn more, you can visit Working with VPCs and subnets, Internet gateways, and NAT gateways.

You must also give your developers access to their AWS Cloud9 environments managed by Session Manager.

AWS Cloud9 requires egress access to the internet for some features, including downloading required libraries or packages needed for updates to the IDE and running AWS Lambda functions. If you don’t want to allow egress internet access for your environment, you can create your VPC without an attached internet gateway, public subnet, and NAT gateway.

Implement the solution

To set up AWS Cloud9 with access via Systems Manager:

  1. Optionally, if no egress access is required, set up interface VPC endpoints for Session Manager
  2. Create a no-ingress Amazon EC2 instance for your AWS Cloud9 environment

(Optional) Set up interface VPC endpoints for Session Manager

Note: For no-egress environments only.

You can skip this step if you don’t need your VPC to restrict egress access. If you need your environment to restrict egress access, continue.

Start by using the AWS Management Console to configure Systems Manager to use an interface VPC endpoint (powered by AWS PrivateLink). If you’d prefer, you can use this custom AWS CloudFormation template to configure the VPC endpoints.

Interface endpoints allow you to privately access Amazon EC2 and System Manager APIs by using a private IP address. This also restricts all traffic between your managed instances, Systems Manager, and Amazon EC2 to the Amazon network. Using the interface VPC endpoint, you don’t need to set up an internet gateway, a NAT device, or a virtual private gateway.

To set up interface VPC endpoints for Session Manager

  1. Create a VPC security group to allow ingress access over HTTPS (port 443) from the subnet where you will deploy your AWS Cloud9 environment. This is applied to your interface VPC endpoints to allow connections from your AWS Cloud9 instance to use Systems Manager.
  2. Create a VPC endpoint.
  3. In the list of Service Names, select com.amazonaws.<region>.ssm service as shown in Figure 2.
     
    Figure 2: AWS PrivateLink service selection filter

    Figure 2: AWS PrivateLink service selection filter

  4. Select your VPC and private Subnets you want to associate the interface VPC endpoint with.
  5. Choose Enable for this endpoint for the Enable DNS name setting.
  6. Select the security group you created in Step 1.
  7. Add any optional tags for the interface VPC endpoint.
  8. Choose Create endpoint.
  9. Repeat Steps 2 through 8 to create interface VPC endpoints for the com.amazonaws.<region>.ssmmessages and com.amazonaws.<region>.ec2messages services.
  10. When all three interface VPC endpoints have a status of available, you can move to the next procedure.

Create a no-ingress Amazon EC2 instance for your AWS Cloud9 environment

Deploy a no-ingress Amazon EC2 instance for your AWS Cloud9 environment using the console. Optionally, you can use this custom AWS CloudFormation template to create the no-ingress Amazon EC2 instance. You can also use the AWS Command Line Interface, or AWS Cloud9 API to set up your AWS Cloud9 environment with access via Systems Manager.

As part of this process, AWS Cloud9 automatically creates three IAM resources pre-configured with the appropriate permissions:

  • An IAM service-linked role (AWSServiceRoleForAWSCloud9)
  • A service role (AWSCloud9SSMAccessRole)
  • An instance profile (AWSCloud9SSMInstanceProfile)

The AWSCloud9SSMAccessRole and AWSCloud9SSMInstanceProfile are attached to your AWS Cloud9 EC2 instance. This service role for Amazon EC2 is configured with the minimum permissions required to integrate with Session Manager. By default, AWS Cloud9 makes managed temporary AWS access credentials available to you in the environment. If you need to grant additional permissions to your AWS Cloud9 instance to access other services, you can create a new role and instance profile and attach it to your AWS Cloud9 instance.

By default, your AWS Cloud9 environment is created with a VPC security group with no ingress access and allowing egress access so the AWS Cloud9 IDE can download required libraries or packages needed for urgent updates to IDE plugins. You can optionally configure your AWS Cloud9 environment to restrict egress access by removing the egress rules in the security group. If you restrict egress access, some features won’t work (for example, the AWS Lambda plugin and updates to IDE plugins).

To use the console to create your AWS Cloud9 environment

  1. Navigate to the AWS Cloud9 console.
  2. Select Create environment on the top right of the console.
  3. Enter a Name and Description.
  4. Select Next step.
  5. Select Create a new no-ingress EC2 instance for your environment (access via Systems Manager) as shown in Figure 3.
     
    Figure 3: AWS Cloud9 environment settings

    Figure 3: AWS Cloud9 environment settings

  6. Select your preferred Instance type, Platform, and Cost-saving setting.
  7. You can optionally configure the Network settings to select the Network (VPC) and private Subnet to create your AWS Cloud9 instance.
  8. Select Next step.

Your AWS Cloud9 environment is ready to use. You can access your AWS Cloud9 environment console via Session Manager using encrypted connections over the AWS global network as shown in Figure 4.
 

Figure 4: AWS Cloud9 instance console access

Figure 4: AWS Cloud9 instance console access

You can see that this AWS Cloud9 connection is using Session Manager by navigating to the Session Manager console and viewing the active sessions as shown in Figure 5.
 

Figure 5: AWS Systems Manager Session Manager active sessions

Figure 5: AWS Systems Manager Session Manager active sessions

Summary

Security teams are charged with providing secure operating environments without inhibiting developer productivity. With the ability to deploy your AWS Cloud9 environment instances in a private subnet, you can provide a seamless experience for developing applications using the AWS Cloud9 IDE while enabling security teams to enforce key security controls to protect their corporate networks and intellectual property.

If you have feedback about this post, submit comments in the Comments section below. If you have questions about this post, start a new thread on the AWS Cloud9 forum or contact AWS Support.

Want more AWS Security how-to content, news, and feature announcements? Follow us on Twitter.

Author

Brandon Wu

Brandon is a security solutions architect helping financial services organizations secure their critical workloads on AWS. In his spare time, he enjoys exploring outdoors and experimenting in the kitchen.

Jump-starting your serverless development environment

Post Syndicated from Benjamin Smith original https://aws.amazon.com/blogs/compute/jump-starting-your-serverless-development-environment/

Developers building serverless applications often wonder how they can jump-start their local development environment. This blog post provides a broad guide for those developers wanting to set up a development environment for building serverless applications.

serverless development environment

AWS and open source tools for a serverless development environment .

To use AWS Lambda and other AWS services, create and activate an AWS account.

Command line tooling

Command line tools are scripts, programs, and libraries that enable rapid application development and interactions from within a command line shell.

The AWS CLI

The AWS Command Line Interface (AWS CLI) is an open source tool that enables developers to interact with AWS services using a command line shell. In many cases, the AWS CLI increases developer velocity for building cloud resources and enables automating repetitive tasks. It is an important piece of any serverless developer’s toolkit. Follow these instructions to install and configure the AWS CLI on your operating system.

AWS enables you to build infrastructure with code. This provides a single source of truth for AWS resources. It enables development teams to use version control and create deployment pipelines for their cloud infrastructure. AWS CloudFormation provides a common language to model and provision these application resources in your cloud environment.

AWS Serverless Application Model (AWS SAM CLI)

AWS Serverless Application Model (AWS SAM) is an extension for CloudFormation that further simplifies the process of building serverless application resources.

It provides shorthand syntax to define Lambda functions, APIs, databases, and event source mappings. During deployment, the AWS SAM syntax is transformed into AWS CloudFormation syntax, enabling you to build serverless applications faster.

The AWS SAM CLI is an open source command line tool used to locally build, test, debug, and deploy serverless applications defined with AWS SAM templates.

Install AWS SAM CLI on your operating system.

Test the installation by initializing a new quick start project with the following command:

$ sam init
  1. Choose 1 for the “Quick Start Templates
  2. Choose 1 for the “Node.js runtime
  3. Use the default name.

The generated /sam-app/template.yaml contains all the resource definitions for your serverless application. This includes a Lambda function with a REST API endpoint, along with the necessary IAM permissions.

Resources:
  HelloWorldFunction:
    Type: AWS::Serverless::Function # More info about Function Resource: https://github.com/awslabs/serverless-application-model/blob/master/versions/2016-10-31.md#awsserverlessfunction
    Properties:
      CodeUri: hello-world/
      Handler: app.lambdaHandler
      Runtime: nodejs12.x
      Events:
        HelloWorld:
          Type: Api # More info about API Event Source: https://github.com/awslabs/serverless-application-model/blob/master/versions/2016-10-31.md#api
          Properties:
            Path: /hello
            Method: get

Deploy this application using the AWS SAM CLI guided deploy:

$ sam deploy -g

Local testing with AWS SAM CLI

The AWS SAM CLI requires Docker containers to simulate the AWS Lambda runtime environment on your local development environment. To test locally, install Docker Engine and run the Lambda function with following command:

$ sam local invoke "HelloWorldFunction" -e events/event.json

The first time this function is invoked, Docker downloads the lambci/lambda:nodejs12.x container image. It then invokes the Lambda function with a pre-defined event JSON file.

Helper tools

There are a number of open source tools and packages available to help you monitor, author, and optimize your Lambda-based applications. Some of the most popular tools are shown in the following list.

Template validation tooling

CloudFormation Linter is a validation tool that helps with your CloudFormation development cycle. It analyses CloudFormation YAML and JSON templates to resolve and validate intrinsic functions and resource properties. By analyzing your templates before deploying them, you can save valuable development time and build automated validation into your deployment release cycle.

Follow these instructions to install the tool.

Once, installed, run the cfn-lint command with the path to your AWS SAM template provided as the first argument:

cfn-lint template.yaml
AWS SAM template validation with cfn-lint

AWS SAM template validation with cfn-lint

The following example shows that the template is not valid because the !GettAtt function does not evaluate correctly.

IDE tooling

Use AWS IDE plugins to author and invoke Lambda functions from within your existing integrated development environment (IDE). AWS IDE toolkits are available for PyCharm, IntelliJ. Visual Studio.

The AWS Toolkit for Visual Studio Code provides an integrated experience for developing serverless applications. It enables you to invoke Lambda functions, specify function configurations, locally debug, and deploy—all conveniently from within the editor. The toolkit supports Node.js, Python, and .NET.

The AWS Toolkit for Visual Studio Code

From Visual Studio Code, choose the Extensions icon on the Activity Bar. In the Search Extensions in Marketplace box, enter AWS Toolkit and then choose AWS Toolkit for Visual Studio Code as shown in the following example. This opens a new tab in the editor showing the toolkit’s installation page. Choose the Install button in the header to add the extension.

AWS Toolkit extension for Visual Studio Code

AWS Toolkit extension for Visual Studio Code

AWS Cloud9

Another option to build a development environment without having to install anything locally is to use AWS Cloud9. AWS Cloud9 is a cloud-based integrated development environment (IDE) for writing, running, and debugging code from within the browser.

It provides a seamless experience for developing serverless applications. It has a preconfigured development environment that includes AWS CLI, AWS SAM CLI, SDKs, code libraries, and many useful plugins. AWS Cloud9 also provides an environment for locally testing and debugging AWS Lambda functions. This eliminates the need to upload your code to the Lambda console. It allows developers to iterate on code directly, saving time, and improving code quality.

Follow this guide to set up AWS Cloud9 in your AWS environment.

Advanced tooling

Efficient configuration of Lambda functions is critical when expecting optimal cost and performance of your serverless applications. Lambda allows you to control the memory (RAM) allocation for each function.

Lambda charges based on the number of function requests and the duration, the time it takes for your code to run. The price for duration depends on the amount of RAM you allocate to your function. A smaller RAM allocation may reduce the performance of your application if your function is running compute-heavy workloads. If performance needs outweigh cost, you can increase the memory allocation.

Cost and performance optimization tooling

AWS Lambda power tuner is an open source tool that uses an AWS Step Functions state machine to suggest cost and performance optimizations for your Lambda functions. It invokes a given function with multiple memory configurations. It analyzes the execution log results to determine and suggest power configurations that minimize cost and maximize performance.

To deploy the tool:

  1. Clone the repository as follows:
    $ git clone https://github.com/alexcasalboni/aws-lambda-power-tuning.git
  2. Create an Amazon S3 bucket and enter the deployment configurations in /scripts/deploy.sh:
    # config
    BUCKET_NAME=your-sam-templates-bucket
    STACK_NAME=lambda-power-tuning
    PowerValues='128,512,1024,1536,3008'
  3. Run the deploy.sh script from your terminal, this uses the AWS SAM CLI to deploy the application:
    $ bash scripts/deploy.sh
  4. Run the power tuning tool from the terminal using the AWS CLI:
    aws stepfunctions start-execution \
    --state-machine-arn arn:aws:states:us-east-1:0123456789:stateMachine:powerTuningStateMachine-Vywm3ozPB6Am \
    --input "{\"lambdaARN\": \"arn:aws:lambda:us-east-1:1234567890:function:testytest\", \"powerValues\":[128,256,512,1024,2048],\"num\":50,\"payload\":{},\"parallelInvocation\":true,\"strategy\":\"cost\"}" \
    --output json
  5. The Step Functions execution output produces a link to a visual summary of the suggested results:

    AWS Lambda power tuning results

    AWS Lambda power tuning results

Monitoring and debugging tooling

Sls-dev-tools is an open source serverless tool that delivers serverless metrics directly to the terminal. It provides developers with feedback on their serverless application’s metrics and key bindings that deploy, open, and manipulate stack resources. Bringing this data directly to your terminal or IDE, reduces context switching between the developer environment and the web interfaces. This can increase application development speed and improve user experience.

Follow these instructions to install the tool onto your development environment.

To open the tool, run the following command:

$ Sls-dev-tools

Follow the in-terminal interface to choose which stack to monitor or edit.

The following example shows how the tool can be used to invoke a Lambda function with a custom payload from within the IDE.

Invoke an AWS Lambda function with a custom payload using sls-dev-tools

Invoke an AWS Lambda function with a custom payload using sls-dev-tools

Serverless database tooling

NoSQL Workbench for Amazon DynamoDB is a GUI application for modern database development and operations. It provides a visual IDE tool for data modeling and visualization with query development features to help build serverless applications with Amazon DynamoDB tables. Define data models using one or more tables and visualize the data model to see how it works in different scenarios. Run or simulate operations and generate the code for Python, JavaScript (Node.js), or Java.

Choose the correct operating system link to download and install NoSQL Workbench on your development machine.

The following example illustrates a connection to a DynamoDB table. A data scan is built using the GUI, with Node.js code generated for inclusion in a Lambda function:

Connecting to an Amazon DynamoBD table with NoSQL Workbench for AmazonDynamoDB

Connecting to an Amazon DynamoDB table with NoSQL Workbench for Amazon DynamoDB

Generating query code with NoSQL Workbench for Amazon DynamoDB

Generating query code with NoSQL Workbench for Amazon DynamoDB

Conclusion

Building serverless applications allows developers to focus on business logic instead of managing and operating infrastructure. This is achieved by using managed services. Developers often struggle with knowing which tools, libraries, and frameworks are available to help with this new approach to building applications. This post shows tools that builders can use to create a serverless developer environment to help accelerate software development.

This list represents AWS and open source tools but does not include our APN Partners. For partner offers, check here.

Read more to start building serverless applications.

Load testing a web application’s serverless backend

Post Syndicated from James Beswick original https://aws.amazon.com/blogs/compute/load-testing-a-web-applications-serverless-backend/

Many web applications experience high levels of traffic and spiky load patterns. The goal of load testing is to ensure that the architecture and system design works for the amount of traffic expected. It can help developers find bottlenecks and unexpected behavior before a system is deployed to production. This post uses the Ask Around Me application as an example to show how to test load in a serverless architecture.

In Ask Around Me, users ask and answer questions in their local geographic area. The expected hourly load is 1,000 new questions, 10,000 new answers, and 50,000 question lookup queries. I use these numbers as a baseline for the tests. This is the architecture of the Ask Around Me backend application:

Ask Around Me backend architecture

Focus areas for load testing

In serverless architectures using AWS services, you can perform a round-trip test from an API endpoint. You can also isolate areas in the design where you should test performance. API testing provides the best approximation of the performance that users experience but it may not always be possible. You can also isolate microservices consuming from SQS queue or receive events from Amazon EventBridge, and test only those parts of the infrastructure.

While AWS services are built to withstand high levels of traffic, it’s important to consider the effect of Service Quotas on your application. Service Quotas are applied at the Region and account levels depending upon the service. You can view all your quotas in one place from the Service Quotas console. These are designed to protect you and other customers if your applications use more resources than planned. These quotas consist of hard and soft limits. For soft limits, you can request quota increases by opening a support ticket.

You must also consider downstream services. While serverless services like Lambda scale on your behalf, you may use downstream services that could be overwhelmed when traffic increases. Load testing can help identify these areas. You can implement mechanisms like queuing, caching, or pooling to protect those non-serverless parts of your infrastructure. If you are using Amazon RDS, for example, you might implement Amazon RDS Proxy to help pool and scale resources.

Finally, load testing can help identify custom code in Lambda functions that may not run efficiently as traffic scales up. Typically, these issues are caused by the code itself or the function configuration. For example, code may process event batches effectively or may not be configured with the appropriate concurrency or memory configuration. Frequently these issues are unnoticed in development but resurface in a load test.

Load testing tools

Load testing serverless infrastructure can be both inexpensive and systematic. There are several tools available for serverless developers to perform this task. One of the most popular is Artillery Community Edition, which is an open-source tool for testing serverless APIs. You configure the number of requests per second and overall test duration, and it uses a headless Chromium browser to run its test flows.

The performance report measures the roundtrip time from the client device, so can be affected by your machine’s performance and network. One way to eliminate your local network’s impact on the results is to use AWS Cloud9 to run the tests remotely.

For Artillery, the maximum number of concurrent tests is constrained by your local computing resources and network. To achieve higher throughput, you can use Serverless Artillery, which runs the Artillery package on Lambda functions. As a result, this tool can scale up to a significantly higher number of tests.

The Ask Around Me application is deployed in my AWS account – see the application’s blog series to learn more about the deployment process. I use an AWS Cloud9 instance to run these API tests:

  1. Adding 1,000 questions per hour using the POST /questions API.
  2. Adding 10,000 answers per hour using the POST /answers API.
  3. Fetching 50,000 questions per hour based upon random geo-location using the GET /questions API.

You can find the test scripts and Artillery configurations in the testing directory of the application’s GitHub repo.

Artillery also enables you to specify custom functions to provide randomized data and custom query parameters, as required by your API. The loadTestFunction.js file contains a function to return randomized geo-point and rating data per test:

// Sets a bounding box around an area in Virginia, USA
const bounds = {
  latMax: 38.735083,
  latMin: 40.898677,
  lngMax: -77.109339,
  lngMin: -81.587841
}

const generateRandomData = (userContext, events, done) => {
  const randomLat = ((bounds.latMax-bounds.latMin) * Math.random()) + bounds.latMin
  const randomLng = ((bounds.lngMax-bounds.lngMin) * Math.random()) + bounds.lngMin

  const id = parseInt(Math.random()*1000000)+1  //random 0-1000000
  const rating = parseInt(Math.random()*5)+1    //returns 1-5

  userContext.vars.lat = randomLat.toFixed(7)
  userContext.vars.lng = randomLng.toFixed(7)
  userContext.vars.id = id
  userContext.vars.rating = rating

  return done()
}

module.exports = { generateRandomData }

Test #1: Adding 1,000 questions per hour

The POST questions API has the following architecture:

POST questions architecture

The Artillery configuration file 1-test.yaml is set to create three requests per second over a 5-minute duration. This equates to 10,800 questions per hour, significantly higher than the estimated load for this function. The scenario specifies the JSON payload expected by the questions API:

config:
  target: 'https://abcd1234567.execute-api.us-east-1.amazonaws.com'
  phases:
    - duration: 300
      arrivalRate: 3
  processor: "./loadTestFunction.js"          
  defaults:
    headers:
      Authorization: 'Bearer <<enter your valid JWT token>>'
scenarios:
  - flow:
    - function: "generateRandomData"
    - post:
        url: "/questions"
        json:
          question: "This is a load test question - #{{ id }}"
          type: "Star rating"
          position:
            latitude: {{ lat }}
            longitude: {{ lng }}
    - log: "Sent POST request to / with {{ lat }}, {{ lng }}"

You execute the Artillery test with the command artillery run ./1-test.yaml. My test concludes with the following results:

Artillery test results

Over 300 requests, the median response time is 114 ms. The p95 response time shows that 95% of all responses are served within 376 ms. The slowest response of 1401 ms is caused by cold starts when the Lambda service scales up the underlying function due to load.

As this process writes to a DynamoDB table, I can also see how many write capacity units (WCUs) are consumed by the test. From the DynamoDB console, select the table aamQuestions, then choose the Metrics tab. This shows the Write capacity metric:

CloudWatch DynamoDB metrics

Test #2: Adding 10,000 answers per hour.

The POST answers API has the following architecture:

POST answers architecture

The Artillery configuration in 2-test.yaml creates 10 answers per second over a 5-minute duration. This equates to 36,000 per hour, much higher than the estimated load. The scenario defines the randomized rating used by the testing process:

config:
  target: 'https://abcd1234567.execute-api.us-east-1.amazonaws.com'
  phases:
    - duration: 300
      arrivalRate: 10
  processor: "./loadTestFunction.js"          
  defaults:
    headers:
      Authorization: 'Bearer <<enter your valid JWT token>>’
scenarios:
  - flow:
    - function: "generateRandomData"
    - post:
        url: "/answers"
        json:
          type: "Star"
          rating: "{{ rating }}"
          question: 
            type: "Star"
            latitude: 39.08259127440097
            longitude: -77.46246339003038
            rangeKey: "testuser|1-1589380702281"
    - log: "Sent POST request to / with {{ rating }}"

The test results show a median response time of 111 ms with a p95 time of 218 ms. In the worst case, a request took 1102 ms to complete:

Artillery summary report

Checking the Metrics tab for the aaAnswers table, this test consumed just under 11 WCUs at peak:

CloudWatch DynamoDB metrics

Test #3: Fetching 50,000 questions per hour

The GET questions API invokes a Lambda function that uses the Geo Library for Amazon DynamoDB:

GET questions architecture

This process is read-intensive on the underlying DynamoDB table. The testing configuration simulates 20 queries per second over 2 minutes for random locations in a bounding box around Virginia, USA:

config:
  target: 'https://abcd1234567.execute-api.us-east-1.amazonaws.com'
  phases:
    - duration: 120
      arrivalRate: 20
  processor: "./loadTestFunction.js"          
  defaults:
    headers:
      Authorization: 'Bearer <<enter your valid JWT token>>’
scenarios:
  - flow:
    - function: "generateRandomData"
    - get:
        url: "/questions"
        qs:
          lat: "{{ lat }}"
          lng: "{{ lng }}"
    - log: "Sent POST request to / with {{ lat }}, {{ lng }}"

This is a synchronous API so the performance directly impacts the user’s experience of the application. This test shows that the median response time is 165 ms with a p95 time of 201 ms:

Artillery performance results

This level of load equates to 72,000 queries per hour, almost 50% above the expected usage. The DynamoDB metrics show a peak consumption of 82 read capacity units:

CloudWatch monitoring details

Testing authenticated routes

These API routes are protected from public access and require authorization. This application uses HTTP APIs, which accepts JWT tokens, and it uses Auth0 in the frontend application to generate these tokens. When you are load testing API Gateway routes with custom authorizers, you have a number of options.

At the early development stage, you may choose to remove the authentication to perform load tests. This simplifies the process but is not recommended beyond research and prototyping. If you turn off authentication for testing, there is a risk that it is not enabled again for production. This would leave your routes open to the public.

A better approach is to create a test user in your identity provider and use the JWT token for testing. Auth0 allows you to obtain a token manually, and use this in the Artillery configuration for the authorization header:

Embedding authorization token in test script

Since custom code frequently uses the decoded identity in processing, supplying a test token provides the closest simulation of actual usage. You must refresh this token in the test scripts periodically, and you can change scopes as needed.

The testing directory in the GitHub repo also includes a script for testing functions that consume from SQS queues. This allows you to test microservices further down in your infrastructure stack. This script injects messages into the SQS queue, simulating upstream processes.

Conclusion

In this post, I discuss focus areas for load testing of serverless applications, and highlight two tools commonly used. I show how to configure Artillery with customized functions, and how to run tests to simulate load on the Ask Around Me application.

I cover some of the options for testing authenticated API Gateway routes and how you can use JWT tokens in your load testing configuration. You can also test microservices within a serverless architecture by injecting messages into SQS queues to simulate upstream load.

To learn more about the Ask Around Me serverless applications, read the blog series.

Using Amazon EFS for AWS Lambda in your serverless applications

Post Syndicated from James Beswick original https://aws.amazon.com/blogs/compute/using-amazon-efs-for-aws-lambda-in-your-serverless-applications/

Serverless applications are event-driven, using ephemeral compute functions to integrate services and transform data. While AWS Lambda includes a 512-MB temporary file system for your code, this is an ephemeral scratch resource not intended for durable storage.

Amazon EFS is a fully managed, elastic, shared file system designed to be consumed by other AWS services, such as Lambda. With the release of Amazon EFS for Lambda, you can now easily share data across function invocations. You can also read large reference data files, and write function output to a persistent and shared store. There is no additional charge for using file systems from your Lambda function within the same VPC.

EFS for Lambda makes it simpler to use a serverless architecture to implement many common workloads. It opens new capabilities, such as building and importing large code libraries directly into your Lambda functions. Since the code is loaded dynamically, you can also ensure that the latest version of these libraries is always used by every new execution environment. For appending to existing files, EFS is also a preferred option to using Amazon S3.

This blog post shows how to enable EFS for Lambda in your AWS account, and walks through some common use-cases.

Capabilities and behaviors of Lambda with EFS

EFS is built to scale on demand to petabytes of data, growing and shrinking automatically as files are written and deleted. When used with Lambda, your code has low-latency access to a file system where data is persisted after the function terminates.

EFS is a highly reliable NFS-based regional service, with all data stored durably across multiple Availability Zones. It is cost-optimized, due to no provisioning requirements, and no purchase commitments. It uses built-in lifecycle management to optimize between SSD-performance class and an infrequent access class that offer 92% lower cost.

EFS offers two performance modes – general purpose and MaxIO. General purpose is suitable for most Lambda workloads, providing lower operational latency and higher performance for individual files.

You also choose between two throughput modes – bursting and provisioned. The bursting mode uses a credit system to determine when a file system can burst. With bursting, your throughput is calculated based upon the amount of data you are storing. Provisioned throughput is useful when you need more throughout than provided by the bursting mode. Total throughput available is divided across the number of concurrent Lambda invocations.

The Lambda service mounts EFS file systems when the execution environment is prepared. This adds minimal latency when the function is invoked for the first time, often within hundreds of milliseconds. When the execution environment is already warm from previous invocations, the EFS mount is already available.

EFS can be used with Provisioned Concurrency for Lambda. When the reserved capacity is prepared, the Lambda service also configures and mounts EFS file system. Since Provisioned Concurrency executes any initialization code, any libraries or packages consumed from EFS at this point are downloaded. In this use-case, it’s recommended to use provisioned throughout when configuring EFS.

The EFS file system is shared across Lambda functions as it scales up the number of concurrent executions. As files are written by one instance of a Lambda function, all other instances can access and modify this data, depending upon the access point permissions. The EFS file system scales with your Lambda functions, supporting up to 25,000 concurrent connections.

Creating an EFS file system

Configuring EFS for Lambda is straight-forward. I show how to do this in the AWS Management Console but you can also use the AWS CLI, AWS SDK, AWS Serverless Application Model (AWS SAM), and AWS CloudFormation. EFS file systems are always created within a customer VPC, so Lambda functions using the EFS file system must all reside in the same VPC.

To create an EFS file system:

  1. Navigate to the EFS console.
  2. Choose Create File System.
    EFS: Create File System
  3. On the Configure network access page, select your preferred VPC. Only resources within this VPC can access this EFS file system. Accept the default mount targets, and choose Next Step.
  4. On Configure file system settings, you can choose to enable encryption of data at rest. Review this setting, then accept the other defaults and choose Next Step. This uses bursting mode instead of provisioned throughout.
  5. On the Configure client access page, choose Add access point.
    EFS: Add access point
  6. Enter the following parameters. This configuration creates a file system with open read/write permissions – read more about settings to secure your access points. Choose Next Step.EFS: Access points
  7. On the Review and create page, check your settings and choose Create File System.
  8. In the EFS console, you see the new file system and its configuration. Wait until the Mount target state changes to Available before proceeding to the next steps.

Alternatively, you can use CloudFormation to create the EFS access point. With the AWS::EFS::AccessPoint resource, the preceding configuration is defined as follows:

  AccessPointResource:
    Type: 'AWS::EFS::AccessPoint'
    Properties:
      FileSystemId: !Ref FileSystemResource
      PosixUser:
        Uid: "1000"
        Gid: "1000"
      RootDirectory:
        CreationInfo:
          OwnerGid: "1000"
          OwnerUid: "1000"
          Permissions: "0777"
        Path: "/lambda"

For more information, see the example setup template in the code repository.

Working with AWS Cloud9 and Amazon EC2

You can mount EFS access points on Amazon EC2 instances. This can be useful for browsing file systems contents and downloading files from other locations. The EFS console shows customized mount instructions directly under each created file system:

EFS customized mount instructions

The instance must have access to the same security group and reside in the same VPC as the EFS file system. After connecting via SSH to the EC2 instance, you mount the EFS mount target to a directory. You can also mount EFS in AWS Cloud9 instances using the terminal window.

Any files you write into the EFS file system are available to any Lambda functions using the same EFS file system. Similarly, any files written by Lambda functions are available to the EC2 instance.

Sharing large code packages with Lambda

EFS is useful for sharing software packages or binaries that are otherwise too large for Lambda layers. You can copy these to EFS and have Lambda use these packages as if there are installed in the Lambda deployment package.

For example, on EFS you can install Puppeteer, which runs a headless Chromium browser, using the following script run on an EC2 instance or AWS Cloud9 terminal:

  mkdir node && cd node
  npm init -y
  npm i puppeteer --save

Building packages in EC2 for EFS

You can then use this package from a Lambda function connected to this folder in the EFS file system. You include the Puppeteer package with the mount path in the require declaration:

const puppeteer = require ('/mnt/efs/node/node_modules/puppeteer')

In Node.js, to avoid changing declarations manually, you can add the EFS mount path to the Node.js module search path by using app-module-path. Lambda functions support a range of other runtimes, including Python, Java, and Go. Many other runtimes offer similar ways to add the EFS path to the list of default package locations.

There is an important difference between using packages in EFS compared with Lambda layers. When you use Lambda layers to include packages, these are downloaded to an immutable code package. Any changes to the underlying layer do not affect existing functions published using that layer.

Since EFS is a dynamic binding, any changes or upgrades to packages are available immediately to the Lambda function when the execution environment is prepared. This means you can output a build process to an EFS mount, and immediately consume any new versions of the build from a Lambda function.

Configuring AWS Lambda to use EFS

Lambda functions that access EFS must run from within a VPC. Read this guide to learn more about setting up Lambda functions to access resources from a VPC. There are also sample CloudFormation templates you can use to configure private and public VPC access.

The execution role for Lambda function must provide access to the VPC and EFS. For development and testing purposes, this post uses the AWSLambdaVPCAccessExecutionRole and AmazonElasticFileSystemClientFullAccess managed policies in IAM. For production systems, you should use more restrictive policies to control access to EFS resources.

Once your Lambda function is configured to use a VPC, next configure EFS in Lambda:

  1. Navigate to the Lambda console and select your function from the list.
  2. Scroll down to the File system panel, and choose Add file system.
    EFS: Add file system
  3. In the File system configuration:
  • From the EFS file system dropdown, select the required file system. From the Access point dropdown, choose the required EFS access point.
  • In the Local mount path, enter the path your Lambda function uses to access this resource. Enter an absolute path.
  • Choose Save.
    EFS: Add file system

The File system panel now shows the configuration of the EFS mount, and the function is ready to use EFS. Alternatively, you can use an AWS Serverless Application Model (SAM) template to add the EFS configuration to a function resource:

AWSTemplateFormatVersion: '2010-09-09'
Resources:
  MyLambdaFunction:
    Type: AWS::Serverless::Function
    Properties:
	...
      FileSystemConfigs:
      - Arn: arn:aws:elasticfilesystem:us-east-1:xxxxxx:accesspoint/
fsap-123abcdef12abcdef
        LocalMountPath: /mnt/efs

To learn more, see the SAM documentation on this feature.

Example applications

You can view and download these examples from this GitHub repository. To deploy, follow the instructions in the repo’s README.md file.

1. Processing large video files

The first example uses EFS to process a 60-minute MP4 video and create screenshots for each second of the recording. This uses to the FFmpeg Linux package to process the video. After copying the MP4 to the EFS file location, invoke the Lambda function to create a series of JPG frames. This uses the following code to execute FFmpeg and pass the EFS mount path and input file parameters:

const os = require('os')

const inputFile = process.env.INPUT_FILE
const efsPath = process.env.EFS_PATH

const { exec } = require('child_process')

const execPromise = async (command) => {
	console.log(command)
	return new Promise((resolve, reject) => {
		const ls = exec(command, function (error, stdout, stderr) {
		  if (error) {
		    console.log('Error: ', error)
		    reject(error)
		  }
		  console.log('stdout: ', stdout);
		  console.log('stderr: ' ,stderr);
		})
		
		ls.on('exit', function (code) {
		  console.log('Finished: ', code);
		  resolve()
		})
	})
}

// The Lambda handler
exports.handler = async function (eventObject, context) {
	await execPromise(`/opt/bin/ffmpeg -loglevel error -i ${efsPath}/${inputFile} -s 240x135 -vf fps=1 ${efsPath}/%d.jpg`)
}

In this example, the process writes more than 2000 individual JPG files back to the EFS file system during a single invocation:

Console output from sample application

2. Archiving large numbers of files

Using the output from the first application, the second example creates a single archive file from the JPG files. The code uses the Node.js archiver package for processing:

const outputFile = process.env.OUTPUT_FILE
const efsPath = process.env.EFS_PATH

const fs = require('fs')
const archiver = require('archiver')

// The Lambda handler
exports.handler = function (event) {

  const output = fs.createWriteStream(`${efsPath}/${outputFile}`)
  const archive = archiver('zip', {
    zlib: { level: 9 } // Sets the compression level.
  })
  
  output.on('close', function() {
    console.log(archive.pointer() + ' total bytes')
  })
  
  output.on('end', function() {
    console.log('Data has been drained')
  })
  
  archive.pipe(output)  

  // append files from a glob pattern
  archive.glob(`${efsPath}/*.jpg`)
  archive.finalize()
}

After executing this Lambda function, the resulting ZIP file is written back to the EFS file system:

Console output from second sample application.

3. Unzipping archives with a large number of files

The last example shows how to unzip an archive containing many files. This uses the Node.js unzipper package for processing:

const inputFile = process.env.INPUT_FILE
const efsPath = process.env.EFS_PATH
const destinationDir = process.env.DESTINATION_DIR

const fs = require('fs')
const unzipper = require('unzipper')

// The Lambda handler
exports.handler = function (event) {

  fs.createReadStream(`${efsPath}/${inputFile}`)
    .pipe(unzipper.Extract({ path: `${efsPath}/${destinationDir}` }))

}

Once this Lambda function is executed, the archive is unzipped into a destination direction in the EFS file system. This example shows the screenshots unzipped into the frames subdirectory:

Console output from third sample application.

Conclusion

EFS for Lambda allows you to share data across function invocations, read large reference data files, and write function output to a persistent and shared store. After configuring EFS, you provide the Lambda function with an access point ARN, allowing you to read and write to this file system. Lambda securely connects the function instances to the EFS mount targets in the same Availability Zone and subnet.

EFS opens a range of potential new use-cases for Lambda. In this post, I show how this enables you to access large code packages and binaries, and process large numbers of files. You can interact with the file system via EC2 or AWS Cloud9 and pass information to and from your Lambda functions.

EFS for Lambda is supported at launch in APN Partner solutions, including Epsagon, Lumigo, Datadog, HashiCorp Terraform, and Pulumi. To learn more about how to use EFS for Lambda, see the AWS News Blog post and read the documentation.

Deploying a serverless application using AWS CDK

Post Syndicated from Georges Leschener original https://aws.amazon.com/blogs/devops/deploying-a-serverless-application-using-aws-cdk/

There are multiple ways to deploy API endpoints, such as this example, in which you could use an application running on Amazon EC2 to demonstrate how to integrate Amazon ElastiCache with Amazon DocumentDB (with MongoDB capability). While the approach in this example help achieve great performance and reliability through the elasticity and the ability to scale up or down the number of EC2 instances in order to accommodate the load on the application, there is still however some operational overhead you still have to manage the EC2 instances yourself. One way of addressing the operational overhead issue and related costs could be to transform the application into a serverless architecture.

The example in this blog post uses an application that provides a similar use case, leveraging a serverless architecture showcasing some of the tools that are being leveraged by customers transitioning from lift-and-shift to building cloud-native applications. It uses Amazon API Gateway to provide the REST API endpoint connected to an AWS Lambda function to provide the business logic to read and write from an Amazon Aurora Serverless database. It also showcases the deployment of most of the infrastructure with the AWS Cloud Development Kit, known as the CDK. By moving your applications to cloud native architecture like the example showcased in this blog post, you will be able to realize a number of benefits including:

  • Fast and clean deployment of your application thereby achieving fast time to market
  • Reduce operational costs by serverless and managed services

Architecture Diagram

At the end of this blog, you have an AWS Cloud9 instance environment containing a CDK project which deploys an API Gateway and Lambda function. This Lambda function leverages a secret stored in your AWS Secrets Manager to read and write from your Aurora Serverless database through the data API, as shown in the following diagram.

 

Architecture diagram for deploying a serverless application using AWS CDK

This above architecture diagram showcases the resources to be deployed in your AWS Account

Through the blog post you will be creating the following resources:

  1. Deploy an Amazon Aurora Serverless database cluster
  2. Secure the cluster credentials in AWS Secrets Manager
  3. Create and populate your database in the AWS Console
  4. Deploy an AWS Cloud9 instance used as a development environment
  5. Initialize and configure an AWS Cloud Development Kit project including the definition of your Amazon API Gateway endpoint and AWS Lambda function
  6. Deploy an AWS CloudFormation template through the AWS Cloud Development Kit

Prerequisites

In order to deploy the CDK application, there are a few prerequisites that need to be met:

  1. Create an AWS account or use an existing account.
  2. Install Postman for testing purposes

Amazon Aurora serverless cluster creation

To begin, navigate to the AWS console to create a new Amazon RDS database.

  1. Select Create Database from the Amazon RDS service.
  2. Select Standard Create under Choose a database creation method.
  3. Select Serverless under Database features.
  4. Select Amazon Aurora as the engine type under Engine options.
  5. Enter db-blog for your DB Cluster Identifier.
  6. Expand the Additional Connectivity section and select the Data API option. This functionality enables you to access Aurora Serverless with web services-based applications. It also allows you to use the query editor feature for Aurora Serverless in order to run SQL queries against your database instance.
  7. Leave the default selection for everything else and choose Create Database.

Your database instance is created in a single availability zone (AZ), but an Aurora Serverless database cluster has a capability known as automatic multi-AZ failover, which enables Aurora to recreate the database instance in a different AZ should the current database instance or the AZ become unavailable. The storage volume for the cluster is spread across multiple AZs, since Aurora separates computation capacity and storage. This allows for data to remain available even if the database instance or the associated AZ is affected by an outage.

Securing database credentials with AWS Secrets Manager

After creating the database instance, the next step is to store your secrets for your database in AWS Secrets Manager.

  • Navigate to AWS Secrets Manager, and select Store a New Secret.
  • Leave the default selection (Credentials for RDS database) for the secret type. Enter your database username and password and then select the radio button for the database you created in the previous step (in this example, db-blog), as shown in the following screenshot.

database search in aws secrets manager

  •  Choose Next.
  • Enter a name and optionally a description. For the name, make sure to add the prefix rds-db-credentials/ as shown in the following screenshot.

AWS Secrets Manager Store a new secret window

  • Choose Next and leave the default selection.
  • Review your settings on the last page and choose Store to have your secrets created and stored in AWS Secrets Manager, which you can now use to connect to your database.

Creating and populating your Amazon Aurora Serverless database

After creating the DB cluster, create the database instance; create your tables and populate them; and finally, test a connection to ensure that you can query your database.

  • Navigate to the Amazon RDS service from the AWS console, and select your db-blog database cluster.
  • Select Query under Actions to open the Connect to database window as shown in the screenshot below . Enter your database connection details. You can copy your secret manager ARN from the Secrets Manager service and paste it into the corresponding field in the database connection window.

Amazon RDS connect to database window

  • To create the DB instance run the following SQL query: CREATE DATABASE recordstore;from the Query editor shown in the screenshot below:

 

Amazon RDS Query editor

  • Before you can run the following commands, make sure you are using the Recordstore database you just created by running the command:
USE recordstore;
  • Create a records table using the following command:
CREATE TABLE IF NOT EXISTS records (recordid INT PRIMARY KEY, title VARCHAR(255) NOT NULL, release_date DATE);
  • Create a singers table using the following command:
CREATE TABLE IF NOT EXISTS singers (id INT PRIMARY KEY, name VARCHAR(255) NOT NULL, nationality VARCHAR(255) NOT NULL, recordid INT NOT NULL, FOREIGN KEY (recordid) REFERENCES records (recordid) ON UPDATE RESTRICT ON DELETE CASCADE);
  • Add a record to your records table and a singer to your singers table.
INSERT INTO records(recordid,title,release_date) VALUES(001,'Liberian Girl','2012-05-03');
INSERT INTO singers(id,name,nationality,recordid) VALUES(100,'Michael Jackson','American',001);

If you have the AWS CLI set up on your computer, you can connect to your database and retrieve records.

To test it, use the rds-data execute-statement API within the AWS CLI to connect to your database via the data API web service and query the singers table, as shown below:

aws rds-data execute-statement —secret-arn "arn:aws:secretsmanager:REGION:xxxxxxxxxxx:secret:rds-db-credentials/xxxxxxxxxxxxxxx" —resource-arn "arn:aws:rds:us-east-1:xxxxxxxxxx:cluster:db-blog" —database demodb —sql "select * from singers" —output json

You should see the following result:

    "numberOfRecordsUpdated": 0,
    "records": [
        [
            {
                "longValue": 100
            },
            {
                "stringValue": "Michael Jackson"
            },
            {
                "stringValue": "American"
            },
            {
                "longValue": 1
            }
        ]
    ]
}

Creating a Cloud9 instance

To create a Cloud9 instance:

  1. Navigate to the Cloud9 console and select Create Environment.
  2. Name your environment AuroraServerlessBlog.
  3. Keep the default values under the Environment Settings.

Once your instance is launched, you see the screen shown in the following screenshot:

AWS Cloud9

 

You can now install the CDK in your environment. Run the following command inside your bash terminal on the blue section at the bottom of your screen:

npm install -g [email protected]

For the next section of this example, you mostly work on the command line of your Cloud9 terminal and on your file explorer.

Creating the CDK deployment

The AWS Cloud Development Kit (AWS CDK) is an open-source software development framework to model and provision your cloud application resources using familiar programming languages. If you would like to familiarize yourself the CDKWorkshop is a great place to start.

First, create a working directory called RecordsApp and initialize a CDK project from a template.

Run the following commands:

mkdir RecordsApp
cd RecordsApp
cdk init app --language typescript
mkdir resources
npm install @aws-cdk/[email protected] @aws-cdk/[email protected] @aws-cdk/[email protected]

Now your instance should look like the example shown in the following screenshot:

AWS Cloud9 shell

 

You are mainly working in two directories:

  • Resources
  • Lib

Your initial set up is ready, and you can move into creating specific services and deploying them to your account.

Creating AWS resources using the CDK

  1. Follow these steps to create AWS resources using the CDK:
  2. Under the /lib folder,  create a new file called records_service.ts.
    • Inside of your new file, paste the following code with these changes:
    • Replace the dbARN with the ARN of your AuroraServerless DB ARN from the previous steps.

Replace the dbSecretARN with the ARN of your Secrets Manager secret ARN from the previous steps.

import core = require("@aws-cdk/core");
import apigateway = require("@aws-cdk/aws-apigateway");
import lambda = require("@aws-cdk/aws-lambda");
import iam = require("@aws-cdk/aws-iam");

//REPLACE THIS
const dbARN = "arn:aws:rds:XXXX:XXXX:cluster:aurora-serverless-blog";
//REPLACE THIS
const dbSecretARN = "arn:aws:secretsmanager:XXXXX:XXXXX:secret:rds-db-credentials/XXXXX";

export class RecordsService extends core.Construct {
  constructor(scope: core.Construct, id: string) {
    super(scope, id);

    const lambdaRole = new iam.Role(this, 'AuroraServerlessBlogLambdaRole', {
      assumedBy: new iam.ServicePrincipal('lambda.amazonaws.com'),
      managedPolicies: [
            iam.ManagedPolicy.fromAwsManagedPolicyName('AmazonRDSDataFullAccess'),
            iam.ManagedPolicy.fromAwsManagedPolicyName('service-role/AWSLambdaBasicExecutionRole')
        ]
    });

    const handler = new lambda.Function(this, "RecordsHandler", {
     role: lambdaRole,
     runtime: lambda.Runtime.NODEJS_12_X, // So we can use async in widget.js
     code: lambda.Code.asset("resources"),
     handler: "records.main",
     environment: {
       TABLE: dbARN,
       TABLESECRET: dbSecretARN,
       DATABASE: "recordstore"
     }
   });

    const api = new apigateway.RestApi(this, "records-api", {
      restApiName: "Records Service",
      description: "This service serves records."
   });

    const getRecordsIntegration = new apigateway.LambdaIntegration(handler, {
      requestTemplates: { "application/json": '{ "statusCode": 200 }' }
    });

    api.root.addMethod("GET", getRecordsIntegration); // GET /

    const record = api.root.addResource("{id}");
    const postRecordIntegration = new apigateway.LambdaIntegration(handler);
    const getRecordIntegration = new apigateway.LambdaIntegration(handler);

    record.addMethod("POST", postRecordIntegration); // POST /{id}
    record.addMethod("GET", getRecordIntegration); // GET/{id}
  }
}

This snippet of code will instruct the AWS CDK to create the following resources:

  • IAM role: AuroraServerlessBlogLambdaRole containing the following managed policies:
    • AmazonRDSDataFullAccess
    • service-role/AWSLambdaBasicExecutionRole
  • Lambda function: RecordsHandler, which has a Node.js 8.10 runtime and three environmental variables
  • API Gateway: Records Service, which has the following characteristics:
    • GET Method
      • GET /
    • { id } Resource
      • GET method
        • GET /{id}
      • POST method
        • POST /{id}

Now that you have a service, you need to add it to your stack under the /lib directory.

  1. Open the records_app-stack.ts
  2. Replace the contents of this file with the following:
import cdk = require('@aws-cdk/core'); 
import records_service = require('../lib/records_service'); 
export class RecordsAppStack extends cdk.Stack { 
  constructor(scope: cdk.Construct, id: string, props?
: cdk.StackProps) { 
    super(scope, id, props); 
    new records_service.RecordsService(this, 'Records'
); 
  } 
}
  1. Create the Lambda code that is invoked from the API Gateway endpoint. Under the /resources directory, create a file called records.js and paste the following code in this file
const AWS = require('aws-sdk');
var rdsdataservice = new AWS.RDSDataService();

exports.main = async function(event, context) {
  try {
    var method = event.httpMethod;
    var recordName = event.path.startsWith('/') ? event.path.substring(1) : event.path;
// Defining parameters for rdsdataservice
    var params = {
      resourceArn: process.env.TABLE,
      secretArn: process.env.TABLESECRET,
      database: process.env.DATABASE,
   }
   if (method === "GET") {
      if (event.path === "/") {
       //Here is where we are defining the SQL query that will be run at the DATA API
       params['sql'] = 'select * from records';
       const data = await rdsdataservice.executeStatement(params).promise();
       var body = {
           records: data
       };
       return {
         statusCode: 200,
         headers: {},
         body: JSON.stringify(body)
       };
     }
     else if (recordName) {
       params['sql'] = `SELECT singers.id, singers.name, singers.nationality, records.title FROM singers INNER JOIN records on records.recordid = singers.recordid WHERE records.title LIKE '${recordName}%';`
       const data = await rdsdataservice.executeStatement(params).promise();
       var body = {
           singer: data
       };
       return {
         statusCode: 200,
         headers: {},
         body: JSON.stringify(body)
       };
     }
   }
   else if (method === "POST") {
     var payload = JSON.parse(event.body);
     if (!payload) {
       return {
         statusCode: 400,
         headers: {},
         body: "The body is missing"
       };
     }

     //Generating random IDs
     var recordId = uuidv4();
     var singerId = uuidv4();

     //Parsing the payload from body
     var recordTitle = `${payload.recordTitle}`;
     var recordReleaseDate = `${payload.recordReleaseDate}`;
     var singerName = `${payload.singerName}`;
     var singerNationality = `${payload.singerNationality}`;

      //Making 2 calls to the data API to insert the new record and singer
      params['sql'] = `INSERT INTO records(recordid,title,release_date) VALUES(${recordId},"${recordTitle}","${recordReleaseDate}");`;
      const recordsWrite = await rdsdataservice.executeStatement(params).promise();
      params['sql'] = `INSERT INTO singers(recordid,id,name,nationality) VALUES(${recordId},${singerId},"${singerName}","${singerNationality}");`;
      const singersWrite = await rdsdataservice.executeStatement(params).promise();

      return {
        statusCode: 200,
        headers: {},
        body: JSON.stringify("Your record has been saved")
      };

    }
    // We got something besides a GET, POST, or DELETE
    return {
      statusCode: 400,
      headers: {},
      body: "We only accept GET, POST, and DELETE, not " + method
    };
  } catch(error) {
    var body = error.stack || JSON.stringify(error, null, 2);
    return {
      statusCode: 400,
      headers: {},
      body: body
    }
  }
}
function uuidv4() {
  return 'xxxx'.replace(/[xy]/g, function(c) {
    var r = Math.random() * 16 | 0, v = c == 'x' ? r : (r & 0x3 | 0x8);
    return v;
  });
}

Take a look at what this Lambda function is doing. You have two functions inside of your Lambda function. The first is the exported handler, which is defined as an asynchronous function. The second is a unique identifier function to generate four-digit random numbers you use as UIDs for your database records. In your handler function, you handle the following actions based on the event you get from API Gateway:

  • Method GETwith empty path /:
    • This calls the data API executeStatement method with the following SQL query:
SELECT * from records
  • Method GET with a record name in the path /{recordName}:
    • This calls the data API executeStatmentmethod with the following SQL query:
SELECT singers.id, singers.name, singers.nationality, records.title FROM singers INNER JOIN records on records.recordid = singers.recordid WHERE records.title LIKE '${recordName}%';
  • Method POST with a payload in the body:
    • This makes two calls to the data API executeStatement with the following SQL queries:
INSERT INTO records(recordid,titel,release_date) VALUES(${recordId},"${recordTitle}",“${recordReleaseDate}”);&lt;br /&gt;INSERT INTO singers(recordid,id,name,nationality) VALUES(${recordId},${singerId},"${singerName}","${singerNationality}");

Now you have all the pieces you need to deploy your endpoint and Lambda function by running the following commands:

npm run build
cdk synth
cdk bootstrap
cdk deploy

If you change the Lambda code or add aditional AWS resources to your CDK deployment, you can redeploy the application by running all four commands in a single line:

npm run build; cdk synth; cdk bootstrap; cdk deploy

Testing with Postman

Once it’s done, you can test it using Postman:

GET = ‘RecordName’ in the path

  • example:
    • ENDPOINT/RecordName

POST = Payload in the body

  • example:
{
   "recordTitle" : "BlogTest",
   "recordReleaseDate" : "2020-01-01",
   "singerName" : "BlogSinger",
   "singerNationality" : "AWS"
}

Clean up

To clean up the resources created by the CDK, run the following command in your Cloud9 instance:

cdk destroy

To clean up the resources created manually, run the following commands:

aws rds delete-db-cluster --db-cluster-identifier Serverless-blog --skip-final-snapshot
aws secretsmanager delete-secret --secret-id XXXXX --recovery-window-in-days 7

Conclusion

This blog post demonstrated how to transform an application running on Amazon EC2 from a previous blog into serverless architecture by leveraging services such as Amazon API Gateway, Lambda, Cloud 9, AWS CDK, and Aurora Serverless. The benefit of serverless architecture is that it takes away the overhead of having to manage a server and helps reduce costs, as you only pay for the time in which your code executes.

This example used a record-store application written in Node.js that allows users to find their favorite singer’s record titles, as well as the dates when they were released. This example could be expanded, for instance, by adding a payment gateway and a shopping cart to allow users to shop and pay for their favorite records. You could then incorporate some machine learning into the application to predict user choice based on previous visits, purchases, or information provided through registration profiles.

 


 

About the Authors

Luis Lopez Soria is an AI/ML specialist solutions architect working with the AWS machine learning team. He works with AWS customers to help them with the adoption of Machine Learning on a large scale. He enjoys doing sports in addition to traveling around the world, exploring new foods and cultures.

 

 

 

 Georges Leschener is a Partner Solutions Architect in the Global System Integrator (GSI) team at Amazon Web Services. He works with our GSIs partners to help migrate customers’ workloads to AWS cloud, design and architect innovative solutions on AWS by applying AWS recommended best practices.

 

Identifying and resolving security code vulnerabilities using Snyk in AWS CI/CD Pipeline

Post Syndicated from Jay Yeras original https://aws.amazon.com/blogs/devops/identifying-and-resolving-vulnerabilities-in-your-code/

The majority of companies have embraced open-source software (OSS) at an accelerated rate even when building proprietary applications. Some of the obvious benefits for this shift include transparency, cost, flexibility, and a faster time to market. Snyk’s unique combination of developer-first tooling and best in class security depth enables businesses to easily build security into their continuous development process.

Even for teams building proprietary code, use of open-source packages and libraries is a necessity. In reality, a developer’s own code is often a small core within the app, and the rest is open-source software. While relying on third-party elements has obvious benefits, it also presents numerous complexities. Inadvertently introducing vulnerabilities into your codebase through repositories that are maintained in a distributed fashion and with widely varying levels of security expertise can be common, and opens up applications to effective attacks downstream.

There are three common barriers to truly effective open-source security:

  1. The security task remains in the realm of security and compliance, often perpetuating the siloed structure that DevOps strives to eliminate and slowing down release pace.
  2. Current practice may offer automated scanning of repositories, but the remediation advice it provides is manual and often un-actionable.
  3. The data generated often focuses solely on public sources, without unique and timely insights.

Developer-led application security

This blog post demonstrates techniques to improve your application security posture using Snyk tools to seamlessly integrate within the developer workflow using AWS services such as Amazon ECR, AWS Lambda, AWS CodePipeline, and AWS CodeBuild. Snyk is a SaaS offering that organizations use to find, fix, prevent, and monitor open source dependencies. Snyk is a developer-first platform that can be easily integrated into the Software Development Lifecycle (SDLC). The examples presented in this post enable you to actively scan code checked into source code management, container images, and serverless, creating a highly efficient and effective method of managing the risk inherent to open source dependencies.

Prerequisites

The examples provided in this post assume that you already have an AWS account and that your account has the ability to create new IAM roles and scope other IAM permissions. You can use your integrated development environment (IDE) of choice. The examples reference AWS Cloud9 cloud-based IDE. An AWS Quick Start for Cloud9 is available to quickly deploy to either a new or existing Amazon VPC and offers expandable Amazon EBS volume size.

Sample code and AWS CloudFormation templates are available to simplify provisioning the various services you need to configure this integration. You can fork or clone those resources. You also need a working knowledge of git and how to fork or clone within your source provider to complete these tasks.

cd ~/environment && \ 
git clone https://github.com/aws-samples/aws-modernization-with-snyk.git modernization-workshop 
cd modernization-workshop 
git submodule init 
git submodule update

Configure your CI/CD pipeline

The workflow for this example consists of a continuous integration and continuous delivery pipeline leveraging AWS CodeCommit, AWS CodePipeline, AWS CodeBuild, Amazon ECR, and AWS Fargate, as shown in the following screenshot.

CI/CD Pipeline

For simplicity, AWS CloudFormation templates are available in the sample repo for services.yaml, pipeline.yaml, and ecs-fargate.yaml, which deploy all services necessary for this example.

Launch AWS CloudFormation templates

A detailed step-by-step guide can be found in the self-paced workshop, but if you are familiar with AWS CloudFormation, you can launch the templates in three steps. From your Cloud9 IDE terminal, change directory to the location of the sample templates and complete the following three steps.

1) Launch basic services

aws cloudformation create-stack --stack-name WorkshopServices --template-body file://services.yaml \
--capabilities CAPABILITY_NAMED_IAM until [[ `aws cloudformation describe-stacks \
--stack-name "WorkshopServices" --query "Stacks[0].[StackStatus]" \
--output text` == "CREATE_COMPLETE" ]]; do echo "The stack is NOT in a state of CREATE_COMPLETE at `date`"; sleep 30; done &&; echo "The Stack is built at `date` - Please proceed"

2) Launch Fargate:

aws cloudformation create-stack --stack-name WorkshopECS --template-body file://ecs-fargate.yaml \
--capabilities CAPABILITY_NAMED_IAM until [[ `aws cloudformation describe-stacks \ 
--stack-name "WorkshopECS" --query "Stacks[0].[StackStatus]" \ 
--output text` == "CREATE_COMPLETE" ]]; do echo "The stack is NOT in a state of CREATE_COMPLETE at `date`"; sleep 30; done &&; echo "The Stack is built at `date` - Please proceed"

3) From your Cloud9 IDE terminal, change directory to the location of the sample templates and run the following command:

aws cloudformation create-stack --stack-name WorkshopPipeline --template-body file://pipeline.yaml \
--capabilities CAPABILITY_NAMED_IAM until [[ `aws cloudformation describe-stacks \
--stack-name "WorkshopPipeline" --query "Stacks[0].[StackStatus]" \
--output text` == "CREATE_COMPLETE" ]]; do echo "The stack is NOT in a state of CREATE_COMPLETE at `date`"; sleep 30; done &&; echo "The Stack is built at `date` - Please proceed"

Improving your security posture

You need to sign up for a free account with Snyk. You may use your Google, Bitbucket, or Github credentials to sign up. Snyk utilizes these services for authentication and does not store your password. Once signed up, navigate to your name and select Account Settings. Under API Token, choose Show, which will reveal the token to copy, and copy this value. It will be unique for each user.

Save your password to the session manager

Run the following command, replacing abc123 with your unique token. This places the token in the session parameter manager.

aws ssm put-parameter --name "snykAuthToken" --value "abc123" --type SecureString

Set up application scanning

Next, you need to insert testing with Snyk after maven builds the application. The simplest method is to insert commands to download, authorize, and run the Snyk commands after maven has built the application/dependency tree.

The sample Dockerfile contains an environment variable from a value passed to the docker build command, which contains the token for Snyk. By using an environment variable, Snyk automatically detects the token when used.

#~~~~~~~SNYK Variable~~~~~~~~~~~~ 
# Declare Snyktoken as a build-arg ARG snyk_auth_token
# Set the SNYK_TOKEN environment variable ENV
SNYK_TOKEN=${snyk_auth_token}
#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Download Snyk, and run a test, looking for medium to high severity issues. If the build succeeds, post the results to Snyk for monitoring and reporting. If a new vulnerability is found, you are notified.

# package the application
RUN mvn package -Dmaven.test.skip=true

#~~~~~~~SNYK test~~~~~~~~~~~~
# download, configure and run snyk. Break build if vulns present, post results to `https://snyk.io/`
RUN curl -Lo ./snyk "https://github.com/snyk/snyk/releases/download/v1.210.0/snyk-linux"
RUN chmod -R +x ./snyk
#Auth set through environment variable
RUN ./snyk test --severity-threshold=medium
RUN ./snyk monitor

Set up docker scanning

Later in the build process, a docker image is created. Analyze it for vulnerabilities in buildspec.yml. First, pull the Snyk token snykAuthToken from the parameter store.

env:
  parameter-store:
    SNYK_AUTH_TOKEN: "snykAuthToken"

Next, in the prebuild phase, install Snyk.

phases:
  pre_build:
    commands:
      - echo Logging in to Amazon ECR...
      - aws --version
      - $(aws ecr get-login --region $AWS_DEFAULT_REGION --no-include-email)
      - REPOSITORY_URI=$(aws ecr describe-repositories --repository-name petstore_frontend --query=repositories[0].repositoryUri --output=text)
      - COMMIT_HASH=$(echo $CODEBUILD_RESOLVED_SOURCE_VERSION | cut -c 1-7)
      - IMAGE_TAG=${COMMIT_HASH:=latest}
      - PWD=$(pwd)
      - PWDUTILS=$(pwd)
      - curl -Lo ./snyk "https://github.com/snyk/snyk/releases/download/v1.210.0/snyk-linux"
      - chmod -R +x ./snyk

Next, in the build phase, pass the token to the docker compose command, where it is retrieved in the Dockerfile code you set up to test the application.

build:
    commands:
      - echo Build started on `date`
      - echo Building the Docker image...
      - cd modules/containerize-application
      - docker build --build-arg snyk_auth_token=$SNYK_AUTH_TOKEN -t $REPOSITORY_URI:latest.

You can further extend the build phase to authorize the Snyk instance for testing the Docker image that’s produced. If it passes, you can pass the results to Snyk for monitoring and reporting.

build:
    commands:
      - $PWDUTILS/snyk auth $SNYK_AUTH_TOKEN
      - $PWDUTILS/snyk test --docker $REPOSITORY_URI:latest
      - $PWDUTILS/snyk monitor --docker $REPOSITORY_URI:latest
      - docker tag $REPOSITORY_URI:latest $REPOSITORY_URI:$IMAGE_TAG

For reference, a sample buildspec.yaml configured with Snyk is available in the sample repo. You can either copy this file and overwrite your existing buildspec.yaml or open an editor and replace the contents.

Testing the application

Now that services have been provisioned and Snyk tools have been integrated into your CI/CD pipeline, any new git commit triggers a fresh build and application scanning with Snyk detects vulnerabilities in your code.

In the CodeBuild console, you can look at your build history to see why your build failed, identify security vulnerabilities, and pinpoint how to fix them.

Testing /usr/src/app...
✗ Medium severity vulnerability found in org.primefaces:primefaces
Description: Cross-site Scripting (XSS)
Info: https://snyk.io/vuln/SNYK-JAVA-ORGPRIMEFACES-31642
Introduced through: org.primefaces:[email protected]
From: org.primefaces:[email protected]
Remediation:
Upgrade direct dependency org.primefaces:[email protected] to org.primefaces:[email protected] (triggers upgrades to org.primefaces:[email protected])
✗ Medium severity vulnerability found in org.primefaces:primefaces
Description: Cross-site Scripting (XSS)
Info: https://snyk.io/vuln/SNYK-JAVA-ORGPRIMEFACES-31643
Introduced through: org.primefaces:[email protected]
From: org.primefaces:[email protected]
Remediation:
Upgrade direct dependency org.primefaces:[email protected] to org.primefaces:[email protected] (triggers upgrades to org.primefaces:[email protected])
Organisation: sample-integrations
Package manager: maven
Target file: pom.xml
Open source: no
Project path: /usr/src/app
Tested 37 dependencies for known vulnerabilities, found 2 vulnerabilities, 2 vulnerable paths.
The command '/bin/sh -c ./snyk test' returned a non-zero code: 1
[Container] 2020/02/14 03:46:22 Command did not exit successfully docker build --build-arg snyk_auth_token=$SNYK_AUTH_TOKEN -t $REPOSITORY_URI:latest . exit status 1
[Container] 2020/02/14 03:46:22 Phase complete: BUILD Success: false
[Container] 2020/02/14 03:46:22 Phase context status code: COMMAND_EXECUTION_ERROR Message: Error while executing command: docker build --build-arg snyk_auth_token=$SNYK_AUTH_TOKEN -t $REPOSITORY_URI:latest .. Reason: exit status 1

Remediation

Once you remediate your vulnerabilities and check in your code, another build is triggered and an additional scan is performed by Snyk. This time, you should see the build pass with a status of Succeeded.

You can also drill down into the CodeBuild logs and see that Snyk successfully scanned the Docker Image and found no package dependency issues with your Docker container!

[Container] 2020/02/14 03:54:14 Running command $PWDUTILS/snyk test --docker $REPOSITORY_URI:latest
Testing 300326902600.dkr.ecr.us-west-2.amazonaws.com/petstore_frontend:latest...
Organisation: sample-integrations
Package manager: rpm
Docker image: 300326902600.dkr.ecr.us-west-2.amazonaws.com/petstore_frontend:latest
✓ Tested 190 dependencies for known vulnerabilities, no vulnerable paths found.

Reporting

Snyk provides detailed reports for your imported projects. You can navigate to Projects and choose View Report to set the frequency with which the project is checked for vulnerabilities. You can also choose View Report and then the Dependencies tab to see which libraries were used. Snyk offers a comprehensive database and remediation guidance for known vulnerabilities in their Vulnerability DB. Specifics on potential vulnerabilities that may exist in your code would be contingent on the particular open source dependencies used with your application.

Cleaning up

Remember to delete any resources you may have created in order to avoid additional costs. If you used the AWS CloudFormation templates provided here, you can safely remove them by deleting those stacks from the AWS CloudFormation Console.

Conclusion

In this post, you learned how to leverage various AWS services to build a fully automated CI/CD pipeline and cloud IDE development environment. You also learned how to utilize Snyk to seamlessly integrate with AWS and secure your open-source dependencies and container images. If you are interested in learning more about DevSecOps with Snyk and AWS, then I invite you to check out this workshop and watch this video.

 

About the Author

Author Photo

 

Jay is a Senior Partner Solutions Architect at AWS bringing over 20 years of experience in various technical roles. He holds a Master of Science degree in Computer Information Systems and is a subject matter expert and thought leader for strategic initiatives that help customers embrace a DevOps culture.

 

 

Running Simcenter STAR-CCM+ on AWS with AWS ParallelCluster, Elastic Fabric Adapter and Amazon FSx for Lustre

Post Syndicated from Bala Thekkedath original https://aws.amazon.com/blogs/compute/running-simcenter-star-ccm-on-aws/

This post is contributed by Anh Tran, Senior HPC Specialist Solution Architect, AWS

Introduction

AWS recently introduced many HPC services that boost the performance and scalability of Computational Fluid Dynamics (CFD) workloads on AWS. These services include: Amazon FSx for Lustre, Elastic Fabric Adapter (EFA), and AWS ParallelCluster 2.5.1. In this technical post, I walk through these three services. Additionally, I outline an example of using AWS ParallelCluster to set up an HPC system with EFA and Amazon FSx Lustre to run a CFD workload.  The CFD application that you will set up during this blog post is Simcenter STAR-CCM+  – the predominant CFD application from Siemens.

Service and solution overview

Services

This blog primarily uses three services – Amazon FSx for Lustre, EFA, and AWS ParallelCluster. Let’s dig into each of these services before reviewing the solution.

Amazon FSx for Lustre

In December 2018, AWS released Amazon FSx for Lustre. This is a fully managed, high-performance file system, optimized for fast processing workloads, like HPC. Amazon FSx for Lustre allows users to access and alter data from either Amazon S3 or on-premises seamlessly and exceptionally fast. For example, you can launch and run a file system that provides low latency access to your data. Additionally, you can read and write data at speeds of up to hundreds of gigabytes per second of throughput, and millions of IOPS. This speed and low-latency unleashes innovation at an unparalleled pace.  This blog post uses the latest version of Amazon FSx for Lustre which recently added a new API for moving data in and out of Amazon S3. This API also includes POSIX support, which allows files to mount with the same user id. Additionally, the latest version also includes a new backup feature that allows you to back up your files to an S3 bucket. I go into more detail of how to take advantage of this at the end of the blog.

Elastic Fabric Adapter

In April of 2019, AWS released EFA. This enables you to run applications requiring high levels of inter-node communications at scale on AWS.

AWS ParallelCluster 2.5.1.

AWS ParallelCluster is an open source cluster management tool that simplifies deploying and managing HPC clusters with Amazon FSx for Lustre, EFA, a variety of job schedulers, and the MPI library of your choice. AWS ParallelCluster simplifies cluster orchestration on AWS so that HPC environments become easy-to-use even for if you’re new to the cloud.  AWS recently released AWS ParallelCluster 2.5.1 – which is the version we will use for this blog.

These three AWS HPC components are optimal for CFD applications. Together, they provide simple deployment of HPC systems on AWS, low latency network communication for MPI workloads, and a fast, parallel filesystem. Now, let’s take a look at how these services come together and seamlessly run a real CFD application: Simcenter STAR-CCM+.

Solution

AWS has a long-standing collaboration with Siemens. AWS and Siemens are dedicated to enhancing Siemens’ customer experiences when they run Simcenter STAR-CCM+ apps on AWS. I am excited to walk you through the steps and the best practices for running Simcenter STAR-CCM+, the predominant CFD application from Siemens.

The Simcenter STAR-CCM+ application runs on an HPC system. This system is optimized with EFA and Amazon FSx for Lustre — all of which is managed by AWS ParallelCluster. AWS ParallelCluster simplifies the deployment process to such an extent that you can set up your HPC cluster with a high throughput parallel file system (Amazon FSx for Lustre), a high-throughput and low-latency network interface (EFA), and high-bandwidth network interconnects (100 Gbps using C5n instances) in less than 15 minutes.

Now that you have the services and solution overviews, we can get started. This blog post includes the following steps:

  1. Creating an HPC infrastructure stack on AWS, which will include:
    • How to set up AWS ParallelCluster for best performance
    • How to enable EFA
    • How to enable the Amazon FSx Lustre file system and how to use some basic Amazon FSx for Lustre for STAR-CCM+
    • How to connect to a remote desktop session using NICE DCV
  2. Installing the Simcenter STAR-CCM+ application and how to submit a Simcenter STAR-CCM+ job to HPC cluster.

The following diagram outlines the steps

Steps

Setting up the HPC infrastructure stack

Before you can run Simcenter STAR-CCM+, you need to build an HPC cluster first.  Some best practices that you should consider when setting up a cluster on AWS include:

  • Turn off Hyper-Threading (HT): AWS instances have HT turned on by default.
  • Use EFA and a cluster placement group for your compute fleet to minimize the latency between nodes.
  • Select the right instance type for your compute fleet. Here, I use C5n.18xlarge because of its high-performance CPU, high-bandwidth bandwidth networking, and the EFA network interface capabilities.

With HPC best practices in mind, you can set up your AWS ParallelCluster

Set up AWS ParallelCluster:

$aws s3 mb s3://benchmark-starccm
make_bucket: benchmark-starccm

*Note: As an example, I create an S3 bucket benchmark-starccm, however you should create an S3 bucket with a different name of your choice because the S3 bucket name must be globally unique.

Let’s download the STAR-CCM+ installation file and a case file, then upload them to the S3 bucket that we just created.

  • Download latest Simcenter STAR-CCM+ package from the Siemens portal. It will look something like this: STAR-CCM+15.02.003_01_linux-x86_64-2.12_gnu7.1.zip
  • Download the Le Mans case file. The Le Mans case is 104 Million cells, which even today is considered large for a CFD case.  The file will look something like this:  LeMans_104M.sim

After downloading the STAR-CCM+ software and LeMans case file, upload them to the S3 bucket created above

aws s3 cp STAR-CCM+15.02.003_01_linux-x86_64-2.12_gnu7.1.zip s3://benchmark-starccm
aws s3 cp LeMans_104M.sim s3://benchmark-starccm/

We will use this same S3 bucket to install the Simcenter STAR-CCM+ application later in this tutorial

With all the ground work done, we can now build our HPC cluster. For more detailed instructions, you can consult Getting Started with AWS ParallelCluster.

  • Install AWS ParallelCluster
pip install aws-parallelcluster
  • Configure AWS ParallelCluster with some basic network information such as AWS Region ID, VPC ID, Subnet ID

pcluster configure

Modify your ~/.parallelcluster/config file to include a cluster section that minimally includes the following:

[aws]
aws_region_name = us-east-2

[cluster default]
vpc_settings = public
key_name = <Key-Name>
initial_queue_size = 2
max_queue_size = 100
maintain_initial_size = true
placement_group = DYNAMIC
placement = cluster
master_instance_type = c5.xlarge
compute_instance_type = c5n.18xlarge
cluster_type = ondemand

base_os = centos7
tags = {“Name” : “STARCCM”}

enable_efa = compute
fsx_settings = fsxshared
disable_hyperthreading = true

dcv_settings = hpc-dcv

[vpc public]
master_subnet_id = subnet-<Subnet-ID>
vpc_id = vpc-<VPC-ID>

[global]
update_check = true
sanity_check = true
cluster_template = default

[dcv hpc-dcv]

enable = master

[fsx fsxshared]
shared_dir = /fsx
storage_capacity = 1200
imported_file_chunk_size = 1024
import_path = s3://benchmark-starccm

export_path = s3://benchmark-starccm

[aliases]
ssh = ssh {CFN_USER}@{MASTER_IP} {ARGS}

  • Now, create your first HPC cluster with the name starccmby running

pcluster create starccm

Your HPC cluster should be ready in about 15 minutes.

While you’re waiting for your cluster to be ready, let’s take a deeper look at what some of the different parameters we used mean for our HPC cluster:

initial_queue_size: We will start with two compute instances after the HPC cluster is up.

max_queue_size: We will limit the maximum compute fleet to 100 instances. This allows us room to scale our jobs up to a large number of cores while putting a limit on the number of compute nodes to help control costs.

base_os: For this blog, we will select centos 7 as a base os. Currently we support Amazon Linux (alinux), Centos 7 (centos7), Ubuntu 16.04 (ubuntu1604), and Ubuntu 18.04 (ubuntu1804) with EFA.

master_instance_type: This can be any instance type. Here we choose c5.xlarge because it is inexpensive and relatively fast for the head node.

compute_instance_type: We select C5n.18xlarge because it is optimized for compute-intensive workloads and supports EFA for better scaling of HPC. Note that EFA is currently only available on c5n.18xlarge, c5n.metal, i3en.24xlarge, p3dn.24xlarge, inf1.24xlarge, m5dn.24xlarge, m5n.24xlarge, r5dn.24xlarge, r5n.24xlarge, and p3dn.24xlarge. See the docs for Currently supported instances.

placement_group: We use placement_group to ensure our instances are located as physically close to one another as possible to minimize the latency between compute nodes and take advantage of EFA’s low latency networking.

enable_efa: with just one configuration line, we can easily turn on EFA support for our HPC cluster.

dcv_settings = hpc-dcv: With AWS ParallelCluster 2.5.1 you can use NICE DCV to support your remote visualization needs.

disable_hyperthreading: This setting turns off hyper-threading on the cluster

[fsx fsxshared]: This section contains the settings to define your FSx for Lustre parallel file system, including the location where the shared directory will be mounted, the storage capacity for the filesystem, the chunk size for files to be imported, and the location from which the data will be imported. You can read more about FSx for Lustre here.

[dcv hpc-dcv]: This section contains the settings to define your remote visualization setup. You can read more about DCV with AWS ParallelCluster here.

  •  After you set up your config file for AWS ParallelCluster, log in and verify that you can access the cluster’s head node
pcluster ssh starccm -i ~/path/to/ssh_key
  • Verify the compute nodes are up. We should see two c5n.18xlarge nodes.
$ qhost
HOSTNAME ARCH NCPU NSOC NCOR NTHR LOAD MEMTOT MEMUSE SWAPTO SWAPUS
global - - - - - - - - - -
ip-172-31-14-220 lx-amd64 36 2 36 36 0.49 184.6G 11.7G 0.0 0.0
ip-172-31-2-137 lx-amd64 36 2 36 36 0.45 184.6G 11.7G 0.0 0.0
  • Verify the EFA driver has been loaded successfully.

In order to verify if EFA is installed correctly you will need to ssh into one of compute nodes and run :

[[email protected] ~]$ which mpirun
/opt/amazon/openmpi/bin/mpirun

[[email protected] ~]$ fi_info -p efa

provider: efa
fabric: EFA-fe80::97:9fff:fe1e:4e78
domain: efa_0-rdm
version: 2.0
type: FI_EP_RDM
protocol: FI_PROTO_EFA

provider: efa
fabric: EFA-fe80::97:9fff:fe1e:4e78
domain: efa_0-dgrm
version: 2.0
type: FI_EP_DGRAM
protocol: FI_PROTO_EFA
provider: efa;ofi_rxd
fabric: EFA-fe80::97:9fff:fe1e:4e78
domain: efa_0-dgrm
version: 1.0
type: FI_EP_RDM
protocol: FI_PROTO_RXD

At this point, EFA is verified.

Install Simcenter STAR-CCM+ application

Now that the HPC cluster using AWS ParallelCluster is set up, it’s time to install the Simcenter STAR-CCM+ application.  In the prior steps, you uploaded a Simcenter STAR-CCM+ application and a case file to S3 bucket and used that S3 bucket as a source for the Amazon FSx for Lustre /fsx storage. As soon as the cluster created, the installation file and the case file will be available in /fsx

[[email protected] fsx]$ cd /fsx
[[email protected] fsx]$ ls
LeMans_104M.sim  STAR-CCM-14.06.013_01_linux-x86_64.tar.gz  STAR-CCM+14.06.013_01_linux-x86_64.tar.gz

As you can see, Amazon FSx for Lustre has already downloaded the case file from the S3 bucket to the /fsx partition, so now you can start install Simcenter STAR-CCM+ software using the following steps.

  • Install Simcenter STAR-CCM+: install Simcenter STAR-CCM+ on /fsx  – a 1.2TB Lustre filesystem that you configured in a previous step.
cd /fsx
sudo unzip STAR-CCM+15.02.003_01_linux-x86_64-2.12_gnu7.1.zip
cd STAR-CCM+15.02.003_01_linux-x86_64-2.12_gnu7.1
./STAR-CCM+15.02.003_01_linux-x86_64-2.12_gnu7.1.sh
Select Installation Location : /fsx/Siemens

After following all the standard installation steps from Simcenter STAR-CCM+, the application should be installed at the following location:

/fsx/Siemens/15.02.003/STAR-CCM+15.02.003/

  • Test the installation
[[email protected] fsx]$ /fsx/Siemens/15.02.003/STAR-CCM+15.02.003/star/bin/starccm+ -version
Simcenter STAR-CCM+ 2020.1 Build 15.02.003 (linux-x86_64-2.12/gnu7.1)

When the above code shows up, you correctly installed Simcenter STAR-CCM+, so now you can run the application.

Running Simcenter STAR-CCM+ on AWS ParallelCluster

Before I move on, let’s recap what you’ve have done so far.

  1. You set up an HPC cluster using AWS ParallelCluster with compute-optimized C5n instances, an Amazon FSx for Lustre filesystem, and EFA-enabled networking.
  2. You installed Simcenter STAR-CCM+ application

Now, let us create an SGE submission script and submit a job to the HPC cluster.

  • Create an SGE job submission script :
cd /fsx/

vi star-ccm.qsub

#!/bin/bash
#$ -N check
#$ -cwd
#$ -j Y
#$ -pe mpi 252

date

your_pod_key="your license key"
ccmp=" /fsx/Siemens/15.02.003/STAR-CCM+15.02.003/star/bin/starccm+”
case="/fsx/LeMans_104M.sim"

fabric="OFI"
tag="fsx-${fabric}"

${ccmp} \
-power \
-podkey $your_pod_key \
-licpath [email protected] \
-mpi openmpi \
-bs sge \
-benchmark:"-preits 40 -nits 20 -nps ${NSLOTS} -tag ${tag}" \
${case}

date

mkdir ${NSLOTS}new
mv *xml ${NSLOTS}new
  • Submit your Simcenter STAR-CCM+ job to your HPC cluster
qsub -V star-ccm.qsub

Now you have submitted an HPC job that requests 252 cores of c5n.18xlarge.

  • Check the status of your jobs by running
qstat -f

Simcenter STAR-CCM+ result analysis

Here are sample scaling results for the LeMans 104M cell benchmark case. As you can see, the Simcenter STAR-CCM+ result shows exciting performance and scaling results running on AWS. The performance is fast and the simulation scales very well with EFA-based networking and great CPU performance.

Connect to a NICE DCV session

AWS ParallelCluster is now natively integrated with NICE DCV. You can configure a NICE DCV session to visualize your STAR-CCM+ result or connect to the application remotely.

As you will recall from when we configured our AWS ParallelCluster in the previous section, I named the DCV server as hpc-dcv.  To create and connect to a NICE DCV session, just run:

pcluster dcv connect hpc-dcv -k <Key-Name>

After you connect to NICE DCV session, you will be able to access to a Linux desktop to work with STAR-CCM+ application.

Backup SIMCENTER STAR-CCM+ result to S3 bucket

After you finish your STAR-CCM+ simulation, you can backup data in /fsx to your S3 bucket that you created from running your application. You can now use Data Repository Tasks. Data Repository Tasks represent bulk operations between your Amazon FSx for Lustre file system and your S3 bucket. One of the jobs is to export your changed file system contents back to its linked S3 bucket.

*Note: in order to use new Amazon FSx for Lustre feature, you will need to have AWS CLI version 1.16.309 or above.

In this case, I select to export the  STAR-CCM+ application directory Siemens as an example.

  • Exit the HPC head node, and go back to your laptop or Cloud9 environment where you have configured your AWS CLI. Find out your Amazon FSx Lustre ID by running:
aws fsx describe-file-systems
  • After you find the Amazon FSx for Lustre ID, which looks simiar to fs-0d72d520f620d765a, create a backup of the data by running:
create-data-repository-task

aws fsx create-data-repository-task --file-system-id fs-0d72d520f620d765a --type EXPORT_TO_REPOSITORY --paths Siemens,testfsx --report Enabled=true,Scope=FAILED_FILES_ONLY,Format=REPORT_CSV_20191124,Path=s3://benchmark-starccm/

Explanation:

–file-system-id: your file system ID

–type EXPORT_TO_REPOSITORY: we will export the data back to the S3 bucket

–paths Siemens,testfsx: the directories you want to export to S3 bucket

Format=REPORT_CSV_20191124:  note this is only name the Amazon FSx Lustre supports. Please keep it the same.

  • Check the status of the backup by running describe-create-data-repository-task
aws fsx describe-data-repository-tasks

As you can see, I can now use FSx for Lustre to install the Simcenter STAR-CCM+ application, and seamlessly move case data between on-premise to Amazon S3 and AWS HPC system. I can create a file system linked to a S3 bucket, create a FSx for Lustre filesystem, and export data back to their S3 bucket after running the CFD app.

Conclusion

Give it a try, set up an HPC environment, and let us know how it goes! If you need more information about running CFD and HPC cases on AWS you can find it on our HPC home page. Please feel free to contact us with questions that you might have.