Tag Archives: testing

Load testing a web application’s serverless backend

Post Syndicated from James Beswick original https://aws.amazon.com/blogs/compute/load-testing-a-web-applications-serverless-backend/

Many web applications experience high levels of traffic and spiky load patterns. The goal of load testing is to ensure that the architecture and system design works for the amount of traffic expected. It can help developers find bottlenecks and unexpected behavior before a system is deployed to production. This post uses the Ask Around Me application as an example to show how to test load in a serverless architecture.

In Ask Around Me, users ask and answer questions in their local geographic area. The expected hourly load is 1,000 new questions, 10,000 new answers, and 50,000 question lookup queries. I use these numbers as a baseline for the tests. This is the architecture of the Ask Around Me backend application:

Ask Around Me backend architecture

Focus areas for load testing

In serverless architectures using AWS services, you can perform a round-trip test from an API endpoint. You can also isolate areas in the design where you should test performance. API testing provides the best approximation of the performance that users experience but it may not always be possible. You can also isolate microservices consuming from SQS queue or receive events from Amazon EventBridge, and test only those parts of the infrastructure.

While AWS services are built to withstand high levels of traffic, it’s important to consider the effect of Service Quotas on your application. Service Quotas are applied at the Region and account levels depending upon the service. You can view all your quotas in one place from the Service Quotas console. These are designed to protect you and other customers if your applications use more resources than planned. These quotas consist of hard and soft limits. For soft limits, you can request quota increases by opening a support ticket.

You must also consider downstream services. While serverless services like Lambda scale on your behalf, you may use downstream services that could be overwhelmed when traffic increases. Load testing can help identify these areas. You can implement mechanisms like queuing, caching, or pooling to protect those non-serverless parts of your infrastructure. If you are using Amazon RDS, for example, you might implement Amazon RDS Proxy to help pool and scale resources.

Finally, load testing can help identify custom code in Lambda functions that may not run efficiently as traffic scales up. Typically, these issues are caused by the code itself or the function configuration. For example, code may process event batches effectively or may not be configured with the appropriate concurrency or memory configuration. Frequently these issues are unnoticed in development but resurface in a load test.

Load testing tools

Load testing serverless infrastructure can be both inexpensive and systematic. There are several tools available for serverless developers to perform this task. One of the most popular is Artillery Community Edition, which is an open-source tool for testing serverless APIs. You configure the number of requests per second and overall test duration, and it uses a headless Chromium browser to run its test flows.

The performance report measures the roundtrip time from the client device, so can be affected by your machine’s performance and network. One way to eliminate your local network’s impact on the results is to use AWS Cloud9 to run the tests remotely.

For Artillery, the maximum number of concurrent tests is constrained by your local computing resources and network. To achieve higher throughput, you can use Serverless Artillery, which runs the Artillery package on Lambda functions. As a result, this tool can scale up to a significantly higher number of tests.

The Ask Around Me application is deployed in my AWS account – see the application’s blog series to learn more about the deployment process. I use an AWS Cloud9 instance to run these API tests:

  1. Adding 1,000 questions per hour using the POST /questions API.
  2. Adding 10,000 answers per hour using the POST /answers API.
  3. Fetching 50,000 questions per hour based upon random geo-location using the GET /questions API.

You can find the test scripts and Artillery configurations in the testing directory of the application’s GitHub repo.

Artillery also enables you to specify custom functions to provide randomized data and custom query parameters, as required by your API. The loadTestFunction.js file contains a function to return randomized geo-point and rating data per test:

// Sets a bounding box around an area in Virginia, USA
const bounds = {
  latMax: 38.735083,
  latMin: 40.898677,
  lngMax: -77.109339,
  lngMin: -81.587841
}

const generateRandomData = (userContext, events, done) => {
  const randomLat = ((bounds.latMax-bounds.latMin) * Math.random()) + bounds.latMin
  const randomLng = ((bounds.lngMax-bounds.lngMin) * Math.random()) + bounds.lngMin

  const id = parseInt(Math.random()*1000000)+1  //random 0-1000000
  const rating = parseInt(Math.random()*5)+1    //returns 1-5

  userContext.vars.lat = randomLat.toFixed(7)
  userContext.vars.lng = randomLng.toFixed(7)
  userContext.vars.id = id
  userContext.vars.rating = rating

  return done()
}

module.exports = { generateRandomData }

Test #1: Adding 1,000 questions per hour

The POST questions API has the following architecture:

POST questions architecture

The Artillery configuration file 1-test.yaml is set to create three requests per second over a 5-minute duration. This equates to 10,800 questions per hour, significantly higher than the estimated load for this function. The scenario specifies the JSON payload expected by the questions API:

config:
  target: 'https://abcd1234567.execute-api.us-east-1.amazonaws.com'
  phases:
    - duration: 300
      arrivalRate: 3
  processor: "./loadTestFunction.js"          
  defaults:
    headers:
      Authorization: 'Bearer <<enter your valid JWT token>>'
scenarios:
  - flow:
    - function: "generateRandomData"
    - post:
        url: "/questions"
        json:
          question: "This is a load test question - #{{ id }}"
          type: "Star rating"
          position:
            latitude: {{ lat }}
            longitude: {{ lng }}
    - log: "Sent POST request to / with {{ lat }}, {{ lng }}"

You execute the Artillery test with the command artillery run ./1-test.yaml. My test concludes with the following results:

Artillery test results

Over 300 requests, the median response time is 114 ms. The p95 response time shows that 95% of all responses are served within 376 ms. The slowest response of 1401 ms is caused by cold starts when the Lambda service scales up the underlying function due to load.

As this process writes to a DynamoDB table, I can also see how many write capacity units (WCUs) are consumed by the test. From the DynamoDB console, select the table aamQuestions, then choose the Metrics tab. This shows the Write capacity metric:

CloudWatch DynamoDB metrics

Test #2: Adding 10,000 answers per hour.

The POST answers API has the following architecture:

POST answers architecture

The Artillery configuration in 2-test.yaml creates 10 answers per second over a 5-minute duration. This equates to 36,000 per hour, much higher than the estimated load. The scenario defines the randomized rating used by the testing process:

config:
  target: 'https://abcd1234567.execute-api.us-east-1.amazonaws.com'
  phases:
    - duration: 300
      arrivalRate: 10
  processor: "./loadTestFunction.js"          
  defaults:
    headers:
      Authorization: 'Bearer <<enter your valid JWT token>>’
scenarios:
  - flow:
    - function: "generateRandomData"
    - post:
        url: "/answers"
        json:
          type: "Star"
          rating: "{{ rating }}"
          question: 
            type: "Star"
            latitude: 39.08259127440097
            longitude: -77.46246339003038
            rangeKey: "testuser|1-1589380702281"
    - log: "Sent POST request to / with {{ rating }}"

The test results show a median response time of 111 ms with a p95 time of 218 ms. In the worst case, a request took 1102 ms to complete:

Artillery summary report

Checking the Metrics tab for the aaAnswers table, this test consumed just under 11 WCUs at peak:

CloudWatch DynamoDB metrics

Test #3: Fetching 50,000 questions per hour

The GET questions API invokes a Lambda function that uses the Geo Library for Amazon DynamoDB:

GET questions architecture

This process is read-intensive on the underlying DynamoDB table. The testing configuration simulates 20 queries per second over 2 minutes for random locations in a bounding box around Virginia, USA:

config:
  target: 'https://abcd1234567.execute-api.us-east-1.amazonaws.com'
  phases:
    - duration: 120
      arrivalRate: 20
  processor: "./loadTestFunction.js"          
  defaults:
    headers:
      Authorization: 'Bearer <<enter your valid JWT token>>’
scenarios:
  - flow:
    - function: "generateRandomData"
    - get:
        url: "/questions"
        qs:
          lat: "{{ lat }}"
          lng: "{{ lng }}"
    - log: "Sent POST request to / with {{ lat }}, {{ lng }}"

This is a synchronous API so the performance directly impacts the user’s experience of the application. This test shows that the median response time is 165 ms with a p95 time of 201 ms:

Artillery performance results

This level of load equates to 72,000 queries per hour, almost 50% above the expected usage. The DynamoDB metrics show a peak consumption of 82 read capacity units:

CloudWatch monitoring details

Testing authenticated routes

These API routes are protected from public access and require authorization. This application uses HTTP APIs, which accepts JWT tokens, and it uses Auth0 in the frontend application to generate these tokens. When you are load testing API Gateway routes with custom authorizers, you have a number of options.

At the early development stage, you may choose to remove the authentication to perform load tests. This simplifies the process but is not recommended beyond research and prototyping. If you turn off authentication for testing, there is a risk that it is not enabled again for production. This would leave your routes open to the public.

A better approach is to create a test user in your identity provider and use the JWT token for testing. Auth0 allows you to obtain a token manually, and use this in the Artillery configuration for the authorization header:

Embedding authorization token in test script

Since custom code frequently uses the decoded identity in processing, supplying a test token provides the closest simulation of actual usage. You must refresh this token in the test scripts periodically, and you can change scopes as needed.

The testing directory in the GitHub repo also includes a script for testing functions that consume from SQS queues. This allows you to test microservices further down in your infrastructure stack. This script injects messages into the SQS queue, simulating upstream processes.

Conclusion

In this post, I discuss focus areas for load testing of serverless applications, and highlight two tools commonly used. I show how to configure Artillery with customized functions, and how to run tests to simulate load on the Ask Around Me application.

I cover some of the options for testing authenticated API Gateway routes and how you can use JWT tokens in your load testing configuration. You can also test microservices within a serverless architecture by injecting messages into SQS queues to simulate upstream load.

To learn more about the Ask Around Me serverless applications, read the blog series.

Test your home network performance

Post Syndicated from Achiel van der Mandele original https://blog.cloudflare.com/test-your-home-network-performance/

Test your home network performance

With many people being forced to work from home, there’s increased load on consumer ISPs. You may be asking yourself: how well is my ISP performing with even more traffic? Today we’re announcing the general availability of speed.cloudflare.com, a way to gain meaningful insights into exactly how well your network is performing.

We’ve seen a massive shift from users accessing the Internet from busy office districts to spread out urban areas.

Although there are a slew of speed testing tools out there, none of them give you precise insights into how they came to those measurements and how they map to real-world performance. With speed.cloudflare.com, we give you insights into what we’re measuring and how exactly we calculate the scores for your network connection. Best of all, you can easily download the measurements from right inside the tool if you’d like to perform your own analysis.

We also know you care about privacy. We believe that you should know what happens with the results generated by this tool. Many other tools sell the data to third parties. Cloudflare does not sell your data. Performance data is collected and anonymized and is governed by the terms of our Privacy Policy. The data is used anonymously to determine how we can improve our network, both in terms of capacity as well as to help us determine which Internet Service Providers to peer with.

Test your home network performance

The test has three main components: download, upload and a latency test. Each measures  different aspects of your network connection.

Down

For starters we run you through a basic download test. We start off downloading small files and progressively move up to larger and larger files until the test has saturated your Internet downlink. Small files (we start off with 10KB, then 100KB and so on) are a good representation of how websites will load, as these typically encompass many small files such as images, CSS stylesheets and JSON blobs.

For each file size, we show you the measurements inside a table, allowing you to drill down. Each dot in the bar graph represents one of the measurements, with the thin line delineating the range of speeds we’ve measured. The slightly thicker block represents the set of measurements between the 25th and 75th percentile.

Test your home network performance

Getting up to the larger file sizes we can see true maximum throughput: how much bandwidth do you really have? You may be wondering why we have to use progressively larger files. The reason is that download speeds start off slow (this is aptly called slow start) and then progressively gets faster. If we were to use only small files we would never get to the maximum throughput that your network provider supports, which should be close to the Internet speed your ISP quoted you when you signed up for service.

The maximum throughput on larger files will be indicative of how fast you can download large files such as games (GTA V is almost 100 GB to download!) or the maximum quality that you can stream video on (lower download speed means you have to use a lower resolution to get continuous playback). We only increase download file sizes up to the absolute minimum required to get accurate measurements: no wasted bandwidth.

Up

Upload is the opposite of download: we send data from your browser to the Internet. This metric is more important nowadays with many people working from home: it directly affects live video conferencing. A faster upload speed means your microphone and video feed can be of higher quality, meaning people can see and hear you more clearly on videoconferences.

Measurements for upload operate in the same manner: we progressively try to upload larger and larger files up until the point we notice your connection is saturated.

Speed measurements are never 100% consistent, which is why we repeat them. An easy way for us to report your speed would be to simply report the fastest speed we see. The problem is that this will not be representative of your real-world experience: latency and packet loss constantly fluctuates, meaning you can’t expect to see your maximum measured performance all the time.

To compensate for this, we take the 90th percentile of measurements, or p90 and report that instead of the absolute maximum speed that we measured. Taking the 90th percentile is a more accurate representation in that it discounts peak outliers, which is a much closer approximation of what you can expect in terms of speeds in the real world.

Latency and Jitter

Download and upload are important metrics but don’t paint the entire picture of the quality of your Internet connection. Many of us find ourselves interacting with work and friends over videoconferencing software more than ever. Although speeds matter, video is also very sensitive to the latency of your Internet connection. Latency represents the time an IP packet needs to travel from your device to the service you’re using on the Internet and back. High latency means that when you’re talking on a video conference, it will take longer for the other party to hear your voice.

But, latency only paints half the picture. Imagine yourself in a conversation where you have some delay before you hear what the other person says. That may be annoying but after a while you get used to it. What would be even worse is if the delay differed constantly: sometimes the audio is almost in sync and sometimes it has a delay of a few seconds. You can imagine how often this would result into two people starting to talk at the same time. This is directly related to how stable your latency is and is represented by the jitter metric. Jitter is the average variation found in consecutive latency measurements. A lower number means that the latencies measured are more consistent, meaning your media streams will have the same delay throughout the session.

Test your home network performance

We’ve designed speed.cloudflare.com to be as transparent as possible: you can click into any of the measurements to see the average, median, minimum, maximum measurements, and more. If you’re interested in playing around with the numbers, there’s a download button that will give you the raw results we measured.

Test your home network performance

The entire speed.cloudflare.com backend runs on Workers, meaning all logic runs entirely on the Cloudflare edge and your browser, no server necessary! If you’re interested in seeing how the benchmarks take place, we’ve open-sourced the code, feel free to take a peek on our Github repository.

We hope you’ll enjoy adding this tool to your set of network debugging tools. We love being transparent and our tools reflect this: your network performance is more than just one number. Give it a whirl and let us know what you think.

Automating your API testing with AWS CodeBuild, AWS CodePipeline, and Postman

Post Syndicated from Juan Lamadrid original https://aws.amazon.com/blogs/devops/automating-your-api-testing-with-aws-codebuild-aws-codepipeline-and-postman/

Today, enterprises of all shapes and sizes are engaged in some form of digital transformation. Many recognize that successful digital transformation requires continuous evolution powered by a robust API strategy. APIs enable the creation of new products, improvement of the customer experience, transformation of business processes, and ultimately, the agility needed to create sustainable business value. Hence, it stands to reason that adopting DevOps best practices such as Continuous Integration into your API development lifecycle helps improve the quality of your APIs and accelerate your API strategy.

In this post, we highlight how to automate API testing using serverless technologies, including AWS CodePipeline and AWS CodeBuild, along with Postman. AWS CodePipeline is a fully managed continuous delivery service that helps you automate your release pipelines for fast and reliable application and infrastructure updates. CodeBuild is a fully managed continuous integration service that compiles source code, runs tests, and produces software packages ready to deploy without the need to provision, manage, and scale your own build servers.

We take advantage of a new feature in CodeBuild called Reports that allows us to view the reports generated by functional or integration tests. We keep an eye on valuable metrics such as Pass Rate %, Test Run Duration, and the number of Passed versus Failed/Error test cases.

Postman is an industry-recognized tool used for API development that makes it easy to both develop and test your APIs. Postman also includes command-line integration with its command-line Collection Runner, Newman. Newman can easily be integrated with your continuous integration servers and build systems. Our CodePipeline pipeline uses CodeBuild to invoke the Newman command line interface and execute tests created with Postman. We cover the steps in detail below.

Solution Overview

In this post, we demonstrate how to automate the deployment and testing of the Pet Store API that is available as a sample API with API Gateway. This is a simple API that integrates via HTTP proxy to a demo Pet Store API. The API contains endpoints to list pets, get a pet by specific id, and add a pet.

The following diagram depicts the architecture of this simple Pet Store demo API.

Simple PetStore API Architecture

 

 

The following diagram depicts the AWS CodePipeline pipeline architecture we use to test the PetStore API.

 

The AWS CodePipline pipeline architecture we use to test our API.

After execution of this pipeline, you have a fully operational API that has been tested for specific functional requirements. These test cases and their results are visible in the Reports section of the CodeBuild console.

Building the PetStore API Pipeline

To get started, follow these steps:

Step 1. Fork the Github repository

Log into your GitHub account and fork the following repository: https://github.com/aws-samples/aws-codepipeline-codebuild-with-postman

Step 2. Clone the forked repository

Clone the forked repository into your local development environment.
git clone https://github.com/<YOUR_GITHUB_USERNAME>/aws-codepipeline-codebuild-with-postman

Step 3. Create an Amazon S3 bucket

This bucket contains resources related to this project. We refer to this bucket as the project’s root bucket.
Using the AWS CLI: aws s3 mb s3://<REPLACE_ME_WITH_UNIQUE_BUCKET_NAME>

Step 4. Edit the buildspec file

The buildspec file petstore-api-buildspec.yml contains the instructions to package the resources defined in your SAM template, petstore-api.yaml. This build spec is executed by CodeBuild within the build stage (BuildPetStoreAPI) of the pipeline.

1. Replace the following text REPLACE_ME_WITH_UNIQUE_BUCKET_NAME in the petstore-api-buildspec.yml with the bucket name created above in step 3.

2. Commit this change to your repository.

Step 5. Store Postman collection and environment files in S3

1. Navigate to the directory 02postman

For this project we included a Postman collection file, PetStoreAPI.postman_collection.json, that validates the PetStore API’s functionality. You can import the collection and environment file into Postman using the instructions here to see the tests associated with each API endpoint.

The following screenshot is an example specific to testing a GET request to the /pets endpoint(1). We make sure the GET request returns a JSON array(2) along with the inclusion of a Content-Type header (3) and a response time of less than 200ms (4). In the Test results tab (5), you can see we passed these tests when calling the API.

Postman screenshot showing tests for specific endpoint.

2. Save the Postman collection file in S3 using the AWS CLI

aws s3 cp PetStoreAPI.postman_collection.json \
s3://<REPLACE_ME_WITH_UNIQUE_BUCKET_NAME>/postman-env-files/PetStoreAPI.postman_collection.json

3. Save the Postman environment file to S3 using the AWS CLI

aws s3 cp PetStoreAPIEnvironment.postman_environment.json \
s3://<REPLACE_ME_WITH_UNIQUE_BUCKET_NAME>/postman-env-files/PetStoreAPIEnvironment.postman_environment.json

Step 6. Create the PetStore API pipeline

We now create the AWS CodePipeline PetStoreAPI pipeline that will both deploy and test our API. We use AWS CloudFormation template (petstore-api-pipeline.yaml) to define the pipeline and required stages, as noted in our pipeline architecture diagram.

Navigate back to the project’s root directory

To launch this template, you need to fill in a few parameters:
BucketRoot: unique bucket folder you created above
GitHubBranch: master
GitHubRepositoryName: aws-codepipeline-codebuild-with-postman
GitHubToken: your github personal access token
You can create your github token here (for select scopes: check repo and admin:repohook)
GitHubUser = your github username

2. Use the AWS CLI to deploy the AWS CloudFormation template as follows

aws cloudformation create-stack --stack-name petstore-api-pipeline \
--template-body file://./petstore-api-pipeline.yaml \
--parameters \
ParameterKey=BucketRoot,ParameterValue=<REPLACE_ME_WITH_UNIQUE_BUCKET_NAME> \
ParameterKey=GitHubBranch,ParameterValue=<REPLACE_ME_GITHUB_BRANCH> \
ParameterKey=GitHubRepositoryName,ParameterValue=<REPLACE_ME_GITHUB_REPO> \
ParameterKey=GitHubToken,ParameterValue=<REPLACE_ME_GITHUB_TOKEN> \
ParameterKey=GitHubUser,ParameterValue=<REPLACE_ME_GITHUB_USERNAME> \
--capabilities CAPABILITY_NAMED_IAM

This command creates a CodePipeline pipeline and required stages to deploy and test our API using CodeBuild and Newman. Open the CodePipeline console to watch your pipeline execute and monitor the different stages, as shown in the following screenshot.

PetStore API AWS CodePipeline
The last stage of the pipeline uses CodeBuild and Newman to execute the tests created with Postman. You should now have a fully functional API visible in the Amazon API Gateway console.

Review AWS CodeBuild configuration

For this pipeline, we use CodeBuild to both deploy our API in the build stage and to test our API in the test stage of the pipeline. For the deploy stage, CodeBuild uses AWS Serverless Application Model (SAM) to build and deploy our API. We focus on the test stage and how we use CodeBuild to run functional tests against our API.

Take a look at the buildspec file (postman-newman-buildspec.yml)that CodeBuild uses to execute the test. Recall that our goal for this stage is to execute functional tests that we created earlier using Postman and to visualize these test results in CodeBuild Reports:

version: 0.2

env:
  variables:
    key: "S3_BUCKET"

phases:
  install:
    runtime-versions:
      nodejs: 10
    commands:
      - npm install -g newman
      - yum install -y jq

  pre_build:
    commands:
      - aws s3 cp "s3://${S3_BUCKET}/postman-env-files/PetStoreAPIEnvironment.postman_environment.json" ./02postman/
      - aws s3 cp "s3://${S3_BUCKET}/postman-env-files/PetStoreAPI.postman_collection.json" ./02postman/
      - cd ./02postman
      - ./update-postman-env-file.sh

  build:
    commands:
      - echo Build started on `date` from dir `pwd`
      - newman run PetStoreAPI.postman_collection.json --environment PetStoreAPIEnvironment.postman_environment.json -r junit

reports:
  JUnitReports: # CodeBuild will create a report group called "SurefireReports".
    files: #Store all of the files
      - '**/*'
    base-directory: '02postman/newman' # Location of the reports

 

In the install phase, we install the required Newman library. Recall this is the library that uses Postman collection and environment files to execute tests from the CLI. We also install the jq library that allows you to query JSON.

In the pre_build phase, we execute commands that set up our test environment. In this case, we need to grab the Postman collection and environment files from Amazon S3. Then we use a shell script, update-postman-env-file.sh to update the Postman environment file with the API Gateway URL for the API created in the build stage. Lets take a look at the shell script executed by CodeBuild:

#!/usr/bin/env bash

#This shell script updates Postman environment file with the API Gateway URL created
# via the api gateway deployment

echo "Running update-postman-env-file.sh"

api_gateway_url=`aws cloudformation describe-stacks \
  --stack-name petstore-api-stack \
  --query "Stacks[0].Outputs[*].{OutputValueValue:OutputValue}" --output text`

echo "API Gateway URL:" ${api_gateway_url}

jq -e --arg apigwurl "$api_gateway_url" '(.values[] | select(.key=="apigw-root") | .value) = $apigwurl' \
  PetStoreAPIEnvironment.postman_environment.json > PetStoreAPIEnvironment.postman_environment.json.tmp \
  && cp PetStoreAPIEnvironment.postman_environment.json.tmp PetStoreAPIEnvironment.postman_environment.json \
  && rm PetStoreAPIEnvironment.postman_environment.json.tmp

echo "Updated PetStoreAPIEnvironment.postman_environment.json"

cat PetStoreAPIEnvironment.postman_environment.json

 

This shell script wraps AWS API commands to get the required API Gateway URL from the AWS CloudFormation stack output and uses this value to update the Postman environment file. Notice how we also use the jq library installed earlier.

Once this is done, we move on to the build phase in our postman-newman-buildspec.yml. Note in the commands section how we execute the Newman command line runner by passing the required Postman collection and environment files. Also, notice how we specify to Newman that we want these reports in JUnit style output. This is very important, as this allows CodeBuild Reports to consume and visualize this output.

Once our test run is complete, we specify in our buildspec file where our test results JUnit files are available. This allows CodeBuild Reports to consume our JUnit test results for visualization.

You can accomplish all of this without having to provision, manage, and scale your own build servers.

Working with CodeBuild’s test reporting feature

CodeBuild announced a new reporting feature that allows you to view the reports generated by functional or integration tests. You can use your test reports to view trends and test and failure rates to help you optimize builds. The test file format can be JUnit XML or Cucumber JSON. You can create your test cases with any test framework that can create files in one of those formats (for example, Surefire JUnit plugin, TestNG, or Cucumber).

Using this feature, you can see the history of your test runs and see duration for the entire report, as shown in the following screenshots:

 

test run trends

 

test run summary information

It also provides details for individual test cases within a report, as shown in the following screenshot.

 

details for individual test cases within a report

 

 

You can select any individual test case to see its details. The following screenshot shows details of a failed test case.

details of a failed test case

 

 

Please note that at the time of this publication, the CodeBuild reporting feature is in preview.

Cleanup

After the tests are completed, we recommend the following steps to clean-up the resources created in this post and avoid any charges.

1. Delete the AWS CloudFormation stack petstore-api-stack to delete the PetStore API deployed by the pipeline stack

2. Delete the pipeline artifact bucket created by the petstore-api-pipeline stack.

This is the bucket referred to as CodePipelineArtifactBucket in the resources tab of the petstore-api-pipeline stack and begins with the name: petstore-api-pipeline-codepipeline-artifact-bucket. This bucket needs to be deleted in order to delete the pipeline stack.

3. Delete the AWS CloudFormation stack petstore-api-pipeline to delete the AWS CodePipeline pipeline that builds and deploys the PetStore API.

Conclusion

Continuous Integration is a DevOps best practice that helps improve software quality. In this blog post, we showed how you can use AWS Services such as CodeBuild and CodePipeline with Postman, a powerful API testing and development tool, to easily adopt Continuous Integration and DevOps best practices into your API development process.

Tackling UI test execution time imbalance for Xcode parallel testing

Post Syndicated from Grab Tech original https://engineering.grab.com/tackling-ui-test-execution-time-imbalance-for-xcode-parallel-testing

Introduction

Testing is a common practice to ensure that code logic is not easily broken during development and refactoring. Having tests running as part of Continuous Integration (CI) infrastructure is essential, especially with a large codebase contributed by many engineers. However, the more tests we add, the longer it takes to execute. In the context of iOS development, the execution time of the whole test suite might be significantly affected by the increasing number of tests written. Running CI pre-merge pipelines against a change, would cost us more time. Therefore, reducing test execution time is a long term epic we have to tackle in order to build a good CI infrastructure.

Apart from splitting tests into subsets and running each of them in a CI job, we can also make use of the Xcode parallel testing feature to achieve parallelism within one single CI job. However, due to platform-specific implementations, there are some constraints that prevent parallel testing from working efficiently. One constraint we found is that tests of the same Swift class run on the same simulator. In this post, we will discuss this constraint in detail and introduce a tip to overcome it.

Background

Xcode parallel testing

The parallel testing feature was shipped as part of the Xcode 10 release. This support enables us to easily configure test setup:

  • There is no need to care about how to split a given test suite.
  • The number of workers (i.e. parallel runners/instances) is configurable. We can pass this value in the xcodebuild CLI via the -parallel-testing-worker-count option.
  • Xcode takes care of cloning and starts simulators accordingly.

However, the distribution logic under the hood is a black-box. We do not really know how tests are assigned to each worker or simulator, and in which order.

Three simulators running tests in parallel
Three simulators running tests in parallel

It is worth mentioning that even without the Xcode parallel testing support, we can still achieve similar improvements by running subsets of tests in different child processes. But it takes more effort to dispatch tests to each child process in an efficient way, and to handle the output from each test process appropriately.

Test time imbalance

Generally, a parallel execution system is at its best efficiency if each parallel task executes in roughly the same duration and ends at roughly the same time.

If the time spent on each parallel task is significantly different, it will take more time than expected to execute all tasks. For example, in the following image, it takes the system on the left 13 mins to finish 3 tasks. Whereas, the one on the right takes only 10.5 mins to finish those 3 tasks.

Bad parallelism vs. good parallelism
Bad parallelism vs. good parallelism

Assume there are N workers. The ith worker executes its tasks in ti seconds/minutes. In the left plot, t1 = 10 mins, t2 = 7 mins, t3 = 13 mins.

We define the test time imbalance metric as the difference between the min and max end time:

max(ti) – min(ti)

For the example above, the test time imbalance is 13 mins – 7 mins = 6 mins.

Contributing factors in test time imbalance

There are several factors causing test time imbalance. The top two prominent factors are:

  1. Tests vary in execution time.
  2. Tests of the same class run on the same simulator.

An example of the first factor is that in our project, around 50% of tests execute in a range of 20-40 secs. Some tests take under 15 secs to run while several take up to 2 minutes. Sometimes tests taking longer execution time is inevitable since those tests usually touch many flows, which cannot be split. If such tests run last, the test time imbalance may increase.

However, this issue, in general, does not matter that much because long-time-execution tests do not always run last.

Regarding the second factor, there is no official Apple documentation that explicitly states this constraint. When Apple first introduced parallel testing support in Xcode 10, they only mentioned that test classes are distributed across runner processes:

“Test parallelization occurs by distributing the test classes in a target across multiple runner processes. Use the test log to see how your test classes were parallelized. You will see an entry in the log for each runner process that was launched, and below each runner you will see the list of classes that it executed.”

For example, we have a test class JobFlowTests that includes five tests and another test class TutorialTests that has only one single test.

final class JobFlowTests: BaseXCTestCase {
func testHappyFlow() { ... }
  func testRecoverFlow() { ... }
  func testJobIgnoreByDax() { ... }
  func testJobIgnoreByTimer() { ... }
  func testForceClearBooking() { ... }
}
...
final class TutorialTests: BaseXCTestCase {
  func testOnboardingFlow() { ... }
}

When executing the two tests with two simulators running in parallel, the actual run is like the one shown on the left side of the following image, but ideally it should work like the one on the right side.

Tests of the same class are supposed to run on the same simulator but they should be able to run on different simulators.
Tests of the same class are supposed to run on the same simulator but they should be able to run on different simulators.

Diving deep into Xcode parallel testing

Demystifying Xcode scheduling log

As mentioned above, Xcode distributes tests to simulators/workers in a black-box manner. However, by looking at the scheduling log generated when running tests, we can understand how Xcode parallel testing works.

When running UI tests via the xcodebuild command:

$ xcodebuild -workspace Driver/Driver.xcworkspace \
    -scheme Driver \
    -configuration Debug \
    -sdk 'iphonesimulator' \
    -destination 'platform=iOS Simulator,id=EEE06943-7D7B-4E76-A3E0-B9A5C1470DBE' \
    -derivedDataPath './DerivedData' \
    -parallel-testing-enabled YES \
    -parallel-testing-worker-count 2 \
    -only-testing:DriverUITests/JobFlowTests \    # 👈👈👈👈👈
    -only-testing:DriverUITests/TutorialTests \
    test-without-building

The log can be found inside the *.xcresult folder under DerivedData/Logs/Test. For example: DerivedData/Logs/Test/Test-Driver-2019.11.04\_23-31-34-+0800.xcresult/1\_Test/Diagnostics/DriverUITests-144D9549-FD53-437B-BE97-8A288855E259/scheduling.log

Scheduling log under xcresult folder.
Scheduling log under xcresult folder
2019-11-05 03:55:00 +0000: Received worker from worker provider: 0x7fe6a684c4e0 [0: Clone 1 of DaxIOS-XC10-1-iP7-1 (3D082B53-3159-4004-A798-EA5553C873C4)]
2019-11-05 03:55:13 +0000: Worker 0x7fe6a684c4e0 [4985: Clone 1 of DaxIOS-XC10-1-iP7-1 (3D082B53-3159-4004-A798-EA5553C873C4)] finished bootstrapping
2019-11-05 03:55:13 +0000: Parallelization enabled; test execution driven by the IDE
2019-11-05 03:55:13 +0000: Skipping test class discovery
2019-11-05 03:55:13 +0000: Executing tests {(	# 👈👈👈👈👈
    DriverUITests/JobFlowTests,
    DriverUITests/TutorialTests
)}; skipping tests {(
)}
2019-11-05 03:55:13 +0000: Load balancer requested an additional worker
2019-11-05 03:55:13 +0000: Dispatching tests {(  # 👈👈👈👈👈
    DriverUITests/JobFlowTests
)} to worker: 0x7fe6a684c4e0 [4985: Clone 1 of DaxIOS-XC10-1-iP7-1 (3D082B53-3159-4004-A798-EA5553C873C4)]
2019-11-05 03:55:13 +0000: Received worker from worker provider: 0x7fe6a1582e40 [0: Clone 2 of DaxIOS-XC10-1-iP7-1 (F640C2F1-59A7-4448-B700-7381949B5D00)]
2019-11-05 03:55:39 +0000: Dispatching tests {(  # 👈👈👈👈👈
    DriverUITests/TutorialTests
)} to worker: 0x7fe6a684c4e0 [4985: Clone 1 of DaxIOS-XC10-1-iP7-1 (3D082B53-3159-4004-A798-EA5553C873C4)]
...

Looking at the log below, we know that once a test class is dispatched or distributed to a worker/simulator, all tests of that class will be executed in that simulator.

2019-11-05 03:55:39 +0000: Dispatching tests {(
    DriverUITests/TutorialTests
)} to worker: 0x7fe6a684c4e0 [4985: Clone 1 of DaxIOS-XC10-1-iP7-1 (3D082B53-3159-4004-A798-EA5553C873C4)]

Even when we customize a test suite (by swizzling some XCTestSuite class methods or variables), to split a test suite into multiple suites, it does not work because the made-up test suite is only initialized after tests are dispatched to a given worker.

Therefore, any hook to bypass this constraint must be done early on.

Passing the -only-testing argument to xcodebuild command

Now we pass tests (instead of test classes) to the -only-testing argument.

$ xcodebuild -workspace Driver/Driver.xcworkspace \
    # ...
    -only-testing:DriverUITests/JobFlowTests/testJobIgnoreByTimer \
    -only-testing:DriverUITests/JobFlowTests/testRecoverFlow \
    -only-testing:DriverUITests/JobFlowTests/testJobIgnoreByDax \
    -only-testing:DriverUITests/JobFlowTests/testHappyFlow \
    -only-testing:DriverUITests/JobFlowTests/testForceClearBooking \
    -only-testing:DriverUITests/TutorialTests/testOnboardingFlow \
    test-without-building

But still, the scheduling log shows that tests are grouped by test class before being dispatched to workers (see the following log for reference). This grouping is automatically done by Xcode (which it should not).

2019-11-05 04:21:42 +0000: Executing tests {(	# 👈
    DriverUITests/JobFlowTests/testJobIgnoreByTimer,
    DriverUITests/JobFlowTests/testRecoverFlow,
    DriverUITests/JobFlowTests/testJobIgnoreByDax,
    DriverUITests/TutorialTests/testOnboardingFlow,
    DriverUITests/JobFlowTests/testHappyFlow,
    DriverUITests/JobFlowTests/testForceClearBooking
)}; skipping tests {(
)}
2019-11-05 04:21:42 +0000: Load balancer requested an additional worker
2019-11-05 04:21:42 +0000: Dispatching tests {(  # 👈 ❌
    DriverUITests/JobFlowTests/testJobIgnoreByTimer,
    DriverUITests/JobFlowTests/testForceClearBooking,
    DriverUITests/JobFlowTests/testJobIgnoreByDax,
    DriverUITests/JobFlowTests/testHappyFlow,
    DriverUITests/JobFlowTests/testRecoverFlow
)} to worker: 0x7fd781261940 [6300: Clone 1 of DaxIOS-XC10-1-iP7-1 (93F0FCB6-C83F-4419-9A75-C11765F4B1CA)]
......

Overcoming grouping logic in Xcode parallel testing

Tweaking the -only-testing argument values

Based on our observation, we can imagine how Xcode runs tests in parallel. See below.

Step 1.   tests = detect_tests_to_run() # parse -only-testing arguments
Step 2.   groups_of_tests = group_tests_by_test_class(tests)
Step 3.   while groups_of_tests is not empty:
Step 3.1. 	worker = find_free_worker()
Step 3.2.     if worker is not None:
                  dispatch_tests_to_workers(groups_of_tests.pop())

In the pseudo-code above, we do not have much control to change step 2 since that grouping logic is implemented by Xcode. But we have a good guess that Xcode groups tests, by the first two components (class name) only (For example, DriverUITests/JobFlowTests). In other words, tests having the same class name run together on one simulator.

The trick to break this constraint is simple. We can tweak the input (test names) so that each group contains only one test. By inserting a random token in the class name, all class names in the tests that are passed via -only-testing argument are different.

For example, instead of passing:

-only-testing:DriverUITests/JobFlowTests/testJobIgnoreByTimer \
-only-testing:DriverUITests/JobFlowTests/testRecoverFlow \

We rather use:

-only-testing:DriverUITests/JobFlowTests_AxY132z8/testJobIgnoreByTimer \
-only-testing:DriverUITests/JobFlowTests_By8MTk7l/testRecoverFlow \

Or we can use the test name itself as the token:

-only-testing:DriverUITests/JobFlowTests_testJobIgnoreByTimer/testJobIgnoreByTimer \
-only-testing:DriverUITests/JobFlowTests_testRecoverFlow/testRecoverFlow \

After that, looking at the scheduling log, we will see that the trick can bypass the grouping logic. Now, only one test is dispatched to a worker once ready.

2019-11-05 06:06:56 +0000: Dispatching tests {(	# 👈 ✅
    DriverUITests/JobFlowTests_testJobIgnoreByDax/testJobIgnoreByDax
)} to worker: 0x7fef7952d0e0 [13857: Clone 2 of DaxIOS-XC10-1-iP7-1 (9BA030CD-C90F-4B7A-B9A7-D12F368A5A64)]
2019-11-05 06:06:58 +0000: Dispatching tests {(	# 👈 ✅
    DriverUITests/TutorialTests_testOnboardingFlow/testOnboardingFlow
)} to worker: 0x7fef7e85fd70 [13719: Clone 1 of DaxIOS-XC10-1-iP7-1 (584F99FE-49C2-4536-B6AC-90B8A10F361B)]
2019-11-05 06:07:07 +0000: Dispatching tests {(	# 👈 ✅
    DriverUITests/JobFlowTests_testRecoverFlow/testRecoverFlow
)} to worker: 0x7fef7952d0e0 [13857: Clone 2 of DaxIOS-XC10-1-iP7-1 (9BA030CD-C90F-4B7A-B9A7-D12F368A5A64)]

Handling tweaked test names

When a worker/simulator receives a request to run a test, the app (could be the runner app or the hosting app) initializes an XCTestSuite corresponding to the test name. In order for the test suite to be properly made up, we need to remove the inserted token.

This could be done easily by swizzling the XCTestSuite.init(forTestCaseWithName:). Inside that swizzled function, we remove the token and then call the original init function.

extension XCTestSuite {
  /// For 'Selected tests' suite
  @objc dynamic class func swizzled_init(forTestCaseWithName maskedName: String) -> XCTestSuite {
    /// Recover the original test name
    /// - masked: UITestCaseA_testA1/testA1      	--> recovered: UITestCaseA/testA1
    /// - masked: Driver/UITestCaseA_testA1/testA1   --> recovered: Driver/UITestCaseA/testA1
    guard let testBaseName = maskedName.split(separator: "/").last else {
      return swizzled_init(forTestCaseWithName: maskedName)
    }
    let recoveredName = maskedName.replacingOccurrences(of: "_\(testBaseName)/", with: "/") # 👈 remove the token
    return swizzled_init(forTestCaseWithName: recoveredName) # 👈 call the original init
  }
}
Swizzle function to run tests properly
Swizzle function to run tests properly

Test class discovery

In order to adopt this tip, we need to know which test classes we need to run in advance. Although Apple does not provide an API to obtain the list before running tests, this can be done in several ways. One approach we can use is to generate test classes using Sourcery. Another alternative is to parse the binaries inside .xctest bundles (in build products) to look for symbols related to tests.

Conclusion

In this article, we identified some factors causing test execution time imbalance in Xcode parallel testing (particularly for UI tests).

We also looked into how Xcode distributes tests in parallel testing. We also try to mitigate a constraint in which tests within the same class run on the same simulator. The trick not only reduces the imbalance but also gives us more confidence in adding more tests to a class without caring about whether it affects our CI infrastructure.

Below is the metric about test time imbalance recorded when running UI tests. After adopting the trick, we saw a decrease in the metric (which is a good sign). As of now, the metric stabilizes at around 0.4 mins.

Tracking data of UI test time imbalance (in minutes) in our project, collected by multiple runs
Tracking data of UI test time imbalance (in minutes) in our project, collected by multiple runs

Join us

Grab is more than just the leading ride-hailing and mobile payments platform in Southeast Asia. We use data and technology to improve everything from transportation to payments and financial services across a region of more than 620 million people. We aspire to unlock the true potential of Southeast Asia and look for like-minded individuals to join us on this ride.

If you share our vision of driving South East Asia forward, apply to join our team today.

Testing and creating CI/CD pipelines for AWS Step Functions

Post Syndicated from Matt Noyce original https://aws.amazon.com/blogs/devops/testing-and-creating-ci-cd-pipelines-for-aws-step-functions-using-aws-codepipeline-and-aws-codebuild/

AWS Step Functions allow users to easily create workflows that are highly available, serverless, and intuitive. Step Functions natively integrate with a variety of AWS services including, but not limited to, AWS Lambda, AWS Batch, AWS Fargate, and Amazon SageMaker. It offers the ability to natively add error handling, retry logic, and complex branching, all through an easy-to-use JSON-based language known as the Amazon States Language.

AWS CodePipeline is a fully managed Continuous Delivery System that allows for easy and highly configurable methods for automating release pipelines. CodePipeline allows the end-user the ability to build, test, and deploy their most critical applications and infrastructure in a reliable and repeatable manner.

AWS CodeCommit is a fully managed and secure source control repository service. It eliminates the need to support and scale infrastructure to support highly available and critical code repository systems.

This blog post demonstrates how to create a CI/CD pipeline to comprehensively test an AWS Step Function state machine from start to finish using CodeCommit, AWS CodeBuild, CodePipeline, and Python.

CI/CD pipeline steps

The pipeline contains the following steps, as shown in the following diagram.

CI/CD pipeline steps

  1. Pull the source code from source control.
  2. Lint any configuration files.
  3. Run unit tests against the AWS Lambda functions in codebase.
  4. Deploy the test pipeline.
  5. Run end-to-end tests against the test pipeline.
  6. Clean up test state machine and test infrastructure.
  7. Send approval to approvers.
  8. Deploy to Production.

Prerequisites

In order to get started building this CI/CD pipeline there are a few prerequisites that must be met:

  1. Create or use an existing AWS account (instructions on creating an account can be found here).
  2. Define or use the example AWS Step Function states language definition (found below).
  3. Write the appropriate unit tests for your Lambda functions.
  4. Determine end-to-end tests to be run against AWS Step Function state machine.

The CodePipeline project

The following screenshot depicts what the CodePipeline project looks like, including the set of stages run in order to securely, reliably, and confidently deploy the AWS Step Function state machine to Production.

CodePipeline project

Creating a CodeCommit repository

To begin, navigate to the AWS console to create a new CodeCommit repository for your state machine.

CodeCommit repository

In this example, the repository is named CalculationStateMachine, as it contains the contents of the state machine definition, Python tests, and CodeBuild configurations.

CodeCommit structure

Breakdown of repository structure

In the CodeCommit repository above we have the following folder structure:

  1. config – this is where all of the Buildspec files will live for our AWS CodeBuild jobs.
  2. lambdas – this is where we will store all of our AWS Lambda functions.
  3. tests – this is the top-level folder for unit and end-to-end tests. It contains two sub-folders (unit and e2e).
  4. cloudformation – this is where we will add any extra CloudFormation templates.

Defining the state machine

Inside of the CodeCommit repository, create a State Machine Definition file called sm_def.json that defines the state machine in Amazon States Language.

This example creates a state machine that invokes a collection of Lambda functions to perform calculations on the given input values. Take note that it also performs a check against a specific value and, through the use of a Choice state, either continues the pipeline or exits it.

sm_def.json file:

{
  "Comment": "CalulationStateMachine",
  "StartAt": "CleanInput",
  "States": {
    "CleanInput": {
      "Type": "Task",
      "Resource": "arn:aws:states:::lambda:invoke",
      "Parameters": {
        "FunctionName": "CleanInput",
        "Payload": {
          "input.$": "$"
        }
      },
      "Next": "Multiply"
    },
    "Multiply": {
      "Type": "Task",
      "Resource": "arn:aws:states:::lambda:invoke",
      "Parameters": {
        "FunctionName": "Multiply",
        "Payload": {
          "input.$": "$.Payload"
        }
      },
      "Next": "Choice"
    },
    "Choice": {
      "Type": "Choice",
      "Choices": [
        {
          "Variable": "$.Payload.result",
          "NumericGreaterThanEquals": 20,
          "Next": "Subtract"
        }
      ],
      "Default": "Notify"
    },
    "Subtract": {
      "Type": "Task",
      "Resource": "arn:aws:states:::lambda:invoke",
      "Parameters": {
        "FunctionName": "Subtract",
        "Payload": {
          "input.$": "$.Payload"
        }
      },
      "Next": "Add"
    },
    "Notify": {
      "Type": "Task",
      "Resource": "arn:aws:states:::sns:publish",
      "Parameters": {
        "TopicArn": "arn:aws:sns:us-east-1:657860672583:CalculateNotify",
        "Message.$": "$$",
        "Subject": "Failed Test"
      },
      "End": true
    },
    "Add": {
      "Type": "Task",
      "Resource": "arn:aws:states:::lambda:invoke",
      "Parameters": {
        "FunctionName": "Add",
        "Payload": {
          "input.$": "$.Payload"
        }
      },
      "Next": "Divide"
    },
    "Divide": {
      "Type": "Task",
      "Resource": "arn:aws:states:::lambda:invoke",
      "Parameters": {
        "FunctionName": "Divide",
        "Payload": {
          "input.$": "$.Payload"
        }
      },
      "End": true
    }
  }
}

This will yield the following AWS Step Function state machine after the pipeline completes:

State machine

CodeBuild Spec files

The CI/CD pipeline uses a collection of CodeBuild BuildSpec files chained together through CodePipeline. The following sections demonstrate what these BuildSpec files look like and how they can be used to chain together and build a full CI/CD pipeline.

AWS States Language linter

In order to determine whether or not the State Machine Definition is valid, include a stage in your CodePipeline configuration to evaluate it. Through the use of a Ruby Gem called statelint, you can verify the validity of your state machine definition as follows:

lint_buildspec.yaml file:

version: 0.2
env:
  git-credential-helper: yes
phases:
  install:
    runtime-versions:
      ruby: 2.6
    commands:
      - yum -y install rubygems
      - gem install statelint

  build:
    commands:
      - statelint sm_def.json

If your configuration is valid, you do not see any output messages. If the configuration is invalid, you receive a message telling you that the definition is invalid and the pipeline terminates.

Lambda unit testing

In order to test your Lambda function code, you need to evaluate whether or not it passes a set of tests. You can test each individual Lambda function deployed and used inside of the state machine. You can feed various inputs into your Lambda functions and assert that the output is what you expect it to be. In this case, you use Python pytest to kick-off tests and validate results.

unit_test_buildspec.yaml file:

version: 0.2
env:
  git-credential-helper: yes
phases:
  install:
    runtime-versions:
      python: 3.8
    commands:
      - pip3 install -r tests/requirements.txt

  build:
    commands:
      - pytest -s -vvv tests/unit/ --junitxml=reports/unit.xml

reports:
  StateMachineUnitTestReports:
    files:
      - "**/*"
    base-directory: "reports"

Take note that in the CodeCommit repository includes a directory called tests/unit, which includes a collection of unit tests that are run and validated against your Lambda function code. Another very important part of this BuildSpec file is the reports section, which generates reports and metrics about the results, trends, and overall success of your tests.

CodeBuild test reports

After running the unit tests, you are able to see reports about the results of the run. Take note of the reports section of the BuildSpec file, along with the –junitxml=reports/unit.xml command run along with the pytest command. This generates a set of reports that can be visualized in CodeBuild.

Navigate to the specific CodeBuild project you want to examine and click on the specific execution of interest. There is a tab called Reports, as seen in the following screenshot:

Test reports

Select the specific report of interest to see a breakdown of the tests that have run, as shown in the following screenshot:

Test visualization

With Report Groups, you can also view an aggregated list of tests that have run over time. This report includes various features such as the number of average test cases that have run, average duration, and the overall pass rate, as shown in the following screenshot:

Report groups

The AWS CloudFormation template step

The following BuildSpec file is used to generate an AWS CloudFormation template that inject the State Machine Definition into AWS CloudFormation.

template_sm_buildspec.yaml file:

version: 0.2
env:
  git-credential-helper: yes
phases:
  install:
    runtime-versions:
      python: 3.8

  build:
    commands:
      - python template_statemachine_cf.py

The Python script that templates AWS CloudFormation to deploy the State Machine Definition given the sm_def.json file in your repository follows:

template_statemachine_cf.py file:

import sys
import json

def read_sm_def (
    sm_def_file: str
) -> dict:
    """
    Reads state machine definition from a file and returns it as a dictionary.

    Parameters:
        sm_def_file (str) = the name of the state machine definition file.

    Returns:
        sm_def_dict (dict) = the state machine definition as a dictionary.
    """

    try:
        with open(f"{sm_def_file}", "r") as f:
            return f.read()
    except IOError as e:
        print("Path does not exist!")
        print(e)
        sys.exit(1)

def template_state_machine(
    sm_def: dict
) -> dict:
    """
    Templates out the CloudFormation for creating a state machine.

    Parameters:
        sm_def (dict) = a dictionary definition of the aws states language state machine.

    Returns:
        templated_cf (dict) = a dictionary definition of the state machine.
    """
    
    templated_cf = {
        "AWSTemplateFormatVersion": "2010-09-09",
        "Description": "Creates the Step Function State Machine and associated IAM roles and policies",
        "Parameters": {
            "StateMachineName": {
                "Description": "The name of the State Machine",
                "Type": "String"
            }
        },
        "Resources": {
            "StateMachineLambdaRole": {
                "Type": "AWS::IAM::Role",
                "Properties": {
                    "AssumeRolePolicyDocument": {
                        "Version": "2012-10-17",
                        "Statement": [
                            {
                                "Effect": "Allow",
                                "Principal": {
                                    "Service": "states.amazonaws.com"
                                },
                                "Action": "sts:AssumeRole"
                            }
                        ]
                    },
                    "Policies": [
                        {
                            "PolicyName": {
                                "Fn::Sub": "States-Lambda-Execution-${AWS::StackName}-Policy"
                            },
                            "PolicyDocument": {
                                "Version": "2012-10-17",
                                "Statement": [
                                    {
                                        "Effect": "Allow",
                                        "Action": [
                                            "logs:CreateLogStream",
                                            "logs:CreateLogGroup",
                                            "logs:PutLogEvents",
                                            "sns:*"             
                                        ],
                                        "Resource": "*"
                                    },
                                    {
                                        "Effect": "Allow",
                                        "Action": [
                                            "lambda:InvokeFunction"
                                        ],
                                        "Resource": "*"
                                    }
                                ]
                            }
                        }
                    ]
                }
            },
            "StateMachine": {
                "Type": "AWS::StepFunctions::StateMachine",
                "Properties": {
                    "DefinitionString": sm_def,
                    "RoleArn": {
                        "Fn::GetAtt": [
                            "StateMachineLambdaRole",
                            "Arn"
                        ]
                    },
                    "StateMachineName": {
                        "Ref": "StateMachineName"
                    }
                }
            }
        }
    }

    return templated_cf


sm_def_dict = read_sm_def(
    sm_def_file='sm_def.json'
)

print(sm_def_dict)

cfm_sm_def = template_state_machine(
    sm_def=sm_def_dict
)

with open("sm_cfm.json", "w") as f:
    f.write(json.dumps(cfm_sm_def))

Deploying the test pipeline

In order to verify the full functionality of an entire state machine, you should stand it up so that it can be tested appropriately. This is an exact replica of what you will deploy to Production: a completely separate stack from the actual production stack that is deployed after passing appropriate end-to-end tests and approvals. You can take advantage of the AWS CloudFormation target supported by CodePipeline. Please take note of the configuration in the following screenshot, which shows how to configure this step in the AWS console:

Deploy test pipeline

End-to-end testing

In order to validate that the entire state machine works and executes without issues given any specific changes, feed it some sample inputs and make assertions on specific output values. If the specific assertions pass and you get the output that you expect to receive, you can proceed to the manual approval phase.

e2e_tests_buildspec.yaml file:

version: 0.2
env:
  git-credential-helper: yes
phases:
  install:
    runtime-versions:
      python: 3.8
    commands:
      - pip3 install -r tests/requirements.txt

  build:
    commands:
      - pytest -s -vvv tests/e2e/ --junitxml=reports/e2e.xml

reports:
  StateMachineReports:
    files:
      - "**/*"
    base-directory: "reports"

Manual approval (SNS topic notification)

In order to proceed forward in the CI/CD pipeline, there should be a formal approval phase before moving forward with a deployment to Production. Using the Manual Approval stage in AWS CodePipeline, you can configure the pipeline to halt and send a message to an Amazon SNS topic before moving on further. The SNS topic can have a variety of subscribers, but in this case, subscribe an approver email address to the topic so that they can be notified whenever an approval is requested. Once the approver approves the pipeline to move to Production, the pipeline will proceed with deploying the production version of the Step Function state machine.

This Manual Approval stage can be configured in the AWS console using a configuration similar to the following:

Manual approval

Deploying to Production

After the linting, unit testing, end-to-end testing, and the Manual Approval phases have passed, you can move on to deploying the Step Function state machine to Production. This phase is similar to the Deploy Test Stage phase, except the name of your AWS CloudFormation stack is different. In this case, you also take advantage of the AWS CloudFormation target for CodeDeploy:

Deploy to production

After this stage completes successfully, your pipeline execution is complete.

Cleanup

After validating that the test state machine and Lambda functions work, include a CloudFormation step that will tear-down the existing test infrastructure (as it is no longer needed). This can be configured as a new CodePipeline step similar to the below configuration:

CloudFormation Template for cleaning up resources

Conclusion

You have linted and validated your AWS States Language definition, unit tested your Lambda function code, deployed a test AWS state machine, run end-to-end tests, received Manual Approval to deploy to Production, and deployed to Production. This gives you and your team confidence that any changes made to your state machine and surrounding Lambda function code perform correctly in Production.

 

About the Author

matt noyce profile photo

 

Matt Noyce is a Cloud Application Architect in Professional Services at Amazon Web Services.
He works with customers to architect, design, automate, and build solutions on AWS
for their business needs.

Marionette – Enabling E2E user-scenario simulation

Post Syndicated from Grab Tech original https://engineering.grab.com/marionette-enabling-e2e-user-scenario-simulation

Introduction

A plethora of interconnected microservices is what powers the Grab’s app. The microservices work behind the scenes to delight millions of our customers in Southeast Asia. It is a no-brainer that we emphasize on strong testing tools, so our app performs flawlessly to continuously meet our customers’ needs.

Background

We have a microservices-based architecture, in which microservices are interconnected to numerous other microservices. Each passing day sees teams within Grab updating their microservices, which in turn enhances the overall app. If any of the microservices fail after changes are rolled out, it may lead to the whole app getting into an unstable state or worse. This is a major risk and that’s why we stress on conducting “end-to-end (E2E) testing” as an integral part of our software test life-cycle.

E2E tests are done for all crucial workflows in the app, but not for every detail. For that we have conventional tests such as unit tests, component tests, functional tests, etc. Consider E2E testing as the final approval in the quality assurance of the app.

Writing E2E tests in the microservices world is not a trivial task. We are not testing just a single monolithic application. To test a workflow on the app from a user’s perspective, we need to traverse multiple microservices, which communicate through different protocols such as HTTP/HTTPS and TCP. E2E testing gets even more challenging with the continuous addition of microservices. Over the years, we have grown tremendously with hundreds of microservices working in the background to power our app.

Some major challenges in writing E2E tests for the microservices-based apps are:

  • Availability

    Getting all microservices together for E2E testing is tough. Each development team works independently and is responsible only for its microservices. Teams use different programming languages, data stores, etc for each microservice. It’s hard to construct all pieces in a common test environment as a complete app for E2E testing each time.

  • Data or resource set up

    E2E testing requires comprehensive data set up. Otherwise, testing results are affected because of data constraints, and not due to any recent changes to underlying microservices. For example, we need to create real-life equivalent driver accounts, passenger accounts, etc and to have those, there are a few dependencies on other internal systems which manage user accounts. Further, data and booking generation should be robust enough to replicate real-world scenarios as far as possible.

  • Access and authentication

    Usually, the test cases require sequential execution in E2E testing. In a microservices architecture, it is difficult to test a workflow which requires access and permissions to several resources or data that should remain available throughout the test execution.

  • Resource and time intensive

    It is expensive and time consuming to run E2E tests; significant time is involved in deploying new changes, configuring all the necessary test data, etc.

Though there are several challenges, we had to find a way to overcome them and test workflows from the beginning to the end in our app.

Our approach to overcome challenges

We knew what our challenges were and what we wanted to achieve from E2E testing, so we started thinking about how to develop a platform for E2E tests. To begin with, we determined that the scope of E2E testing that we’re going to primarily focus on is Grab’s transport domain — the microservices powering the driver and passenger apps.

One approach is to “simulate” user scenarios through a single platform before any new versions of these microservices are released. Ideally, the platform should also have the capabilities to set up the data required for these simulations. For example, ride booking requires data set up such as driver accounts, passenger accounts, location coordinates, geofencing, etc.

We wanted to create a single platform that multiple teams could use to set up their test data and run E2E user-scenario simulations easily. We put ourselves to work on that idea, which resulted in the creation of an internal platform called “Marionette”. It simulates actions performed by Grab’s passenger and driver apps as they are expected to behave in the real world. The objective is to ensure that all standard user workflows are tested before deploying new app versions.

Introducing Marionette

Marionette enables Grabbers (developers and QAs) to run E2E user-scenario simulations without depending on the actual passenger and driver apps. Grabbers can set up data as well as configure data such as drivers, passengers, taxi types, etc to mimic the real-world behavior.

Let’s look at the overall architecture to understand Marionette better:

Overall Architecture

Grabbers can interact with Marionette through three channels: UI, SDK, and through RESTful API endpoints in their test scripts. All requests are routed through a load balancer to the Marionette platform. The Marionette platform in turn talks to the required microservices to create test data and to run the simulations.

The benefits

With Marionette, Grabbers now have the ability to:

  • Simulate the whole booking flow including customer and driver behavior as well as transition through the booking life cycle including pick-up, drop-off, cancellation, etc. For example, developers can make passenger booking from the UI and configure pick-up points, drop-off points, taxi types, and other parameters easily. They can define passenger behaviour such as “make bookings after a specified time interval”, “cancel each booking”, etc. They can also set driver locations, define driver behaviour such as “always accept booking manually”, “decline received bookings”, etc.
  • Simulate bookings in all cities where Grab operates. Further, developers can run simulations for multiple Grab taxi types such as JustGrab, GrabShare, etc.
  • Visualize passengers, drivers, and ride transitions on the UI, which lets them easily test their workflows.
  • Save efforts and time spent on installing third-party android or iOS emulators, troubleshooting or debugging .apk installation files, etc before testing workflows.
  • Conduct E2E testing without real mobile devices and installed apps.
  • Run automatic simulations, in which a particular set of scenarios are run continuously, thus helping developers with exploratory testing.

How we isolated simulations among users

It is important to have independent simulations for each user. Otherwise, simulations don’t yield correct results. This was one of the challenges we faced when we first started running simulations on Marionette.

To resolve this issue, we came up with the idea of “cohorts”. A cohort is a logical group of passengers and drivers who are located in a particular city. Each simulation on Marionette is run using a “cohort” containing the number of drivers and passengers required for that simulation. When a passenger/driver needs to interact with other passengers/drivers (such as for ride bookings), Marionette ensures that the interaction is constrained to resources within the cohort. This ensures that drivers and passengers are not shared in different test cases/simulations, resulting in more consistent test runs.

How to interact with Marionette

Let’s take a look at how to interact with Marionette starting with its user interface first.

User Interface

The Marionette UI is designed to provide the same level of granularity as available on the real passenger and driver apps.

Generally, the UI is used in the following scenarios:

  • To test common user scenarios/workflows after deploying a change on staging.
  • To test the end-to-end booking flow right from the point where a passenger makes a booking till drop-off at the destination.
  • To simulate functionality of other teams within Grab – the passenger app developers can simulate the driver app for their testing and vice versa. Usually, teams work independently and the ability to simulate the dependent app for testing allows developers to work independently.
  • To perform E2E testing (such as by QA teams) without writing any test scripts.

The Marionette UI also allows Grabbers to create and set up data. All that needs to be done is to specify the necessary resources such as number of drivers, number of passengers, city to run the simulation, etc. Running E2E simulations involves just the click of a button after data set up. Reports generated at the end of running simulations provide a graphical visualization of the results. Visual reports save developers’ time, which otherwise is spent on browsing through logs to ascertain errors.

SDK

Marionette also provides an SDK, written in the Go programming language.

It lets developers:

  • Create resources such as passengers, drivers, and cohorts for simulating booking flows.
  • Create booking simulations in both staging and production.
  • Set bookings to specific states as needed for simulation through customizable driver and passenger behaviour.
  • Make HTTP requests and receive responses that matter in tests.
  • Run load tests by scaling up booking requests to match the required workload (QPS).

Let’s look at a high-level booking test case example to understand the simulation workflow.

Assume we want to run an E2E booking test with this driver behavior type — “accepts passenger bookings and transits between booking states according to defined behavior parameters”. This is just one of the driver behavior types in Marionette; other behavior types are also supported. Similarly, passengers also have behaviour types.

To write the E2E test for this example case, we first define the driver behavior in a function like this:

Overall Architecture

Then, we handle the booking request for the driver like this:

Overall Architecture

The SDK client makes the handling of passengers, drivers, and bookings very easy as developers don’t need to worry about hitting multiple services and multiple APIs to set up their required driver and passenger actions. Instead, teams can focus on testing their use cases.

To ensure that passengers and drivers are isolated in our test, we need to group them together in a cohort before running the E2E test.

Overall Architecture

In summary, we have defined the driver’s behavior, created the booking request, created the SDK client and associated the driver and passenger to a cohort. Now, we just have to trigger the E2E test from our IDE. It’s just that simple and easy!

Previously, developers had to write boilerplate code to make HTTP requests and parse returned HTTP responses. With the Marionette SDK in place, developers don’t have to write any boilerplate code saving significant time and effort in E2E testing.

RESTful APIs in test scripts

Marionette provides several RESTful API endpoints that cover different simulation areas such as resource or data creation APIs, driver APIs, passenger APIs, etc. APIs are particularly suitable for scripted testing. Developers can directly call these APIs in their test scripts to facilitate their own tests such as load tests, integration tests, E2E tests, etc.

Developers use these APIs with their preferred programming languages to run simulations. They don’t need to worry about any underlying complexities when using the APIs. For example, developers in Grab have created custom libraries using Marionette APIs in Python, Java, and Bash to run simulations.

What’s next

Currently, we cover E2E tests for our transport domain (microservices for the passenger and driver apps) through Marionette. The next phase is to expand into a full-fledged platform that can test microservices in other Grab domains such as Food, Payments, and so on. Going forward, we are also looking to further simplify the writing of E2E tests and running them as a part of the CD pipeline for seamless testing before deployment.

In conclusion

We had an idea of creating a simulation platform that can run and facilitate E2E testing of microservices. With Marionette, we have achieved this objective. Marionette has helped us understand how end users use our apps, allowing us to make improvements to our services. Further, Marionette ensures there are no breaking changes and provides additional visibility into potential bugs that might be introduced as a result of any changes to microservices.

If you have any comments or questions about Marionette, please leave a comment below.

Thermal testing Raspberry Pi 4

Post Syndicated from Alex Bate original https://www.raspberrypi.org/blog/thermal-testing-raspberry-pi-4/

Raspberry Pi 4 just got a lot cooler! The last four months of firmware updates have taken over half a watt out of idle power and nearly a watt out of fully loaded power. For The MagPi magazine, Gareth Halfacree gets testing.

Raspberry Pi 4 Model B

Raspberry Pi 4 launched with a wealth of new features to tempt users into upgrading: a more powerful CPU and GPU, more memory, Gigabit Ethernet, and USB 3.0 support. More processing power means more electrical power, and Raspberry Pi 4 is the most power-hungry member of the family.

The launch of a new Raspberry Pi model is only the beginning of the story. Development is continuous, with new software and firmware improving each board long after it has rolled off the factory floor.

Raspberry Pi 4 updates

Raspberry Pi 4 is no exception: since launch, there has been a series of updates which have reduced its power needs and, in doing so, enabled it to run considerably cooler. These updates apply to any Raspberry Pi 4, whether you picked one up on launch day or are only just now making a purchase.

This feature takes a look at how each successive firmware release has improved Raspberry Pi 4, using a synthetic workload designed – unlike a real-world task – to make the system-on-chip (SoC) get as hot as possible in as short a time as possible.

Read on to see what wonders a simple firmware update can work.

How we tested Raspberry Pi 4 firmware revisions

To test how well each firmware revision handles the heat, a power-hungry synthetic workload was devised to represent a worst-case scenario: the stress-ng CPU stress-testing utility places all four CPU cores under heavy and continuous load. Meanwhile, the glxgears tool exercises the GPU. Both tools can be installed by typing the following at the Terminal:

sudo apt install stress-ng mesa-utils

The CPU workload can be run with the following command:

stress-ng --cpu 0 --cpu-method fft

The command will run for a full day at default settings; to cancel, press CTRL+C on the keyboard.

To run the GPU workload, type:

glxgears -fullscreen

This will display a 3D animation of moving gears, filling the entire screen. To close it, press ALT+F4 on the keyboard.

For more information on how both tools work, type:

man stress-ng
man glxgears

During the testing for this feature, both of the above workloads are run simultaneously for ten minutes. Afterwards, Raspberry Pi is allowed to cool for five minutes.

The thermal imagery was taken at idle, then again after 60 seconds of the stress-ng load alone.

Baseline test: Raspberry Pi 3B+

Already well established, Raspberry Pi 3 Model B+ was the device to beat

Before Raspberry Pi 4 came on the scene, Raspberry Pi 3 Model B+ was the must-have single-board computer. Benefiting from all the work that had gone into the earlier Raspberry Pi 3 Model B alongside improved hardware, Raspberry Pi 3B+ was – and still is – a popular device. Let’s see how well it performs before testing Raspberry Pi 4.

Power draw

An efficient processor and an improved design for the power circuitry compared to its predecessor help keep Raspberry Pi 3B+ power draw down: at idle, the board draws just 1.91W; when running the synthetic workload, that increases to 5.77W.

Thermal imaging


A thermal camera shows where the power goes. At idle, the system-on-chip is relatively cool while the combined USB and Ethernet controller to the middle-right is a noticeable hot spot; at load, measured after 60 seconds of a CPU-intensive synthetic workload, the SoC is by far the hottest component at 58.1°C.

Thermal throttling

This chart measures Raspberry Pi 3B+ CPU speed and temperature during a ten-minute power-intensive synthetic workload. The test runs on both the CPU and GPU, and is followed by a five-minute cooldown. Raspberry Pi 3B+ quickly reaches the ‘soft throttle’ point of 60°C, designed to prevent the SoC hitting the hard-throttle maximum limit of 80°C, and the CPU remains throttled at 1.2GHz for the duration of the benchmark run.

Raspberry Pi 4 Launch Firmware

The fastest Raspberry Pi ever made demanded the most power

Raspberry Pi 4 Model B launched with a range of improvements over Raspberry Pi 3B+, including a considerably more powerful CPU, a new GPU, up to four times the memory, and USB 3.0 ports. All that new hardware came at a cost: higher power draw and heat output. So let’s see how Raspberry Pi 4 performed at launch.

Power Draw

There’s no denying it, Raspberry Pi 4 was a hungry beast at launch. Even idling at the Raspbian desktop, the board draws 2.89W, hitting a peak of 7.28W under a worst-case synthetic CPU and GPU workload – a hefty increase over Raspberry Pi 3 B+.

Thermal Imaging


Thermal imaging shows that Raspberry Pi 4, using the launch-day firmware, runs hot even at idle, with hot spots at the USB controller to the middle-right and power-management circuitry to the bottom-left. Under a heavy synthetic load, the SoC hits 72.1°C by the 60-second mark.

Thermal Throttling

Raspberry Pi 4 manages to go longer than Raspberry Pi 3 B+ before the synthetic workload causes it to throttle; but throttle it does after just 65 seconds. As the workload runs, the CPU drops from 1.5GHz to a stable 1GHz, then dips as low as 750MHz towards the end.

Raspberry Pi 4 VLI Firmware

USB power management brings some relief for Raspberry Pi heat

The first major firmware update developed for Raspberry Pi 4 brought power management features to the Via Labs Inc. (VLI) USB controller. The VLI controller is responsible for handling the two USB 3.0 ports, and the firmware update allowed it to run cooler.

Power Draw

Even without anything connected to Raspberry Pi 4’s USB 3.0 ports, the VLI firmware upgrade has a noticeable impact: idle power draw has dropped to 2.62W, while the worst-case draw under a heavy synthetic workload sits at 7.01W.

Thermal Imaging


The biggest impact on heat is seen, unsurprisingly, on the VLI chip to the middle-right; the VLI firmware helps keep the SoC in the centre and the power-management circuitry at the bottom-left cooler than the launch firmware. The SoC reached 71.4°C under load – a small, but measurable, improvement.

Thermal Throttling

Enabling power management on the VLI chip has a dramatic impact on performance in the worst-case synthetic workload: the throttle point is pushed back to 77 seconds, the CPU spends more time at its full 1.5GHz speed, and it doesn’t drop to 750MHz at all. The SoC also cools marginally more rapidly at the end of the test.

Raspberry Pi 4 VLI, SDRAM firmware

With VLI tamed, it’s the memory’s turn now

The next firmware update, designed to be used alongside the VLI power management features, changes how Raspberry Pi 4’s memory – LPDDR4 SDRAM – operates. While having no impact on performance, it helps to push the power draw down still further at both idle and load.

Power Draw

As with the VLI update, the SDRAM update brings a welcome drop in power draw at both idle and load. Raspberry Pi 4 now draws 2.47W at idle and 6.79W running a worst-case synthetic load – a real improvement from the 7.28W at launch.

Thermal Imaging


Thermal imaging shows the biggest improvement yet, with both the SoC and the power-management circuitry running considerably cooler at idle after the installation of this update. After 60 seconds of load, the SoC is noticeably cooler at 68.8°C – a drop of nearly 3°C over the VLI firmware alone.

Thermal Throttling

A cooler SoC means better performance: the throttle point under the worst-case synthetic workload is pushed back to 109 seconds, after which Raspberry Pi 4 continues to bounce between full 1.5GHz and throttled 1GHz speeds for the entire ten-minute benchmark run – bringing the average speed up considerably.

Raspberry Pi 4 VLI, SDRAM, Clocking, and Load-Step Firmware

September 2019’s firmware update includes several changes, while bringing with it the VLI power management and SDRAM firmware updates. The biggest change is how the BCM2711B0 SoC on Raspberry Pi 4 increases and decreases its clock-speed in response to demand and temperature.

Power Draw

The September firmware update has incremental improvements: idle power draw is down to 2.36W and load under the worst-case synthetic workload to a peak of 6.67W, all without any reduction in raw performance or loss of functionality.

Thermal Imaging


Improved processor clocking brings a noticeable drop in idle temperature throughout the circuit board. At load, everything’s improved – the SoC peaked at 65°C after 60 seconds of the synthetic workload, while both the VLI chip and the power-management circuitry are clearly cooler than under previous firmwares.

Thermal Throttling

With this firmware, Raspberry Pi 4’s throttle point under the worst-case synthetic workload is pushed back all the way to 155 seconds – more than double the time the launch-day firmware took to hit the same point. The overall average speed is also brought up, thanks to more aggressive clocking back up to 1.5GHz.

Raspberry Pi 4 Beta Firmware

Currently in testing, this beta release is cutting-edge

Nobody at Raspberry Pi is resting on their laurels. Beta firmware is in testing and due for public release soon. It brings with it many improvements, including finer-grained control over SoC operating voltages and optimised clocking for the HDMI video state machines.

To upgrade your Raspberry Pi to the latest firmware, open a Terminal window and enter:

sudo apt update
sudo apt full-upgrade

Now restart Raspberry Pi using:

sudo shutdown - r now

Power Draw

The beta firmware decreases power draw at idle to reduce overall power usage, while tweaking the voltage of the SoC to drop power draw at load without harming performance. The result: a drop to 2.1W idle, and 6.41W at load – the best yet.

Thermal Imaging


The improvements made at idle are clear to see on thermal imaging: the majority of Raspberry Pi 4’s circuit board is below the bottom 35°C measurement point for the first time. After 60 seconds of load, there’s a smaller but still measurable improvement, with a peak measured temperature of 64.8°C.

Thermal Throttling

While Raspberry Pi 4 does still throttle with the beta firmware, thanks to the heavy demands of the synthetic workload used for testing, it delivers the best results yet: throttling occurs at the 177s mark while the new clocking controls bring the average clock speed up markedly. The firmware also allows Raspberry Pi 4 to up-clock more at idle, improving the performance of background tasks.

Keep cool with Raspberry Pi 4 orientation

Firmware upgrades offer great gains, but what about putting Raspberry Pi on its side?

While running the latest firmware will result in considerable power draw and heat management improvements, there’s a trick to unlock even greater gains: adjusting the orientation of Raspberry Pi. For this test, Raspberry Pi 4 with the beta firmware installed was stood upright with the GPIO header at the bottom and the power and HDMI ports at the top.

Thermal Throttling

Simply moving Raspberry Pi 4 into a vertical orientation has an immediate impact: the SoC idles around 2°C lower than the previous best and heats a lot more slowly – allowing it to run the synthetic workload for longer without throttling and maintain a dramatically improved average clock speed.

There are several factors at work: having the components oriented vertically improves convection, allowing the surrounding air to draw the heat away more quickly, while lifting the rear of the board from a heat-insulating desk surface dramatically increases the available surface area for cooling.

Throttle Point Timing

This chart shows how long it took to reach the throttle point under the synthetic workload. Raspberry Pi 3B+ sits at the bottom, soft-throttling after just 19 seconds. Each successive firmware update for Raspberry Pi 4, meanwhile, pushes the throttle point further and further – though the biggest impact can be achieved simply by adjusting Raspberry Pi’s orientation.

Real World Testing

Synthetic benchmarks aside, how do the boards perform with real workloads?

Looking at the previous pages, it’s hard to get a real idea of the difference in performance between Raspberry Pi 3B+ and Raspberry Pi 4. The synthetic benchmark chosen for the thermal throttle tests performs power-hungry operations which are rarely seen in real-world workloads, and repeats them over and over again with no end.

Compiling Linux

In this test, both Raspberry Pi 3B+ and Raspberry Pi 4 are given the task of compiling the Linux kernel from its source code. It’s a good example of a CPU-heavy workload which occurs in the real world, and is much more realistic than the deliberately taxing synthetic workload of earlier tests.

With this workload, Raspberry Pi 4 easily emerges the victor. Despite its CPU running only 100MHz faster than Raspberry Pi 3B+ at its full speed, it’s considerably more efficient – and, combined with the ability to run without hitting its thermal throttle point, completes the task in nearly half the time.

Kernel compile: Raspberry Pi 3B+

Raspberry Pi 3B+ throttles very early on in the benchmark compilation test and remains at a steady 1.2GHz until a brief period of cooling, as the compiler switches from a CPU-heavy workload to a storage-heavy workload, allows it to briefly spike back to its 1.4GHz default again. Compilation finished in 5097 seconds – one hour, 24 minutes, and 57 seconds.

Kernel compile: Raspberry Pi 4 model B

The difference between the synthetic and real-world workloads is clear to see: at no point during the compilation did Raspberry Pi 4 reach a high enough temperature to throttle, remaining at its full 1.5GHz throughout – bar, as with Raspberry Pi 3 B+, a brief period when a change in compiler workload allowed it to drop to its idle speeds. Compilation finished in 2660 seconds – 44 minutes and 20 seconds.

Get The MagPi magazine issue 88 now

This article is from today’s brand-new issue of The MagPi magazine, the official Raspberry Pi magazine. Buy it from all good newsagents, Raspberry Pi Press, and the Raspberry Pi Store, Cambridge.

Subscribe to pay less per issue and support our work, or download the free PDF to give it a try first.

The post Thermal testing Raspberry Pi 4 appeared first on Raspberry Pi.

Building and testing polyglot applications using AWS CodeBuild

Post Syndicated from Prakash Palanisamy original https://aws.amazon.com/blogs/devops/building-and-testing-polyglot-applications-using-aws-codebuild/

Prakash Palanisamy, Solutions Architect

Microservices are becoming the new normal, and it’s natural to use multiple different programming languages for different microservices in the same application. This blog post explains how easy it is to build polyglot applications, test them, and package them for deployment using a single AWS CodeBuild project.

CodeBuild adds support for Polyglot builds using runtime-versions. With CodeBuild, it is possible to specify multiple runtimes in the buildspec file as part of the install phase. Runtimes can be composed of different major versions of the same programming language, or different programming languages altogether. For a complete list of supported runtime versions and how they must be specified in the buildspec file, see the build specification reference for CodeBuild.

This post provides an example of a microservices application comprised of three different microservices written using different programming languages. The complete code is available in the GitHub repository aws-codebuild-polyglot-application.

  • A microservice providing a greeting message (microservice-greeting) is written in Python.
  • A microservice obtaining users’ details (microservice-name) from an Amazon DynamoDB table is written using Node.js.
  • Another microservice making API calls to the above-mentioned name and greeting services, and provide a personalized greeting (microservice-webapp), is written in Java.

All of these microservices are deployed as serverless functions in AWS Lambda and front-ended by an API interface using Amazon API Gateway. To enable automated packaging and deployment, use AWS Serverless Application Model (AWS SAM) and the AWS SAM CLI. The complete application is deployed locally using DynamoDB Local and the sam local command. Automated UI testing uses the built-in headless browsers in the standard CodeBuild containers. To run DynamoDB Local and SAM Local docker runtime is being used hence the privileged mode should be enabled in the CodeBuild project.

The example buildspec file contains the following code:

version: 0.2

env:
  variables:
    ARTIFACT_BUCKET: "polyglot-codebuild-artifact-eu-west-1"

phases:
  install:
    runtime-versions:	
      nodejs: 10
      java: corretto8
      python: 3.7
      docker: 18
    commands:
      - pip3 install --upgrade aws-sam-cli selenium pylint
  pre_build:
    commands:
      - python --version
      - pylint $CODEBUILD_SRC_DIR/microservices-greeting/greeting_handler.py
      - own_ip=`hostname -i`
      - sed -ie "s/CODEBUILD_IP/$own_ip/g" $CODEBUILD_SRC_DIR/env-webapp.json
  build:
    commands:
      - node --version
      - cd $CODEBUILD_SRC_DIR/microservices-name
      - npm install
      - npm run build
      - java -version
      - cd $CODEBUILD_SRC_DIR/microservices-webapp
      - mvn clean package -Plambda
      - cd $CODEBUILD_SRC_DIR
      - |
        sam package --template-file greeting-sam.yaml \
        --output-template-file greeting-sam-packaged.yaml \
        --s3-bucket $ARTIFACT_BUCKET
      - |
        sam package --template-file name-sam.yaml \
        --output-template-file name-sam-packaged.yaml \
        --s3-bucket $ARTIFACT_BUCKET
      - |
        sam package --template-file webapp-sam.yaml \
        --output-template-file webapp-sam-packaged.yaml \
        --s3-bucket $ARTIFACT_BUCKET
  post_build:
    commands:
      - cd $CODEBUILD_SRC_DIR
      - echo "Starting local DynamoDB instance"
      - docker network create codebuild-local
      - |
        docker run -d -p 8000:8000 --network codebuild-local \
        --name dynamodb amazon/dynamodb-local
      - |
        aws dynamodb create-table --endpoint-url http://127.0.0.1:8000 \
        --table-name NamesTable \
        --attribute-definitions AttributeName=Id,AttributeType=S \
        --key-schema AttributeName=Id,KeyType=HASH \
        --provisioned-throughput ReadCapacityUnits=5,WriteCapacityUnits=5
      - |
        aws dynamodb --endpoint-url http://127.0.0.1:8000 batch-write-item \
        --cli-input-json file://ddb_names.json
      - echo "Starting APIs locally..."
      - |
        nohup sam local start-api --template greeting-sam.yaml \
        --port 3000 --host 0.0.0.0 &> greeting.log &
      - |
        nohup sam local start-api --template name-sam.yaml \
        --port 3001 --host 0.0.0.0 --env-vars env.json \
        --docker-network codebuild-local &> name.log &
      - |
        nohup sam local start-api --template webapp-sam.yaml \
        --port 3002 --host 0.0.0.0 \
        --env-vars env-webapp.json &> webapp.log &
      - cd $CODEBUILD_SRC_DIR/website
      - nohup python3 -m http.server 8080 &>webserver.log &
      - sleep 20
      - echo "Starting headless UI testing..."
      - python $CODEBUILD_SRC_DIR/tests/testsuite.py
artifacts:
  files:
    - greeting-sam-packaged.yaml
    - name-sam-packaged.yaml
    - webapp-sam-packaged.yaml

In the env section, an environment variable named ARTIFACT_BUCKET is uploaded and initialized. It contains the name of the S3 bucket in which the sam package command generated the AWS SAM template. This variable can be overridden either in the build project or while running the build. For more information about the order of precedence, see Run a Build (Console).

In the install phase, there are two sequences: runtime-versions and commands. As part of the runtime versions, four runtimes are specified: Three programming languages’ runtime (Node.js, Java, and Python), which are required to build the microservices, and Docker runtime to deploy the application locally using the sam local command. In the commands sequence, the required packages are installed and updated.

In the pre_build phase, use pylint to lint the Python-based microservice. Get the IP address of the CodeBuild container, to be used later while running the application locally.

In the build phase, use the command sequence to build individual microservices based on their programming language. Use the sam package command to create the Lambda deployment package and generate an updated AWS SAM template.

In the post_build phase, start a DynamoDB Local container as a daemon, create a table in that database, and insert the user details in to that table. After that, start a local instance of greeting, name, and web app microservices using the sam local command. The repository also contains a simple static website that can make API calls to these microservices and provide a user-friendly response. This website is hosted locally using Python’s built-in HTTP server. After the application starts, execute the automated UI testing by running the testsuite.py script. This test script uses Selenium WebDriver. It runs UI tests on headless Firefox and Chrome browsers to validate whether the local deployment of the application works as expected.

If all the phases complete successfully, the updated AWS SAM template files are stored as artifacts based on the specification in artifacts section.

The repository also contains an AWS CloudFormation template that creates a pipeline on AWS CodePipeline with AWS CodeCommit as sources. It builds the application using the previously mentioned buildspec in CodeBuild, and deploys the application using AWS CloudFormation.

Conclusion

In this post, you learned how to use the runtime-versions capability in CodeBuild to build a polyglot application. You also learned about automated UI testing using built-in headless browsers in CodeBuild. Using polygot build capabilities of CodeBuild you shall efficiently handle building applications developed on multiple programming languages using a single build project, instead of creating and maintaining multiple build projects. Similarly, since all the dependent components are built as part of the same project, it also improves the ease of testing including the integration between components.

Loki, a dynamic mock server for HTTP/TCP testing

Post Syndicated from Grab Tech original https://engineering.grab.com/loki-dynamic-mock-server-http-tcp-testing

Background

In a previous article we introduced Mockers – an innovative tool for local box testing at Grab. Mockers used a Shift Left testing strategy, making testing more effective and cheaper for development teams. Mockers’ popularity and success motivated us to create Loki – a one-stop dynamic mock server for local box testing of mobile apps.

There are some unique challenges in mobile apps testing at Grab. End-to-end testing of an app is difficult due to high dependency on backend services and other apps. Staging environment, which hosts a plethora of backend services, is tough to manage and maintain. Issues such as staging downtime, configuration mismatches, and data corruption can affect staging adding to the testing woes. Moreover, our apps are fairly complex, utilizing multiple transport protocols such as HTTP, HTTPS, TCP for various business flows.

The business flows are also complex, requiring exhaustive set up such as credit card payments set up, location spoofing, etc resulting in high maintenance costs for automated testing. Loki simulates these flows and developers can easily test use cases that take longer to set up in a real backend staging.

Loki is our attempt to address challenges in mobile app testing by turning every developer local box into a full fledged pseudo backend environment where all mobile workflows can be tested without any external dependencies. It mocks backend services on developer local boxes, decoupling the mobile apps from real backend services, which provides several advantages such as:

No need to deploy frequently to staging

Testing is blocked if the app receives a bad response from staging. In these cases, code changes have to be deployed on staging to fix issues before resuming tests. In contrast, using Loki lets developers continue testing without any immediate need to deploy code changes to staging.

Allows parallel frontend and backend development

Loki acts as a mock backend service when the real backend is still evolving. It lets the frontend development run in parallel with backend development.

Overcome time limitations

In a one week regression-and-release scenario, testing time is limited. However, the application UI rendering and functionality still needs reasonable testing. Loki lets developers concentrate on testing in the available time instead of fixing dependencies on backend services.

Loki – Grab’s solution to simplify mobile apps testing

At Grab, we have multiple mobile apps that are dependent on each other. For example, our Passenger and Driver apps are two sides of a coin; the driver gets a job card only when a passenger requests a booking. These apps are developed by different teams, each with its own release cycle. This can make it tricky to confidently and repeatedly test the whole business flow across apps. Apps also depend on multiple backend services to execute a booking or food order and communicate over different protocols.

Here’s a look at how our mobile apps interact with backend services over different protocols:

Mobile app interaction with backend services

Loki is a dynamic mock server, written in Golang, running in a Docker container on the local box or in CI. It is easy to set up and run through standard Docker commands. In the context of mobile app testing, it plays the role of backend services, so you no longer need to set up an extensive staging environment.

The Loki architecture looks like this:

Loki architecture

The technical challenges we had to overcome

We wanted a comprehensive mocking solution so that teams don’t need to integrate multiple tools to achieve independent testing. It turned out that mocking TCP was most challenging because:

  • It is a long running client-server connection, and it doesn’t follow an HTTP-like request/response pattern.
  • Messages can be sent to the app without an incoming request as well, hence we had to expose a way via Loki to set a mock expectation which can send messages to the app without any request triggering it.
  • As TCP is a long running connection, we needed a way to delimit incoming requests so we know when we can truncate and deserialize the incoming request into JSON.

We engineered the Loki backend to support both HTTP and TCP protocols on different ports. Yet, the mock expectations are set up using RESTful APIs over HTTP for both protocols. A single point of entry for setting expectations made it more intuitive for our developers.

An in-memory cron implementation pushes scheduled messages to the app over a TCP connection. This enabled testing of complex use cases such as drivers getting new job cards, driver and passenger chat workflows, etc. The delimiter for TCP protocol is configurable at start up, so each team can decide when to truncate the request.

To enable Loki on our CI, we had to reduce its memory footprint. Hence, we built Loki with pluggable storages. MySQL is used when running on local and on CI we switch seamlessly to in-memory cache or Redis.

For testing apps locally, developers must validate complex use cases such as:

  • Payment related flows, which require the response to include the same payment ID as sent in the request. This is a case of simple mapping of request fields in the response JSON.

  • Flows requiring runtime logic execution. For example, a job card sent to a driver must have a valid timestamp, requiring runtime computation on Loki.

To support these cases and many more, we added JavaScript injection capability to Loki. So, when we set an expectation for an HTTP request/response pair or for TCP events, we can specify JavaScript for computing the dynamic response. This is executed in a sandbox by an in-house JS execution library.

Grab follows a transactional workflow for bookings. Over the life of a ride, bookings go through different statuses. So, Loki had to address multiple HTTP requests to the same endpoint returning different responses. This feature is required for successfully mocking a whole ride end-to-end.

Loki uses  an HTTP API “httpTimesAndOrder” for this feature. For example, using “httpTimesAndOrder”, you can configure the same status endpoint (/ride/status) to return different ride statuses such as “PICKING” for the first five requests, “IN_RIDE” for the next three requests, and so on.

Now, let’s look at how to use Loki to mock HTTP requests and TCP events.

Mocking HTTP requests

To mock HTTP requests, developers first point their app to send requests to the Loki mock server. Then, they set up expectations for all requests sent to the Loki mock server.

Loki mock server

For example, the Passenger app calls an HTTP dependency GET /closeby/drivers/ to get nearby drivers. To mock it with Loki, you set an expected response on the Loki mock server. When the GET /closeby/drivers/ request is actually made from the Passenger app, Loki returns the set response.

This snippet shows how to set an expected response for the GET /closeby/drivers/request:

Loki API: POST `/api/v1/expectations`

Request Body :

{
  "uriToMock": "/closeby/drivers",
  "method": "GET",
  "response": {
    "drivers": [
      1001,
      1002,
      1010
    ]
  }
}

Workflow for setting expectations and receiving responses

Workflow for setting expectations and receiving responses

Mocking TCP events

Developers point their app to Loki over a TCP connection and set up the TCP expectations. Loki then generates scheduled events such as sending push messages (job cards, notifications, etc) to the apps pointing at Loki.

For example, if the Driver app, after it starts, wants to get a job card, you can set an expectation in Loki to push a job card over the TCP connection to the Driver app after a scheduled time interval.

This snippet shows how to set the TCP expectation and schedule a push message:

Loki API: POST `/api/v1/tcp/expectations/pushmessage`

Request Body :

{
  "name": "samplePushMsg",
  "msgSequence": [
    {
      "messages": {
        "body": {
          "jobCardID": 1001
        }
      }
    },
    {
      "messages": {
        "body": {
          "jobCardID": 1002
        }
      }
    }
  ],
  "schedule": "@every 1m"
}

Workflow for scheduling a push message over TCP

Workflow for scheduling a push message over TCP

Some example use cases

Now that you know about Loki, let’s look at some example use cases.

Generating a custom response at runtime

Our first example is customizing a runtime response for both HTTP and TCP requests. This is helpful when developers need dynamic responses to requests. For example, you can add parameters from the request URL or request body to the runtime response.

It’s simple to implement this with a JavaScript function. Assume you want to embed a message parameter in the request URL to the response. To do this, you first use a POST method to set up the expectation (in JSON format) for the request on Loki:

Loki API: POST `/api/v1/feature/expectations`

Request Body :

{
  "expectations": [{
    "name": "Sample call",
    "desc": "v1/test/{name}",
    "tags": "v1/test/{name}",
    "resource": "/v1/test?name=user1",
    "verb": "POST",
    "response": {
      "body": "{ \"msg\": \"Hi \"}",
      "status": 200
    },
    "clientOptions": {
"javascript": "function main(req, resp) { var url = req.RequestURI; var captured = /name=([^&]+)/.exec(url)[1]; resp.msg =  captured ? resp.msg + captured : resp.msg + 'myDefaultValue'; return resp }"
    },
    "isActive": 1
  }]
}

When Loki receives the request, the JavaScript function used in the clientOptionskey, adds name to the response at runtime. For example, this is the request’s fixed response:

{
    "msg": "Hi "
}

But, after using the JavaScript function to add the URL parameter, the dynamic response is:

{
    "msg": "Hi user1"
}

Similarly, you can use JavaScript to add other dynamic responses such as modifying the response’s JSON array, adding parameters to push messages, etc.

Defining a response sequence for mocked API endpoints

Here’s another interesting example – defining the response sequence for API endpoints.

A response sequence is useful when you need different responses from the same API endpoint. For example, a status endpoint should return different ride statuses such as ‘allocating’, ‘allocated’, ‘picking’, etc. depending on the stage of a ride.

To do this, developers set up their HTTP expectations on Loki. Then, they easily define the response sequence for an API endpoint using a Loki POST method.

In this example:

  • times – specifies the number of times the same response is returned.
  • after – specifies one or more expectations that must match before a specified expectation is matched.

Here, the expectations are matched in this sequence when a request is made to an endpoint – Allocating > Allocated > Pickuser > Completed. Further, Completed is set to two times, so Loki returns this response two times.

Loki API: POST `/api/v1/feature/sequence`

Request Body :
  "httpTimesAndOrder": [
      {
          "name": "Allocating",
          "times": 1
      },
      {
          "name": "Allocated",
          "times": 1,
          "after": ["Allocating"]
      },
      {
          "name": "Pickuser",
          "times": 1,
          "after": ["Allocated"]
      },
      {
          "name": "Completed",
          "times": 2,
          "after": ["Pickuser"]
      }
  ]
}

In conclusion

Since Loki’s inception, we have set up a full range CI with proper end-to-end app UI tests and, to a great extent, decoupled our app releases from the staging backend. This improved delivery cycles, and we did faster bug catching and more exhaustive testing. Moreover, both developers and QAs can easily play with apps to perform exploratory testing as well as manual functional validations. Teams are also using Loki to run automated scripts (Espresso and XCUItests) for validating the mobile app pages.

Loki’s adoption is growing steadily at Grab. With our frequent release of new mobile app features, Loki helps teams meet our high quality bar and achieve huge productivity gains.

If you have any feedback or questions on Loki, please leave a comment.

Anatomy of a product quality issue: PoE HAT

Post Syndicated from James Adams original https://www.raspberrypi.org/blog/poe-hat-revision/

One of the neat new features of the Raspberry Pi 3 Model B+ is its support for IEEE 802.3af Power-over-Ethernet (PoE). This standard allows up to 13W of power to be delivered over the twisted pairs in an Ethernet cable without interfering with the transmission of data. The Raspberry Pi board itself provides a PoE-capable Ethernet jack and circuit protection components; the power regulation electronics, which would be too costly and bulky to include on the main board, live on a separate HAT.

Raspberry Pi PoE HAT Power over ethernet

The Raspberry Pi 3B+ wearing a PoE HAT

When we announced the 3B+, we revealed that an official Raspberry Pi PoE HAT was in the works and, after a few unforeseen production delays, we we released this HAT at the end of August. Feedback was, and remains, generally very positive; but fairly quickly, we started to see some reports from users who were experiencing issues.

The problem

The problem they reported was this: when powering certain Raspberry Pi units via the PoE HAT, it was not possible to draw the full rated current from the USB ports.

Our 5V USB output, denoted VBUS, is fed by the main 5V rail via a current-limiting switch. This switch is designed to protect the system by detecting short-circuit, over-current, or reverse-voltage events, and disconnecting the USB ports in response. Our current-limiting switch is set to a limit of just over 1A.

Despite the PoE HAT’s ability to supply up to 2.5A, the experiments we ran in response to the reports suggested that, when it was used to supply some boards, the USB supply would trip out at a much lower current. Mice and keyboards worked fine, but higher-current devices such as wireless dongles and hard disks would fail.

Our initial theory was that the PoE HAT was injecting noise into the Pi via the 5V rail, and that this was somehow upsetting the switch. However, we were able to rule this out, since we found no evidence of high-frequency noise at the input to the switch. Another theory was that the flyback transformer’s close physical proximity to the switch was somehow coupling noise in. But we were able to rule this out as well: we showed that the behaviour persisted when the HAT was connected using a right-angle header, which moves the power electronics away from the Raspberry Pi.

What was happening?

The PoE HAT works by converting the incoming 48V from the Ethernet lines to 5V using a flyback transformer. In simple terms, the primary side of the transformer is switched across the 48V, and energy is stored in the transformer in the form of a magnetic field. The primary is then disconnected and the magnetic field collapses. This changing magnetic field induces a voltage (scaled based on the transformer turns ratio) in the secondary, which is rectified by a schottky diode and output capacitance. This output capacitance is formed from the output capacitors on the PoE HAT itself, the capacitors on the Raspberry Pi 5V rail, and, when the switch is on, the VBUS reservoir capacitors.

The switching frequency of the flyback transformer is relatively low (~100 kHz). This means that when the system is under load, each switching cycle must transfer a relatively large amount of energy. During each cycle, the 5V rail is discharged according to the load on the system, and charged up again by the flyback’s secondary, dumping more energy into the caps. In each cycle, a spike of high current is pushed through the output diode into the capacitors.

To cut a long story short, putting a current probe on the input to switch showed large current spikes, as energy from the flyback made its way into the VBUS reservoir capacitors. This was expected. However, it turned out that the switch was erroneously registering these spikes as true over-current events. The switch is supposed to have a filter that allows it to ignore brief spikes, but we discovered that only one of the two approved versions of the switch did this correctly.

Current into switch (yellow) and VBUS voltage (blue)

If it’s not been tested, it’s broken

It’s a truism that if you don’t test an aspect of a design, it will certainly be broken. Those of us with a Broadcom background sometimes refer to this as Alan Morgan’s rule, after its most enthusiastic proponent.

Extensive testing over all configurations, operating parameters, and use cases is the only way to minimise the likelihood of releasing a product with a hardware issue. Even relatively simple hardware can end up catching you out by throwing up some unexpected bug or issue. And even the big guys with huge development teams and test labs occasionally mess things up — anyone remember the Pentium FDIV bug?

We made several mistakes with the first version of the PoE HAT:

  • USB load testing was performed using boards that had the working switch
  • Our field testing programme was abbreviated because the product was late
  • We didn’t inquire as to whether our field testers were using high-current peripherals (they weren’t)

It’s embarrassing to have released a product with a bug like this, but it’s a lesson well-learned, and we will be improving our internal processes to prevent a recurrence.

The solution

Fortunately, this bug turned out to be easy to fix. We designed an L-C filter to apply further smoothing to the output current from the HAT. The filter consists of a little extra input and output capacitance and a 4.7µH inductor (chosen to have a suitable current rating and DC resistance), as well as 330mR resistor in parallel to provide damping. We were even able to wrap the mod up in a little mezzanine PCB that fits neatly underneath the board.

The original, un-modded board

Hand-modded board with L-C filter

Final board with mezzanine

Once we had confirmed that there was a problem with the PoE HAT, we took the product off sale, and recalled and reworked the outstanding units. We are now happy to announce that most Approved Resellers should now have the revised boards in stock. We believe that most people who have been affected by this issue have already returned their PoE HATs for a refund; if you’re experiencing issues and haven’t yet returned your product, you can get in touch with your reseller to arrange a replacement.

I’d like to thank the members of the Raspberry Pi engineering team, our contract manufacturing partners Taijie, our licensee partners and Approved Resellers, and also the community members who kindly tested prototypes of the fixed board design. This hasn’t been the easiest product launch in our history, but hopefully the lessons learned have set us up well for the future.

The post Anatomy of a product quality issue: PoE HAT appeared first on Raspberry Pi.

How to Test and Debug AWS CodeDeploy Locally Before You Ship Your Code

Post Syndicated from Kirankumar Chandrashekar original https://aws.amazon.com/blogs/devops/how-to-test-and-debug-aws-codedeploy-locally-before-you-ship-your-code/

AWS CodeDeploy is a powerful service for automating deployments to Amazon EC2, AWS Lambda, and on-premises servers. However, it can take some effort to get complex deployments up and running or to identify the error in your application when something goes wrong.

When I set up new deployments or debug existing ones, I like to test and debug locally for these reasons:

  • To speed up the iteration process.
  • To isolate potential issues.
  • To validate code.

You can test application code packages on any machine that has the CodeDeploy agent installed before you deploy it through the service. Likewise, to debug locally, you just need to install the CodeDeploy agent on any machine, including your local server or EC2 instance.

In this blog post, I will walk you through the steps to validate and debug a sample application package using the codedeploy-local command. You can find the sample package in this GitHub repository.

 

 

Prerequisites

Install the CodeDeploy agent on any supported instance type. For information, see Use the AWS CodeDeploy Agent to Validate a Deployment Package on a Local Machine in the AWS CodeDeploy User Guide.

Step 1

Verify the CodeDeploy agent is installed and ready for local testing. By default, codedeploy-local is installed in the following locations:

On Amazon Linux, RHEL, or Ubuntu Server:

/opt/codedeploy-agent/bin/codedeploy-local

On Windows Server:

C:\ProgramData\Amazon\CodeDeploy\bin

For simplicity, I am creating an alias for /opt/codedeploy-agent/bin/codedeploy-local as codedeploy-local so I can use the absolute path. This is optional.

alias codedeploy-local='sudo /opt/codedeploy-agent/bin/codedeploy-local'

When I execute the codedeploy-local command on the Linux terminal, I get the following response from the agent, which indicates that the agent is installed:

[[email protected] ~]$ codedeploy-local 
ERROR: Expecting appspec file at location /home/ec2-user/appspec.yml but it is not found there. Please either run the CLI from within a directory containing the appspec.yml file or specify a bundle location containing an appspec.yml file in its root directory

If you receive an error that the codedeploy-local command is not available or the package was not found, go back to the prerequisites and install the agent.

Step 2
To test the sample application package using the codedeploy-local command, I have to make sure that the application package is available on the local machine. The sample package I am testing here is an Apache (httpd)-based application.

Use wget to download the package to the local machine.

wget https://s3.amazonaws.com/aws-codedeploy-us-east-1/samples/latest/SampleApp_Linux.zip

Now that the sample package is available locally, I can either unzip the package or use the zip file for testing with the codedeploy-local command.

To test the zip file (archive) package (SampleApp_Linux.zip) with the codedeploy-local command, use the -l or –bundle-location option along with the -t or –type option as shown:

On Linux server:

codedeploy-local --bundle-location /home/ec2-user/CodeDeployPackage/SampleApp_Linux.zip -t zip --deployment-group my-deployment-group

On Windows server:

codedeploy-local --bundle-location C:/path/to/local/bundle.zip --type zip --deployment-group my-deployment-group

To unarchive the zip file, either change the directory (cd) to the top-level directory or provide the absolute path to the application package.

The package can be executed by providing the absolute path to the content as shown here:

codedeploy-local --bundle-location /path/to/local/bundle/directory

Or by changing the directory (cd) to the location of the unarchived package and executing the following command:

codedeploy-local

Executing the codedeploy-local command in the directory where the sample package is unzipped shows whether the deployment was successful or failed.

Here is a successful deployment execution and result:

[email protected] CodeDeployPackage]$ ls -a
.  ..  appspec.yml  index.html  LICENSE.txt  SampleApp_Linux.zip  scripts

[email protected] CodeDeployPackage]$ codedeploy-local
Starting to execute deployment from within folder /opt/codedeploy-agent/deployment-root/default-local-deployment-group/d-H3OZK261S-local
See the deployment log at /opt/codedeploy-agent/deployment-root/default-local-deployment-group/d-H3OZK261S-local/logs/scripts.log for more details
AppSpec file valid. Local deployment successful

Step 3

Check the codedeploy-local logs and the deployment archive.

In the previous step, I was able to see that the local deployment was successful. The output included:

  • The log location.
  • The location where the deployment-archive was uploaded. It will be used as a staging directory for that deployment.

Because the –deployment-group, -g option was not provided, a local deployment group folder was created in the following location:

/opt/codedeploy-agent/deployment-root/default-local-deployment-group/d-H3OZK261S-local

The following shows the listing of the files in the codedeploy-local deployment directory for a deployment:

[email protected] ~]$ ls /opt/codedeploy-agent/deployment-root/default-local-deployment-group/d-H3OZK261S-local
deployment-archive  logs

[[email protected] deployment-archive]$ ls -a /opt/codedeploy-agent/deployment-root/default-local-deployment-group/d-H3OZK261S-local/deployment-archive/
.  ..  appspec.yml  index.html  LICENSE.txt  SampleApp_Linux.zip  scripts

[[email protected] deployment-archive]$ ls -a /opt/codedeploy-agent/deployment-root/default-local-deployment-group/d-H3OZK261S-local/logs
.  ..  scripts.log

In the directory path generated for each deployment, default-local-deployment-group  is the name of the deployment group and d-H3OZK261S-local is the deployment ID.

The scripts.log shows the execution logs for the codedeploy-local command for a deployment group and deployment ID. Here is an example of a scripts.log that shows the execution of each lifecycle event defined in the appspec.yml:

[[email protected] deployment-archive]$ cat /opt/codedeploy-agent/deployment-root/default-local-deployment-group/d-H3OZK261S-local/logs/scripts.log
2018-03-13 23:02:37 LifecycleEvent - ApplicationStop
2018-03-13 23:02:37 Script - scripts/stop_server
2018-03-13 23:02:37 [stdout]Stopping httpd: [  OK  ]
2018-03-13 23:02:37 LifecycleEvent - BeforeInstall
2018-03-13 23:02:37 Script - scripts/install_dependencies
2018-03-13 23:02:37 [stdout]Loaded plugins: priorities, update-motd, upgrade-helper
2018-03-13 23:02:37 [stdout]Package httpd-2.2.34-1.16.amzn1.x86_64 already installed and latest version
2018-03-13 23:02:37 [stdout]Nothing to do
2018-03-13 23:02:37 Script - scripts/start_server
2018-03-13 23:02:37 [stdout]Starting httpd: [  OK  ]

There is another log file in this location that comes in handy when deploying the code on the local machine:

/var/log/aws/codedeploy-agent/codedeploy-local.log

You can enable verbose logging in the codedeploy-agent configuration file by setting the parameter :verbose: to true.

By default, the location of the configuration file is:

Amazon Linux, RHEL, or Ubuntu Server instances

/etc/codedeploy-agent/conf/codedeployagent.yml

Windows Server

C:/ProgramData/Amazon/CodeDeploy/conf.yml

Other features for debugging issues locally with codedeploy-local

The codedeploy-local command has other features that you can use to debug and troubleshoot issues.

Override the lifecycle hooks mentioned in the appspec.yml file

You can use codedeploy-local to override the lifecycle hooks provided in the appspec.yml. In this example, only the ApplicationStop lifecycle hook defined in the appspec.yml file will be executed. All other hooks will be ignored.

codedeploy-local -e ApplicationStop

In the same way, you can override the order in which the CodeDeploy agent executes multiple lifecycle hooks. This feature can help you determine and change the sequence before the deployment is performed on the server. For information, see AppSpec ‘hooks’ Section in the AWS CodeDeploy User Guide.

For example, this command executes the BeforeInstall lifecycle hook first and then executes the ApplicationStop lifecycle hook.

codedeploy-local -e BeforeInstall,ApplicationStop

Execute scripts specifically for codedeploy-local

If there are scripts that are used for local testing only and not required for the CodeDeploy deployment, then you can use the $DEPLOYMENT_GROUP_NAME variable, which has a value equal to LocalFleet.

Here are other environment variables and their values:

$APPLICATION_NAME: The location of the deployment package (for example, /home/ec2-user/CodeDeployPackage)

$DEPLOYMENT_ID: Unique per deployment (for example, d-LTVP5L6YY-local)

$DEPLOYMENT_GROUP_ID: The name of the deployment group. When the -g option is used for the command, this value will be passed. For example, in codedeploy-local -g testing, this value is testing. If this option is not set, the value of this environment variable is default-local-deployment-group

$LIFECYCLE_EVENT: The lifecycle hook that echoed this environment variable (for example, ApplicationStop)

Override the CodeDeploy agent configuration

You can override the CodeDeploy agent configuration and use your own configuration file from a custom location. This functionality makes it possible to test multiple configurations with the local deployments using the option -c, –agent-configuration-file while executing the codedeploy-local command. For the options to use, see AWS CodeDeploy Agent Configuration Reference in the AWS CodeDeploy User Guide.

By default, configuration files are stored in the following locations:

On Amazon Linux, RHEL, or Ubuntu Server:

/etc/codedeploy-agent/conf/codedeployagent.yml

On Windows Server:

C:/ProgramData/Amazon/CodeDeploy/conf.yml

Using custom configuration helps when verbose logging is required for package testing. You can do this just by using the -c or –agent-configuration-file option and without changing the default configuration file. Here is an example that shows the use of this option:

codedeploy-local -e BeforeInstall,ApplicationStop -c /<;-local-path->;/

For example, on Amazon Linux, RHEL, or Ubuntu Server instances, when the config file is in /etc/codedeployagent.yml, the command is:

codedeploy-local -e BeforeInstall,ApplicationStop -c /etc/codedeployagent.yml

For example, on Windows Server instances, when the config file is in C:/ProgramData/conf.yml, the command is:

codedeploy-local -e BeforeInstall,ApplicationStop -c C:/ProgramData/conf.yml

Point to an application package in an S3 bucket or GitHub repository

If the application package is stored in an S3 bucket or GitHub repository, codedeploy-local can be executed without downloading the file onto the local machine. You can do this using the -l, –bundle-location and -t, –type with the codedeploy-local command.

Here is an example for deploying a sample application package located in an S3 bucket:

codedeploy-local -l s3://aws-codedeploy-us-east-1/samples/latest/SampleApp_Linux.zip -t zip

Here is an example for deploying a sample application package from a public GitHub repository:

codedeploy-local --bundle-location https://api.github.com/repos/awslabs/aws-codedeploy-sample-tomcat/zipball/master --type zip

If you use GitHub, make sure that the application package with the appspec.yaml is in the root of the directory. If these contents are in a subfolder path, download the package to the local instance or server and then:

  • Execute codedeploy-local from the directory where the file exists.

-OR-

  • Use the -t, –type  option with the value of directory and -l, –bundle-location as the local path.

Troubleshooting common errors using codedeploy-local

The codedeploy-local command can be used to detect if the appspec.yml is in valid YAML format. If the format is invalid, you get the following error:

/usr/share/ruby/vendor_ruby/2.0/psych.rb:205:in `parse': (<unknown>): mapping values are not allowed in this context at line 10 column 13 (Psych::SyntaxError)

If there is an invalid lifecycle hook in the appspec.yml file, the deployment fails with this error:

ERROR: appspec.yml file contains unknown lifecycle events: ["BeforeInstall1"]

The name of a lifecycle hook is case-sensitive. The following error is returned because the BeforeInstall lifecycle hook was entered as Beforeinstall:

ERROR: appspec.yml file contains unknown lifecycle events: ["Beforeinstall"]

If there is any error in the scripts provided for execution in any lifecycle hooks (for example, a problem in the BeforeInstall script), the execution logs show something like this:

codedeploy-local -g testing
Starting to execute deployment from within folder /opt/codedeploy-agent/deployment-root/testing/d-6UBAIVVSK-local
Your local deployment failed while trying to execute your script at /opt/codedeploy-agent/deployment-root/testing/d-6UBAIVVSK-local/deployment-archive/scripts/install_dependencies
See the deployment log at /opt/codedeploy-agent/deployment-root/testing/d-6UBAIVVSK-local/logs/scripts.log for more details

For the preceding error, when you look at the logs in the deployment directory for the deployment group, you will see something like this:

cat /opt/codedeploy-agent/deployment-root/testing/d-6UBAIVVSK-local/logs/scripts.log
2018-03-21 03:34:04 LifecycleEvent - ApplicationStop
2018-03-21 03:34:04 Script - scripts/stop_server
2018-03-21 03:34:04 [stdout]LocalFleet
2018-03-21 03:34:04 [stdout]/home/ec2-user/CodeDeployPackage
2018-03-21 03:34:04 [stdout]d-6UBAIVVSK-local
2018-03-21 03:34:04 [stdout]testing
2018-03-21 03:34:04 [stdout]ApplicationStop
2018-03-21 03:34:04 [stdout]Stopping httpd: [  OK  ]
2018-03-21 03:34:04 LifecycleEvent - BeforeInstall
2018-03-21 03:34:04 Script - scripts/install_dependencies
2018-03-21 03:34:04 [stdout]Loaded plugins: priorities, update-motd, upgrade-helper
2018-03-21 03:34:04 [stdout]No package httpd1 available.
2018-03-21 03:34:04 [stderr]Error: Nothing to do

This log snippet shows that the install_dependencies script had a package called httpd1 that is not available for installation.

If the appspec.yml is not found in the root of the application package, you will see an error like this:

/opt/codedeploy-agent/lib/instance_agent/plugins/codedeploy/hook_executor.rb:213:in `parse_app_spec': The CodeDeploy agent did not find an AppSpec file within the unpacked revision directory at revision-relative path "appspec.yml". The revision was unpacked to directory "/opt/codedeploy-agent/deployment-root/default-local-deployment-group/d-BE59ORH9I-local/deployment-archive", and the AppSpec file was expected but not found at path "/opt/codedeploy-agent/deployment-root/default-local-deployment-group/d-BE59ORH9I-local/deployment-archive/appspec.yml". Consult the AWS CodeDeploy Appspec documentation for more information at http://docs.aws.amazon.com/codedeploy/latest/userguide/reference-appspec-file.html (RuntimeError)
    from /opt/codedeploy-agent/lib/instance_agent/plugins/codedeploy/hook_executor.rb:100:in `initialize'
    from /opt/codedeploy-agent/lib/instance_agent/plugins/codedeploy/command_executor.rb:147:in `new'
    from /opt/codedeploy-agent/lib/instance_agent/plugins/codedeploy/command_executor.rb:147:in `block (3 levels) in map'
    from /opt/codedeploy-agent/lib/instance_agent/plugins/codedeploy/command_executor.rb:146:in `each'
    from /opt/codedeploy-agent/lib/instance_agent/plugins/codedeploy/command_executor.rb:146:in `block (2 levels) in map'
    from /opt/codedeploy-agent/lib/instance_agent/plugins/codedeploy/command_executor.rb:68:in `execute_command'
    from /opt/codedeploy-agent/lib/aws/codedeploy/local/deployer.rb:85:in `block in execute_events'
    from /opt/codedeploy-agent/lib/aws/codedeploy/local/deployer.rb:84:in `each'
    from /opt/codedeploy-agent/lib/aws/codedeploy/local/deployer.rb:84:in `execute_events'
    from /opt/codedeploy-agent/bin/codedeploy-local:117:in `<main>'

Conclusion

The codedeploy-local command can be used to validate and debug an application package for deployments to Amazon EC2 instances or on-premises servers. With codedeploy-local, you can test and fix errors on a local machine during the code development phase. CodeDeploy local deployments also make it possible for you to change the order of the lifecycle hooks so you can restructure the appspec.yaml to add commands on the fly.

How to Run Headless Front-End Tests with AWS Cloud9 and AWS CodeBuild

Post Syndicated from Eric Z. Beard original https://aws.amazon.com/blogs/devops/how-to-run-headless-front-end-tests-with-aws-cloud9-and-aws-codebuild/

Automated testing is a critical component to a well-designed software development lifecycle. When you test front-end applications, you often use a browser in combination with testing frameworks. A headless browser is one that is used on a server that does not normally need to run visual applications. In this blog post, I will show you how to configure AWS Cloud9 and AWS CodeBuild to support testing an Angular application with the headless version of Chrome. AWS Cloud9 has deep integration with services such as AWS Lambda, and the environment is easily accessible anywhere, from any internet-connected device.

AWS Cloud9

By default, Cloud9 runs on an Amazon EC2 instance that is managed for you. You can also run it on any Linux machine that is accessible through SSH.

First, create a Cloud9 environment.

  1. Sign in to the AWS Management Console, scroll down to Developer Tools, and choose Cloud9.
  2. On the following page, choose Create Environment.
  3. Enter a name for your environment and then choose Next Step.
  4. On the following page, leave the defaults for the time being and click Next Step.
  5. On the following page, choose Create Environment.

It might take a few minutes for your environment to initialize. Behind the scenes, an EC2 instance is created for you in the region you have currently selected in the console. In the environment, press Alt-T to bring up a bash terminal tab. For the remaining steps in this post, you will enter commands into this tab.

There is a lot to take in if this is your first time using Cloud9. If you need help getting set up or want to learn more, see the Cloud9 User Guide.

Install and configure Angular

The first thing we will do in our new environment is to install and configure an Angular application.

  1. Upgrade Node to the latest version supported by AWS Lambda. (At the time of this writing, that’s 8.10.)
    nvm install 8.10
  2. Install the Angular CLI using npm, the Node Package Manager. Install it as a global package with the –g option so that it is available to run from anywhere in your environment.
    npm install -g @angular/cli
  3. Use the Angular CLI to create an Angular application.
    ng new my-app
    cd my-app/
  4. Run the application to make sure everything is working as expected. To preview a running application in Cloud9, the app must run on a specific port. With Angular, you must disable the default host header check.
    ng serve --port 8080 --host localhost --disable-host-check

     

    On the toolbar, next to Run, choose Preview and then choose Preview Running Application. You should see something like this:

  5. Press Ctrl-C to stop serving and then in the my-app directory, try to test your application.
    cd ..
    ng test --watch=false

    That obviously doesn’t work the way you would expect it to on a regular workstation. The testing framework can’t find Chrome because we are running on a headless EC2 instance. To start addressing the problem, first install a package called Puppeteer as a development dependency in your application.

    I’d like to give credit here to Alex Bainter, a software developer who wrote a comprehensive blog post about replacing PhantomJS with headless Chromium and Karma. His post was extremely helpful to me when I had to figure this out for the first time.

  6. Install Puppeteer and its dependencies.
    npm i -D puppeteer
    npm i –D @angular-devkit/build-angular
  7. You can get a good look at the missing Chrome libraries by running the ldd command on the binary that comes with Puppeteer.
    cd node_modules/puppeteer/.local-chromium/linux-564778/chrome-linux/

    (By the time you read this post, the version number in that path will probably be different. Look in the puppeteer/.local-chromium directory to see what it is for your installation.)

    ldd chrome | grep not

    You should see output that looks like this:

    libXcursor.so.1 => not found
    libXdamage.so.1 => not found
    libXfixes.so.3 => not found
    libcups.so.2 => not found
    libXss.so.1 => not found
    libXrandr.so.2 => not found
    libpangocairo-1.0.so.0 => not found
    libpango-1.0.so.0 => not found
    libcairo.so.2 => not found
    libatk-1.0.so.0 => not found
    libatk-bridge-2.0.so.0 => not found
    libgtk-3.so.0 => not found
    libgdk-3.so.0 => not found
    libgdk_pixbuf-2.0.so.0 => not found

 

Install headless Chrome

Now comes the tricky part. Installing headless Chrome on an Amazon Linux EC2 instance is no simple task. One strategy is to install the various dependencies by compiling from source, but the chain of dependencies for Chrome, which includes gtk+ and glib, soon gets out of hand. I found another blogger who solved the problem by borrowing from the CentOS and Fedora package repositories. Thanks to Yuanyi for this part of the solution.

  1. Install yum packages to cover basic dependencies.
    sudo yum install -y libXcursor libXdamage libcups libXss libXrandr \
        cups-libs dbus-glib libXinerama cairo cairo-gobject pango
  2. Borrow packages from CentOS and Fedora.
    sudo rpm -ivh --nodeps http://mirror.centos.org/centos/7/os/x86_64/Packages/atk-2.22.0-3.el7.x86_64.rpm
    sudo rpm -ivh --nodeps http://mirror.centos.org/centos/7/os/x86_64/Packages/at-spi2-atk-2.22.0-2.el7.x86_64.rpm
    sudo rpm -ivh --nodeps http://mirror.centos.org/centos/7/os/x86_64/Packages/at-spi2-core-2.22.0-1.el7.x86_64.rpm
    sudo rpm -ivh --nodeps http://dl.fedoraproject.org/pub/archive/fedora/linux/releases/20/Fedora/x86_64/os/Packages/g/GConf2-3.2.6-7.fc20.x86_64.rpm
    sudo rpm -ivh --nodeps http://dl.fedoraproject.org/pub/archive/fedora/linux/releases/20/Fedora/x86_64/os/Packages/l/libXScrnSaver-1.2.2-6.fc20.x86_64.rpm
    sudo rpm -ivh --nodeps http://dl.fedoraproject.org/pub/archive/fedora/linux/releases/20/Fedora/x86_64/os/Packages/l/libxkbcommon-0.3.1-1.fc20.x86_64.rpm
    sudo rpm -ivh --nodeps http://dl.fedoraproject.org/pub/archive/fedora/linux/releases/20/Fedora/x86_64/os/Packages/l/libwayland-client-1.2.0-3.fc20.x86_64.rpm
    sudo rpm -ivh --nodeps http://dl.fedoraproject.org/pub/archive/fedora/linux/releases/20/Fedora/x86_64/os/Packages/l/libwayland-cursor-1.2.0-3.fc20.x86_64.rpm
    sudo rpm -ivh --nodeps http://dl.fedoraproject.org/pub/archive/fedora/linux/releases/20/Fedora/x86_64/os/Packages/g/gtk3-3.10.4-1.fc20.x86_64.rpm
    sudo rpm -ivh --nodeps http://dl.fedoraproject.org/pub/archive/fedora/linux/releases/16/Fedora/x86_64/os/Packages/gdk-pixbuf2-2.24.0-1.fc16.x86_64.rpm
  3. Edit src/karma.conf.js to require Puppeteer and set the CHROME_BIN environment variable. Here is the full content of that file after the changes.
    const puppeteer = require("puppeteer");
    process.env.CHROME_BIN = puppeteer.executablePath();
    
    module.exports = function (config) {
        config.set({
            basePath: '',
            frameworks: ['jasmine', ' @angular-devkit/build-angular'],
            plugins: [
                require('karma-jasmine'),
                require('karma-chrome-launcher'),
                require('karma-jasmine-html-reporter'),
                require('karma-coverage-istanbul-reporter'),
               require('@angular-devkit/build-angular/plugins/karma')
            ],
            client:{
                clearContext: false // leave Jasmine Spec Runner output visible in browser
            },
        coverageIstanbulReporter: {
            reports: [ 'html', 'lcovonly' ],
            fixWebpackSourcePaths: true
        },
        angularCli: {
            environment: 'dev'
        },
        reporters: ['progress', 'kjhtml'],
        port: 8080,
        colors: true,
        logLevel: config.LOG_INFO,
        autoWatch: true,
        browsers: ['ChromeHeadlessNoSandbox'],
        customLaunchers: {
            ChromeHeadlessNoSandbox: {
                base: 'ChromeHeadless',
                flags: ['--no-sandbox']
            }
        },
        singleRun: false
    
    });
    
    };
  4. Make a small adjustment to your test specification in src/app/app.component.spec.ts so that it is checking for the title in the test called "should render title in a h1 tag". Run ng test again.
    ng test --watch=false

If you see that green SUCCESS indicator, then you have done it! You installed Angular and created an application, installed Puppeteer, and by filling in the missing libraries for Chrome, you made it possible to run headless Chrome tests in Cloud9!

AWS CodeBuild

The next piece of the puzzle is your CI/CD pipeline. When a developer checks in new code, you want to test that code with a continuous integration tool like AWS CodeBuild. With CodeBuild, the problem related to headless Chrome is slightly different than it was with Cloud9, because the default build environment for Node apps is an Ubuntu image. You still need to install Chromium and its dependencies, but Ubuntu packages make it easier.

  1. Navigate to the CodeBuild console and create a new build project. Give it a name and configure the source repository. You will need to store your code for this exercise with one of the providers listed later so that CodeBuild knows where to find it when you start a build. Since you are already logged in to the AWS console, AWS CodeCommit is a good option, but you could also choose Amazon S3, Bitbucket, or GitHub.
  2. Configure the build environment. For Operating system, choose Ubuntu. For Runtime, choose Node.js. You can specify your own container image for the build, but the buildspec.yml described in step 3 works out of the box with the default image.
  3. For the build specification, provide the following buildspec.yml file in the root directory of the source code repository.
    
    version: 0.1
    phases:
      install:
        commands:
    
          # Install the Angular CLI
          - npm install -g @angular/cli
    
          # Install puppeteer as a dev dependency
          - npm i -D puppeteer
          - npm i –D @angular-devkit/build-angular
    
          # Print out missing libs
          - echo "Missing Libs" || ldd ./node_modules/puppeteer/.local-chromium/linux-549031/chrome-linux/chrome | grep not
    
          # Upgrade apt
          - apt-get upgrade
    
          # Update libs
          - apt-get update
    
          # Install apt-transport-https
          - apt-get install -y apt-transport-https
    
          # Use apt to install the Chrome dependencies
          - apt-get install -y libxcursor1
          - apt-get install -y libgtk-3-dev
          - apt-get install -y libxss1
          - apt-get install -y libasound2
          - apt-get install -y libnspr4
          - apt-get install -y libnss3
          - apt-get install -y libx11-xcb1
    
          # Print out missing libs
          - echo "Missing Libs" || ldd ./node_modules/puppeteer/.local-chromium/linux-549031/chrome-linux/chrome | grep not
    
          # Install project dependencies
          - npm install
    
      pre_build:
        commands
    	  - echo "Nothing to pre_build"
    
      build:
        commands:
    
          - printenv 
    
          # Build the project
          - ng build
    
          # Run headless Chrome tests
          - ng test --watch=false
          - printenv
    
      post_build:
        commands:
    
          - printenv
    
          # Deploy the project to S3
    
          - if [ ${CODEBUILD_BUILD_SUCCEEDING}=1 ]; then aws s3 sync --delete dist/ "s3://${BUCKET_NAME}"; else echo "Skipping aws sync"; fi
    
    artifacts:
      files:
        - src/*
    
    

    Feel free to remove those ldd and printenv statements, but it is worth taking a look at the output to get a better understanding of what is going on with the build.

  4. Specify the location for artifacts. The following step isn’t required, but it makes it possible to incorporate the build project into AWS CodePipeline.
  5. Expand Advanced Settings and configure an environment variable for the website bucket name.
  6. Configure the buckets. CodeBuild can’t write to the S3 buckets unless you give the service explicit permissions to do so. This is one of the most common causes of build failures for projects that involve S3. Attach the following policy to the CodeBuild service role to give it access to those buckets. Choose Continue and Save to create the build project, and then navigate to the IAM console and search for the CodeBuild service role that was just created for you. Add this as an inline policy.
    
    {
    	"Version": "2012-10-17",
    	"Statement": [
    		{
    			"Sid": "VisualEditor0",
    			"Effect": "Allow",
    			"Action": "s3:*",
    			"Resource": [
    				"arn:aws:s3:::YOUR_BUCKET_FOR_ARTIFACTS",
    				"arn:aws:s3:::YOUR_BUCKET_FOR_ARTIFACTS /*"
    			]
    		},
    		{
    			"Sid": "VisualEditor1",
    			"Effect": "Allow",
    			"Action": "s3:*",
    			"Resource": [
    				"arn:aws:s3:::YOUR_BUCKET_FOR_THE_WEBSITE",
    				"arn:aws:s3:::YOUR_BUCKET_FOR_THE_WEBSITE /*"
    			]
    		}
    	]
    }
    
    
  7. You should now be able to start the build and see that the compiled website has been copied to your S3 bucket after the build is complete.

 

Alternative Cloud9 installation using SSH and Ubuntu

You can run the Cloud9 IDE from a Linux machine that you create, rather than letting Cloud9 provision it for you. Create a Cloud9 environment and choose Connect and run in remote server. For more information about this type of setup, see Creating an SSH Environment in the AWS Cloud9 User Guide.

After you have configured the environment, the work you have to do is much simpler than on the Amazon Linux instance, because there are Ubuntu packages that install the required dependencies. Follow the instructions earlier in this post until you get to the “Install headless Chrome” section. Issue this command:

sudo apt install -y libxcursor1 libgtk-3-dev libxss1 libasound2 libnspr4 libnss3

You don’t need to borrow from any of the CentOS or Fedora repositories.

Make changes to karma.conf.js as described earlier and you should then be ready to test your application.

 

Summary

You are now able to run headless integration tests using Cloud9 by installing Puppeteer and filling in the required Chrome dependencies. You can also extend this to the container image used to test your application with CodeBuild. Automated testing is vital to a trustworthy DevOps pipeline, and Cloud9 opens up new possibilities for developers of all types, including front-end developers.

Happy coding! –EZB

[$] A filesystem “change journal” and other topics

Post Syndicated from jake original https://lwn.net/Articles/755277/rss

At the 2017 Linux Storage, Filesystem, and Memory-Management Summit
(LSFMM), Amir Goldstein presented his work
on adding a superblock watch mechanism to provide a scalable way to notify
applications
of changes in a filesystem. At the 2018 edition of LSFMM, he was back to
discuss adding NTFS-like change
journals
to the kernel in support of backup solutions of various
sorts. As a second topic for the session, he also wanted to discuss doing
more performance-regression testing
for filesystems.

Protecting coral reefs with Nemo-Pi, the underwater monitor

Post Syndicated from Janina Ander original https://www.raspberrypi.org/blog/coral-reefs-nemo-pi/

The German charity Save Nemo works to protect coral reefs, and they are developing Nemo-Pi, an underwater “weather station” that monitors ocean conditions. Right now, you can vote for Save Nemo in the Google.org Impact Challenge.

Nemo-Pi — Save Nemo

Save Nemo

The organisation says there are two major threats to coral reefs: divers, and climate change. To make diving saver for reefs, Save Nemo installs buoy anchor points where diving tour boats can anchor without damaging corals in the process.

reef damaged by anchor
boat anchored at buoy

In addition, they provide dos and don’ts for how to behave on a reef dive.

The Nemo-Pi

To monitor the effects of climate change, and to help divers decide whether conditions are right at a reef while they’re still on shore, Save Nemo is also in the process of perfecting Nemo-Pi.

Nemo-Pi schematic — Nemo-Pi — Save Nemo

This Raspberry Pi-powered device is made up of a buoy, a solar panel, a GPS device, a Pi, and an array of sensors. Nemo-Pi measures water conditions such as current, visibility, temperature, carbon dioxide and nitrogen oxide concentrations, and pH. It also uploads its readings live to a public webserver.

Inside the Nemo-Pi device — Save Nemo
Inside the Nemo-Pi device — Save Nemo
Inside the Nemo-Pi device — Save Nemo

The Save Nemo team is currently doing long-term tests of Nemo-Pi off the coast of Thailand and Indonesia. They are also working on improving the device’s power consumption and durability, and testing prototypes with the Raspberry Pi Zero W.

web dashboard — Nemo-Pi — Save Nemo

The web dashboard showing live Nemo-Pi data

Long-term goals

Save Nemo aims to install a network of Nemo-Pis at shallow reefs (up to 60 metres deep) in South East Asia. Then diving tour companies can check the live data online and decide day-to-day whether tours are feasible. This will lower the impact of humans on reefs and help the local flora and fauna survive.

Coral reefs with fishes

A healthy coral reef

Nemo-Pi data may also be useful for groups lobbying for reef conservation, and for scientists and activists who want to shine a spotlight on the awful effects of climate change on sea life, such as coral bleaching caused by rising water temperatures.

Bleached coral

A bleached coral reef

Vote now for Save Nemo

If you want to help Save Nemo in their mission today, vote for them to win the Google.org Impact Challenge:

  1. Head to the voting web page
  2. Click “Abstimmen” in the footer of the page to vote
  3. Click “JA” in the footer to confirm

Voting is open until 6 June. You can also follow Save Nemo on Facebook or Twitter. We think this organisation is doing valuable work, and that their projects could be expanded to reefs across the globe. It’s fantastic to see the Raspberry Pi being used to help protect ocean life.

The post Protecting coral reefs with Nemo-Pi, the underwater monitor appeared first on Raspberry Pi.

Measuring the throughput for Amazon MQ using the JMS Benchmark

Post Syndicated from Rachel Richardson original https://aws.amazon.com/blogs/compute/measuring-the-throughput-for-amazon-mq-using-the-jms-benchmark/

This post is courtesy of Alan Protasio, Software Development Engineer, Amazon Web Services

Just like compute and storage, messaging is a fundamental building block of enterprise applications. Message brokers (aka “message-oriented middleware”) enable different software systems, often written in different languages, on different platforms, running in different locations, to communicate and exchange information. Mission-critical applications, such as CRM and ERP, rely on message brokers to work.

A common performance consideration for customers deploying a message broker in a production environment is the throughput of the system, measured as messages per second. This is important to know so that application environments (hosts, threads, memory, etc.) can be configured correctly.

In this post, we demonstrate how to measure the throughput for Amazon MQ, a new managed message broker service for ActiveMQ, using JMS Benchmark. It should take between 15–20 minutes to set up the environment and an hour to run the benchmark. We also provide some tips on how to configure Amazon MQ for optimal throughput.

Benchmarking throughput for Amazon MQ

ActiveMQ can be used for a number of use cases. These use cases can range from simple fire and forget tasks (that is, asynchronous processing), low-latency request-reply patterns, to buffering requests before they are persisted to a database.

The throughput of Amazon MQ is largely dependent on the use case. For example, if you have non-critical workloads such as gathering click events for a non-business-critical portal, you can use ActiveMQ in a non-persistent mode and get extremely high throughput with Amazon MQ.

On the flip side, if you have a critical workload where durability is extremely important (meaning that you can’t lose a message), then you are bound by the I/O capacity of your underlying persistence store. We recommend using mq.m4.large for the best results. The mq.t2.micro instance type is intended for product evaluation. Performance is limited, due to the lower memory and burstable CPU performance.

Tip: To improve your throughput with Amazon MQ, make sure that you have consumers processing messaging as fast as (or faster than) your producers are pushing messages.

Because it’s impossible to talk about how the broker (ActiveMQ) behaves for each and every use case, we walk through how to set up your own benchmark for Amazon MQ using our favorite open-source benchmarking tool: JMS Benchmark. We are fans of the JMS Benchmark suite because it’s easy to set up and deploy, and comes with a built-in visualizer of the results.

Non-Persistent Scenarios – Queue latency as you scale producer throughput

JMS Benchmark nonpersistent scenarios

Getting started

At the time of publication, you can create an mq.m4.large single-instance broker for testing for $0.30 per hour (US pricing).

This walkthrough covers the following tasks:

  1.  Create and configure the broker.
  2. Create an EC2 instance to run your benchmark
  3. Configure the security groups
  4.  Run the benchmark.

Step 1 – Create and configure the broker
Create and configure the broker using Tutorial: Creating and Configuring an Amazon MQ Broker.

Step 2 – Create an EC2 instance to run your benchmark
Launch the EC2 instance using Step 1: Launch an Instance. We recommend choosing the m5.large instance type.

Step 3 – Configure the security groups
Make sure that all the security groups are correctly configured to let the traffic flow between the EC2 instance and your broker.

  1. Sign in to the Amazon MQ console.
  2. From the broker list, choose the name of your broker (for example, MyBroker)
  3. In the Details section, under Security and network, choose the name of your security group or choose the expand icon ( ).
  4. From the security group list, choose your security group.
  5. At the bottom of the page, choose Inbound, Edit.
  6. In the Edit inbound rules dialog box, add a role to allow traffic between your instance and the broker:
    • Choose Add Rule.
    • For Type, choose Custom TCP.
    • For Port Range, type the ActiveMQ SSL port (61617).
    • For Source, leave Custom selected and then type the security group of your EC2 instance.
    • Choose Save.

Your broker can now accept the connection from your EC2 instance.

Step 4 – Run the benchmark
Connect to your EC2 instance using SSH and run the following commands:

$ cd ~
$ curl -L https://github.com/alanprot/jms-benchmark/archive/master.zip -o master.zip
$ unzip master.zip
$ cd jms-benchmark-master
$ chmod a+x bin/*
$ env \
  SERVER_SETUP=false \
  SERVER_ADDRESS={activemq-endpoint} \
  ACTIVEMQ_TRANSPORT=ssl\
  ACTIVEMQ_PORT=61617 \
  ACTIVEMQ_USERNAME={activemq-user} \
  ACTIVEMQ_PASSWORD={activemq-password} \
  ./bin/benchmark-activemq

After the benchmark finishes, you can find the results in the ~/reports directory. As you may notice, the performance of ActiveMQ varies based on the number of consumers, producers, destinations, and message size.

Amazon MQ architecture

The last bit that’s important to know so that you can better understand the results of the benchmark is how Amazon MQ is architected.

Amazon MQ is architected to be highly available (HA) and durable. For HA, we recommend using the multi-AZ option. After a message is sent to Amazon MQ in persistent mode, the message is written to the highly durable message store that replicates the data across multiple nodes in multiple Availability Zones. Because of this replication, for some use cases you may see a reduction in throughput as you migrate to Amazon MQ. Customers have told us they appreciate the benefits of message replication as it helps protect durability even in the face of the loss of an Availability Zone.

Conclusion

We hope this gives you an idea of how Amazon MQ performs. We encourage you to run tests to simulate your own use cases.

To learn more, see the Amazon MQ website. You can try Amazon MQ for free with the AWS Free Tier, which includes up to 750 hours of a single-instance mq.t2.micro broker and up to 1 GB of storage per month for one year.

Practice Makes Perfect: Testing Campaigns Before You Send Them

Post Syndicated from Zach Barbitta original https://aws.amazon.com/blogs/messaging-and-targeting/practice-makes-perfect-testing-campaigns-before-you-send-them/

In an article we posted to Medium in February, we talked about how to determine the best time to engage your customers by using Amazon Pinpoint’s built-in session heat map. The session heat map allows you to find the times that your customers are most likely to use your app. In this post, we continued on the topic of best practices—specifically, how to appropriately test a campaign before going live.

In this post, we’ll talk about the old adage “practice makes perfect,” and how it applies to the campaigns you send using Amazon Pinpoint. Let’s take a scenario many of our customers encounter daily: creating a campaign to engage users by sending a push notification.

As you can see from the preceding screenshot, the segment we plan to target has nearly 1.7M recipients, which is a lot! Of course, before we got to this step, we already put several best practices into practice. For example, we determined the best time to engage our audience, scheduled the message based on recipients’ local time zones, performed A/B/N testing, measured lift using a hold-out group, and personalized the content for maximum effectiveness. Now that we’re ready to send the notification, we should test the message before we send it to all of the recipients in our segment. The reason for testing the message is pretty straightforward: we want to make sure every detail of the message is accurate before we send it to all 1,687,575 customers.

Fortunately, Amazon Pinpoint makes it easy to test your messages—in fact, you don’t even have to leave the campaign wizard in order to do so. In step 3 of the campaign wizard, below the message editor, there’s a button labelled Test campaign.

When you choose the Test campaign button, you have three options: you can send the test message to a segment of 100 endpoints or less, or to a set of specific endpoint IDs (up to 10), or to a set of specific device tokens (up to 10), as shown in the following image.

In our case, we’ve already created a segment of internal recipients who will test our message. On the Test Campaign window, under Send a test message to, we choose A segment. Then, in the drop-down menu, we select our test segment, and then choose Send test message.

Because we’re sending the test message to a segment, Amazon Pinpoint automatically creates a new campaign dedicated to this test. This process executes a test campaign, complete with message analytics, which allows you to perform end-to-end testing as if you sent the message to your production audience. To see the analytics for your test campaign, go to the Campaigns tab, and then choose the campaign (the name of the campaign contains the word “test”, followed by four random characters, followed by the name of the campaign).

After you complete a successful test, you’re ready to launch your campaign. As a final check, the Review & Launch screen includes a reminder that indicates whether or not you’ve tested the campaign, as shown in the following image.

There are several other ways you can use this feature. For example, you could use it for troubleshooting a campaign, or for iterating on existing campaigns. To learn more about testing campaigns, see the Amazon Pinpoint User Guide.