Tag Archives: Compute

Building an electronic security lock using serverless

Post Syndicated from Moheeb Zara original https://aws.amazon.com/blogs/compute/building-an-electronic-security-lock-using-serverless/

In this guide I show how to build an electronic security lock for package delivery, securing physical documents, or granting access to a secret lab. This project uses AWS Serverless to create a touchscreen keypad lock that uses SMS to alert a recipient with a custom message and unlock code. Files are included for the lockbox shown, but the system can be installed in anything with a door.

CircuitPython is a lightweight version of Python that works on embedded hardware. It runs on an Adafruit PyPortal open-source IoT touch display. A relay wired to the PyPortal acts as an electronic switch to bridge power to an electronic solenoid lock.

I deploy the backend to the AWS Cloud using the AWS Serverless Application Repository. The code on the PyPortal makes REST calls to the backend to send a random four-digit code as a text message using Amazon Pinpoint. It also stores the lock state in AWS System Manager Parameter Store, a secure service for storing and retrieving sensitive information.

Prerequisites

You need the following to complete the project:

Deploy the backend application

An architecture diagram of the serverless backend.

An architecture diagram of the serverless backend.

The serverless backend consists of three Amazon API Gateway endpoints that invoke AWS Lambda functions. At boot, the PyPortal calls the FetchState function to access the lock state from a Parameter Store in AWS Systems Manager. For example, if the returned state is:

{ “locked”: True, “code”: “1234” }

the PyPortal leaves the relay open so that the solenoid lock remains locked. Once the matching “1234” code is entered, the relay circuit is closed and the solenoid lock is opened. When unlocked the PyPortal calls the UpdateState function to update the state to:

{ “locked”: False, “code”: “ ” }

In an unlocked state, the PyPortal requests a ten-digit phone number to be entered in order to lock. The SendCode function is called with the phone number so that it can generate a random four-digit code. A message is then sent to the recipient using Amazon Pinpoint, and the Parameter Store state is updated to “locked”. The state is returned in the response and the PyPortal opens the relay again and stores the unlock code locally.

Before deploying the backend, create an Amazon Pinpoint Project and request a long code. A long code is a dedicated phone number required for sending SMS.

  1. Navigate to the Amazon Pinpoint console.
  2. Ensure that you are in a Region where Amazon Pinpoint is supported. For the most up-to-date list, see AWS Service Endpoints.
  3. Choose Create Project.
  4. Name your project and choose Create.
  5. Choose Configure under SMS and Voice.
  6. Select Enable the SMS channel for this project and choose Save changes.

  7. Under Settings, SMS and Voice choose Request long codes.
  8. Enter the target country and select Transactional for Default call type. Choose Request long codes. This incurs a monthly cost of one dollar and can be canceled anytime. For a breakdown of costs, check out current pricing.
  9. Under Settings, General settings make a note of the Project ID.

I use the AWS Serverless Application Model (AWS SAM) to create the backend template. While it can be deployed using the AWS SAM CLI, you can also deploy from the AWS Management Console:

  1. Navigate to the aws-serverless-pyportal-lock application in the AWS Serverless Application Repository.
  2. Under Application settings, fill the parameters PinpointApplicationID and LockboxCustomMessage.
  3. Choose Deploy.
  4. Once complete, choose View CloudFormation Stack.
  5. Select the Outputs tab and make a note of the LockboxBaseApiUrl. This is required for configuring the PyPortal.
  6. Navigate to the URL listed as LockboxApiKey in the Outputs tab.
  7. Choose Show to reveal the API key. Make a note of this. This is required for authenticating requests from the PyPortal to the backend.

PyPortal setup

The following instructions walk through installing the latest version of the Adafruit CircuityPython libraries and firmware.

  1. Follow these instructions from Adafruit to install the latest version of the CircuitPython bootloader. At the time of writing, the latest version is 5.3.0.
  2. Follow these instructions to install the latest Adafruit CircuitPython library bundle. I use bundle version 5.x.
  3. Optionally install the Mu Editor, a multi-platform code editor and serial debugger compatible with Adafruit CircuitPython boards. This can help with troubleshooting issues.

Wiring

Electronic solenoid locks come in varying shapes, sizes, and voltages. Choose one that works for your needs and wire it according to the following instructions for the PyPortal.

  1. Gather the PyPortal, a solenoid lock, relay module, JST connectors, jumper wire, and a power source that matches the solenoid being used. For this project, a six-volt solenoid is used with a four AA battery holder.
  2. Wire the system following this diagram.
  3. Splice female jumper wires to the exposed leads of a JST connector to connect the relay module.
  4. Insert the JST connector end to the port labeled D4 on the PyPortal.
  5. Power the PyPortal using USB or by feeding a five-volt supply to the port labeled D3.

Code PyPortal

As with regular Python, CircuitPython does not need to be compiled to execute. You can flash new firmware on the PyPortal by copying a Python file and necessary assets to a mounted volume. The bootloader runs code.py anytime the device starts or any files are updated.

  1. Use a USB cable to plug the PyPortal into your computer and wait until a new mounted volume CIRCUITPY is available.
  2. Download the project from GitHub. Inside the project, copy the contents of /circuit-python on to the CIRCUITPY volume.
  3. Inside the volume, open and edit the secrets.py file. Include your Wi-Fi credentials along with the LockboxApiKey and LockboxBaseApiURL API Gateway endpoint. These can be found under Outputs in the AWS CloudFormation stack created by the AWS Serverless Application Repository.
  4. Save the file, and the device restarts. It takes a moment to connect to Wi-Fi and make the first request to the FetchState function.
  5. Test the system works by entering in a phone number when prompted. An SMS message with the unlock code is sent to the provided number.
  6. Mount the system to the desired door or container, such as a 3D printed safe (files included in the GitHub project).

    Optionally
    , if you installed the Mu Editor, you can choose “Serial” to follow along the device log.

 

Understanding the code

See circuit-python/code.py from the GitHub project, this is the main code for the PyPortal. When the PyPortal connects to Wi-Fi, the first thing it does is make a GET request to the API Gateway endpoint for the FetchState function.

def getState():
    endpoint = secrets['base-api'] + "/state"
    headers = {"x-api-key": secrets['x-api-key']}
    response = wifi.get(endpoint, headers=headers, timeout=30)
    handleState(response.json())
    response.close()

The FetchState Lambda function code, written in Python, gets the state from the Parameter Store and returns it in the response to the PyPortal.

import os
import json
import boto3

client = boto3.client('ssm')
parameterName = os.environ.get('PARAMETER_NAME')

def lambda_handler(event, context):
    response = client.get_parameter(
        Name=parameterName,
        WithDecryption=False
    )

    state = json.loads(response['Parameter']['Value'])

    return {
        "statusCode": 200,
        "body": json.dumps(state)
    }

The getState function in the CircuitPython code passes the returned state to the handleState function, which determines whether to physically lock or unlock the device.

def handleState(newState):
    print(state)
    state['code'] = newState['code']
    state['locked'] = newState['locked']
    print(state)
    if state['locked'] == True:
        lock()
    if state['locked'] == False:
        unlock()

When the device is unlocked, and a phone number is entered to lock the device, the CircuitPython command function is called.

def command(action, num):
    if action == "unlock":
        if num == state["code"]:
            unlock()
        else:
            number_label.text = "Wrong code!"
            playBeep()
    if action == "lock":
        if validate(num) == True:
            data = sendCode(num)
            handleState(data)

The CircuitPython sendCode function makes a POST request with the entered phone number to the API Gateway endpoint for the SendCode Lambda function

def sendCode(num):
    endpoint = secrets['base-api'] + "/lock"
    headers = {"x-api-key": secrets['x-api-key']}
    data = { "number": num }
    response = wifi.post(endpoint, json=data, headers=headers, timeout=30)
    data = response.json()
    print("Code received: ", data)
    response.close()
    return data

This Lambda function generates a random four-digit number and adds it to the custom message stored as an environment variable. It then sends a text message to the provided phone number using Amazon Pinpoint, and saves the new state in the Parameter Store. The new state is returned in the response and is used by the handleState function in the CircuitPython code.

import os
import json
import boto3
import random

pinpoint = boto3.client('pinpoint')
ssm = boto3.client('ssm')

applicationId = os.environ.get('APPLICATION_ID')
parameterName = os.environ.get('PARAMETER_NAME')
message = os.environ.get('MESSAGE')

def lambda_handler(event, context):
    print(event)
    body = json.loads(event['body'])

    number = "+1" + str(body['number'])
    code = str(random.randint(1111,9999))

    addresses = {}
    addresses[number] = {'ChannelType': 'SMS'}
    pinpoint.send_messages(
        ApplicationId=applicationId,
        MessageRequest={
            'Addresses': addresses,
            'MessageConfiguration': {
                'SMSMessage': {
                    'Body': message + code,
                    'MessageType': 'TRANSACTIONAL'
                }
            }
        }
    )

    state = { "locked": True, "code": code }

    response = ssm.put_parameter(
        Name=parameterName,
        Value=json.dumps(state),
        Type='String',
        Overwrite=True
    )

    return {
        "statusCode": 200,
        "body": json.dumps(state)
    }

Entering the correct unlock code from the SMS message calls the unlock function. The unlock function closes the relay circuit to open the solenoid lock. It plays a beep sound and then calls the updateState function, which makes a POST request to the API Gateway endpoint for the UpdateState Lambda function.

def updateState(newState):
    endpoint = secrets['base-api'] + "/state"
    headers = {"x-api-key": secrets['x-api-key']}
    response = wifi.post(endpoint, json=newState, headers=headers, timeout=30)
    data = response.json()
    print("Updated state to: ", data)
    response.close()
    return data

def unlock():
    print("Unlocked!")
    number_label.text = "Enter Phone# to Lock"
    time.sleep(1)
    btn = find_button("Unlock")
    if btn is not None:
        btn.selected = True
        btn.label = "Lock"
    lock_relay.value = True
    playBeep()
    updateState({"locked": False, "code": ""})

The UpdateState Lambda function updates the Parameter Store whenever the state is changed. When the PyPortal loses power or restarts, the last known state is fetched, preventing a false lock/unlocked position.

import os
import json
import boto3

client = boto3.client('ssm')
parameterName = os.environ.get('PARAMETER_NAME')

def lambda_handler(event, context):
    state = json.loads(event['body'])

    response = client.put_parameter(
        Name=parameterName,
        Value=json.dumps(state),
        Type='String',
        Overwrite=True
    )

    return {
        "statusCode": 200,
        "body": json.dumps(state)
    }

Conclusion

I show how to build an electronic keypad lock system using a basic relay circuit and a microcontroller. The system is managed by a serverless backend API deployed using the AWS Serverless Application Repository. The backend uses API Gateway to provide a REST API for Lambda functions that handle fetching lock state, updating lock state, and sending a random four-digit code via SMS using Amazon Pinpoint. Language consistency is achieved by using CircuitPython on the PyPortal and Python 3.8 in the Lambda function code.

Use this project as a template to build out any solution that requires secure physical access control. It can be embedded in cabinet drawers to protect documents or can be used with a door solenoid to control room access. Try combining it with a serverless geohashing app to develop a treasure hunting experience. Explore how to further modify the serverless application in the GitHub project by learning about the AWS Serverless Application Model. Read my previous guide to learn how you can add voice to a CircuitPython project on a PyPortal.

 

Using WebSockets and Load Balancers Part two

Post Syndicated from Emma White original https://aws.amazon.com/blogs/compute/using-websockets-and-load-balancers-part-two/

This post was written by Robert Zhu, Principal Developer Advocate at AWS. 

This article continues a blog I posted earlier about using Load Balancers on Amazon Lightsail. In this article, I demonstrate a few common challenges and solutions when combining stateful applications with load balancers. I start with a simple WebSocket application in Amazon Lightsail that counts the number of seconds the client has been connected. Then, I add a Lightsail Load Balancer, and show you how the application performs routing and retries. Let’s get started.

WebSockets

WebSockets are persistent, duplex sockets that enable bi-directional communication between a client and server. Applications often use WebSockets to provide real-time functionality such as chat and gaming. Let’s start with some sample code for a simple WebSocket server:

const WebSocket = require("ws");
const name = require("./randomName");
const server = require("http").createServer();
const express = require("express");
const app = express();

console.log(`This server is named: ${name}`);

// serve files from the public directory
server.on("request", app.use(express.static("public")));

// tell the WebSocket server to use the same HTTP server
const wss = new WebSocket.Server({
  server,
});

wss.on("connection", function connection(ws, req) {
  const clientId = req.url.replace("/?id=", "");
  console.log(`Client connected with ID: ${clientId}`);

  let n = 0;
  const interval = setInterval(() => {
    ws.send(`${name}: you have been connected for ${n++} seconds`);
  }, 1000);

  ws.on("close", () => {
    clearInterval(interval);
  });
});

const port = process.env.PORT || 80;
server.listen(port, () => {
  console.log(`Server listening on port ${port}`);
});

We serve static files from the public directory and WebSocket connection requests on the same port. An incoming HTTP request from a browser loads public/index.html, and a WebSocket connection initiated from the client triggers the wss.on(“connection”, …) code. Upon receiving a WebSocket connection, I set up a recurring callback where I tell the client how long it has been connected. Now, let’s look at a look at the client code:

buttonConnect.onclick = async () => {
  const serverAddress = inputServerAddress.value;
  messages.innerHTML = "";
  instructions.parentElement.removeChild(instructions);

  appendMessage(`Connecting to ${serverAddress}`);

  try {
    let retries = 0;
    while (retries < 50) {
      appendMessage(`establishing connection... retry #${retries}`);
      await runSession(serverAddress);
      await sleep(1500);
      retries++;
    }

    appendMessage("Reached maximum retries, giving up.");
  } catch (e) {
    appendMessage(e.message || e);
  }
};

async function runSession(address) {
  const ws = new WebSocket(address);

  ws.addEventListener("open", () => {
    appendMessage("connected to server");
  });

  ws.addEventListener("message", ({ data }) => {
    console.log(data);
    appendMessage(data);
  });

  return new Promise((resolve) => {
    ws.addEventListener("close", () => {
      appendMessage("Connection lost with server.");
      resolve();
    });
  });
}

I use the WebSocket DOM API to connect to the server. Once connected, I append any received messages to console and on screen via the custom appendMessage function. If the client loses connectivity, it will try to reconnect up to 50 times. Let’s run it:

"slimy-cardinal" is a randomly generated server name

Now, suppose I am running a very demanding real-time application, and need to scale the server capacity beyond a single host. How would I do this? I create two Ubuntu 18.04 instances. Once the instances are up, I SSH to each one, and run the following commands:

sudo apt-get update
sudo apt-get -y install nodejs npm
git clone https://github.com/robzhu/ws-time
cd ws-time && npm install
node server.js

During installation, select Yes when presented with the prompt:

Select "Yes" when prompted to install libssl for npm

Keep these SSH sessions open, you need them shortly. Next, you create the Load Balancer in Amazon Lightsail, and attach the instances:

screenshot of target instances for load balancers

Note: the Lightsail load balancer only works for port 80, which is part of the reason I use the same port for HTTP and WebSocket requests.

Copy the DNS name for the load balancer, open it in a new browser tab, and paste it into the WebSocket server address with the format:

ws://<DNSName>

screenshot of what the correct server address should look like

Make sure the server address does not accidentally start with “ws://http://…”

Next, locate the SSH session that accepted the connection. It looks like this:

accepted ssh section

The server logs the client ID when it receives a connection.

If you kill this process, the client disconnects and runs its retry logic, hopefully causing the load balancer to route the client to a healthy node. Next, hit connect from the client. After a few seconds, kill the process on the server, and you should see the client reconnect to a healthy instance:

what you should see when you connect to a healthy instance

The client retry was routed to a healthy instance on the first attempt. This is due to the round-robin algorithm that the Lightsail load balancer uses. In production, you should not expect the load balancer to detect an unhealthy node immediately. If the load balancer continues to route incoming connections to an unhealthy node, the client will need more retry attempts before reconnecting. If this is a large scale system, we will want to implement an exponential backoff on the retry intervals to avoid overwhelming other nodes in the cluster (aka the thundering herd problem).

Notice that the message “you have been connected for X seconds” reset X to 0 after the client reconnected. What if you want to make the failover transparent to the user? The problem is that the connection duration (X) is stored in the NodeJS process that we killed. That state is lost if the process dies or if the host goes down. The solution is unsurprising: move the state off the WebSocket server and into a distributed cache, such as redis.

Deep health checks

When you attached your instances to the load balancer, your health checks passed because the Lightsail load balancer issues an HTTP request for the default path (where you serve index.html). However, if you expect most of your server load to come from I/O on the WebSocket connections, the ability to serve our index.html file is not a good health check.  You might implement a better health check like so:

app.get('/healthcheck', (req, res) => {
  const serverHasCapacity = getAverageNetworkUsage() < 0.6;
  if (serverHasCapacity) res.status(200).send("ok");
  res.status(400).send("server is overloaded");
});

This caused the load balancer to consider a node as “unhealthy” when the target node’s network usage reaches a threshold value. In response, the load balancer stops routing new incoming connections to that node. However, note that the load balancer will not terminate existing connections to an over-subscribed node.

When working with persistent connections or sticky sessions, always leave some capacity buffer. For example, do not mark the server as unhealthy only when it reaches 100% capacity. This is because existing connections or sticky clients will continue to generate traffic for that node, and some workloads may increase server usage beyond its threshold (e.g. a chat room that suddenly gets very busy).

 

Conclusion

I hope this post has given you a clear idea of how to use load balancers to improve scalability for stateful applications, and how to implement such a solution using Amazon Lightsail instances and load balancers. Please feel free to leave comments, and try this solution for yourself.

 

Adding voice to a CircuitPython project using Amazon Polly

Post Syndicated from Moheeb Zara original https://aws.amazon.com/blogs/compute/adding-voice-to-a-circuitpython-project-using-amazon-polly/

An Adafruit PyPortal displaying a quote while synthesizing and playing speech using Amazon Polly.

An Adafruit PyPortal displaying a quote while synthesizing and playing speech using Amazon Polly.

As a natural means of communication, voice is a powerful way to humanize an experience. What if you could make anything talk? This guide walks through how to leverage the cloud to add voice to an off-the-shelf microcontroller. Use it to develop more advanced ideas, like a talking toaster that encourages healthy breakfast habits or a house plant that can express its needs.

This project uses an Adafruit PyPortal, an open-source IoT touch display programmed using CircuitPython, a lightweight version of Python that works on embedded hardware. You copy your code to the PyPortal like you would to a thumb drive and it runs. Random quotes from the PaperQuotes API are periodically displayed on the PyPortal LCD.

A microcontroller can’t do speech synthesis on its own so I use Amazon Polly, a natural text to speech synthesis service, to generate audio. Adding speech also extends accessibility to the visually impaired. This project includes an example for requesting arbitrary speech in addition to random quotes. Use this example to add a voice to any CircuitPython project.

An Adafruit PyPortal, an external speaker, and a microSD card.

An Adafruit PyPortal, an external speaker, and a microSD card.

I deploy the backend to the AWS Cloud using the AWS Serverless Application Repository. The code on the PyPortal makes a REST call to the backend to fetch a quote and synthesize speech audio for playback on the device.

Prerequisites

You need the following to complete the project:

Deploy the backend application

An architecture diagram of the serverless backend when requesting speech synthesis of a text string.

An architecture diagram of the serverless backend when requesting speech synthesis of a text string.

The serverless backend consists of an Amazon API Gateway endpoint that invokes an AWS Lambda function. If called with a JSON object containing text and voiceId attributes, it uses Amazon Polly to synthesize speech and uploads an MP3 file as a public object to Amazon S3. Upon completion, it returns the URL for downloading the audio file. It also processes the submitted text and adds return lines so that it can appear text-wrapped when displayed on the PyPortal. For a full list of voices, see the Amazon Polly documentation. An example response:

To fetch quotes instead of a text field, call the endpoint with a comma-separated list of tags as shown in the following diagram. The Lambda function then calls the PaperQuotes API. It fetches up to 50 quotes per tag and selects a random one to synthesize as speech. As with arbitrary text, it returns a URL and a text-wrapped representation of the quote.

An architecture diagram of the serverless backend when requesting a random quote from the PaperQuotes API to synthesize as speech.

An architecture diagram of the serverless backend when requesting a random quote from the PaperQuotes API to synthesize as speech.

I use the AWS Serverless Application Model (AWS SAM) to create the backend template. While it can be deployed using the AWS SAM CLI, you can also deploy from the AWS Management Console:

  1. Generate a free PaperQuotes API key at paperquotes.com. The serverless backend requires this to fetch quotes.
  2. Navigate to the aws-serverless-pyportal-polly application in the AWS Serverless Application Repository.
  3. Under Application settings, enter the parameter, PaperQuotesAPIKey.
  4. Choose Deploy.
  5. Once complete, choose View CloudFormation Stack.
  6. Select the Outputs tab and make a note of the SpeechApiUrl. This is required for configuring the PyPortal.
  7. Click the link listed for SpeechApiKey in the Outputs tab.
  8. Click Show to reveal the API key. Make a note of this. This is required for authenticating requests from the PyPortal to the SpeechApiUrl.

PyPortal setup

The following instructions walk through installing the latest version of the Adafruit CircuityPython libraries and firmware. It also shows how to enable an external speaker module.

  1. Follow these instructions from Adafruit to install the latest version of the CircuitPython bootloader. At the time of writing, the latest version is 5.3.0.
  2. Follow these instructions to install the latest Adafruit CircuitPython library bundle. I use bundle version 5.x.
  3. Insert the microSD card in the slot located on the back of the device.
  4. Cut the jumper pad on the back of the device labeled A0. This enables you to use an external speaker instead of the built-in speaker.
  5. Plug the external speaker connector into the port labeled SPEAKER on the back of the device.
  6. Optionally install the Mu Editor, a multi-platform code editor and serial debugger compatible with Adafruit CircuitPython boards. This can help with troubleshooting issues.
  7. Optionally if you have a 3D printer at home, you can print a case for your PyPortal. This can protect and showcase your project.

Code PyPortal

As with regular Python, CircuitPython does not need to be compiled to execute. You can flash new firmware on the PyPortal by copying a Python file and necessary assets to a mounted volume. The bootloader runs code.py anytime the device starts or any files are updated.

  1. Use a USB cable to plug the PyPortal into your computer and wait until a new mounted volume CIRCUITPY is available.
  2. Download the project from GitHub. Inside the project, copy the contents of /circuit-python on to the CIRCUITPY volume.
  3. Inside the volume, open and edit the secrets.py file. Include your Wi-Fi credentials along with the SpeechApiKey and SpeechApiUrl API Gateway endpoint. These can be found under Outputs in the AWS CloudFormation stack created by the AWS Serverless Application Repository.
  4. Save the file, and the device restarts. It takes a moment to connect to Wi-Fi and make the first request.
    Optionally, if you installed the Mu Editor, you can click on “Serial” to follow along the device log.

The PyPortal takes a few moments to connect to the Wi-Fi network and make its first request. On success, you hear it greet you and describe itself. The default interval is set to then display and read a quote every five minutes.

Understanding the CircuitPython code

See the bottom of circuit-python/code.py from the GitHub project. When the PyPortal connects to Wi-Fi, the first thing it does is synthesize an arbitrary “hello world” text for display. It then begins periodically displaying and “speaking” quotes.

# Connect to WiFi
print("Connecting to WiFi...")
wifi.connect()
print("Connected!")

displayQuote("Ready!")

speakText('Hello world! I am an Adafruit PyPortal running Circuit Python speaking to you using AWS Serverless', 'Joanna')

while True:
    speakQuote('equality, humanity', 'Joanna')
    time.sleep(60*secrets['interval'])

Both the speakText and speakQuote function call the synthesizeSpeech function. The difference is whether text or tags are passed to the API.

def speakText(text, voice):
    data = { "text": text, "voiceId": voice }
    synthesizeSpeech(data)

def speakQuote(tags, voice):
    data = { "tags": tags, "voiceId": voice }
    synthesizeSpeech(data)

The synthesizeSpeech function posts the data to the API Gateway endpoint. It then invokes the Lambda function and returns the MP3 URL and the formatted text. The downloadfile function is called to fetch the MP3 file and store it on the SD card. displayQuote is called to display the quote on the LCD. Finally, the playMP3 opens the file and plays the speech audio using the built-in or external speaker.

def synthesizeSpeech(data):
    response = postToAPI(secrets['endpoint'], data)
    downloadfile(response['url'], '/sd/cache.mp3')
    displayQuote(response['text'])
    playMP3("/sd/cache.mp3")

Modifying the Lambda function

The serverless application includes a Lambda function, SynthesizeSpeechFunction, which can be modified directly in the Lambda console. The AWS SAM template used to deploy the AWS Serverless Application Repository application adds policies for accessing the S3 bucket where audio is stored. It also grants access to Amazon Polly for synthesizing speech. It also adds the PaperQuote API token as an environment variable and sets API Gateway as an event source.

SynthesizeSpeechFunction:
    Type: AWS::Serverless::Function
    Properties:
      CodeUri: lambda_functions/SynthesizeSpeech/
      Handler: app.lambda_handler
      Runtime: python3.8
      Policies:
        - S3FullAccessPolicy:
            BucketName: !Sub "${AWS::StackName}-audio"
        - Version: '2012-10-17'
          Statement:
            - Effect: Allow
              Action:
                - polly:*
              Resource: '*'
      Environment:
        Variables:
          BUCKET_NAME: !Sub "${AWS::StackName}-audio"
          PAPER_QUOTES_TOKEN: !Ref PaperQuotesAPIKey
      Events:
        Speech:
          Type: Api
          Properties:
            RestApiId: !Ref SpeechApi
            Path: /speech
            Method: post

To edit the Lambda function, navigate back to the CloudFormation stack and click on the SpeechSynthesizeFunction under the Resources tab.

From here, you can edit the Lambda function code directly. Clicking Save deploys the new code.

The getQuotes function is called to fetch quotes from the PaperQuotes API. You can change this to call from a different source, such as a custom selection of quotes. Try modifying it to fetch social media posts or study questions.

Conclusion

I show how to add natural sounding text to speech on a microcontroller using a serverless backend. This is accomplished by deploying an application through the AWS Serverless Application Repository. The deployed API uses API Gateway to securely invoke a Lambda function that fetches quotes from the PaperQuotes API and generates speech using Amazon Polly. The speech audio is uploaded to S3.

I then show how to program a microcontroller, the Adafruit PyPortal, using CircuitPython. The code periodically calls the serverless API to fetch a quote and to download speech audio for playback. The sample code also demonstrates synthesizing arbitrary text to speech, meaning it can be used for any project you can conceive. Check out my previous guide on using the PyPortal to create a Martian weather display for inspiration.

Low-latency computing with AWS Local Zones – Part 1

Post Syndicated from Emma White original https://aws.amazon.com/blogs/compute/low-latency-computing-with-aws-local-zones-part-1/

This post was contributed by: Pranav Chachra, Rob Chen, Alan Goodman

AWS launched a Local Zone in Los Angeles (LA), California, in late 2019 at re:Invent. Since the launch, we have seen a lot of interest from you, and have worked to bring additional features and services based on your feedback.

In this blog series, we share best practices and recommendations for hosting your applications in Local Zones based on what has worked for customers. Part one focuses on sharing more about Local Zones and how customers are already using the LA Local Zone. In this blog, we also address key questions asked by customers since the launch of the LA Local Zone. In the upcoming parts of this blog series, we do a deep dive into specific use cases for which our customers are using the LA Local Zone. We also list best practices to host your applications in Local Zones and plan to provide related architectural considerations.

AWS Local Zones introduction

local zone base picture

Today, there are a significant number of applications that can run in the AWS Cloud. Most applications work well in our Regions. However, for workloads that require low-latency or local data processing, you need us to bring AWS infrastructure closer to you. Without local AWS presence, you have to procure, operate, and maintain IT infrastructure in your own data center or colocation facility for such workloads. And, you end up building and running such application components with a different set of APIs and tools than the other parts of your applications running in the AWS Cloud. This results in a lot of extra effort and expenses.

And that’s why, based on your feedback, we launched AWS Local Zones, a new type of AWS infrastructure deployment that places compute, storage, and other select services closer to large cities. We designed Local Zones as a powerful construct that extends our existing Regions into new locations. This gives you the ability to run applications on AWS that require single-digit millisecond latencies to your end-users or on-premises installations in the local area. Just like Regions, Local Zones are fully managed and supported by AWS, giving you the elasticity, scalability, and security benefits of running on the AWS. The first Local Zone is generally available in LA, and is an extension of the US West (Oregon) Region (the parent Region).

Use cases

Like Regions, customers leverage the LA Local Zone for many different use cases. The top customer use cases include:

  1. Media and Entertainment (M&E) content creation: M&E customers are migrating expensive on-premises workstations to the LA Local Zone. They do this to accelerate content creation by getting rid of capacity constraints while improving security and operational efficiency. For artist workstations, latency is the key to having a jitter-free experience on a remote instance. These customers typically require less than five millisecond latency from their offices to virtual instances. With Direct Connect, most of these customers are able to achieve as low as 1-2 millisecond latency from their animation hubs in LA to the Local Zone, and run latency-sensitive workloads, such as live production, video editing.
  2. Enterprise migration with hybrid architecture: Enterprises have workloads running in their existing on-premises data centers in the LA metro area. These customers use the Local Zone to migrate complex legacy on-premises applications to AWS without expensive revamp of their architecture. Customers have told us that it can be daunting to migrate a portfolio of interdependent applications to the cloud. Now with a Direct Connect to the LA Local Zone, customers can establish a hybrid environment that provides ultra-low latency communication between applications running in the LA Local Zone and on-premises installations. In turn, this enables customers to migrate applications incrementally, simplifying migrations drastically and enabling on-going hybrid deployments in the LA area.
  3. Real-time multiplayer gaming: This category includes gaming companies that deploy game servers for multiplayer sessions all over the world to be closer to gamers. A latency of 20 milliseconds or less is considered ideal for good gameplay experience. Until now, these customers were using on-premises installations in the LA area to supplement AWS presence. However, now, customers are deploying latency-sensitive game servers in the LA Local Zone to run real-time and interactive multiplayer game sessions, enabling them to provide end users in the Southern California area with a great experience.

In the next parts of the blog series, we further dive into these use cases, cover respective best practices, and review architectural guidance for you.

Services and features

AWS launched the LA Local Zone with support for seven Nitro based Amazon EC2 instance types (T3, C5, M5, R5, R5d, I3en, and G4), two EBS volume types (io1 and gp2), Amazon FSx for Windows File Server, Amazon FSx for Lustre, Application Load Balancer, Amazon VPC and Amazon Direct Connect. Since launch, AWS added support for Amazon RDS, Amazon EMR, AWS Shield, and Amazon EC2 Dedicated Hosts, and are adding more services based on your feedback.

Other parts of AWS, like AWS CloudFormation templates, CloudWatch, IAM resources, and Organizations, will continue to work as expected, providing you a consistent experience. You can also leverage the full suite of services like Amazon S3 in the parent Region, US West (Oregon), over AWS’s private network backbone with ~20–30 milliseconds latency.

Getting started and using the Local Zone

Now that we reviewed some common use cases and services of Local Zones, we want to get you started using the Local Zone. First, enable the LA Local Zone from the new “Zone settings” section of the EC2 console, as shown in the following image:

console Zone Settings

Once enabled, Local Zones looks and behaves similarly to an Availability Zones (AZs).  You can access Local Zones through the parent Region’s console and API endpoints. The following image shows that the LA Local Zone is visible as us-west-2-lax-1a along with other AZs in the EC2 console:

service health of zone

Once the LA Local Zone is enabled, you can extend your existing VPC from the parent Region to a Local Zone by creating a new VPC subnet assigned to the LA Local Zone:

creating subnet

Once a VPC subnet is established for the LA Local Zone, simply select the subnet while creating local resources. For example, you can launch an EC2 instance in the LA Local Zone by selecting the local subnet as the following:

configuring instance details

Local resources are then ready within seconds. You can manage these resources in the LA Local Zone just like resources in AZs:

shows the local zone created

Just like Regions, you can set up an Internet Gateway to access your local resources in the LA Local Zone over the internet. Or you can use AWS Direct Connect to route your traffic over a private network connection from any Direct Connect location. To get the best latency performance, you should use one of the Direct Connect locations available in LA:

  • T5 at El Segundo, Los Angeles, CA (recommended for lowest latency to the LA Local Zone)
  • CoreSite LA1, Los Angeles, CA
  • Equinix LA3, El Segundo, CA

Pricing

From a pricing perspective, Instances, and other local AWS resources running in a Local Zone have their own prices that might differ from the parent Region. Billing reports include a prefix, “LAX1,” or location name, “US West (Los Angeles),” that is specific to a group of Local Zones in LA. EC2 instances in Local Zones are available in On-Demand and Spot form, and you can also purchase Savings Plans. For pricing information, you can visit the pricing section on the respective services and filter pricing information by choosing the Local Zone location as “US West (Los Angeles)” in the dropdown.

Data transfer charges in AWS Local Zones are the same as in the AZs in the parent Region today. For example, data transferred between Amazon EC2 instances in the LA Local Zone and Amazon S3 in the parent Region, US West (Oregon), is free. Similarly, data transferred IN and OUT from Amazon EC2 in the Local Zone to the Amazon EC2 in the Parent Region is charged at $0.01/GB in each direction. You can learn more about data transfer prices “in” and “out” of Amazon EC2 here.

Thinking ahead

Later this year, AWS plans to open a second Local Zone in LA (us-west-2-lax-1b). The two Local Zones in LA will be interconnected with high-bandwidth, low-latency AWS networking allowing you to architect your low-latency applications for high availability and fault tolerance. Based on your feedback, we are also working on adding Local Zones in other locations along with the availability of the additional services, including Amazon ECS, Amazon Elastic Kubernetes Service, Amazon ElastiCache, Amazon ES, and Amazon Managed Streaming for Apache Kafka.

Conclusion

Now that we provided AWS Local Zones that you requested; we are really looking forward to seeing what all you can do with them. AWS would love to get your advice on locations or additional local services/features or other interesting use cases, so feel free to leave us your comments!

Introducing Instance Refresh for EC2 Auto Scaling

Post Syndicated from Ben Peven original https://aws.amazon.com/blogs/compute/introducing-instance-refresh-for-ec2-auto-scaling/

This post is contributed to by: Ran Sheinberg – Principal EC2 Spot SA, and Isaac Vallhonrat – Sr. EC2 Spot Specialist SA

Today, we are launching Instance Refresh. This is a new feature in EC2 Auto Scaling that enables automatic deployments of instances in Auto Scaling Groups (ASGs), in order to release new application versions or make infrastructure updates.

Amazon EC2 Auto Scaling is used for a wide variety of workload types and applications. EC2 Auto Scaling helps you maintain application availability through a rich feature set. This feature set includes integration into Elastic Load Balancing, automatically replacing unhealthy instances, balancing instances across Availability Zones, provisioning instances across multiple pricing options and instance types, dynamically adding and removing instances, and more.

Many customers use an immutable infrastructure approach. This approach encourages replacing EC2 instances to update the application or configuration, instead of deploying into EC2 instances that are already running. This can be done by baking code and software in golden Amazon Machine Images (AMIs), and rolling out new EC2 Instances that use the new AMI version. Another common pattern for rolling out application updates is changing the package version that the instance pulls when it boots (via updates to instance user data). Or, keeping that pointer static, and pushing a new version to the code repository or another type of artifact (container, package on Amazon S3) to be fetched by an instance when it boots and gets provisioned.

Until today, EC2 Auto Scaling customers used different methods for replacing EC2 instances inside EC2 Auto Scaling groups when a deployment or operating system update was needed. For example, UpdatePolicy within AWS CloudFormation, create_before_destroy lifecycle in Hashicorp Terraform, using AWS CodeDeploy, or even custom scripts that call the EC2 Auto Scaling API.

Customers told us that they want native deployment functionality built into EC2 Auto Scaling to take away the heavy lifting of custom solutions, or deployments that are initiated from outside of Auto Scaling groups.

Introducing Instance Refresh in EC2 Auto Scaling

You can trigger an Instance Refresh using the EC2 Auto Scaling groups Management Console, or use the new StartInstanceRefresh API in AWS CLI or any AWS SDK. All you need to do is specify the percentage of healthy instances to keep in the group while ASG terminates and launches instances. Also specify the warm-up time, which is the time period that ASG waits between groups of instances, that refreshes via Instance Refresh. If your ASG is using Health Checks, the ASG waits for the instances in the group to be healthy before it continues to the next group of instances.

Instance Refresh in action

To get started with Instance Refresh in the AWS Management Console, click on an existing ASG in the EC2 Auto Scaling Management Console. Then click the Instance refresh tab.

When clicking the Start instance refresh button, I am presented with the following options:

start instance refresh

With the default configuration, ASG works to keep 90% of the instances running and does not proceed to the next group of instances if that percentage is not kept. After each group, ASG waits for the newly launched instances to transition into the healthy state, in addition to the 300-second warm-up time to pass before proceeding to the next group of instances.

I can also initiate the same action from the AWS CLI by using the following code:

aws autoscaling start-instance-refresh --auto-scaling-group-name ASG-Instance-Refresh —preferences MinHealthyPercentage=90,InstanceWarmup=300

After initializing the instance refresh process, I can see ongoing instance refreshes in the console:

initialize instance refresh

The following image demonstrates how an active Instance refresh looks in the EC2 Instances console. Moreover, ASG strives to keep the capacity balanced between Availability Zones by terminating and launching instances in different Availability Zones in each group.

active instance refresh

Automate your workflow with Instance Refresh

You can now use this new functionality to create automations that work for your use-case.

To get started quickly, we created an example solution based on AWS Lambda. Visit the solution page on Github and see the deployment instructions.

Here’s an overview of what the solution contains and how it works:

  • An EC2 Auto Scaling group with two instances running
  • An EC2 Image Builder pipeline, set up to build and test an AMI
  • An SNS topic that would get notified when the image build completes
  • A Lambda function that is subscribed to the SNS topic, which gets triggered when the image build completes
  • The Lambda function gets the new AMI ID from the SNS notification, creates a new Launch Template version, and then triggers an Instance Refresh in the ASG, which starts the rolling update of instances.
  • Because you can configure the ASG with LaunchTemplateVersion = $Latest, every new instance that is launched by the Instance Refresh process uses the new AMI from the latest version of the Launch Template.

See the automation flow in the following diagram.

instance refresh automation flow

Conclusion

We hope that the new Instance Refresh functionality in your ASGs allow for a more streamlined approach to launching and updating your application deployments running on EC2. You can now create automations that fit your use case. This allows you to more easily refresh the EC2 instances running in your Auto Scaling groups, when deploying a new version of your application or when you must replace the AMI being used. Visit the user-guide to learn more and get started.

New – A Shared File System for Your Lambda Functions

Post Syndicated from Danilo Poccia original https://aws.amazon.com/blogs/aws/new-a-shared-file-system-for-your-lambda-functions/

I am very happy to announce that AWS Lambda functions can now mount an Amazon Elastic File System (EFS), a scalable and elastic NFS file system storing data within and across multiple availability zones (AZ) for high availability and durability. In this way, you can use a familiar file system interface to store and share data across all concurrent execution environments of one, or more, Lambda functions. EFS supports full file system access semantics, such as strong consistency and file locking.

To connect an EFS file system with a Lambda function, you use an EFS access point, an application-specific entry point into an EFS file system that includes the operating system user and group to use when accessing the file system, file system permissions, and can limit access to a specific path in the file system. This helps keeping file system configuration decoupled from the application code.

You can access the same EFS file system from multiple functions, using the same or different access points. For example, using different EFS access points, each Lambda function can access different paths in a file system, or use different file system permissions.

You can share the same EFS file system with Amazon Elastic Compute Cloud (EC2) instances, containerized applications using Amazon ECS and AWS Fargate, and on-premises servers. Following this approach, you can use different computing architectures (functions, containers, virtual servers) to process the same files. For example, a Lambda function reacting to an event can update a configuration file that is read by an application running on containers. Or you can use a Lambda function to process files uploaded by a web application running on EC2.

In this way, some use cases are much easier to implement with Lambda functions. For example:

  • Processing or loading data larger than the space available in /tmp (512MB).
  • Loading the most updated version of files that change frequently.
  • Using data science packages that require storage space to load models and other dependencies.
  • Saving function state across invocations (using unique file names, or file system locks).
  • Building applications requiring access to large amounts of reference data.
  • Migrating legacy applications to serverless architectures.
  • Interacting with data intensive workloads designed for file system access.
  • Partially updating files (using file system locks for concurrent access).
  • Moving a directory and all its content within a file system with an atomic operation.

Creating an EFS File System
To mount an EFS file system, your Lambda functions must be connected to an Amazon Virtual Private Cloud that can reach the EFS mount targets. For simplicity, I am using here the default VPC that is automatically created in each AWS Region.

Note that, when connecting Lambda functions to a VPC, networking works differently. If your Lambda functions are using Amazon Simple Storage Service (S3) or Amazon DynamoDB, you should create a gateway VPC endpoint for those services. If your Lambda functions need to access the public internet, for example to call an external API, you need to configure a NAT Gateway. I usually don’t change the configuration of my default VPCs. If I have specific requirements, I create a new VPC with private and public subnets using the AWS Cloud Development Kit, or use one of these AWS CloudFormation sample templates. In this way, I can manage networking as code.

In the EFS console, I select Create file system and make sure that the default VPC and its subnets are selected. For all subnets, I use the default security group that gives network access to other resources in the VPC using the same security group.

In the next step, I give the file system a Name tag and leave all other options to their default values.

Then, I select Add access point. I use 1001 for the user and group IDs and limit access to the /message path. In the Owner section, used to create the folder automatically when first connecting to the access point, I use the same user and group IDs as before, and 750 for permissions. With this permissions, the owner can read, write, and execute files. Users in the same group can only read. Other users have no access.

I go on, and complete the creation of the file system.

Using EFS with Lambda Functions
To start with a simple use case, let’s build a Lambda function implementing a MessageWall API to add, read, or delete text messages. Messages are stored in a file on EFS so that all concurrent execution environments of that Lambda function see the same content.

In the Lambda console, I create a new MessageWall function and select the Python 3.8 runtime. In the Permissions section, I leave the default. This will create a new AWS Identity and Access Management (IAM) role with basic permissions.

When the function is created, in the Permissions tab I click on the IAM role name to open the role in the IAM console. Here, I select Attach policies to add the AWSLambdaVPCAccessExecutionRole and AmazonElasticFileSystemClientReadWriteAccess AWS managed policies. In a production environment, you can restrict access to a specific VPC and EFS access point.

Back in the Lambda console, I edit the VPC configuration to connect the MessageWall function to all subnets in the default VPC, using the same default security group I used for the EFS mount points.

Now, I select Add file system in the new File system section of the function configuration. Here, I choose the EFS file system and accesss point I created before. For the local mount point, I use /mnt/msg and Save. This is the path where the access point will be mounted, and corresponds to the /message folder in my EFS file system.

In the Function code editor of the Lambda console, I paste the following code and Save.

import os
import fcntl

MSG_FILE_PATH = '/mnt/msg/content'


def get_messages():
    try:
        with open(MSG_FILE_PATH, 'r') as msg_file:
            fcntl.flock(msg_file, fcntl.LOCK_SH)
            messages = msg_file.read()
            fcntl.flock(msg_file, fcntl.LOCK_UN)
    except:
        messages = 'No message yet.'
    return messages


def add_message(new_message):
    with open(MSG_FILE_PATH, 'a') as msg_file:
        fcntl.flock(msg_file, fcntl.LOCK_EX)
        msg_file.write(new_message + "\n")
        fcntl.flock(msg_file, fcntl.LOCK_UN)


def delete_messages():
    try:
        os.remove(MSG_FILE_PATH)
    except:
        pass


def lambda_handler(event, context):
    method = event['requestContext']['http']['method']
    if method == 'GET':
        messages = get_messages()
    elif method == 'POST':
        new_message = event['body']
        add_message(new_message)
        messages = get_messages()
    elif method == 'DELETE':
        delete_messages()
        messages = 'Messages deleted.'
    else:
        messages = 'Method unsupported.'
    return messages

I select Add trigger and in the configuration I select the Amazon API Gateway. I create a new HTTP API. For simplicity, I leave my API endpoint open.

With the API Gateway trigger selected, I copy the endpoint of the new API I just created.

I can now use curl to test the API:

$ curl https://1a2b3c4d5e.execute-api.us-east-1.amazonaws.com/default/MessageWall
No message yet.
$ curl -X POST -H "Content-Type: text/plain" -d 'Hello from EFS!' https://1a2b3c4d5e.execute-api.us-east-1.amazonaws.com/default/MessageWall
Hello from EFS!

$ curl -X POST -H "Content-Type: text/plain" -d 'Hello again :)' https://1a2b3c4d5e.execute-api.us-east-1.amazonaws.com/default/MessageWall
Hello from EFS!
Hello again :)

$ curl https://1a2b3c4d5e.execute-api.us-east-1.amazonaws.com/default/MessageWall
Hello from EFS!
Hello again :)

$ curl -X DELETE https://1a2b3c4d5e.execute-api.us-east-1.amazonaws.com/default/MessageWall
Messages deleted.

$ curl https://1a2b3c4d5e.execute-api.us-east-1.amazonaws.com/default/MessageWall
No message yet.

It would be relatively easy to add unique file names (or specific subdirectories) for different users and extend this simple example into a more complete messaging application. As a developer, I appreciate the simplicity of using a familiar file system interface in my code. However, depending on your requirements, EFS throughput configuration must be taken into account. See the section Understanding EFS performance later in the post for more information.

Now, let’s use the new EFS file system support in AWS Lambda to build something more interesting. For example, let’s use the additional space available with EFS to build a machine learning inference API processing images.

Building a Serverless Machine Learning Inference API
To create a Lambda function implementing machine learning inference, I need to be able, in my code, to import the necessary libraries and load the machine learning model. Often, when doing so, the overall size of those dependencies goes beyond the current AWS Lambda limits in the deployment package size. One way of solving this is to accurately minimize the libraries to ship with the function code, and then download the model from an S3 bucket straight to memory (up to 3 GB, including the memory required for processing the model) or to /tmp (up 512 MB). This custom minimization and download of the model has never been easy to implement. Now, I can use an EFS file system.

The Lambda function I am building this time needs access to the public internet to download a pre-trained model and the images to run inference on. So I create a new VPC with public and private subnets, and configure a NAT Gateway and the route table used by the the private subnets to give access to the public internet. Using the AWS Cloud Development Kit, it’s just a few lines of code.

I create a new EFS file system and an access point in the new VPC using similar configurations as before. This time, I use /ml for the access point path.

Then, I create a new MLInference Lambda function with the same set up as before for permissions and connect the function to the private subnets of the new VPC. Machine learning inference is quite a heavy workload, so I select 3 GB for memory and 5 minutes for timeout. In the File system configuration, I add the new access point and mount it under /mnt/inference.

The machine learning framework I am using for this function is PyTorch, and I need to put the libraries required to run inference in the EFS file system. I launch an Amazon Linux EC2 instance in a public subnet of the new VPC. In the instance details, I select one of the availability zones where I have an EFS mount point, and then Add file system to automatically mount the same EFS file system I am using for the function. For the security groups of the EC2 instance, I select the default security group (to be able to mount the EFS file system) and one that gives inbound access to SSH (to be able to connect to the instance).

I connect to the instance using SSH and create a requirements.txt file containing the dependencies I need:

torch
torchvision
numpy

The EFS file system is automatically mounted by EC2 under /mnt/efs/fs1. There, I create the /ml directory and change the owner of the path to the user and group I am using now that I am connected (ec2-user).

$ sudo mkdir /mnt/efs/fs1/ml
$ sudo chown ec2-user:ec2-user /mnt/efs/fs1/ml

I install Python 3 and use pip to install the dependencies in the /mnt/efs/fs1/ml/lib path:

$ sudo yum install python3
$ pip3 install -t /mnt/efs/fs1/ml/lib -r requirements.txt

Finally, I give ownership of the whole /ml path to the user and group I used for the EFS access point:

$ sudo chown -R 1001:1001 /mnt/efs/fs1/ml

Overall, the dependencies in my EFS file system are using about 1.5 GB of storage.

I go back to the MLInference Lambda function configuration. Depending on the runtime you use, you need to find a way to tell where to look for dependencies if they are not included with the deployment package or in a layer. In the case of Python, I set the PYTHONPATH environment variable to /mnt/inference/lib.

I am going to use PyTorch Hub to download this pre-trained machine learning model to recognize the kind of bird in a picture. The model I am using for this example is relatively small, about 200 MB. To cache the model on the EFS file system, I set the TORCH_HOME environment variable to /mnt/inference/model.

All dependencies are now in the file system mounted by the function, and I can type my code straight in the Function code editor. I paste the following code to have a machine learning inference API:

import urllib
import json
import os

import torch
from PIL import Image
from torchvision import transforms

transform_test = transforms.Compose([
    transforms.Resize((600, 600), Image.BILINEAR),
    transforms.CenterCrop((448, 448)),
    transforms.ToTensor(),
    transforms.Normalize((0.485, 0.456, 0.406), (0.229, 0.224, 0.225)),
])

model = torch.hub.load('nicolalandro/ntsnet-cub200', 'ntsnet', pretrained=True,
                       **{'topN': 6, 'device': 'cpu', 'num_classes': 200})
model.eval()


def lambda_handler(event, context):
    url = event['queryStringParameters']['url']

    img = Image.open(urllib.request.urlopen(url))
    scaled_img = transform_test(img)
    torch_images = scaled_img.unsqueeze(0)

    with torch.no_grad():
        top_n_coordinates, concat_out, raw_logits, concat_logits, part_logits, top_n_index, top_n_prob = model(torch_images)

        _, predict = torch.max(concat_logits, 1)
        pred_id = predict.item()
        bird_class = model.bird_classes[pred_id]
        print('bird_class:', bird_class)

    return json.dumps({
        "bird_class": bird_class,
    })

I add the API Gateway as trigger, similarly to what I did before for the MessageWall function. Now, I can use the serverless API I just created to analyze pictures of birds. I am not really an expert in the field, so I looked for a couple of interesting images on Wikipedia:

I call the API to get a prediction for these two pictures:

$ curl https://1a2b3c4d5e.execute-api.us-east-1.amazonaws.com/default/MLInference?url=https://path/to/image/atlantic-puffin.jpg

{"bird_class": "106.Horned_Puffin"}

$ curl https://1a2b3c4d5e.execute-api.us-east-1.amazonaws.com/default/MLInference?url=https://path/to/image/western-grebe.jpg

{"bird_class": "053.Western_Grebe"}

It works! Looking at Amazon CloudWatch Logs for the Lambda function, I see that the first invocation, when the function loads and prepares the pre-trained model for inference on CPUs, takes about 30 seconds. To avoid a slow response, or a timeout from the API Gateway, I use Provisioned Concurrency to keep the function ready. The next invocations take about 1.8 seconds.

Understanding EFS Performance
When using EFS with your Lambda function, is very important to understand how EFS performance works. For throughput, each file system can be configured to use bursting or provisioned mode.

When using bursting mode, all EFS file systems, regardless of size, can burst at least to 100 MiB/s of throughput. Those over 1 TiB in the standard storage class can burst to 100 MiB/s per TiB of data stored in the file system. EFS uses a credit system to determine when file systems can burst. Each file system earns credits over time at a baseline rate that is determined by the size of the file system that is stored in the standard storage class. A file system uses credits whenever it reads or writes data. The baseline rate is 50 KiB/s per GiB of storage.

You can monitor the use of credits in CloudWatch, each EFS file system has a BurstCreditBalance metric. If you see that you are consuming all credits, and the BurstCreditBalance metric is going to zero, you should enable provisioned throughput mode for the file system, from 1 to 1024 MiB/s. There is an additional cost when using provisioned throughput, based on how much throughput you are adding on top of the baseline rate.

To avoid running out of credits, you should think of the throughput as the average you need during the day. For example, if you have a 10GB file system, you have 500 KiB/s of baseline rate, and every day you can read/write 500 KiB/s * 3600 seconds * 24 hours = 43.2 GiB.

If the libraries and everything you function needs to load during initialization are about 2 GiB, and you only access the EFS file system during function initialization, like in the MLInference Lambda function above, that means you can initialize your function (for example because of updates or scaling up activities) about 20 times per day. That’s not a lot, and you would probably need to configure provisioned throughput for the EFS file system.

If you have 10 MiB/s of provisioned throughput, then every day you have 10 MiB/s * 3600 seconds * 24 hours = 864 GiB to read or write. If you only use the EFS file system at function initialization to read about 2 GB of dependencies, it means that you can have 400 initializations per day. That may be enough for your use case.

In the Lambda function configuration, you can also use the reserve concurrency control to limit the maximum number of execution environments used by a function.

If, by mistake, the BurstCreditBalance goes down to zero, and the file system is relatively small (for example, a few GiBs), there is the possibility that your function gets stuck and can’t execute fast enough before reaching the timeout. In that case, you should enable (or increase) provisioned throughput for the EFS file system, or throttle your function by setting the reserved concurrency to zero to avoid all invocations until the EFS file system has enough credits.

Understanding Security Controls
When using EFS file systems with AWS Lambda, you have multiple levels of security controls. I’m doing a quick recap here because they should all be considered during the design and implementation of your serverless applications. You can find more info on using IAM authorization and access points with EFS in this post.

To connect a Lambda function to an EFS file system, you need:

  • Network visibility in terms of VPC routing/peering and security group.
  • IAM permissions for the Lambda function to access the VPC and mount (read only or read/write) the EFS file system.
  • You can specify in the IAM policy conditions which EFS access point the Lambda function can use.
  • The EFS access point can limit access to a specific path in the file system.
  • File system security (user ID, group ID, permissions) can limit read, write, or executable access for each file or directory mounted by a Lambda function.

The Lambda function execution environment and the EFS mount point uses industry standard Transport Layer Security (TLS) 1.2 to encrypt data in transit. You can provision Amazon EFS to encrypt data at rest. Data encrypted at rest is transparently encrypted while being written, and transparently decrypted while being read, so you don’t have to modify your applications. Encryption keys are managed by the AWS Key Management Service (KMS), eliminating the need to build and maintain a secure key management infrastructure.

Available Now
This new feature is offered in all regions where AWS Lambda and Amazon EFS are available, with the exception of the regions in China, where we are working to make this integration available as soon as possible. For more information on availability, please see the AWS Region table. To learn more, please see the documentation.

EFS for Lambda can be configured using the console, the AWS Command Line Interface (CLI), the AWS SDKs, and the Serverless Application Model. This feature allows you to build data intensive applications that need to process large files. For example, you can now unzip a 1.5 GB file in a few lines of code, or process a 10 GB JSON document. You can also load libraries or packages that are larger than the 250 MB package deployment size limit of AWS Lambda, enabling new machine learning, data modelling, financial analysis, and ETL jobs scenarios.

Amazon EFS for Lambda is supported at launch in AWS Partner Network solutions, including Epsagon, Lumigo, Datadog, HashiCorp Terraform, and Pulumi.

There is no additional charge for using EFS from Lambda functions. You pay the standard price for AWS Lambda and Amazon EFS. Lambda execution environments always connect to the right mount target in an AZ and not across AZs. You can connect to EFS in the same AZ via cross account VPC but there can be data transfer costs for that. We do not support cross region, or cross AZ connectivity between EFS and Lambda.

Danilo

Introducing the serverless LAMP stack – part 2 relational databases

Post Syndicated from Benjamin Smith original https://aws.amazon.com/blogs/compute/introducing-the-serverless-lamp-stack-part-2-relational-databases/

In this post, you learn how to use an Amazon Aurora MySQL relational database in your serverless applications. I show how to pool and share connections to the database with Amazon RDS Proxy, and how to choose configurations. The code examples in this post are written in PHP and can be found in this GitHub repository. The concepts can be applied to any AWS Lambda supported runtime.

TThe serverless LAMP stack

The serverless LAMP stack

This serverless LAMP stack architecture is first discussed in this post. This architecture uses a PHP Lambda function (or multiple functions) to read and write to an Amazon Aurora MySQL database.

Amazon Aurora provides high performance and availability for MySQL and PostgreSQL databases. The underlying storage scales automatically to meet demand, up to 64 tebibytes (TiB). An Amazon Aurora DB instance is created inside a virtual private cloud (VPC) to prevent public access. To connect to the Aurora database instance from a Lambda function, that Lambda function must be configured to access the same VPC.

Database memory exhaustion can occur when connecting directly to an RDS database. This is caused by a surge in database connections or by a large number of connections opening and closing at a high rate. This can lead to slower queries and limited application scalability. Amazon RDS Proxy is implemented to solve this problem. RDS Proxy is a fully managed database proxy feature for Amazon RDS. It establishes a database connection pool that sits between your application and your relational database and reuses connections in this pool. This protects the database against oversubscription, without the memory and CPU overhead of opening a new database connection each time. Credentials for the database connection are securely stored in AWS Secrets Manager. They are accessed via an AWS Identity and Access Management (IAM) role. This enforces strong authentication requirements for database applications without a costly migration effort for the DB instances themselves.

The following steps show how to connect to an Amazon Aurora MySQL database running inside a VPC. The connection is made from a Lambda function running PHP. The Lambda function connects to the database via RDS Proxy. The database credentials that RDS Proxy uses are held in  Secrets Manager and accessed via IAM authentication.

RDS Proxy with IAM Authentication

RDS Proxy with IAM authentication

Getting started

RDS Proxy is currently in preview and not recommended for production workloads. For a full list of available Regions, refer to the RDS Proxy pricing page.

Creating an Amazon RDS Aurora MySQL database

Before creating an Aurora DB cluster, you must meet the prerequisites, such as creating a VPC and an RDS DB subnet group. For more information on how to set this up, see DB cluster prerequisites.

  1. Call the create-db-cluster AWS CLI command to create the Aurora MySQL DB cluster.
    aws rds create-db-cluster \
    --db-cluster-identifier sample-cluster \
    --engine aurora-mysql \
    --engine-version 5.7.12 \
    --master-username admin \
    --master-user-password secret99 \
    --db-subnet-group-name default-vpc-6cc1cf0a \
    --vpc-security-group-ids sg-d7cf52a3 \
    --enable-iam-database-authentication true
  2. Add a new DB instance to the cluster.
    aws rds create-db-instance \
        --db-instance-class db.r5.large \
        --db-instance-identifier sample-instance \
        --engine aurora-mysql  \
        --db-cluster-identifier sample-cluster
  3. Store the database credentials as a secret in AWS Secrets Manager.
    aws secretsmanager create-secret \
    --name MyTestDatabaseSecret \
    --description "My test database secret created with the CLI" \
    --secret-string '{"username":"admin","password":"secret99","engine":"mysql","host":"<REPLACE-WITH-YOUR-DB-WRITER-ENDPOINT>","port":"3306","dbClusterIdentifier":"<REPLACE-WITH-YOUR-DB-CLUSTER-NAME>"}'

    Make a note of the resulting ARN for later

    {
        "VersionId": "eb518920-4970-419f-b1c2-1c0b52062117", 
        "Name": "MySampleDatabaseSecret", 
        "ARN": "arn:aws:secretsmanager:eu-west-1:1234567890:secret:MySampleDatabaseSecret-JgEWv1"
    }

    This secret is used by RDS Proxy to maintain a connection pool to the database. To access the secret, the RDS Proxy service requires permissions to be explicitly granted.

  4. Create an IAM policy that provides secretsmanager permissions to the secret.
    aws iam create-policy \
    --policy-name my-rds-proxy-sample-policy \
    --policy-document '{
      "Version": "2012-10-17",
      "Statement": [
        {
          "Sid": "VisualEditor0",
          "Effect": "Allow",
          "Action": [
            "secretsmanager:GetResourcePolicy",
            "secretsmanager:GetSecretValue",
            "secretsmanager:DescribeSecret",
            "secretsmanager:ListSecretVersionIds"
          ],
          "Resource": [
            "<the-arn-of-the-secret>”
          ]
        },
        {
          "Sid": "VisualEditor1",
          "Effect": "Allow",
          "Action": [
            "secretsmanager:GetRandomPassword",
            "secretsmanager:ListSecrets"
          ],
          "Resource": "*"
        }
      ]
    }'
    

    Make a note of the resulting policy ARN, which you need to attach to a new role.

    {
        "Policy": {
            "PolicyName": "my-rds-proxy-sample-policy", 
            "PermissionsBoundaryUsageCount": 0, 
            "CreateDate": "2020-06-04T12:21:25Z", 
            "AttachmentCount": 0, 
            "IsAttachable": true, 
            "PolicyId": "ANPA6JE2MLNK3Z4EFQ5KL", 
            "DefaultVersionId": "v1", 
            "Path": "/", 
            "Arn": "arn:aws:iam::1234567890112:policy/my-rds-proxy-sample-policy", 
            "UpdateDate": "2020-06-04T12:21:25Z"
         }
    }
    
  5. Create an IAM Role that has a trust relationship with the RDS Proxy service. This allows the RDS Proxy service to assume this role to retrieve the database credentials.

    aws iam create-role --role-name my-rds-proxy-sample-role --assume-role-policy-document '{
     "Version": "2012-10-17",
     "Statement": [
      {
       "Sid": "",
       "Effect": "Allow",
       "Principal": {
        "Service": "rds.amazonaws.com"
       },
       "Action": "sts:AssumeRole"
      }
     ]
    }'
    
  6. Attach the new policy to the role:
    aws iam attach-role-policy \
    --role-name my-rds-proxy-sample-role \
    --policy-arn arn:aws:iam::123456789:policy/my-rds-proxy-sample-policy
    

Create an RDS Proxy

  1. Use the AWS CLI to create a new RDS Proxy. Replace the – -role-arn and SecretArn value to those values created in the previous steps.
    aws rds create-db-proxy \
    --db-proxy-name sample-db-proxy \
    --engine-family MYSQL \
    --auth '{
            "AuthScheme": "SECRETS",
            "SecretArn": "arn:aws:secretsmanager:eu-west-1:123456789:secret:exampleAuroraRDSsecret1-DyCOcC",
             "IAMAuth": "REQUIRED"
          }' \
    --role-arn arn:aws:iam::123456789:role/my-rds-proxy-sample-role \
    --vpc-subnet-ids  subnet-c07efb9a subnet-2bc08b63 subnet-a9007bcf
    

    To enforce IAM authentication for users of the RDS Proxy, the IAMAuth value is set to REQUIRED. This is a more secure alternative to embedding database credentials in the application code base.

    The Aurora DB cluster and its associated instances are referred to as the targets of that proxy.

  2. Add the database cluster to the proxy with the register-db-proxy-targets command.
    aws rds register-db-proxy-targets \
    --db-proxy-name sample-db-proxy \
    --db-cluster-identifiers sample-cluster
    

Deploying a PHP Lambda function with VPC configuration

This GitHub repository contains a Lambda function with a PHP runtime provided by a Lambda layer. The function uses the MySQLi PHP extension to connect to the RDS Proxy. The extension has been installed and compiled along with a PHP executable using this command:

The PHP executable is packaged together with a Lambda bootstrap file to create a PHP custom runtime. More information on building your own custom runtime for PHP can be found in this post.

Deploy the application stack using the AWS Serverless Application Model (AWS SAM) CLI:

sam deploy -g

When prompted, enter the SecurityGroupIds and the SubnetIds for your Aurora DB cluster.

The SAM template attaches the SecurityGroupIds and SubnetIds parameters to the Lambda function using the VpcConfig sub-resource.

Lambda creates an elastic network interface for each combination of security group and subnet in the function’s VPC configuration. The function can only access resources (and the internet) through that VPC.

Adding RDS Proxy to a Lambda Function

  1. Go to the Lambda console.
  2. Choose the PHPHelloFunction that you just deployed.
  3. Choose Add database proxy at the bottom of the page.
  4. Choose existing database proxy then choose sample-db-proxy.
  5. Choose Add.

Using the RDS Proxy from within the Lambda function

The Lambda function imports three libraries from the AWS PHP SDK. These are used to generate a password token from the database credentials stored in Secrets Manager.

The AWS PHP SDK libraries are provided by the PHP-example-vendor layer. Using Lambda layers in this way creates a mechanism for incorporating additional libraries and dependencies as the application evolves.

The function’s handler named index, is the entry point of the function code. First, getenv() is called to retrieve the environment variables set by the SAM application’s deployment. These are saved as local variables and available for the duration of the Lambda function’s execution.

The AuthTokenGenerator class generates an RDS auth token for use with IAM authentication. This is initialized by passing in the credential provider to the SDK client constructor. The createToken() method is then invoked, with the Proxy endpoint, port number, Region, and database user name provided as method parameters. The resultant temporary token is then used to connect to the proxy.

The PHP mysqli class represents a connection between PHP and a MySQL database. The real_connect() method is used to open a connection to the database via RDS Proxy. Instead of providing the database host endpoint as the first parameter, the proxy endpoint is given. The database user name, temporary token, database name, and port number are also provided. The constant MYSQLI_CLIENT_SSL is set to ensure that the connection uses SSL encryption.

Once a connection has been established, the connection object can be used. In this example, a SHOW TABLES query is executed. The connection is then closed, and the result is encoded to JSON and returned from the Lambda function.

This is the output:

RDS Proxy monitoring and performance tuning

RDS Proxy allows you to monitor and adjust connection limits and timeout intervals without changing application code.

Limit the timeout wait period that is most suitable for your application with the connection borrow timeout option. This specifies how long to wait for a connection to become available in the connection pool before returning a timeout error.

Adjust the idle connection timeout interval to help your applications handle stale resources. This can save your application from mistakenly leaving open connections that hold important database resources.

Multiple applications using a single database can each use an RDS Proxy to divide the connection quotas across each application. Set the maximum proxy connections as a percentage of the max_connections configuration (for MySQL).

The following example shows how to change the MaxConnectionsPercent setting for a proxy target group.

aws rds modify-db-proxy-target-group \
--db-proxy-name sample-db-proxy \
--target-group-name default \
--connection-pool-config '{"MaxConnectionsPercent": 75 }'

Response:

{
    "TargetGroups": [
        {
            "DBProxyName": "sample-db-proxy",
            "TargetGroupName": "default",
            "TargetGroupArn": "arn:aws:rds:eu-west-1:####:target-group:prx-tg-03d7fe854604e0ed1",
            "IsDefault": true,
            "Status": "available",
            "ConnectionPoolConfig": {
            "MaxConnectionsPercent": 75,
            "MaxIdleConnectionsPercent": 50,
            "ConnectionBorrowTimeout": 120,
            "SessionPinningFilters": []
        	},            
"CreatedDate": "2020-06-04T16:14:35.858000+00:00",
            "UpdatedDate": "2020-06-09T09:08:50.889000+00:00"
        }
    ]
}

RDS Proxy may keep a session on the same connection until the session ends when it detects a session state change that isn’t appropriate for reuse. This behavior is called pinning. Performance tuning for RDS Proxy involves maximizing connection reuse by minimizing pinning.

The Amazon CloudWatch metric DatabaseConnectionsCurrentlySessionPinned can be monitored to see how frequently pinning occurs in your application.

Amazon CloudWatch collects and processes raw data from RDS Proxy into readable, near real-time metrics. Use these metrics to observe the number of connections and the memory associated with connection management. This can help identify if a database instance or cluster would benefit from using RDS Proxy. For example, if it is handling many short-lived connections, or opening and closing connections at a high rate.

Conclusion

In this post, you learn how to create and configure an RDS Proxy to manage connections from a PHP Lambda function to an Aurora MySQL database. You see how to enforce strong authentication requirements by using Secrets Manager and IAM authentication. You deploy a Lambda function that uses Lambda layers to store the AWS PHP SDK as a dependency.

You can create secure, scalable, and performant serverless applications with relational databases. Do this by placing the RDS Proxy service between your database and your Lambda functions. You can also migrate your existing MySQL database to an Aurora DB cluster without altering the database. Using RDS Proxy and Lambda, you can build serverless PHP applications faster, with less code.

Find more PHP examples with the Serverless LAMP stack.

Amazon EKS Now Supports EC2 Inf1 Instances

Post Syndicated from Julien Simon original https://aws.amazon.com/blogs/aws/amazon-eks-now-supports-ec2-inf1-instances/

Amazon Elastic Kubernetes Service (EKS) has quickly become a leading choice for machine learning workloads. It combines the developer agility and the scalability of Kubernetes, with the wide selection of Amazon Elastic Compute Cloud (EC2) instance types available on AWS, such as the C5, P3, and G4 families.

As models become more sophisticated, hardware acceleration is increasingly required to deliver fast predictions at high throughput. Today, we’re very happy to announce that AWS customers can now use the Amazon EC2 Inf1 instances on Amazon Elastic Kubernetes Service, for high performance and the lowest prediction cost in the cloud.

A primer on EC2 Inf1 instances
Inf1 instances were launched at AWS re:Invent 2019. They are powered by AWS Inferentia, a custom chip built from the ground up by AWS to accelerate machine learning inference workloads.

Inf1 instances are available in multiple sizes, with 1, 4, or 16 AWS Inferentia chips, with up to 100 Gbps network bandwidth and up to 19 Gbps EBS bandwidth. An AWS Inferentia chip contains four NeuronCores. Each one implements a high-performance systolic array matrix multiply engine, which massively speeds up typical deep learning operations such as convolution and transformers. NeuronCores are also equipped with a large on-chip cache, which helps cut down on external memory accesses, saving I/O time in the process. When several AWS Inferentia chips are available on an Inf1 instance, you can partition a model across them and store it entirely in cache memory. Alternatively, to serve multi-model predictions from a single Inf1 instance, you can partition the NeuronCores of an AWS Inferentia chip across several models.

Compiling Models for EC2 Inf1 Instances
To run machine learning models on Inf1 instances, you need to compile them to a hardware-optimized representation using the AWS Neuron SDK. All tools are readily available on the AWS Deep Learning AMI, and you can also install them on your own instances. You’ll find instructions in the Deep Learning AMI documentation, as well as tutorials for TensorFlow, PyTorch, and Apache MXNet in the AWS Neuron SDK repository.

In the demo below, I will show you how to deploy a Neuron-optimized model on an EKS cluster of Inf1 instances, and how to serve predictions with TensorFlow Serving. The model in question is BERT, a state of the art model for natural language processing tasks. This is a huge model with hundreds of millions of parameters, making it a great candidate for hardware acceleration.

Building an EKS Cluster of EC2 Inf1 Instances
First of all, let’s build a cluster with two inf1.2xlarge instances. I can easily do this with eksctl, the command-line tool to provision and manage EKS clusters. You can find installation instructions in the EKS documentation.

Here is the configuration file for my cluster. Eksctl detects that I’m launching a node group with an Inf1 instance type, and will start your worker nodes using the EKS-optimized Accelerated AMI.

apiVersion: eksctl.io/v1alpha5
kind: ClusterConfig
metadata:
  name: cluster-inf1
  region: us-west-2
nodeGroups:
  - name: ng1-public
    instanceType: inf1.2xlarge
    minSize: 0
    maxSize: 3
    desiredCapacity: 2
    ssh:
      allow: true

Then, I use eksctl to create the cluster. This process will take approximately 10 minutes.

$ eksctl create cluster -f inf1-cluster.yaml

Eksctl automatically installs the Neuron device plugin in your cluster. This plugin advertises Neuron devices to the Kubernetes scheduler, which can be requested by containers in a deployment spec. I can check with kubectl that the device plug-in container is running fine on both Inf1 instances.

$ kubectl get pods -n kube-system
NAME                                  READY STATUS  RESTARTS AGE
aws-node-tl5xv                        1/1   Running 0        14h
aws-node-wk6qm                        1/1   Running 0        14h
coredns-86d5cbb4bd-4fxrh              1/1   Running 0        14h
coredns-86d5cbb4bd-sts7g              1/1   Running 0        14h
kube-proxy-7px8d                      1/1   Running 0        14h
kube-proxy-zqvtc                      1/1   Running 0        14h
neuron-device-plugin-daemonset-888j4  1/1   Running 0        14h
neuron-device-plugin-daemonset-tq9kc  1/1   Running 0        14h

Next, I define AWS credentials in a Kubernetes secret. They will allow me to grab my BERT model stored in S3. Please note that both keys needs to be base64-encoded.

apiVersion: v1 
kind: Secret 
metadata: 
  name: aws-s3-secret 
type: Opaque 
data: 
  AWS_ACCESS_KEY_ID: <base64-encoded value> 
  AWS_SECRET_ACCESS_KEY: <base64-encoded value>

Finally, I store these credentials on the cluster.

$ kubectl apply -f secret.yaml

The cluster is correctly set up. Now, let’s build an application container storing a Neuron-enabled version of TensorFlow Serving.

Building an Application Container for TensorFlow Serving
The Dockerfile is very simple. We start from an Amazon Linux 2 base image. Then, we install the AWS CLI, and the TensorFlow Serving package available in the Neuron repository.

FROM amazonlinux:2
RUN yum install -y awscli
RUN echo $'[neuron] \n\
name=Neuron YUM Repository \n\
baseurl=https://yum.repos.neuron.amazonaws.com \n\
enabled=1' > /etc/yum.repos.d/neuron.repo
RUN rpm --import https://yum.repos.neuron.amazonaws.com/GPG-PUB-KEY-AMAZON-AWS-NEURON.PUB
RUN yum install -y tensorflow-model-server-neuron

I build the image, create an Amazon Elastic Container Registry repository, and push the image to it.

$ docker build . -f Dockerfile -t tensorflow-model-server-neuron
$ docker tag IMAGE_NAME 123456789012.dkr.ecr.us-west-2.amazonaws.com/inf1-demo
$ aws ecr create-repository --repository-name inf1-demo
$ docker push 123456789012.dkr.ecr.us-west-2.amazonaws.com/inf1-demo

Our application container is ready. Now, let’s define a Kubernetes service that will use this container to serve BERT predictions. I’m using a model that has already been compiled with the Neuron SDK. You can compile your own using the instructions available in the Neuron SDK repository.

Deploying BERT as a Kubernetes Service
The deployment manages two containers: the Neuron runtime container, and my application container. The Neuron runtime runs as a sidecar container image, and is used to interact with the AWS Inferentia chips. At startup, the latter configures the AWS CLI with the appropriate security credentials. Then, it fetches the BERT model from S3. Finally, it launches TensorFlow Serving, loading the BERT model and waiting for prediction requests. For this purpose, the HTTP and grpc ports are open. Here is the full manifest.

kind: Service
apiVersion: v1
metadata:
  name: eks-neuron-test
  labels:
    app: eks-neuron-test
spec:
  ports:
  - name: http-tf-serving
    port: 8500
    targetPort: 8500
  - name: grpc-tf-serving
    port: 9000
    targetPort: 9000
  selector:
    app: eks-neuron-test
    role: master
  type: ClusterIP
---
kind: Deployment
apiVersion: apps/v1
metadata:
  name: eks-neuron-test
  labels:
    app: eks-neuron-test
    role: master
spec:
  replicas: 2
  selector:
    matchLabels:
      app: eks-neuron-test
      role: master
  template:
    metadata:
      labels:
        app: eks-neuron-test
        role: master
    spec:
      volumes:
        - name: sock
          emptyDir: {}
      containers:
      - name: eks-neuron-test
        image: 123456789012.dkr.ecr.us-west-2.amazonaws.com/inf1-demo:latest
        command: ["/bin/sh","-c"]
        args:
          - "mkdir ~/.aws/ && \
           echo '[eks-test-profile]' > ~/.aws/credentials && \
           echo AWS_ACCESS_KEY_ID=$AWS_ACCESS_KEY_ID >> ~/.aws/credentials && \
           echo AWS_SECRET_ACCESS_KEY=$AWS_SECRET_ACCESS_KEY >> ~/.aws/credentials; \
           /usr/bin/aws --profile eks-test-profile s3 sync s3://jsimon-inf1-demo/bert /tmp/bert && \
           /usr/local/bin/tensorflow_model_server_neuron --port=9000 --rest_api_port=8500 --model_name=bert_mrpc_hc_gelus_b4_l24_0926_02 --model_base_path=/tmp/bert/"
        ports:
        - containerPort: 8500
        - containerPort: 9000
        imagePullPolicy: Always
        env:
        - name: AWS_ACCESS_KEY_ID
          valueFrom:
            secretKeyRef:
              key: AWS_ACCESS_KEY_ID
              name: aws-s3-secret
        - name: AWS_SECRET_ACCESS_KEY
          valueFrom:
            secretKeyRef:
              key: AWS_SECRET_ACCESS_KEY
              name: aws-s3-secret
        - name: NEURON_RTD_ADDRESS
          value: unix:/sock/neuron.sock

        resources:
          limits:
            cpu: 4
            memory: 4Gi
          requests:
            cpu: "1"
            memory: 1Gi
        volumeMounts:
          - name: sock
            mountPath: /sock

      - name: neuron-rtd
        image: 790709498068.dkr.ecr.us-west-2.amazonaws.com/neuron-rtd:1.0.6905.0
        securityContext:
          capabilities:
            add:
            - SYS_ADMIN
            - IPC_LOCK

        volumeMounts:
          - name: sock
            mountPath: /sock
        resources:
          limits:
            hugepages-2Mi: 256Mi
            aws.amazon.com/neuron: 1
          requests:
            memory: 1024Mi

I use kubectl to create the service.

$ kubectl create -f bert_service.yml

A few seconds later, the pods are up and running.

$ kubectl get pods
NAME                           READY STATUS  RESTARTS AGE
eks-neuron-test-5d59b55986-7kdml 2/2   Running 0        14h
eks-neuron-test-5d59b55986-gljlq 2/2   Running 0        14h

Finally, I redirect service port 9000 to local port 9000, to let my prediction client connect locally.

$ kubectl port-forward svc/eks-neuron-test 9000:9000 &

Now, everything is ready for prediction, so let’s invoke the model.

Predicting with BERT on EKS and Inf1
The inner workings of BERT are beyond the scope of this post. This particular model expects a sequence of 128 tokens, encoding the words of two sentences we’d like to compare for semantic equivalence.

Here, I’m only interested in measuring prediction latency, so dummy data is fine. I build 100 prediction requests storing a sequence of 128 zeros. I send them to the TensorFlow Serving endpoint via grpc, and I compute the average prediction time.

import numpy as np
import grpc
import tensorflow as tf
from tensorflow_serving.apis import predict_pb2
from tensorflow_serving.apis import prediction_service_pb2_grpc
import time

if __name__ == '__main__':
    channel = grpc.insecure_channel('localhost:9000')
    stub = prediction_service_pb2_grpc.PredictionServiceStub(channel)
    request = predict_pb2.PredictRequest()
    request.model_spec.name = 'bert_mrpc_hc_gelus_b4_l24_0926_02'
    i = np.zeros([1, 128], dtype=np.int32)
    request.inputs['input_ids'].CopyFrom(tf.contrib.util.make_tensor_proto(i, shape=i.shape))
    request.inputs['input_mask'].CopyFrom(tf.contrib.util.make_tensor_proto(i, shape=i.shape))
    request.inputs['segment_ids'].CopyFrom(tf.contrib.util.make_tensor_proto(i, shape=i.shape))

    latencies = []
    for i in range(100):
        start = time.time()
        result = stub.Predict(request)
        latencies.append(time.time() - start)
        print("Inference successful: {}".format(i))
    print ("Ran {} inferences successfully. Latency average = {}".format(len(latencies), np.average(latencies)))

On average, prediction took 5.92ms. As far as BERT goes, this is pretty good!

Ran 100 inferences successfully. Latency average = 0.05920819044113159

In real-life, we would certainly be batching prediction requests in order to increase throughput. If needed, we could also scale to larger Inf1 instances supporting several Inferentia chips, and deliver even more prediction performance at low cost.

Getting Started
Kubernetes users can deploy Amazon Elastic Compute Cloud (EC2) Inf1 instances on Amazon Elastic Kubernetes Service today in the US East (N. Virginia) and US West (Oregon) regions. As Inf1 deployment progresses, you’ll be able to use them with Amazon Elastic Kubernetes Service in more regions.

Give this a try, and please send us feedback either through your usual AWS Support contacts, on the AWS Forum for Amazon Elastic Kubernetes Service, or on the container roadmap on Github.

– Julien

OpenFOAM on Amazon EC2 C6g Arm-based Graviton2 Instances – up to 37% better price/performance

Post Syndicated from Emma White original https://aws.amazon.com/blogs/compute/c6g-openfoam-better-price-performance/

This post is contributed to by: Neil Ashton (AWS) – Principal CFD Specialist SA, Karthik Raman (AWS) – Senior HPC Specialist SA, Oliver Perks (Arm) – Principal HPC Engineer

Over the past 30 years, Computational Fluid Dynamics (CFD) has become a key part of many engineering design processes. From aircraft design to modelling the blood flow inside our bodies, the ability to understand the behavior of fluids has enabled countless innovations and improved the time to market for many products. Typically, the modelling of the underlying fluid flow equations remains the major limit to accuracy; however, the scale and availability of HPC resources is arguably the main bottleneck for the typical CFD user. The need for more HPC resources has accelerated over recent years due to the move to higher-fidelity approaches, in addition to the growing use of optimization techniques and machine learning driven workflows. These workflows require the simulation of thousands or even millions of small jobs to explore a design space or train an improved turbulence model. AWS enables engineers to overcome this HPC bottleneck by allowing the quick deployment of a supercomputer. This can scale to practically any size and with the right hardware to match your needs, for example, CPUs, GPUs.

CFD users have a broad choice for instance types on AWS.  Typically the majority of users select the compute-optimized Amazon EC2 C5 family of instances. These offer a better price/performance over the Amazon EC2 M5 family of instances for the majority of CFD cases due to a need for memory bandwidth over total memory needs. This broad set of available instances allows users to easily incorporate Amazon EC2 M5 or Amazon EC2 R5 instances when higher memory is needed, for example, for pre or post-processing.

There are typically two competing needs for the typical CFD user: turn-around time and cost. Depending on the underlying numerical methods, the speed of a simulation does not linearly increase with an increased core count. At some point, the cost of network communication means a non-linear increase. This means the cost of increasing the number of cores is not matched by a similar decrease in runtime. Therefore, the user typically faces a choice between the fastest possible runtime or the most efficient cost for the simulation.

The recently announced Amazon EC2 C6g instances powered by AWS-designed Arm-based Graviton2 processors bring in a new instance option. So, the logical question for any AWS customer is whether this is suitable for their CFD workload. In this blog, we provide benchmarking results for the open-source CFD code OpenFOAM that can be easily compiled and run with the Amazon EC2 C6g instance. We demonstrate that you can achieve 37% better price/performance for both single-node and multi-node cases on more than 200M cells on thousands of cores.

 

AWS Graviton2-based Instances

AWS Graviton2 processors are custom built by AWS using the 64-bit Arm Neoverse cores to deliver great price performance for your cloud workloads running in Amazon EC2. The Graviton2-powered EC2 instances were announced at re:Invent 2019. The new Amazon EC2 C6g compute-optimized instances are available as part of the sixth generation EC2 offering. These instances are powered by AWS Graviton2 processors that use 64-bit Arm Neoverse N1 cores and custom silicon designed by AWS, built using advanced 7-nanometer manufacturing technology.

Graviton2 processor cores feature 64 KB L1 cache and 1 MB L2 cache, includes dual SIMD units to double the floating-point performance versus first-generation Graviton processors.  This targets high performance computing workloads and also support int8/fp16 number formats to accelerate machine learning inference workloads. Every vCPU is a physical core (that is, no simultaneous multithreading – SMT). The instances are single socket, so there are no NUMA concerns since every core sees the same path to memory and other cores. There are 8x DDR4 memory channels running at 3200 MT/s delivering over 200 GB/s of peak memory bandwidth.

The compute-optimized Amazon EC2 C6g instances are available in 9 sizes 1, 2, 4, 8, 16, 32, 48, and 64 vCPUs, or as bare metal instances. The compute-optimized Amazon EC2 C6g instances support configurations with up to 128 GB of DDR4 memory or 2 GB/ CPU. The instances support up-to 25 Gbps of network bandwidth and 19 Gbps of Amazon EBS bandwidth. These instances are powered by the AWS Nitro System, a combination of dedicated hardware and lightweight hypervisor.

Benchmarking Results

Single-Node Performance

In order to demonstrate the performance of the Amazon EC2 C6g instance, we have taken two test-cases that reflect the range of cases that CFD users run. For the first case, we simulate a 4 million cell motorbike case that is part of the standard OpenFOAM tutorial suite. It might seem that these cases do not represent the complexity of some production CFD workflows. However, these were chosen to allow readers to easily replicate the results. This case uses the OpenFOAM simpleFoam solver, which uses a steady-state incompressible segregated SIMPLE approach combined with a multigrid method for the linear solver. The k- SST model is used with second order upwinding for both momentum and turbulent equations. The scotch domain decomposition approach is used to ensure best parallel efficiency. However, for this first case the focus is on the single-node performance, given the low cell count.

Figures 1 and 2 show the time taken to run 5000 iterations as well as the cost for that simulation based upon on-demand pricing. It can be seen that the C6g.16xlarge offers a clear cost per simulation benefit over both the C5n.18xlarge (37%), and C5.24xlarge (29%) instances. There is however an increase in total run time compared to the C5.24xlarge (33%) and C5n.18xlarge (13%) which may mean that the C5.24xlarge or C5n.18xlarge will still be preferred for those prioritizing turn-around time.

 

Figure 1 - Single-node OpenFOAM cost per simulation

Figure 1 – Single-node OpenFOAM cost per simulation

Figure 2 - Single-node OpenFOAM run-time performance

Figure 2 – Single-node OpenFOAM run-time performance

Single-Node Performance Profile

Profiling performed on the simpleFoam solver (used in the motorbike example) showed that it is limited by the sustained memory bandwidth on the instance. We therefore measured the peak sustainable memory bandwidth performance (Figure 3) on the three instance types (C5n.18xlarge, C6g.16xlarge, C5.24xlarge) using STREAM Triad. Then calculated the corresponding bandwidth efficiencies, which are shown in Figure 4.

 

Figure 3 - Peak memory bandwidth using STREAM

Figure 3 – Peak memory bandwidth using STREAM

Figure 4 - Memory bandwidth efficiency using STREAM Triad

Figure 4 – Memory bandwidth efficiency using STREAM Triad

The C5n.18xlarge and C5.24xlarge instances have higher peak memory bandwidth due to more memory channels (dual socket) compared to C6g.16xlarge instance. However, the Graviton2 based processor can achieve a higher percentage of peak (~86% bandwidth efficiency) compared to the x86 processors (~79% on C5.24xlarge). This in addition to the cost benefits (47% vs. C5.24xlarge, 44% vs. C5n.18xlarge) is the reason for the improved cost per simulation with the C6g.16xlarge instance as shown in Figure 1.

Multi-Node Performance

A further, larger mesh of 222 million cells was studied on the same motorbike geometry with the same underlying numerical methods. We did this to assess whether the same performance trends continue for much larger multi-node simulations. Figure 5 shows the number of iterations possible per minute for different number of cores for the C5.24xlarge, C5n.18xlarge, and C6g.16xlarge instances. You can see in Figure 5 that the scaling of the C5n.18xlarge instances is much better than C5.24xlarge and C6g.16xlarge instances due to the use of the Elastic Fabric Adapter (EFA), which enables optimum scaling thanks to low latency, high-bandwidth  communication. However, while the scaling is much improved for C5n.18xlarge instances with EFA, the cost per simulation (Figure 6) shows up to 37% better price/performance for the C6g.16xlarge instances over the C5n.18xlarge and C5n.24xlarge instances.

Figure 5 - Multi-node OpenFOAM Scaling Performance

Figure 5 – Multi-node OpenFOAM Scaling Performance

Figure 6 - Multi-node OpenFOAM cost per simulation

Figure 6 – Multi-node OpenFOAM cost per simulation

Conclusions

This blog has demonstrated the ease with which both single-node and multi-node CFD simulations can be ported to the new Amazon EC2 C6g instances powered by Arm-based AWS Graviton2 processors. With no code modifications, we have been able to port an x86 workload to the new C6g instances and achieve competitive single node and multi-node performance with up to 37% better price/performance over other C family instances. We encourage you to test out your applications for yourself and reach out to us if you have any questions!

OpenFOAM compilation on AWS Graviton2 Instances

OpenFOAM v1912 builds on modern Arm-based systems, with support in-place for both the GCC compiler and the Arm Compiler for Linux (ACfL). No source code modifications are required to obtain a working OpenFOAM binary. The process for building OpenFOAM on Arm based systems is equivalent to that on x86 systems, by specifying the desired compiler in OpenFOAM bashrc file.

The only external dependencies are a working MPI library. For this experiment, we used Open MPI version 4.0.3, built with UCX 1.8 and GCC 9.2. We note that we have not applied any additional hardware-specific performance optimizations. Although deploying OpenFOAM on the Amazon EC2 C6g instances was trivial, there is documentation on the Arm community GitLab pages. This documentation covers the installation of different OpenFOAM versions and with different compilers.

For the C5n.18xlarge and C5.24xlarge simulations, OpenFOAM v1912 was compiled using GCC 8.2 and IntelMPI 2019.7. For all simulations Amazon Linux 2 was the operating system. The HPC environment for all tests was created using AWS ParallelCluster, which will soon include official support for AWS’ latest Graviton2 instances, including C6g. You can stay up to date on the latest information and releases of AWS ParallelCluster on GitHub.

Instance Details:

ModelvCPUMemory (GiB)Instance Storage (GiB)Network Bandwidth (Gbps)EBS Bandwidth (Mbps)
C5n.18xlarge72192EBS-only10019,000
C6g.16xlarge64128EBS-only2519,000
C5.24xlarge96192EBS-only2519,000

 

Building for Cost optimization and Resilience for EKS with Spot Instances

Post Syndicated from Ben Peven original https://aws.amazon.com/blogs/compute/cost-optimization-and-resilience-eks-with-spot-instances/

This post is contributed by Chris Foote, Sr. EC2 Spot Specialist Solutions Architect

Running your Kubernetes and containerized workloads on Amazon EC2 Spot Instances is a great way to save costs. Kubernetes is a popular open-source container management system that allows you to deploy and manage containerized applications at scale. AWS makes it easy to run Kubernetes with Amazon Elastic Kubernetes Service (EKS) a managed Kubernetes service to run production-grade workloads on AWS. To cost optimize these workloads, run them on Spot Instances. Spot Instances are available at up to a 90% discount compared to On-Demand prices. These instances are best used for various fault-tolerant and instance type flexible applications. Spot Instances and containers are an excellent combination, because containerized applications are often stateless and instance flexible.

In this blog, I illustrate the best practices of using Spot Instances such as diversification, automated interruption handling, and leveraging Auto Scaling groups to acquire capacity. You then adapt these Spot Instance best practices to EKS with the goal of cost optimizing and increasing the resilience of container-based workloads.

Spot Instances Overview

Spot Instances are spare Amazon EC2 capacity that allows customers to save up to 90% over On-Demand prices. Spot capacity is split into pools determined by instance type, Availability Zone (AZ), and AWS Region. The Spot Instance price changes slowly determined by long-term trends in supply and demand of a particular Spot capacity pool, as shown below:

Spot Instance pricing

Prices listed are an example, and may not represent current prices. Spot Instance pricing is illustrated in orange blocks, and On-Demand is illustrated in dark-blue.

When EC2 needs the capacity back, the Spot Instance service arbitrarily sends Spot interruption notifications to instances within the associated Spot capacity pool. This Spot interruption notification lands in both the EC2 instance metadata and Eventbridge. Two minutes after the Spot interruption notification, the instance is reclaimed. You can set up your infrastructure to automate a response to this two-minute notification. Examples include draining containers, draining ELB connections, or post-processing.

Instance flexibility is important when following Spot Instance best practices, because it allows you to provision from many different pools of Spot capacity. Leveraging multiple Spot capacity pools help reduce interruptions depending on your defined Spot Allocation Strategy, and decrease time to provision capacity. Tapping into multiple Spot capacity pools across instance types and AZs, allows you to achieve your desired scale — even for applications that require 500K concurrent cores:

Spot capacity pools = (Availability Zones) * (Instance Types)

If your application is deployed across two AZs and uses only an c5.4xlarge then you are only using (2 * 1 = 2) two Spot capacity pools. To follow Spot Instance best practices, consider using six AZs and allowing your application to use c5.4xlarge, c5d.4xlarge, c5n.4xlarge, and c4.4xlarge. This gives us (6 * 4 = 24) 24 Spot capacity pools, greatly increasing the stability and resilience of your application.

Auto Scaling groups support deploying applications across multiple instance types, and automatically replace instances if they become unhealthy, or terminated due to Spot interruption. To decrease the chance of interruption, use the capacity-optimized Spot allocation strategy. This automatically launches Spot Instances into the most available pools by looking at real-time capacity data, and identifying which are the most available.

Now that I’ve covered Spot best practices, you can apply them to Kubernetes and build an architecture for EKS with Spot Instances.

Solution architecture

The goals of this architecture are as follows:

  • Automatically scaling the worker nodes of Kubernetes clusters to meet the needs of the application
  • Leveraging Spot Instances to cost-optimize workloads on Kubernetes
  • Adapt Spot Instance best practices (like diversification) to EKS and Cluster Autoscaler

You achieve these goals via the following components:

ComponentRoleDetailsDeployment Method
Cluster AutoscalerScales EC2 instances automatically according to pods running in the clusterOpen SourceA DaemonSet via Helm on On-Demand Instances
EC2 Auto Scaling groupProvisions and maintains EC2 instance capacityAWSCloudformation via eksctl
AWS Node Termination HandlerDetects EC2 Spot interruptions and automatically drains nodesOpen SourceA DaemonSet via Helm on Spot Instances

The architecture deploys the EKS worker nodes over three AZs, and leverages three Auto Scaling groups – two for Spot Instances, and one for On-Demand. The Kubernetes Cluster Autoscaler is deployed on On-Demand worker nodes, and the AWS Node Termination Handler is deployed on all worker nodes.

EKS worker node architecture

Additional detail on Kubernetes interaction with Auto Scaling group:

  • Cluster Autoscaler can be used to control scaling activities by changing the DesiredCapacity of the Auto Scaling group, and directly terminating instances. Auto Scaling groups can be used to find capacity, and automatically replace any instances that become unhealthy, or terminated through Spot Instance interruptions.
  • The Cluster Autoscaler can be provisioned as a Deployment of one pod to an instance of the On-Demand Auto Scaling group. The proceeding diagram shows the pod in AZ1, however this may not necessarily be the case.
  • Each node group maps to a single Auto Scaling group. However, Cluster Autoscaler requires all instances within a node group to share the same number of vCPU and amount of RAM. To adhere to Spot Instance best practices and maximize diversification, you use multiple node groups. Each of these node groups is a mixed-instance Auto Scaling group with capacity-optimized Spot allocation strategy.

Kuberentes Node Groups

Autoscaling in Kubernetes Clusters

There are two common ways to scale Kubernetes clusters:

  1. Horizontal Pod Autoscaler (HPA) scales the pods in deployment or a replica set to meet the demand of the application. Scaling policies are based on observed CPU utilization or custom metrics.
  2. Cluster Autoscaler (CA) is a standalone program that adjusts the size of a Kubernetes cluster to meet the current needs. It increases the size of the cluster when there are pods that failed to schedule on any of the current nodes due to insufficient resources. It attempts to remove underutilized nodes, when its pods can run elsewhere.

When a pod cannot be scheduled due to lack of available resources, Cluster Autoscaler determines that the cluster must scale out and increases the size of the node group. When multiple node groups are used, Cluster Autoscaler chooses one based on the Expander configuration. Currently, the following strategies are supported: random, most-pods, least-waste, and priority

You use random placement strategy in this example for the Expander in Cluster Autoscaler. This is the default expander, and arbitrarily chooses a node-group when the cluster must scale out. The random expander maximizes your ability to leverage multiple Spot capacity pools. However, you can evaluate the others and may find another more appropriate for your workload.

Atlassian Escalator:

An alternative to Cluster Autoscaler for batch workloads is Escalator. It is designed for large batch or job-based workloads that cannot be force-drained and moved when the cluster needs to scale down.

Auto Scaling group

Following best practices for Spot Instances means deploying a fleet across a diversified set of instance families, sizes, and AZs. An Auto Scaling group is one of the best mechanisms to accomplish this. Auto Scaling groups automatically replace Spot Instances that have been terminated due to a Spot interruption with an instance from another capacity pool.

Auto Scaling groups support launching capacity from multiple instances types, and using multiple purchase options. For the example in this blog post, I’m maintaining a 1:4 vCPU to memory ratio for all instances chosen. For your application, there may be a different set of requirements. The instances chosen for the two Auto Scaling groups are below:

  • 4vCPU / 16GB ASG: xlarge, m5d.xlarge, m5n.xlarge, m5dn.xlarge, m5a.xlarge, m4.xlarge
  • 8vCPU / 32GB ASG: 2xlarge, m5d.2xlarge, m5n.2xlarge, m5dn.2xlarge, m5a.2xlarge, m4.2xlarge

In this example I use a total of 12 different instance types and three different AZs, for a total of (12 * 3 = 36) 36 different Spot capacity pools. The Auto Scaling group chooses which instance types to deploy based on the Spot allocation strategy. To minimize the chance of Spot Instance interruptions, you use the capacity-optimized allocation strategy. The capacity-optimized strategy automatically launches Spot Instances into the most available pools by looking at real-time capacity data and predicting which are the most available.

capacity-optimized Spot allocation strategy

Example of capacity-optimized Spot allocation strategy.

For Cluster Autoscaler, other cluster administration/management pods, and stateful workloads that run on EKS worker nodes, you create a third Auto Scaling group using On-Demand Instances. This ensures that Cluster Autoscaler is not impacted by Spot Instance interruptions. In Kubernetes, labels and nodeSelectors can be used to control where pods are placed. You use the nodeSelector to place Cluster Autoscaler on an instance in the On-Demand Auto Scaling group.

Note: The Auto Scaling groups continually work to balance the number of instances in each AZ they are deployed over. This may cause worker nodes to be terminated while ASG is scaling in the number of instances within an AZ. You can disable this functionality by suspending the AZRebalance process, but this can result in capacity becoming unbalanced across AZs. Another option is running a tool to drain instances upon ASG scale-in such as the EKS Node Drainer. The code provides an AWS Lambda function that integrates as an Amazon EC2 Auto Scaling Lifecycle Hook. When called, the Lambda function calls the Kubernetes API to cordon and evicts all evictable pods from the node being terminated. It then waits until all pods are evicted before the Auto Scaling group continues to terminate the EC2 instance.

Spot Instance Interruption Handling

To mitigate the impact of potential Spot Instance interruptions, leverage the ‘node termination handler’. The DaemonSet deploys a pod on each Spot Instance to detect the Spot Instance interruption notification, so that it can both terminate gracefully any pod that was running on that node, drain from load balancers and allow the Kubernetes scheduler to reschedule the evicted pods elsewhere on the cluster.

The workflow can be summarized as:

  • Identify that a Spot Instance is about to be interrupted in two minutes.
  • Use the two-minute notification window to gracefully prepare the node for termination.
  • Taint the node and cordon it off to prevent new pods from being placed on it.
  • Drain connections on the running pods.

Consequently:

  • Controllers that manage K8s objects like Deployments and ReplicaSet will understand that one or more pods are not available and create a new replica.
  • Cluster Autoscaler and AWS Auto Scaling group will re-provision capacity as needed.

Walkthrough

Getting started (launch EKS)

First, you use eksctl to create an EKS cluster with the name spotcluster-eksctl in combination with a managed node group. The managed node group will have two On-Demand t3.medium nodes and it will bootstrap with the labels lifecycle=OnDemand and intent=control-apps. Be sure to replace <YOUR REGION> with the region you’ll be launching your cluster into.

eksctl create cluster --version=1.15 --name=spotcluster-eksctl --node-private-networking --managed --nodes=3 --alb-ingress-access --region=<YOUR REGION> --node-type t3.medium --node-labels="lifecycle=OnDemand" --asg-access

This takes approximately 15 minutes. Once cluster creation is complete, test the node connectivity:

kubectl get nodes

Provision the worker nodes

You use eksctl create nodegroup and eksctl configuration files to add the new nodes to the cluster. First, create the configuration file spot_nodegroups.yml. Then, paste the code and replace <YOUR REGION> with the region you launched your EKS cluster in.

apiVersion: eksctl.io/v1alpha5
kind: ClusterConfig
metadata:
    name: spotcluster-eksctl
    region: <YOUR REGION>
nodeGroups:
    - name: ng-4vcpu-16gb-spot
      minSize: 0
      maxSize: 5
      desiredCapacity: 1
      instancesDistribution:
        instanceTypes: ["m5.xlarge", "m5n.xlarge", "m5d.xlarge", "m5dn.xlarge","m5a.xlarge", "m4.xlarge"] 
        onDemandBaseCapacity: 0
        onDemandPercentageAboveBaseCapacity: 0
        spotAllocationStrategy: capacity-optimized
      labels:
        lifecycle: Ec2Spot
        intent: apps
        aws.amazon.com/spot: "true"
      tags:
        k8s.io/cluster-autoscaler/node-template/label/lifecycle: Ec2Spot
        k8s.io/cluster-autoscaler/node-template/label/intent: apps
      iam:
        withAddonPolicies:
          autoScaler: true
          albIngress: true
    - name: ng-8vcpu-32gb-spot
      minSize: 0
      maxSize: 5
      desiredCapacity: 1
      instancesDistribution:
        instanceTypes: ["m5.2xlarge", "m5n.2xlarge", "m5d.2xlarge", "m5dn.2xlarge","m5a.2xlarge", "m4.2xlarge"] 
        onDemandBaseCapacity: 0
        onDemandPercentageAboveBaseCapacity: 0
        spotAllocationStrategy: capacity-optimized
      labels:
        lifecycle: Ec2Spot
        intent: apps
        aws.amazon.com/spot: "true"
      tags:
        k8s.io/cluster-autoscaler/node-template/label/lifecycle: Ec2Spot
        k8s.io/cluster-autoscaler/node-template/label/intent: apps
      iam:
        withAddonPolicies:
          autoScaler: true
          albIngress: true

This configuration file adds two diversified Spot Instance node groups with 4vCPU/16GB and 8vCPU/32GB instance types. These node groups use the capacity-optimized Spot allocation strategy as described above. Last, you label all nodes created with the instance lifecycle “Ec2Spot” and later use nodeSelectors, to guide your application front-end to your Spot Instance nodes. To create both node groups, run:

eksctl create nodegroup -f spot_nodegroups.yml

This takes approximately three minutes. Once done, confirm these nodes were added to the cluster:

kubectl get nodes --show-labels --selector=lifecycle=Ec2Spot

Install the Node Termination Handler

You can install the .yaml file from the official GitHub site.

kubectl apply -f https://github.com/aws/aws-node-termination-handler/releases/download/v1.3.1/all-resources.yaml

This installs the Node Termination Handler to both Spot Instance and On-Demand nodes, which is helpful because the handler responds to both EC2 maintenance events and Spot Instance interruptions. However, if you are interested in limiting deployment to just Spot Instance nodes, the site has additional instructions to accomplish this.

Verify the Node Termination Handler is running:

kubectl get daemonsets --all-namespaces

Deploy the Cluster Autoscaler

For additional detail, see the EKS page here. Export the Cluster Autoscaler into a configuration file:

curl -LO https://raw.githubusercontent.com/kubernetes/autoscaler/master/cluster-autoscaler/cloudprovider/aws/examples/cluster-autoscaler-autodiscover.yaml

Open the file created and edit the cluster-autoscaler container command to replace <YOUR CLUSTER NAME> with your cluster’s name, and add the following options.

--balance-similar-node-groups
--skip-nodes-with-system-pods=false

You also need to change the expander configuration. Search for - --expander= and replace least-waste with random

Example:

    spec:
        containers:
        - command:
            - ./cluster-autoscaler
            - --v=4
            - --stderrthreshold=info
            - --cloud-provider=aws
            - --skip-nodes-with-local-storage=false
            - --expander=random
            - --node-group-auto-discovery=asg:tag=k8s.io/cluster-autoscaler/enabled,k8s.io/cluster-autoscaler/<YOUR CLUSTER NAME>
            - --balance-similar-node-groups
            - --skip-nodes-with-system-pods=false

Save the file and then deploy the Cluster Autoscaler:

kubectl apply -f cluster-autoscaler-autodiscover.yaml

Next, add the cluster-autoscaler.kubernetes.io/safe-to-evictannotation to the deployment with the following command:

kubectl -n kube-system annotate deployment.apps/cluster-autoscaler cluster-autoscaler.kubernetes.io/safe-to-evict="false"

Open the Cluster Autoscaler releases page in a web browser and find the latest Cluster Autoscaler version that matches your cluster’s Kubernetes major and minor version. For example, if your cluster’s Kubernetes version is 1.16 find the latest Cluster Autoscaler release that begins with 1.16.

Set the Cluster Autoscaler image tag to this version using the following command, replacing 1.15.n with your own value. You can replace us with asia or eu:

kubectl -n kube-system set image deployment.apps/cluster-autoscaler cluster-autoscaler=us.gcr.io/k8s-artifacts-prod/autoscaling/cluster-autoscaler:v1.15.n

To view the Cluster Autoscaler logs, use the following command:

kubectl -n kube-system logs -f deployment.apps/cluster-autoscaler

Deploy the sample application

Create a new file web-app.yaml, paste the following specification into it and save the file:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: web-stateless
spec:
  replicas: 3
  selector:
    matchLabels:
      app: nginx
  template:
    metadata:
      labels:
        service: nginx
        app: nginx
    spec:
      containers:
      - image: nginx
        name: web-stateless
        resources:
          limits:
            cpu: 1000m
            memory: 1024Mi
          requests:
            cpu: 1000m
            memory: 1024Mi
      nodeSelector:    
        lifecycle: Ec2Spot
--- 
apiVersion: apps/v1
kind: Deployment
metadata:
  name: web-stateful
spec:
  replicas: 2
  selector:
    matchLabels:
      app: redis
  template:
    metadata:
      labels:
        service: redis
        app: redis
    spec:
      containers:
      - image: redis:3.2-alpine
        name: web-stateful
        resources:
          limits:
            cpu: 1000m
            memory: 1024Mi
          requests:
            cpu: 1000m
            memory: 1024Mi
      nodeSelector:
        lifecycle: OnDemand

This deploys three replicas, which land on one of the Spot Instance node groups due to the nodeSelector choosing lifecycle: Ec2Spot. The “web-stateful” nodes are not fault-tolerant and not appropriate to be deployed on Spot Instances. So, you use nodeSelector again, and instead choose lifecycle: OnDemand. By guiding fault-tolerant pods to Spot Instance nodes, and stateful pods to On-Demand nodes, you can even use this to support multi-tenant clusters.

To deploy the application:

kubectl apply -f web-app.yaml

Confirm that both deployments are running:

kubectl get deployment/web-stateless
kubectl get deployment/web-stateful

Now, scale out the stateless application:

kubectl scale --replicas=30 deployment/web-stateless

Check to see that there are pending pods. Wait approximately 5 minutes, then check again to confirm the pending pods have been scheduled:

kubectl get pods

Clean-Up

Remove the AWS Node Termination Handler:

kubectl delete daemonset aws-node-termination-handler -n kube-system

Remove the two Spot node groups (EC2 Auto Scaling group) that you deployed in the tutorial.

eksctl delete nodegroup ng-4vcpu-16gb-spot --cluster spotcluster-eksctl
eksctl delete nodegroup ng-8vcpu-32gb-spot --cluster spotcluster-eksctl

If you used a new cluster and not your existing cluster, delete the EKS cluster.

eksctl confirms the deletion of the cluster’s CloudFormation stack immediately but the deletion could take up to 15 minutes. You can optionally track it in the CloudFormation Console.

eksctl delete cluster --name spotcluster-eksctl

Conclusion

By following best practices, Kubernetes workloads can be deployed onto Spot Instances, achieving both resilience and cost optimization. Instance and Availability Zone flexibility are the cornerstones of pulling from multiple capacity pools and obtaining the scale your application requires. In addition, there are pre-built tools to handle Spot Instance interruptions, if they do occur. EKS makes this even easier by reducing operational overhead through offering a highly-available managed control-plane and managed node groups. You’re now ready to begin integrating Spot Instances into your Kubernetes clusters to reduce workload cost, and if needed, achieve massive scale.

Proactively Monitoring System Performance on Amazon Lightsail Instances

Post Syndicated from Emma White original https://aws.amazon.com/blogs/compute/proactively-monitoring-system-performance-on-amazon-lightsail-instances/

This post is contributed by Mike Coleman, AWS Senior Developer Advocate – Lightsail

I commonly hear from customers that they want to be able to proactively identify issues that could affect system performance before they become a problem. For instance, the ability to be alerted before an instance might become unresponsive to a burst in traffic due to exhaustion of all its CPU burst capacity. Burst capacity can be consumed either from a workload that needs to operate in the burstable zone for long periods of time, or unexpected CPU consumption by system processes. In either case, you’d want to be notified so you could take corrective action such as moving to a larger instance or stopping errant processes. To that end, today Amazon Lightsail launched a new feature allowing you to set up custom alarms to be notified when your burst capacity is running low.

 

Amazon Lightsail instances use burstable CPUs. These CPUs operate in two different zones: the sustainable zone and the burstable zone. The sustainable zone is based on the CPUs baseline performance. As long as the CPU utilization stays below this baseline, the system will perform with no impact to system responsiveness. The burstable zone is entered whenever the CPU utilization climbs above the baseline. The instance can only operate in this zone for a finite amount of time (more on that below) before system performance is negatively impacted.

 

Earlier this year, we announced the release of resource monitoring, alarms, and notifications for Amazon Lightsail instances. This feature introduced a graph that shows when an instance operates in the CPUs burstable zone. However, this was only a partial solution since there was no easy way for you to know how long your system could effectively operate in the burstable zone.

With today’s new feature, Lightsail has augmented that functionality by allowing you to see how much burstable capacity is available to your system at any given time. Additionally, you can create alarms and be notified when that burstable capacity has dropped to a critical level. This allows users to be proactively notified that there is a potential performance issue developing.

 

The remainder of this blog runs through how to configure a burstable capacity alert so you can prevent system performance issues before they impact your users.

 

Burstable capacity overview

Before I begin, it’s important to understand how burst capacity minutes are calculated. One minute of the CPU running at 100% is one minute of CPU burst capacity. By the same token, one minute of the CPU running at 50% is 30 seconds of CPU burst capacity. So, if a system has 72 minutes of available burst capacity, and it’s running at 50% CPU it can run in that state for 144 minutes before system performance is impacted.

CPU burstable capacity can be displayed two ways: percentage available and minutes available, with the default being percentage.  CPU burst capacity percentage is a simple calculation of dividing the available burst capacity (minutes) by the maximum available burst capacity (minutes). For example, an instance with 36 minutes of burst capacity left out of a maximum available limit of 72 minutes would be at 50%.

 

Configuring a Burstable Capacity Alert

Let’s take a look at how to configure an alert for CPU burst capacity (percentage).

 

Note: If you’re not familiar with the general concepts around how to configure alerts and notifications in Lightsail, you can read this blog post.

 

  1. From the Lightsail home page click on the instance you wish to create the alert for.
  2. From the horizontal menu, choose Metrics.
    metrics tab of Lightsail console
  3. You should now see the graphs for both CPU utilization and burst capacity.

Notice how the CPU utilization graph shows the sustainable and the burstable zones.

The burst capacity graph shows you the current percentage of burst capacity you have remaining of minutes that you before system performance is impacted.

In the following graphs, you can see that the CPU is operating at about 15% and has just over 90% of remaining burst capacity.
cpu graphs

  1. Scroll down and click CPU burst capacity (percentage) under
    cpu alarm option
  2. Click +Add alarm.
  3. For my alarm, I choose to be notified whenever the percentage of available CPU burst capacity drops below 25% for 10 consecutive minutes.
    cpu percentage alarm option
  4. At this point, you can enable notifications via email or SMS message (instructions on how to do this can be found in the blog post I linked earlier). Regardless of whether you choose to enable notifications, you’ll receive a banner in the Lightsail console whenever your alarm threshold is breached.
  5. Click Create to create your alarm.

Lightsail is now configured with a single alarm. I consider it a best practice to configure two alarms. The first is a warning level, and the second is the critical level. This ensures that you have additional time to respond to developing problems. If you’d like to create a second alarm, just repeat the previous steps.

 

Conclusion

In this blog, I covered the concepts behind Lightsail’s burstable CPUs, and how you create alarms to respond before your system runs out of burst capacity.  If you see that your system is running out of bursting capacity frequently, you should investigate the processes running on your instance looking for any that are consuming extra CPU. Or you should consider upgrading your instance to a larger plan. Read more about this latest feature here, and check out our Getting Started page for more tutorials and resources.

 

 

 

 

 

Running a high-performance SAS Grid Manager cluster on AWS with Amazon FSx for Lustre

Post Syndicated from Neelam original https://aws.amazon.com/blogs/big-data/running-a-high-performance-sas-grid-manager-cluster-on-aws-with-amazon-fsx-for-lustre/

SAS® is a software provider of data science and analytics used by enterprises and government organizations. SAS Grid is a highly available, fast processing analytics platform that offers centralized management that balances workloads across different compute nodes. This application suite is capable of data management, visual analytics, governance and security, forecasting and text mining, statistical analysis, and environment management. SAS and AWS recently performed testing using the Amazon FSx for Lustre shared file system to determine how well a standard workload performs on AWS using SAS Grid Manager. For more information about the results, see the whitepaper Accelerating SAS Using High-Performing File Systems on Amazon Web Services.

In this post, we take a look at an approach to deploy underlying AWS infrastructure to run SAS Grid with FSx for Lustre that you can also apply to similar applications with demanding I/O requirements.

System design overview

Running high-performance workloads that use throughput heavily, with sensitivity to network latency, requires approaches outside of typical applications. AWS generally recommends that applications span multiple Availability Zones for high availability. In the case of latency sensitivity, high throughput applications traffic should be local for optimal performance. To maximize throughput, you can do the following:

  • Run in a virtual private cloud (VPC), using instance types that support enhanced networking
  • Run instances in the same Availability Zone
  • Run instances within a placement group

The following diagram illustrates the SAS Grid with FSx for Lustre architecture on AWS.

SAS Grid architecture consists of mid-tier nodes, metadata servers, and Grid compute nodes. The mid-tier nodes are responsible for running the Platform Web Services (PWS) and Load Sharing Facility (LSF) components. These components dispatch jobs submitted and return the status of each job.

To effectively run PWS and LSF on mid-tier nodes, you need Amazon Elastic Compute Cloud (Amazon EC2) instances with high memory. For this use case, the r5 instance family would meet this requirement.

Metadata servers contain the metadata repository that stores the metadata definitions of all SAS Grid manager products, which the r5 instance family can also serve effectively. We recommend either meeting or exceeding the recommended memory requirement of 24 GB of RAM or 8 MB per physical core (whichever is larger). Metadata servers don’t need compute-intensive resources or high I/O bandwidth; therefore, you can choose the r5 instance family for a balance of price and performance.

SAS Grid nodes are responsible for executing the jobs received by the grid, and EC2 instances capable of handling these jobs depend on the size, complexity, and volume of the work the grid performs. To meet the minimum requirements of SAS Grid workloads, we recommend having a minimum of 8 GB of physical RAM per core and a robust I/O throughput of 100–125 MB/second per physical core. For this use case, EC2 instance families of m5n and r5n suffice in meeting the RAM and throughput requirements. You can host SASDATA, SASWORK, and UTILLOC libraries in a shared file system. If you choose to offload SASWORK to instance storage, the i3en instance family meets this need because they support instance storage over 1.2 TB. In the next section, we take a look at how throughput testing was performed to arrive at the EC2 instance recommendations with FSx for Lustre.

Steps to maximize storage I/O performance

SAS Grid requires a shared file system, and we wanted to benchmark the performance of FSx for Lustre as the chosen shared file system against various EC2 instance families that meet the minimum requirements of 8 GB of physical RAM per core and 100–125 MB/second throughput per physical core.

FSx for Lustre is a fully managed file storage service designed for applications that require fast storage. As a POSIX-compliant file system, you can use FSx for Lustre with current Linux-based applications without having to make any changes. Although FSx for Lustre offers a choice between scratch and persistent type file systems, we recommend for SAS Grid to use persistent type FSx for Lustre file system because you need to store the SASWORK, SASDATA, and UTILLOC data and libraries for longer periods with high availability and data durability. To meet I/O throughput, make sure to select the appropriate storage capacity for throughput per unit of storage to achieve the desired range of 100–125 MB/second.

After setting up the file system, we recommend mounting FSx for Lustre with the flock mount option. The following code example is a mount command and mount option for FSx for Lustre:

$ sudo mount -t lustre -o noatime,flock fs-0123456789abcd.fsx.us-west- [email protected]:/za3atbmv /fsx
$ mount -t lustre
[email protected]:/za3atbmv on /fsx type lustre

(rw,noatime,seclabel,flock,lazystatfs)

Throughput testing and results

To select the best-placed EC2 instances for running SAS Grid with FSx for Lustre, we ran a series of highly parallel network throughput tests from individual EC2 instances against a 100.8 TiB persistent file system that had an aggregate throughput capacity of 19.688 GB/second. We ran these tests in multiple regions using multiple EC2 instance families (c5, c5n, i3, i3en, m5, m5a, m5ad, m5n, m5dn, r5, r5a, r5ad, r5n, and r5dn). The tests ran for 3 hours for each instance, and the DataWriteBytes metric of the file system was recorded every 1 minute. Only one instance was accessing the file system at a time, and the p99.9 results were captured. The metrics were consistent across all four Regions.

We observed that the i3en, m5n, m5dn, r5n, and r5dn EC2 instance families meet or exceed the minimum network performance and memory recommendations. For more information about the performance results, see the whitepaper Accelerating SAS Using High-Performing File Systems on Amazon Web Services. The i3 instance family is just shy of meeting the minimum network performance. If you want to use the instance storage for SASWORK and UTILLOC libraries, you can consider i3en instances.

M5n and r5n are a good blend of price and performance, and we recommend the m5n instance family for SAS Grid nodes. However, if your workload is memory bound, consider using r5n instances, which provide higher memory per physical core for a higher price point than m5n instances.

We also ran rhel_iotest.sh, which is available from the SAS technical support samples tool repository (SASTSST), using the same FSx for Lustre configuration as mentioned earlier. The following table shows the read and write performance per physical core for a variety of instances sizes in the m5n and r5n families.

Instance Type

Variable Network Performance Peak per Physical Core
Read (MB/second)Write (MB/second)
m5n.large850.20357.07
m5n.xlarge519.46386.25
m5n.2xlarge283.01446.84
m5n.4xlarge202.89376.57
m5n.8xlarge154.98297.71
r5n.large906.88429.93
r5n.xlarge488.36455.76
r5n.2xlarge256.96471.65
r5n.4xlarge203.31390.03
r5n.8xlarge149.63299.45

To take advantage of the elasticity, scalability, and flexibility of the cloud, we recommend spreading the SAS Grid and compute workload over a larger number of smaller instances versus using a smaller number of larger instances. For mid-tier, use a minimum of two instances, and for metadata servers, we recommend a minimum of three instances for the SAS Grid architecture.

Conclusions

Before FSx for Lustre file system, you either had to use Amazon Elastic File System (Amazon EFS) or a third-party file system from AWS Marketplace and Amazon Elastic Block Store (Amazon EBS) for the SASWORK, SASDATA, and UTILLOC libraries and storage data. Each storage option came with its own settings and limitations, which caused loss in performance. With FSx for Lustre, you have a single solution for all SAS Grid storage requirements, which allows you to focus on running your business instead of maintaining a file system. We recommend that SAS admin deploy SAS Grid with m5n and r5n instances for SAS Grid compute nodes when accessing FSx for Lustre file system.

If you have questions or suggestions, please leave a comment.

Simplifying application orchestration with AWS Step Functions and AWS SAM

Post Syndicated from Rob Sutter original https://aws.amazon.com/blogs/compute/simplifying-application-orchestration-with-aws-step-functions-and-aws-sam/

Modern software applications consist of multiple components distributed across many services. AWS Step Functions lets you define serverless workflows to orchestrate these services so you can build and update your apps quickly. Step Functions manages its own state and retries when there are errors, enabling you to focus on your business logic. Now, with support for Step Functions in the AWS Serverless Application Model (AWS SAM), you can easily create, deploy, and maintain your serverless applications.

The most recent AWS SAM update introduces the AWS::Serverless::StateMachine component that simplifies the definition of workflows in your application. Because the StateMachine is an AWS SAM component, you can apply AWS SAM policy templates to scope the permissions of your workflows. AWS SAM also provides configuration options for invoking your workflows based on events or a schedule that you specify.

Defining a simple state machine

The simplest way to begin orchestrating your applications with Step Functions and AWS SAM is to install the latest version of the AWS SAM CLI.

Creating a state machine with AWS SAM CLI

To create a state machine with the AWS SAM CLI, perform the following steps:

  1. From a command line prompt, enter sam init
  2. Choose AWS Quick Start Templates
  3. Select nodejs12.x as the runtime
  4. Provide a project name
  5. Choose the Hello World Example quick start application template

Screen capture showing the first execution of sam init selecting the Hello World Example quick start application template

The AWS SAM CLI downloads the quick start application template and creates a new directory with sample code. Change into the sam-app directory and replace the contents of template.yaml with the following code:


# Copyright 2020 Amazon.com, Inc. or its affiliates. All Rights Reserved.
# SPDX-License-Identifier: Apache-2.0
AWSTemplateFormatVersion: '2010-09-09'
Transform: AWS::Serverless-2016-10-31

Resources:
  SimpleStateMachine:
    Type: AWS::Serverless::StateMachine
    Properties:
      Definition:
        StartAt: Single State
        States:
          Single State:
            Type: Pass
            End: true
      Policies:
        - CloudWatchPutMetricPolicy: {}

This is a simple yet complete template that defines a Step Functions Standard Workflow with a single Pass state. The Transform: AWS::Serverless-2016-10-31 line indicates that this is an AWS SAM template and not a basic AWS CloudFormation template. This enables the AWS::Serverless components and policy templates such as CloudWatchPutMetricPolicy on the last line, which allows you to publish metrics to Amazon CloudWatch.

Deploying a state machine with AWS SAM CLI

To deploy your state machine with the AWS SAM CLI:

  1. Save your template.yaml file
  2. Delete any function code in the directory, such as hello-world
  3. Enter sam deploy --guided into the terminal and follow the prompts
  4. Enter simple-state-machine as the stack name
  5. Select the defaults for the remaining prompts

Screen capture showing the first execution of sam deploy --guided

For additional information on visualizing, executing, and monitoring your workflow, see the tutorial Create a Step Functions State Machine Using AWS SAM.

Refining your workflow

The StateMachine component not only simplifies creation of your workflows, but also provides powerful control over how your workflow executes. You can compose complex workflows from all available Amazon States Language (ASL) states. Definition substitution allows you to reference resources. Finally, you can manage access permissions using AWS Identity and Access Management (IAM) policies and roles.

Service integrations

Step Functions service integrations allow you to call other AWS services directly from Task states. The following example shows you how to use a service integration to store information about a workflow execution directly in an Amazon DynamoDB table. Replace the Resources section of your template.yaml file with the following code:


Resources:
  SAMTable:
    Type: AWS::Serverless::SimpleTable

  SimpleStateMachine:
    Type: AWS::Serverless::StateMachine
    Properties:
      Definition:
        StartAt: FirstState
        States:
          FirstState:
            Type: Pass
            Next: Write to DynamoDB
          Write to DynamoDB:
            Type: Task
            Resource: arn:aws:states:::dynamodb:putItem
            Parameters:
              TableName: !Ref SAMTable
              Item:
                id:
                  S.$: $$.Execution.Id
            ResultPath: $.DynamoDB
            End: true
      Policies:
        - DynamoDBWritePolicy: 
            TableName: !Ref SAMTable

The AWS::Serverless::SimpleTable is an AWS SAM component that creates a DynamoDB table with on-demand capacity and reasonable defaults. To learn more, see the SimpleTable component documentation.

The Write to DynamoDB state is a Task with a service integration to the DynamoDB PutItem API call. The above code stores a single item with a field id containing the execution ID, taken from the context object of the current workflow execution.

Notice that DynamoDBWritePolicy replaces the CloudWatchPutMetricPolicy policy from the previous workflow. This is another AWS SAM policy template that provides write access only to a named DynamoDB table.

Definition substitutions

AWS SAM supports definition substitutions when defining a StateMachine resource. Definition substitutions work like template string substitution. First, you specify a Definition or DefinitionUri property of the StateMachine that contains variables specified in ${dollar_sign_brace} notation. Then you provide values for those variables as a map via the DefinitionSubstitution property.

The AWS SAM CLI provides a quick start template that demonstrates definition substitutions. To create a workflow using this template, perform the following steps:

  1. From a command line prompt in an empty directory, enter sam init
  2. Choose AWS Quick Start Templates
  3. Select your preferred runtime
  4. Provide a project name
  5. Choose the Step Functions Sample App (Stock Trader) quick start application template

Screen capture showing the execution of sam init selecting the Step Functions Sample App (Stock Trader) quick start application template

Change into the newly created directory and open the template.yaml file with your preferred text editor. Note that the Definition property is a path to a file, not a string as in your previous template. The DefinitionSubstitutions property is a map of key-value pairs. These pairs should match variables in the statemachine/stockTrader.asl.json file referenced under DefinitionUri.


      DefinitionUri: statemachine/stockTrader.asl.json
      DefinitionSubstitutions:
        StockCheckerFunctionArn: !GetAtt StockCheckerFunction.Arn
        StockSellerFunctionArn: !GetAtt StockSellerFunction.Arn
        StockBuyerFunctionArn: !GetAtt StockBuyerFunction.Arn
        DDBPutItem: !Sub arn:${AWS::Partition}:states:::dynamodb:putItem
        DDBTable: !Ref TransactionTable

Open the statemachine/stockTrader.asl.json file and look for the first state, Check Stock Value. The Resource property for this state is not a Lambda function ARN, but a replacement expression, “${StockCheckerFunctionArn}”. You see from DefinitionSubstitutions that this maps to the ARN of the StockCheckerFunction resource, an AWS::Serverless::Function also defined in template.yaml. AWS SAM CLI transforms these components into a complete, standard CloudFormation template at deploy time.

Separating the state machine definition into its own file allows you to benefit from integration with the AWS Toolkit for Visual Studio Code. With your state machine in a separate file, you can make changes and visualize your workflow within the IDE while still referencing it from your AWS SAM template.

Screen capture of a rendering of the AWS Step Functions workflow from the Step Functions Sample App (Stock Trader) quick start application template

Managing permissions and access

AWS SAM support allows you to apply policy templates to your state machines. AWS SAM policy templates provide pre-defined IAM policies for common scenarios. These templates appropriately limit the scope of permissions for your state machine while simultaneously simplifying your AWS SAM templates. You can also apply AWS managed policies to your state machines.

If AWS SAM policy templates and AWS managed policies do not suit your needs, you can also create inline policies or attach an IAM role. This allows you to tailor the permissions of your state machine to your exact use case.

Additional configuration

AWS SAM provides additional simplification for configuring event sources and logging.

Event sources

Event sources determine what events can start execution of your workflow. These sources can include HTTP requests to Amazon API Gateway REST APIs and Amazon EventBridge rules. For example, the below Events block creates an API Gateway REST API. Whenever that API receives an HTTP POST request to the path /request, it starts an execution of the state machine:


      Events:
        HttpRequest:
          Type: Api
          Properties:
            Method: POST
            Path: /request

Event sources can also start executions of your workflow on a schedule that you specify. The quick start template you created above provides the following example. When this event source is enabled, the workflow executes once every hour:


      Events:
        HourlyTradingSchedule:
          Type: Schedule 
          Properties:
            Enabled: False
            Schedule: "rate(1 hour)"

Architecture diagram for the Step Functions Sample App (Stock Trader) quick start application template

To learn more about schedules as event sources, see the AWS SAM documentation on GitHub.

Logging

Both Standard Workflows and Express Workflows support logging execution history to CloudWatch Logs. To enable logging for your workflow, you must define an AWS::Logs::LogGroup and add a Logging property to your StateMachine definition. You also must attach an IAM policy or role that provides sufficient permissions to create and publish logs. The following code shows how to add logging to an existing workflow:


Resources:
  SAMLogs:
    Type: AWS::Logs::LogGroup

  SimpleStateMachine:
    Type: AWS::Serverless::StateMachine
    Properties:
      Definition: {…}
      Logging:
        Destinations:
          - CloudWatchLogsLogGroup: 
              LogGroupArn: !GetAtt SAMLogs.Arn
        IncludeExecutionData: true
        Level: ALL
      Policies:
        - CloudWatchLogsFullAccess
      Type: EXPRESS

 

Conclusion

Step Functions workflows simplify orchestration of distributed services and accelerate application development. AWS SAM support for Step Functions compounds those benefits by helping you build, deploy, and monitor your workflows more quickly and more precisely. In this post, you learned how to use AWS SAM to define simple workflows and more complex workflows with service integrations. You also learned how to manage security permissions, event sources, and logging for your Step Functions workflows.

To learn more about building with Step Functions, see the AWS Step Functions playlist on the AWS Serverless YouTube channel. To learn more about orchestrating modern, event-driven applications with Step Functions, see the App 2025 playlist.

Now go build!

Enhancing site security with new Lightsail firewall features

Post Syndicated from Emma White original https://aws.amazon.com/blogs/compute/enhancing-site-security-with-new-lightsail-firewall-features/

This post is contributed by Mike Coleman, AWS Senior Developer Advocate – Lightsail

Amazon Lightsail provides an easy way to get started with AWS for many customers. The service balances ease of use, security, and flexibility. The Lightsail firewall now offers additional features to help customers secure their Lightsail instances. This update offers three new capabilities:

  • The ability to specify source IP addresses for firewall rules
  • Explicitly allowing or disallowing remote access to instances via Lightsail’s web-based console
  • Support for PING

This blog explores each of these new features in detail, starting with source IP addresses.

Before this update, any open ports in the Lightsail firewall were open to the internet. In many cases, this is a reasonable approach. For example, for new WordPress servers, you likely need broad public access.

However, in some cases you want to restrict access to an instance. If you are staging a new website and it’s not ready for publication, you may want to limit access. One way to ensure that only certain people can visit the site is to only allow certain IP addresses to connect.

Another common use case is limiting remote access to an instance. With the new changes to the Lightsail firewall, you would be able to limit SSH or RDP access by source IP address. Additionally, you can now enable or disable remote access via Lightsail’s built-in web client.

Access can be restricted from one or more IP addresses (for example, the IP address for your home computer) or a continuous range of IP addresses (such as the address range for your corporate network).

Next, I review how you configure these options to restrict remote access via SSH to a single source IP address.

Finding your IP address

Most computers do not have an internet routable IP address assigned. Internet routable IP addresses are scarce and usually assigned to your internet gateway device. The devices on the network are assigned private IP addresses. To communicate between the private IP network and the internet, the network router typically uses network address translation (NAT).

This tutorial assumes you are using NAT. This means the IP address used to restrict SSH access is the IP routable address of your network gateway device (usually your wireless router). Consequently, this limits access to all devices on the network behind this IP address.

There are many ways to find your internet routable IP address. You can log into your network gateway device and find it there (consult your device’s user manual for more details). Alternatively, use one of several public services to determine your IP address – search online for “what is my IP” to list several options.

Restricting SSH access to a single IP address

  1. Start by creating a new Lightsail instance – you can select any blueprint.
  2. Once the instance state shows Running, choose the name of the instance to open the Instance details page.
    firewall test instance
  3. Choose Networking from the menu.
    networking tab
  4. Scroll down to find the current firewall settings. Under Allow connections from, it lists Any IP address for all of the applications. To change this, choose the edit icon for the SSH rule.
    IP address
  5. Check the box next to Restrict to IP address and enter your internet routable IP address under Source IP or range.restrict IP address
    Note: The next section shows how to restrict access from Lightsail’s browser-based SSH client. Currently, Allow Lightsail browser SSH box is checked.
  6. Choose Save.

Now, SSH into your Lightsail instance from your local machine. You can learn more about how to connect to your Lightsail instance using SSH from our documentation.

You should be able to connect to your instance successfully. Next, you test the connection from a different IP address. You do this by restricting a different IP address, and attempting to connect again:

  1. Edit the SSH firewall rule again follows the instructions above. This time, under IP or IP Range enter 192.168.2.150.
  2. Choose Save.

Attempt to connect to your instance once more. The connection fails because your IP address does not match an IP address in the range.

Restricting access from the Lightsail browser-based SSH client

The browser-based SSH client makes it easy to access instances without needing to manage SSH keys on locally. However, there may be cases where you must disable browser-based access.

To do this:

  1. Navigate to the firewall rules for the instance you created earlier.
  2. Choose the edit icon for the SSH rule.edit IP address
  3. Uncheck the box next to Allow Lightsail browser SSH. Choose Save.
  4. From the menu, choose Connect, then choose Connect using SSH. The browser window opens, but you are not connected to the instance.

PING Support

There is now support for PING, a command line utility used to check if a computer is reachable over the network. PING sends a packet to a remote computer, which sends a simple response back. Before this release, you could not PING Lightsail instances.

To activate this feature, add a firewall rule:

  1. Navigate to the networking page for your instance.      add rule to firewall<
  2. Under the firewall section, choose +Add rule.
  3. From the application list, choose PING (ICMP). Choose Save.
  4. From a terminal window on your local machine, send a ping command to your Lightsail instance’s IP address. You can find the IP address from the Connect tab of the instance details page or from instance card on the Lightsail home page.
ping -c 5 192.168.2.143

You see a response similar to:

<ping -c 5 192.168.2.143
PING 192.168.2.143 (192.168.2.143): 56 data bytes
64 bytes from 192.168.2.143: icmp_seq=0 ttl=54 time=19.383 ms
64 bytes from 192.168.2.143: icmp_seq=1 ttl=54 time=16.821 ms
64 bytes from 192.168.2.143: icmp_seq=2 ttl=54 time=16.363 ms
64 bytes from 192.168.2.143: icmp_seq=3 ttl=54 time=27.335 ms
64 bytes from 192.168.2.143: icmp_seq=4 ttl=54 time=19.429 ms

--- 192.168.2.143 ping statistics ---
5 packets transmitted, 5 packets received, 0.0% packet loss
round-trip min/avg/max/stddev = 16.363/19.866/27.335/3.943 ms

Conclusion

In this blog I covered how you can increase the security of your Lightsail instances by taking advantage of three new features: source IP restrictions, limiting access to the Lightsail browser SSH and RDP clients, and the addition of PING (ICMP) as an application type. These new features provide you an extra level of flexibility and security when deploying applications on Lightsail.

To learn more about the Lightsail firewall, see the documentation. Additionally, there are Getting Started tutorials for Lightsail, including launching a LAMP stack application or .NET application.

Build a serverless Martian weather display with CircuitPython and AWS Lambda

Post Syndicated from Moheeb Zara original https://aws.amazon.com/blogs/compute/build-a-serverless-martian-weather-display-with-circuitpython-and-aws-lambda/

Build a standalone digital weather display of Mars showing the latest images from the Mars Curiosity Rover.

This project uses an Adafruit PyPortal, an open-source IoT touch display. Traditionally, a microcontroller is programmed with firmware compiled using various specific toolchains. Fortunately, the PyPortal is programmed using CircuitPython, a lightweight version of Python that works on embedded hardware. You just copy your code to the PyPortal like you would to a thumb drive and it runs.

I deploy the backend, the part in the cloud that does all the heavy lifting, using the AWS Serverless Application Repository (SAR). The code on the PyPortal makes a REST call to the backend to handle the requests to the NASA Mars Rover Photos API and InSight: Mars Weather Service API. It then converts and resizes the image before returning the information to the PyPortal for display.

An Adafruit PyPortal displaying the latest images from the Mars Curiosity Rover and weather data from InSight Mars Lander.

An Adafruit PyPortal displaying the latest images from the Mars Curiosity Rover and weather data from InSight Mars Lander.

Prerequisites

You need the following to complete the project:

Deploy the backend application

An architecture diagram of the serverless backend.

An architecture diagram of the serverless backend.

Using a serverless backend reduces the load on the PyPortal. The PyPortal makes a call to the backend API and receives a small JSON object with the relevant data. This allows you to change to the logic of where and how to get the image and weather data without needing physical access to the device.

The backend API consists of an AWS Lambda function, written in Python, behind an Amazon API Gateway endpoint. When invoked, the FetchMarsData function makes requests to two separate NASA APIs. First it fetches the latest images from the Mars Curiosity Rover, typically from the previous day, and picks one at random. It resizes and converts the image to bitmap format before uploading to Amazon S3 with public read permissions. The PyPortal downloads the image from S3 later.

The function then calls the InSight: Mars Weather Service API. It retrieves the average air temperature, wind speed, pressure, season, solar day (sol), as well as the first and last timestamp of daily sampling. The API returns these values and the S3 image URL as a JSON object.

I use the AWS Serverless Application Model (SAM) to create the backend. While it can be deployed using the AWS SAM CLI, you can also deploy from the AWS Management Console:

  1. Generate a free NASA API key at api.nasa.gov. This is required to gain access to the NASA data APIs.
  2. Navigate to the aws-serverless-pyportal-mars-weather-display application in the Serverless Application Repository.
  3. Choose Deploy.
  4. On the next page, under Application Settings, enter the parameter, NasaApiKey.

  5. Once complete, choose View CloudFormation Stack.

  6. Select the Outputs tab and make a note of the MarsApiUrl. This is required for configuring the PyPortal.

  7. Navigate to the MarsApiKey URL listed in the Outputs tab.

  8. Click Show to reveal the API key. Make a note of this. This is required for authenticating requests from the PyPortal to the MarsApiUrl.

PyPortal setup

  1. Follow these instructions from Adafruit to install the latest version of the CircuitPython bootloader. At the time of writing, the latest version is 5.2.0.
  2. Follow these instructions to install the latest Adafruit CircuitPython library bundle. I use bundle version 5.x.
  3. Insert the microSD card in the slot located on the back of the device.
  4. Optionally install the Mu Editor, a multi-platform code editor and serial debugger compatible with Adafruit CircuitPython boards. This can help if you need to troubleshoot issues.
  5. Optionally if you have a 3D printer at home, you can print a case for your PyPortal. This can protect your project while also being a great way to display it on a desk.

Code PyPortal

As with regular Python, CircuitPython does not need to be compiled to execute. Flashing new firmware on the PyPortal is as simple as copying a Python file and necessary assets over to a mounted volume. The bootloader runs code.py anytime the device starts or any files are updated.

  1. Use a USB cable to plug the PyPortal into your computer and wait until a new mounted volume CIRCUITPY is available.
  2. Download the project from GitHub. Inside the project, copy the contents of /circuit-python on to the CIRCUITPY volume.
  3. Inside the volume, open and edit the secrets.py file. Include your Wi-Fi credentials along with the MarsApiKey and MarsApiUrl API Gateway endpoint, which can be found under Outputs in the AWS CloudFormation stack created by the Serverless Application Repository.
  4. Save the file, and the device restarts. It takes a moment to connect to Wi-Fi and make the first request.
    Optionally, if you installed the Mu Editor, you can click on “Serial” to follow along the device log.Animated gif of the PyPortal device displaying a Mars rover image and Mars weather data.

Understanding how CircuitPython calls API Gateway

The main CircuitPython file is code.py. At the end of the file, the while loop periodically performs the operations necessary to display the photos from the Curiosity Rover and the InSight Mars lander weather data.

while True:
    data = callAPIEndpoint(secrets['mars_api_url'])
    downloadImage(data['image_url'])
    showDisplay(data['insight'], 
    displayTime=60*interval_minutes)

First, it calls the API Gateway endpoint using the URL from the secrets.py file, and passes the returned JSON to helper functions. The callAPIEndpoint(url) function passes the MarsApiKey in the header and a timeout of 30 seconds to the wifi.get() method. The timeout is required for integrations with services like Lambda and API Gateway. Remember, the CircuitPython code is running on a microcontroller and sometimes must wait longer when making requests.

def callAPIEndpoint(mars_api_url):
    headers = {"x-api-key": secrets['mars_api_key']}
    response = wifi.get(mars_api_url, headers=headers, timeout=30)
    data = response.json()
    print("JSON Response: ", data)
    response.close()
    return data

The JSON object that is received by the PyPortal is defined in the handler of the Lambda function. In the GitHub project downloaded earlier, see src/app.py.

def lambda_handler(event, context):
    url = fetchRoverImage()
    imgData = fetchImageData(url)
    image_s3_url = resize_image(imgData)
    weatherData = getMarsInsightWeather()

    return {
        "statusCode": 200,
        "body": json.dumps({
            "image_url": image_s3_url,
            "insight": weatherData
        })
    }

Similar to the CircuitPython code, this uses helper functions to perform all the various operations needed to retrieve and craft the data. At completion, the returned JSON is passed as the response to the PyPortal.

A quick way to add a new property is to edit the Lambda function directly through the AWS Lambda Console. Here, a key “hello” is added with a value “world”:

In the CircuitPython code.py file, the key is now available in the JSON response from API Gateway. The following prints the key value, which can be seen using the Mu Editor Serial debugger.

data = callAPIEndpoint(secrets['mars_api_url'])

print(data[‘hello’])

The Lambda function is packaged with the AWS Python SDK, boto3, which provides methods for interacting with a variety of AWS services. The Python Requests library is also included to make calls to the NASA APIs. Try exploring how to incorporate other services or APIs into your project. To understand how to modify the visual display on the PyPortal itself, see the displayio guide from Adafruit.

Conclusion

I show how to build a “live” Martian weather display using an Adafruit PyPortal, CircuitPython, and AWS Serverless technologies. Whether this is your first time using hardware or a serverless backend in the AWS Cloud, this project is simplified by the use of CircuitPython and the Serverless Application Model.

I also show how to make a request to API Gateway from the PyPortal. I then craft a response in Lambda for the PyPortal. Since both use variants of the Python programming language, much of the syntax stays the same.

To learn more, explore other devices supported by CircuitPython and the variety of community contributed libraries. Combined with the breadth of AWS services, you can push the boundaries of creativity.

Deep dive into Fargate Spot to run your ECS Tasks for up to 70% less

Post Syndicated from Ben Peven original https://aws.amazon.com/blogs/compute/deep-dive-into-fargate-spot-to-run-your-ecs-tasks-for-up-to-70-less/

Author: Pritam Pal, Sr. EC2 Spot Specialist SA

AWS launched AWS Fargate Spot during late 2019 for customers looking for a cost effective way to run containers. This blog dives deep into how to use ECS Fargate Spot and Fargate Tasks to lower the cost of your workloads. I explain existing concepts like Container Stop Timeout, catching SIGTERM for graceful shut down, and introduce you to new Amazon Elastic Container Service (Amazon ECS) concepts like capacity providers and service events.

Product overview

In 2017, AWS launched AWS Fargate for Amazon ECS. Fargate allows you to spend less time managing Amazon EC2 instances and more time building. Fargate Spot is a new purchase option that allows customers to launch Tasks on spare capacity with a steep discount. A Spot task is almost indistinguishable from an On-Demand Task with the following exceptions:

Price (per CPU-Hour and GB-Hour) of a Spot Task is variable, ranging between 50% to 70% off the price of an On-Demand Task, and a Fargate Spot Task may be interrupted (i.e stopped) when AWS needs the capacity back.

Fargate Spot runs on the same principle as Amazon EC2 Spot Instances. Your tasks run on spare capacity in the AWS Cloud. If you request to run your Task on Fargate Spot, your Tasks will run when capacity for Fargate Spot is available. As these Tasks run on spare capacity, you receive a two-minute notification when AWS needs capacity back, just like Spot Instances.

In this blog, I walk through how to launch a Fargate Spot Task using the AWS Management Console and command line interface (CLI), how to handle termination notices, what the Service Task Placement Failure Event looks like and other best practices to make sure you are a Fargate Spot champion.

ECS Fargate concepts

Before we go deep, let’s explore some ECS concepts, which we will use for Fargate Spot in this blog post.

StopTask and stopTimeout

Because Fargate Spot task may be interrupted with two minutes notice, you need to make sure you gracefully exit. For this, you can use concepts like stopTimeout parameters. When StopTask is called on a task, the equivalent of Docker stop is issued to the containers running in the task. If the container handles the SIGTERM value gracefully and exits within 30 seconds from receiving it, no SIGKILL signal is sent. You typically use stopTimeout parameter of the task definition to control this behavior. stopTimeout is time duration (in seconds) to wait before the container is forcefully stopped if it doesn’t exit normally on its own. For Fargate 1.3.0 or later, the max stop timeout value is 120 seconds. If the parameter is not specified, the default value of 30 seconds is used.

Capacity providers

Capacity providers are a new way to manage compute capacity for containers. This tool allows the application to define its requirements for how it uses the capacity. With capacity providers, you can define flexible rules for how containerized workloads run on different types of compute capacity, and manage the scaling of the capacity. Capacity providers improve the availability, scalability, and cost of running tasks and services on ECS. As of now, each cluster can have up to six capacity providers and an optional default capacity provider strategy, which determines how the tasks are spread across the capacity providers. To run your tasks, you can either use the default capacity provider strategy or specify one of your own.

AWS Fargate and AWS Fargate Spot capacity providers do not need to be created. They are available to all accounts, and only need to be associated with a cluster to be available for use. When a new cluster is created using the Amazon ECS console, along with the Networking only cluster template, the FARGATE and FARGATE_SPOT capacity providers are associated with the new cluster automatically.

A cluster may contain a mix of FARGATE, FARGATE_SPOT and Auto Scaling Group (ASG) capacity providers, however at this moment, a capacity provider strategy may only contain either FARGATE or Auto Scaling Group capacity providers, but not both.

Now that I covered ECS Fargate concepts, lets jump into the technical walk through.

Launch ECS Fargate Spot Task using AWS Management Console

1. Open the Amazon ECS console

2. From the navigation bar, select the Region to use

3. In the navigation pane, choose Clusters

4. On the Clusters page, choose Create Cluster

5. Create a Networking only Cluster

ECS create clusterWith this option, you can launch a cluster with a new VPC to use for Fargate tasks. The FARGATE and FARGATE_SPOT capacity providers are automatically associated with the cluster, as shown in the following image.ECS default capacity providers

6. Click on Update Cluster on top right-hand side to set a capacity provider strategy

In this example, I use a combination of FARGATE_SPOT and FARGATE capacity providers. I selected a Weight of 4 for FARGATE_SPOT and 1 for FARGATE. This means that for every five Tasks, four are started on FARGATE_SPOT and one on FARGATE. You can distribute this however you want. More tasks on Fargate Spot means more savings. But, if your workload requires high availability and you are not comfortable with interruptions, start with a ratio that works for you.

ECS update capacity provider strategy

7. Let’s create a Task Definition first. Here are some great Task definitions to start with. Find the Task Definition link of left navigation panel, click Create a New Task Definition, Choose Fargate launch type, scroll down, near the bottom of the page find the Configure via JSON button. Delete the pre-populated JSON entry, copy the sample Fargate WebApp task definition from below and paste. Click Save. Click Create.

{
    "family": "webapp-fargate-task", 
    "networkMode": "awsvpc", 
    "containerDefinitions": [
        {
            "name": "fargate-app", 
            "image": "httpd:2.4", 
            "portMappings": [
                {
                    "containerPort": 80, 
                    "hostPort": 80, 
                    "protocol": "tcp"
                }
            ], 
            "essential": true, 
            "entryPoint": [
                "sh",
		"-c"
            ], 
            "command": [
                "/bin/sh -c \"echo '<html> <head> <title>Amazon ECS Sample App</title> <style>body {margin-top: 40px; background-color: #333;} </style> </head><body> <div style=color:white;text-align:center> <h1>Amazon ECS Sample App</h1> <h2>Congratulations!</h2> <p>Your application is now running on a container in Amazon ECS.</p> </div></body></html>' >  /usr/local/apache2/htdocs/index.html && httpd-foreground\""
            ]
        }
    ], 
    "requiresCompatibilities": [
        "FARGATE"
    ], 
    "cpu": "256", 
    "memory": "512"
}

8. Now our Task definition is ready, we will run the same Task definition. Select the Task definition you just created, click Action, Run Task. Enter the number of Tasks you want to run. In this case I chose 10. After configuring the VPC and security groups, click Run Task.

Task definition

ECS run tasks listThe Run Task command from the last step starts ten Tasks, out of which eight Tasks launch on FARGATE_SPOT and two launch on FARGATE (The ratio I setup is 4:1). You can see the ratio by clicking on any Task, and finding the “Capacity provider” value for that Task under the details tab. Currently, there is no option to view all Tasks with particular “Capacity provider” in the Run Task console.

ECS Fargate Spot Task detailThis Task ran on FARGATE Capacity Provider, whereas the earlier mentioned Task ran on FARGATE_SPOT.ECS Fargate Task detail

 

In next section, I explore how to Launch Fargate Spot Tasks using the AWS CLI.

Launch ECS Fargate Spot Task using AWS CLI

Create a cluster with capacity providers

While creating a new Cluster using CLI, you must specify capacity providers. In the following example, I specified two capacity providers FARGATE and FARGATE_SPOT.

Enter the following code to specify the capacity providers:

aws ecs create-cluster \
    --cluster-name "Fargate-Spot-Deep-Dive" 
    --capacity-providers FARGATE FARGATE_SPOT   

The Output of this command should result in this:

"cluster": {
        "status": "PROVISIONING", 
        "defaultCapacityProviderStrategy": [], 
        "statistics": [], 
        "capacityProviders": [
            "FARGATE", 
            "FARGATE_SPOT"
        ], 
        ... 

Launching a Task

Once the cluster is created, you can launch a Fargate Spot Task by calling RunTask and providing the Spot capacity provider in the –capacity-provider-strategy field. You also need to specify:

  • A task definition
  • Weight options for capacity providers
  • A network configuration like Subnets, Security Groups
  • How many Tasks you want to run

I defined these specifications in the following code:

aws ecs run-task \
 --capacity-provider-strategy capacityProvider=FARGATE,weight=1
 capacityProvider=FARGATE_SPOT,weight=1 \
 --cluster "Fargate-Spot-Deep-Dive" \
 --task-definition task-def-family:revision \
 --network-configuration
"awsvpcConfiguration={subnets=[string,string],securityGroups=[string,string],assignPublicIp=string}"

 \
 -count integer \

If you specify count=10, and weight =1 for both providers, it would start 5 FARGATE_SPOT and 5 FARGATE Tasks.

Creating a service

In the example below, you create a service with FARGATE_SPOT only.

aws ecs create-service \
 --capacity-provider-strategy capacityProvider=FARGATE,weight=1
 capacityProvider=FARGATE_SPOT,weight=1 \
 --cluster "Fargate-Spot-Deep-Dive" \
 --service-name FargateService \
 --task-definition task-def-family:revision \
 --network-configuration
 "awsvpcConfiguration={subnets=[string,string],securityGroups=[string,string],assignPublicIp=string}"

 \
 --desired-count integer \

You can also create a service with a mix of Spot and On-Demand Tasks by calling CreateService and providing both Spot and On-Demand capacity providers in the capacity-provider-strategy field.

The base attribute is an optional field that says there should be at least four On-Demand Tasks (default base is 0, you cannot specify more than one capacity provider with a non-zero base). The weight is another optional field that says for the six remaining Tasks that are not managed by the base attribute, there should be one On-Demand Tasks for every two Spot Tasks.

aws ecs create-service \
--cluster "my-cluster"
--service-name "my-service"
--desired-count 10
--capacity-provider-strategy [
{'capacityProvider':'FARGATE_SPOT', 'weight': 2, 'base': 0},
{'capacityProvider':'FARGATE', 'weight': 1, 'base': 4},
]
...

Add the Fargate and Fargate Spot capacity providers to an existing cluster

aws ecs put-cluster-capacity-providers \
 --cluster "Fargate-Spot-Deep-Dive" \
 --capacity-providers FARGATE FARGATE_SPOT 
existing_capacity_provider1 existing_capacity_provider2 \
 --default-capacity-provider-strategy existing_default_capacity_provider_strategy \

How to handle Fargate Spot termination notices

By design, Fargate Spot is an interruptible service. When tasks using FARGATE_SPOT are stopped due to a Spot interruption, a two-minute warning is sent before a task is stopped.

The warning is sent as a task state change event to Amazon EventBridge and a SIGTERM signal to the running task. When using Fargate Spot as part of a service, the service scheduler will receive the interruption signal and attempt to launch additional Tasks on Fargate Spot if capacity is available.

To ensure that our containers exit gracefully before the Task stops, the following can be configured:

A stopTimeout value of 120 seconds (2 minutes) or less can be specified in the container definition that the task is using. Specifying a stopTimeout value gives us time between the moment the Task state change event is received and the point at which the container is forcefully stopped.

The SIGTERM signal must be received from within the container to perform any cleanup actions.

The following is a snippet of a Task state change event displaying the stopped reason and stop code for a Fargate Spot interruption.

{
  "version": "0",
  "id": "9bcdac79-b31f-4d3d-9410-fbd727c29fab",
  "detail-type": "ECS Task State Change",
  "source": "aws.ecs",
  "account": "111122223333",
  "resources": [
    "arn:aws:ecs:us-east-1:111122223333:task/b99d40b3-5176-4f71-9a52-9dbd6f1cebef"
  ],
  "detail": {
    "clusterArn": "arn:aws:ecs:us-east-1:111122223333:cluster/default",
    "createdAt": "2016-12-06T16:41:05.702Z",
    "desiredStatus": "STOPPED",
    "lastStatus": "RUNNING",
    "stoppedReason": "Your Spot Task was interrupted.",
    "stopCode": "TerminationNotice",
    "taskArn": "arn:aws:ecs:us-east-1:111122223333:task/b99d40b3-5176-4f71-9a52-9dbd6fEXAMPLE",
    ...
  }
}

Example service Task placement failure event

In case FARGATE_SPOT can’t place a Task due to capacity constraints; service Task placement failure events are delivered.

In the following example, the task attempted to use the FARGATE_SPOT capacity provider, but the service scheduler was unable to acquire any Fargate Spot capacity.

{
    "version": "0",
    "id": "ddca6449-b258-46c0-8653-e0e3a6d0468b",
    "detail-type": "ECS Service Action",
    "source": "aws.ecs",
    "account": "111122223333",
    "time": "2019-11-19T19:55:38Z",
    "region": "us-west-2",
    "resources": [
        "arn:aws:ecs:us-west-2:111122223333:service/default/servicetest"
    ],
    "detail": {
        "eventType": "ERROR",
        "eventName": "SERVICE_TASK_PLACEMENT_FAILURE",
        "clusterArn": "arn:aws:ecs:us-west-2:111122223333:cluster/default",
        "capacityProviderArns": [
            "arn:aws:ecs:us-west-2:111122223333:capacity-provider/FARGATE_SPOT"
        ],
        "reason": "RESOURCE:FARGATE",
        "createdAt": "2019-11-06T19:09:33.087Z"
    }
}

Amazon EventBridge enables you to automate your AWS services, and respond automatically to system events such as application availability issues or resource changes. Events from AWS services are delivered to EventBridge in near real-time. You can write simple rules to indicate which events are of interest to you and what automated actions to take when an event matches a rule.

More details on how to use Amazon ECS Events can be found here.

Fargate Spot pricing

With AWS Fargate, there are no upfront payments and you only pay for the resources that you use. You pay for the amount of vCPU and memory resources consumed by your containerized applications.

The price for Spot CPU-Hour and GB-Hour is the same across all Availability Zones and Task Configurations. However, the price varies throughout the day. The latest price is available at Fargate Pricing page. Pricing is per second with a 1-minute minimum. Duration is calculated from the time you start to download your container image (Docker pull) until the Task terminates, rounded up to the nearest second.

Fargate Spot best practices

As I wrap up, I want to focus on a few best practices about Fargate Spot.

  • Fargate Spot is great for stateless, fault-tolerant workloads, but don’t rely solely on Spot Tasks for critical workloads, configure a mix of regular Fargate Tasks
  • Applications running on Fargate Spot should be fault-tolerant
  • Handle interruptions gracefully by catching SIGTERM signals

Conclusion

Fargate Spot is a great fit for parallelizable workloads like image rendering, Monte Carlo simulations, and genomic processing. However, customers can also use Fargate Spot for Tasks that run as a part of ECS services such as websites and APIs which require high availability.

ECS Fargate has already made its super easy to run containerized workloads without worrying about the setup and managing infrastructure. Fargate Spot makes it more affordable for your price sensitive workloads. With right mix of FARGATE and FARGATE_SPOT capacity providers you can get the optimal capacity for you tasks in budget.


PritamPritam is Sr. Specialist Solutions Architect in EC2 Spot team. For last 13 years he evangelized DevOps and Cloud adoption across industries and verticals. He likes to deep dive and find solutions to every day problems.

Building an automated knowledge repo with Amazon EventBridge and Zendesk

Post Syndicated from Benjamin Smith original https://aws.amazon.com/blogs/compute/building-an-automated-knowledge-repo-with-amazon-eventbridge-and-zendesk/

Zendesk Guide is a smart knowledge base that helps customers harness the power of institutional knowledge. It enables users to build a customizable help center and customer portal.

This post shows how to implement a bidirectional event orchestration pattern between AWS services and an Amazon EventBridge third-party integration partner. This example uses support ticket events to build a customer self-service knowledge repository. It uses the EventBridge partner integration with Zendesk to accelerate the growth of a customer help center.

The examples in this post are part of a serverless application called FreshTracks. This is built in Vue.js and demonstrates SaaS integrations with Amazon EventBridge. To test this example, ask a question on the Fresh Tracks application.

The backend components for this EventBridge integration with Zendesk have been extracted into a separate example application in this GitHub repo.

How the application works

Routing Zendesk events with Amazon EventBridge.

Routing Zendesk events with Amazon EventBridge.

  1. A user searches the knowledge repository via a widget embedded in the web application.
  2. If there is no answer, the user submits the question via the web widget.
  3. Zendesk receives the question as a support ticket.
  4. Zendesk emits events when the support ticket is resolved.
  5. These events are streamed into a custom SaaS event bus in EventBridge.
  6. Event rules match events and send them downstream to an AWS Step Functions Express Workflow.
  7. The Express Workflow orchestrates Lambda functions to retrieve additional information about the event with the Zendesk API.
  8. A Lambda function uses the Zendesk API to publish a new help article from the support ticket data.
  9. The new article is searchable on the website widget for other users to read.

Before deploying this application, you must generate an API key from within Zendesk.

Creating the Zendesk API resource

Use an API to execute events on your Zendesk account from AWS. Follow these steps to generate a Zendesk API token. This is used by the application to authenticate Zendesk API calls.

To generate an API token

  1. Log in to the Zendesk dashboard.
  2. Click the Admin icon in the sidebar, then select Channels > API.
  3. Click the Settings tab, and make sure that Token Access is enabled.
  4. Click the + button to the right of Active API Tokens.

    Creating a Zendesk API token.

    Creating a Zendesk API token.

  5. Copy the token, and store it securely. Once you close this window, the full token is not displayed again.
  6. Click Save to return to the API page, which shows a truncated version of the token.

    Zendesk API token.

    Zendesk API token.

Configuring Zendesk with Amazon EventBridge

Step 1. Configuring your Zendesk event source.

  1. Go to your Zendesk Admin Center and select Admin Center > Integrations.
  2. Choose Connect in events Connector for Amazon EventBridge to open the page to configure your Zendesk event source.

    Zendesk integrations

    Zendesk integrations

  3. Enter your AWS account ID in the Amazon Web Services account ID field, and select the Region to receive events.
  4. Choose Save.

    Zendesk Amazon EventBridge configuration.

Step 2. Associate the Zendesk event source with a new event bus: 

  1. Log into the AWS Management Console and navigate to services > Amazon EventBridge > Partner event sources
    New event source

    New event source

     

  2. Select the radio button next to the new event source and choose Associate with event bus.
    Associating event source with event bus.

    Associating event source with event bus.

     

  3. Choose Associate.

Deploying the backend application

After associating the Event source with a new partner event bus, you can deploy backend services to receive events.

To set up the example application, visit the GitHub repo and follow the instructions in the README.md file.

When deploying the application stack, make sure to provide the custom event bus name, and Zendesk API credentials with --parameter-overrides.

sam deploy --parameter-overrides ZendeskEventBusName=aws.partner/zendesk.com/123456789/default ZenDeskDomain=MydendeskDomain ZenDeskPassword=myAPITOken ZenDeskUsername=myZendeskAgentUsername

You can find the name of the new Zendesk custom event bus in the custom event bus section of the EventBridge console.

Routing events with rules

When a support ticket is updated in Zendesk, a number of individual events are streamed to EventBridge, these include an event for each of:

  • Agent Assignment Changed
  • Comment Created
  • Status Changed
  • Brand Changed
  • Subject Changed

An EventBridge rule is used filter for events. The AWS Serverless Application Model (SAM) template defines the rule with the `AWS::Events::Rule` resource type. This routes the event downstream to an AWS Step Functions Express Workflow.  The EventPattern is shown below:

  ZendeskNewWebQueryClosed: 
    Type: AWS::Events::Rule
    Properties: 
      Description: "New Web Query"
      EventBusName: 
         Ref: ZendeskEventBusName
      EventPattern: 
        account:
        - !Sub '${AWS::AccountId}'
        detail-type: 
        - "Support Ticket: Comment Created"
        detail:
          ticket_event:
            ticket:
              status: 
              - solved
              tags:
              - web_widget
              tags: 
              - guide
      Targets: 
        - RoleArn: !GetAtt [ MyStatesExecutionRole, Arn ]
          Arn: !Ref FreshTracksZenDeskQueryMachine
          Id: NewQuery

The tickets must have two specific tags (web_widget and guide) for this pattern to match. These are defined as separate fields to create an AND  matching rule, instead of declaring within the same array field to create an OR rule. A new comment on a support ticket triggers the event.

The Step Functions Express Workflow

The application routes events to a Step Functions Express Workflow that is defined in the application’s SAM template:

FreshTracksZenDeskQueryMachine:
    Type: "AWS::StepFunctions::StateMachine"
    Properties:
      StateMachineType: EXPRESS
      DefinitionString: !Sub |
               {
                    "Comment": "Create a new article from a zendeskTicket",
                    "StartAt": "GetFullZendeskTicket",
                    "States": {
                      "GetFullZendeskTicket": {
                      "Comment": "Get Full Ticket Details",
                      "Type": "Task",
                      "ResultPath": "$.FullTicket",
                      "Resource": "${GetFullZendeskTicket.Arn}",
                      "Next": "GetFullZendeskUser"
                      },
                      "GetFullZendeskUser": {
                      "Comment": "Get Full User Details",
                      "Type": "Task",
                      "ResultPath": "$.FullUser",
                      "Resource": "${GetFullZendeskUser.Arn}",
                      "Next": "PublishArticle"
                      },
                      "PublishArticle": {
                      "Comment": "Publish as an article",
                      "Type": "Task",
                      "Resource": "${CreateZendeskArticle.Arn}",
                      "End": true
                      }
                    }
                }
      RoleArn: !GetAtt [ MyStatesExecutionRole, Arn ]

This application is suited for a Step Functions Express Workflow because it is orchestrating short duration, high-volume, event-based workloads.  Each workflow task is idempotent and stateless. The Express Workflow carries the workload’s state by passing the output of one task to the input of the next. The Amazon States Language ResultPath definition is used to control where each tasks output is appended to workflow’s state before it is passed to the next task.

 

AWS StepFunctions Express workflow

AWS StepFunctions Express workflow

Lambda functions

Each task in this Express Workflow invokes a Lambda function defined within the example application’s SAM template. The Lambda functions use the Node.js Axios package to make a request to Zendesk’s API.  The Zendesk API credentials are stored in the Lambda function’s environment variables and accessible via ‘process.env’.

The first two Lambda functions in the workflow make a GET request to Zendesk. This retrieves additional data about the support ticket, the author, and the agent’s response.

The final Lambda function makes a POST request to Zendesk API. This creates and publishes a new article using this data.  The permission_group and section defined in this function must be set to your Zendesk account’s default permission group ID and FAQ section ID.

AWS Lambda function code

AWS Lambda function code

Integrating with your front-end application

Follow the instructions in the Fresh Tracks repo on GitHub to deploy the front-end application. This application includes Zendesk’s web widget script in the index.html page. The widget has been customized using Zendesk’s javascript API. This is implemented in the navigation component to insert custom forms into the widget and prefill the email address field for authenticated users. The backend application starts receiving Zendesk emitted events immediately.

The video below demonstrates the implementation from end to end.

Conclusion

This post explains how to set up EventBridge’s third-party integration with Zendesk to capture events. The example backend application demonstrates how to filter these events, and send downstream to a Step Functions Express Workflow. The Express Workflow orchestrates a series of stateless Lambda functions to gather additional data about the event. Zendesk’s API is then used to publish a new help guide article from this data.

This pattern provides a framework for bidirectional event orchestration between AWS services, custom web applications and third party integration partners. This can be replicated and applied to any number of third party integration partners.

This is implemented with minimal code to provide near real-time streaming of events and without adding latency to your application.

The possibilities are vast. I am excited to see how builders use this bidirectional serverless pattern to add even more value to their third party services.

Start here to learn about other SaaS integrations with Amazon EventBridge.

Using AWS CodeDeploy and AWS CodePipeline to Deploy Applications to Amazon Lightsail

Post Syndicated from Emma White original https://aws.amazon.com/blogs/compute/using-aws-codedeploy-and-aws-codepipeline-to-deploy-applications-to-amazon-lightsail/

This post is contributed by Mike Coleman | Developer Advocate for Lightsail | Twitter: @mikegcoleman

Introduction

Amazon Lightsail is the easiest way to get started in the cloud, allowing you to get your application running on your own virtual server in a matter of minutes. But, what do you do if you want to update that running application?

In order to automate the process of both deploying and updating software many developers are turning to automated workflows. AWS has a full complement of tools that allow you to build deployment pipelines to cover a wide array of use cases.

This blog post provides guidance on how to configure Lightsail to work with AWS CodePipeline and AWS CodeDeploy to automatically deploy (or update) an application every time you push a change to GitHub. Even though this tutorial provides detailed, step-by-step instructions, you may still want to read some of the docs (CodeDeploy and CodePipeline) if you’re unfamiliar with deployment pipelines.

After completing this walkthrough you’ll be on your way to implementing more complex pipelines that might include build and test steps.

 

Prerequisites

You need the following to complete this walkthrough:

  • A GithHub account in which to fork the demo code
  • git installed on your local machine, and a basic understanding on how to use it
  • An AWS account with sufficient privileges to create AWS Identity and Access (IAM) users, policies, and service roles as well as to create resources with the following services: Amazon S3, CodePipeline, CodeDeploy and Lightsail
  • The AWS CLI installed and configured on your local machine

 

Document Important Values

As you go through this tutorial, you’ll need to take note of a few key values. Open up a text editor of your choice, and copy and paste the template below into a new document. As you go through the guide, when instructed, copy the values into your text document

S3 Bucket Name:

Access Key ID:

Secret Key:

IAM User ARN:

Note: Both the Access key ID and Secret access key should be protected in the same manner you protect any sensitive username / password pair.

 

Solution Overview

The rest of this blog guides you through the process of setting up your deployment pipeline. You start by creating a service role for CodeDeploy, an Amazon S3 bucket, and an IAM user. After deploying these services, you create a Lightsail instance. You also install and configure the CodeDeploy agent, as well as registering the instance with CodeDeploy. Finally, you create an application in CodeDeploy, and configure CodePipeline to kick off a new deployment whenever you push changes to GitHub.

 

Create a service role

In AWS a service role is a role that an AWS service assumes to perform actions on your behalf. The policies that you attach to the service role determine which AWS resources the service can access and what it can do with those resources. Below you use AWS IAM to create a service role for CodeDeploy that has the necessary permissions to work with Lightsail.

  1. Sign in to the AWS Management Console and open the IAM console at https://console.aws.amazon.com/iam/.
  2. In the navigation pane, choose Roles, and then choose Create role.                                iam role
  3. On the Create role page, choose AWS service, and from the Choose the service that will use this role list, choose CodeDeploy.                                  iam role for service
  4. Near the bottom of the screen under Select your use case, choose CodeDeploy and click Next: Permissions.
  5. On the Attached permissions policy page, the default permission policy (AWSCodeDeployRule) is displayed. You can click on the policy name if you’d like to review the details of the policy.Click Next: Tags, and then click Next: Review.
  6. On the Create Role page, in Role name, enter a name for the service role (for example, CodeDeployServiceRole).role name
  7. Click Create role.

 

Create an S3 bucket

In this section, you create an Amazon S3 bucket to store the deployment artifact created by CodeDeploy. This artifact is a compressed file containing your source code files, and any scripts that need to run as part of the installation or update process.

  1. Sign in to the AWS Management Console and open the S3 console, and click Create Bucket. create s3 bucket
  2. Enter a name under Bucket name. The name must be unique across all of S3.
    Be sure to copy the S3 bucket name into your text document.
  3. Ensure Block all public access is checked.                             name s3 bucket
  4. Click Create bucket.

 

Create IAM policy

An IAM policy is a set of rules that, when attached to an identity or resource, defines their permissions. For this use case, you want a policy that only allows AWS CodeDeploy agent to read the S3 bucket you just created. In the next set of steps, you define the policy. Then in the subsequent section you apply that policy to an IAM user. That user ultimately is associated with the CodeDeploy agent running on your Lightsail instance.

  1. Sign in to the AWS Management Console and open the IAM console.
  2. In the navigation pane, choose Policies, and then choose Create policy. create iam policy
  3. Click on the JSON tab.                                                                                            JSON tab

Erase the content in the editor window, and paste in the code from below.

NOTE: Be sure to replace <S3 Bucket Name> with the name of the S3 bucket you created in the previous step.

{

  "Version": "2012-10-17",

  "Statement": [

    {

      "Effect": "Allow",

      "Action": [

        "s3:Get*",

        "s3:List*"

      ],

      "Resource": [

        "arn:aws:s3:::<S3 Bucket Name>/*"

      ]

    }

  ]

}

JSON editor revised

  1. Click Review policy.
  2. Enter CodeDeployS3BucketPolicy for the policy name.
  3. Click Create policy.

Create an IAM user

Because you cannot assign an IAM role to a Lightsail instance, you need to create your IAM user with the appropriate permissions. In this case the user will need to be able to list the contents of the S3 bucket you just created.

  1. Stay in the IAM console.
  2. In the navigation pane, choose Users, and then choose Add user.                                      IAM user creation
  3. Enter LightSailCodeDeployUser for the User name and click Programmatic access under Select AWS access type (you use programmatic access since this user account will never need to log into the console). Click Next: permissions.        set user name
  4. Click Attach existing polices directly. Enter CodeDeployS3BucketPolicy in the search box, and check the box next to the CodeDeployS3BucketPolicy policy.                                        attach policies to the S3 bucket
  5. Click Next: Tags. Click Next: Review. Click Create user.
  6. Copy the Access key ID and Secret access key into your text document. You will need to click Show to display the secret access key. Note: If you do not copy these values now, you cannot go back and retrieve them from the console. You will need to create a new set of credentials

.access key and secret access key

       7. Click Close.

8. Click on the user you just and copy the User ARN into your document.  USR ARN

At this point you created an S3 bucket where CodeDeploy can store your build artifact, as well as the IAM components you need (a service role, IAM Policy, and an IAM user) to configure the CodeDeploy agent. In the next step, you actually deploy a Lightsail instance with the CodeDeploy agent, and then you register that instance with CodeDeploy

Create a Lightsail instance and install the CodeDeploy agent

In this section you create the Lightsail instance where you want your code to run. In order for the instance to work with CodeDeploy you must install the CodeDeploy agent. The agent installation is done by providing a startup script that runs when the instance is first created.

  1. Log in to the AWS Management Console, and navigate to the Lightsail home page.
  2. Click Create instance.
  3. Lightsail instance location
  4. Ensure that you’re creating your instance in the correct AWS Region.
  5. Under Pick your instance image click on Linux/Unix. Click on OS Only. Select Amazon Linux.instance image lightsail
  6. Scroll down and click + Add launch script. In the code below paste in your Access key ID, Secret access key, and IAM User ARN from your text document. Also replace <Desired Region> with the Region that you deployed instance into (e.g. us-west-2).

After you edit the code below, paste it into the launch script edit window. The configuration below allows the CodeDeploy agent run with the permissions you assigned to the IAM user earlier. These permissions allow the CodeDeploy agent to download the deployment artifact created by the CodeDeploy service from the S3 bucket where it will be stored. Additionally, the agent will use the information in the artifact to deploy or update your application.

mkdir /etc/codedeploy-agent/

mkdir /etc/codedeploy-agent/conf

cat <<EOT >> /etc/codedeploy-agent/conf/codedeploy.onpremises.yml

---

aws_access_key_id: <Access Key ID>

aws_secret_access_key: <Secret Access Key>

iam_user_arn: <IAM User ARN>

region: <Desired Region>

EOT

wget https://aws-codedeploy-us-west-2.s3.us-west-2.amazonaws.com/latest/install

chmod +x ./install

sudo ./install auto

     6. Enter codedeploy for the instance name under Identify your instance. Click Create Instance.

Verify the CodeDeploy agent

Wait about 5-10 minutes for the instance to boot up and run the startup script.

In this section you verify that the CodeDeploy agent is up and running, register your instance with CodeDeploy, and, finally, tag the instance.

Note: You will be using the command line for both your Lightsail instance and on your local machine. Pay careful attention to the instructions to ensure you’re issuing the commands on the right command line.

  1. Start an SSH session by clicking on the terminal icon next to the name of your instancessh session for instance
  2. On the command line of the Lightsail terminal session enter the command below to verify the CodeDeploy agent is running.

sudo service codedeploy-agent status

You should see a response similar to the one below (the PID will be different):

The AWS CodeDeploy agent is running as PID 2783

      3. Enter the command below using the AWS CLI in a terminal session on your local machine to register your Lightsail instance with CodeDeploy.

NOTE: Replace <IAM User ARN> with the value in your document and the <Desired Region> with the appropriate Region.
NOTE: If you did not name your Lightsail instance codedeploy you will need to adjust the –instance-name parameter accordingly.
NOTE: The command does not provide any output

aws deploy register-on-premises-instance --instance-name codedeploy --iam-user-arn <IAM User ARN> --region <Desired Region>

  1. Enter the following command using the AWS CLI in a terminal session on your local machine to tag your Lightsail instance in CodeDeploy. The tag will be used by CodeDeploy to know where to install your code.
    NOTE: If you did not name your Lightsail instance codedeploy you will need to adjust the –instance-name parameter accordingly.
    NOTE: Replace <Desired Region> with the appropriate Region.
    NOTE: The command does not provide any output

aws deploy add-tags-to-on-premises-instances --instance-names codedeploy --tags Key=Name,Value=CodeDeployLightsailDemo --region <Desired Region>

      5. Enter the command below using the AWS CLI in a terminal session on your local machine to verify your machine was successfully registered:
          NOTE: Replace <Desired Region> with the appropriate Region.

aws deploy list-on-premises-instances --region <Desired Region>

You should see output similar to:

{
"instanceNames": [
     "codedeploy"
  ]
}

At this point you are now ready to setup your actual code deployment using CodePipeline and CodeDeploy.

Setup the application in CodeDeploy

  1. Navigate to the CodeDeploy console, make sure you’re in the correct Region, and click Create application.codedeploy create application
  2. Enter CodeDeployLightsailDemo for the Application name and select EC2/On-premises under Compute platform. Click Create application.
  3. In the Deployment groups section Click Create deployment group.
  4. Enter CodeDeployLightsailDemoDeploymentGroup for the Deployment group name.
  5. Click in the text box for Enter a service role and select the service role you created earlier (CodeDeployServiceRole)
  6. Under Environment configuration check the box for On-premises instances. Under Key enter Name and under Value enter. environment configuration
  7. Under Load balancer uncheck Enable load balancing.
  8. Click Create deployment group.

 

Fork the GitHub Repo

In this section, you connect your GitHub account to CodePipeline so that whenever you push a change to GitHub your new code will automatically be deployed. For this tutorial, I placed a small demo application in a GitHub repository. These next steps guide you through forking that code into your own GitHub account.

  1. Sign into GithHub.
  2. Navigate to the demo repository: http://github.com/mikegcoleman/codedeploygithubdemo
  3. To the right of the repository name at the top click the Forkfork github
  4. Click on the account that you want to fork the repository into.
    After a few seconds the fork process completes, and you are redirected to the new repo in your account.

Setup CodePipeline

As the name implies, CodePipeline allows you to create an automated set of steps for your application deployment. For instance, build and test processes that must happen before a final deployment step. In this example you’re going to build a very simple pipeline that will redeploy your application when a change is pushed to the associated GitHub repository.

  1. Navigate to the CodePipeline console, ensure you’re in the correct Region, and click Create pipeline.
  2. Enter CodeDeployLightsailDemoPipeline for the Pipeline name.
  3. Click on Advanced Settings. Under Artifact store click the radio button next to Custom Location. Click into the Bucket text box and select the S3 bucket you created earlier. Click Next.codepipeline advanced settings
  4. From the Source provider drop down choose Click Connect to GitHub and follow any prompts to authorize CodePipeline to access your GitHub account.
    1. Note: If you’ve connected GitHub previously there will not be any additional prompts.
  5. Click in the Repository text box and select the repository you forked earlier. The name should be <your github username>/codedeploygithubdemo.
  6. Click in the Branch box and choose Click Next. Since there isn’t a build stage click Skip build stage and confirm by clicking Skip.
  7. Choose AWS CodeDeploy from the Deploy provider Ensure the appropriate Region is selected. Choose CodeDeployLightsailDemo from the Application name list. Choose CodeDeployLightsailDemoDeploymentGroup from the Deployment group list.
    1. Click Next.
    2. Click Create pipeline.codedeploy configuration
  8. You’ll be taken to the details page for your pipeline, and can watch the status of the pipeline update. Once the Deploy step has a status of succeeded feel free to move on to the next section.                                                                                           source for code

 

Test and Update the Application

In this final section you to verify the application deployed, and then you’ll make an update to the application to kick off a new deployment. Finally, you’ll verify that the update was successfully pushed to your server.

  1. Navigate in your web browser to the IP address of your Lightsail instance. You can find the IP address on the card for your instance on the Lightsail home page. You should see a simple webpage displayed.
    ip address
  2. Move to the command line of your local machine, and clone the GitHub repository, being sure to insert your GitHub username.NOTE: If you are using SSH to authenticate to GitHub adjust the GitHub command accordinglygit clone https://github.com/<your github username>/codedeploygithubdemoYou should see output similar to the following:

    Cloning into 'codedeploygithubdemo'...
    remote: Enumerating objects: 49, done.
    remote: Counting objects: 100% (49/49), done.
    remote: Compressing objects: 100% (33/33), done.
    remote: Total 49 (delta 25), reused 36 (delta 12), pack-reused 0
    Unpacking objects: 100% (49/49), done.

  3. Change into the directory with the website code:cd codedeploygithubdemo
  4. Using an editor of your choice edit the html file by changing the background color to purple:background-color: purple;
  5. Push the changes to GitHub by issuing each of the following commands one at a time:git add index.html
    git commit -m “new background color”
    git push origin master
  6. Navigate back to the CodePipeline console and click on the name of your pipeline. You should see that the change from GitHub has been picked up and the pipeline is deploying your website. Once the pipeline has successfully completed move to the next step.
  7. Reload to demo website to see that the background color has changed.

Conclusion

Congratulations on finishing this tutorial. You should now have an understanding of how to automate the deployment and updating of your application running on Lightsail using CodeDeploy and CodePipeline. As a next step, you might want to try and deploy a more complex application to Lightsail, including adding a build step. You can find information on how to do that in AWS CodeBuild documentation.

 

 

 

Deploying a ASP.NET Core web application to Amazon ECS using an Azure DevOps pipeline

Post Syndicated from John Formento original https://aws.amazon.com/blogs/devops/deploying-a-asp-net-core-web-application-to-amazon-ecs-using-an-azure-devops-pipeline/

For .NET developers, leveraging Team Foundation Server (TFS) has been the cornerstone for CI/CD over the years. As more and more .NET developers start to deploy onto AWS, they have been asking questions about using the same tools to deploy to the AWS cloud. By configuring a pipeline in Azure DevOps to deploy to the AWS cloud, you can easily use familiar Microsoft development tools to build great applications.

Solution overview

This blog post demonstrates how to create a simple Azure DevOps project, repository, and pipeline to deploy an ASP.NET Core web application to Amazon ECS using Azure DevOps. The following screenshot shows a high-level architecture diagram of the pipeline:

 

Solution Architecture Diagram

In this example, you perform the following steps:

  1. Create an Azure DevOps Project, clone project repo, and push ASP.NET Core web application.
  2. Create a pipeline in Azure DevOps
  3. Build an Amazon ECS Cluster, Task and Service.
  4. Kick-off deployment of the ASP.Net Core web application using the newly create Azure DevOps pipeline.

 

Prerequisites

Ensure you have the following prerequisites set up:

  • An Amazon ECR repository
  • An IAM user with permissions for Amazon ECR and Amazon ECS (the user will need an access key and secret access key)

 

Create an Azure DevOps Project, clone project repo, and push ASP.NET Core web application

Follow these steps to deploy a .NET Core app onto your Amazon ECS cluster using the Azure DevOps (ADO) repository and pipeline:

 

  1. Login to dev.azure.com and navigate to the marketplace.
  2. Go to Visual Studio, search for “AWS”, and add the AWS Tools for Microsoft Visual Studio Team Services.
  3. Create a project in ADO: Provide a project name and choose Create.
  4. On the Project Summary page, choose Project Settings.
  5. In the Project Settings pane, navigate to the Service Connections page.
  6. Choose Create service connection, select AWS, and choose Next.
  7. Input an Access Key ID and Secret Access Key. (You’ll need an IAM user with permissions for Amazon ECR and Amazon ECS in order to deploy via the Azure DevOps pipeline.) Choose Save.
  8. Choose Repos in the left pane, then Clone in Visual Studio under Clone to your computer.
  9. Create a ASP.NET Core web application in Visual Studio, set the location to locally cloned repository, and check Enable Docker support.
  10. Once you’ve created the new project, perform an initial commit and push to the repository in Azure DevOps.

 

Creating a pipeline in Azure DevOps

Now that you have synced the repository, create a pipeline in Azure DevOps.

  1. Go to the pipeline page within Azure DevOps and choose Create Pipeline.
  2. Choose Use the classic editor.Pipeline configuration with repository
  3. Select Azure Repos Git for the location of your code and select the repository you created earlier.
  4. On the Choose a Template page, select Docker Container and choose Apply.
  5. Remove the Push an image step.
  6. Add an Amazon ECR Push task by choosing the + symbol next to Agent job 1. You can search for “AWS” in the Add tasks pane to filter for all AWS tasks.

 

Now, configure each task:

  1. Choose the Build an image task and ensure that the action is set to Build an image. Additionally, you can modify the Image Name to your standards.Pipeline configuration page Azure DevOps
  2. Choose the Push Image task and provide the following
    • Enter a name under Display Name.
    • Select the AWS Credentials that you created in Service Connections.
    • Select the AWS Region.
    • Provide the source image name, which you can find in the setting for the Build an image task.
    • Enter the name of the repository in Amazon ECR to which the image is pushedPipeline configuration page Azure DevOps
  3. Choose Save and queue.

Build Amazon ECS Cluster, Task, and Service

The goal here is to test up to building the Docker image and ensure it’s pushed to Amazon ECR. Once the Docker image is in Amazon ECR, you can create the Amazon ECS cluster, task definition, and service leveraging the newly created Docker image.

  1. Create an Amazon ECS cluster.
  2. Create an Amazon ECS task definition. When you create the task definition and configure the container, use the Amazon ECR URI for the Docker image that was just pushed to Amazon ECR.
  3. Create an Amazon ECS service.

Go back and edit the pipeline:

  1. Add the last step by choosing the + symbol next to Agent job 1.
  2. Search for “AWS CLI” in the search bar and add the task.
  3. Choose AWS CLI and configure the task.
  4. Enter a name under Display Name, such as Update ECS Service.
  5. Select the AWS Credentials that you created in Service Connections.
  6. Select the AWS Region.
  7. Input the following command, which updates the Amazon ECS service after a new image is pushed to Amazon ECR. Replace <clustername> and <servicename> with your Amazon ECS cluster and service names.
    • Command:ecs
    • Subcommand:update-service
    • Options and parameters: --cluster <clustername> --service <servicename> --force-new-deployment
  8. Now choose the Triggers tab and select Enable continuous integration with the repository you created.
  9. Choose Save and queue.

 

At this point, your build pipeline kicks off and builds a Docker image from the source code in the repository you created, pushes the image to Amazon ECR, and updates the Amazon ECS service with the new image.

You can verify by viewing the build. Choose Pipelines in Azure DevOps, selecting the entry for the latest run, and then the icon under the status column. Once it successfully completes, you can log in to the AWS console and view the updated image in Amazon ECR and the updated service in Amazon ECS.Pipeline status page Azure DevOps

Every time you commit and push your code through Visual Studio, this pipeline kicks off and builds and deploys your application to Amazon ECS.

Cleanup

At the end of this example, once you’ve completed all steps and are finished testing, follow these steps to disable or delete resources to avoid incurring costs:

  1. Go to the Amazon ECS console within the AWS Console.
  2. Navigate to the cluster you created, then choose the Tasks tab.
  3. Choose Stop all to turn off the tasks.

Conclusion

This blog post reviewed how to create a CI/CD pipeline in Azure DevOps to deploy a Docker Image to Amazon ECR and container to Amazon ECS. It provided detailed steps on how to set up a basic CI/CD pipeline, leveraging tools with which .NET developers are familiar and the steps needed to integrate with Amazon ECR and Amazon ECS.

I hope this post was informative and has helped you learn the basics of how to integrate Amazon ECR and Amazon ECS with Azure DevOps to create a robust CI/CD pipeline.

About the Authors

John Formento

 

 

John Formento is a Solution Architect at Amazon Web Services. He helps large enterprises achieve their goals by architecting secure and scalable solutions on the AWS Cloud.

Using Load Balancers on Amazon Lightsail

Post Syndicated from Emma White original https://aws.amazon.com/blogs/compute/using-load-balancers-on-amazon-lightsail/

This post was written by Robert Zhu, Principal Developer Advocate at AWS. 

As your web application grows, it needs to scale up to main performance and improve availability. A Load Balancer is an important tool in any developer’s tool belt. In this post, I show how to load balance a simple Node.js web application using the Amazon Lightsail Load Balancer. I chose Node.js as an example, but the concepts and techniques I discuss work with any other web application stack such as PHP, Djago, .net, or even plain old HTML. To demonstrate these concepts, I deploy a load balancer that splits traffic between two Amazon Lightsail instances using round-robin load balancing. Compared to other load balancing solutions on AWS, such as our Application Load Balancer and Network Load Balancer, the Lightsail Load Balancer is simpler, easier to use, and has a fixed cost of $18/month, regardless of how much bandwidth you use or how many connections are open.

Launch the Cluster

For the demo, first create a cluster consisting of two Lightsail instances. To do so, navigate to the Lightsail console, and launch two micro Lightsail instances using the Ubuntu 18.04 image.

After you select your Region and OS, select an instance plan and identify your instance. Make sure that your instances have enough memory and compute to run your application. For this demo, the smallest instance size suffices.

Instance plan and name

 

Once your instances are ready, click each instance and open an SSH session and run the following commands:

 

sudo apt-get update

sudo apt-get -y install nodejs npm

git clone https://github.com/robzhu/colornode

cd colornode && npm install

 

The above commands update APT package registry, and install Node.js and npm, in order to run the application. When installing npm, select “yes” when you see the following prompt:

npm prompt

Once installation is complete, you can run “node -v” and “npm -v” to make sure everything works. While this is an older version of Node.js, you should be able to use it without any issues. Remember to run these steps for both instances.

The application

Let’s take a tour of the application. Here’s the main.js source file:

 

const color = require("randomcolor")();

const name = require("./randomName");

const app = require("express")();

app.get("/", (_, res) => {

  res.send(`

    <html>

      <body bgcolor=${color}>

      <h1>${name}</h1>

    </html>

  `);

});


app.get("/health", (_, res) => {

  res.status(200).send("I'm OK");

});

app.listen(80, () => console.log(`${name} started on port 80`));

 

Upon startup, the application generates a random name and color. It then uses express-js to serve responses to requests at two routes: “/” and “/health”. When a client requests the “/” route, you respond with an HTML snippet that contains the name and color of the application. When a client requests the “/health” route, you respond with HTTP status code 200. Later, I use the health route to enable Lightsail to determine the health of a node and whether the load balancer should continue routing requests to that node.

 

You can start the application by running “sudo node main.js”. Do this for both instances, and you should see two distinct webpages at each instance’s IP address:

images of two instance urls

 

Launch the Load Balancer

To create a load balancer, open the Network tab in the Lightsail console and click Create load balancer.

Networking tab of Lightsail console

On the next page, name your load balancer and click Create load balancer. After that, a page shows where you can add target instances to the load balancer:

target instances

From the dropdown menu under Target instances, select each instance and attach it to the load balancer, like so:

 

target instances 2

image of first loadbalancer

Once both instances are attached, copy the DNS name for your load balancer and paste it into a new browser session. When the load balancer receives the browser request, it routes the request to one of the two instances. As you refresh the browser, you should see the returned response alternate between the two instances. This should clearly illustrate the functionality of the load balancer.

Health Checks

Over time, your instances inevitably exhibit faulty behavior due to application errors, network partitions, reboots, etc. Load balancers need to detect faulty nodes in order to avoid routing traffic to them. The simplest way to implement a load balancer health check is to expose an HTTP endpoint on your nodes. By default, the Lightsail Load Balancer performs health checks by issuing an HTTP GET request against the node IP address directly. However, recall that the application serves a simple HTTP 200 response at the “/health” route. You can configure the Load Balancer to use this route to perform health checks on our nodes instead.

 

Within your Lightsail console, open your load balancer, click Customize health checking, and enter health for the route:

health check screenshot

 

Now, the Lightsail load balancer uses http://{instanceIP}/health to perform the health check. You can further customize your load balancer by using a custom domain and adding SSL termination.

 

The custom health check feature allows us to implement deeper health check logic. For example, if you want to make sure that the entire application stack is running, you can write a test value to the database and read it back. That would give our application a much higher assurance of functionality, rather than returning Status Code 200 at an HTTP endpoint.

 

Conclusion

In summary, you created a Lightsail Load Balancer and used it to route traffic between two Lightsail instances serving our demo application. If one of these instances crashes or becomes unreachable, the load balancer will automatically route traffic to the remaining healthy nodes in the cluster based on a customizable HTTP health check. In comparison to other AWS load balancing solutions, the Lightsail Load Balancer is simpler, easier, and uses a fixed-pricing model. Please leave feedback in the comments or reach out to me on Twitter or email!

 

In a follow-up post, I walk through some more advanced concepts:

  • SSL termination
  • Custom Domains
  • Deep health check
  • Websocket connections

 

About the Author

Robert Zhu is a principal developer advocate at AWS.

Email: [email protected]

Twitter: @rbzhu