For .NET developers, leveraging Team Foundation Server (TFS) has been the cornerstone for CI/CD over the years. As more and more .NET developers start to deploy onto AWS, they have been asking questions about using the same tools to deploy to the AWS cloud. By configuring a pipeline in Azure DevOps to deploy to the AWS cloud, you can easily use familiar Microsoft development tools to build great applications.
Solution overview
This blog post demonstrates how to create a simple Azure DevOps project, repository, and pipeline to deploy an ASP.NET Core web application to Amazon ECS using Azure DevOps. The following screenshot shows a high-level architecture diagram of the pipeline:
In this example, you perform the following steps:
Create an Azure DevOps Project, clone project repo, and push ASP.NET Core web application.
Create a pipeline in Azure DevOps
Build an Amazon ECS Cluster, Task and Service.
Kick-off deployment of the ASP.Net Core web application using the newly create Azure DevOps pipeline.
Prerequisites
Ensure you have the following prerequisites set up:
An IAM user with permissions for Amazon ECR and Amazon ECS (the user will need an access key and secret access key)
Create an Azure DevOps Project, clone project repo, and push ASP.NET Core web application
Follow these steps to deploy a .NET Core app onto your Amazon ECS cluster using the Azure DevOps (ADO) repository and pipeline:
Login to dev.azure.com and navigate to the marketplace.
Go to Visual Studio, search for “AWS”, and add the AWS Tools for Microsoft Visual Studio Team Services.
Create a project in ADO: Provide a project name and choose Create.
On the Project Summary page, choose Project Settings.
In the Project Settings pane, navigate to the Service Connections page.
Choose Create service connection, select AWS, and choose Next.
Input an Access Key ID and Secret Access Key. (You’ll need an IAM user with permissions for Amazon ECR and Amazon ECS in order to deploy via the Azure DevOps pipeline.) Choose Save.
Choose Repos in the left pane, then Clone in Visual Studio under Clone to your computer.
Create a ASP.NET Core web application in Visual Studio, set the location to locally cloned repository, and check Enable Docker support.
Once you’ve created the new project, perform an initial commit and push to the repository in Azure DevOps.
Creating a pipeline in Azure DevOps
Now that you have synced the repository, create a pipeline in Azure DevOps.
Go to the pipeline page within Azure DevOps and choose Create Pipeline.
Choose Use the classic editor.
Select Azure Repos Git for the location of your code and select the repository you created earlier.
On the Choose a Template page, select Docker Container and choose Apply.
Remove the Push an image step.
Add an Amazon ECR Push task by choosing the + symbol next to Agent job 1. You can search for “AWS” in the Add tasks pane to filter for all AWS tasks.
Now, configure each task:
Choose the Build an image task and ensure that the action is set to Build an image. Additionally, you can modify the Image Name to your standards.
Choose the Push Image task and provide the following
Enter a name under Display Name.
Select the AWS Credentials that you created in Service Connections.
Select the AWS Region.
Provide the source image name, which you can find in the setting for the Build an image task.
Enter the name of the repository in Amazon ECR to which the image is pushed
Choose Save and queue.
Build Amazon ECS Cluster, Task, and Service
The goal here is to test up to building the Docker image and ensure it’s pushed to Amazon ECR. Once the Docker image is in Amazon ECR, you can create the Amazon ECS cluster, task definition, and service leveraging the newly created Docker image.
Create an Amazon ECS task definition. When you create the task definition and configure the container, use the Amazon ECR URI for the Docker image that was just pushed to Amazon ECR.
Add the last step by choosing the + symbol next to Agent job 1.
Search for “AWS CLI” in the search bar and add the task.
Choose AWS CLI and configure the task.
Enter a name under Display Name, such as Update ECS Service.
Select the AWS Credentials that you created in Service Connections.
Select the AWS Region.
Input the following command, which updates the Amazon ECS service after a new image is pushed to Amazon ECR. Replace <clustername> and <servicename> with your Amazon ECS cluster and service names.
Command:ecs
Subcommand:update-service
Options and parameters: --cluster <clustername> --service <servicename> --force-new-deployment
Now choose the Triggers tab and select Enable continuous integration with the repository you created.
Choose Save and queue.
At this point, your build pipeline kicks off and builds a Docker image from the source code in the repository you created, pushes the image to Amazon ECR, and updates the Amazon ECS service with the new image.
You can verify by viewing the build. Choose Pipelines in Azure DevOps, selecting the entry for the latest run, and then the icon under the status column. Once it successfully completes, you can log in to the AWS console and view the updated image in Amazon ECR and the updated service in Amazon ECS.
Every time you commit and push your code through Visual Studio, this pipeline kicks off and builds and deploys your application to Amazon ECS.
Cleanup
At the end of this example, once you’ve completed all steps and are finished testing, follow these steps to disable or delete resources to avoid incurring costs:
Go to the Amazon ECS console within the AWS Console.
Navigate to the cluster you created, then choose the Tasks tab.
Choose Stop all to turn off the tasks.
Conclusion
This blog post reviewed how to create a CI/CD pipeline in Azure DevOps to deploy a Docker Image to Amazon ECR and container to Amazon ECS. It provided detailed steps on how to set up a basic CI/CD pipeline, leveraging tools with which .NET developers are familiar and the steps needed to integrate with Amazon ECR and Amazon ECS.
I hope this post was informative and has helped you learn the basics of how to integrate Amazon ECR and Amazon ECS with Azure DevOps to create a robust CI/CD pipeline.
About the Authors
John Formento is a Solution Architect at Amazon Web Services. He helps large enterprises achieve their goals by architecting secure and scalable solutions on the AWS Cloud.
This is a guest post by Floating Point Group. In their own words, “Floating Point Group is on a mission to bring institutional-grade trading services to the world of cryptocurrency.”
The need and demand for financial infrastructure designed specifically for trading digital assets may not be obvious. There’s a rather pervasive narrative that these coins and tokens are effectively natively digital counterparts to traditional assets such as currencies, commodities, equities, and fixed income. This narrative often manifests in the form of pithy one-liners recycled by pundits attempting to communicate the value proposition of various projects in the space (such as, “Bitcoin is just a currency with an algorithmically controlled, tamper-proof monetary policy,” or, “Ether is just a commodity like gasoline that you can use to pay for computational work on a global computer.”). Unsurprisingly, we at FPG often hear the question, “What’s so special about cryptocurrencies that they warrant dedicated financial services? Why do we need solutions for problems that have already been solved?”
The truth is that these assets and the widespread public interest surrounding them are entirely unprecedented. The decentralized ledger technology that serves as an immutable record of network transactions, the clever use of proof-of-work algorithms to economically incentivize rational actors to help uphold the security of the network (the proof-of-work concept dates back at least as far as 1993, but it was not until bitcoin that the technology showed potential for widespread adoption), the irreversible nature of transactions that poses unique legal challenges in cases such as human error or extortion, the precariousness of self-custody (third-party custody solutions don’t exactly have track records that inspire trust), the regulatory uncertainties that come with the difficulty of both classifying these assets as well as arbitrating their exchange which must ultimately be reconciled by entities like the IRS, SEC, and CFTC—it is all very new, and very weird. With 24-hour market volume regularly exceeding $100 billion, we decided to direct our focus towards problems related specifically to trading these assets. Granted, crypto trading has undoubtedly matured since the days of bartering for bitcoin in web forums and witnessing 10% price spreads between international exchanges. But there is still a long path ahead.
One major pain point we are aiming to address for institutional traders involves liquidity (or, more precisely, the lack thereof). Simply put, the buying and selling of cryptocurrencies occurs across many different trading venues (exchanges), and liquidity (the offers to buy or sell a certain quantity of an asset at a certain price) continues to become more fragmented as new exchanges emerge. So say you’re trying to buy 100 bitcoins. You must buy from people who are willing to sell. As you take the best (cheapest) offers, you’re left with increasingly expensive offers. By the time you fill your order (in this example, buy all 100 bitcoins), you may have paid a much higher average price than, say, the price you paid for the first bitcoin of your order. This phenomenon is referred to as slippage. One easy way to minimize slippage is by expanding your search for offers. So rather than looking at the offers on just one exchange, look at the offers across hundreds of exchanges. This process, traditionally referred to as smart order routing (SOR), is one of the core services we provide. Our SOR service allows traders to easily submit orders that our system can match against the best offers available across multiple trading venues by actively monitoring liquidity across dozens of exchanges.
Fanning out large orders in search of the best prices is a rather intuitive and widely applicable concept—roughly 75% of equities are purchased and sold via SOR. But the value of such a service for crypto markets is particularly salient: a perpetual cycle of new exchanges surging in popularity while incumbents falter has resulted in a seemingly incessant fragmentation of liquidity across trading venues—yet traders tend to assume an exchange-agnostic mindset, concerned exclusively with finding the best price for a given quantity of an asset.
Access to both real-time and historical market data is essential to the functionality of our SOR service. The highest resolution data we could hope to obtain for a given market would include every trade and every change applied to the order book, effectively allowing us to recreate the state of a market at any given point in time. The updates provided through the WebSocket streams are not sufficient for reconstructing order books. We also need to periodically fetch snapshots of the order books and store those, which we can do using an exchange’s REST API. We can fetch a snapshot and apply the corresponding updates from the streams to “replay” the order book.
Fortunately, this data is freely available, because many exchanges offer real-time feeds of market data via WebSocket APIs. We found several third-party vendors selling subscriptions to these data sets, typically in the form of CSV dumps delivered at a weekly or monthly cadence. This presented the question of build vs. buy. Given that we felt capable of building a robust and reliable system for ingesting real-time market data in a relatively short amount of time and at a fraction of the cost of purchasing the data from a vendor, we were already leaning in favor of building. Further investigation made buying look like an increasingly unattractive option. Disclaimers that multiple vendors issued about their inability to guarantee data quality and consistency did not inspire confidence. Inspecting sample data sets revealed that some essential fields provided in the original data streams were missing—fields necessary for achieving our goal of recreating the state of a market at an arbitrary point in time. We also recognized that a weekly or monthly delivery schedule would restrict our ability to explore relatively recent market data.
This post provides a high-level overview of how we ingest and store real-time market data and how we use the AWS Data Exchange API to organize and publish our data sets programmatically. Our system’s functionality extends well beyond data ingestion, normalization, and persistence; we run dedicated services for data validation, caching the most recent trade and order book for every market, computing and storing derivative metrics, and other services that help safeguard data accuracy and minimize the latency of our trading systems.
Data ingestion
The WebSocket streams we connect to for data consumption are often the same APIs responsible for providing real-time updates to an exchange’s trading dashboard.
WebSocket connections transmit data as discrete messages. We can inspect the content of individual messages as they stream into the browser. For example, the following screenshot shows a batch of order book updates.
The updates are expressed as arrays of bids and asks that were either added to the book or removed from it. Client-side code processes each update, resulting in a real-time rendering of the market’s order book. In practice, our data ingestion service (Ingester) does not read a single stream, but rather thousands of different streams, covering various data feeds for all markets across multiple exchanges. All the connections required for such broad coverage and the resulting flood of incoming data raise some obvious concerns about data loss. We’ve taken several measures to mitigate such concerns, including a redundant system design that allows us to spin up an arbitrary number of instances of the Ingester service. Like most of our microservices, Ingester is a Dockerized service run on Amazon ECS and deployed via Terraform.
All these instances consume the same data feeds as each other while a downstream mechanism handles deduplication (this is covered in more detail later in this post). We also set up Amazon CloudWatch alerts to notify us when we detect non-contiguous messages, indicating a gap in the incoming data. The alerts don’t directly mitigate data loss, but they do serve the important function of prompting an investigation.
Ingester builds up separate buffers of incoming messages, split out by data-type/exchange/market. Then, after a fixed time interval, each buffer is flushed into Amazon S3 as a gzipped JSON file. The buffer-flush cycle repeats.
The following screenshot shows a portion of the file content.
This code snippet is a single, pretty-printed JSON record from the file in the screenshot above.
Ingester handles additional functionality, such as applying pre-defined mappings of venue-specific field names to our internal field names. Data normalization is one of many processes necessary to enable our systems to build a holistic understanding of market dynamics.
As with most distributed system designs, our services are written with horizontal scalability as a first-order priority. We took the same approach in designing our data ingestion service, but it has some features that make it a bit different than the archetypical horizontally scalable microservice. The most common motivations for adjusting the number of instances of a given service are load-balancing and throttling throughput. Either your system is experiencing backpressure and a consumer service scales to alleviate that pressure, or the consumer is over-provisioned and you scale down the number of instances for the sake of parsimony. For our data ingestion service, however, our motivation for running multiple instances is to minimize data loss via redundancy. The CPU usage for each instance is independent of instance count, because each instance does identical work.
For example, rather than helping alleviate backpressure by pulling messages from a single queue, each instance of our data ingestion service connects to the same WebSocket streams and performs the same amount of work. Another somewhat unusual and confounding aspect of horizontally scaling our data ingestion service is related to state: we batch records in memory and flush the records to S3 every minute (based on the incoming message’s timestamp, not the system timestamp, because those would be inconsistent). Redundancy is our primary measure for minimizing data loss, but we also need each instance to write the files to S3 in such a way that we don’t end up with duplicate records. Our first thought was that we’d need a mechanism for coordinating activity across the instances, such as maintaining a cache that would allow us to check if a record had already been persisted. But we realized that we could perform this deduplication without any coordination between instances at all. Most of the message streams we consume publish messages with sequence IDs. We can combine the sequence IDs with the incoming message timestamp to achieve our deduplication mechanism: we can deterministically generate the same exact file names containing the exact same data by writing our service code to check that the message added to the batch has the appropriate sequence ID relative to the previous message in the batch and using the timestamp on the incoming message to determine the exact start and end of each batch (we typically get a UNIX timestamp and check when we’ve rolled over to the next clock minute). This allows us to simply rely on a key collision in S3 for deduplication.
AWS suggests a similar solution for a slightly different problem, relating to Amazon Kinesis Data Streams. For more information, see Handling Duplicate Records.
With this scheme, even if records are processed more than one time, the resulting Amazon S3 file has the same name and has the same data. The retries only result in writing the same data to the same file more than one time.
After we store the data, we can perform simple analytics queries on the billions of records we’ve stored in S3 using Amazon Athena, a query service that requires minimal configuration and zero infrastructure overhead. Athena has a concept of partitions (inherited from one of its underlying services, Apache Hive). Partitions are mappings between virtual columns (in our case: pair, year, month, and day) and the S3 directories in which the corresponding data is stored.
S3’s file system is not actually hierarchical. Files are prepended with long key prefixes that are rendered as directories in the AWS console when browsing a bucket’s contents. This has some non-trivial performance consequences when querying or filtering on large data sets.
The following screenshot illustrates a typical directory path.
By pointing Athena directly to a particular subset of data, a well-defined partitioning scheme can drastically reduce query run times and costs. Though the ability the perform ad hoc business analytics queries is primarily a convenience, taking time to choose a sane multi-level partitioning scheme for Athena based on some of our most common access patterns seemed worthwhile. A poorly designed partition structure can result in Athena unnecessarily scanning huge swaths of data and ultimately render the service unusable.
Data publication
Our pipeline for transforming thousands of small gzipped JSON files into clean CSVs and loading them into AWS Data Exchange involves three distinct jobs, each expressed as an AWS Lambda function.
Job 1
Job 1 is initiated shortly after midnight UTC by a cron-scheduled CloudWatch event. As mentioned previously, our data ingestion service’s batching mechanism flushes each batch to S3 at a regular time interval. A timestamp on the incoming message (applied server-side) determines the rollover from one interval to the next, as opposed to the ingestion service’s system timestamp, so in the rare case that a non-trivial amount of time elapses between the consumption of the final message of batch n and the first message of batch n+1, we kick off the first Lambda function 20 minutes after midnight UTC to minimize the likelihood of omitting data pending write.
Job 1 formats values for the date and data source into an Athena query template and outputs the query results as a CSV to a specified prefix path in S3. (Every Athena query produces a .metadata file and a CSV file of the query results, though DDL statements do not output a CSV.) This PUT request to S3 triggers an S3 event notification.
We run a full replica data ingestion system as an additional layer of redundancy. Using the coalesce conditional expression, the Athena query in Job 1 merges data from our primary system with the corresponding data from our replica system, and fills in any gaps while deduplicating redundant records.
We experimented fairly extensively with AWS Glue and PySpark for the ETL-related work performed in Job 1. When we realized that we could merge all the small source files into one, join the primary and replica data sets, and sort the results with a single Athena query, we decided to stick with this seemingly simpler and more elegant approach.
The following code shows one of our Athena query templates.
Job 2
Job 2 is triggered by the S3 event notification from Job 1. Job 2 simply copies the query results CSV file to a different key within the same S3 bucket.
The motivation for this step is twofold. First, we cannot dictate the name of an Athena query results CSV file; it is automatically set to the Athena query ID. Second, when adding an S3 object as an asset to an AWS Data Exchange revision, the asset’s name is automatically set to the S3 object’s key. So to dictate how the CSV file name appears in AWS Data Exchange, we must first rename it, which we accomplish by copying it to a specified S3 key.
Job 3
Job 3 handles all work related to AWS Data Exchange and AWS Marketplace Catalog via their respective APIs. We use boto3, AWS’s Python SDK, to interface with these APIs. The AWS Marketplace Catalog API is necessary for adding data set revisions to products that have already been published. For more information, see Tutorial: Adding New Data Set Revisions to a Published Data Product.
Our code explicitly defines mappings with the following structure:
data source / DataSet / Product
The following code shows how we configure relationships between data sources, data sets, and products.
Our data sources are typically represented by a trading venue and data type combination (such as Binance trades or CoinbasePro order books). Each new file for a given data source is delivered as a single asset within a single new revision for a particular data set.
An S3 trigger kicks off the Lambda function. The trigger is scoped to a specified prefix that maps to a single data set. The function alias feature of AWS Lambda allows us to define the unique S3 triggers for each data set while reusing the same underlying Lambda function. Job 3 carries out the following steps (note that steps 1 through 5 refer to the AWS Data Exchange API while steps 6 and 7 refer to the AWS Marketplace Catalog API):
Submits a request to create a new revision for the corresponding data set via CreateRevision.
Adds the file that was responsible for triggering the Lambda function to the newly created revision via CreateJob using the IMPORT_ASSETS_FROM_S3 job type. To submit this job, we need to supply a few values: the S3 bucket and key values for the file are pulled from the Lambda event message, while the RevisionID argument comes from the response to the CreateRevision call in the previous step.
Kicks off the job with StartJob, sourcing the JobID argument from the response to the CreateJob call in the previous step.
Polls the job’s status via GetJob (using the job ID from the response to the StartJob call in the previous step) to check that our file (the asset) was successfully added to the revision.
Finalizes the revision via UpdateRevision.
Requests a description of the marketplace entity using DescribeEntity, passing in the product ID stored in our hardcoded mappings as the EntityID
Kicks off the entity ChangeSet via StartChangeSet, passing in the entity ID from the previous step, the entity ID from the DescribeEntity response in the previous step as EntityID, the revision ARN parsed from the response to our earlier call to CreateRevision as RevisionArn, and the data set ARN as DataSetArn, which we fetch at the start of the code’s runtime using AWS Data Exchange API’s GetDataSet.
Here’s a thin wrapper class we wrote to carry out the steps detailed above:
from time import sleep
import logging
import json
import boto3
from config import (
DATA_EXCHANGE_REGION,
MARKETPLACE_CATALOG_REGION,
LambdaS3TriggerMappings
)
logger = logging.getLogger()
class CustomDataExchangeClient:
def __init__(self):
self._de_client = boto3.client('dataexchange', region_name=DATA_EXCHANGE_REGION)
self._mc_client = boto3.client('marketplace-catalog', region_name=MARKETPLACE_CATALOG_REGION)
def _get_s3_data_source(self, bucket, prefix):
return LambdaS3TriggerMappings[(bucket, prefix)]
# Job State can be one of: WAITING | IN_PROGRESS | ERROR | COMPLETED | CANCELLED | TIMED_OUT
def _wait_for_de_job_completion(self, job_id):
while True:
get_job_resp = self._de_client.get_job(JobId=job_id)
if get_job_resp['State'] == 'COMPLETED':
logger.info(f"Job '{job_id}' succeeded:\n\t{get_job_resp}")
break
elif get_job_resp['State'] in ('ERROR', 'CANCELLED'):
raise Exception(f"Job '{job_id}' failed:\n\t{get_job_resp}")
else:
sleep(5)
logger.info(f"Still waiting on job {job_id}...")
return get_job_resp
# ChangeSet Status can be one of: PREPARING | APPLYING | SUCCEEDED | CANCELLED | FAILED
def _wait_for_mc_change_set_completion(self, change_set_id):
while True:
describe_change_set_resp = self._mc_client.describe_change_set(
Catalog='AWSMarketplace',
ChangeSetId=change_set_id
)
if describe_change_set_resp['Status'] == 'SUCCEEDED':
logger.info(
f"ChangeSet '{change_set_id}' succeeded:\n\t{describe_change_set_resp}"
)
break
elif describe_change_set_resp['Status'] in ('FAILED', 'CANCELLED'):
raise Exception(
f"ChangeSet '{change_set_id}' failed:\n\t{describe_change_set_resp}"
)
else:
sleep(1)
logger.info(f"Still waiting on ChangeSet {change_set_id}...")
return describe_change_set_resp
def process_s3_event(self, s3_event):
source_bucket = s3_event['Records'][0]['s3']['bucket']['name']
source_key = s3_event['Records'][0]['s3']['object']['key']
source_prefix = '/'.join(source_key.split('/')[0:-1])
s3_data_source = self._get_s3_data_source(source_bucket, source_prefix)
obj_name = source_key.split('/')[-1]
s3_data_source.validate_object_name(obj_name)
for data_set in s3_data_source.lambda_s3_trigger_target_data_sets:
# Create revision
create_revision_resp = self._de_client.create_revision(
DataSetId=data_set.id,
Comment=obj_name
)
logger.debug(create_revision_resp)
revision_id = create_revision_resp['Id']
revision_arn = create_revision_resp['Arn']
# Create job
create_job_resp = self._de_client.create_job(
Type='IMPORT_ASSETS_FROM_S3',
Details={
'ImportAssetsFromS3': {
'AssetSources': [
{
'Bucket': source_bucket,
'Key': source_key
},
],
'DataSetId': data_set.id,
'RevisionId': revision_id
}
}
)
logger.debug(create_job_resp)
# Start job
job_id = create_job_resp['Id']
start_job_resp = self._de_client.start_job(JobId=job_id)
logger.debug(start_job_resp)
# Wait for Data Exchange job completion
get_job_resp = self._wait_for_de_job_completion(job_id)
logger.debug(get_job_resp)
# Finalize revision
update_revision_resp = self._de_client.update_revision(
DataSetId=data_set.id,
RevisionId=revision_id,
Finalized=True
)
logger.debug(update_revision_resp)
# Ensure revision finalization succeeded
finalized_status = update_revision_resp['Finalized']
if finalized_status is not True:
raise Exception(f"Failed to finalize revision:\n{update_revision_resp}")
# Publish the new revision to each product associated with the data set
for product in data_set.products:
# Describe the AWS Marketplace entity corresponding to the Data Exchange product
describe_entity_resp = self._mc_client.describe_entity(
Catalog='AWSMarketplace',
EntityId=product.id
)
logger.debug(describe_entity_resp)
entity_type = describe_entity_resp['EntityType']
entity_id = describe_entity_resp['EntityIdentifier']
# Isolate the target data set in the DescribeEntity response
describe_entity_resp_data_sets = json.loads(describe_entity_resp['Details'])['DataSets']
describe_entity_resp_data_set = list(
filter(lambda ds: ds['DataSetArn'] == data_set.arn, describe_entity_resp_data_sets)
)
# We should get the data set of interest in describe_entity_resp and only that data set
assert len(describe_entity_resp_data_set) == 1
# Start a ChangeSet to add the newly finalized revision to an existing product
start_change_set_resp = self._mc_client.start_change_set(
Catalog='AWSMarketplace',
ChangeSet=[
{
"ChangeType": "AddRevisions",
"Entity": {
"Identifier": entity_id,
"Type": entity_type
},
"Details": json.dumps({
"DataSetArn": data_set.arn,
"RevisionArns": [revision_arn]
})
}
]
)
logger.debug(start_change_set_resp)
# Wait for the ChangeSet workflow to complete
change_set_id = start_change_set_resp['ChangeSetId']
describe_change_set_resp = self._wait_for_mc_change_set_completion(change_set_id)
logger.debug(describe_change_set_resp)
The following screenshot shows the S3 trigger for Job 3.
The following screenshot shows an example of CloudWatch logs for Job 3.
The following screenshot shows a CloudWatch alarm for Job 3.
Finally, we can verify that our revisions were successfully added to their corresponding data sets and products through the AWS console.
AWS Data Exchange allows you to create private offers for your AWS account IDs, providing a convenient means of checking that revisions show up in each product as expected.
Conclusion
This post demonstrated how you can integrate AWS Data Exchange into an existing data pipeline frictionlessly. We’re pleased to have been invited to participate in the AWS Data Exchange private preview, and even more pleased with the service itself, which has proven to be a sophisticated yet natural extension of our system.
I want to offer special thanks to both Kyle Patsen and Rafic Melhem of the AWS Data Exchange team for generously fielding my questions (and patiently enduring my ramblings) for the better part of the past year. I also want to thank Lucas Adams for helping me design the system discussed in this post and, more importantly, for his unwavering vote of confidence.
If you are interested in learning more about FPG, don’t hesitate to contact us.
In this blog post, I’ll show you how to integrate Prowler, an open-source security tool, with AWS Security Hub. Prowler provides dozens of security configuration checks related to services such as Amazon Redshift, Amazon ElasticCache, Amazon API Gateway and Amazon CloudFront. Integrating Prowler with Security Hub will provide posture information about resources not currently covered by existing Security Hub integrations or compliance standards. You can use Prowler checks to supplement the existing CIS AWS Foundations compliance standard Security Hub already provides, as well as other compliance-related findings you may be ingesting from partner solutions.
In this post, I’ll show you how to containerize Prowler using Docker and host it on the serverless container service AWS Fargate. By running Prowler on Fargate, you no longer have to provision, configure, or scale infrastructure, and it will only run when needed. Containers provide a standard way to package your application’s code, configurations, and dependencies into a single object that can run anywhere. Serverless applications automatically run and scale in response to events you define, rather than requiring you to provision, scale, and manage servers.
Solution overview
The following diagram shows the flow of events in the solution I describe in this blog post.
The Lambda function maps Prowler findings into the AWS Security Finding Format (ASFF) before importing them to Security Hub.
Except for an ECR repository, you’ll deploy all of the above via AWS CloudFormation. You’ll also need the following prerequisites to supply as parameters for the CloudFormation template.
Prerequisites
A VPC with at least 2 subnets that have access to the Internet plus a security group that allows access on Port 443 (HTTPS).
An ECS task role with the permissions that Prowler needs to complete its scans. You can find more information about these permissions on the official Prowler GitHub page.
An ECS task execution IAM role to allow Fargate to publish logs to CloudWatch and to download your Prowler image from Amazon ECR.
Step 1: Create an Amazon ECR repository
In this step, you’ll create an ECR repository. This is where you’ll upload your Docker image for Step 2.
Navigate to the Amazon ECR Console and select Create repository.
Enter a name for your repository (I’ve named my example securityhub-prowler, as shown in figure 2), then choose Mutable as your image tag mutability setting, and select Create repository.
Figure 2: ECR Repository Creation
Keep the browser tab in which you created the repository open so that you can easily reference the Docker commands you’ll need in the next step.
Step 2: Build and push the Docker image
In this step, you’ll create a Docker image that contains scripts that will map Prowler findings into DynamoDB. Before you begin step 2, ensure your workstation has the necessary permissions to push images to ECR.
Create a Dockerfile via your favorite text editor, and name it Dockerfile.
FROM python:latest
# Declar Env Vars
ENV MY_DYANMODB_TABLE=MY_DYANMODB_TABLE
ENV AWS_REGION=AWS_REGION
# Install Dependencies
RUN \
apt update && \
apt upgrade -y && \
pip install awscli && \
apt install -y python3-pip
# Place scripts
ADD converter.py /root
ADD loader.py /root
ADD script.sh /root
# Installs prowler, moves scripts into prowler directory
RUN \
git clone https://github.com/toniblyx/prowler && \
mv root/converter.py /prowler && \
mv root/loader.py /prowler && \
mv root/script.sh /prowler
# Runs prowler, ETLs ouput with converter and loads DynamoDB with loader
WORKDIR /prowler
RUN pip3 install boto3
CMD bash script.sh
Create a new file called script.sh and paste in the below code. This script will call the remaining scripts, which you’re about to create in a specific order.
Note: Change the AWS Region in the Prowler command on line 3 to the region in which you’ve enabled Security Hub.
#!/bin/bash
echo "Running Prowler Scans"
./prowler -b -n -f us-east-1 -g extras -M csv > prowler_report.csv
echo "Converting Prowler Report from CSV to JSON"
python converter.py
echo "Loading JSON data into DynamoDB"
python loader.py
Create a new file called converter.py and paste in the below code. This Python script will convert the Prowler CSV report into JSON, and both versions will be written to the local storage of the Prowler container.
import csv
import json
# Set variables for within container
CSV_PATH = 'prowler_report.csv'
JSON_PATH = 'prowler_report.json'
# Reads prowler CSV output
csv_file = csv.DictReader(open(CSV_PATH, 'r'))
# Create empty JSON list, read out rows from CSV into it
json_list = []
for row in csv_file:
json_list.append(row)
# Writes row into JSON file, writes out to docker from .dumps
open(JSON_PATH, 'w').write(json.dumps(json_list))
# open newly converted prowler output
with open('prowler_report.json') as f:
data = json.load(f)
# remove data not needed for Security Hub BatchImportFindings
for element in data:
del element['PROFILE']
del element['SCORED']
del element['LEVEL']
del element['ACCOUNT_NUM']
del element['REGION']
# writes out to a new file, prettified
with open('format_prowler_report.json', 'w') as f:
json.dump(data, f, indent=2)
Create your last file, called loader.py and paste in the below code. This Python script will read values from the JSON file and send them to DynamoDB.
from __future__ import print_function # Python 2/3 compatibility
import boto3
import json
import decimal
import os
awsRegion = os.environ['AWS_REGION']
prowlerDynamoDBTable = os.environ['MY_DYANMODB_TABLE']
dynamodb = boto3.resource('dynamodb', region_name=awsRegion)
table = dynamodb.Table(prowlerDynamoDBTable)
# CHANGE FILE AS NEEDED
with open('format_prowler_report.json') as json_file:
findings = json.load(json_file, parse_float = decimal.Decimal)
for finding in findings:
TITLE_ID = finding['TITLE_ID']
TITLE_TEXT = finding['TITLE_TEXT']
RESULT = finding['RESULT']
NOTES = finding['NOTES']
print("Adding finding:", TITLE_ID, TITLE_TEXT)
table.put_item(
Item={
'TITLE_ID': TITLE_ID,
'TITLE_TEXT': TITLE_TEXT,
'RESULT': RESULT,
'NOTES': NOTES,
}
)
From the ECR console, within your repository, select View push commands to get operating system-specific instructions and additional resources to build, tag, and push your image to ECR. See Figure 3 for an example.
Figure 3: ECR Push Commands
Note: If you’ve built Docker images previously within your workstation, pass the –no-cacheflag with your docker build command.
After you’ve built and pushed your Image, note the URI within the ECR console (such as 12345678910.dkr.ecr.us-east-1.amazonaws.com/my-repo), as you’ll need this for a CloudFormation parameter in step 3.
You’ll need the values you noted in Step 2 and during the “Solution overview” prerequisites. The description of each parameter is provided on the Parameters page of the CloudFormation deployment (see Figure 4)
Figure 4: CloudFormation Parameters
After the CloudFormation stack finishes deploying, click the Resources tab to find your Task Definition (called ProwlerECSTaskDefinition). You’ll need this during Step 4.
Figure 5: CloudFormation Resources
Step 4: Manually run ECS task
In this step, you’ll run your ECS Task manually to verify the integration works. (Once you’ve tested it, this step will be automatic based on CloudWatch events.)
Navigate to the Amazon ECS console and from the navigation pane select Task Definitions.
As shown in Figure 6, select the check box for the task definition you deployed via CloudFormation, then select the Actions dropdown menu and choose Run Task.
Figure 6: ECS Run Task
Configure the following settings (shown in Figure 7), then select Run Task:
Launch Type: FARGATE
Platform Version: Latest
Cluster: Select the cluster deployed by CloudFormation
Number of tasks: 1
Cluster VPC: Enter the VPC of the subnets you provided as CloudFormation parameters
Subnets: Select 1 or more subnets in the VPC
Security groups: Enter the same security group you provided as a CloudFormation parameter
Auto-assign public IP: ENABLED
Figure 7: ECS Task Settings
Depending on the size of your account and the resources within it, your task can take up to an hour to complete. Follow the progress by looking at the Logs tab within the Task view (Figure 8) by selecting your task. The stdout from Prowler will appear in the logs.
Note: Once the task has completed it will automatically delete itself. You do not need to take additional actions for this to happen during this or subsequent runs.
Figure 8: ECS Task Logs
Under the Details tab, monitor the status. When the status reads Stopped, navigate to the DynamoDB console.
Select your table, then select the Items tab. Your findings will be indexed under the primary key NOTES, as shown in Figure 9. From here, the Lambda function will trigger each time new items are written into the table from Fargate and will load them into Security Hub.
Figure 9: DynamoDB Items
Finally, navigate to the Security Hub console, select the Findings menu, and wait for findings from Prowler to arrive in the dashboard as shown in figure 10.
Figure 10: Prowler Findings in Security Hub
If you run into errors when running your Fargate task, refer to the Amazon ECS Troubleshooting guide. Log errors commonly come from missing permissions or disabled Regions – refer back to the Prowler GitHub for troubleshooting information.
Conclusion
In this post, I showed you how to containerize Prowler, run it manually, create a schedule with CloudWatch Events, and use custom Python scripts along with DynamoDB streams and Lambda functions to load Prowler findings into Security Hub. By using Security Hub, you can centralize and aggregate security configuration information from Prowler alongside findings from AWS and partner services.
From Security Hub, you can use custom actions to send one or a group of findings from Prowler to downstream services such as ticketing systems or to take custom remediation actions. You can also use Security Hub custom insights to create saved searches from your Prowler findings. Lastly, you can use Security Hub in a master-member format to aggregate findings across multiple accounts for centralized reporting.
If you have feedback about this blog post, submit comments in the Comments section below. If you have questions about this blog post, start a new thread on the AWS Security Hub forum.
Want more AWS Security news? Follow us on Twitter.
This post is contributed by Mahmoud ElZayet | Specialist SA – Dev Tech, AWS
Modern application development processes enable organizations to improve speed and quality continually. In this innovative culture, small, autonomous teams own the entire application life cycle. While such nimble, autonomous teams speed product delivery, they can also impose costs on compliance, quality assurance, and code deployment infrastructures.
Standardized tooling and application release code helps share best practices across teams, reduce duplicated code, speed on-boarding, create consistent governance, and prevent resource over-provisioning.
Overview
In this post, I show you how to use AWS Service Catalog to provide standardized and automated deployment blueprints. This helps accelerate and improve your product teams’ application release workflows on Amazon ECS. Follow my instructions to create a sample blueprint that your product teams can use to release containerized applications on ECS. You can also apply the blueprint concept to other technologies, such as serverless or Amazon EC2–based deployments.
The sample templates and scripts provided here are for demonstration purposes and should not be used “as-is” in your production environment. After you become familiar with these resources, create customized versions for your production environment, taking account of in-house tools and team skills, as well as all applicable standards and restrictions.
Prerequisites
To use this solution, you need the following resources:
Example Corp. has various product teams that develop applications and services on AWS. Example Corp. teams have expressed interest in deploying their containerized applications managed by AWS Fargate on ECS. As part of Example Corp’s central tooling team, you want to enable teams to quickly release their applications on Fargate. However, you also make sure that they comply with all best practices and governance requirements.
For convenience, I also assume that you have supplied product teams working on the same domain, application, or project with a shared AWS account for service deployment. Using this account, they all deploy to the same ECS cluster.
In this scenario, you can author and provide these teams with a shared deployment blueprint on ECS Fargate. Using AWS Service Catalog, you can share the blueprint with teams as follows:
Every time that a product team wants to release a new containerized application on ECS, they retrieve a new AWS Service Catalog ECS blueprint product. This enables them to obtain the required infrastructure, permissions, and tools. As a prerequisite, the ECS blueprint requires building blocks such as a git repository or an AWS CodeBuild project. Again, you can acquire those blocks through another AWS Service Catalog product.
The product team completes the ECS blueprint’s required parameters, such as the desired number of ECS tasks and application name. As an administrator, you can constrain the value of some parameters such as the VPC and the cluster name. For more information, see AWS Service Catalog Template Constraints.
The ECS blueprint product deploys all the required ECS resources, configured according to best practices. You can also use the AWS Cloud Development Kit (CDK) to maintain and provision pre-defined constructs for your infrastructure.
A standardized CI/CD pipeline also generates, enabling your product teams to publish their application to ECS automatically. Ideally, this pipeline should have all stages, practices, security checks, and standards required for application release. Product teams must still author application code, create a Dockerfile, build specifications, run automated tests and deployment scripts, and complete other tasks required for application release.
The ECS blueprint can be continually updated based on organization-wide feedback and to support new use cases. Your product team can always access the latest version through AWS Service Catalog. I recommend retaining multiple, customizable blueprints for various technologies.
For simplicity’s sake, my explanation envisions your environment as consisting of one AWS account. In practice, you can use IAM controls to segregate teams’ access to each other’s resources, even when they share an account. However, I recommend having at least two AWS accounts, one for testing and one for production purposes.
To see an example framework that helps deploy your AWS Service Catalog products to multiple accounts, see AWS Deployment Framework (ADF). This framework can also help you create cross-account pipelines that cater to different product teams’ needs, even when these teams deploy to the same technology stack.
To set up shared deployment blueprints for your production teams, follow the steps outlined in the following sections.
Set up the environment
In this section, I explain how to create a central ECS cluster in the appropriate VPC where teams can deploy their containers. I provide an AWS CloudFormation template to help you set up these resources. This template also creates an IAM role to be used by AWS Service Catalog later.
To run the CloudFormation template:
1. Use a git client to clone the following GitHub repository to a local directory. This will be the directory where you will run all the subsequent AWS CLI commands.
2. Using the AWS CLI, run the following commands. Replace <Application_Name> with a lowercase string with no spaces representing the application or microservice that your product team plans to release—for example, myapp.
4. In case of error, use the describe-events CLI command or review error details on the console.
5. When the stack creation reads CREATE_COMPLETE, run the following command, and make a note of the output values in an editor of your choice. You need this information for a later step:
6. Run the following commands to copy those CloudFormation templates to Amazon S3. Replace <Template_Bucket_Name> with the template bucket output value you just copied into your editor of choice:
In this section, I show you how to create two AWS Service Catalog products for teams to use in publishing their containerized app:
Core Build Tools
ECS Fargate Deployment Blueprint
To create an AWS Service Catalog portfolio that includes these products:
1. Using the AWS CLI, run the following command, replacing <Application_Name> with the application name you defined earlier and replacing <Template_Bucket_Name> with the template bucket output value you copied into your editor of choice:
3. In case of error, use the describe-events CLI command or check error details in the console.
Your AWS Service Catalog configuration should now be ready.
Test product teams experience
In this section, I show you how to use IAM roles to impersonate a product team member and simulate their first experience of containerized application deployment.
Assume team role
To assume the role that you created during the environment setup step
1. In the Management console, follow the instructions in Switching a Role.
For Account, enter the account ID used in the sample solution. To learn more about how to find an AWS account ID, see Your AWS Account ID and Its Alias.
For Role, enter <Application_Name>-product-team-role, where <Application_Name> is the same application name you defined in Environment Setup section.
(Optional) For Display name, enter a custom session value.
You are now logged in as a member of the product team.
Provision core build product
Next, provision the core build tools for your blueprint:
In the Service Catalog console, you should now see the two products created earlier listed under Products.
Select the first product, Core Build Tools.
Choose LAUNCH PRODUCT.
Name the product something such as <Application_Name>-build-tools, replacing <Application_Name> with the name previously defined for your application.
Provide the same application name you defined previously.
Leave the ContainerBuild parameter default setting as yes, as you are building a container requiring a container repository and its associated permissions.
Choose NEXT three times, then choose LAUNCH.
Under Events, watch the Status property. Keep refreshing until the status reads Succeeded. In case of failure, choose the URL value next to the key CloudformationStackARN. This choice takes you to the CloudFormation console, where you can find more information on the errors.
Now you have the following build tools created along with the required permissions:
AWS CodeCommit repository to store your code
CodeBuild project to build your container image and test your application code
Amazon ECR repository to store your container images
Amazon S3 bucket to store your build and release artifacts
Provision ECS Fargate deployment blueprint
In the Service Catalog console, follow the same steps to deploy the blueprint for ECS deployment. Here are the product provisioning details:
For the parameters Subnet1, Subnet2, VpcId, enter the output values you copied earlier into your editor of choice in the Setup Environment section.
For other parameters, enter the following:
ApplicationName: The same application name you defined previously.
ClusterName: Enter the value example-corp-ecs-cluster, which is the name chosen in the template for the central cluster.
Leave the DesiredCount and LaunchType parameters to their default values.
After the blueprint product creation completes, you should have an ECS service with a sample task definition for your product team. The build tools created earlier include the permissions required for deploying to the ECS service. Also, a CI/CD pipeline has been created to guide your product teams as they publish their application to the ECS service. Ideally, this pipeline should have all stages, practices, security checks, and standards required for application release.
Product teams still have to author application code, create a Dockerfile, build specifications, run automated tests and deployment scripts, and perform other tasks required for application release. The blueprint product can provide wiki links to reference examples for these steps, or access to pre-provisioned sample pipelines.
Test your pipeline
Now, upload a sample app to test your pipeline:
Log in with the product team role.
In the CodeCommit console, select the repository with the application name that you defined in the environment setup section.
Scroll down, choose Add file, Create file.
Paste the following in the page editor, which is a script to build the container image and push it to the ECR repository:
6. For Author name and Email address, enter your name and your preferred email address for the commit. Although optional, the addition of a commit message is a good practice.
7. Choose Commit changes.
8. Repeat the same steps for the Dockerfile. The sample Dockerfile creates a straightforward PHP application. Typically, you add your application content to that image.
File name: Dockerfile
File content:
FROM ubuntu:12.04
# Install dependencies
RUN apt-get update -y
RUN apt-get install -y git curl apache2 php5 libapache2-mod-php5 php5-mcrypt php5-mysql
# Configure apache
RUN a2enmod rewrite
RUN chown -R www-data:www-data /var/www
ENV APACHE_RUN_USER www-data
ENV APACHE_RUN_GROUP www-data
ENV APACHE_LOG_DIR /var/log/apache2
EXPOSE 80
CMD ["/usr/sbin/apache2", "-D", "FOREGROUND"]
Your pipeline should now be ready to run successfully. Although you can list all current pipelines in the Region, you can only describe and modify pipelines that have a prefix matching your application name. To confirm:
In the AWS CodePipeline console, select the pipeline <Application_Name>-ecs-fargate-pipeline.
The pipeline should now be running.
Because you performed two commits to the repository from the console, you must wait for the second run to complete before successful deployment to ECS Fargate.
Clean up
To clean up the environment, run the following commands in the AWS CLI, replacing <Application_Name> with your application name, <Account_Id> with your AWS Account ID with no hyphens and <Template_Bucket_Name> with the template bucket output value you copied into your editor of choice:
In this post, I showed you how to design and build ECS Fargate deployment blueprints. I explained how these accelerate and standardize the release of containerized applications on AWS. Your product teams can keep getting the latest standards and coded best practices through those automated blueprints.
As always, AWS welcomes feedback. Please submit comments or questions below.
This post is contributed by Tony Pujals | Senior Developer Advocate, AWS
AWS recently increased the number of elastic network interfaces available when you run tasks on Amazon ECS. Use the account setting called awsvpcTrunking. If you use the Amazon EC2 launch type and task networking (awsvpc network mode), you can now run more tasks on an instance—5 to 17 times as many—as you did before.
As more of you embrace microservices architectures, you deploy increasing numbers of smaller tasks. AWS now offers you the option of more efficient packing per instance, potentially resulting in smaller clusters and associated savings.
Overview
To manage your own cluster of EC2 instances, use the EC2 launch type. Use task networking to run ECS tasks using the same networking properties as if tasks were distinct EC2 instances.
Task networking offers several benefits. Every task launched with awsvpc network mode has its own attached network interface, a primary private IP address, and an internal DNS hostname. This simplifies container networking and gives you more control over how tasks communicate, both with each other and with other services within their virtual private clouds (VPCs).
Task networking also lets you take advantage of other EC2 networking features like VPC Flow Logs. This feature lets you monitor traffic to and from tasks. It also provides greater security control for containers, allowing you to use security groups and network monitoring tools at a more granular level within tasks. For more information, see Introducing Cloud Native Networking for Amazon ECS Containers.
However, if you run container tasks on EC2 instances with task networking, you can face a networking limit. This might surprise you, particularly when an instance has plenty of free CPU and memory. The limit reflects the number of network interfaces available to support awsvpc network mode per container instance.
Raise network interface density limits with trunking
The good news is that AWS raised network interface density limits by implementing a networking feature on ECS called “trunking.” This is a technique for multiplexing data over a shared communication link.
If you’re migrating to microservices using AWS App Mesh, you should optimize network interface density. App Mesh requires awsvpc networking to provide routing control and visibility over an ever-expanding array of running tasks. In this context, increased network interface density might save money.
By opting for network interface trunking, you should see a significant increase in capacity—from 5 to 17 times more than the previous limit. For more information on the new task limits per container instance, see Supported Amazon EC2 Instance Types.
Applications with tasks not hitting CPU or memory limits also benefit from this feature through the more cost-effective “bin packing” of container instances.
Trunking is an opt-in feature
AWS chose to make the trunking feature opt-in due to the following factors:
Instance registration: While normal instance registration is straightforward with trunking, this feature increases the number of asynchronous instance registration steps that can potentially fail. Any such failures might add extra seconds to launch time.
Available IP addresses: The “trunk” belongs to the same subnet in which the instance’s primary network interface originates. This effectively reduces the available IP addresses and potentially the ability to scale out on other EC2 instances sharing the same subnet. The trunk consumes an IP address. With a trunk attached, there are two assigned IP addresses per instance, one for the primary interface and one for the trunk.
Differing customer preferences and infrastructure: If you have high CPU or memory workloads, you might not benefit from trunking. Or, you may not want awsvpc networking.
Consequently, AWS leaves it to you to decide if you want to use this feature. AWS might revisit this decision in the future, based on customer feedback. For now, your account roles or users must opt in to the awsvpcTrunking account setting to gain the benefits of increased task density per container instance.
Enable trunking
Enable the ECS elastic network interface trunking feature to increase the number of network interfaces that can be attached to supported EC2 container instance types. You must meet the following prerequisites before you can launch a container instance with the increased network interface limits:
Your account must have the AWSServiceRoleForECS service-linked role for ECS.
You must opt into the awsvpcTrunking account setting.
Make sure that a service-linked role exists for ECS
A service-linked role is a unique type of IAM role linked to an AWS service (such as ECS). This role lets you delegate the permissions necessary to call other AWS services on your behalf. Because ECS is a service that manages resources on your behalf, you need this role to proceed.
In most cases, you won’t have to create a service-linked role. If you created or updated an ECS cluster, ECS likely created the service-linked role for you.
You can confirm that your service-linked role exists using the AWS CLI, as shown in the following code example:
Your account, IAM user, or role must opt in to the awsvpcTrunking account setting. Select this setting using the AWS CLI or the ECS console. You can opt in for an account by making awsvpcTrunking its default setting. Or, you can enable this setting for the role associated with the instance profile with which the instance launches. For instructions, see Account Settings.
Other considerations
After completing the prerequisites described in the preceding sections, launch a new container instance with increased network interface limits using one of the supported EC2 instance types.
Keep the following in mind:
It’s available with the latest variant of the ECS-optimized AMI.
It only affects creation of new container instances after opting into awsvpcTrunking.
It only affects tasks created with awsvpc network mode and EC2 launch type. Tasks created with the AWS Fargate launch type always have a dedicated network interface, no matter how many you launch.
If you seek to optimize the usage of your EC2 container instances for clusters that you manage, enable the increased network interface density feature with awsvpcTrunking. By following the steps outlined in this post, you can launch tasks using significantly fewer EC2 instances. This is especially useful if you embrace a microservices architecture, with its increasing numbers of lighter tasks.
Hopefully, you found this post informative and the proposed solution intriguing. As always, AWS welcomes all feedback or comment.
This post is contributed by Tony Pujals | Senior Developer Advocate, AWS
AWS App Mesh is a service mesh, which provides a framework to control and monitor services spanning multiple AWS compute environments. My previous post provided a walkthrough to get you started. In it, I showed deploying a simple microservice application to Amazon ECS and configuring App Mesh to provide traffic control and observability.
In this post, I show more advanced techniques using AWS Fargate as an ECS launch type. I show you how to deploy a specific version of the colorteller service from the previous post. Finally, I move on and explore distributing traffic across other environments, such as Amazon EC2 and Amazon EKS.
I simplified this example for clarity, but in the real world, creating a service mesh that bridges different compute environments becomes useful. Fargate is a compute service for AWS that helps you run containerized tasks using the primitives (the tasks and services) of an ECS application. This lets you work without needing to directly configure and manage EC2 instances.
Solution overview
This post assumes that you already have a containerized application running on ECS, but want to shift your workloads to use Fargate.
You deploy a new version of the colorteller service with Fargate, and then begin shifting traffic to it. If all goes well, then you continue to shift more traffic to the new version until it serves 100% of all requests. Use the labels “blue” to represent the original version and “green” to represent the new version. The following diagram shows programmer model of the Color App.
You want to begin shifting traffic over from version 1 (represented by colorteller-blue in the following diagram) over to version 2 (represented by colorteller-green).
In App Mesh, every version of a service is ultimately backed by actual running code somewhere, in this case ECS/Fargate tasks. Each service has its own virtual node representation in the mesh that provides this conduit.
The following diagram shows the App Mesh configuration of the Color App.
After shifting the traffic, you must physically deploy the application to a compute environment. In this demo, colorteller-blue runs on ECS using the EC2 launch type and colorteller-green runs on ECS using the Fargate launch type. The goal is to test with a portion of traffic going to colorteller-green, ultimately increasing to 100% of traffic going to the new green version.
AWS compute model of the Color App.
Prerequisites
Before following along, set up the resources and deploy the Color App as described in the previous walkthrough.
Deploy the Fargate app
To get started after you complete your Color App, configure it so that your traffic goes to colorteller-blue for now. The blue color represents version 1 of your colorteller service.
Log into the App Mesh console and navigate to Virtual routers for the mesh. Configure the HTTP route to send 100% of traffic to the colorteller-blue virtual node.
The following screenshot shows routes in the App Mesh console.
Test the service and confirm in AWS X-Ray that the traffic flows through the colorteller-blue as expected with no errors.
The following screenshot shows racing the colorgateway virtual node.
Deploy the new colorteller to Fargate
With your original app in place, deploy the send version on Fargate and begin slowly increasing the traffic that it handles rather than the original. The app colorteller-green represents version 2 of the colorteller service. Initially, only send 30% of your traffic to it.
If your monitoring indicates a healthy service, then increase it to 60%, then finally to 100%. In the real world, you might choose more granular increases with automated rollout (and rollback if issues arise), but this demonstration keeps things simple.
You pushed the gateway and colorteller images to ECR (see Deploy Images) in the previous post, and then launched ECS tasks with these images. For this post, launch an ECS task using the Fargate launch type with the same colorteller and envoy images. This sets up the running envoy container as a sidecar for the colorteller container.
You don’t have to manually configure the EC2 instances in a Fargate launch type. Fargate automatically colocates the sidecar on the same physical instance and lifecycle as the primary application container.
To begin deploying the Fargate instance and diverting traffic to it, follow these steps.
Step 1: Update the mesh configuration
You can download updated AWS CloudFormation templates located in the repo under walkthroughs/fargate.
This updated mesh configuration adds a new virtual node (colorteller-green-vn). It updates the virtual router (colorteller-vr) for the colorteller virtual service so that it distributes traffic between the blue and green virtual nodes at a 2:1 ratio. That is, the green node receives one-third of the traffic.
$ ./appmesh-colorapp.sh ... Waiting for changeset to be created.. Waiting for stack create/update to complete ... Successfully created/updated stack - DEMO-appmesh-colorapp $
Step 2: Deploy the green task to Fargate
The fargate-colorteller.sh script creates parameterized template definitions before deploying the fargate-colorteller.yaml CloudFormation template. The change to launch a colorteller task as a Fargate task is in fargate-colorteller-task-def.json.
$ ./fargate-colorteller.sh ...
Waiting for changeset to be created.. Waiting for stack create/update to complete Successfully created/updated stack - DEMO-fargate-colorteller $
Verify the Fargate deployment
The ColorApp endpoint is one of the CloudFormation template’s outputs. You can view it in the stack output in the AWS CloudFormation console, or fetch it with the AWS CLI:
This reflects the expected result for a 2:1 ratio. Check everything on your AWS X-Ray console.
The following screenshot shows the X-Ray console map after the initial testing.
The results look good: 100% success, no errors.
You can now increase the rollout of the new (green) version of your service running on Fargate.
Using AWS CloudFormation to manage your stacks lets you keep your configuration under version control and simplifies the process of deploying resources. AWS CloudFormation also gives you the option to update the virtual route in appmesh-colorapp.yaml and deploy the updated mesh configuration by running appmesh-colorapp.sh.
For this post, use the App Mesh console to make the change. Choose Virtual routers for appmesh-mesh, and edit the colorteller-route. Update the HTTP route so colorteller-blue-vn handles 33.3% of the traffic and colorteller-green-vn now handles 66.7%.
If your results look good, double-check your result in the X-Ray console.
Finally, shift 100% of your traffic over to the new colorteller version using the same App Mesh console. This time, modify the mesh configuration template and redeploy it:
Again, repeat your verification process in both the CLI and X-Ray to confirm that the new version of your service is running successfully.
Conclusion
In this walkthrough, I showed you how to roll out an update from version 1 (blue) of the colorteller service to version 2 (green). I demonstrated that App Mesh supports a mesh spanning ECS services that you ran as EC2 tasks and as Fargate tasks.
In my next walkthrough, I will demonstrate that App Mesh handles even uncontainerized services launched directly on EC2 instances. It provides a uniform and powerful way to control and monitor your distributed microservice applications on AWS.
If you have any questions or feedback, feel free to comment below.
Contributed by Manuel Manzano Hoss, Cloud Support Engineer
I remember playing around with graphics processing units (GPUs) workload examples in 2017 when the Deep Learning on AWS Batch post was published by my colleague Kiuk Chung. He provided an example of how to train a convolutional neural network (CNN), the LeNet architecture, to recognize handwritten digits from the MNIST dataset using Apache MXNet as the framework. Back then, to run such jobs with GPU capabilities, I had to do the following:
Identify the type of P2 EC2 instance that had the required amount of GPU for my job.
Check the amount of vCPUs that it offered (even if I was not interested on using them).
Specify that number of vCPUs for my job.
All that, when I didn’t have any certainty that the instance was going to have the GPU required available when my job was already running. Back then, there was no GPU pinning. Other jobs running on the same EC2 instance were able to use that GPU, making the orchestration of my jobs a tricky task.
Fast forward two years. Today, AWS Batch announced integrated support for Amazon EC2 Accelerated Instances. It is now possible to specify an amount of GPU as a resource that AWS Batch considers in choosing the EC2 instance to run your job, along with vCPU and memory. That allows me to take advantage of the main benefits of using AWS Batch, the compute resource selection algorithm and job scheduler. It also frees me from having to check the types of EC2 instances that have enough GPU.
Also, I can take advantage of the Amazon ECS GPU-optimized AMI maintained by AWS. It comes with the NVIDIA drivers and all the necessary software to run GPU-enabled jobs. When I allow the P2 or P3 instance types on my compute environment, AWS Batch launches my compute resources using the Amazon ECS GPU-optimized AMI automatically.
In other words, now I don’t worry about the GPU task list mentioned earlier. I can focus on deciding which framework and command to run on my GPU-accelerated workload. At the same time, I’m now sure that my jobs have access to the required performance, as physical GPUs are pinned to each job and not shared among them.
A GPU race against the past
As a kind of GPU-race exercise, I checked a similar example to the one from Kiuk’s post, to see how fast it could be to run a GPU-enabled job now. I used the AWS Management Console to demonstrate how simple the steps are.
In this case, I decided to use the deep neural network architecture called multilayer perceptron (MLP), not the LeNet CNN, to compare the validation accuracy between them.
To make the test even simpler and faster to implement, I thought I would use one of the recently announced AWS Deep Learning containers, which come pre-packed with different frameworks and ready-to-process data. I chose the container that comes with MXNet and Python 2.7, customized for Training and GPU. For more information about the Docker images available, see the AWS Deep Learning Containers documentation.
In the AWS Batch console, I created a managed compute environment with the default settings, allowing AWS Batch to create the required IAM roles on my behalf.
On the configuration of the compute resources, I selected the P2 and P3 families of instances, as those are the type of instance with GPU capabilities. You can select On-Demand Instances, but in this case I decided to use Spot Instances to take advantage of the discounts that this pricing model offers. I left the defaults for all other settings, selecting the AmazonEC2SpotFleetRole role that I created the first time that I used Spot Instances:
Finally, I also left the network settings as default. My compute environment selected the default VPC, three subnets, and a security group. They are enough to run my jobs and at the same time keep my environment safe by limiting connections from outside the VPC:
I created a job queue, GPU_JobQueue, attaching it to the compute environment that I just created:
Next, I registered the same job definition that I would have created following Kiuk’s post. I specified enough memory to run this test, one vCPU, and the AWS Deep Learning Docker image that I chose, in this case mxnet-training:1.4.0-gpu-py27-cu90-ubuntu16.04. The amount of GPU required was in this case, one. To have access to run the script, the container must run as privileged, or using the root user.
Finally, I submitted the job. I first cloned the MXNet repository for the train_mnist.py Python script. Then I ran the script itself, with the parameter –gpus 0 to indicate that the assigned GPU should be used. The job inherits all the other parameters from the job definition:
That’s all, and my GPU-enabled job was running. It took me less than two minutes to go from zero to having the job submitted. This is the log of my job, from which I removed the iterations from epoch 1 to 18 to make it shorter:
As you can see, after AWS Batch launched the instance, the job took slightly more than two minutes to run. I spent roughly five minutes from start to finish. That was much faster than the time that I was previously spending just to configure the AMI. Using the AWS CLI, one of the AWS SDKs, or AWS CloudFormation, the same environment could be created even faster.
From a training point of view, I lost on the validation accuracy, as the results obtained using the LeNet CNN are higher than when using an MLP network. On the other hand, my job was faster, with a time cost of 1.6 seconds in average for each epoch. As the software stack evolves, and increased hardware capabilities come along, these numbers keep improving, but that shouldn’t mean extra complexity. Using managed primitives like the one presented in this post enables a simpler implementation.
I encourage you to test this example and see for yourself how just a few clicks or commands lets you start running GPU jobs with AWS Batch. Then, it is just a matter of replacing the Docker image that I used for one with the framework of your choice, TensorFlow, Caffe, PyTorch, Keras, etc. Start to run your GPU-enabled machine learning, deep learning, computational fluid dynamics (CFD), seismic analysis, molecular modeling, genomics, or computational finance workloads. It’s faster and easier than ever.
If you decide to give it a try, have any doubt or just want to let me know what you think about this post, please write in the comments section!
Amazon Elastic Container Service (ECS) is a highly scalable, high-performance container orchestration service that allows you to easily run and scale containerized applications on AWS. This post covers how Amazon Elastic Container Service (Amazon ECS) runs containers in a cluster. Topics include why AWS built the task placement engine, the different strategies and constraints available to decide where and how containers are run, and things to consider when picking placement strategies.
If you are not familiar with the relationship between ECS and Amazon EC2 or its components, see the Building Blocks of Amazon ECS post.
When a task is launched in a cluster, a decision has to be made to choose which container instance should run that task. Conversely, when scaling down a service, a decision has to be made to choose the specific task to be terminated.
Task placement
By default, ECS uses the following placement strategies:
When you run tasks with the RunTask API action, tasks are placed randomly in a cluster.
When you launch and terminate tasks with the CreateService API action, the service scheduler spreads the tasks across the Availability Zones (and the instances within the zones) in a cluster.
Before December 2016, tasks could only be placed by their default placement strategies. This meant making the decision yourself, such as writing your own scheduler, and calling the StartTask API action to achieve custom task placement. When you manually constrained the placement of your grouping of containers, you could only place based on CPU, memory, and ports. Additionally, while creating your own scheduler can be powerful, there’s a tradeoff with complexity.
AWS built the task placement engine, which removes the need for you to build, run, and manage your own scheduling and placement services. There are several new features that provide you with more control over how applications run across clusters through custom attributes.
You can think of this flow as a funnel with filters for your instances. Constraints must be obeyed. If an instance doesn’t fit, it isn’t used. Strategies are then used to sort the rest of the instances by preference to determine which are the “best.”
For every instantiation of your task, it runs through every step. Calling run-task with a count of n is effectively calling run-task n times (create-service also works the same way).
Example
Here’s how to use these placement features. In this example, you use the AWS CLIrun-task command. For the last couple of filters, I show how to use them with placement flags, but you can just as easily include them in your task definition file instead. This can all be done in the console as well. Start with the cluster shown earlier:
In the first step, eliminate all the instances that don’t have the required resources based on what you defined either in the JSON task definition or what you provided overrides for to RunTask.
Not enough CPU? Not enough memory? A port is needed, but it is already in use on that instance? Then the instance is eliminated from the set of valid candidates.
aws ecs run-task --task-definition nouvelleApp
Placement constraints
In the second step, keep only the instances that satisfy the attribute or task group constraints. Yes, this means that you can indicate what instance to use for a task (for example, to make sure that CPU-intensive jobs are scheduled on the right type of instance, or in which Availability Zone).
You can also create any custom tags of your choosing. The green tasks on the green instances, the blue tasks on the blue instances! You can also use the Cluster Query Language to write expressions to check for multiple attributes. In the next section, I cover how to write and use the attributes and expressions.
In the third step, filter on the following supported task placement strategies:
random
binpack
spread
By default, tasks are randomly placed with RunTask or spread across Availability Zones with CreateService. Spread is typically used to achieve high availability by making sure that multiple copies of a task are scheduled across multiple instances based on attributes such as Availability Zones.
Conversely, binpack places tasks together to be as cost-efficient as possible. Later in this post, you’ll see how these placement strategies work, as well as how to chain them together and why you may want to do so.
This isn’t part of the filter, but instead, the count flag is used to indicate how many copies (n) of a given task to run. Effectively, it tells ECS to re-run this workflow n times. By default, the count is set to 1, so run-task is executed one time. For services, the desired-count flag is used.
--count 8
Attributes, task groups, and expressions
For task placement, you can use instance fields, such as attributes, as well as task groups. These can be used in expressions for task placement constraints, or instance fields can be used standalone for task placement strategies. Here’s a quick overview of attributes, task groups, and expressions before you go any further.
Instance: Fields
Because you are using these fields with respect to instances in task placement, the instance: preface is optional and can be used either of the following ways with a field name or an attribute.
instance:<field>
<field>
Field names
The currently supported field names are as follows:
ec2InstanceId
agentConnected
Attributes
There are also instance attributes, which are prefaced with attribute. Again, instance: is optional:
attribute:<attribute-name>
Built-in attributes
The following are some of the provided attributes:
Well, what if you don’t see an attribute that you want? This is where custom attributes come in handy! Want to differentiate between test and prod? What about blue versus green?
In addition to placing tasks based on attributes, you can use task groups. Every task is assigned a group ID that you can reference in placement. For both tasks and services, a default ID is given, or you can choose your own. Perhaps you want to run version 2 of a service but only on instances with version 1.
task:group
Expressions
Alright, so you have some attributes and task groups… now what? Well, AWS created the Cluster Query Language to make it easy to create expressions for task placement constraints. These attributes and task groups are used with the available comparison operators, which may look familiar if you’ve used Boolean operators before. Some of these operators can be written in multiple ways, such as “!” or “not”.
For instance, to create an expression using a single attribute to select only t2.micro instances, use the ecs.instance-type attribute and the string equality comparator as follows:
attribute:ecs.instance-type == t2.micro
For t2.micro and t2.nano instances, you have a few options. You could use the same syntax as earlier with the or comparator:
attribute:ecs.instance-type == t2.micro or attribute:ecs.instance-type == t2.nano
Another way is to use the in comparator with an argument list:
attribute:ecs.instance-type in [t2.micro, t2.nano]
To include all t2 instances, use a wildcard and the pattern match operator instead of listing out each one:
attribute:ecs.instance-type =~ t2.*
Task group comparisons work the same way. The following snippet selects any instance upon which the task group “database” is running:
task:group == database
To select only task groups that are not “database,” combine expressions:
not(task:group == database)
You can use these expressions to filter your instances:
These expressions and attributes, respectively, are also used for task placement constraints and strategies, which I cover in the next few sections.
Constraints
Now look at placement constraints. When determining task placement, there may be certain EC2 instances to include or exclude from running containers. For example, you may want to place tasks only on GPU types.
Task placement constraints let you define where your containers should run across your cluster. ECS currently supports two types of placement constraints: distinctInstance and memberOf. By default, ECS spreads tasks across Availability Zones and instances.
The distinctInstance constraint makes it possible to ensure that every container is started on a unique instance in your cluster. The distinctInstance constraint never places multiple copies of a task on a single instance, even if you request more running tasks than available instances.
For example if you decide to place five copies of a task, each time it filters out the instances that are already running the task.
The memberOf constraint describes a set of instances on which your tasks should run. It is for anything you could define as an attribute or task. It also takes in an expression of attributes written in the Cluster Query Language.
For example, if you have a small application and just want it to run on t2.micro instances:
You can create expressions using the Cluster Query Language to check for multiple attributes. Here’s how you can weed out all instances in the us-west-2c Availability Zone as well as instances that aren’t of type t2.nano or t2.micro:
Now look at placement strategies. Placement strategies are used to identify an instance that meets a specific strategy. ECS supports three task placement strategies:
random
binpack
spread
Random is how RunTask places tasks by default and is fairly straightforward (it doesn’t require further parameters). The two other strategies, binpack and spread, take opposite actions. Binpack places tasks on as few instances as possible, helping to optimize resource utilization, while spread places tasks evenly across your cluster to help maximize availability. By default, ECS uses spread with the ecs.availability-zone attribute to place tasks.
Random places tasks on instances at random. This still honors the other constraints that you specified, implicitly or explicitly. Specifically, it still makes sure that tasks are scheduled on instances with enough resources to run them.
The binpack strategy tries to fit your workloads in as few instances as possible. It gets its name from the bin packing problem where the goal is to fit objects of various sizes in the smallest number of bins. It is well suited to scenarios for minimizing the number of instances in your cluster, perhaps for cost savings, and lends itself well to automatic scaling for elastic workloads, to shut down instances that are not in use.
When you use the binpack strategy, you must also indicate if you are trying to make optimal use of your instances’ CPU or memory. This is done by passing an extra field parameter, which tells the task placement engine which parameter to use to evaluate how “full” your “bins” are. It then chooses the instance with the least available CPU or memory (depending on which you pick). If there are multiple instances with this CPU or memory remaining, it chooses randomly.
The spread strategy, contrary to the binpack strategy, tries to put your tasks on as many different instances as possible. It is typically used to achieve high availability and mitigate risks, by making sure that you don’t put all your task-eggs in the same instance-baskets. Spread across Availability Zones, therefore, is the default placement strategy used for services.
When using the spread strategy, you must also indicate a field parameter. It is used to indicate the “bins” that you are considering. The accepted values are instanceID to balance tasks across all instances, host, or attribute key:value pairs such as attribute:ecs.availability-zone to balance tasks across zones. There are several AWS attributes that start with the “ecs” prefix, but you can be creative and create your own attributes.
Now that you’ve seen how to use task placement strategies, you can also chain multiple task placement strategies with their respective attributes together. You can have up to five strategy rules per service. Perhaps you want to spread tasks across Availability Zones and binpack:
Here are some use cases for task placement so you can see how they can be solved by combining attributes, expressions, constraints, and strategies.
Task creation
Mariya is fairly new to using containers and especially container orchestrators. She wants to try ECS and has a simple application that she first wants to get running on a single node. (Solution: Use the RunTask API.)
aws ecs run-task --task-definition nouvelleApp
Scaling
After trying this, Mariya wants to scale her application to run 10 containers across any available nodes in her cluster. (Solution: This means she needs to run a task using either random or spread placement strategies.)
Mariya then realizes that if she wants her tasks to automatically restart themselves if they fail, or if she wants more than 10 instantiations of her task running, she needs to create a service. (Solution: Create a service.)
Christopher wants to achieve high availability by distributing his tasks amongst all the instances in his cluster so he minimizes impact if any one host goes down. (Solution: To do this he uses spread placement over host name.)
Ming-ya wants to run a monitoring container on each instance in her cluster. To help her do this, she creates a service with a high desired count and a distinctInstance placement constraint. The ECS service scheduler ensures that each instance in the cluster runs this task (up to the desired count).
Alex wants to run a fleet of webservers. For performance reasons, they want each webserver to have local access to a caching process that was written by another team. They define their webserver as one task, the caching server as a second task. When they launch their webserver task they uses a placement constraint so that the tasks are only placed on instances that are already hosting the cache task. (Solution: Use placement constraints with a task group.)
Jake wants to achieve high availability, but he has a limited budget and needs to optimize all the resources he uses. (Solution: Take a balanced approach of spreading over availability Availability Zones and binpacking on memory within a zone.)
Aditya has a GPU workload that they want to run in containers on ECS. He needs to ensure that only GPU-enabled instances are used for this workload. (Solution: Create a service and spread on instance type = G2* or whatever other GPU-enabled instance types are in the cluster)
This post was contributed by Jason Umiker, AWS Solutions Architect.
Whether it’s helping facilitate a journey to microservices or deploying existing tools more easily and repeatably, many customers are moving toward containerized infrastructure and workflows. AWS provides many of the services and mechanisms to help you with that.
Amazon Elastic Container Service (ECS) helps schedule and orchestrate containers across a fleet of servers. It involves installing an agent on each container host that takes instructions from the ECS control plane and relays them to the local Docker image on each one. ECS makes this easy by providing an optimized Amazon Machine Image (AMI) that launches automatically using the ECS console or CLI and that you can use to launch container hosts yourself.
It is up to you to choose the appropriate instance types, sizes, and quantity for your cluster fleet. You should have the capacity to deploy and scale workloads as well as to spread them across enough failure domains for high availability. Features like Auto Scaling groups help with that.
Also, while AWS provides Amazon Linux and Windows AMIs pre-configured for ECS, you are responsible for ongoing maintenance of the OS, which includes patching and security. Items that require regular patching or updating in this model are the OS, Docker, the ECS agent, and of course the contents of the container images.
Two of the key ECS concepts are Tasks and Services. A task is one or more containers that are to be scheduled together by ECS. A service is like an Auto Scaling group for tasks. It defines the quantity of tasks to run across the cluster, where they should be running (for example, across multiple Availability Zones), automatically associates them with a load balancer, and horizontally scales based on metrics that you define like CPI or memory utilization.
What is Fargate?
AWS Fargate is a new compute engine for Amazon ECS that runs containers without requiring you to deploy or manage the underlying Amazon EC2 instances. With Fargate, you specify an image to deploy and the amount of CPU and memory it requires. Fargate handles the updating and securing of the underlying Linux OS, Docker daemon, and ECS agent as well as all the infrastructure capacity management and scaling.
How to use Fargate?
Fargate is exposed as a launch type for ECS. It uses an ECS task and service definition that is similar to the traditional EC2 launch mode, with a few minor differences. It is easy to move tasks and services back and forth between launch types. The differences include:
Using the awsvpc network mode
Specifying the CPU and memory requirements for the task in the definition
The best way to learn how to use Fargate is to walk through the process and see it in action.
Walkthrough: Deploying a service with Fargate in the console
At the time of publication, Fargate for ECS is available in the N. Virginia, Ohio, Oregon, and Ireland AWS regions. This walkthrough works in any AWS region where Fargate is available.
If you’d prefer to use a CloudFormation template, this one covers Steps 1-4. After launching this template you can skip ahead to Explore Running Service after Step 4.
Step 1 – Create an ECS cluster
An ECS cluster is a logical construct for running groups of containers known as tasks. Clusters can also be used to segregate different environments or teams from each other. In the traditional EC2 launch mode, there are specific EC2 instances associated with and managed by each ECS cluster, but this is transparent to the customer with Fargate.
Open the ECS console and ensure that Fargate is available in the selected Region (for example, N. Virginia).
Choose Clusters, Create Cluster.
Choose Networking only, Next step.
For Cluster name, enter “Fargate”. If you don’t already have a VPC to use, select the Create VPC check box and accept the defaults as well. Choose Create.
Step 2 – Create a task definition, CloudWatch log group, and task execution role
A task is a collection of one or more containers that is the smallest deployable unit of your application. A task definition is a JSON document that serves as the blueprint for ECS to know how to deploy and run your tasks.
The console makes it easier to create this definition by exposing all the parameters graphically. In addition, the console creates two dependencies:
The Amazon CloudWatch log group to store the aggregated logs from the task
The task execution IAM role that gives Fargate the permissions to run the task
In the left navigation pane, choose Task Definitions, Create new task definition.
Under Select launch type compatibility, choose FARGATE, Next step.
For Task Definition Name, enter NGINX.
If you had an IAM role for your task, you would enter it in Task Role but you don’t need one for this example.
The Network Mode is automatically set to awsvpc for Fargate
Under Task size, for Task memory, choose 0.5 GB. For Task CPU, enter 0.25.
Choose Add container.
For Container name, enter NGINX.
For Image, put nginx:1.13.9-alpine.
For Port mappings type 80 into Container port.
Choose Add, Create.
Step 3 – Create an Application Load Balancer
Sending incoming traffic through a load balancer is often a key piece of making an application both scalable and highly available. It can balance the traffic between multiple tasks, as well as ensure that traffic is only sent to healthy tasks. You can have the service manage the addition or removal of tasks from an Application Load Balancer as they come and go but that must be specified when the service is created. It’s a dependency that you create first.
Open the EC2 console.
In the left navigation pane, choose Load Balancers, Create Load Balancer.
Under Application Load Balancer, choose Create.
For Name, put NGINX.
Choose the appropriate VPC (10.0.0.0/16 if you let ECS create if for you).
For Availability Zones, select both and choose Next: Configure Security Settings.
Choose Next: Configure Security Groups.
For Assign a security group, choose Create a new security group. Choose Next: Configure Routing.
For Name, enter NGINX. For Target type, choose ip.
Select the new load balancer and note its DNS name (this is the public address for the service).
Step 4 – Create an ECS service using Fargate
A service in ECS using Fargate serves a similar purpose to an Auto Scaling group in EC2. It ensures that the needed number of tasks are running both for scaling as well as spreading the tasks over multiple Availability Zones for high availability. A service creates and destroys tasks as part of its role and can optionally add or remove them from an Application Load Balancer as targets as it does so.
Open the ECS console and ensure that that Fargate is available in the selected Region (for example, N. Virginia).
In the left navigation pane, choose Task Definitions.
Select the NGINX task definition that you created and choose Actions, Create Service.
For Launch Type, select Fargate.
For Service name, enter NGINX.
For Number of tasks, enter 1.
Choose Next step.
Under Subnets, choose both of the options.
For Load balancer type, choose Application Load Balancer. It should then default to the NGINX version that you created earlier.
Choose Add to load balancer.
For Target group name, choose NGINX.
Under DNS records for service discovery, for TTL, enter 60.
Click Next step, Next step, and Create Service.
Explore the running service
At this point, you have a running NGINX service using Fargate. You can now explore what you have running and how it works. You can also ask it to scale up to two tasks across two Availability Zones in the console.
Go into the service and see details about the associated load balancer, tasks, events, metrics, and logs:
Scale the service from one task to multiple tasks:
Choose Update.
For Number of tasks, enter 2.
Choose Next step, Next step, Next step then Update Service.
Watch the event that is logged and the new additional task both appear.
On the service Details tab, open the NGINX Target Group Name link and see the IP address registered targets spread across the two zones.
Go to the DNS name for the Application Load Balancer in your browser and see the default NGINX page. Get the value from the Load Balancers dashboard in the EC2 console.
Walkthrough: Adding a CI/CD pipeline to your service
Now, I’m going to show you how to set up a CI/CD pipeline around this service. It watches a GitHub repo for changes and rebuilds the container with CodeBuild based on the buildspec.yml file and Dockerfile in the repo. If that build is successful, it then updates your Fargate service to deploy the new image.
If you’d prefer to use a CloudFormation Template, this one covers the creation of the dependencies so that the console will pre-fill these (CodeBuild Project and IAM Roles) during the creation of the CodePipeline in the steps below.
Step 1 – Create an ECR repository for the rebuilt container image
An ECR repository is a place to store your container images in a secure and reliable manner. Scaling and self-healing of Fargate tasks requires these images to be always available to be pulled when required. This is an important part of a container platform.
Open the ECS console and ensure that that Fargate is available in the selected Region (for example N. Virginia).
In the left navigation pane, under Amazon ECR, choose Repositories, Get started.
For Repository name, put NGINX and choose Next step.
Step 2 – Fork the nginx-codebuild example into your own GitHub account
I have created an example project that takes the Dockerfile and config files for the official NGINX Docker Hub image and adds a buildspec.yml file to tell CodeBuild how to build the container and push it to your new ECR registry on completion. You can fork it into your own GitHub account for this CI/CD demo.
Go to https://github.com/jasonumiker/nginx-codebuild.
In the upper right corner, choose Fork.
Step 3 – Create the pipeline and associated IAM roles
You have two complementary AWS services for building a CI/CD pipeline for your containers. CodeBuild executes the build jobs and CodePipeline kicks off those builds when it notices that the source GitHub or CodeCommit repo changes. If successful, CodePipeline then deploys the new container image to Fargate.
The CodePipeline console can create the associated CodeBuild project, in addition to other dependencies such as the required IAM roles.
Open the CodePipeline console and ensure that that Fargate is available in the selected Region (for example, N. Virginia).
Choose Get started.
For Pipeline name, enter NGINX and choose Next step.
For Source provider, choose GitHub.
Choose Connect to GitHub and log in.
For Repository, choose your forked nginx-codebuild repo. For Branch, enter master. Choose Next step.
For Build provider, enter AWS CodeBuild.
Select Create a new build project.
For Project name, enter NGINX.
For Operating system, choose Ubuntu. For Runtime, choose Docker. For Version, select the latest version.
Expand Advanced and set the following environment variables:
AWS_ACCOUNT_ID with a value of the account number
IMAGE_REPO_NAME with a value of NGINX (or whatever ECR name that you used)
Choose Save build project, Next step.
For Deployment provider, choose Amazon ECS.
For Cluster name, enter Fargate.
For Service name, choose NGINX.
For Image filename, enter images.json.
Choose Next step.
Choose Create role, Allow, Next step, and then choose Create pipeline.
Open the IAM console and ensure that that Fargate is available in the selected Region (for example, N. Virginia).
In the left navigation pane, choose Roles.
Choose the code-build-nginx-service-role that was just created and choose Attach policy.
For Policy type, choose AmazonEC2ContainerRegistryPowerUser and choose Attach policy.
Step 4 – Start the pipeline
You now have CodePipeline watching the GitHub repo for changes. It kicks off a CodeBuild build job on a change and, if the build is successful, creates a new deployment of the Fargate service with the new image.
Make a change to the source repo (even just adding a new dummy file) and then commit it and push it to master on your GitHub fork. This automatically kicks off the pipeline to build and deploy the change.
Conclusion
As you’ve seen, Fargate is fast and easy to set up, integrates well with the rest of the AWS platform, and saves you from much of the heavy lifting of running containers reliably at scale.
While it is useful to go through creating things in the console to understand them better we suggest automating them with infrastructure-as-code patterns via things like our CloudFormation to ensure that they are repeatable, and any changes can be managed. There are some example templates to help you get started in this post.
In addition, adding things like unit and integration testing, blue/green and/or manual approval gates into CodePipeline are often a good idea before deploying patterns like this to production in many organizations. Some additional examples to look at next include:
Earlier this month we launched the C5 Instances with Local NVMe Storage and I told you that we would be doing the same for additional instance types in the near future!
Today we are introducing M5 instances equipped with local NVMe storage. Available for immediate use in 5 regions, these instances are a great fit for workloads that require a balance of compute and memory resources. Here are the specs:
Instance Name
vCPUs
RAM
Local Storage
EBS-Optimized Bandwidth
Network Bandwidth
m5d.large
2
8 GiB
1 x 75 GB NVMe SSD
Up to 2.120 Gbps
Up to 10 Gbps
m5d.xlarge
4
16 GiB
1 x 150 GB NVMe SSD
Up to 2.120 Gbps
Up to 10 Gbps
m5d.2xlarge
8
32 GiB
1 x 300 GB NVMe SSD
Up to 2.120 Gbps
Up to 10 Gbps
m5d.4xlarge
16
64 GiB
1 x 600 GB NVMe SSD
2.210 Gbps
Up to 10 Gbps
m5d.12xlarge
48
192 GiB
2 x 900 GB NVMe SSD
5.0 Gbps
10 Gbps
m5d.24xlarge
96
384 GiB
4 x 900 GB NVMe SSD
10.0 Gbps
25 Gbps
The M5d instances are powered by Custom Intel® Xeon® Platinum 8175M series processors running at 2.5 GHz, including support for AVX-512.
You can use any AMI that includes drivers for the Elastic Network Adapter (ENA) and NVMe; this includes the latest Amazon Linux, Microsoft Windows (Server 2008 R2, Server 2012, Server 2012 R2 and Server 2016), Ubuntu, RHEL, SUSE, and CentOS AMIs.
Here are a couple of things to keep in mind about the local NVMe storage on the M5d instances:
Naming – You don’t have to specify a block device mapping in your AMI or during the instance launch; the local storage will show up as one or more devices (/dev/nvme*1 on Linux) after the guest operating system has booted.
Encryption – Each local NVMe device is hardware encrypted using the XTS-AES-256 block cipher and a unique key. Each key is destroyed when the instance is stopped or terminated.
Lifetime – Local NVMe devices have the same lifetime as the instance they are attached to, and do not stick around after the instance has been stopped or terminated.
Available Now M5d instances are available in On-Demand, Reserved Instance, and Spot form in the US East (N. Virginia), US West (Oregon), EU (Ireland), US East (Ohio), and Canada (Central) Regions. Prices vary by Region, and are just a bit higher than for the equivalent M5 instances.
Today I’m excited to announce built-in authentication support in Application Load Balancers (ALB). ALB can now securely authenticate users as they access applications, letting developers eliminate the code they have to write to support authentication and offload the responsibility of authentication from the backend. The team built a great live example where you can try out the authentication functionality.
Identity-based security is a crucial component of modern applications and as customers continue to move mission critical applications into the cloud, developers are asked to write the same authentication code again and again. Enterprises want to use their on-premises identities with their cloud applications. Web developers want to use federated identities from social networks to allow their users to sign-in. ALB’s new authentication action provides authentication through social Identity Providers (IdP) like Google, Facebook, and Amazon through Amazon Cognito. It also natively integrates with any OpenID Connect protocol compliant IdP, providing secure authentication and a single sign-on experience across your applications.
How Does ALB Authentication Work?
Authentication is a complicated topic and our readers may have differing levels of expertise with it. I want to cover a few key concepts to make sure we’re all on the same page. If you’re already an authentication expert and you just want to see how ALB authentication works feel free to skip to the next section!
Authentication verifies identity.
Authorization verifies permissions, the things an identity is allowed to do.
OpenID Connect (OIDC) is a simple identity, or authentication, layer built on top on top of the OAuth 2.0 protocol. The OIDC specification document is pretty well written and worth a casual read.
Identity Providers (IdPs) manage identity information and provide authentication services. ALB supports any OIDC compliant IdP and you can use a service like Amazon Cognito or Auth0 to aggregate different identities from various IdPs like Active Directory, LDAP, Google, Facebook, Amazon, or others deployed in AWS or on premises.
When we get away from the terminology for a bit, all of this boils down to figuring out who a user is and what they’re allowed to do. Doing this securely and efficiently is hard. Traditionally, enterprises have used a protocol called SAML with their IdPs, to provide a single sign-on (SSO) experience for their internal users. SAML is XML heavy and modern applications have started using OIDC with JSON mechanism to share claims. Developers can use SAML in ALB with Amazon Cognito’s SAML support. Web app or mobile developers typically use federated identities via social IdPs like Facebook, Amazon, or Google which, conveniently, are also supported by Amazon Cognito.
ALB Authentication works by defining an authentication action in a listener rule. The ALB’s authentication action will check if a session cookie exists on incoming requests, then check that it’s valid. If the session cookie is set and valid then the ALB will route the request to the target group with X-AMZN-OIDC-* headers set. The headers contain identity information in JSON Web Token (JWT) format, that a backend can use to identify a user. If the session cookie is not set or invalid then ALB will follow the OIDC protocol and issue an HTTP 302 redirect to the identity provider. The protocol is a lot to unpack and is covered more thoroughly in the documentation for those curious.
ALB Authentication Walkthrough
I have a simple Python flask app in an Amazon ECS cluster running in some AWS Fargate containers. The containers are in a target group routed to by an ALB. I want to make sure users of my application are logged in before accessing the authenticated portions of my application. First, I’ll navigate to the ALB in the console and edit the rules.
I want to make sure all access to /account* endpoints is authenticated so I’ll add new rule with a condition to match those endpoints.
Now, I’ll add a new rule and create an Authenticate action in that rule.
I’ll have ALB create a new Amazon Cognito user pool for me by providing some configuration details.
After creating the Amazon Cognito pool, I can make some additional configuration in the advanced settings.
I can change the default cookie name, adjust the timeout, adjust the scope, and choose the action for unauthenticated requests.
I can pick Deny to serve a 401 for all unauthenticated requests or I can pick Allow which will pass through to the application if unauthenticated. This is useful for Single Page Apps (SPAs). For now, I’ll choose Authenticate, which will prompt the IdP, in this case Amazon Cognito, to authenticate the user and reload the existing page.
Now I’ll add a forwarding action for my target group and save the rule.
Over on the Facebook side I just need to add my Amazon Cognito User Pool Domain to the whitelisted OAuth redirect URLs.
I would follow similar steps for other authentication providers.
Now, when I navigate to an authenticated page my Fargate containers receive the originating request with the X-Amzn-Oidc-* headers set by ALB. Using the information in those headers (claims-data, identity, access-token) my application can implement authorization.
All of this was possible without having to write a single line of code to deal with each of the IdPs. However, it’s still important for the implementing applications to verify the signature on the JWT header to ensure the request hasn’t been tampered with.
Additional Resources
Of course everything we’ve seen today is also available in the the API and AWS Command Line Interface (CLI). You can find additional information on the feature in the documentation. This feature is provided at no additional charge.
With authentication built-in to ALB, developers can focus on building their applications instead of rebuilding authentication for every application, all the while maintaining the scale, availability, and reliability of ALB. I think this feature is a pretty big deal and I can’t wait to see what customers build with it. Let us know what you think of this feature in the comments or on twitter!
Thanks to Greg Eppel, Sr. Solutions Architect, Microsoft Platform for this great blog that describes how to create a custom CodeBuild build environment for the .NET Framework. — AWS CodeBuild is a fully managed build service that compiles source code, runs tests, and produces software packages that are ready to deploy. CodeBuild provides curated build environments for programming languages and runtimes such as Android, Go, Java, Node.js, PHP, Python, Ruby, and Docker. CodeBuild now supports builds for the Microsoft Windows Server platform, including a prepackaged build environment for .NET Core on Windows. If your application uses the .NET Framework, you will need to use a custom Docker image to create a custom build environment that includes the Microsoft proprietary Framework Class Libraries. For information about why this step is required, see our FAQs. In this post, I’ll show you how to create a custom build environment for .NET Framework applications and walk you through the steps to configure CodeBuild to use this environment.
Build environments are Docker images that include a complete file system with everything required to build and test your project. To use a custom build environment in a CodeBuild project, you build a container image for your platform that contains your build tools, push it to a Docker container registry such as Amazon Elastic Container Registry (Amazon ECR), and reference it in the project configuration. When it builds your application, CodeBuild retrieves the Docker image from the container registry specified in the project configuration and uses the environment to compile your source code, run your tests, and package your application.
Step 1: Launch EC2 Windows Server 2016 with Containers
In the Amazon EC2 console, in your region, launch an Amazon EC2 instance from a Microsoft Windows Server 2016 Base with Containers AMI.
Increase disk space on the boot volume to at least 50 GB to account for the larger size of containers required to install and run Visual Studio Build Tools.
Run the following command in that directory. This process can take a while. It depends on the size of EC2 instance you launched. In my tests, a t2.2xlarge takes less than 30 minutes to build the image and produces an approximately 15 GB image.
docker build -t buildtools2017:latest -m 2GB .
Run the following command to test the container and start a command shell with all the developer environment variables:
docker run -it buildtools2017
Create a repository in the Amazon ECS console. For the repository name, type buildtools2017. Choose Next step and then complete the remaining steps.
Execute the following command to generate authentication details for our registry to the local Docker engine. Make sure you have permissions to the Amazon ECR registry before you execute the command.
aws ecr get-login
In the same command prompt window, copy and paste the following commands:
In the CodeCommit console, create a repository named DotNetFrameworkSampleApp. On the Configure email notifications page, choose Skip.
Clone a .NET Framework Docker sample application from GitHub. The repository includes a sample ASP.NET Framework that we’ll use to demonstrate our custom build environment.On the EC2 instance, open a command prompt and execute the following commands:
Navigate to the CodeCommit repository and confirm that the files you just pushed are there.
Step 4: Configure build spec
To build your .NET Framework application with CodeBuild you use a build spec, which is a collection of build commands and related settings, in YAML format, that AWS CodeBuild can use to run a build. You can include a build spec as part of the source code or you can define a build spec when you create a build project. In this example, I include a build spec as part of the source code.
In the root directory of your source directory, create a YAML file named buildspec.yml.
At this point, we have a Docker image with Visual Studio Build Tools installed and stored in the Amazon ECR registry. We also have a sample ASP.NET Framework application in a CodeCommit repository. Now we are going to set up CodeBuild to build the ASP.NET Framework application.
In the Amazon ECR console, choose the repository that was pushed earlier with the docker push command. On the Permissions tab, choose Add.
For Source Provider, choose AWS CodeCommit and then choose the called DotNetFrameworkSampleApp repository.
For Environment Image, choose Specify a Docker image.
For Environment type, choose Windows.
For Custom image type, choose Amazon ECR.
For Amazon ECR repository, choose the Docker image with the Visual Studio Build Tools installed, buildtools2017. Your configuration should look like the image below:
Choose Continue and then Save and Build to create your CodeBuild project and start your first build. You can monitor the status of the build in the console. You can also configure notifications that will notify subscribers whenever builds succeed, fail, go from one phase to another, or any combination of these events.
Summary
CodeBuild supports a number of platforms and languages out of the box. By using custom build environments, it can be extended to other runtimes. In this post, I showed you how to build a .NET Framework environment on a Windows container and demonstrated how to use it to build .NET Framework applications in CodeBuild.
We’re excited to see how customers extend and use CodeBuild to enable continuous integration and continuous delivery for their Windows applications. Feel free to share what you’ve learned extending CodeBuild for your own projects. Just leave questions or suggestions in the comments.
As you can see from my EC2 Instance History post, we add new instance types on a regular and frequent basis. Driven by increasingly powerful processors and designed to address an ever-widening set of use cases, the size and diversity of this list reflects the equally diverse group of EC2 customers!
Near the bottom of that list you will find the new compute-intensive C5 instances. With a 25% to 50% improvement in price-performance over the C4 instances, the C5 instances are designed for applications like batch and log processing, distributed and or real-time analytics, high-performance computing (HPC), ad serving, highly scalable multiplayer gaming, and video encoding. Some of these applications can benefit from access to high-speed, ultra-low latency local storage. For example, video encoding, image manipulation, and other forms of media processing often necessitates large amounts of I/O to temporary storage. While the input and output files are valuable assets and are typically stored as Amazon Simple Storage Service (S3) objects, the intermediate files are expendable. Similarly, batch and log processing runs in a race-to-idle model, flushing volatile data to disk as fast as possible in order to make full use of compute resources.
New C5d Instances with Local Storage In order to meet this need, we are introducing C5 instances equipped with local NVMe storage. Available for immediate use in 5 regions, these instances are a great fit for the applications that I described above, as well as others that you will undoubtedly dream up! Here are the specs:
Instance Name
vCPUs
RAM
Local Storage
EBS Bandwidth
Network Bandwidth
c5d.large
2
4 GiB
1 x 50 GB NVMe SSD
Up to 2.25 Gbps
Up to 10 Gbps
c5d.xlarge
4
8 GiB
1 x 100 GB NVMe SSD
Up to 2.25 Gbps
Up to 10 Gbps
c5d.2xlarge
8
16 GiB
1 x 225 GB NVMe SSD
Up to 2.25 Gbps
Up to 10 Gbps
c5d.4xlarge
16
32 GiB
1 x 450 GB NVMe SSD
2.25 Gbps
Up to 10 Gbps
c5d.9xlarge
36
72 GiB
1 x 900 GB NVMe SSD
4.5 Gbps
10 Gbps
c5d.18xlarge
72
144 GiB
2 x 900 GB NVMe SSD
9 Gbps
25 Gbps
Other than the addition of local storage, the C5 and C5d share the same specs. Both are powered by 3.0 GHz Intel Xeon Platinum 8000-series processors, optimized for EC2 and with full control over C-states on the two largest sizes, giving you the ability to run two cores at up to 3.5 GHz using Intel Turbo Boost Technology.
You can use any AMI that includes drivers for the Elastic Network Adapter (ENA) and NVMe; this includes the latest Amazon Linux, Microsoft Windows (Server 2008 R2, Server 2012, Server 2012 R2 and Server 2016), Ubuntu, RHEL, SUSE, and CentOS AMIs.
Here are a couple of things to keep in mind about the local NVMe storage:
Naming – You don’t have to specify a block device mapping in your AMI or during the instance launch; the local storage will show up as one or more devices (/dev/nvme*1 on Linux) after the guest operating system has booted.
Encryption – Each local NVMe device is hardware encrypted using the XTS-AES-256 block cipher and a unique key. Each key is destroyed when the instance is stopped or terminated.
Lifetime – Local NVMe devices have the same lifetime as the instance they are attached to, and do not stick around after the instance has been stopped or terminated.
Available Now C5d instances are available in On-Demand, Reserved Instance, and Spot form in the US East (N. Virginia), US West (Oregon), EU (Ireland), US East (Ohio), and Canada (Central) Regions. Prices vary by Region, and are just a bit higher than for the equivalent C5 instances.
Today I’m excited to announce a new Machine Learning Competency for Consulting Partners in the Amazon Partner Network (APN). This AWS Competency program allows APN Consulting Partners to demonstrate a deep expertise in machine learning on AWS by providing solutions that enable machine learning and data science workflows for their customers. This new AWS Competency is in addition to the Machine Learning comptency for our APN Technology Partners, that we launched at the re:Invent 2017 partner summit.
These APN Consulting Partners help organizations solve their machine learning and data challenges through:
Providing data services that help data scientists and machine learning practitioners prepare their enterprise data for training.
Platform solutions that provide data scientists and machine learning practitioners with tools to take their data, train models, and make predictions on new data.
SaaS and API solutions to enable predictive capabilities within customer applications.
Why work with an AWS Machine Learning Competency Partner?
The AWS Competency Program helps customers find the most qualified partners with deep expertise. AWS Machine Learning Competency Partners undergo a strict validation of their capabilities to demonstrate technical proficiency and proven customer success with AWS machine learning tools.
If you’re an AWS customer interested in machine learning workloads on AWS, check out our AWS Machine Learning launch partners below:
Interested in becoming an AWS Machine Learning Competency Partner?
APN Partners with experience in Machine Learning can learn more about becoming an AWS Machine Learning Competency Partner here. To learn more about the benefits of joining the AWS Partner Network, see our APN Partner website.
Thanks to the AWS Partner Team for their help with this post! – Randall
EC2 Spot Fleets are really cool. You can launch a fleet of Spot Instances that spans EC2 instance types and Availability Zones without having to write custom code to discover capacity or monitor prices. You can set the target capacity (the size of the fleet) in units that are meaningful to your application and have Spot Fleet create and then maintain the fleet on your behalf. Our customers are creating Spot Fleets of all sizes. For example, one financial service customer runs Monte Carlo simulations across 10 different EC2 instance types. They routinely make requests for hundreds of thousands of vCPUs and count on Spot Fleet to give them access to massive amounts of capacity at the best possible price.
EC2 Fleet Today we are extending and generalizing the set-it-and-forget-it model that we pioneered in Spot Fleet with EC2 Fleet, a new building block that gives you the ability to create fleets that are composed of a combination of EC2 On-Demand, Reserved, and Spot Instances with a single API call. You tell us what you need, capacity and instance-wise, and we’ll handle all the heavy lifting. We will launch, manage, monitor and scale instances as needed, without the need for scaffolding code.
You can specify the capacity of your fleet in terms of instances, vCPUs, or application-oriented units, and also indicate how much of the capacity should be fulfilled by Spot Instances. The application-oriented units allow you to specify the relative power of each EC2 instance type in a way that directly maps to the needs of your application. All three capacity specification options (instances, vCPUs, and application-oriented units) are known as weights.
I think you’ll find a number ways this feature makes managing a fleet of instances easier, and believe that you will also appreciate the team’s near-term feature roadmap of interest (more on that in a bit).
Using EC2 Fleet There are a number of ways that you can use this feature, whether you’re running a stateless web service, a big data cluster or a continuous integration pipeline. Today I’m going to describe how you can use EC2 Fleet for genomic processing, but this is similar to workloads like risk analysis, log processing or image rendering. Modern DNA sequencers can produce multiple terabytes of raw data each day, to process that data into meaningful information in a timely fashion you need lots of processing power. I’ll be showing you how to deploy a “grid” of worker nodes that can quickly crunch through secondary analysis tasks in parallel.
Projects in genomics can use the elasticity EC2 provides to experiment and try out new pipelines on hundreds or even thousands of servers. With EC2 you can access as many cores as you need and only pay for what you use. Prior to today, you would need to use the RunInstances API or an Auto Scaling group for the On-Demand & Reserved Instance portion of your grid. To get the best price performance you’d also create and manage a Spot Fleet or multiple Spot Auto Scaling groups with different instance types if you wanted to add Spot Instances to turbo-boost your secondary analysis. Finally, to automate scaling decisions across multiple APIs and Auto Scaling groups you would need to write Lambda functions that periodically assess your grid’s progress & backlog, as well as current Spot prices – modifying your Auto Scaling Groups and Spot Fleets accordingly.
You can now replace all of this with a single EC2 Fleet, analyzing genomes at scale for as little as $1 per analysis. In my grid, each step in in the pipeline requires 1 vCPU and 4 GiB of memory, a perfect match for M4 and M5 instances with 4 GiB of memory per vCPU. I will create a fleet using M4 and M5 instances with weights that correspond to the number of vCPUs on each instance:
m4.16xlarge – 64 vCPUs, weight = 64
m5.24xlarge – 96 vCPUs, weight = 96
This is expressed in a template that looks like this:
By default, EC2 Fleet will select the most cost effective combination of instance types and Availability Zones (both specified in the template) using the current prices for the Spot Instances and public prices for the On-Demand Instances (if you specify instances for which you have matching RIs, your discounts will apply). The default mode takes weights into account to get the instances that have the lowest price per unit. So for my grid, fleet will find the instance that offers the lowest price per vCPU.
Now I can request capacity in terms of vCPUs, knowing EC2 Fleet will select the lowest cost option using only the instance types I’ve defined as acceptable. Also, I can specify how many vCPUs I want to launch using On-Demand or Reserved Instance capacity and how many vCPUs should be launched using Spot Instance capacity:
The above means that I want a total of 2880 vCPUs, with 960 vCPUs fulfilled using On-Demand and 1920 using Spot. The On-Demand price per vCPU is lower for m5.24xlarge than the On-Demand price per vCPU for m4.16xlarge, so EC2 Fleet will launch 10 m5.24xlarge instances to fulfill 960 vCPUs. Based on current Spot pricing (again, on a per-vCPU basis), EC2 Fleet will choose to launch 30 m4.16xlarge instances or 20 m5.24xlarges, delivering 1920 vCPUs either way.
Putting it all together, I have a single file (fl1.json) that describes my fleet:
My entire fleet is created within seconds and was built using 10 m5.24xlarge On-Demand Instances and 30 m4.16xlarge Spot Instances, since the current Spot price was 1.5¢ per vCPU for m4.16xlarge and 1.6¢ per vCPU for m5.24xlarge.
Now lets imagine my grid has crunched through its backlog and no longer needs the additional Spot Instances. I can then modify the size of my fleet by changing the target capacity in my fleet specification, like this:
{
"TotalTargetCapacity": 960,
}
Since 960 was equal to the amount of On-Demand vCPUs I had requested, when I describe my fleet I will see all of my capacity being delivered using On-Demand capacity:
Earlier I described how RI discounts apply when EC2 Fleet launches instances for which you have matching RIs, so you might be wondering how else RI customers benefit from EC2 Fleet. Let’s say that I own regional RIs for M4 instances. In my EC2 Fleet I would remove m5.24xlarge and specify m4.10xlarge and m4.16xlarge. Then when EC2 Fleet creates the grid, it will quickly find M4 capacity across the sizes and AZs I’ve specified, and my RI discounts apply automatically to this usage.
In the Works We plan to connect EC2 Fleet and EC2 Auto Scaling groups. This will let you create a single fleet that mixed instance types and Spot, Reserved and On-Demand, while also taking advantage of EC2 Auto Scaling features such as health checks and lifecycle hooks. This integration will also bring EC2 Fleet functionality to services such as Amazon ECS, Amazon EKS, and AWS Batch that build on and make use of EC2 Auto Scaling for fleet management.
Available Now You can create and make use of EC2 Fleets today in all public AWS Regions!
Enterprises adopt containers because they recognize the benefits: speed, agility, portability, and high compute density. They understand how accelerating application delivery and deployment pipelines makes it possible to rapidly slipstream new features to customers. Although the benefits are indisputable, this acceleration raises concerns about security and corporate compliance with software governance. In this blog post, I provide a solution that shows how Layered Insight, the pioneer and global leader in container-native application protection, can be used with seamless application build and delivery pipelines like those available in AWS CodeBuild to address these concerns.
Layered Insight solutions
Layered Insight enables organizations to unify DevOps and SecOps by providing complete visibility and control of containerized applications. Using the industry’s first embedded security approach, Layered Insight solves the challenges of container performance and protection by providing accurate insight into container images, adaptive analysis of running containers, and automated enforcement of container behavior.
AWS CodeBuild
AWS CodeBuild is a fully managed build service that compiles source code, runs tests, and produces software packages that are ready to deploy. With CodeBuild, you don’t need to provision, manage, and scale your own build servers. CodeBuild scales continuously and processes multiple builds concurrently, so your builds are not left waiting in a queue. You can get started quickly by using prepackaged build environments, or you can create custom build environments that use your own build tools.
Problem Definition
Security and compliance concerns span the lifecycle of application containers. Common concerns include:
Visibility into the container images. You need to verify the software composition information of the container image to determine whether known vulnerabilities associated with any of the software packages and libraries are included in the container image.
Governance of container images is critical because only certain open source packages/libraries, of specific versions, should be included in the container images. You need support for mechanisms for blacklisting all container images that include a certain version of a software package/library, or only allowing open source software that come with a specific type of license (such as Apache, MIT, GPL, and so on). You need to be able to address challenges such as:
· Defining the process for image compliance policies at the enterprise, department, and group levels.
· Preventing the images that fail the compliance checks from being deployed in critical environments, such as staging, pre-prod, and production.
Visibility into running container instances is critical, including:
· CPU and memory utilization.
· Security of the build environment.
· All activities (system, network, storage, and application layer) of the application code running in each container instance.
Protection of running container instances that is:
· Zero-touch to the developers (not an SDK-based approach).
· Zero touch to the DevOps team and doesn’t limit the portability of the containerized application.
· This protection must retain the option to switch to a different container stack or orchestration layer, or even to a different Container as a Service (CaaS ).
· And it must be a fully automated solution to SecOps, so that the SecOps team doesn’t have to manually analyze and define detailed blacklist and whitelist policies.
Solution Details
In AWS CodeCommit, we have three projects: ● “Democode” is a simple Java application, with one buildspec to build the app into a Docker container (run by build-demo-image CodeBuild project), and another to instrument said container (instrument-image CodeBuild project). The resulting container is stored in ECR repo javatestasjavatest:20180415-layered. This instrumented container is running in AWS Fargate cluster demo-java-appand can be seen in the Layered Insight runtime console as the javatestapplication in us-east-1. ● aws-codebuild-docker-imagesis a clone of the official aws-codebuild-docker-images repo on GitHub . This CodeCommit project is used by the build-python-builder CodeBuild project to build the python 3.3.6 codebuild image and is stored at the codebuild-python ECR repo. We then manually instructed the Layered Insight console to instrument the image. ● scan-java-imagecontains just a buildspec.yml file. This file is used by the scan-java-image CodeBuild project to instruct Layered Assessment to perform a vulnerability scan of the javatest container image built previously, and then run the scan results through a compliance policy that states there should be no medium vulnerabilities. This build fails — but in this case that is a success: the scan completes successfully, but compliance fails as there are medium-level issues found in the scan.
This build is performed using the instrumented version of the Python 3.3.6 CodeBuild image, so the activity of the processes running within the build are recorded each time within the LI console.
Build container image
Create or use a CodeCommit project with your application. To build this image and store it in Amazon Elastic Container Registry (Amazon ECR), add a buildspec file to the project and build a container image and create a CodeBuild project.
Scan container image
Once the image is built, create a new buildspec in the same project or a new one that looks similar to below (update ECR URL as necessary):
version: 0.2
phases:
pre_build:
commands:
- echo Pulling down LI Scan API client scripts
- git clone https://github.com/LayeredInsight/scan-api-example-python.git
- echo Setting up LI Scan API client
- cd scan-api-example-python
- pip install layint_scan_api
- pip install -r requirements.txt
build:
commands:
- echo Scanning container started on `date`
- IMAGEID=$(./li_add_image --name <aws-region>.amazonaws.com/javatest:20180415)
- ./li_wait_for_scan -v --imageid $IMAGEID
- ./li_run_image_compliance -v --imageid $IMAGEID --policyid PB15260f1acb6b2aa5b597e9d22feffb538256a01fbb4e5a95
Add the buildspec file to the git repo, push it, and then build a CodeBuild project using with the instrumented Python 3.3.6 CodeBuild image at <aws-region>.amazonaws.com/codebuild-python:3.3.6-layered. Set the following environment variables in the CodeBuild project: ● LI_APPLICATIONNAME – name of the build to display ● LI_LOCATION – location of the build project to display ● LI_API_KEY – ApiKey:<key-name>:<api-key> ● LI_API_HOST – location of the Layered Insight API service
Instrument container image
Next, to instrument the new container image:
In the Layered Insight runtime console, ensure that the ECR registry and credentials are defined (click the Setup icon and the ‘+’ sign on the top right of the screen to add a new container registry). Note the name given to the registry in the console, as this needs to be referenced in the li_add_imagecommand in the script, below.
Next, add a new buildspec (with a new name) to the CodeCommit project, such as the one shown below. This code will download the Layered Insight runtime client, and use it to instruct the Layered Insight service to instrument the image that was just built:
version: 0.2
phases:
pre_build:
commands:
echo Pulling down LI API Runtime client scripts
git clone https://github.com/LayeredInsight/runtime-api-example-python
echo Setting up LI API client
cd runtime-api-example-python
pip install layint-runtime-api
pip install -r requirements.txt
build:
commands:
echo Instrumentation started on `date`
./li_add_image --registry "Javatest ECR" --name IMAGE_NAME:TAG --description "IMAGE DESCRIPTION" --policy "Default Policy" --instrument --wait --verbose
Commit and push the new buildspec file.
Going back to CodeBuild, create a new project, with the same CodeCommit repo, but this time select the new buildspec file. Use a Python 3.3.6 builder – either the AWS or LI Instrumented version.
Click Continue
Click Save
Run the build, again on the master branch.
If everything runs successfully, a new image should appear in the ECR registry with a -layered suffix. This is the instrumented image.
Run instrumented container image
When the instrumented container is now run — in ECS, Fargate, or elsewhere — it will log data back to the Layered Insight runtime console. It’s appearance in the console can be modified by setting the LI_APPLICATIONNAME and LI_LOCATION environment variables when running the container.
Conclusion
In the above blog we have provided you steps needed to embed governance and runtime security in your build pipelines running on AWS CodeBuild using Layered Insight.
AWS CloudFormation allows developers and systems administrators to easily create and manage a collection of related AWS resources (called a CloudFormation stack) by provisioning and updating them in an orderly and predictable way. CloudFormation users can now deploy and manage AWS Batch resources in exactly the same way that they are managing the rest of their AWS infrastructure.
This post highlights the native resources supported in CloudFormation and demonstrates how to create AWS Batch compute environments using CloudFormation. All sample CloudFormation, per-region templates related to this post can be found on the CloudFormation sample template site. The Ohio (us-east-2) Region is used as the example region for the remainder of this post.
AWS Batch Resources
AWS Batch is a managed service that helps you efficiently run batch computing workloads on the AWS Cloud. Users submit jobs to job queues, specifying the application to be run and their jobs’ CPU and memory requirements. AWS Batch is responsible for launching the appropriate quantity and types of instances needed to run your jobs.
AWS Batch removes the undifferentiated heavy lifting of configuring and managing compute infrastructure, allowing you to instead focus on your applications and users. This is demonstrated in the How AWS Batch Works video.
AWS Batch manages the following resources:
Job definitions
Job queues
Compute environments
A job definition specifies how jobs are to be run—for example, which Docker image to use for your job, how many vCPUs and how much memory is required, the IAM role to be used, and more.
Jobs are submitted to job queues where they reside until they can be scheduled to run on Amazon EC2 instances within a compute environment. An AWS account can have multiple job queues, each with varying priority. This gives you the ability to closely align the consumption of compute resources with your organizational requirements.
Compute environments provision and manage your EC2 instances and other compute resources that are used to run your AWS Batch jobs. Job queues are mapped to one more compute environments and a given environment can also be mapped to one or more job queues. This many-to-many relationship is defined by the compute environment order and job queue priority properties.
The following diagram shows a general overview of how the AWS Batch resources interact.
CloudFormation stack creation and updates
Upon the creation of your stack, an AWS Batch job definition is registered using your CloudFormation template. If a job definition with the same name has already been registered, a new revision is created. On stack updates, any changes to your job definition specifications in the CloudFormation template result in a new revision of that job definition and a deregistration of the previous job definition revision. Stack deletions only result in the deregistration of your job definition, as AWS Batch does not delete job definitions.
At the stack creation, a job queue is created using the template. Any changes to your job queue properties within the stack result in a call to the UpdateJobQueue API action. Similarly, stack deletions result in the deletion of job queues from your AWS Batch compute environment.
CloudFormation creates an AWS Batch compute environment using the properties specified in your template. Stack updates result in updates to your compute environment where possible. If you need to change a parameter that is not supported by the UpdateComputeEnvironment API action, stack updates result in the deletion and re-creation of your compute environment. Upon stack deletion, your compute environment is disabled, and then deleted.
All naming conventions specified by CloudFormation should be followed—especially in the case of resource replacement—or you run the risk of a failed stack changes. For example, all AWS Batch resource property names must be capitalized, and resource names must be changed in the case of resource replacement, as is the case in any CloudFormation stack.
If you do not provide values for ComputeEnvironmentName, JobQueueName, or JobDefinitionName in your template, a pseudo-random name is generated for you using the logical ID that you gave the resource in CloudFormation.
Launching a “Hello World” example stack
Here’s a familiar “Hello World” example of a CloudFormation stack with AWS Batch resources.
This example registers a simple job definition, a job queue that can accept job submissions, and a compute environment that contains the compute resources used to execute your job. The stack template also creates additional AWS resources that are required by AWS Batch:
An IAM service role that gives AWS Batch permissions to take the required actions on your behalf
An IAM ECS instance role
A VPC
A VPC subnet (though I’ve provided a general template, I suggest that this be a private subnet)
A security group
This stack can easily be deployed in the CloudFormation console, but I provide CLI commands that complete the stack creation for you. Use the Launch stack button or run the following command:
You can monitor the creation of the resources in your CloudFormation stack in the CloudFormation console, on the Events tab:
Confirm the successful creation of your stack by observing a CREATE_COMPLETE status. At this point, you should also be able to view the new resource ARNs on the Outputs tab:
After your stack is successfully created, everything that you need to submit a “hello-world” job is complete.
Make sure to use the accurate job definition name and revision number. You can find the accurate Amazon Resource Name (ARN) on the CloudFormation stack Outputs tab. A pseudo-random resource name is generated for your AWS Batch resources. If you do have an existing hello-world job definition, make sure that you run the command with the job definition revision created by your new CloudFormation stack from the stack outputs.
You can monitor the successful execution of the job in the AWS Batch console under Jobs:
When you are done using this stack and want to delete the resources, run the following command. CloudFormation deregisters the job definition, and deletes the job queue, compute environment, and the rest of the resources in the stack template.
Now that you know the basics of AWS Batch resources, here’s a more complex example.
High– and low-priority job queues with On-Demand and Spot compute environments
This CloudFormation stack creates two job queues with varying priority and two compute environments. You have one On-Demand compute environment and one Spot compute environment with a Spot price at 40% of On-Demand.
The first job queue is higher priority and feeds jobs to both compute environments, while the lower priority job queue only submits jobs for execution to the Spot compute environment.
There are two job definitions, one high-priority job queue and one low-priority job queue. Each job submitted using a given job definition is submitted to a job queue. For example, jobs submitted with an important-production-application job definition are submitted to the high priority job queue, while jobs submitted with a test-application job definition are submitted to the low priority job queue.
This example registers both job definitions and creates your compute environments and job queues. It also creates the VPC, subnet, security group, IAM service role for AWS Batch, ECS instance role, and an IAM Spot Fleet role. Use the Launch stack button or run the following command:
As with any CloudFormation stack, you can update resources for your application’s specific needs. AWS CloudFormation Designer is a graphic tool for creating, viewing, and modifying CloudFormation templates.
Any changes to resource properties that require replacement results in the creation of a new resource to reflect this change, and the deletion of the obsolete resource. Changes to an immutable compute environment or job queue properties results in replacement. Changes to updateable properties update the existing resource. Any changes to job definitions (beyond the name) result in the registration of a new revision of the existing job definition, followed by the deregistration of the previous revision.
Finally, run the following command to delete the CloudFormation stack containing your AWS Batch resources:
In this post, I detailed the steps to create, update with and without replacement, and delete your AWS Batch resources using CloudFormation templates as part of CloudFormation stacks with other AWS service resources. For more information, see the following topics:
Many of today’s discussions around blockchain technology remind me of the classic Shimmer Floor Wax skit. According to Dan Aykroyd, Shimmer is a dessert topping. Gilda Radner claims that it is a floor wax, and Chevy Chase settles the debate and reveals that it actually is both! Some of the people that I talk to see blockchains as the foundation of a new monetary system and a way to facilitate international payments. Others see blockchains as a distributed ledger and immutable data source that can be applied to logistics, supply chain, land registration, crowdfunding, and other use cases. Either way, it is clear that there are a lot of intriguing possibilities and we are working to help our customers use this technology more effectively.
We are launching AWS Blockchain Templates today. These templates will let you launch an Ethereum (either public or private) or Hyperledger Fabric (private) network in a matter of minutes and with just a few clicks. The templates create and configure all of the AWS resources needed to get you going in a robust and scalable fashion.
Launching a Private Ethereum Network The Ethereum template offers two launch options. The ecs option creates an Amazon ECS cluster within a Virtual Private Cloud (VPC) and launches a set of Docker images in the cluster. The docker-local option also runs within a VPC, and launches the Docker images on EC2 instances. The template supports Ethereum mining, the EthStats and EthExplorer status pages, and a set of nodes that implement and respond to the Ethereum RPC protocol. Both options create and make use of a DynamoDB table for service discovery, along with Application Load Balancers for the status pages.
Here are the AWS Blockchain Templates for Ethereum:
I start by opening the CloudFormation Console in the desired region and clicking Create Stack:
I select Specify an Amazon S3 template URL, enter the URL of the template for the region, and click Next:
I give my stack a name:
Next, I enter the first set of parameters, including the network ID for the genesis block. I’ll stick with the default values for now:
I will also use the default values for the remaining network parameters:
Moving right along, I choose the container orchestration platform (ecs or docker-local, as I explained earlier) and the EC2 instance type for the container nodes:
Next, I choose my VPC and the subnets for the Ethereum network and the Application Load Balancer:
I configure my keypair, EC2 security group, IAM role, and instance profile ARN (full information on the required permissions can be found in the documentation):
The Instance Profile ARN can be found on the summary page for the role:
I confirm that I want to deploy EthStats and EthExplorer, choose the tag and version for the nested CloudFormation templates that are used by this one, and click Next to proceed:
On the next page I specify a tag for the resources that the stack will create, leave the other options as-is, and click Next:
I review all of the parameters and options, acknowledge that the stack might create IAM resources, and click Create to build my network:
The template makes use of three nested templates:
After all of the stacks have been created (mine took about 5 minutes), I can select JeffNet and click the Outputs tab to discover the links to EthStats and EthExplorer:
Here’s my EthStats:
And my EthExplorer:
If I am writing apps that make use of my private network to store and process smart contracts, I would use the EthJsonRpcUrl.
Stay Tuned My colleagues are eager to get your feedback on these new templates and plan to add new versions of the frameworks as they become available.
Thanks to Raja Mani, AWS Solutions Architect, for this great blog.
—
In this blog post, I’ll walk you through the steps for setting up continuous replication of an AWS CodeCommit repository from one AWS region to another AWS region using a serverless architecture. CodeCommit is a fully-managed, highly scalable source control service that stores anything from source code to binaries. It works seamlessly with your existing Git tools and eliminates the need to operate your own source control system. Replicating an AWS CodeCommit repository from one AWS region to another AWS region enables you to achieve lower latency pulls for global developers. This same approach can also be used to automatically back up repositories currently hosted on other services (for example, GitHub or BitBucket) to AWS CodeCommit.
This solution uses AWS Lambda and AWS Fargate for continuous replication. Benefits of this approach include:
The replication process can be easily setup to trigger based on events, such as commits made to the repository.
Setting up a serverless architecture means you don’t need to provision, maintain, or administer servers.
Note: AWS Fargate has a limitation of 10 GB for storage and is available in US East (N. Virginia) region. A similar solution that uses Amazon EC2 instances to replicate the repositories on a schedule was published in a previous blog and can be used if your repository does not meet these conditions.
Replication using Fargate
As you follow this blog post, you’ll set up an architecture that looks like this:
Any change in the AWS CodeCommit repository will trigger a Lambda function. The Lambda function will call the Fargate task that replicates the repository using a Git command line tool.
Let us assume a user wants to replicate a repository (Source) from US East (N. Virginia/us-east-1) region to a repository (Destination) in US West (Oregon/us-west-2) region. I’ll walk you through the steps for it:
Prerequisites
Create an AWS Service IAM role for Amazon EC2 that has permission for both source and destination repositories, IAM CreateRole, AttachRolePolicy and Amazon ECR privileges. Here is the EC2 role policy I used:
You need a Docker environment to build this solution. You can launch an EC2 instance and install Docker (or) you can use AWS Cloud9 that comes with Docker and Git preinstalled. I used an EC2 instance and installed Docker in it. Use the IAM role created in the previous step when creating the EC2 instance. I am going to refer this environment as “Docker Environment” in the following steps.
You need to install the AWS CLI on the Docker environment. For AWS CLI installation, refer this page.
You need to install Git, including a Git command line on the Docker environment.
Step 1: Create the Docker image
To create the Docker image, first it needs a Dockerfile. A Dockerfile is a manifest that describes the base image to use for your Docker image and what you want installed and running on it. For more information about Dockerfiles, go to the Dockerfile Reference.
1. Choose a directory in the Docker environment and perform the following steps in that directory. I used /home/ec2-user directory to perform the following steps.
2. Clone the AWS CodeCommit repository in the Docker environment. Open the terminal to the Docker environment and run the following commands to clone your source AWS CodeCommit repository (I ran the commands from /home/ec2-user directory):
Note: Change the URL marked in red to your source and destination repository URL.
3. Create a file called Dockerfile (case sensitive) with the following content (I created it in /home/ec2-user directory):
# Pull the Amazon Linux latest base image
FROM amazonlinux:latest
#Install aws-cli and git command line tools
RUN yum -y install unzip aws-cli
RUN yum -y install git
WORKDIR /home/ec2-user
RUN mkdir LocalRepository
WORKDIR /home/ec2-user/LocalRepository
#Copy Cloned CodeCommit repository to Docker container
COPY ./LocalRepository /home/ec2-user/LocalRepository
#Copy shell script that does the replication
COPY ./repl_repository.bash /home/ec2-user/LocalRepository
RUN chmod ugo+rwx /home/ec2-user/LocalRepository/repl_repository.bash
WORKDIR /home/ec2-user/LocalRepository
#Call this script when Docker starts the container
ENTRYPOINT ["/home/ec2-user/LocalRepository/repl_repository.bash"]
4. Copy the following shell script into a file called repl_repository.bash to the DockerFile directory location in the Docker environment (I created it in /home/ec2-user directory)
6. Verify whether the replication is working by running the repl_repository.bash script from the LocalRepository directory. Go to LocalRepository directory and run this command: . ../repl_repository.bash If it is successful, you will get the “Everything up-to-date” at the last line of the result like this:
$ . ../repl_repository.bash
Everything up-to-date
Step 2: Build the Docker Image
1. Build the Docker image by running this command from the directory where you created the DockerFile in the Docker environment in the previous step (I ran it from /home/ec2-user directory):
$ docker build . –t ccrepl
Output: It installs various packages and set environment variables as part of steps 1 to 3 from the Dockerfile. The steps 4 to 11 from the Dockerfile should produce an output similar to the following:
2. Run the following command to verify that the image was created successfully. It will display “Everything up-to-date” at the end if it is successful.
[[email protected] LocalRepository]$ docker run ccrepl
Everything up-to-date
Step 3: Push the Docker Image to Amazon Elastic Container Registry (ECR)
Perform the following steps in the Docker Environment.
1. Run the AWS CLI configure command and set default region as your source repository region (I used us-east-1).
$ aws configure set default.region <Source Repository Region>
2. Create an Amazon ECR repository using this command to store your ccrepl image (Note the repositoryUri in the output):
2. Create a role called AccessRoleForCCfromFG using the following command in the DockerEnvironment:
$ aws iam create-role --role-name AccessRoleForCCfromFG --assume-role-policy-document file://trustpolicyforecs.json
3. Assign CodeCommit service full access to the above role using the following command in the DockerEnvironment:
$ aws iam attach-role-policy --policy-arn arn:aws:iam::aws:policy/AWSCodeCommitFullAccess --role-name AccessRoleForCCfromFG
4. In the Amazon ECS Console, choose Repositories and select the ccrepl repository that was created in the previous step. Copy the Repository URI.
5. In the Amazon ECS Console, choose Task Definitions and click Create New Task Definition.
6. Select launch type compatibility as FARGATE and click Next Step.
7. In the create task definition screen, do the following:
In Task Definition Name, type ccrepl
In Task Role, choose AccessRoleForCCfromFG
In Task Memory, choose 2GB
In Task CPU, choose 1 vCPU
Click Add Container under Container Definitions in the same screen. In the Add Container screen, do the following:
Enter Container name as ccreplcont
Enter Image URL copied from step 4
Enter Memory Limits as 128 and click Add.
Note: Select TaskExecutionRole as “ecsTaskExecutionRole” if it already exists. If not, select create new role and it will create “ecsTaskExecutionRole” for you.
8. Click the Create button in the task definition screen to create the task. It will successfully create the task, execution role and AWS CloudWatch Log groups.
9. In the Amazon ECS Console, click Clusters and create cluster. Select template as “Networking only, Powered by AWS Fargate” and click next step.
10. Enter cluster name as ccreplcluster and click create.
Step 5: Create the Lambda Function
In this section, I used Amazon Elastic Container Service (ECS) run task API from Lambda to invoke the Fargate task.
1. In the IAM Console, create a new role called ECSLambdaRole with the permissions to AWS CodeCommit, Amazon ECS as well as pass roles privileges needed to run the ECS task. Your statement should look similar to the following (replace <your account id>):
2. In AWS management console, select VPC service and click subnets in the left navigation screen. Note down the Subnet IDs that you want to run the Fargate task in.
3. Create a new Lambda Node.js function called FargateTaskExecutionFunc and assign the role ECSLambdaRole with the following content:
Note: Replace subnets values (marked in red color) with the subnet IDs you identified as the subnets you wanted to run the Fargate task on in Step 2 of this section.
1. In the Lambda Console, click FargateTaskExecutionFunc under functions.
2. Under Add triggers in the Designer, select CodeCommit
3. In the Configure triggers screen, do the following:
Enter Repository name as Source (your source repository name)
Enter trigger name as LambdaTrigger
Leave the Events as “All repository events”
Leave the Branch names as “All branches”
Click Add button
Click Save button to save the changes
Step 6: Verification
To test the application, make a commit and push the changes to the source repository in AWS CodeCommit. That should automatically trigger the Lambda function and replicate the changes in the destination repository. You can verify this by checking CloudWatch Logs for Lambda and ECS, or simply going to the destination repository and verifying the change appears.
Conclusion
Congratulations! You have successfully configured repository replication of an AWS CodeCommit repository using AWS Lambda and AWS Fargate. You can use this technique in a deployment pipeline. You can also tweak the trigger configuration in AWS CodeCommit to call the Lambda function in response to any supported trigger event in AWS CodeCommit.
Amazon ECS now includes integrated service discovery. This makes it possible for an ECS service to automatically register itself with a predictable and friendly DNS name in Amazon Route 53. As your services scale up or down in response to load or container health, the Route 53 hosted zone is kept up to date, allowing other services to lookup where they need to make connections based on the state of each service. You can see a demo of service discovery in an imaginary social networking app over at: https://servicediscovery.ranman.com/.
Service Discovery
Part of the transition to microservices and modern architectures involves having dynamic, autoscaling, and robust services that can respond quickly to failures and changing loads. Your services probably have complex dependency graphs of services they rely on and services they provide. A modern architectural best practice is to loosely couple these services by allowing them to specify their own dependencies, but this can be complicated in dynamic environments as your individual services are forced to find their own connection points.
Traditional approaches to service discovery like consul, etcd, or zookeeper all solve this problem well, but they require provisioning and maintaining additional infrastructure or installation of agents in your containers or on your instances. Previously, to ensure that services were able to discover and connect with each other, you had to configure and run your own service discovery system or connect every service to a load balancer. Now, you can enable service discovery for your containerized services in the ECS console, AWS CLI, or using the ECS API.
Introducing Amazon Route 53 Service Registry and Auto Naming APIs
Amazon ECS Service Discovery works by communicating with the Amazon Route 53 Service Registry and Auto Naming APIs. Since we haven’t talked about it before on this blog, I want to briefly outline how these Route 53 APIs work. First, some vocabulary:
Namespaces – A namespace specifies a domain name you want to route traffic to (e.g. internal, local, corp). You can think of it as a logical boundary between which services should be able to discover each other. You can create a namespace with a call to the aws servicediscovery create-private-dns-namespace command or in the ECS console. Namespaces are roughly equivalent to hosted zones in Route 53. A namespace contains services, our next vocabulary word.
Service – A service is a specific application or set of applications in your namespace like “auth”, “timeline”, or “worker”. A service contains service instances.
Service Instance – A service instance contains information about how Route 53 should respond to DNS queries for a resource.
Route 53 provides APIs to create: namespaces, A records per task IP, and SRV records per task IP + port.
When we ask Route 53 for something like: worker.corp we should get back a set of possible IPs that could fulfill that request. If the application we’re connecting to exposes dynamic ports then the calling application can easily query the SRV record to get more information.
ECS service discovery is built on top of the Route 53 APIs and manages all of the underlying API calls for you. Now that we understand how the service registry, works lets take a look at the ECS side to see service discovery in action.
Amazon ECS Service Discovery
Let’s launch an application with service discovery! First, I’ll create two task definitions: “flask-backend” and “flask-worker”. Both are simple AWS Fargate tasks with a single container serving HTTP requests. I’ll have flask-backend ask worker.corp to do some work and I’ll return the response as well as the address Route 53 returned for worker. Something like the code below:
Now, with my containers and task definitions in place, I’ll create a service in the console.
As I move to step two in the service wizard I’ll fill out the service discovery section and have ECS create a new namespace for me.
I’ll also tell ECS to monitor the health of the tasks in my service and add or remove them from Route 53 as needed. Then I’ll set a TTL of 10 seconds on the A records we’ll use.
I’ll repeat those same steps for my “worker” service and after a minute or so most of my tasks should be up and running.
Over in the Route 53 console I can see all the records for my tasks!
We can use the Route 53 service discovery APIs to list all of our available services and tasks and programmatically reach out to each one. We could easily extend to any number of services past just backend and worker. I’ve created a simple demo of an imaginary social network with services like “auth”, “feed”, “timeline”, “worker”, “user” and more here: https://servicediscovery.ranman.com/. You can see the code used to run that page on github.
Available Now Amazon ECS service discovery is available now in US East (N. Virginia), US East (Ohio), US West (Oregon), and EU (Ireland). AWS Fargate is currently only available in US East (N. Virginia). When you use ECS service discovery, you pay for the Route 53 resources that you consume, including each namespace that you create, and for the lookup queries your services make. Container level health checks are provided at no cost. For more information on pricing check out the documentation.
Please let us know what you’ll be building or refactoring with service discovery either in the comments or on Twitter!
P.S. Every blog post I write is made with a tremendous amount of help from numerous AWS colleagues. To everyone that helped build service discovery across all of our teams – thank you :)!
The collective thoughts of the interwebz
By continuing to use the site, you agree to the use of cookies. more information
The cookie settings on this website are set to "allow cookies" to give you the best browsing experience possible. If you continue to use this website without changing your cookie settings or you click "Accept" below then you are consenting to this.