Tag Archives: Uncategorized

How to use service control policies to set permission guardrails across accounts in your AWS Organization

Post Syndicated from Michael Switzer original https://aws.amazon.com/blogs/security/how-to-use-service-control-policies-to-set-permission-guardrails-across-accounts-in-your-aws-organization/

AWS Organizations provides central governance and management for multiple accounts. Central security administrators use service control policies (SCPs) with AWS Organizations to establish controls that all IAM principals (users and roles) adhere to. Now, you can use SCPs to set permission guardrails with the fine-grained control supported in the AWS Identity and Access Management (IAM) policy language. This makes it easier for you to fine-tune policies to meet the precise requirements of your organization’s governance rules.

Now, using SCPs, you can specify Conditions, Resources, and NotAction to deny access across accounts in your organization or organizational unit. For example, you can use SCPs to restrict access to specific AWS Regions, or prevent your IAM principals from deleting common resources, such as an IAM role used for your central administrators. You can also define exceptions to your governance controls, restricting service actions for all IAM entities (users, roles, and root) in the account except a specific administrator role.

To implement permission guardrails using SCPs, you can use the new policy editor in the AWS Organizations console. This editor makes it easier to author SCPs by guiding you to add actions, resources, and conditions. In this post, I review SCPs, walk through the new capabilities, and show how to construct an example SCP you can use in your organization today.

Overview of Service Control Policy concepts

Before I walk through some examples, I’ll review a few features of SCPs and AWS Organizations.

SCPs offer central access controls for all IAM entities in your accounts. You can use them to enforce the permissions you want everyone in your business to follow. Using SCPs, you can give your developers more freedom to manage their own permissions because you know they can only operate within the boundaries you define.

You create and apply SCPs through AWS Organizations. When you create an organization, AWS Organizations automatically creates a root, which forms the parent container for all the accounts in your organization. Inside the root, you can group accounts in your organization into organizational units (OUs) to simplify management of these accounts. You can create multiple OUs within a single organization, and you can create OUs within other OUs to form a hierarchical structure. You can attach SCPs to the organization root, OUs, and individual accounts. SCPs attached to the root and OUs apply to all OUs and accounts inside of them.

SCPs use the AWS Identity and Access Management (IAM) policy language; however, they do not grant permissions. SCPs enable you set permission guardrails by defining the maximum available permissions for IAM entities in an account. If a SCP denies an action for an account, none of the entities in the account can take that action, even if their IAM permissions allow them to do so. The guardrails set in SCPs apply to all IAM entities in the account, which include all users, roles, and the account root user.

Policy Elements Available in SCPs

The table below summarizes the IAM policy language elements available in SCPs. You can read more about the different IAM policy elements in the IAM JSON Policy Reference.

The Supported Statement Effect column describes the effect type you can use with each policy element in SCPs.

Policy ElementDefinitionSupported Statement Effect
StatementMain element for a policy. Each policy can have multiple statements.Allow, Deny
Sid(Optional) Friendly name for the statement.Allow, Deny
EffectDefine whether a SCP statement allows or denies actions in an account.Allow, Deny
ActionList the AWS actions the SCP applies to.Allow, Deny
NotAction (New)(Optional) List the AWS actions exempt from the SCP. Used in place of the Action element.Deny
Resource (New)List the AWS resources the SCP applies to.Deny
Condition (New)(Optional) Specify conditions for when the statement is in effect.Deny

Note: Some policy elements are only available in SCPs that deny actions.

You can use the new policy elements in new or existing SCPs in your organization. In the next section, I use the new elements to create a SCP using the AWS Organizations console.

Create an SCP in the AWS Organizations console

In this section, you’ll create an SCP that restricts IAM principals in accounts from making changes to a common administrative IAM role created in all accounts in your organization. Imagine your central security team uses these roles to audit and make changes to AWS settings. For the purposes of this example, you have a role in all your accounts named AdminRole that has the AdministratorAccess managed policy attached to it. Using an SCP, you can restrict all IAM entities in the account from modifying AdminRole or its associated permissions. This helps you ensure this role is always available to your central security team. Here are the steps to create and attach this SCP.

  1. Ensure you’ve enabled all features in AWS Organizations and SCPs through the AWS Organizations console.
  2. In the AWS Organizations console, select the Policies tab, and then select Create policy.

    Figure 1: Select "Create policy" on the "Policies" tab

    Figure 1: Select “Create policy” on the “Policies” tab

  3. Give your policy a name and description that will help you quickly identify it. For this example, I use the following name and description.
    • Name: DenyChangesToAdminRole
    • Description: Prevents all IAM principals from making changes to AdminRole.

     

    Figure 2: Give the policy a name and description

    Figure 2: Give the policy a name and description

  4. The policy editor provides you with an empty statement in the text editor to get started. Position your cursor inside the policy statement. The editor detects the content of the policy statement you selected, and allows you to add relevant Actions, Resources, and Conditions to it using the left panel.

    Figure 3: SCP editor tool

    Figure 3: SCP editor tool

  5. Change the Statement ID to describe what the statement does. For this example, I reused the name of the policy, DenyChangesToAdminRole, because this policy has only one statement.

    Figure 4: Change the Statement ID

    Figure 4: Change the Statement ID

  6. Next, add the actions you want to restrict. Using the left panel, select the IAM service. You’ll see a list of actions. To learn about the details of each action, you can hover over the action with your mouse. For this example, we want to allow principals in the account to view the role, but restrict any actions that could modify or delete it. We use the new NotAction policy element to deny all actions except the view actions for the role. Select the following view actions from the list:
    • GetContextKeysForPrincipalPolicy
    • GetRole
    • GetRolePolicy
    • ListAttachedRolePolicies
    • ListInstanceProfilesForRole
    • ListRolePolicies
    • ListRoleTags
    • SimulatePrincipalPolicy
  7. Now position your cursor at the Action element and change it to NotAction. After you perform the steps above, your policy should look like the one below.

    Figure 5: An example policy

    Figure 5: An example policy

  8. Next, apply these controls to only the AdminRole role in your accounts. To do this, use the Resource policy element, which now allows you to provide specific resources.
      1. On the left, near the bottom, select the Add Resources button.
      2. In the prompt, select the IAM service from the dropdown menu.
      3. Select the role as the resource type, and then type “arn:aws:iam::*:role/AdminRole” in the resource ARN prompt.
      4. Select Add resource.

    Note: The AdminRole has a common name in all accounts, but the account IDs will be different for each individual role. To simplify the policy statement, use the * wildcard in place of the account ID to account for all roles with this name regardless of the account.

  9. Your policy should look like this:
    
    {    
      "Version": "2012-10-17",
      "Statement": [
        {
          "Sid": "DenyChangesToAdminRole",
          "Effect": "Deny",
          "NotAction": [
            "iam:GetContextKeysForPrincipalPolicy",
            "iam:GetRole",
            "iam:GetRolePolicy",
            "iam:ListAttachedRolePolicies",
            "iam:ListInstanceProfilesForRole",
            "iam:ListRolePolicies",
            "iam:ListRoleTags",
            "iam:SimulatePrincipalPolicy"
          ],
          "Resource": [
            "arn:aws:iam::*:role/AdminRole"
          ]
        }
      ]
    }
    

  10. Select the Save changes button to create your policy. You can see the new policy in the Policies tab.

    Figure 6: The new policy on the “Policies” tab

    Figure 6: The new policy on the “Policies” tab

  11. Finally, attach the policy to the AWS account where you want to apply the permissions.

When you attach the SCP, it prevents changes to the role’s configuration. The central security team that uses the role might want to make changes later on, so you may want to allow the role itself to modify the role’s configuration. I’ll demonstrate how to do this in the next section.

Grant an exception to your SCP for an administrator role

In the previous section, you created a SCP that prevented all principals from modifying or deleting the AdminRole IAM role. Administrators from your central security team may need to make changes to this role in your organization, without lifting the protection of the SCP. In this next example, I build on the previous policy to show how to exclude the AdminRole from the SCP guardrail.

  1. In the AWS Organizations console, select the Policies tab, select the DenyChangesToAdminRole policy, and then select Policy editor.
  2. Select Add Condition. You’ll use the new Condition element of the policy, using the aws:PrincipalARN global condition key, to specify the role you want to exclude from the policy restrictions.
  3. The aws:PrincipalARN condition key returns the ARN of the principal making the request. You want to ignore the policy statement if the requesting principal is the AdminRole. Using the StringNotLike operator, assert that this SCP is in effect if the principal ARN is not the AdminRole. To do this, fill in the following values for your condition.
    1. Condition key: aws:PrincipalARN
    2. Qualifier: Default
    3. Operator: StringNotEquals
    4. Value: arn:aws:iam::*:role/AdminRole
  4. Select Add condition. The following policy will appear in the edit window.
    
    {    
      "Version": "2012-10-17",
      "Statement": [
        {
          "Sid": "DenyChangesToAdminRole",
          "Effect": "Deny",
          "NotAction": [
            "iam:GetContextKeysForPrincipalPolicy",
            "iam:GetRole",
            "iam:GetRolePolicy",
            "iam:ListAttachedRolePolicies",
            "iam:ListInstanceProfilesForRole",
            "iam:ListRolePolicies",
            "iam:ListRoleTags"
          ],
          "Resource": [
            "arn:aws:iam::*:role/AdminRole"
          ],
          "Condition": {
            "StringNotLike": {
              "aws:PrincipalARN":"arn:aws:iam::*:role/AdminRole"
            }
          }
        }
      ]
    }
    
    

  5. After you validate the policy, select Save. If you already attached the policy in your organization, the changes will immediately take effect.

Now, the SCP denies all principals in the account from updating or deleting the AdminRole, except the AdminRole itself.

Next steps

You can now use SCPs to restrict access to specific resources, or define conditions for when SCPs are in effect. You can use the new functionality in your existing SCPs today, or create new permission guardrails for your organization. I walked through one example in this blog post, and there are additional use cases for SCPs that you can explore in the documentation. Below are a few that we have heard from customers that you may want to look at.

  • Account may only operate in certain AWS regions (example)
  • Account may only deploy certain EC2 instance types (example)
  • Account requires MFA is enabled before taking an action (example)

You can start applying SCPs using the AWS Organizations console, CLI, or API. See the Service Control Policies Documentation or the AWS Organizations Forums for more information about SCPs, how to use them in your organization, and additional examples.

Want more AWS Security how-to content, news, and feature announcements? Follow us on Twitter.

Author

Michael Switzer

Mike is the product manager for the Identity and Access Management service at AWS. He enjoys working directly with customers to identify solutions to their challenges, and using data-driven decision making to drive his work. Outside of work, Mike is an avid cyclist and outdoorsperson. He holds a master’s degree in computational mathematics from the University of Washington.

КЗК разреши придобиването на Нова Броудкастинг Груп

Post Syndicated from nellyo original https://nellyo.wordpress.com/2019/03/22/cpc-nova/

Адванс Медиа Груп, търговско дружество, свързано с К.Домусчиев и Г.Домусчиев, има намерение да придобие едноличен контрол върху „Нова Броудкастинг Груп“ АД.

В КЗК е постъпило искане да направи оценка на сделката и да постанови решение, че настоящата сделка не представлява концентрация; или да постанови, че концентрацията не попада в обхвата на чл. 24 от ЗЗК; или да разреши концентрацията,
тъй като тя не води до установяване или засилване на господстващо положение, което значително би попречило на ефективната конкуренция на съответния пазар.

Произнасянето на КЗК е крайно любопитно поради скорошното решение, с което се отказва на Келнер да придобие Нова Броудкастинг Груп –

Предвид характеристиките на всеки един от съответните пазари в медийния сектор е установено, че придобиваната група разполага със значителен финансов и организационен ресурс, възможност за реализиране на икономии от мащаба и обхвата, и утвърден имидж. Значителният брой средства за масова информация, с които ще разполага обединената група, ще й даде съществено предимство пред останалите участници, предоставящи медийни услуги.

При анализа на нотифицираната сделка Комисията отчита водещите позиции на придобиваното предприятие в областта на медийните услуги, което от своя странаповдига основателни опасения за ефекта от сделка върху конкурентна среда на горепосочените пазари, както и хоризонтално припокриване на дейностите на участниците в концентрацията на пазара на онлайн търговия.

По този начин, участниците в концентрацията биха имали стимул и реална възможност да променят своята търговска политика под различни форми, изразяващи се в ограничаване на достъпа, повишаване на цените или промяна в условията по сключените договори. С оглед на гореизложеното и предвид значителния опит на придобиващото дружество и неговите инвестиционни намерения се създават предпоставки сделката да доведе до установяване или засилване на господстващо положение, което значително би възпрепятствало конкуренцията на съответните пазари. Такова поведение би ограничило и нарушило не само конкуренцията на пазара, но и интересите на крайните потребители, предвид обществената значимост на медиите.

КЗК оповести на 21 март 2019 решението си, с което – този път – разрешава сделката:

Реално осъществяваната от „Нова“ дейност включва:

  • създаване на телевизионно съдържание за собствена употреба, както и придобиване на права за разпространение на телевизионно съдържание;
  • разпространение на телевизионно съдържание – „Нова“ създава и разпространява 7 (седем) телевизионни програми с национален обхват: „Нова телевизия“; „Диема“; „Кино Нова“; „Диема Фемили“; „Нова спорт“; „Диема спорт“; „Диема спорт 2“;
  • телевизионна реклама – „Нова“ продава достъп до аудиторията си посредством излъчване на рекламни материали на рекламодатели и рекламни агенции в посочените телевизионни програми;
  • поддържане на интернет сайтове, предоставящи основно информация за програмите на ТВ-каналите, които оперира, както и възможност за гледане на част от излъчваните предавания в Интернет – https://nova.bg/, https://play.nova.bg/,
    https://diemaxtra.nova.bg/, https://play.diemaxtra.bg/, http://www.diema.bg, https://kino.nova.bg/, https://diemafamily.nova.bg/. Допълнително, Нова е разработила услугата Play DiemaXtra, която се състои от интернет сайт и мобилно приложение, даващи
    възможност за линейно гледане на пакета от телевизионните канали Диема Спорт и Диема Спорт 2, както и на отделни спортни събития по избор на потребителите (PPV);
  • предоставяне на услуги (…..)*.
    Дружеството „Атика Ева“ АД, контролирано от „Нова Броудкастинг Груп“ АД, е е специализирано в издаването на месечните списания: Еva, Playboy, Esquire, Joy, Grazia, OK „Атика Ева“ ООД, чрез които извършва издателска дейност, търговия с
    печатни произведения, реклама в печатни издания. Дружеството също така администрира интернет сайтове на част от посочените списания, на които самостоятелно продава интернет реклама.
    Другите предприятия, контролирани от „Нова Броудкастинг Груп“ АД – чрез „Нет Инфо“ АД, предоставят следните ключови продукти и услуги:
  • уеб-базирана електронна поща (www.abv.bg), която позволява на крайните потребители да отворят електронни пощенски кутии и да обменят пощенски съобщения; сайтът http://www.abv.bg също така позволява на крайните потребители да си съставят адресен указател и да общуват с други потребители във виртуални чат стаи и онлайн форуми;
  • електронни директории – платформа за организиране, съхранение и споделяне на файлове онлайн – http://www.dox.bg;
  • търсене – в сътрудничество с международния доставчик на тази услуга, Google, компанията предоставя уеб-търсене на основните си страници – http://www.abv.bg и http://www.gbg.bg;
  • новини и информация – Нет Инфо предоставя цифрови новини и информация чрез новинарския сайт http://www.vesti.bg, специализирания спортен новинарски сайт – http://www.gong.bg, сайта за прогнозата за времето http://www.sinoptik.bg, сайта за финансова
    информация http://www.pariteni.bg и сайта, посветен на модерната жена http://www.edna.bg;
  • обяви за автомобили – http://www.carmarket.bg дава възможност на потребителите да публикуват и разглеждат обяви за продажба на автомобили;
  • проверка на цени и сравнение на продукти – чрез сайта http://www.sravni.bg интернет потребителите могат да сравняват цените на продукти, продавани в различни онлайн магазини.

Пазари, върху които сделката ще окаже въздействие.
Придобиващата контрол група извършва разнообразни дейности на територията на страната, като участва на множество пазари в различни области. Придобиваното предприятие и дружествата под негов контрол оперират на пазари в областта на медиите (телевизионни, печатни и интернет). Известна връзка между техните дейности е налице по отношение на пазара на телевизионно съдържание и по-точно в сегмента придобиване на права за разпространение, на който оперира придобиваното предприятие „Нова Броудкастинг Груп“АД и дружеството „Футбол Про Медия“ ЕООД
от групата на придобиващия контрол „Адванс Медиа Груп” ЕАД.
„Футбол Про Медиа“ ЕООД има сключени договори, както следва:
(…..)* Въз основа на тези договори, (…..). От гореизложеното може да се направи извод, че дружеството „Футбол Про Медиа“ ЕООД извършва дейност, свързана с (…..) права за телевизионно разпространение. Групата на „Нова“, също създава и купува телевизионно съдържание.
Следователно, сделката ще окаже въздействие единствено върху пазара на телевизионно съдържание и в частност по отношение на придобиване на права за разпространение ((…..)*), на който е налице известно припокриване между дейностите на участниците в концентрацията.

КЗК отчита факта, че участниците в сделката не са преки
конкуренти на съответния пазар и техните отношения са по вертикала, определящо се от качеството, в което оперира всеки от тях, а именно: „Футбол Про Медия“ ЕООД се явява (…..)* на права, а предприятието – цел е купувач на телевизионни права.
По своето естество правата за излъчване на спортни събития са ексклузивни и е обичайна търговска практика да се притежават от едно предприятие за определен период от време и за определена територия. В разглеждания случай (…..). Изхождайки от анализираните данни, Комисията приема, че на съответния пазар оперират значителен брой търговци на съдържание, от които телевизионните оператори в България купуват правата за разпространение на спортни събития. Наличието на голям брой конкуренти, включително утвърдени на пазара чуждестранни имена, води до извода, че те ще са в състояние да окажат ефективен конкурентен натиск на новата икономическа група и същата няма да е независима от тях в своето търговско поведение. Допълнително, с оглед изискванията на ЗРТ и характеристиките на продукта „телевизионно съдържание”, КЗК намира, че пазарът на телевизионно съдържание се отличава с преодолими бариери и е достъпен за навлизане на нови участници. В своя анализ Комисията взема под внимание и обстоятелството, че (…..).
Предвид изложеното и доколкото предприятията –участници в концентрацията, оперират на различни нива на пазара на телевизионно съдържание, Комисията намира, че нотифицираната сделка няма да промени значително пазарното положение на НБГ
на съответния пазар, респективно няма потенциал да увреди конкурентната среда на него.
Въз основа на извършената оценка може да се заключи, че планираната концентрация не води до създаване или засилване на господстващо положение, което значително да ограничи или възпрепятства ефективната конкуренция на анализирания
съответен пазар. Следователно, нотифицираната сделка не би могла да породи антиконкурентни ефекти и следва да бъде безусловно разрешена при условията на чл. 26, ал. 1 от ЗЗК.

Разрешава концентрацията.

The Raspberry Pi shop, one month in

Post Syndicated from Gordon Hollingworth original https://www.raspberrypi.org/blog/the-raspberry-pi-shop-one-month-in/

Five years ago, I spent my first day working at the original Pi Towers (Starbucks in Cambridge). Since then, we’ve developed a whole host of different products and services which our customers love, but there was always one that we never got around to until now: a physical shop. (Here are opening times, directions and all that good stuff.)

Years ago, my first idea was rather simple: rent a small space for the Christmas month and then open a pop-up shop just selling Raspberry Pis. We didn’t really know why we wanted to do it, but suspected it would be fun! We didn’t expect it to take five years to organise, but last month we opened the first Raspberry Pi store in Cambridge’s Grand Arcade – and it’s a much more complete and complicated affair than that original pop-up idea.

Given that we had access to a bunch of Raspberry Pis, we thought that we should use some of them to get some timelapse footage of the shop being set up.

Raspberry Pi Shop Timelapse

Uploaded by Raspberry Pi on 2019-03-22.

The idea behind the shop is to reach an audience that wouldn’t necessarily know about Raspberry Pi, so its job is to promote and display the capabilities of the Raspberry Pi computer and ecosystem. But there’s also plenty in there for the seasoned Pi hacker: we aim to make sure there’s something for you whatever your level of experience with computing is.

Inside the shop you’ll find a set of project centres. Each one contains a Raspberry Pi project tutorial or example that will help you understand one advantage of the Raspberry Pi computer, and walk you through getting started with the device. We start with a Pi running Scratch to control a GPIO, turning on and off an LED. Another demos a similar project in Python, reading a push button and lighting three LEDs (can you guess what colour the three LEDs are?) –  you can also see project centres based around Kodi and RetroPi demonstrating our hardware (the TV-HAT and the Pimoroni Picade console), and an area demonstrating the various Raspberry Pi computer options.

store front

There is a soft seating area, where you can come along, sit and read through the Raspberry Pi books and magazines, and have a chat with the shop staff.  Finally we’ve got shelves of stock with which you can fill yer boots. This is not just Raspberry Pi official products, but merchandise from all of the ecosystem, totalling nearly 300 different lines (with more to come). Finally, we’ve got the Raspberry Pi engineering desk, where we’ll try to answer even the most complex of your questions.

Come along, check out the shop, and give us your feedback. Who knows – maybe you’ll find some official merchandise you can’t buy anywhere else!

The post The Raspberry Pi shop, one month in appeared first on Raspberry Pi.

Win a Raspberry Pi 3B+ and signed case this Pi Day 2019

Post Syndicated from Alex Bate original https://www.raspberrypi.org/blog/win-raspberry-pi-3b-pi-day/

Happy Pi Day everyone

What is Pi Day, we hear you ask? It’s the day where countries who display their date as month/day/year celebrate the first three digits of today displaying Pi, or 3.14.2019 to be exact.

In celebration of Pi Day, we’re running a Raspberry Pi 3B+ live stream on YouTube. Hours upon hours of our favourite 3B+ in all it’s glorious wonderment.

PI DAY 2019

Celebrate Pi Day with us by watching this Pi

At some point today, we’re going to add a unique hashtag to that live stream, and anyone who uses said hashtag across Instagram and/or Twitter* before midnight tonight (GMT) will be entered into a draw to win a Raspberry Pi Model 3 B+ and an official case, the latter of which will be signed by Eben Upton himself.

Raspberry Pi - PI Day 2019

So sit back, relax, and enjoy the most pointless, yet wonderful, live stream to ever reach the shores of YouTube!

*For those of you who don’t have a Twitter or Instagram account, you can also comment below with the hashtag when you see it.

The post Win a Raspberry Pi 3B+ and signed case this Pi Day 2019 appeared first on Raspberry Pi.

Play Heverlee’s Sjoelen and win beer

Post Syndicated from Rob Zwetsloot original https://www.raspberrypi.org/blog/play-heverlees-sjoelen-win-beer/

Chances are you’ve never heard of the Dutch table shuffleboard variant Sjoelen. But if you have, then you’ll know it has a basic premise – to slide wooden pucks into a set of four scoring boxes – but some rather complex rules.

Sjoelen machine

Uploaded by Grant Gibson on 2018-07-10.

Sjoelen

It may seem odd that a game which relies so much on hand-eye coordination and keeping score could be deemed a perfect match for a project commissioned by a beer brand. Yet Grant Gibson is toasting success with his refreshing interpretation of Sjoelen, having simplified the rules and incorporated a Raspberry Pi to serve special prizes to the winners.

“Sjoelen’s traditional scoring requires lots of addition and multiplication, but our version simply gives players ten pucks and gets them to slide three through any one of the four gates within 30 seconds,” Grant explains.

As they do this, the Pi (a Model 3B) keeps track of how many pucks are sliding through each gate, figures how much time the player has left, and displays a winning message on a screen. A Logitech HD webcam films the player in action, so bystanders can watch their reactions as they veer between frustration and success.

Taking the plunge

Grant started the project with a few aims in mind: “I wanted something that could be transported in a small van and assembled by a two-person team, and I wanted it to have a vintage look.” Inspired by pinball tables, he came up with a three-piece unit that could be flat-packed for transport, then quickly assembled on site. The Pi 3B proved a perfect component.

Grant has tended to use full-size PCs in his previous builds, but he says the Pi allowed him to use less complex software, and less hardware to control input and output. He used Python for the input and output tasks and to get the Pi to communicate with a full-screen Chromium browser, via JSON, in order to handle the scoring and display tasks in JavaScript.

“We used infrared (IR) sensors to detect when a puck passed through the gate bar to score a point,” Grant adds. “Because of the speed of the pucks, we had to poll each of the four IR sensors over 100 times per second to ensure that the pucks were always detected. Optimising the Python code to run fast enough, whilst also leaving enough processing power to run a full-screen web browser and HD webcam, was definitely the biggest software challenge on this project.”

Bottoms up

The Raspberry Pi’s GPIO pins are used to trigger the dispensing of a can of Heverlee beer to the winner. These are stocked inside the machine, but building the vending mechanism was a major headache, since it needed to be lightweight and compact, and to keep the cans cool.

No off-the-shelf vending unit offered a solution, and Grant’s initial attempts with stepper motors and clear laser-cut acrylic gears proved disastrous. “After a dozen successful vends, the prototype went out of alignment and started slicing through cans, creating a huge frothy fountain of beer. Impressive to watch, but not a great mix with electronics,” Grant laughs.

Instead, he drew up a final design that was laser‑cut from poplar plywood. “It uses automotive central locking motors to operate a see-saw mechanism that serve the cans. A custom Peltier-effect heat exchanger, and a couple of salvaged PC fans, keep the cans cool inside the machine,” reveals Grant.

“I’d now love to make a lightweight version sometime, perhaps with a folding Sjoelen table and pop-up scoreboard screen, that could be carried by one person,” he adds. We’d certainly drink to that.

More from The MagPi magazine

Get your copy now from the Raspberry Pi Press store, major newsagents in the UK, or Barnes & Noble, Fry’s, or Micro Center in the US. Or, download your free PDF copy from The MagPi magazine website.

MagPi 79 cover

Subscribe now

Subscribe to The MagPi on a monthly, quarterly, or twelve-monthly basis to save money against newsstand prices!

Twelve-month print subscribers get a free Raspberry Pi 3A+, the perfect Raspberry Pi to try your hand at some of the latest projects covered in The MagPi magazine.

The post Play Heverlee’s Sjoelen and win beer appeared first on Raspberry Pi.

Running your game servers at scale for up to 90% lower compute cost

Post Syndicated from Roshni Pary original https://aws.amazon.com/blogs/compute/running-your-game-servers-at-scale-for-up-to-90-lower-compute-cost/

This post is contributed by Yahav Biran, Chad Schmutzer, and Jeremy Cowan, Solutions Architects at AWS

Many successful video games such Fortnite: Battle Royale, Warframe, and Apex Legends use a free-to-play model, which offers players access to a portion of the game without paying. Such games are no longer low quality and require premium-like quality. The business model is constrained on cost, and Amazon EC2 Spot Instances offer a viable low-cost compute option. The casual multiplayer games naturally fit the Spot offering. With the orchestration of Amazon EKS containers and the mechanism available to minimize the player impact and optimize the cost when running multiplayer game-servers workloads, both casual and hardcore multiplayer games fit the Spot Instance offering.

Spot Instances offer spare compute capacity available in the AWS Cloud at steep discounts compared to On-Demand Instances. Spot Instances enable you to optimize your costs and scale your application’s throughput up to 10 times for the same budget. Spot Instances are best suited for fault-tolerant workloads. Multiplayer game-servers are no exception: a game-server state is updated using real-time player inputs, which makes the server state transient. Game-server workloads can be disposable and take advantage of Spot Instances to save up to 90% on compute cost. In this blog, we share how to architect your game-server workloads to handle interruptions and effectively use Spot Instances.

Characteristics of game-server workloads

Simply put, multiplayer game servers spend most of their life updating current character position and state (mostly animation). The rest of the time is spent on image updates that result from combat actions, moves, and other game-related events. More specifically, game servers’ CPUs are busy doing network I/O operations by accepting client positions, calculating the new game state, and multi-casting the game state back to the clients. That makes a game server workload a good fit for general-purpose instance types for casual multiplayer games and, preferably, compute-optimized instance types for the hardcore multiplayer games.

AWS provides a wide variety for both compute-optimized (C5 and C4) and general-purpose (M5) instance types with Amazon EC2 Spot Instances. Because capacities fluctuate independently for each instance type in an Availability Zone, you can often get more compute capacity for the same price when using a wide range of instance types. For more information on Spot Instance best practices, see Getting Started with Amazon EC2 Spot Instances

One solution that customers use for running dedicated game-servers is Amazon GameLift. This solution deploys a fleet of Amazon GameLift FleetIQ and Spot Instances in an AWS Region. FleetIQ places new sessions on game servers based on player latencies, instance prices, and Spot Instance interruption rates so that you don’t need to worry about Spot Instance interruptions. For more information, see Reduce Cost by up to 90% with Amazon GameLift FleetIQ and Spot Instances on the AWS Game Tech Blog.

In other cases, you can use game-server deployment patterns like containers-based orchestration (such as Kubernetes, Swarm, and Amazon ECS) for deploying multiplayer game servers. Those systems manage a large number of game-servers deployed as Docker containers across several Regions. The rest of this blog focuses on this containerized game-server solution. Containers fit the game-server workload because they’re lightweight, start quickly, and allow for greater utilization of the underlying instance.

Why use Amazon EC2 Spot Instances?

A Spot Instance is the choice to run a disposable game server workload because of its built-in two-minute interruption notice that provides graceful handling. The two-minute termination notification enables the game server to act upon interruption. We demonstrate two examples for notification handling through Instance Metadata and Amazon CloudWatch. For more information, see “Interruption handling” and “What if I want my game-server to be redundant?” segments later in this blog.

Spot Instances also offer a variety of EC2 instances types that fit game-server workloads, such as general-purpose and compute-optimized (C4 and C5). Finally, Spot Instances provide low termination rates. The Spot Instance Advisor can help you choose a good starting point for determining which instance types have lower historical interruption rates.

Interruption handling

Avoiding player impact is key when using Spot Instances. Here is a strategy to avoid player impact that we apply in the proposed reference architecture and code examples available at Spotable Game Server on GitHub. Specifically, for Amazon EKS, node drainage requires draining the node via the kubectl drain command. This makes the node unschedulable and evicts the pods currently running on the node with a graceful termination period (terminationGracePeriodSeconds) that might impact the player experience. As a result, pods continue to run while a signal is sent to the game to end it gracefully.

Node drainage

Node drainage requires an agent pod that runs as a DaemonSet on every Spot Instance host to pull potential spot interruption from Amazon CloudWatch or instance metadata. We’re going to use the Instance Metadata notification. The following describes how termination events are handled with node drainage:

  1. Launch the game-server pod with a default of 120 seconds (terminationGracePeriodSeconds). As an example, see this deploy YAML file on GitHub.
  2. Provision a worker node pool with a mixed instances policy of On-Demand and Spot Instances. It uses the Spot Instance allocation strategy with the lowest price. For example, see this AWS CloudFormation template on GitHub.
  3. Use the Amazon EKS bootstrap tool (/etc/eks/bootstrap.sh in the recommended AMI) to label each node with its instances lifecycle, either nDemand or Spot. For example:
    • OnDemand: “–kubelet-extra-args –node labels=lifecycle=ondemand,title=minecraft,region=uswest2”
    • Spot: “–kubelet-extra-args –node-labels=lifecycle=spot,title=minecraft,region=uswest2”
  4. A daemon set deployed on every node pulls the termination status from the instance metadata endpoint. When a termination notification arrives, the `kubectl drain node` command is executed, and a SIGTERM signal is sent to the game-server pod. You can see these commands in the batch file on GitHub.
  5. The game server keeps running for the next 120 seconds to allow the game to notify the players about the incoming termination.
  6. No new game-server is scheduled on the node to be terminated because it’s marked as unschedulable.
  7. A notification to an external system such as a matchmaking system is sent to update the current inventory of available game servers.

Optimization strategies for Kubernetes specifications

This section describes a few recommended strategies for Kubernetes specifications that enable optimal game server placements on the provisioned worker nodes.

  • Use single Spot Instance Auto Scaling groups as worker nodes. To accommodate the use of multiple Auto Scaling groups, we use Kubernetes nodeSelector to control the game-server scheduling on any of the nodes in any of the Spot Instance–based Auto Scaling groups.
    nodeSelector:
         lifecycle: spot
            title: your game title

  • The lifecycle label is populated upon node creation through the AWS CloudFormation template in the following section:
    BootstrapArgumentsForSpotFleet:
    	Description: Sets Node Labels to set lifecycle as Ec2Spot
    	    Default: "--kubelet-extra-args --node-labels=lifecycle=spot,title=minecraft,region=uswest2"
    	
    	    Type: String

  • You might have a case where the incoming player actions are served by UDP and masking the interruption from the player is required. Here, the game-server allocator (a Kubernetes scheduler for us) schedules more than one game server as target upstream servers behind a UDP load balancer that multicasts any packet received to the set of game servers. After the scheduler terminates the game server upon node termination, the failover occurs seamlessly. For more information, see “What if I want my game-server to be redundant?” later in this blog.

Reference architecture

The following architecture describes an instance mix of On-Demand and Spot Instances in an Amazon EKS cluster of multiplayer game servers. Within a single VPC, control plane node pools (Master Host and Storage Host) are highly available and thus run On-Demand Instances. The game-server hosts/nodes uses a mix of Spot and On-Demand Instances. The control plane, the API server is accessible via an Amazon Elastic Load Balancing Application Load Balancer with a preconfigured allowed list.

What if I want my game server to be redundant?

A game server is a sessionful workload, but it traditionally runs as a single dedicated game server instance with no redundancy. For game servers that use TCP as the transport network layer, AWS offers Network Load Balancers as an option for distributing player traffic across multiple game servers’ targets. Currently, game servers that use UDP don’t have similar load balancer solutions that add redundancy to maintain a highly available game server.

This section proposes a solution for the case where game servers deployed as containerized Amazon EKS pods use UDP as the network transport layer and are required to be highly available. We’re using the UDP load balancer because of the Spot Instances, but the choice isn’t limited to when you’re using Spot Instances.

The following diagram shows a reference architecture for the implementation of a UDP load balancer based on Amazon EKS. It requires a setup of an Amazon EKS cluster as suggested above and a set of components that simulate architecture that supports multiplayer game services. For example, this includes game-server inventory that captures the running game-servers, their status, and allocation placement.The Amazon EKS cluster is on the left, and the proposed UDP load-balancer system is on the right. A new game server is reported to an Amazon SQS queue that persists in an Amazon DynamoDB table. When a player’s assignment is required, the match-making service queries an API endpoint for an optimal available game server through the game-server inventory that uses the DynamoDB tables.

The solution includes the following main components:

  • The game server (see mockup-udp-server at GitHub). This is a simple UDP socket server that accepts a delta of a game state from connected players and multicasts the updated state based on pseudo computation back to the players. It’s a single threaded server whose goal is to prove the viability of UDP-based load balancing in dedicated game servers. The model presented here isn’t limited to this implementation. It’s deployed as a single-container Kubernetes pod that uses hostNetwork: true for network optimization.
  • The load balancer (udp-lb). This is a containerized NGINX server loaded with the stream module. The load balance upstream set is configured upon initialization based on the dedicated game-server state that is stored in the DynamoDB table game-server-status-by-endpoint. Available load balancer instances are also stored in a DynamoDB table, lb-status-by-endpoint, to be used by core game services such as a matchmaking service.
  • An Amazon SQS queue that captures the initialization and termination of game servers and load balancers instances deployed in the Kubernetes cluster.
  • DynamoDB tables that persist the state of the cluster with regards to the game servers and load balancer inventory.
  • An API operation based on AWS Lambda (game-server-inventory-api-lambda) that serves the game servers and load balancers for an updated list of resources available. The operation supports /get-available-gs needed for the load balancer to set its upstream target game servers. It also supports /set-gs-busy/{endpoint} for labeling already claimed game servers from the available game servers inventory.
  • A Lambda function (game-server-status-poller-lambda) that the Amazon SQS queue triggers and that populates the DynamoDB tables.

Scheduling mechanism

Our goal in this example is to reduce the chance that two game servers that serve the same load-balancer game endpoint are interrupted at the same time. Therefore, we need to prevent the scheduling of the same game servers (mockup-UDP-server) on the same host. This example uses advanced scheduling in Kubernetes where the pod affinity/anti-affinity policy is being applied.

We define two soft labels, mockup-grp1 and mockup-grp2, in the podAffinity section as follows:

      affinity:
        podAntiAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
            - labelSelector:
                matchExpressions:
                  - key: "app"
                    operator: In
                    values:
                      - mockup-grp1
              topologyKey: "kubernetes.io/hostname"

The requiredDuringSchedulingIgnoredDuringExecution tells the scheduler that the subsequent rule must be met upon pod scheduling. The rule says that pods that carry the value of key: “app” mockup-grp1 will not be scheduled on the same node as pods with key: “app” mockup-grp2 due to topologyKey: “kubernetes.io/hostname”.

When a load balancer pod (udp-lb) is scheduled, it queries the game-server-inventory-api endpoint for two game-server pods that run on different nodes. If this request isn’t fulfilled, the load balancer pod enters a crash loop until two available game servers are ready.

Try it out

We published two examples that guide you on how to build an Amazon EKS cluster that uses Spot Instances. The first example, Spotable Game Server, creates the cluster, deploys Spot Instances, Dockerizes the game server, and deploys it. The second example, Game Server Glutamate, enhances the game-server workload and enables redundancy as a mechanism for handling Spot Instance interruptions.

Conclusion

Multiplayer game servers have relatively short-lived processes that last between a few minutes to a few hours. The current average observed life span of Spot Instances in US and EU Regions ranges between a few hours to a few days, which makes Spot Instances a good fit for game servers. Amazon GameLift FleetIQ offers native and seamless support for Spot Instances, and Amazon EKS offers mechanisms to significantly minimize the probability of interrupting the player experience. This makes the Spot Instances an attractive option for not only casual multiplayer game server but also hardcore game servers. Game studios that use Spot Instances for multiplayer game server can save up to 90% of the compute cost, thus benefiting them as well as delighting their players.

Building a scalable log solution aggregator with AWS Fargate, Fluentd, and Amazon Kinesis Data Firehose

Post Syndicated from Anuneet Kumar original https://aws.amazon.com/blogs/compute/building-a-scalable-log-solution-aggregator-with-aws-fargate-fluentd-and-amazon-kinesis-data-firehose/

This post is contributed by Wesley Pettit, Software Dev Engineer, and a maintainer of the Amazon ECS CLI.

Modern distributed applications can produce gigabytes of log data every day. Analysis and storage is the easy part. From Amazon S3 to Elasticsearch, many solutions are available. The hard piece is reliably aggregating and shipping logs to their final destinations.

In this post, I show you how to build a log aggregator using AWS Fargate, Amazon Kinesis Data Firehose, and Fluentd. This post is unrelated to the AWS effort to support Fluentd to stream container logs from Fargate tasks. Follow the progress of that effort on the AWS container product roadmap.

Solution overview

Fluentd forms the core of my log aggregation solution. It is an open source project that aims to provide a unified logging layer by handling log collection, filtering, buffering, and routing. Fluentd is widely used across cloud platforms and was adopted by the Cloud Native Computing Foundation (CNCF) in 2016.

AWS Fargate provides a straightforward compute environment for the Fluentd aggregator. Kinesis Data Firehose streams the logs to their destinations. It batches, compresses, transforms, and encrypts the data before loading it. ECS minimizes the amount of storage used at the destination and increases security.

The log aggregator that I detail in this post is generic and can be used with any type of application. However, for simplicity, I focus on how to use the aggregator with Amazon Elastic Container Service (ECS) tasks and services.

Building a Fluentd log aggregator on Fargate that streams to Kinesis Data Firehose

 

The diagram describes the architecture that you are going to implement. A Fluentd aggregator runs as a service on Fargate behind a Network Load Balancer. The service uses Application Auto Scaling to dynamically adjust to changes in load.

Because the load balancer DNS can only be resolved in the VPC, the aggregator is a private log collector that can accept logs from any application in the VPC. Fluentd streams the logs to Kinesis Data Firehose, which dumps them in S3 and Amazon ElasticSearch Service (Amazon ES).

Not all logs are of equal importance. Some require real time analytics, others simply need long-term storage so that they can be analyzed if needed. In this post, applications that log to Fluentd are split up into frontend and backend.

Frontend applications are user-facing and need rich functionality to query and analyze the data present in their logs to obtain insights about users. Therefore, frontend application logs are sent to Amazon ES.

In contrast, backend services do not need the same level of analytics, so their logs are sent to S3. These logs can be queried using Amazon Athena, or they can be downloaded and ingested into other analytics tools as needed.

Each application tags its logs, and Fluentd sends the logs to different destinations based on the tag. Thus, the aggregator can determine whether a log message is from a backend or frontend application. Each log message gets sent to one of two Kinesis Data Firehose streams:

  • One streams to S3
  • One streams to an Amazon ES cluster

Running the aggregator on Fargate makes maintaining this service easy, and you don’t have to worry about provisioning or managing instances. This makes scaling the aggregator simple. You do not have to manage an additional Auto Scaling group for instances.

Aggregator performance and throughput

Before I walk you through how to deploy the Fluentd aggregator in your own VPC, you should know more about its performance.

I performed extensive real world testing with this aggregator set up, to test its limits. Each task in the aggregator service can handle at least at least 9 MB/s of log traffic, and at least 10,000 log messages/second. These are comfortable lower bounds for the aggregator’s performance. I recommend using these numbers to provision your aggregator service based upon your expected throughput for log traffic.

While this aggregator set up includes dynamic scaling, you must carefully choose the minimum size of the service. This is because dynamic scaling with Fluentd is complicated.

The Fluentd aggregator accepts logs via TCP connections, which are balanced across the instances of the service by the Network Load Balancer. However, these TCP connections are long-lived, and the load balancer only distributes new TCP connections. Thus, when the aggregator scales up in response to increased load, the new Fargate tasks can not help with any of the existing load. The new tasks can only take new TCP connections. This also means that older tasks in the aggregator tend to accumulate connections with time. This is an important limitation to keep in mind.

For the Docker logging driver for Fluentd (which can be used by ECS tasks to send logs to the aggregator), a single TCP connection is made when each container starts. This connection is held open as long as possible. A TCP connection can remain open as long as data is still being sent over it, and there are no network connectivity issues. The only way to guarantee that there are new TCP connections is to launch new containers.

Dynamic scaling can only help in cases where there are spikes in log traffic and new TCP connections. If you are using the aggregator with ECS tasks, dynamic scaling is only useful if spikes in log traffic come from launching new containers. On the other hand, if spikes in log traffic come from existing containers that periodically increase their log output, then dynamic scaling can’t help.

Therefore, configure the minimum number of tasks for the aggregator based upon the maximum throughput that you expect from a stable population of containers. For example, if you expect 45 MB/s of log traffic, then I recommend setting the minimum size of the aggregator service to five tasks, so that each one gets 9 MB/s of traffic.

For reference, here is the resource utilization that I saw for a single aggregator task under a variety of loads. The aggregator is configured with four vCPU and 8 GB of memory as its task size. As you can see, CPU usage scales linearly with load, so dynamic scaling is configured based on CPU usage.

Performance of a single aggregator task

Keep in mind that this data does not represent a guarantee, as your performance may differ. I recommend performing real-world testing using logs from your applications so that you can tune Fluentd to your specific needs.

As a warning, one thing to watch out for is messages in the Fluentd logs that mention retry counts:

2018-10-24 19:26:54 +0000 [warn]: #0 [output_kinesis_frontend] Retrying to request batch. Retry count: 1, Retry records: 250, Wait seconds 0.35
2018-10-24 19:26:54 +0000 [warn]: #0 [output_kinesis_frontend] Retrying to request batch. Retry count: 2, Retry records: 125, Wait seconds 0.27
2018-10-24 19:26:57 +0000 [warn]: #0 [output_kinesis_frontend] Retrying to request batch. Retry count: 1, Retry records: 250, Wait seconds 0.30

In my experience, these warnings always came up whenever I was hitting Kinesis Data Firehose API limits. Fluentd can accept high volumes of log traffic, but if it runs into Kinesis Data Firehose limits, then the data is buffered in memory.

If this state persists for a long time, data is eventually lost when the buffer reaches its max size. To prevent this problem, either increase the number of Kinesis Data Firehose delivery streams in use or request a Kinesis Data Firehose limit increase.

Aggregator reliability

In normal use, I didn’t see any dropped or duplicated log messages. A small amount of log data loss occurred when tasks in the service were stopped. This happened when the service was scaling down, and during deployments to update the Fluentd configuration.

When a task is stopped, it is sent SIGTERM, and then after a 30-second timeout, SIGKILL. When Fluentd receives the SIGTERM, it makes a single attempt to send all logs held in its in-memory buffer to their destinations. If this single attempt fails, the logs are lost. Therefore, log loss can be minimized by over-provisioning the aggregator, which reduces the amount of data buffered by each aggregator task.

Also, it is important to stay well within your Kinesis Data Firehose API limits. That way, Fluentd has the best chance of sending all the data to Kinesis Data Firehose during that single attempt.

To test the reliability of the aggregator, I used applications hosted on ECS that created several megabytes per second of log traffic. These applications inserted special ‘tracer’ messages into their normal log output. By querying for these tracer messages at the log storage destination, I was able determine how many messages were lost or duplicated.

These tracer logs were produced at a rate of 18 messages per second. During a deployment to the aggregator service (which stops all the existing tasks after starting new ones), 2.67 tracer messages were lost on average, and 11.7 messages were duplicated.

There are multiple ways to think about this data. If I ran one deployment during an hour, then 0.004% of my log data would be lost during that period, making the aggregator 99.996% reliable. In my experience, stopping a task only causes log loss during a short time slice of about 10 seconds.

Here’s another way to look at this. Every time that a task in my service was stopped (either due to a deployment or the service scaling in), only 1.5% of the logs received by that task in the 10-second period were lost on average.

As you can see, the aggregator is not perfect, but it is fairly reliable. Remember that logs were only dropped when aggregator tasks were stopped. In all other cases, I never saw any log loss. Thus, the aggregator provides a sufficient reliability guarantee that it can be trusted to handle the logs of many production workloads.

Deploying the aggregator

Here’s how to deploy the log aggregator in your own VPC.

1.     Create the Kinesis Data Firehose delivery streams.

2.     Create a VPC and network resources.

3.     Configure Fluentd.

4.     Build the Fluentd Docker image.

5.     Deploy the Fluentd aggregator on Fargate.

6.     Configure ECS tasks to send logs to the aggregator.

Create the Kinesis Data Firehose delivery streams

For the purposes of this post, assume that you have already created an Elasticsearch domain and S3 bucket that can be used as destinations.

Create a delivery stream that sends to Amazon ES, with the following options:

  • For Delivery stream name, type “elasticsearch-delivery-stream.”
  • For Source, choose Direct Put or other sources.
  • For Record transformation and Record format conversion, enable them to change the format of your log data before it is sent to Amazon ES.
  • For Destination, choose Amazon Elasticsearch Service.
  • If needed, enable S3 backup of records.
  • For IAM Role, choose Create new or navigate to the Kinesis Data Firehose IAM role creation wizard.
  • For the IAM policy, remove any statements that do not apply to your delivery stream.

All records sent through this stream are indexed under the same Elasticsearch type, so it’s important that all of the log records be in the same format. Fortunately, Fluentd makes this easy. For more information, see the Configure Fluentd section in this post.

Follow the same steps to create the delivery stream that sends to S3. Call this stream “s3-delivery-stream,” and select your S3 bucket as the destination.

Create a VPC and network resources

Download the ecs-refarch-cloudformation/infrastructure/vpc.yaml AWS CloudFormation template from GitHub. This template specifies a VPC with two public and two private subnets spread across two Availability Zones. The Fluentd aggregator run in the private subnets, along with any other services that should not be accessible outside the VPC. Your backend services would likely run here as well.

The template configures a NAT gateway that allows services in the private subnets to make calls to endpoints on the internet. It allows one-way communication out of the VPC, but blocks incoming traffic. This is important. While the aggregator service should only be accessible in your VPC, it does need to make calls to the Kinesis Data Firehose API endpoint, which lives outside of your VPC.

Deploy the template with the following command:

aws cloudformation deploy --template-file vpc.yaml \
--stack-name vpc-with-nat \
--parameter-overrides EnvironmentName=aggregator-service-infrastructure

Configure Fluentd

The Fluentd aggregator collects logs from other services in your VPC. Assuming that all these services are running in Docker containers that use the Fluentd docker log driver, each log event collected by the aggregator is in the format of the following example:

{
"source": "stdout",
"log": "122.116.50.70 - Corwin8644 264 [2018-10-31T21:31:59Z] \"POST /activate\" 200 19886",
"container_id": "6d33ade920a96179205e01c3a17d6e7f3eb98f0d5bb2b494383250220e7f443c",
"container_name": "/ecs-service-2-apache-d492a08f9480c2fcca01"
}

This log event is from an Apache server running in a container on ECS. The line that the server logged is captured in the log field, while source, container_id, and container_name are metadata added by the Fluentd Docker logging driver.

As I mentioned earlier, all log events sent to Amazon ES from the delivery stream must be in the same format. Furthermore, the log events must be JSON-formatted so that they can be converted into a Elasticsearch type. The Fluentd Docker logging driver satisfies both of these requirements.

If you have applications that emit Fluentd logs in different formats, then you could use a Lambda function in the delivery stream to transform all of the log records into a common format.

Alternatively, you could have a different delivery stream for each application type and log format and each log format could correspond to a different type in the Amazon ES domain. For simplicity, this post assumes that all of the frontend and backend services run on ECS and use the Fluentd Docker logging driver.

Now create the Fluentd configuration file, fluent.conf:

<system>
workers 4
</system>

<source>
@type  forward
@id    input1
@label @mainstream
port  24224
</source>

# Used for docker health check
<source>
@type http
port 8888
bind 0.0.0.0
</source>

# records sent for health checking won't be forwarded anywhere
<match health*>
@type null
</match>

<label @mainstream>
<match frontend*>
@type kinesis_firehose
@id   output_kinesis_frontend
region us-west-2
delivery_stream_name elasticsearch-delivery-stream
<buffer>
flush_interval 1
chunk_limit_size 1m
flush_thread_interval 0.1
flush_thread_burst_interval 0.01
flush_thread_count 15
total_limit_size 2GB
</buffer>
</match>
<match backend*>
@type kinesis_firehose
@id   output_kinesis_backend
region us-west-2
delivery_stream_name s3-delivery-stream
<buffer>
flush_interval 1
chunk_limit_size 1m
flush_thread_interval 0.1
flush_thread_burst_interval 0.01
flush_thread_count 15
total_limit_size 2GB
</buffer>
</match>
</label>

This file can also be found here, along with all of the code for this post. The first three lines tell Fluentd to use four workers, which means it can use up to 4 CPU cores. You later configure each Fargate task to have four vCPU. The rest of the configuration defines sources and destinations for logs processed by the aggregator.

The first source listed is the main source. All the applications in the VPC forward logs to Fluentd using this source definition. The source tells Fluentd to listen for logs on port 24224. Logs are streamed over tcp connections at this port.

The second source is the http Fluentd plugin, listening on port 8888. This plugin accepts logs over http; however, this is only used for container health checks. Because Fluentd lacks a built-in health check, I’ve created a container health check that sends log messages via curl to the http plugin. The rationale is that if Fluentd can accept log messages, it must be healthy. Here is the command used for the container health check:

curl http://localhost:8888/healthcheck?json=%7B%22log%22%3A+%22health+check%22%7D || exit 1

The query parameter in the URL defines a URL-encoded JSON object that looks like this:

{"log": "health check"}

The container health check inputs a log message of “health check”. While the query parameter in the URL defines the log message, the path, which is /healthcheck, sets the tag for the log message. In Fluentd, log messages are tagged, which allows them to be routed to different destinations.

In this case, the tag is healthcheck. As you can see in the configuration file, the first <match> definition handles logs that have a tag that matches the pattern health*. Each <match> element defines a tag pattern and defines a destination for logs with tags that match that pattern. For the health check logs, the destination is null, because you do not want to store these dummy logs anywhere.

The Configure ECS tasks to send logs to the aggregator section of this post explains how log tags are defined with the Fluentd docker logging driver. The other <match> elements process logs for the applications in the VPC and send them to Kinesis Data Firehose.

One of them matches any log tag that begins with “frontend”, the other matches any tag that starts with “backend”. Each sends to a different delivery stream. Your frontend and and backend services can tag their logs and have them be sent to different destinations.

Fluentd lacks built in support for Kinesis Data Firehose, so use an open source plugin maintained by AWS: awslabs/aws-fluent-plugin-kinesis.

Finally, each of the Kinesis Data Firehose <match> tags define buffer settings with the <buffer> element. The Kinesis output plugin buffers data in memory if needed. These settings have been tuned to increase throughput and minimize the chance of data loss, though you should modify them as needed based upon your own testing. For a discussion on the throughput and performance of this setup, see the Aggregator performance and throughput section in this post.

Build the Fluentd Docker image

Download the Dockerfile, and place it in the same directory as the fluentd.conf file discussed earlier. The Dockerfile starts with the latest Fluentd image based on Alpine, and installs awslabs/aws-fluent-plugin-kinesis and curl (for the container health check discussed earlier).

In the same directory, create an empty directory named /plugins. This directory is left empty but is needed when building the Fluentd Docker image. For more information about building custom Fluentd Docker images, see the Fluentd page on DockerHub.

Build and tag the image:

docker build -t custom-fluentd:latest .

Push the image to an ECR repository so that it can be used in the Fargate service.

Deploy the Fluentd aggregator on Fargate

Download the CloudFormation template, which defines all the resources needed for the aggregator service. This includes the Network Load Balancer, target group, security group, and task definition.

First, create a cluster:

aws ecs create-cluster --cluster-name fargate-fluentd-tutorial

Second, create an Amazon ECS task execution IAM role. This allows the Fargate task to pull the Fluentd container image from ECR. It also allows the Fluentd Aggregator Tasks to log to Amazon CloudWatch. This is important: Fluentd can’t manage its own logs because that would be a circular dependency.

The template can then be launched into the VPC and private subnets created earlier. Add the required information as parameters:

aws cloudformation deploy --template-file template.yml --stack-name aggregator-service \
--parameter-overrides EnvironmentName=fluentd-aggregator-service \
DockerImage=<Repository URI used for the image built in Step 4> \
VPC=<your VPC ID> \
Subnets=<private subnet 1, private subnet 2> \
Cluster=fargate-fluentd-tutorial \
ExecutionRoleArn=<your task execution IAM role ARN> \
MinTasks=2 \
MaxTasks=4 \
--capabilities CAPABILITY_NAMED_IAM

MinTasks is set to 2, and MaxTasks is set to 4, which means that the aggregator always has at least two tasks.

When load increases, it can dynamically scale up to four tasks. Recall the discussion in Aggregator performance and throughput and set these values based upon your own expected log throughput.

Configure ECS tasks to send logs to the aggregator

First, get the DNS name of the load balancer created by the CloudFormation template. In the EC2 console, choose Load Balancers. The load balancer has the same value as the EnvironmentName parameter. In this case, it is fluentd-aggregator-service.

Create a container definition for a container that logs to the Fluentd aggregator by adding the appropriate values for logConfiguration. In the following example, replace the fluentd-address value with the DNS name for your own load balancer. Ensure that you add :24224 after the DNS name; the aggregator listens on TCP port 24224.

"logConfiguration": {
    "logDriver": "fluentd",
    "options": {
        "fluentd-address": "fluentd-aggregator-service-cfe858972373a176.elb.us-west-2.amazonaws.com:24224",
        "tag": "frontend-apache"
    }
}

Notice the tag value, frontend-apache. This is how the tag discussed earlier is set. This tag matches the pattern frontend*, so the Fluentd aggregator sends it to the delivery stream for “frontend” logs.

Finally, your container instances need the following user data to enable the Fluentd log driver in the ECS agent:

#!/bin/bash
echo "ECS_AVAILABLE_LOGGING_DRIVERS=[\"awslogs\",\"fluentd\"]" >> /etc/ecs/ecs.config

Conclusion

In this post, I showed you how to build a log aggregator using AWS Fargate, Amazon Kinesis Data Firehose, and Fluentd.

To learn how to use the aggregator with applications that do not run on ECS, I recommend reading All Data Are Belong to AWS: Streaming upload via Fluentd from Kiyoto Tamura, a maintainer of Fluentd.

Learn about AWS Services & Solutions – March AWS Online Tech Talks

Post Syndicated from Robin Park original https://aws.amazon.com/blogs/aws/learn-about-aws-services-solutions-march-aws-online-tech-talks/

AWS Tech Talks

Join us this March to learn about AWS services and solutions. The AWS Online Tech Talks are live, online presentations that cover a broad range of topics at varying technical levels. These tech talks, led by AWS solutions architects and engineers, feature technical deep dives, live demonstrations, customer examples, and Q&A with AWS experts. Register now!

Note – All sessions are free and in Pacific Time.

Tech talks this month:

Compute

March 26, 2019 | 11:00 AM – 12:00 PM PTTechnical Deep Dive: Running Amazon EC2 Workloads at Scale – Learn how you can optimize your workloads running on Amazon EC2 for cost and performance, all while handling peak demand.

March 27, 2019 | 9:00 AM – 10:00 AM PTIntroduction to AWS Outposts – Learn how you can run AWS infrastructure on-premises with AWS Outposts for a truly consistent hybrid experience.

March 28, 2019 | 1:00 PM – 2:00 PM PTDeep Dive on OpenMPI and Elastic Fabric Adapter (EFA) – Learn how you can optimize your workloads running on Amazon EC2 for cost and performance, all while handling peak demand.

Containers

March 21, 2019 | 11:00 AM – 12:00 PM PTRunning Kubernetes with Amazon EKS – Learn how to run Kubernetes on AWS with Amazon EKS.

March 22, 2019 | 9:00 AM – 10:00 AM PTDeep Dive Into Container Networking – Dive deep into microservices networking and how you can build, secure, and manage the communications into, out of, and between the various microservices that make up your application.

Data Lakes & Analytics

March 19, 2019 | 9:00 AM – 10:00 AM PTFuzzy Matching and Deduplicating Data with ML Transforms for AWS Lake Formation – Learn how to use ML Transforms for AWS Glue to link and de-duplicate matching records.

March 20, 2019 | 9:00 AM – 10:00 AM PTCustomer Showcase: Perform Real-time ETL from IoT Devices into your Data Lake with Amazon Kinesis – Learn best practices for how to perform real-time extract-transform-load into your data lake with Amazon Kinesis.

March 20, 2019 | 11:00 AM – 12:00 PM PTMachine Learning Powered Business Intelligence with Amazon QuickSight – Learn how Amazon QuickSight leverages powerful ML and natural language capabilities to generate insights that help you discover the story behind the numbers.

Databases

March 18, 2019 | 9:00 AM – 10:00 AM PTWhat’s New in PostgreSQL 11 – Find out what’s new in PostgreSQL 11, the latest major version of the popular open source database, and learn about AWS services for running highly available PostgreSQL databases in the cloud.

March 19, 2019 | 1:00 PM – 2:00 PM PTIntroduction on Migrating your Oracle/SQL Server Databases over to the Cloud using AWS’s New Workload Qualification Framework – Get an introduction on how AWS’s Workload Qualification Framework can help you with your application and database migrations.

March 20, 2019 | 1:00 PM – 2:00 PM PTWhat’s New in MySQL 8 – Find out what’s new in MySQL 8, the latest major version of the world’s most popular open source database, and learn about AWS services for running highly available MySQL databases in the cloud.

March 21, 2019 | 9:00 AM – 10:00 AM PTBuilding Scalable & Reliable Enterprise Apps with AWS Relational Databases – Learn how AWS Relational Databases can help you build scalable & reliable enterprise apps.

DevOps

March 19, 2019 | 11:00 AM – 12:00 PM PTIntroduction to Amazon Corretto: A No-Cost Distribution of OpenJDK – Learn how to transform your approach to secure desktop delivery with a cloud desktop solution like Amazon WorkSpaces.

End-User Computing

March 28, 2019 | 9:00 AM – 10:00 AM PTFireside Chat: Enabling Today’s Workforce with Cloud Desktops – Learn about the tools and best practices Amazon Redshift customers can use to scale storage and compute resources on-demand and automatically to handle growing data volume and analytical demand.

Enterprise

March 26, 2019 | 1:00 PM – 2:00 PM PTSpeed Your Cloud Computing Journey With the Customer Enablement Services of AWS: ProServe, AMS, and Support – Learn how to accelerate your cloud journey with AWS’s Customer Enablement Services.

IoT

March 26, 2019 | 9:00 AM – 10:00 AM PTHow to Deploy AWS IoT Greengrass Using Docker Containers and Ubuntu-snap – Learn how to bring cloud services to the edge using containerized microservices by deploying AWS IoT Greengrass to your device using Docker containers and Ubuntu snaps.

Machine Learning

March 18, 2019 | 1:00 PM – 2:00 PM PTOrchestrate Machine Learning Workflows with Amazon SageMaker and AWS Step Functions – Learn about how ML workflows can be orchestrated with the rich features of Amazon SageMaker and AWS Step Functions.

March 21, 2019 | 1:00 PM – 2:00 PM PTExtract Text and Data from Any Document with No Prior ML Experience – Learn how to extract text and data from any document with no prior machine learning experience.

March 22, 2019 | 11:00 AM – 12:00 PM PTBuild Forecasts and Individualized Recommendations with AI – Learn how you can build accurate forecasts and individualized recommendation systems using our new AI services, Amazon Forecast and Amazon Personalize.

Management Tools

March 29, 2019 | 9:00 AM – 10:00 AM PTDeep Dive on Inventory Management and Configuration Compliance in AWS – Learn how AWS helps with effective inventory management and configuration compliance management of your cloud resources.

Networking & Content Delivery

March 25, 2019 | 1:00 PM – 2:00 PM PTApplication Acceleration and Protection with Amazon CloudFront, AWS WAF, and AWS Shield – Learn how to secure and accelerate your applications using AWS’s Edge services in this demo-driven tech talk.

Robotics

March 28, 2019 | 11:00 AM – 12:00 PM PTBuild a Robot Application with AWS RoboMaker – Learn how to improve your robotics application development lifecycle with AWS RoboMaker.

Security, Identity, & Compliance

March 27, 2019 | 11:00 AM – 12:00 PM PTRemediating Amazon GuardDuty and AWS Security Hub Findings – Learn how to build and implement remediation automations for Amazon GuardDuty and AWS Security Hub.

March 27, 2019 | 1:00 PM – 2:00 PM PTScaling Accounts and Permissions Management – Learn how to scale your accounts and permissions management efficiently as you continue to move your workloads to AWS Cloud.

Serverless

March 18, 2019 | 11:00 AM – 12:00 PM PT Testing and Deployment Best Practices for AWS Lambda-Based Applications – Learn best practices for testing and deploying AWS Lambda based applications.

Storage

March 25, 2019 | 11:00 AM – 12:00 PM PT Introducing a New Cost-Optimized Storage Class for Amazon EFS – Come learn how the new Amazon EFS storage class and Lifecycle Management automatically reduces cost by up to 85% for infrequently accessed files.

Най-добри и най-лоши решения на ЕСПЧ за 2018

Post Syndicated from nellyo original https://nellyo.wordpress.com/2019/03/01/echr-30/

  и тази  година обявяват най-добри и най-лоши решения на ЕСПЧ за 2018.

Тази година интересното е, че победителите и в двете категории са решения по чл. 10, свобода на изразяване.

  • Положителните примери:

Magyar Jeti Zrt v. Hungary: 40.4 %

Big Brother Watch and Others v. the United Kingdom: 17.7 %

Al Nashiri v. Romania / Abu Zubaydah v. Lithuania: 16.2 %

Мотиви: Защита на свободата на изразяване с богата аргументация, решението е за режима на отговорността при хипервръзки, вж тук

  • Отрицателните примери:

Sinkova v. Ukraine: 26.4 %

Beuze v. Belgium: 24.3 %

Mohamed Hasan v. Norway: 13.9 %

Мотиви: ЕСПЧ не защитава  свободата на изразяване по отношение на протест – пърформанс, художествено изразяване, вж тук

Implementing GitFlow Using AWS CodePipeline, AWS CodeCommit, AWS CodeBuild, and AWS CodeDeploy

Post Syndicated from Ashish Gore original https://aws.amazon.com/blogs/devops/implementing-gitflow-using-aws-codepipeline-aws-codecommit-aws-codebuild-and-aws-codedeploy/

This blog post shows how AWS customers who use a GitFlow branching model can model their merge and release process by using AWS CodePipeline, AWS CodeCommit, AWS CodeBuild, and AWS CodeDeploy. This post provides a framework, AWS CloudFormation templates, and AWS CLI commands.

Before we begin, we want to point out that GitFlow isn’t something that we practice at Amazon because it is incompatible with the way we think about CI/CD. Continuous integration means that every developer is regularly merging changes back to master (at least once per day). As we’ll explain later, GitFlow involves creating multiple levels of branching off of master where changes to feature branches are only periodically merged all the way back to master to trigger a release. Continuous delivery requires the capability to get every change into production quickly, safely, and sustainably. Research by groups such as DORA has shown that teams that practice CI/CD get features to customers more quickly, are able to recover from issues more quickly, experience fewer failed deployments, and have higher employee satisfaction.

Despite our differing view, we recognize that our customers have requirements that might make branching models like GitFlow attractive (or even mandatory). For this reason, we want to provide information that helps them use our tools to automate merge and release tasks and get as close to CI/CD as possible. With that disclaimer out of the way, let’s dive in!

When Linus Torvalds introduced Git version control in 2005, it really changed the way developers thought about branching and merging. Before Git, these tasks were scary and mostly avoided. As the tools became more mature, branching and merging became both cheap and simple. They are now part of the daily development workflow. In 2010, Vincent Driessen introduced GitFlow, which became an extremely popular branch and release management model. It introduced the concept of a develop branch as the mainline integration and the well-known master branch, which is always kept in a production-ready state. Both master and develop are permanent branches, but GitFlow also recommends short-lived feature, hotfix, and release branches, like so:

GitFlow guidelines:

  • Use development as a continuous integration branch.
  • Use feature branches to work on multiple features.
  • Use release branches to work on a particular release (multiple features).
  • Use hotfix branches off of master to push a hotfix.
  • Merge to master after every release.
  • Master contains production-ready code.

Now that you have some background, let’s take a look at how we can implement this model using services that are part of AWS Developer Tools: AWS CodePipeline, AWS CodeCommit, AWS CodeBuild, and AWS CodeDeploy. In this post, we assume you are familiar with these AWS services. If you aren’t, see the links in the Reference section before you begin. We also assume that you have installed and configured the AWS CLI.

Throughout the post, we use the popular GitFlow tool. It’s written on top of Git and automates the process of branch creation and merging. The tool follows the GitFlow branching model guidelines. You don’t have to use this tool. You can use Git commands instead.

For simplicity, production-like pipelines that have approval or testing stages have been omitted, but they can easily fit into this model. Also, in an ideal production scenario, you would keep Dev and Prod accounts separate.

AWS Developer Tools and GitFlow

Let’s take a look at how can we model AWS CodePipeline with GitFlow. The idea is to create a pipeline per branch. Each pipeline has a lifecycle that is tied to the branch. When a new, short-lived branch is created, we create the pipeline and required resources. After the short-lived branch is merged into develop, we clean up the pipeline and resources to avoid recurring costs.

The following would be permanent and would have same lifetime as the master and develop branches:

  • AWS CodeCommit master/develop branch
  • AWS CodeBuild project across all branches
  • AWS CodeDeploy application across all branches
  • AWS Cloudformation stack (EC2 instance) for master (prod) and develop (stage)

The following would be temporary and would have the same lifetime as the short-lived branches:

  • AWS CodeCommit feature/hotfix/release branch
  • AWS CodePipeline per branch
  • AWS CodeDeploy deployment group per branch
  • AWS Cloudformation stack (EC2 instance) per branch

Here’s how it would look:

Basic guidelines (assuming EC2/on-premises):

  • Each branch has an AWS CodePipeline.
  • AWS CodePipeline is configured with AWS CodeCommit as the source provider, AWS CodeBuild as the build provider, and AWS CodeDeploy as the deployment provider.
  • AWS CodeBuild is configured with AWS CodePipeline as the source.
  • Each AWS CodePipeline has an AWS CodeDeploy deployment group that uses the Name tag to deploy.
  • A single Amazon S3 bucket is used as the artifact store, but you can choose to keep separate buckets based on repo.

 

Step 1: Use the following AWS CloudFormation templates to set up the required roles and environment for master and develop, including the commit repo, VPC, EC2 instance, CodeBuild, CodeDeploy, and CodePipeline.

$ aws cloudformation create-stack --stack-name GitFlowEnv \
--template-body https://s3.amazonaws.com/devops-workshop-0526-2051/git-flow/aws-devops-workshop-environment-setup.template \
--capabilities CAPABILITY_IAM 

$ aws cloudformation create-stack --stack-name GitFlowCiCd \
--template-body https://s3.amazonaws.com/devops-workshop-0526-2051/git-flow/aws-pipeline-commit-build-deploy.template \
--capabilities CAPABILITY_IAM \
--parameters ParameterKey=MainBranchName,ParameterValue=master ParameterKey=DevBranchName,ParameterValue=develop 

Here is how the pipelines should appear in the CodePipeline console:

Step 2: Push the contents to the AWS CodeCommit repo.

Download https://s3.amazonaws.com/gitflowawsdevopsblogpost/WebAppRepo.zip. Unzip the file, clone the repo, and then commit and push the contents to CodeCommit – WebAppRepo.

Step 3: Run git flow init in the repo to initialize the branches.

$ git flow init

Assume you need to start working on a new feature and create a branch.

$ git flow feature start <branch>

Step 4: Update the stack to create another pipeline for feature-x branch.

$ aws cloudformation update-stack --stack-name GitFlowCiCd \
--template-body https://s3.amazonaws.com/devops-workshop-0526-2051/git-flow/aws-pipeline-commit-build-deploy-update.template \
--capabilities CAPABILITY_IAM \
--parameters ParameterKey=MainBranchName,ParameterValue=master ParameterKey=DevBranchName,ParameterValue=develop ParameterKey=FeatureBranchName,ParameterValue=feature-x

When you’re done, you should see the feature-x branch in the CodePipeline console. It’s ready to build and deploy. To test, make a change to the branch and view the pipeline in action.

After you have confirmed the branch works as expected, use the finish command to merge changes into the develop branch.

$ git flow feature finish <feature>

After the changes are merged, update the AWS CloudFormation stack to remove the branch. This will help you avoid charges for resources you no longer need.

$ aws cloudformation update-stack --stack-name GitFlowCiCd \
--template-body https://s3.amazonaws.com/devops-workshop-0526-2051/git-flow/aws-pipeline-commit-build-deploy.template \
--capabilities CAPABILITY_IAM \
--parameters ParameterKey=MainBranchName,ParameterValue=master ParameterKey=DevBranchName,ParameterValue=develop

The steps for the release and hotfix branches are the same.

End result: Pipelines and deployment groups

You should end up with pipelines that look like this.

Next steps

If you take the CLI commands and wrap them in your own custom bash script, you can use GitFlow and the script to quickly set up and tear down pipelines and resources for short-lived branches. This helps you avoid being charged for resources you no longer need. Alternatively, you can write a scheduled Lambda function that, based on creation date, deletes the short-lived pipelines on a regular basis.

Summary

In this blog post, we showed how AWS CodePipeline, AWS CodeCommit, AWS CodeBuild, and AWS CodeDeploy can be used to model GitFlow. We hope you can use the information in this post to improve your CI/CD strategy, specifically to get your developers working in feature/release/hotfixes branches and to provide them with an environment where they can collaborate, test, and deploy changes quickly.

References

19 години без проф. Тончо Жечев

Post Syndicated from nellyo original https://nellyo.wordpress.com/2019/02/22/19/

Toncho01100213

Професор Тончо Жечев си отиде на 23 февруари 2000 година.

Така помня професор Жечев. Малко умислен,  малко загрижен, малко тревожен от света наоколо,  в мир с това зад нас и с  любопитство към това пред нас.

 

Scalable deep learning training using multi-node parallel jobs with AWS Batch and Amazon FSx for Lustre

Post Syndicated from Geoff Murase original https://aws.amazon.com/blogs/compute/scalable-deep-learning-training-using-multi-node-parallel-jobs-with-aws-batch-and-amazon-fsx-for-lustre/

Contributed by Amr Ragab, HPC Application Consultant, AWS Professional Services

How easy is it to take an AWS reference architecture and implement a production solution? At re:Invent 2018, Toyota Research Institute presented their production DL HPC architecture. This was based on a reference architecture for a scalable, deep learning, high performance computing solution, released earlier in the year.  The architecture was designed to run ImageNet and ResNet-50 benchmarks on Apache MXNet and TensorFlow machine learning (ML) frameworks. It used cloud best practices to take advantage of the scale and elasticity that AWS offers.

With the pace of innovation at AWS, I can now show an evolution of that deep learning solution with new services.

A three-component HPC cluster is common in tightly coupled, multi-node distributed training solutions. The base layer is a high-performance file system optimized for reading the images packed as TFRecords or RecordIO as well as in its original form. The reference architecture originally referenced BeeGFS. In this post, I use the high performance Amazon FSx for Lustre file system, announced at re:Invent 2018. The second layer is the scalable compute, which originally used p3.16xl instances containing eight NVIDIA Tesla V100 per node. Finally, a job scheduler is the third layer for managing multiuser access to plan and distribute the workload across the available nodes.

In this post, I demonstrate how to create a fully managed HPC infrastructure, execute the distributed training job, and collapse it using native AWS services. In the three-component HPC design, the scheduler and compute layers are achieved by using AWS Batch as a managed service built to run thousands of batch computing jobs. AWS Batch dynamically provisions compute resources based on the specific job requirements of the distributed training job.

AWS Batch recently started supporting multi-node parallel jobs, allowing tightly coupled jobs to be executed. This compute layer can be coupled with the FSx for Lustre file system.

FSx for Lustre is a fully managed, parallel file system based on Lustre that can scale to millions of IOPS, and hundreds of gigabytes per second throughput. FSx for Lustre is seamlessly integrated with Amazon S3 to parallelize the ingestion of data from the object store.

 

Coupled together, this provides a core compute solution for running workloads requiring high performance layers. One additional benefit is that AWS Batch and FSx for Lustre are API-driven services and can be programmatically orchestrated.

The goal of this post is to showcase an innovative architecture, replacing self-managed roll-your-own file system and compute to platform managed services using FSx for Lustre and AWS Batch running containerized applications, hence reducing complexity and maintenance. This can also serve as a template for other HPC applications requiring similar compute/networking and storage topologies. With that in mind, benchmarks related to distributed deep learning are out of scope. As you see at the end of this post, I achieved linear scalability over a broad range (8 – 160) of GPUs spanning 1–20 p3.16xlarge nodes.

Deployment

Much of the deployment was covered in a previous post, Building a tightly coupled molecular dynamics workflow with multi-node parallel jobs in AWS Batch. However, some feature updates since then have simplified the initial deployment.

In brief, you provision the following resources:

  • A FSx for Lustre file system hydrated from a S3 bucket that provides the source ImageNet 2012 images
  • A new Ubuntu 16.04 ECS instance:
    • Lustre kernel driver and FS mount
    • CUDA 10 with NVIDIA Tesla 410 driver
    • Docker 18.09-ce including nvidia-docker2
    • A multi-node parallel batch–compatible TensorFlow container with the following stack:
      • Ubuntu 18.04 container image
      • TENSORFLOW_VERSION=1.12.0
      • HOROVOD_VERSION=0.15.2
      • CUDNN_VERSION=7.4.2.24-1+cuda10.0
      • NCCL_VERSION=2.3.7-1+cuda10.0
      • OPENMPI 4.0.0

FSx for Lustre setup

First, create a file system in the FSx for Lustre console. The default minimum file system size of 3600 GiB is sufficient.

  • File system name: ImageNet2012 dataset
  • Storage capacity: 3600 (GiB)

In the console, ensure that you have specified the appropriate network access and security groups so that clients can access the FSx for Lustre file system. For this post, find the scripts to prepare the dataset in the deep-learning-models GitHub repo.

  • Data repository type: Amazon S3
  • Import path: Point to an S3 bucket holding the ImageNet 2012 dataset.

While the FSx for Lustre layer is being provisioned, spin up an instance in the Amazon EC2 console with the Ubuntu 16.04 ECS AMI using a p3.2xlarge instance type. One modification required, when preparing the ecs-agent systemd file. Replace the ExecStart= stanza with the following:

ExecStart=docker run --name ecs-agent \
  --init \
  --restart=on-failure:10 \
  --volume=/var/run:/var/run \
  --volume=/var/log/ecs/:/log \
  --volume=/var/lib/ecs/data:/data \
  --volume=/etc/ecs:/etc/ecs \
  --volume=/sbin:/sbin \
  --volume=/lib:/lib \
  --volume=/lib64:/lib64 \
  --volume=/usr/lib:/usr/lib \
  --volume=/proc:/host/proc \
  --volume=/sys/fs/cgroup:/sys/fs/cgroup \
  --volume=/var/lib/ecs/dhclient:/var/lib/dhclient \
  --net=host \
  --env ECS_LOGFILE=/log/ecs-agent.log \
  --env ECS_DATADIR=/data \
  --env ECS_UPDATES_ENABLED=false \
  --env ECS_AVAILABLE_LOGGING_DRIVERS='["json-file","syslog","awslogs"]' \
  --env ECS_ENABLE_TASK_IAM_ROLE=true \
  --env ECS_ENABLE_TASK_IAM_ROLE_NETWORK_HOST=true \
  --env ECS_UPDATES_ENABLED=true \
  --env ECS_ENABLE_TASK_ENI=true \
  --env-file=/etc/ecs/ecs.config \
  --cap-add=sys_admin \
  --cap-add=net_admin \
  -d \
  amazon/amazon-ecs-agent:latest

During the provisioning workflow, add a 500 GB SSD (gp2) Amazon EBS volume. For ease of installation, install the Lustre kernel driver first. Also, modify the kernel for compatibility. Install the dkms package first.

sudo apt install -y dkms git

Follow the instructions for Ubuntu 16.04.

Install the CUDA 10 and NVIDIA 410 driver branch according to the instructions provided by NVIDIA. It’s important that the dkms system is installed with the kernel modules being built against the kernel installed earlier.

When complete, install the latest Docker release, as well as nvidia-docker2, according to the instructions in the nvidia-docker GitHub repo, setting the default runtime to “nvidia.”

{
    "runtimes": {
        "nvidia": {
            "path": "nvidia-container-runtime",
            "runtimeArgs": []
        }
    },
    "default-runtime": "nvidia"
}

At this stage, you can create this AMI and keep it for future deployments. This saves time in bootstrapping, as the generic AMI can be used for a diverse set of applications.

When the FSx for Lustre file system is complete, add the file system information into /etc/fstab:

<file_system_dns_name>@tcp:/fsx /fsx lustre defaults,_netdev 0 0

Confirm that the mounting is successful by using the following command:

sudo mkdir /fsx && sudo mount -a

Building the multi-node parallel batch TensorFlow Docker image

Now, set up the multi-node TensorFlow container image. Keep in mind that this process takes approximately two hours to build on a p3.2xlarge. Use the Dockerfile build scripts for setting up multinode parallel batch jobs.

git clone https://github.com/aws-samples/aws-mnpbatch-template.git
cd aws-mnpbatch-template
docker build -t nvidia/mnp-batch-tensorflow .

As part of the Docker container’s ENTRYPOINT, use the mpi-run.sh script from the Building a tightly coupled molecular dynamics workflow with multi-node parallel jobs in AWS Batch post. Optimize it for running the TensorFlow distributed training as follows:

cd $SCRATCH_DIR
 export INTERFACE=eth0
 export MODEL_HOME=/root/deep-learning-models/models/resnet/tensorflow
 /opt/openmpi/bin/mpirun --allow-run-as-root -np $MPI_GPUS --machinefile ${HOST_FILE_PATH}-deduped -mca plm_rsh_no_tree_spawn 1 \
                        -bind-to socket -map-by slot \
                        $EXTRA_MPI_PARAMS -x LD_LIBRARY_PATH -x PATH -mca pml ob1 -mca btl ^openib \
                        -x NCCL_SOCKET_IFNAME=$INTERFACE -mca btl_tcp_if_include $INTERFACE \
                        -x TF_CPP_MIN_LOG_LEVEL=0 \
                        python3 -W ignore $MODEL_HOME/train_imagenet_resnet_hvd.py \
                        --data_dir $JOB_DIR --num_epochs 90 -b $BATCH_SIZE \
                        --lr_decay_mode poly --warmup_epochs 10 --clear_log

There are some undefined environment variables in the startup command. Those are filled in when you create the multi-node batch job definition file in later stages of this post.

Upon successfully building the Docker image, commit this image to the Amazon ECR registry, to be pulled later. Consult the ECR push commands in the registry by selecting the registry and choose View Push Commands.

One additional tip:  Notice that the Docker image is approximately 12 GB, to ensure that your container instance starts up quickly. I would cache this image in the Docker cache so that incremental layer updates can be pulled from ECR instead of pulling the entire image, which takes more time.

Finally, you should be ready to create this AMI for the AWS Batch compute environment phase of the workflow. In the AWS Batch console, choose Compute environment and create an environment with the following parameters.

Compute environment

  • Compute environment type:  Managed
  • Compute environment name:  tensorflow-gpu-fsx-ce
  • Service role:  AWSBatchServiceRole
  • EC2 instance role:  ecsInstanceRole

Compute resources

Set the minimum and desired vCPUs at 0. When a job is submitted, the underlying AWS Batch service recruits the nodes, taking advantage of the elasticity and scale offered on AWS.

  • Provisioning model: On-Demand
  • Allowed instance types: p3 family, p3dn.24xlarge
  • Minimum vCPUs: 0
  • Desired vCPUs: 0
  • Maximum vCPUs: 4096
  • User-specified AMI: Use the Amazon Linux 2 AMI mentioned earlier.

Networking

AWS Batch makes it easy to specify the placement groups. If you do this, the internode communication between instances has the lowest latencies possible, which is a requirement when running tightly coupled workloads.

  • VPC Id: Choose a VPC that allows access to the FSx cluster created earlier.
  • Security groups: FSx security group, Cluster security group
  • Placement group: tf-group (Create the placement group.)

EC2 tags

  • Key: Name
  • Value: tensorflow-gpu-fsx-processor

Associate this compute environment with a queue called tf-queue. Finally, create a job definition that ties the process together and executes the container.

The following parameters in JSON format sets up the mnp-tensorflow job definition.

{
    "jobDefinitionName": "mnptensorflow-gpu-mnp1",
    "jobDefinitionArn": "arn:aws:batch:us-east-2:<accountid>:job-definition/mnptensorflow-gpu-mnp1:1",
    "revision": 2,
    "status": "ACTIVE",
    "type": "multinode",
    "parameters": {},
    "retryStrategy": {
        "attempts": 1
    },
    "nodeProperties": {
        "numNodes": 20,
        "mainNode": 0,
        "nodeRangeProperties": [
            {
                "targetNodes": "0:19",
                "container": {
                    "image": "<accountid>.dkr.ecr.us-east-2.amazonaws.com/mnp-tensorflow",
                    "vcpus": 62,
                    "memory": 424000,
                    "command": [],
                    "jobRoleArn": "arn:aws:iam::<accountid>:role/ecsTaskExecutionRole",
                    "volumes": [
                        {
                            "host": {
                                "sourcePath": "/scratch"
                            },
                            "name": "scratch"
                        },
                        {
                            "host": {
                                "sourcePath": "/fsx"
                            },
                            "name": "fsx"
                        }
                    ],
                    "environment": [
                        {
                            "name": "SCRATCH_DIR",
                            "value": "/scratch"
                        },
                        {
                            "name": "JOB_DIR",
                            "value": "/fsx/resized"
                        },
                        {
                            "name": "BATCH_SIZE",
                            "value": "256"
                        },
                        {
                            "name": "EXTRA_MPI_PARAMS",
                            "value": "-x HOROVOD_HIERARCHICAL_ALLREDUCE=1 -x HOROVOD_FUSION_THRESHOLD=16777216 -x NCCL_MIN_NRINGS=8 -x NCCL_LAUNCH_MODE=PARALLEL"
                        },
                        {
                            "name": "MPI_GPUS",
                            "value": "160"
                        }
                    ],
                    "mountPoints": [
                        {
                            "containerPath": "/fsx",
                            "sourceVolume": "fsx"
                        },
                        {
                            "containerPath": "/scratch",
                            "sourceVolume": "scratch"
                        }
                    ],
                    "ulimits": [],
                    "instanceType": "p3.16xlarge"
                }
            }
        ]
    }
}

MPI_GPUS 

Total number of GPUs in the cluster. In this case, it’s 20 x p3.16xlarge = 160.

BATCH_SIZE

Number of images of per GPU to load at time for training on 16 GB of memory per GPU = 256.

JOB_DIR

Location of the TFrecords prepared earlier optimized for the number of shards = /fsx/resized.

SCRATCH_DIR

Path to the model outputs = /scratch.

One additional tip:  You have the freedom to expose additional parameters in the job definition. This means that you can also expose model training hyperparameters, which opens the door to multi-parameter optimization (MPO) studies on the AWS Batch layer.

With the job definition created, submit a new job sourcing this job definition, executing on the tf-queue created earlier. This spawns the compute environment.

The AWS Batch service only launches the requested number of nodes. You don’t pay for the running EC2 instances until all requested nodes are launched in your compute environment.

After the job enters the RUNNING state, you can monitor the main container:0 for activity with the CloudWatch log stream created for this job. Some of key entries are as follows, with the 20 nodes joining the cluster. One additional tip: It is possible to use this infrastructure to push the model parameters and training performance to a Tensorboard for additional monitoring.

The next log screenshot shows the main TensorFlow and Horovod workflow starting up. 

Performance monitoring

On 20 p3.16xl nodes, I achieved a comparable speed of approximately 100k images/sec, with close to 90-100% GPU utilization across all 160 GPUs with the containerized Horovod TensorFlow Docker image.

When you have this implemented, try out the cluster using the recently announced p3dn.24xlarge, a 32-GB NVIDIA Tesla V100 memory variant of the p3.16xl with 100-Gbps networking. To take advantage of the full GPU memory of the p3dn in the job definition, increase the BATCH_SIZEenvironmental variable.

Conclusion

With the evolution of a scalable, deep learning–focused, high performance computing environment, you can now use a cloud-native approach. Focus on your code and training while AWS handles the undifferentiated heavy lifting.

As mentioned earlier, this reference architecture has an API interface, thus an event-driven workflow can further extend this work. For example, you can integrate this core compute in an AWS Step Functions workflow to stand up the FSx for Lustre layer. Submit the batch job and collapse the FSx for Lustre layer.

Or through an API Gateway, create a web application for the job submission. Integrate with on-premises resources to transfer data to the S3 bucket and hydrate the FSx for Lustre file system.

If you have any questions about this deployment or how to integrate with a longer AWS posture, please comment below. Now go power up your deep learning workloads with a fully managed, high performance compute framework!

Podcast #299: February 2019 Updates

Post Syndicated from Simon Elisha original https://aws.amazon.com/blogs/aws/podcast-299-february-2019-updates/

Simon guides you through lots of new features, services and capabilities that you can take advantage of. Including the new AWS Backup service, more powerful GPU capabilities, new SLAs and much, much more!

Chapters:

Service Level Agreements 0:17
Storage 0:57
Media Services 5:08
Developer Tools 6:17
Analytics 9:54
AI/ML 12:07
Database 14:47
Networking & Content Delivery 17:32
Compute 19:02
Solutions 21:57
Business Applications 23:38
AWS Cost Management 25:07
Migration & Transfer 25:39
Application Integration 26:07
Management & Governance 26:32
End User Computing 29:22

Additional Resources

Topic || Service Level Agreements 0:17

Topic || Storage 0:57

Topic || Media Services 5:08

Topic || Developer Tools 6:17

Topic || Analytics 9:54

Topic || AI/ML 12:07

Topic || Database 14:47

Topic || Networking and Content Delivery 17:32

Topic || Compute 19:02

Topic || Solutions 21:57

Topic || Business Applications 23:38

Topic || AWS Cost Management 25:07

Topic || Migration and Transfer 25:39

Topic || Application Integration 26:07

Topic || Management and Governance 26:32

Topic || End User Computing 29:22

About the AWS Podcast

The AWS Podcast is a cloud platform podcast for developers, dev ops, and cloud professionals seeking the latest news and trends in storage, security, infrastructure, serverless, and more. Join Simon Elisha and Jeff Barr for regular updates, deep dives and interviews. Whether you’re building machine learning and AI models, open source projects, or hybrid cloud solutions, the AWS Podcast has something for you. Subscribe with one of the following:

Like the Podcast?

Rate us on iTunes and send your suggestions, show ideas, and comments to [email protected]. We want to hear from you!

Podcast 298: [Public Sector Special Series #6] – Bringing the White House to the World

Post Syndicated from Simon Elisha original https://aws.amazon.com/blogs/aws/podcast-298-public-sector-special-series-6-bringing-the-white-house-to-the-world/

Dr. Stephanie Tuszynski (Director of the Digital Library – White House Historical Association) speaks about how they used AWS to bring the experience of the White House to the world.

Additional Resources

About the AWS Podcast

The AWS Podcast is a cloud platform podcast for developers, dev ops, and cloud professionals seeking the latest news and trends in storage, security, infrastructure, serverless, and more. Join Simon Elisha and Jeff Barr for regular updates, deep dives and interviews. Whether you’re building machine learning and AI models, open source projects, or hybrid cloud solutions, the AWS Podcast has something for you. Subscribe with one of the following:

Like the Podcast?

Rate us on iTunes and send your suggestions, show ideas, and comments to [email protected]. We want to hear from you!

Validating AWS CodeCommit Pull Requests with AWS CodeBuild and AWS Lambda

Post Syndicated from Chris Barclay original https://aws.amazon.com/blogs/devops/validating-aws-codecommit-pull-requests-with-aws-codebuild-and-aws-lambda/

Thanks to Jose Ferraris and Flynn Bundy for this great post about how to validate AWS CodeCommit pull requests with AWS CodeBuild and AWS Lambda. Both are DevOps Consultants from the AWS Professional Services’ EMEA team.

You can help ensure a high level of code quality and avoid merging code that does not integrate with previous changes by testing proposed code changes in pull requests before they are allowed to be merged. In this blog post, we’ll show you how to set up this kind of validation using AWS CodeCommit, AWS CodeBuild, and AWS Lambda. In addition, we’ll show you how to set up a pipeline to automatically build your tested, approved, and merged code changes using AWS CodePipeline.

When we talk with customers and partners, we find that they are in different stages in the adoption of DevOps methodologies such as Continuous Integration and Continuous Deployment (CI/CD). However, one of the main requirements we see is a strong emphasis on automation of delivering resources in a safe, secure, and repeatable manner. One of the fundamental principles of CI/CD is aimed at keeping everyone on the team in sync about changes happening in the codebase. With this in mind, it’s important to fail fast and fail early within a CI/CD workflow to ensure that potential issues are caught before making their way into production.

To do this, we can use services such as AWS CodeBuild for running our tests, along with AWS CodeCommit to store our source code. One of the ways we can “fail fast” is to validate pull requests with tests to see how they will integrate with the current master branch of a repository when first opened in AWS CodeCommit. By running our tests against the proposed changes prior to merging them into the master branch, we can ensure a high level of quality early on, catch any potential issues, and boost the confidence of the developer in relation to their changes. In this way, you can start validating your pull requests in AWS CodeCommit by utilizing AWS Lambda and AWS CodeBuild to automatically trigger builds and tests of your development branches.

We can also use services such as AWS CodePipeline for visualizing and creating our pipeline, and automatically building and deploying merged code that has met the validation bar for pull requests.

The following diagram shows the workflow of a pull request. The AWS CodeCommit repository contains two branches, the master branch that contains approved code, and the development branch, where changes to the code are developed. In this workflow, a pull request is created with the new code in the development branch, which the developer wants to merge into the master branch. The creation of the pull request is an event detected by AWS CloudWatch. This event will start two separate actions:
• It triggers an AWS Lambda function that will post an automated comment to the pull request that indicates a build to test the changes is about to begin.
• It also triggers an AWS CodeBuild project that will build and validate those changes.

When the build completes, AWS CloudWatch detects that event. Another AWS Lambda function posts an automated comment to the pull request with the results of the build and a link to the build logs. Based on this automated testing, the developer who opened the pull request can update the code to address any build failures, and then update the pull request with those changes. Those updates will be built, and the build results are then posted to the pull request as a comment.

Let’s show how this works in a specific example project. This project has its own set of tasks defined in the build specification file that will execute and validate this specific pull request. The buildspec.yml for our example AWS CloudFormation template contains the following code:

version: 0.2

phases:
  install:
    commands:
      - pip install cfn-lint
  build:
    commands:
      - cfn-lint --template ./template.yaml --regions $AWS_REGION
      - aws cloudformation validate-template --template-body file://$(pwd)/template.yaml
artifacts:
  files:
    - '*'

In this example we are installing cfn-lint, which perform various checks against our template, we are also running the AWS CloudFormation validate-template command via the AWS CLI.

Once the code included in the pull request has been built, AWS CloudWatch detects the build complete event and passes along the outcome to a Lambda function that will update the specific commit with a comment that notifies the users of the results. It also includes a link to build logs in AWS CodeBuild. This process repeats any time the pull request is updated. For example, if an initial pull request was opened but failed the set of tests associated with the project, the developer might fix the code and make an update to the currently opened pull request. This will in turn trigger the function to run again and update the comments section with the test results.

Testing and validating pull requests before they can be merged into production code is a common approach and a best practice when working with CI/CD. Once the pull request is approved and merged into the production branch, it is also a good CI/CD practice to automatically build, test, and deploy that code. This is why we’ve structured this into two different AWS CloudFormation stacks (both can be found in our GitHub repository). One contains a base layer template that contains the resources you would only need to create once, in this case the AWS Lambda functions that test and update pull requests. The second stack includes an example of a CI/CD pipeline defined in AWS CloudFormation that imports the resources from the base layer stack.

We start by creating our base layer, which creates the Lambda functions and sets up AWS IAM roles that the functions will use to interact with the various AWS services. Once this stack is in place, we can add one or more pipeline stacks which import some of the values from the base layer. The pipeline will automatically build any changes merged into the master branch of the repository. Once any pipeline stack is complete, we have an AWS CodeCommit repository, AWS CodeBuild project, and an AWS CodePipeline pipeline set up and ready for deployment.

We can now push some code into our repository on the master branch to trigger a run-through of our pipeline.

In this example we will use the following AWS CloudFormation template. This template creates a single Amazon S3 bucket. This template will be the artifact that we push through our CI/CD pipeline and deploy to our stages.

AWSTemplateFormatVersion: '2010-09-09'
Description: 'A sample CloudFormation template that we can use to validate in our pipeline'
Resources:
  S3Bucket:
    Type: 'AWS::S3::Bucket'

Once this code is tested and approved in a pull request, it will be merged into the production branch as part of the pull request approve and merge process. This will automatically start our pipeline in AWS CodePipeline, and will run through to the stages defined for it. For example:

Now we can make some changes to our code base in the development branch and open a pull request. First, edit the file to make a typo in our CloudFormation template so we can test the validation.

AWSTemplateFormatVersion: '2010-09-09'
Metadata: 
  License: Apache-2.0
Description: 'A sample CloudFormation template that we can use to validate in our pipeline'
Resources:
  S3Bucket:
    Type: 'AWS::S3::Bucket1'

Notice that we changed the S3 bucket to be AWS::S3::Bucket1. This doesn’t exist, so cfn-lint will return a failure when it attempts to validate the template.

Now push this change into our development branch in the AWS CodeCommit repository and open the pull request against the production (master) branch.

From there, navigate to the comments section of the pull request. You should see a status update that the pull request is currently building.

Once the build is complete, you should see feedback on the outcome of the build and its results given to us as a comment.

Choose the Logs link to view details about the failure. We can see that we were able to catch an error related to linting rules failing.

We can remedy this and update our pull request with the updated code. Upon doing so, we can see another build has been kicked off by looking at the comments of the pull request. Once this has been completed we can confirm that our pull request has been validated as desired and our tests have passed.

Once this pull request is approved and merged to master, this will start our pipeline in AWS CodePipeline, which will take this code change through the specified stages.

 

Working with AWS Lambda and Lambda Layers in AWS SAM

Post Syndicated from Eric Johnson original https://aws.amazon.com/blogs/compute/working-with-aws-lambda-and-lambda-layers-in-aws-sam/

The introduction of serverless technology has enabled developers to shed the burden of managing infrastructure and concentrate on their application code. AWS Lambda has taken on that management by providing isolated, event-driven compute environments for the execution of application code. To use a Lambda function, a developer just needs to package their code and any dependencies into a zip file and upload that file to AWS. However, as serverless applications get larger and more functions are required for those applications, there is a need for the ability to share code across multiple functions within the application.

To meet this need, AWS released Lambda layers, providing a mechanism to externally package dependencies that can be shared across multiple Lambda functions. Lambda layers reduces lines of code and size of application artifacts and simplifies dependency management. Along with the release of Lambda layers, AWS also released support for layers in the AWS Serverless Application Model (SAM) and the AWS SAM command line interface (CLI). SAM is a template specification that enables developers to define a serverless application in clean and simple syntax. The SAM CLI is a command line tool that operates on SAM templates and application code. SAM can now define Lambda layers with the AWS::Serverless::LayerVersion type. The SAM CLI can build and test your layers locally as well as package, deploy, and publish your layers for public consumption.

How layers work

To understand how SAM CLI supports layers, you need to understand how layers work on AWS. When a Lambda function configured with a Lambda layer is executed, AWS downloads any specified layers and extracts them to the /opt directory on the function execution environment. Each runtime then looks for a language-specific folder under the /opt directory.

Lambda layers can come from multiple sources. You can create and upload your own layers for sharing, you can implement an AWS managed layer such as SciPi, or you can grab a third-party layer from an APN Partner or another trusted developer. The following image shows how layers work with multiple sources.AWS Lambda Layers diagram

How layers work in the AWS SAM CLI

To support Lambda layers, SAM CLI replicates the AWS layer process locally by downloading all associated layers and caching them on your development machine. This happens the first time you run sam local invoke or the first time you execute your Lambda functions using sam local start-lambda or sam local start-api.

Two specific flags in SAM CLI are helpful when you’re working with Lambda layers locally. To specify where the layer cache should be located, pass the –layer-cache-basedir flag, followed by your desired cache directory. To force SAM CLI to rebuild the layer cache, pass the –force-image-build flag.

Time for some code

Now you’re going to create a simple application that does some temperature conversions using a simple library named temp-units-conv. After the app is running, you move the dependencies to Lambda layers using SAM. Finally, you add a layer managed by an AWS Partner Network Partner, Epsagon, to enhance the monitoring of the Lambda function.

Creating a serverless application

To create a serverless application, use the SAM CLI. If you don’t have SAM CLI installed, see Installing the AWS SAM CLI in the AWS Serverless Application Model Developer Guide.

  1. To initialized a new application, run the following command.
    $ sam init -r nodejs8.10

    This creates a simple node application under the directory sam-app that looks like this.

    $ tree sam-app
    sam-app
    ├── README.md
    ├── hello-world
    │   ├── app.js
    │   ├── package.json
    │   └── tests
    └── template.yaml

    The template.yaml file is a SAM template describing the infrastructure for the application, and the app.js file contains the application code.

  2. To install the dependencies for the application, run the following command from within the sam-app/hello-world directory.
    $ npm install temp-units-conv
  3. The application is going to perform temperature scale conversions for Celsius, Fahrenheit, and Kelvin using the following code. In a code editor, open the file sam-app/hello-world/app.js and replace its contents with the following.
    const tuc = require('temp-units-conv');
    let response;
    
    const scales = {
        c: "celsius",
        f: "fahrenheit",
        k: "kelvin"
    }
    
    exports.lambdaHandler = async (event) => {
        let conversion = event.pathParameters.conversion
        let originalValue = event.pathParameters.value
        let answer = tuc[conversion](originalValue)
        try {
            response = {
                'statusCode': 200,
                'body': JSON.stringify({
                    source: scales[conversion[0]],
                    target: scales[conversion[2]],
                    original: originalValue,
                    answer: answer
                })
            }
        } catch (err) {
            console.log(err);
            return err;
        }
    
        return response
    };
  4. Update the SAM template. Open the sam-app/template.yaml file. Replace the contents with the following. This is a YAML file, so spacing and indentation is important.
    AWSTemplateFormatVersion: '2010-09-09'
    Transform: AWS::Serverless-2016-10-31
    Description: sam app
    Globals:
        Function:
            Timeout: 3
            Runtime: nodejs8.10
    
    Resources:
        TempConversionFunction:
            Type: AWS::Serverless::Function 
            Properties:
                CodeUri: hello-world/
                Handler: app.lambdaHandler
                Events:
                    HelloWorld:
                        Type: Api
                        Properties:
                            Path: /{conversion}/{value}
                            Method: get

    This change dropped out some comments and output parameters and updated the function resource to TempConversionFunction. The primary change is the Path: /{conversion}/{value} line. This enables you to use path mapping for our conversion type and value.

  5. Okay, now you have a simple app that does temperature conversions. Time to spin it up and make sure that it works. In the sam-app directory, run the following command.
    $ sam local start-api

  6. Using curl or your browser, navigate to the address output by the previous command with a conversion and value attached. For reference, c = Celsius, f = Fahrenheit, and k = Kelvin. Use the pattern c2f/ followed by the temperature that you want to convert.
    $ curl http://127.0.0.1:3000/f2c/45
    {"source":"fahrenheit","target":"celsius","original":"45","answer":7.222222222222222}

Deploying the application

Now that you have a working application that you have tested locally, deploy it to AWS. This enables the application to run on Lambda, providing a public endpoint that you can share with others.

  1. Create a resource bucket. The resource bucket gives you a place to upload the application so that AWS CloudFormation can access it when you run the deploy process. Run the following command to create a resource bucket.
    $ aws s3api create-bucket –bucket <your unique bucket name>
  2. Use SAM to package the application. From the sam-app directory, run the following command.
    $ sam package --template-file template.yaml --s3-bucket <your bucket> --output-template-file out.yaml
  3. Now you can use SAM to deploy the application. Run the following command from the sam-app folder.
    $ sam deploy --template-file ./out.yaml --stack-name <your stack name> --capabilities CAPABILITY_IAM

Sign in to the AWS Management Console. Navigate to the Lambda console to find your function.

Lambda Console

Now that the application is deployed, you can access it via the API endpoint. In the Lambda console, click on the API Gateway option and scroll down. You will find a link to your API Gateway endpoint.

API Gateway Endpoint

Using that value, you can test the live application. Your endpoint will be different from the one in the following image.

Live Demo

Let’s take a moment to talk through the structure of our new application. Because you installed temp-units-conv, there is a dependency folder named sam-app`hello-world/node_modules that you need to include when you upload the application.

$ tree sam-app
sam-app
├── README.md
├── hello-world
│   ├── app.js
│   ├── node_modules
│   ├── package-lock.json
│   ├── package.json
│   └── tests
└── template.yaml

Because you’re a node user, you can use something like webpack to minimize your uploads. However, this requires a processing step to pack your code, and it still forces you to upload unchanging, static code on every update. To simplify this, create a layer to separate the dependencies from the application code.

Creating a layer

To create and manage the dependency layer, you need to update your directory structure a bit.

$ tree sam-app
sam-app
├── README.md
├── dependencies
│   └── nodejs
│       └── package.json
├── hello-world
│   ├── app.js
│   └── tests
│       └── unit
└── template.yaml

In the root, create a new directory named dependencies. Under that directory, create a second directory named nodejs. This is the structure required for layers to be injected into a Lambda function. Next, move the package.json file from the hello-world directory to the dependencies/nodejs directory. Finally, clean up the hello-world directory by deleting the node_modules folder and the package-lock.json file.

Before you gather your dependencies, edit the sam-app/dependencies/nodejs/pakage.json file. Replace the entire contents with the following.

{
  "dependencies": {
    "temp-units-conv": "^1.0.2"
  }
}

Now that you have the package file cleaned up, install the required packages into the dependencies directory. From the sam-app/dependencies/nodejs directory, run the following command.

$ npm install

You now have a node_modules directory under the nodejs directory. With this in place, you have everything in place to create your first layer using SAM.

The next step is to update the AWS SAM template. Replace the contents of your sam-app/template.yaml file with the following.

AWSTemplateFormatVersion: '2010-09-09'
Transform: AWS::Serverless-2016-10-31
Description: sam app
Globals:
    Function:
        Timeout: 3
        Runtime: nodejs8.10

Resources:
    TempConversionFunction:
        Type: AWS::Serverless::Function 
        Properties:
            CodeUri: hello-world/
            Handler: app.lambdaHandler
            Layers:
              - !Ref TempConversionDepLayer
            Events:
                HelloWorld:
                    Type: Api
                    Properties:
                        Path: /{conversion}/{value}
                        Method: get

    TempConversionDepLayer:
        Type: AWS::Serverless::LayerVersion
        Properties:
            LayerName: sam-app-dependencies
            Description: Dependencies for sam app [temp-units-conv]
            ContentUri: dependencies/
            CompatibleRuntimes:
              - nodejs6.10
              - nodejs8.10
            LicenseInfo: 'MIT'
            RetentionPolicy: Retain

There are two changes to the template. The first is a new resource named TempConversionDepLayer, which defines the new layer and points to the dependencies folder as the code source for the layer. The second is the addition of the Layers parameter in the TempConversionFunction resource. The single layer entry references the layer that is being created in the template file.

With that final change, you have separated the application code from the dependencies. Try out the application and see if it still works. From the sam-app directory, run the following command.

$ sam local start-api

If all went well, you can open your browser back up and try another conversion.

One other thing to note here is that you didn’t have to change our application code at all. As far as the application is concerned, nothing has changed. However, under the hood, SAM CLI is doing a bit of magic. SAM CLI is creating an image of the layer and caching it locally. It then makes that layer available in the /opt directory on the container being used to execute the Lambda function locally.

Using layers from APN Partners and third parties

So far, you have used a layer of your own creation. Now you’re going to branch out and add a managed layer by an APN Partner, Epsagon, who provides a tool to help with monitoring and troubleshooting your serverless applications. If you want to try this demo, you can sign up for a free trial on their website. After you create an account, you need to get the Epsagon token from the Settings page of your dashboard.

Epsagon Settings

  1. Add Epsagon layer reference. Edit the sam-app/template.yaml file. Update the Layers section of the TempConversionFunction resource to the following.
    Layers:
      - !Ref TempConversionDepLayer
      - arn:aws:lambda:us-east-1:066549572091:layer:epsagon-node-layer:1
    

    Note: This demo uses us-east-1 for the AWS Region. If you plan to deploy your Lambda function to a different Region, update the Epsagon LayerVersion Amazon Resource Name (ARN) accordingly. For more information, see the Epsagon blog post on layers.

  2. To use the Epsagon library in our code, you need to add or modify nine lines of code. You reference and initialize the library, wrap the handler with the Epsagon library, and modify the output. Open the sam-app/hello-world/app.js file and replace the entire contents with the following. The changes are highlighted. Be sure to update 1122334455 with your token from Epsagon.
    const tuc = require('temp-units-conv');
    const epsagon = require('epsagon');
    epsagon.init({
        token: '1122334455',
        appName: 'layer-demo-app',
        metadataOnly: false, // Optional, send more trace data
    });
    
    let response;
    
    const scales = {
        c: "celsius",
        f: "fahrenheit",
        k: "kelvin"
    }
    
    exports.lambdaHandler = epsagon.lambdaWrapper((event, context, callback) => {
        let conversion = event.pathParameters.conversion
        let originalValue = event.pathParameters.value
        let answer = tuc[conversion](originalValue)
        try {
            response = {
                'statusCode': 200,
                'body': JSON.stringify({
                    source: scales[conversion[0]],
                    target: scales[conversion[2]],
                    original: originalValue,
                    answer: answer
                })
            }
        } catch (err) {
            console.log(err);
            return err;
        }
    
        callback(null, response)
    });

Test the change to make sure that everything still works. From the sam-app directory, run the following command.

$ sam cli start-api

Use curl to test your code.

$ curl http://127.0.0.1:3000/k2c/100

Your answer should be the following.

{"source":"kelvin","target":"celsius","original":"100","answer":-173.14999999999998}

Your Epsagon dashboard should display traces from your Lambda function, as shown in the following image.

Epsagon Dashboard

Deploying the application with layers

Now that you have a functioning application that uses Lambda layers, you can package and deploy it.

  1. To package the application, run the following command.
    $ sam package --template-file template.yaml --s3-bucket <your bucket> --output-template-file out.yaml
  2. To deploy the application, run the following command.
    $ sam deploy --template-file ./out.yaml --stack-name <your stack name> --capabilities CAPABILITY_IAM

The Lambda console for your function has updated, as shown in the following image.

Lambda Console

Also, the dependency code isn’t in your code environment in the Lambda function.

Lambda Console Code

There you have it! You just deployed your Lambda function and your dependencies layer for that function. It’s important to note that you did not publish the Epsagon layer. You just told AWS to grab their layer and extract it in to your function’s execution environment. The following image shows the flow of this process.

Epsagon Layer

Options for managing layers

You have several options for managing your layers through AWS SAM.

First, following the pattern you just walked through releases a new version of your layer each time you deploy your application. If you remember, one of the advantages of using layers is not having to upload the dependencies each time. One option to avoid this is to keep your dependencies in a separate template and deploy them only when the dependencies have changed.

Second, this pattern always uses the latest build of dependencies. If for some reason you want to get a specific version of your dependencies, you can. After you deploy the application for the first time, you can run the following command.

$ aws lambda list-layer-versions --layer-name sam-app-depedencies

You should see a response like the following.

{
    "LayerVersions": [
        {
            "LayerVersionArn": "arn:aws:lambda:us-east-1:5555555:layer:sam-app-dependencies:1",
            "Version": 1,
            "Description": "Dependencies for sam app",
            "CreatedDate": "2019-01-08T18:04:51.833+0000",
            "CompatibleRuntimes": [
                "nodejs6.10",
                "nodejs8.10"
            ],
            "LicenseInfo": "MIT"
        }
    ]
}

The critical information here is the LayerArnVersion. Returning to the sam-app/template.yaml file, you can change the Layers section of the TempConversionFunction resource to use this version.

Layers:
  - arn:aws:lambda:us-east-1:5555555:layer:sam-app-dependencies:1
  - arn:aws:lambda:us-east-1:066549572091:layer:epsagon-node-layer:1

Conclusion

This blog post demonstrates how AWS SAM can manage Lambda layers via AWS SAM templates. It also demonstrates how the AWS SAM CLI creates a local development environment that provides layer support without any changes in the application.

Developers are often taught to think of code in an object-oriented manner and to code in a DRY way (don’t repeat yourself). For developers of serverless applications, these practices remain true. As serverless applications grow in size and require more Lambda functions, using Lambda layers provides an efficient mechanism to reuse code and libraries. Layers also reduce the size of your upload packages, making iterations faster.

We’re excited to see what you do with layers and to hear how AWS SAM is helping you. As always, we welcome your feedback.

Now go code!

Samsung Builds a Secure Developer Portal with Fargate and ECR

Post Syndicated from AWS Admin original https://aws.amazon.com/blogs/architecture/samsung-builds-a-secure-developer-portal-with-fargate-and-ecr/

This post was provided by Samsung.

The Samsung developer portal (Samsung Developers) is Samsung’s online portal built to serve technical documents, the Developer blog, and API guides to developers, IT managers, and students interested in building applications with the Samsung products. The Samsung Developers consists of three different portals:

  • SmartThings portal, which serves IoT developers is our oldest portal. We developed it on Amazon Elastic Container Service (ECS) but have now migrated it to AWS Fargate
  • Bixby portal, which serves Bixby capsule developers, was developed using AWS Fargate
  • Rich Communication Services (RCS), which serves the new standard of mobile messaging, was also developed using AWS Fargate

Samsung Electronics Cloud Operation Group (SECOG) unveiled these three portals at Samsung Developer Conference 2017 and 2018.

Samsung developed the SmartThings portal on ECS and had an overall good experience using it. We   found that ECS provided the appropriate level of abstraction while also offering control of their underlying instances. However, when we learned about AWS Fargate at re:Invent 2017, we wanted to try it out. Being an Amazon ECS customer, there was a lot to like about Fargate. It provided significant operational efficiency while also eliminating the need to manage servers and clusters, meaning we could just focus on running containers to release new features.

In 2018, our engineering team began migrating all of our systems to Fargate. Because Fargate exposed the same APIs and endpoints that ECS did, the migration experience was extremely smooth and we immediately experienced improvements in operational efficiency. Before Fargate, Samsung typically had administrators and operators dedicated to managing their web services for the portal. However, as we migrated to Fargate, we were able to easily eliminate the need for an administrator, saving operational cost while improving development efficiency. Now, our operations and administration teams are focused more on elaborate logging and monitoring activities, further improving overall service reliability, security, and performance.

The Samsung developer portal is built using a microservice based architecture, and provides technical documents, API Docs, and support channels to our customers. To serve these features, the portal requires frequent updates to a number of different Fargate services. Technical writers who are interested in publishing new content every day  initiate these updates. To meet these business requirements, Samsung Electronics Cloud Operation Group (SECOG) and Technology Partner (TecAce) researched services that were agile and efficient and could be run with minimal operational overhead. When they learned about Fargate, they were interested in doing a proof of concept and based on its result, were convinced that Fargate could meet their needs.

Service Key Requirements

As we began our migration to Fargate, we realized that the portal had to comply with the various key requirements standardized with SECOG and InfoSec. These requirements are:

  • Security: the Service Ops should have the ability to control every Security factor.
  • Scalability: the Service focuses on Samsung developers who are using Samsung products in public. The Service therefore should be capable of handling traffic surges.
  • Easy to deploy: technical documents are easily pushed to the live environment giving technical writers the ability to easily make edits.
  • Controllability: The Service should be able to control container options such as port mapping, memory size, etc.

As we dived deeper into AWS Fargate,  SECOG and Infosec teams were satisfied that Fargate could deliver on all these requirements.

Build and Deploy Process

SECOG and TecAce decided to use AWS Fargate and Amazon Elastic Container Registry (ECR) service to meet the key requirements of the developer portal.

Figure 1: Architecture drawing

Figure 1: Architecture drawing

The System Architecture is very simple. When we release new features or update documents, we upload new container images to ECR then we publish our code to production. Each business application is designed with the combination of Application Load Balancer (ALB), Fargate, and Route 53.

Easy Fargate

After using Fargate, Samsung’s business owners were extremely satisfied with the choice. The Samsung Developers is operated and configured with multiple teams, which are globally distributed with development, operations, and QA roles, and responsibilities. Each team needs to deploy an individual environment for test. Before Fargate, we needed considerable engineers and developers bandwidth to operate web services infrastructure. However, Fargate simplified this process. Each team only needs to create a new container images and deploy to ECR. The image is then deployed to the test environment on Fargate. With this process, we were able to greatly reduce the time our developers and operators spent managing and configuring this infrastructure.

With Fargate, we are able to deploy more often to production and teams are able to handle additional Samsung products within the Samsung Developers. Additionally, we don’t have to worry about deploying and creating new images. We  simply create a new revision, setting the container’s memory and port. Then, we select our Fargate cluster after determining the commute capacity needed.

The compute capacity of the Fargate services can be easily scaled out using Autoscaling. Therefore, all deployment tasks only take a few minutes to serve. Additionally, there is no cluster managed by a system administrator or operator, and there is no EC2 instance and no docker swarm to maintain the  services. This ensures that we can focus on the features of Samsung Developers and improve end-customer experiences.

Currently, when an environment is deployed and served at Samsung Developers, Samsung monitors the health with alarms based on Amazon CloudWatch metrics. In addition, we have easily achieved the required availability and the reliability from our portal  while reducing monthly costs by approximately 44.5% (compute cost only).

Because of Samsung’s experience with Fargate, we have decided to migrate additional services from ECS to Fargate. Overall our tems have a great experience working with Fargate. The level of automation Fargate provides helps us move faster while also helping us become more economical with our developerment and operations resource. We felt that getting started with Fargate can take some time, however once the environment is set up, we were able to achive high levels of agiligty and scalablility with Fargate.

About Samsung

Samsung is a South Korean multinational conglomerate headquartered in Samsung Town, Seoul. It comprises numerous affiliated businesses,most of them united under the Samsung brand, and is the largest South Korean business conglomerate.

How to enable secure access to Kibana using AWS Single Sign-On

Post Syndicated from Remek Hetman original https://aws.amazon.com/blogs/security/how-to-enable-secure-access-to-kibana-using-aws-single-sign-on/

Amazon Elasticsearch Service (Amazon ES) is a fully managed service to search, analyze, and visualize data in real-time. The service offers integration with Kibana, an open-source data visualization and exploration tool that lets you perform log and time-series analytics and application monitoring.

Many enterprise customers who want to use these capabilities find it challenging to secure access to Kibana. Kibana users have direct access to data stored in Amazon ES—so it’s important that only authorized users have access to Kibana. Data stored in Amazon ES can also have different classifications. For example, you might have one domain that stores confidential data and another that stores public data. In this case, securing access requires you not only to prevent unauthorized users from accessing the data but also to grant different groups of users access to different data classifications.

In this post, I’ll show you how to secure access to Kibana through Amazon Single Sign-On (AWS SSO) so that only users authenticated to Microsoft Active Directory can access and visualize data stored in Amazon ES. AWS SSO uses standard identity federation via SAML similar to Microsoft ADFS or Ping Federation. AWS SSO integrates with AWS Managed Microsoft Active Directory or Active Directory hosted on-premises or EC2 Instance through AWS Active Directory Connector, which means that your employees can sign into the AWS SSO user portal using their existing corporate Active Directory credentials. In addition, I’ll show you how to map users between an Amazon ES domain and a specific Active Directory security group so that you can limit who has access to a given Amazon ES domain.

Prerequisites and assumptions

You need the following for this walkthrough:

Solution overview

The architecture diagram below illustrates how the solution will authenticate users into Kibana:
 

Figure 1: Architectural diagram

Figure 1: Architectural diagram

  1. The user requests accesses to Kibana
  2. Kibana sends an HTML form back to the browser with a SAML request for authentication from Cognito. The HTML form is automatically posted to Cognito. User is prompted to then select SSO and authentication request is passed to SSO.
  3. AWS SSO sends a challenge to the browser for credentials
  4. User logs in to AWS SSO. AWS SSO authenticates the user against AWS Directory Service. AWS Directory Service may in turn authenticate the user against an on premise Active Directory.
  5. AWS SSO sends a SAML response to the browser
  6. Browser POSTs the response to Cognito. Amazon Cognito validates the SAML response to verify that the user has been successfully authenticated and then passes the information back to Kibana.
  7. Access to Kibana and Elasticsearch is granted

Deployment and configuration

In this section, I’ll show you how to deploy and configure the security aspects described in the solution overview.

Amazon Cognito authentication for Kibana

First, I’m going to highlight some initial configuration settings for Amazon Cognito and Amazon ES. I’ll show you how to create a Cognito user pool, a user pool domain, and an identity pool, and then how to configure Kibana authentication under Elasticsearch. For each of the commands, remember to replace the placeholders with your own values.

If you need more details on how to set up Amazon Cognito authentication for Kibana, please refer to the service documentation.

  1. Create an Amazon Cognito user pool with the following command:

    aws cognito-idp create-user-pool –pool-name <pool name, for example “Kibana”>

    From the output, copy down the user pool id. You’ll need to provide it in a couple of places later in the process.

    
                    }
                    "CreationDate": 1541690691.411,
                    "EstimatedNumberOfUsers": 0,
                    "Id": "us-east-1_0azgJMX31",
                    "LambdaConfig": {}
                }
            

  2. Create a user pool domain:

    aws cognito-idp create-user-pool-domain –domain <domain name>–user-pool-id <pool id created in step 1>

    The user pool domain name MUST be the same as your Amazon Elasticsearch domain name. If you receive an error that “domain already exists,” it means the name is already in use and you must choose a different name.

  3. Create your Amazon Cognito federated identities:

    aws cognito-identity create-identity-pool –identity-pool-name <identity pool name e.g. Kibana> –allow-unauthenticated-identities

    To make this command work, you have to temporally allow unauthenticated access by adding –allow-unauthenticated-identities. Unauthenticated access will be removed by Amazon Elasticsearch upon enabling Kibana authentication in the next step.

  4. Create an Amazon Elasticsearch domain. To do so, from the AWS Management Console, navigate to Amazon Elasticsearch and select Create a new domain.
    1. Make sure that value enter under “Elasticsearch domain name” match with the domain created under Cognito User Pool.
    2. Under Kibana authentication, complete the form with the following values, as shown in the screenshot:
      • For Cognito User Pool, enter the name of the pool you created in step one.
      • For Cognito Identity Pool, enter the identity you created in step three.
         
        Figure 2: Enter the identity you created in step three

        Figure 2: Enter the identity you created in step three

  5. Now you’re ready to assign IAM roles to your identity pool. Those roles will be saved with your identity pool and whenever Cognito receive a request to authorize a user, it will automatically utilize these roles
    1. From the AWS Management Console, go to Amazon Cognito and select Manage Identity Pools.
    2. Select the identity pool you created in step three.
    3. You should receive the following message: You have not specified roles for this identity pool. Click here to fix it. Follow the link.
       
      Figure 3: Follow the "Click here to fix it" link

      Figure 3: Follow the “Click here to fix it” link

    4. Under Edit identity pool, next to Unauthenticated role, select Create new role.
    5. Select Allow and save your changes.
    6. Next to Unauthenticated role, select Create new role.
    7. Select Allow and save your changes.
  6. Finally, modify the Amazon Elasticsearch access policy:
    1. From the AWS Management Console, go to AWS Identity and Access Management (IAM).
    2. Search for the authenticated role you created in step five and copy the role ARN.
    3. From the mangement console, go to Amazon Elasticsearch Service, and then select the domain you created in step four.
    4. Select Modify access policy and add the following policy (replace the ARN of the authenticated role and the domain ARN with your own values):
      
                      {
                          "Effect": "Allow",
                          "Principal": {
                              "AWS": "<ARN of Authenticated role>"
                          },
                          "Action": "es:ESHttp*",
                          "Resource": "<Domain ARN/*>"
                      }
                      

      Note: For more information about the Amazon Elasticsearch Service access policy visit: https://docs.aws.amazon.com/elasticsearch-service/latest/developerguide/es-ac.html

Configuring AWS Single Sign-On

In this section, I’ll show you how to configure AWS Single Sign-On. In this solution, AWS SSO is used not only to integrate with Microsoft AD but also as a SAML 2.0 identity federation provider. SAML 2.0 is an industry standard used for securely exchanging SAML assertions that pass information about a user between a SAML authority (in this case, Microsoft AD), and a SAML consumer (in this case, Amazon Cognito).

Add Active Directory

  1. From the AWS Management Console, go to AWS Single Sign-On.
  2. If this is the first time you’re configuring AWS SSO, you’ll be asked to enable AWS SSO. Follow the prompt to do so.
  3. From the AWS SSO Dashboard, select Manage your directory.
     
    Figure 4: Select Manage your directory

    Figure 4: Select Manage your directory

  4. Under Directory, select Change directory.
     
    Figure 5: Select "Change directory"

    Figure 5: Select “Change directory”

  5. On the next screen, select Microsoft AD Directory, select the directory you created under AWS Directory Service as a part of prerequisites, and then select Next: Review.
     
    Figure 6: Select "Microsoft AD Directory" and then select the directory you created as a part of the prerequisites

    Figure 6: Select “Microsoft AD Directory” and then select the directory you created as a part of the prerequisites

  6. On the Review page, confirm that you want to switch from an AWS SSO directory to an AWS Directory Service directory, and then select Finish.
    1. Once setup is complete, select Proceed to the directory.

Add application

  1. From AWS SSO Dashboard, select Applications and then Add a new application. Select Add a custom SAML 2.0 application.
     
    Figure 7: Select "Application" and then "Add a new application"

    Figure 7: Select “Application” and then “Add a new application”

  2. Enter a display name for your application (for example, “Kibana”) and scroll down to Application metadata. Select the link that reads If you don’t have a metadata file, you can manually type your metadata values.
  3. Enter the following values, being sure to replace the placeholders with your own values:
    1. Application ACS URL: https://<Elasticsearch domain name>.auth..amazoncognito.com/saml2/idpresponse
    2. Application SAML audience: urn:amazon:cognito:sp:<user pool id>
  4. Select Save changes.
     
    Figure 8: Select "Save changes"

    Figure 8: Select “Save changes”

Add attribute mappings

Switch to the Attribute mappings tab and next to Subject, enter ${user:name} and select unspecified under Format as shown in the following screenshot. Click Save Changes.
 

Figure 9: Enter "${user:name}" and select "Unspecified"

Figure 9: Enter “${user:name}” and select “Unspecified”

For more information about attribute mappings visit: https://docs.aws.amazon.com/singlesignon/latest/userguide/attributemappingsconcept.html

Grant access to Kibana

To manage who has access to Kibana, switch to the Assigned users tab and select Assign users. Add individual users or groups.

Download SAML metadata

Next, you’ll need to download the Amazon SSO SAML metadata. The SAML metadata contains information such as SSO entity ID, public certificate, attributes schema, and other information that’s necessary for Cognito to federate with a SAML identity provider. To download the metadata .xml file, switch to the Configuration tab and select Download metadata file.
 

Figure 10: Select "Download metadata file"

Figure 10: Select “Download metadata file”

Adding an Amazon Cognito identity provider

The last step is to add the identity provider to the user pool.

  1. From the AWS Management Console, go to Amazon Cognito.
    1. Select Manage User Pools, and then select the user pool you created in the previous section.
    2. From the left side menu, under Federation, select Identity providers, and then select SAML.
    3. Select Select file, and then select the Amazon SSO metadata .xml file you downloaded in previous step.
    4.  

      Figure 11: Select "Select file" and select the Amazon SSO metadata .xml file you downloaded in previous step

      Figure 11: Select “Select file” and then select the Amazon SSO metadata .xml file you downloaded in previous step

    5. Enter the provider name (for example, “AWS SSO”), and then select Create provider.
  2. From the left side menu, under App integration, select App client settings.
  3. Uncheck Cognito User Pool, check the name of provider you created in step one, and select Save Changes.
     
    Figure 12: Uncheck "Cognito User Pool"

    Figure 12: Uncheck “Cognito User Pool”

At this point, the configuration is finished. When you open the Kibana URL, you should be redirected to AWS SSO and asked to authenticate using your Active Directory credentials. Keep in mind that if the AWS Elasticsearch domain was created inside VPC, it won’t be accessible from the Internet but only within VPC.

Managing multiple Amazon ES domains

In scenarios where different users need access to different Amazon ES domains, the solution would be as follows for each Amazon ES domain:

  1. Create one Active Directory Security Group per Amazon ES domain
  2. Create an Amazon Cognito user pool for each domain
  3. Add new applications to AWS SSO and grant permission to corresponding security groups
  4. Assign users to the appropriate security group

Deleting domains that use Amazon Cognito Authentication for Kibana

To prevent domains that use Amazon Cognito authentication for Kibana from becoming stuck in a configuration state of “Processing,” it’s important that you delete Amazon ES domains before deleting their associated Amazon Cognito user pools and identity pools.

Conclusion

I’ve outlined an approach to securing access to Kibana by integrating Amazon Cognito with Amazon SSO and AWS Directory Services. This allows you to narrow the scope of users who haves access to each Amazon Elasticsearch domain by configuring separate applications in AWS SSO for each of the domains.

Want more AWS Security how-to content, news, and feature announcements? Follow us on Twitter.

Author

Remek Hetman

Remek is a Senior Cloud Infrastructure Architect with Amazon Web Services Professional Services. He works with AWS financial enterprise customers providing technical guidance and assistance for Infrastructure, Security, DevOps, and Big Data to help them make the best use of AWS services. Outside of work, he enjoys spending time actively, and pursuing his passion – astronomy.

How to Use Cross-Account ECR Images in AWS CodeBuild for Your Build Environment

Post Syndicated from Kausalya Rani Krishna Samy original https://aws.amazon.com/blogs/devops/how-to-use-cross-account-ecr-images-in-aws-codebuild-for-your-build-environment/

AWS CodeBuild now makes it possible for you to access Docker images from any Amazon Elastic Container Registry repository in another account as the build environment. With this feature, AWS CodeBuild allows you to pull any image from a repository to which you have been granted resource-level permissions.

In this blog post, we will show you how to provision a build environment using an image from another AWS account.

Here is a quick overview of the services used in our example:

AWS CodeBuild is a fully managed continuous integration service that compiles source code, runs tests, and produces software packages that are ready to deploy. It provides a fully preconfigured build platform for most popular programming languages and build tools, including Apache Maven, Gradle, and more.

Amazon Elastic ECR is a fully managed Docker container registry that makes it easy for developers to store, manage, and deploy Docker container images.

We will use a sample Docker image in an Amazon ECR image repository in AWS account B. The CodeBuild project in AWS account A will pull the images from the Amazon ECR image repository in AWS account B.

Prerequisites:

To get started you need:

·       Two AWS accounts (AWS account A and AWS account B).

·       In AWS account A, an image registry in Amazon ECR. In AWS account B, images that you would like to use for your build environment. If you do not have an image registry and a sample image, see Docker Sample in the AWS CodeBuild User Guide.

·       In AWS account A, an AWS CodeCommit repository with a buildspec.yml file and sample code.

·       Using the following steps, permissions in your Amazon ECR image repository for AWS CodeBuild to pull the repository’s Docker image into the build environment.

To grant CodeBuild permissions to pull the Docker image into the build environment

1.     Open the Amazon ECS console at https://console.aws.amazon.com/ecs/.

2.     Choose the name of the repository you created.

3.     On the Permissions tab, choose Edit JSON policy.

4.     Apply the following policy and save.

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Sid": "CodeBuildAccess",
      "Effect": "Allow",
      "Principal": {
        "AWS": "<arn of the service role>"  
      },
      "Action": [
        "ecr:GetDownloadUrlForLayer",
        "ecr:BatchGetImage",
        "ecr:BatchCheckLayerAvailability"
      ]
    }
  ]
}

To use an image from account B and set up a build project in account A

1. Open the AWS CodeBuild console at https://console.aws.amazon.com/codesuite/codebuild/home.

2. Choose Create project.

3. In Project configuration, enter a name and description for the build project.

4. In Source, for Source provider, choose the source code provider type. In this example, we use the AWS CodeCommit repository name.

 

5.  For Environment, we will pull the Docker image from AWS account B and use the image to create the build environment to build artifacts. To configure the build environment, choose Custom Image. For Image registry, choose Amazon ECR. For ECR account, choose Other ECR account.

6.  In Amazon ECR repository URI, enter the URI for the image repository from AWS account B and then choose Create build project.

7. Go to the build project you just created, and choose Start build. The build execution will download the source code from the AWS CodeCommit repository and provision the build environment using the image retrieved from the image registry.

Next steps

Now that you have seen how to use cross-account ECR images, you can integrate a build step in AWS CodePipeline and use the build environment to create artifacts and deploy your application. To integrate a build step in your pipeline, see Working with Deployments in AWS CodeDeploy in the AWS CodeDeploy User Guide

If you have any feedback, please leave it in the Comments section below. If you have questions, please start a thread on the AWS CodeBuild forum or contact AWS Support.