Tag Archives: Best practices

Transforming DevOps at Broadridge on AWS

Post Syndicated from Som Chatterjee original https://aws.amazon.com/blogs/devops/transforming-devops-for-a-fintech-on-aws/

by Tom Koukourdelis (Broadridge – Vice President, Head of Global Cloud Platform Development and Engineering), Sreedhar Reddy (Broadridge – Vice President, Enterprise Cloud Architecture)

We have seen large enterprises in all industry segments meaningfully utilizing AWS to build new capabilities and deliver business value. While doing so, enterprises have to balance existing systems, processes, tools, and culture while innovating at pace with industry disruptors. Broadridge Financial Solutions, Inc. (NYSE: BR) is no exception. Broadridge is a $4 billion global FinTech leader and a leading provider of investor communications and technology-driven solutions to banks, broker-dealers, asset and wealth managers, and corporate issuers.

This blog post explores how we adopted AWS at scale while being secured and compliant, as well as delivering a high degree of productivity for our builders on AWS. It also describes the steps we took to create technical (a cloud solution as a foundation based on AWS) and procedural (organizational) capabilities by leveraging AWS cloud adoption constructs. The improvement in our builder productivity and agility directly contributes to rolling out differentiated business capabilities addressing our customer needs in a timely manner. In this post, we share real-life learnings and takeaways to adopt AWS at scale, transform business and application team experiences, and deliver customer delight.

Background

At Broadridge we have number of distributed and mainframe systems supporting multiple financial services domains and sub-domains such as post trade, proxy communications, financial and regulatory reporting, portfolio management, and financial operations. The majority of these systems were built and deployed years ago at on-premises data centers all over the US and abroad.

Builder personas at Broadridge are diverse in terms of location, culture, and the technology stack they use to build and support applications (we use a number of front-end JS frameworks; .NET; Java; ColdFusion for web development; ORMs for data entity relational mapping; IBM MQ; Apache Camel for messaging; databases like SQL, Oracle, Sybase, and other open source stacks for transaction management; databases, and batch processing on virtualized and bare metal instances). With more than 200 on-premises distributed applications and mainframe systems across front-, mid-, and back-office ecosystems, we wanted to leverage AWS to improve efficiency and build agility, and to reduce costs. The ability to reach customers at new geographies, reduced time to market, and opportunities to build new business competencies were key parameters as well.

Broadridge’s core tenents for cloud adoption

When AWS adoption within Broadridge attained a critical mass (known as the Foundation stage of adoption), the business and technology leadership teams defined our posture of cloud adoption and shared them with teams across the organization using the following tenets. Enterprises looking to adopt AWS at scale should define similar tenets fit for their organizations in plain language understandable by everyone across the board.

  • Iterate: Understanding that we cannot disrupt ongoing initiatives, small and iterative approach of moving workloads to cloud in waves— rinse and repeat— was to be adopted. Staying away from long-drawn, capital-intensive big bangs were to be avoided.
  • Fully automate: Starting from infrastructure deployment to application build, test, and release, we decided early on that automation and no-touch deployment are the right approach both to leverage cloud capabilities and to fuel a shift toward a matured DevOps culture.
  • Trust but verify only exceptions: Security and regulatory compliance are paramount for an organization like Broadridge. Guardrails (such as service control policies, managed AWS Config rules, multi-account strategy) and controls (such as PCI, NIST control frameworks) are iteratively developed to baseline every AWS account and AWS resource deployed. Manual security verification of workloads isn’t needed unless an exception is raised. Defense in depth (distancing attack surface from sensitive data and resources using multi-layered security) strategies were to be applied.
  • Go fast; re-hosting is acceptable: Not every workload needs to go through years of rewriting and refactoring before it is deemed suitable for the cloud. Minor tweaking (light touch re-platforming) to go fast (such as on-premises Oracle to RDS for Oracle) is acceptable.
  • Timeliness and small wins are key: Organizations spend large sums of capital to completely rewrite applications and by the time they are done, the business goal and customer expectation will have changed. That leads to material dissatisfaction with customers. We wanted to avoid that by setting small, measurable targets.
  • Cloud fluency: Investment in training and upskilling builders and leaders across the organization (developers, infra-ops, sec-ops, managers, salesforce, HR, and executive leadership) were to be to made to build fluency on the cloud.

The first milestone

The first milestone in our adoption journey was synonymous with Project stage of adoption and had the following characteristics.

A controlled sprawl of shadow IT

We first gave small teams with little to no exposure to critical business functions (such as customer data and SLA-oriented workloads) sandboxes to test out proofs of concepts (PoC) on AWS. We created the cloud sandboxes with least privilege, and added additional privileges upon request after verification. During this time, our key AWS usage characteristics were:

  • Manual AWS account setup with least privilege
  • Manual IAM role creation with role boundaries and authentication and authorization from the existing enterprise Active Directory
  • Integration with existing Security Information and Event Management (SIEM) tools to audit role sprawl and config changes
  • Proofs of concepts only
  • Account tagging for chargeback and tracking purposes
  • No automated build, test, deploy, or integration with existing delivery pipeline
  • Small and definitive timeframes for PoCs with defined goals

A typical AWS environment at this stage will resemble that shown in the following diagram:

Representative AWS usage during first milestone

As shown above, at this time the corporate assets were connected to a highly restrictive AWS environment through VPN. The access to the AWS environment were setup based on AWS Identity primitives or IAM roles mapped to and federated with the on-premises Active Directory. There was a single VPC setup for a sandbox account with no egress to the internet. There were no customer data hosted on this AWS environment and the AWS environment was connected with our SIEM of choice.

Early adopters became first educators and mentors

Members of the first teams to carry out proofs of concept on AWS shared learnings with each other and with the leadership team within Broadridge. This helped build communities of practices (CoPs) over time. Initial CoPs established were for networking and security, and were later extended to various practices like Terraform, Chef, and Jenkins.

Tech PMO team within Broadridge as the quasi-central cloud team

Ownership is vital no matter how small the effort and insignificant the impact of risky experimentation. The ownership of account setup, role creation, integration with on-premises AD and SIEM, and oversight to ensure that the experimentation does not pose any risk to the brand led us to build a central cloud team with experienced AWS and infrastructure practitioners. This team created a process for cloud migration with first manual guardrails of allowed and disallowed actions, manual interventions, and checkpoints built in every step.

At this stage, a representative pattern of work products across teams resembles what is shown below.

Work products across teams during initial stages of AWS usage

As the diagram suggests, individual application teams built overlapping—and, in many cases, identical—technical building blocks across the teams. This was acceptable as the teams were experimenting and running PoCs on AWS. In an actual production application delivery, the blocks marked with a * would be considered technical and functional waste—that is, undifferentiated lift which increases the cost of doing business.

The second milestone

In hindsight, this is perhaps the most important milestone in our cloud adoption journey. This step was marked with following key characteristics:

  • Every new team doing PoCs are rebuilding the same building blocks: This includes networking (VPCs and security groups), identity primitives (account, roles, and policies), monitoring (Amazon CloudWatch setup and custom metrics), and compute (images with org-mandated security patches).
  • The teams usually asking the same first fundamental questions: These include questions such as: What is an ideal CIDR block range? How do we integrate with SIEM? How do we spin up web servers on Amazon EC2? How do we secure access to data? How do we setup workload monitoring?
  • Security reviews rarely finding new security gaps but adding time to the process: A central security group as part of the central cloud team reviewed every new account request and every new service usage request without finding new security gaps when the application team used the baseline guardrails.
  • Manual effort is spent on tagging, chargeback, and other approvals: A portion of the application PoC/minimum viable product (MVP) lifecycle was spent on housekeeping. While housekeeping was necessary, the effort spent was undifferentiated.

The follow diagram represents the efforts for every team during the first phase.

Team wise efforts showing duplicative work

As shown above, every application team spent effort on building nearly the same capabilities before they could begin developing their team specific application functionalities and assets. The common blocks of work are undifferentiated and leads to spending effort which also varies depending on the efficiency of the team.

During this step, learnings from the PoCs led us to establish the tenets shared earlier in this post. To address the learnings, Broadridge established a cloud platform team. The cloud platform team, also referred to as the cloud enablement engine (CEE), is a team of builders who create the foundational building blocks on AWS that address common infrastructure, security, monitoring, auditing, and break-glass controls. At the same time, we established a cloud business office (CBO) as a liaison between the application and business teams and the CEE. CBO exists to manage and prioritize foundational requirements from multiple application teams as they go online on AWS and helps create the product backlog for CEE.

Cloud Enablement Engine Responsibilities:

  • Build out foundational building blocks utilizing AWS multi-account strategy
  • Build security guardrails, compliance controls, infrastructure as code automation, auditing and monitoring controls
  • Implement cloud platform backlog that funnel from CBO as common asks from app teams
  • Work with our AWS team to understand service roadmap, future releases, and provide feedback

Cloud Business Office Responsibilities:

  • Identify and prioritize repeating technical building blocks that cuts across multiple teams
  • Establish acceptable architecture patterns based on application use cases
  • Manage cloud programs to ensure CEE deliverables and business expectations align
  • Identify skilling needs, budget, and track spend
  • Contribute to the cloud platform backlog
  • Work with AWS team to understand service roadmap, future releases, and provide feedback

These teams were set up to scale AWS adoption, put building blocks into the hands of the applications teams, and ultimately deliver differentiated capabilities to Broadridge’s business teams and end customers. The following diagram translates the relationship and modus operandi among the teams:

CEE and CBO working model

Upon establishing the conceptual working model, the CBO and CEE teams looked at solutions from AWS to enable them to achieve the working model quickly. The starting point was AWS Landing Zone (ALZ). ALZ is an AWS solution based on the AWS multi-account strategy. It is a set of vetted constructs and best practices that we use as mechanisms to accelerate AWS adoption.

AWS multi-account strategy

The multi-account strategy employs best practices around separation of concerns, reduction of blast radius, account setup based on Software Development Life Cycle (SDLC) phases, and base operational roles for auditing, monitoring, security, and compliance, as shown in the above diagram. This strategy defines the need for having centralized shared or core accounts, which works as the master account for monitoring, governance, security, and auditing. A number of AWS services like Amazon GuardDuty, AWS Security Hub, and AWS Config configurations are set in these centralized accounts. Spoke or child accounts are vended as per a team’s requirement which are spun up with these governance, monitoring, and security defaults connected to the centralized account for log capturing, threat detection, configuration management, and security management.

The third milestone

The third milestone is synonymous with the Foundation stage of adoption

Using the ALZ construct, our CEE team developed a core set of principles to be used by every application team. Based on our core tenets, the CEE team built out an entry point (a web-based UI workflow application). This web UI was the entry point for any application team requesting an environment within AWS for experimentation or to begin the application development life cycle. Simplistically, the web UI sat on top of an automation engine built using APIs from AWS, ALZ components (Account Vending Machine, Shared Services Account, Logging Account, Security Account, default security groups, default IAM roles, and AD groups), and Terraform based code. The CBO team helped establish the common architecture patterns that was codified into this engine.

Team on-boarding workflow using foundational building blocks on AWS

An Angular based web UI is the starting point for application team to request for the AWS accounts. The web UI entry point asks a number of questions validating the type of account requested along with its intended purpose, ingress/egress requirements, high availability and disaster recovery requirements, business unit for charge back and ownership purposes. Once all information is entered, it sends out a notification based on a preset organization dispatch matrix rule. Upon receiving the request, the approver has the option to approve it or asks further clarification question. Once satisfactorily answered the approver approves the account vending request and a Terraform code is kicked in to create the default account.

When an account is created through this process, the following defaults are set up for a secure environment for development, testing, and staging. Similar guardrails are deployed in the production accounts as well.

  • Creates a new account under an existing AWS Organizational Unit (OU) based on the input parameters. Tags the chargeback codes, custom tags, and also integrates the resources with existing CMDB
  • Connects the new account to the master shared services and logging account as per the AWS Landing Zone constructs
  • Integrates with the CloudWatch event bus as a sender account
  • Runs stsAssumeRole commands on the new account to create infosec cross-account roles
  • Defines actions, conditions, role limits, and account policies
  • Creates environment variables related to the account in the parameter store within AWS Systems Manager
  • Connects the new account to TrendMicro for AV purposes
  • Attaches the default VPC of the new account to an existing AWS Transit Gateway
  • Generates a Splunk key for the account to store in the Splunk KV store
  • Uses AWS APIs to attach Enterprise support to the new account
  • Creates or amends a new AD group based on the IAM role
  • Integrates as an Amazon Macie member account
  • Enables AWS Security Hub for the account by running an enable-security-hub call
  • Sets up Chef runner for the new account
  • Runs account setting lock procedures to set Amazon S3 public settings, EBS default encryption setting
  • Enable firewall by setting AWS WAF rules for the account
  • Integrates the newly created account with CloudHealth and Dome9

Deploying all these guardrails in any new accounts removes the need for manual setup and intervention. This gives application developers the needed freedom to stop worrying about infrastructure and access provisioning while giving them a higher speed to value.

Using these technical and procedural cloud adoption constructs, we have been able to reduce application onboarding time. This has led to quicker delivery of business capability with the application teams focusing only on what differentiates their business rather than repeatedly building undifferentiated work products. This has also led to creation of mature building blocks over time for use of the application teams. Using these building blocks the teams are also modernizing applications by iteratively replacing old application blocks.

Conclusion

In summary, we are able to deliver better business outcomes and differentiated customer experience by:

  • Building common asks as reusable and automated enterprise assets and improving the overall enterprise-wide maturity by indexing on and growing these assets.
  • Depending on an experienced team to deliver baseline operational controls and guardrails.
  • Improving their security posture with higher-level and managed AWS security services instead of rebuilding everything from the ground up.
  • Using the Cloud Business Office to improve funneling of common asks. This helps the next team on AWS to benefit from a readily available set of approved services and application blueprints.

We will continue to build on and maturing these reusable building blocks by using AWS services and new feature releases.

 

The content and opinions in this blog are those of the third-party author and AWS is not responsible for the content or accuracy of this post.

 

How to get started with security response automation on AWS

Post Syndicated from Cameron Worrell original https://aws.amazon.com/blogs/security/how-get-started-security-response-automation-aws/

At AWS, we encourage you to use automation to help quickly detect and respond to security events within your AWS environments. In addition to increasing the speed of detection and response, automation also helps you scale your security operations as you expand your workloads running on AWS. For these reasons, security automation is a key ­principle outlined in both the Well-Architected and Cloud Adoption frameworks as well as in the AWS Security Incident Response Guide.

In this blog post, you’ll learn to implement automated security response mechanisms within your AWS environments. This post will include common patterns, implementation considerations, and an example solution. Security response automation is a broad topic that spans many areas. The goal of this blog post is to introduce you to core concepts and help you get started.

A word from our lawyers: Please note that you are responsible for making your own independent assessment of the information in this post. This post: (a) is for informational purposes only, (b) represents current AWS product offerings and practices, which are subject to change without notice, and (c) does not create any commitments or assurances from AWS and its affiliates, suppliers, or licensors.

What is security response automation?

Security response automation is a planned and programmed action taken to achieve a desired state for an application or resource based on a condition or event. When you implement security response automation, you should adopt an approach that draws from existing security frameworks. Frameworks are published materials which consist of standards, guidelines, and best practices in order help organizations manage cybersecurity-related risk. Using frameworks helps you achieve consistency and scalability and enables you to focus more on the strategic aspects of your security program. You should work with compliance professionals within your organization to understand any specific security frameworks that may also be relevant for your AWS environment.

Our example solution is based on the NIST Cybersecurity Framework (CSF), which is designed to help organizations assess and improve their ability to prevent, detect, and respond to security events. According to the CSF, “cybersecurity incident response” supports your ability to contain the impact of potential cybersecurity incidents. Although automation is not a CSF requirement, automating responses to events enables you to create repeatable, predictable approaches to monitoring and responding to threats.

The five main steps in the CSF are identify, protect, detect, respond and recover. We’ve expanded the detect and respond steps to include automation and investigation activities.
 

Figure 1: The five steps in the CSF

Figure 1: The five steps in the CSF

The following definitions for each step in the diagram above are based on the CSF but have been adapted for our example in this blog post. Although we will focus on the detect, automate and respond steps, it’s important to understand the entire process flow.

  • Identify: Identify and understand the resources, applications, and data within your AWS environment.
  • Protect: Develop and implement appropriate controls and safeguards to ensure delivery of services.
  • Detect: Develop and implement appropriate activities to identify the occurrence of a cybersecurity event. This step includes the implementation of monitoring capabilities which will be discussed further in the next section.
  • Automate: Develop and implement planned, programmed actions that will achieve a desired state for an application or resource based on a condition or event.
  • Investigate: Perform a systematic examination of the security event to establish the root cause.
  • Respond: Develop and implement appropriate activities to take automated or manual actions regarding a detected security event.
  • Recover: Develop and implement appropriate activities to maintain plans for resilience and to restore any capabilities or services that were impaired due to a security event.

Security response automation on AWS

AWS CloudTrail, AWS Config, and Amazon EventBridge continuously record details about the resources and configuration changes in your AWS account. You can use this information to automatically detect resource changes and to react to deviations from your desired state.
 

Figure 2: Automated remediation flow

Figure 2: Automated remediation flow

As shown in the diagram above, an automated remediation flow on AWS has three stages:

  • Monitor: Your automated monitoring tools collect information about resources and applications running in your AWS environment. For example, they might collect AWS CloudTrail information about activities performed in your AWS account, usage metrics from your Amazon EC2 instances, or flow log information about the traffic going to and from network interfaces in your Amazon Virtual Private Cloud (VPC).
  • Detect: When a monitoring tool detects a predefined condition—such as a breached threshold, anomalous activity, or configuration deviation—it raises a flag within the system. A triggering condition might be an anomalous activity detected by Amazon GuardDuty, a resource becoming out of compliance with an AWS Config Rule, or a high rate of blocked requests on an Amazon VPC security group or AWS WAF web access control list.
  • Respond: When a condition is flagged, an automated response is triggered that performs an action you’ve predefined—something intended to remediate or mitigate the flagged condition. Examples of automated response actions might include modifying a VPC security group, patching an Amazon EC2 instance, or rotating credentials.

You can use the event-driven flow described above to achieve many automated response patterns with varying degrees of complexity. Your response pattern could be as simple as invoking a single AWS Lambda function, or it could be a complex series of AWS Step Function tasks with advanced logic. In this blog post, we’ll use two simple Lambda functions in our example solution.

How to define your response automation

Now that we’ve introduced the concept of security response automation, start thinking about security requirements within your environment that you’d like to enforce through automation. These design requirements might come from general best practices you’d like to follow, or they might be specific controls from compliance frameworks relevant for your business. Regardless, your objectives should be quantitative, not qualitative. Here are some examples of quantitative objectives:

  • Remote administrative network access to servers should be limited.
  • Server storage volumes should be encrypted.
  • AWS console logins should be protected by multi-factor authentication.

As an optional step, you can expand these objectives into user stories that define the conditions and remediation actions when there is an event. User stories are informal descriptions that briefly document a feature within a software system. User stories may be global and span across multiple applications or they may be specific to a single application. For example:

“Remote administrative network access to servers should be limited. Remote access ports include SSH TCP port 22 and RDP TCP port 3389. If open remote access ports are detected within the environment, they should be automatically closed and the owner will be notified.”

Once you’ve completed your user story, you can determine how to use automated remediation to help achieve these objectives in your AWS environment. User stories should be stored in a location that provides versioning support and can reference the associated automation code.

You should carefully consider the effect of your remediation mechanisms in order to prevent unintended impact on your resources and applications. Remediation actions such as instance termination, credential revocation, and security group modification can adversely affect application availability. Depending on the level of risk that’s acceptable to your organization, your automated mechanism might only provide a notification which can then be manually investigated prior to remediation. Once you’ve identified an automated remediation mechanism, you can build out the required components and test them in a non-production environment.

Sample response automation walkthrough

In the following section, we’ll walk you through an automated remediation for a simulated event that indicates potential unauthorized activity—the unintended disabling of CloudTrail logging. Outside parties might want to disable logging to prevent detection and recording of their unauthorized activity. Our response is to re-enable the CloudTrail logging and immediately notify the security contact. Here’s the user story for this scenario:

“CloudTrail logging should be enabled for all AWS accounts and regions. If CloudTrail logging is disabled, it will automatically be enabled and the security operations team will be notified.”

Note: The sample response automation below references Amazon EventBridge which extends and builds upon CloudWatch Events. Amazon EventBridge uses the same Amazon CloudWatch Events API, so the event structure and rules configuration are the same. This blog post uses base functionality that is identical in both EventBridge and CloudWatch Events.

Prerequisites

In order to use our sample remediation, you will need to enable Amazon GuardDuty and AWS Security Hub in the AWS Region you have selected. Both of these services include a 30-day free trial. See the AWS Security Hub pricing page and the Amazon GuardDuty pricing page for additional details.

Important: You’ll use AWS CloudTrail to test the sample remediation. Running more than one CloudTrail trail in your AWS account will result in charges based on the number of events processed while the trail is running. Charges for additional copies of management events recorded in a Region are applied based on the published pricing plan. To minimize the charges, follow the clean-up steps that we provide later in this post to remove the sample automation and delete the trail.

Deploy the sample response automation

In this section, we’ll show you how to deploy and test the CloudTrail logging remediation sample. Amazon GuardDuty generates the finding Stealth:IAMUser/CloudTrailLoggingDisabled when CloudTrail logging is disabled, and AWS Security Hub collects findings from GuardDuty using the standardized finding format mentioned earlier. We recommend that you deploy this sample into a non-production AWS account.

Select the Launch Stack button below to deploy a CloudFormation template with an automation sample in the us-east-1 Region. You can also download the template and implement it in another Region. The template consists of an Amazon EventBridge rule, an AWS Lambda function and the IAM permissions necessary for both components to execute. It takes several minutes for the CloudFormation stack build to complete.

Select this image to open a link that starts building the CloudFormation stack

  1. In the CloudFormation console, choose the Select Template form, and then select Next.
  2. On the Specify Details page, provide the email address for a security contact. (For the purpose of this walkthrough, it should be an email address you have access to.) Then select Next.
  3. On the Options page, accept the defaults, then select Next.
  4. On the Review page, confirm the details, then select Create.
  5. While the stack is being created, check the inbox of the email address you provided in step 2. Look for an email message with the subject AWS Notification – Subscription Confirmation. Select the link in the body of the email to confirm your subscription to the Amazon Simple Notification Service (Amazon SNS) topic. You should see a success message similar to the screenshot below:
     
    Figure 3: SNS subscription confirmation

    Figure 3: SNS subscription confirmation

  6. Return to the CloudFormation console. Once the Status field for the CloudFormation stack changes to CREATE COMPLETE (as shown in figure 4), the solution is implemented and is ready for testing.
     
    Figure 4: CREATE_COMPLETE status

    Figure 4: CREATE_COMPLETE status

Test the sample automation

You’re now ready to test the automated response by creating a test trail in CloudTrail, then trying to stop it.

  1. From the AWS Management Console, choose Services > CloudTrail.
  2. Select Trails, then select Create Trail.
  3. On the Create Trail form:
    1. Enter a value for Trail name. We use test-trail in our example below.
    2. Under Management events, select Write-only (to minimize event volume).
       
      Figure 5: Create a CloudTrail trail

      Figure 5: Create a CloudTrail trail

    3. Under Storage location, choose an existing S3 bucket or create a new one. Note that since S3 bucket names are globally unique, you must add characters (such as a random string) to the name. For example: my-test-trail-bucket-<random-string>.
  4. On the Trails page of the CloudTrail console, verify that the new trail has started. You should see a green checkmark in the Status column, as shown in figure 6.
     
    Figure 6: Verify new trail has started

    Figure 6: Verify new trail has started

  5. You’re now ready to act like an unauthorized user trying to cover their tracks! Stop the logging for the trail you just created:
    1. Select the new trail name to display its configuration page.
    2. Toggle the Logging switch in the top-right corner to OFF.
    3. When prompted with a warning dialog box, select Continue.
    4. Verify that the Logging switch is now off, as shown below.
       
      Figure 7: Verify logging switch is off

      Figure 7: Verify logging switch is off

      You have now simulated a security event by disabling logging for one of the trails in the CloudTrail service. Within the next few seconds, the near real-time automated response will detect the stopped trail, restart it, and send an email notification. You can refresh the Trails page of the CloudTrail console to verify that the trail’s status is ON again.

      Within the next several minutes, the investigatory automated response will also begin. GuardDuty will detect the action that stopped the trail and enrich the data about the source of unexpected behavior. Security Hub will then ingest that information and optionally correlate with other security events.

      Following the steps below, you can monitor findings within Security Hub for the finding type TTPs/Defense Evasion/Stealth:IAMUser-CloudTrailLoggingDisabled to be generated:

  6. In the AWS Management Console, choose Services > Security Hub
    1. Select Findings in the left pane.
    2. Select the Add filters field, then select Type.
    3. Select EQUALS, paste TTPs/Defense Evasion/Stealth:IAMUser-CloudTrailLoggingDisabled into the field, then select Apply.
    4. Refresh your browser periodically until the finding is generated.
    5. Figure 8: Monitor Security Hub for your finding

      Figure 8: Monitor Security Hub for your finding

While you wait on that detection, let’s dig into the components of automation.

How the sample automation works

This example incorporates two automated responses: a near real-time workflow and an investigatory workflow. The near real-time workflow provides a rapid response to an individual event, in this case the stopping of a trail. The goal is to restore the trail to a functioning state and alert security responders as quickly as possible. The investigatory workflow still includes a response to provide defense in depth and also uses services that support a more in-depth investigation of the incident.

Figure 9: Sample automation workflow

Figure 9: Sample automation workflow

In the near real-time workflow, Amazon EventBridge monitors for the undesired activity. When a trail is stopped, AWS CloudTrail publishes an event on the EventBridge bus. An EventBridge rule detects the trail-stopping event and invokes a Lambda function to respond to the event by restarting the trail and notifying the security contact via an Amazon Simple Notification Service (SNS) topic.

In the investigative workflow, CloudTrail logs are monitored for undesired activities. For example, if a trail is stopped, there will be a corresponding log record. GuardDuty detects this activity and retrieves additional data points about the source IP that executed the API call. Two common examples of those additional data points in GuardDuty findings include whether the API call came from an IP address on a threat list, or whether it came from a network not commonly used in your AWS account. An AWS Lambda function responds by restarting the trail and notifying the security contact. Finally, the finding is imported into AWS Security Hub for additional investigation.

AWS Security Hub imports findings from AWS security services such as GuardDuty, Amazon Macie and Amazon Inspector, plus from any third-party product integrations you’ve enabled. All findings are provided to Security Hub in AWS Security Finding Format, which eliminates the need for data conversion. Security Hub correlates these findings to help you identify related security events and determine a root cause. Security Hub also publishes its findings to Amazon EventBridge to enable further processing by other AWS services such as AWS Lambda.

Respond step deep dive

Amazon EventBridge and AWS Lambda work together to respond to a security finding. Amazon EventBridge is a service that provides real-time access to changes in data in AWS services, your own applications, and Software-as-a-Service (SaaS) applications without writing code. In this example, EventBridge identifies a Security Hub finding that requires action and invokes a Lambda function that performs remediation. As shown in figure 10, the Lambda function both notifies the security operator via SNS and restarts the stopped CloudTrail.

Figure 10: Sample "respond" workflow

Figure 10: Sample “respond” workflow

To set this response up, we looked for an event to indicate that a trail had stopped or was disabled. We knew that the GuardDuty finding Stealth:IAMUser/CloudTrailLoggingDisabled is raised when CloudTrail logging is disabled. Therefore, we configured the default event bus to look for this event. You can learn more about all of the available GuardDuty findings in the user guide.

How the code works

When Security Hub publishes a finding to EventBridge, it includes full details of the incident as discovered by GuardDuty. The finding is published in JSON format. If you review the details of the sample finding, note that it has several fields helping you identify the specific events that you’re looking for. Here are some of the relevant details:


{
   …
   "source":"aws.securityhub",
   …
   "detail":{
      "findings": [{
		…
    	“Types”: [
			"TTPs/Defense Evasion/Stealth:IAMUser-CloudTrailLoggingDisabled"
			],
		…
      }]
}

You can build an event pattern using these fields, which an EventBridge filtering rule can then use to identify events and to invoke the remediation Lambda function. Below is a snippet from the CloudFormation template we provided earlier that defines that event pattern for the EventBridge filtering rule:


# pattern matches the nested JSON format of a specific Security Hub finding
      EventPattern:
        source:
        - aws.securityhub
        detail-type:
          - "Security Hub Findings - Imported"
        detail:
          findings:
            Types:
              - "TTPs/Defense Evasion/Stealth:IAMUser-CloudTrailLoggingDisabled"

Once the rule is in place, EventBridge continuously scans the event bus for this pattern. When EventBridge finds a match, it invokes the remediating Lambda function and passes the full details of the event to the function. The Lambda function then parses the JSON fields in the event so that it can act as shown in this Python code snippet:


# extract trail ARN by parsing the incoming Security Hub finding (in JSON format)
trailARN = event['detail']['findings'][0]['ProductFields']['action/awsApiCallAction/affectedResources/AWS::CloudTrail::Trail']   

# description contains useful details to be sent to security operations
description = event['detail']['findings'][0]['Description']

The code also issues a notification to security operators so they can review the findings and insights in Security Hub and other services to better understand the incident and to decide whether further manual actions are warranted. Here’s the code snippet that uses SNS to send out a note to security operators:


#Sending the notification that the AWS CloudTrail has been disabled.
snspublish = snsclient.publish(
	TargetArn = snsARN,
	Message="Automatically restarting CloudTrail logging.  Event description: \"%s\" " %description
	)

While notifications to human operators are important, the Lambda function will not wait to take action. It immediately remediates the condition by restarting the stopped trail in CloudTrail. Here’s a code snippet that restarts the trail to reenable logging:


#Enabling the AWS CloudTrail logging
try:
	client = boto3.client('cloudtrail')
	enablelogging = client.start_logging(Name=trailARN)
	logger.debug("Response on enable CloudTrail logging- %s" %enablelogging)
except ClientError as e:
	logger.error("An error occured: %s" %e)

After the trail has been restarted, API activity is once again logged and can be audited. This can help provide relevant data for the remaining steps in the incident response process. The data is especially important for the post-incident phase, when your team analyzes lessons learned to prevent future incidents. You can also use this phase to identify additional steps to automate in your incident response.

Clean up

After you’ve completed the sample security response automation, we recommend that you remove the resources created in this walkthrough example from your account in order to minimize any charges associated with the trail in CloudTrail and data stored in S3.

Important: Deleting resources in your account can negatively impact the applications running in your AWS account. Verify that applications and AWS account security do not depend on the resources you’re about to delete.

Here are the clean-up steps:

  1. Delete the CloudFormation stack.
  2. Delete the trail you created in CloudTrail.
  3. If you created an S3 bucket for CloudTrail logs, you can also delete that S3 bucket.
  4. New accounts can try GuardDuty at no cost for 30 days. You can suspend or disable GuardDuty before the free trial period to avoid charges.
  5. Security Hub comes with a 30-day free trial. You can avoid charges by disabling the service before the trial period is over.

Summary

You’ve learned the basic concepts and considerations behind security response automation on AWS and how to use Amazon EventBridge, Amazon GuardDuty and AWS Security Hub to automatically re-enable AWS CloudTrail when it becomes disabled unexpectedly. As a next step, you may want to start building your own response automations and dive deeper into the AWS Security Incident Response Guide, NIST Cybersecurity Framework (CSF) or the AWS Cloud Adoption Framework (CAF) Security Perspective. You can explore additional automatic remediation solutions on the AWS Security Blog. You can find the code used in this example on GitHub.

If you have feedback about this blog post, submit them in the Comments section below. If you have questions about using this solution, start a thread in the EventBridgeGuardDuty or Security Hub forums, or contact AWS Support.

Want more AWS Security news? Follow us on Twitter.

Cameron Worrell

Cameron Worrell

Cameron is a Solutions Architect with a passion for security and enterprise transformation. He joined AWS in 2015.

Alex Tomic

Alex Tomic

Alex is an AWS Enterprise Solutions Architect focused on security and compliance. He joined AWS in 2014.

Nathan Case

Nathan Case

Nathan is a Senior Security Strategist, and joined AWS in 2016. He is always interested to see where our customers plan to go and how we can help them get there. He is also interested in intel, combined data lake sharing opportunities, and open source collaboration. In the end Nathan loves technology and that we can change the world to make it a better place.

Setting up a CI/CD pipeline by integrating Jenkins with AWS CodeBuild and AWS CodeDeploy

Post Syndicated from Noha Ghazal original https://aws.amazon.com/blogs/devops/setting-up-a-ci-cd-pipeline-by-integrating-jenkins-with-aws-codebuild-and-aws-codedeploy/

In this post, I explain how to use the Jenkins open-source automation server to deploy AWS CodeBuild artifacts with AWS CodeDeploy, creating a functioning CI/CD pipeline. When properly implemented, the CI/CD pipeline is triggered by code changes pushed to your GitHub repo, automatically fed into CodeBuild, then the output is deployed on CodeDeploy.

Solution overview

The functioning pipeline creates a fully managed build service that compiles your source code. It then produces code artifacts that can be used by CodeDeploy to deploy to your production environment automatically.

The deployment workflow starts by placing the application code on the GitHub repository. To automate this scenario, I added source code management to the Jenkins project under the Source Code section. I chose the GitHub option, which by design clones a copy from the GitHub repo content in the Jenkins local workspace directory.

In the second step of my automation procedure, I enabled a trigger for the Jenkins server using an “Poll SCM” option. This option makes Jenkins check the configured repository for any new commits/code changes with a specified frequency. In this testing scenario, I configured the trigger to perform every two minutes. The automated Jenkins deployment process works as follows:

  1. Jenkins checks for any new changes on GitHub every two minutes.
  2. Change determination:
    1. If Jenkins finds no changes, Jenkins exits the procedure.
    2. If it does find changes, Jenkins clones all the files from the GitHub repository to the Jenkins server workspace directory.
  3. The File Operation plugin deletes all the files cloned from GitHub. This keeps the Jenkins workspace directory clean.
  4. The AWS CodeBuild plugin zips the files and sends them to a predefined Amazon S3 bucket location then initiates the CodeBuild project, which obtains the code from the S3 bucket. The project then creates the output artifact zip file, and stores that file again on the S3 bucket.
  5. The HTTP Request plugin downloads the CodeBuild output artifacts from the S3 bucket.
    I edited the S3 bucket policy to allow access from the Jenkins server IP address. See the following example policy:

    {
      "Version": "2012-10-17",
      "Id": "S3PolicyId1",
      "Statement": [
        {
          "Sid": "IPAllow",
          "Effect": "Allow",
          "Principal": "*",
          "Action": "s3:*",
          "Resource": "arn:aws:s3:::examplebucket/*",
          "Condition": {
             "IpAddress": {"aws:SourceIp": "x.x.x.x/x"},  <--- IP of the Jenkins server
          } 
        } 
      ]
    }
    
    

    This policy enables the HTTP request plugin to access the S3 bucket. This plugin doesn’t use the IAM instance profile or the AWS access keys (access key ID and secret access key).

  6. The output artifact is a compressed ZIP file. The CodeDeploy plugin by design requires the files to be unzipped to zip them and send them over to the S3 bucket for the CodeDeploy deployment. For that, I used the File Operation plugin to perform the following:
    1. Unzip the CodeBuild zipped artifact output in the Jenkins root workspace directory. At this point, the workspace directory should include the original zip file downloaded from the S3 bucket from Step 5 and the files extracted from this archive.
    2. Delete the original .zip file, and leave only the source bundle contents for the deployment.
  7. The CodeDeploy plugin selects and zips all workspace directory files. This plugin uses the CodeDeploy application name, deployment group name, and deployment configurations that you configured to initiate a new CodeDeploy deployment. The CodeDeploy plugin then uploads the newly zipped file according to the S3 bucket location provided to CodeDeploy as a source code for its new deployment operation.

Walkthrough

In this post, I walk you through the following steps:

  • Creating resources to build the infrastructure, including the Jenkins server, CodeBuild project, and CodeDeploy application.
  • Accessing and unlocking the Jenkins server.
  • Creating a project and configuring the CodeDeploy Jenkins plugin.
  • Testing the whole CI/CD pipeline.

Create the resources

In this section, I show you how to launch an AWS CloudFormation template, a tool that creates the following resources:

  • Amazon S3 bucket—Stores the GitHub repository files and the CodeBuild artifact application file that CodeDeploy uses.
  • IAM S3 bucket policy—Allows the Jenkins server access to the S3 bucket.
  • JenkinsRole—An IAM role and instance profile for the Amazon EC2 instance for use as a Jenkins server. This role allows Jenkins on the EC2 instance to access the S3 bucket to write files and access to create CodeDeploy deployments.
  • CodeDeploy application and CodeDeploy deployment group.
  • CodeDeploy service role—An IAM role to enable CodeDeploy to read the tags applied to the instances or the EC2 Auto Scaling group names associated with the instances.
  • CodeDeployRole—An IAM role and instance profile for the EC2 instances of CodeDeploy. This role has permissions to write files to the S3 bucket created by this template and to create deployments in CodeDeploy.
  • CodeBuildRole—An IAM role to be used by CodeBuild to access the S3 bucket and create the build projects.
  • Jenkins server—An EC2 instance running Jenkins.
  • CodeBuild project—This is configured with the S3 bucket and S3 artifact.
  • Auto Scaling group—Contains EC2 instances running Apache and the CodeDeploy agent fronted by an Elastic Load Balancer.
  • Auto Scaling launch configurations—For use by the Auto Scaling group.
  • Security groups—For the Jenkins server, the load balancer, and the CodeDeploy EC2 instances.

 

  1. To create the CloudFormation stack (for example in the AWS Frankfurt Region) click the below link:
    .

    .
  2. Choose Next and provide the following values on the Specify Details page:
    • For Stack name, name your stack as you prefer.
    • For CodedeployInstanceCount, choose the default of t2.medium.
      To check the supported instance types by AWS Region, see Supported Regions.
    • For InstanceCount, keep the default of 3, to launch three EC2 instances for CodeDeploy.
    • For JenkinsInstanceType, keep the default of t2.medium.
    • For KeyName, choose an existing EC2 key pair in your AWS account. Use this to connect by using SSH to the Jenkins server and the CodeDeploy EC2 instances. Make sure that you have access to the private key of this key pair.
    • For PublicSubnet1, choose a public subnet from which the load balancer, Jenkins server, and CodeDeploy web servers launch.
    • For PublicSubnet2, choose a public subnet from which the load balancers and CodeDeploy web servers launch.
    • For VpcId, choose the VPC for the public subnets you used in PublicSubnet1 and PublicSubnet2.
    • For YourIPRange, enter the CIDR block of the network from which you connect to the Jenkins server using HTTP and SSH. If your local machine has a static public IP address, go to https://www.whatismyip.com/ to find your IP address, and then enter your IP address followed by /32. If you don’t have a static IP address (or aren’t sure if you have one), enter 0.0.0.0/0. Then, any address can reach your Jenkins server.
      .
  3. Choose Next.
  4. On the Review page, select the I acknowledge that this template might cause AWS CloudFormation to create IAM resources check box.
  5. Choose Create and wait for the CloudFormation stack status to change to CREATE_COMPLETE. This takes approximately 6–10 minutes.
  6. Check the resulting values on the Outputs tab. You need them later.
    .
  7. Browse to the ELBDNSName value from the Outputs tab, verifying that you can see the Sample page. You should see a congratulatory message.
  8. Your Jenkins server should be ready to deploy.

Access and unlock your Jenkins server

In this section, I discuss how to access, unlock, and customize your Jenkins server.

  1. Copy the JenkinsServerDNSName value from the Outputs tab of the CloudFormation stack, and paste it into your browser.
  2. To unlock the Jenkins server, SSH to the server using the IP address and key pair, following the instructions from Unlocking Jenkins.
  3. Use the root user to Cat the log file (/var/log/jenkins/jenkins.log) and copy the automatically generated alphanumeric password (between the two sets of asterisks). Then, use the password to unlock your Jenkins server, as shown in the following screenshots.
    .
  4. On the Customize Jenkins page, choose Install suggested plugins.

  5. Wait until Jenkins installs all the suggested plugins. When the process completes, you should see the check marks alongside all of the installed plugins.
    .
    .
  6. On the Create First Admin User page, enter a user name, password, full name, and email address of the Jenkins user.
  7. Choose Save and continue, Save and finish, and Start using Jenkins.
    .
    After you install all the needed Jenkins plugins along with their required dependencies, the Jenkins server restarts. This step should take about two minutes. After Jenkins restarts, refresh the page. Your Jenkins server should be ready to use.

Create a project and configure the CodeDeploy Jenkins plugin

Now, to create our project in Jenkins we need to configure the required Jenkins plugin.

  1. Sign in to Jenkins with the user name and password that you created earlier and click on Manage Jenkins then Manage Plugins.
  2. From the Available tab search for and select the below plugins then choose Install without restart:
    .
    AWS CodeDeploy
    AWS CodeBuild
    Http Request
    File Operations
    .
  3. Select the Restart Jenkins when installation is complete and no jobs are running.
    Jenkins will take couple of minutes to download the plugins along with their dependencies then will restart.
  4. Login then choose New Item, Freestyle project.
  5. Enter a name for the project (for example, CodeDeployApp), and choose OK.
    .

    .
  6. On the project configuration page, under Source Code Management, choose Git. For Repository URL, enter the URL of your GitHub repository.
    .

    .
  7. For Build Triggers, select the Poll SCM check box. In the Schedule, for testing enter H/2 * * * *. This entry tells Jenkins to poll GitHub every two minutes for updates.
    .

    .
  8. Under Build Environment, select the Delete workspace before build starts check box. Each Jenkins project has a dedicated workspace directory. This option allows you to wipe out your workspace directory with each new Jenkins build, to keep it clean.
    .

    .
  9. Under Build Actions, add a Build Step, and AWS CodeBuild. On the AWS Configurations, choose Manually specify access and secret keys and provide the keys.
    .
    .
  10. From the CloudFormation stack Outputs tab, copy the AWS CodeBuild project name (myProjectName) and paste it in the Project Name field. Also, set the Region that you are using and choose Use Jenkins source.
    It is a best practice is to store AWS credentials for CodeBuild in the native Jenkins credential store. For more information, see the Jenkins AWS CodeBuild Plugin wiki.
    .
    .
  11. To make sure that all files cloned from the GitHub repository are deleted choose Add build step and select File Operation plugin, then click Add and select File Delete. Under File Delete operation in the Include File Pattern, type an asterisk.
    .
    .
  12. Under Build, configure the following:
    1. Choose Add a Build step.
    2. Choose HTTP Request.
    3. Copy the S3 bucket name from the CloudFormation stack Outputs tab and paste it after (http://s3-eu-central-1.amazonaws.com/) along with the name of the zip file codebuild-artifact.zip as the value for HTTP Plugin URL.
      Example: (http://s3-eu-central-1.amazonaws.com/mybucketname/codebuild-artifact.zip)
    4. For Ignore SSL errors?, choose Yes.
      .

      .
  13. Under HTTP Request, choose Advanced and leave the default values for Authorization, Headers, and Body. Under Response, for Output response to file, enter the codebuild-artifact.zip file name.
    .

    .
  14. Add the two build steps for the File Operations plugin, in the following order:
    1. Unzip action: This build step unzips the codebuild-artifact.zip file and places the contents in the root workspace directory.
    2. File Delete action: This build step deletes the codebuild-artifact.zip file, leaving only the source bundle contents for deployment.
      .
      .
  15. On the Post-build Actions, choose Add post-build actions and select the Deploy an application to AWS CodeDeploy check box.
  16. Enter the following values from the Outputs tab of your CloudFormation stack and leave the other settings at their default (blank):
    • For AWS CodeDeploy Application Name, enter the value of CodeDeployApplicationName.
    • For AWS CodeDeploy Deployment Group, enter the value of CodeDeployDeploymentGroup.
    • For AWS CodeDeploy Deployment Config, enter CodeDeployDefault.OneAtATime.
    • For AWS Region, choose the Region where you created the CodeDeploy environment.
    • For S3 Bucket, enter the value of S3BucketName.
      The CodeDeploy plugin uses the Include Files option to filter the files based on specific file names existing in your current Jenkins deployment workspace directory. The plugin zips specified files into one file. It then sends them to the location specified in the S3 Bucket parameter for CodeDeploy to download and use in the new deployment.
      .
      As shown below, in the optional Include Files field, I used (**) so all files in the workspace directory get zipped.
      .
      .
  17. Choose Deploy Revision. This option registers the newly created revision to your CodeDeploy application and gets it ready for deployment.
  18. Select the Wait for deployment to finish? check box. This option allows you to view the CodeDeploy deployments logs and events on your Jenkins server console output.
    .
    .
    Now that you have created a project, you are ready to test deployment.

Testing the whole CI/CD pipeline

To test the whole solution, put an application on your GitHub repository. You can download the sample from here.

The following screenshot shows an application tree containing the application source files, including text and binary files, executables, and packages:

In this example, the application files are the templates directory, test_app.py file, and web.py file.

The appspec.yml file is the main application specification file telling CodeDeploy how to deploy your application. Jenkins uses the AppSpec file to manage each deployment as a series of lifecycle event “hooks”, as defined in the file. For information about how to create a well-formed AppSpec file, see AWS CodeDeploy AppSpec File Reference.

The buildspec.yml file is a collection of build commands and related settings, in YAML format, that CodeBuild uses to run a build. You can include a build spec as part of the source code, or you can define a build spec when you create a build project. For more information, see How AWS CodeBuild Works.

The scripts folder contains the scripts that you would like to run during the CodeDeploy LifecycleHooks execution with respect to your application requirements. For more information, see Plan a Revision for AWS CodeDeploy.

To test this solution, perform the following steps:

  1. Unzip the application files and send them to your GitHub repository, run the following git commands from the path where you placed your sample application:
    $ git add -A
    
    $ git commit -m 'Your first application'
    
    $ git push
  2. On the Jenkins server dashboard, wait for two minutes until the previously set project trigger starts working. After the trigger starts working, you should see a new build taking place.
    .

    .
  3. In the Jenkins server Console Output page, check the build events and review the steps performed by each Jenkins plugin. You can also review the CodeDeploy deployment in detail, as shown in the following screenshot:
    .

On completion, Jenkins should report that you have successfully deployed a web application. You can also use your ELBDNSName value to confirm that the deployed application is running successfully.

.

.Conclusion

In this post, I outlined how you can use a Jenkins open-source automation server to deploy CodeBuild artifacts with CodeDeploy. I showed you how to construct a functioning CI/CD pipeline with these tools. I walked you through how to build the deployment infrastructure and automatically deploy application version changes from GitHub to your production environment.

Hopefully, you have found this post informative and the proposed solution useful. As always, AWS welcomes all feedback or comment.

About the Author

.

 

Noha Ghazal is a Cloud Support Engineer at Amazon Web Services. She is is a subject matter expert for AWS CodeDeploy. In her role, she enjoys supporting customers with their CodeDeploy and other DevOps configurations. Outside of work she enjoys drawing portraits, fishing and playing video games.

 

 

Nine AWS Security Hub best practices

Post Syndicated from Ketan Srivastava original https://aws.amazon.com/blogs/security/nine-aws-security-hub-best-practices/

AWS Security Hub is a security and compliance service that became generally available on June 25, 2019. It provides you with extensive visibility into your security and compliance status across multiple AWS accounts, in a single dashboard per region. The service helps you monitor critical settings to ensure that your AWS accounts remain secure, allowing you to notice and react quickly to any changes in your environment.

AWS Security Hub aggregates, organizes, and prioritizes security findings from supported AWS services—that’s Amazon GuardDuty, Amazon Inspector, and Amazon Macie at the time this post was published—and from various AWS partner security solutions. AWS Security Hub also generates its own findings, based on automated, resource-level and account-level configuration and compliance checks using service-linked AWS Config rules plus other analytic techniques. These checks help you keep your AWS accounts compliant with industry standards and best practices, such as the Center for Internet Security (CIS) AWS Foundations standard.

In this post, I’ll provide nine best practices to help you use AWS Security Hub as effectively as possible.

1. Use the AWS Labs script to turn on Security Hub in all your AWS accounts in all regions and to establish your existing Amazon GuardDuty master/member hierarchy

As a best practice, you should continuously monitor all regions across all of your AWS accounts for unauthorized behavior or misconfigurations, even in regions that you don’t use heavily. AWS already recommends that you do this when using monitoring services like AWS Config and AWS CloudTrail. I recommend that you enable Security Hub in every region available in your AWS accounts.

In addition, you can also invite other AWS accounts to enable Security Hub and share findings with your AWS account. If you send an invitation and it is accepted by the other account owner, your Security Hub account is designated as the master account, and any associated Security Hub accounts become your member accounts. Users from the master account will then be able to view Security Hub findings from member accounts.

To simplify these configurations, you can utilize the AWS Labs script available on GitHub, which provides a step-by-step guide to automate this process. This script allows you to enable (and disable) AWS Security Hub simultaneously across a list of associated AWS accounts and bulk-add them to become your Security Hub members; it sends invitations from the master account and automatically accepts invitations in all member accounts. To run the script, you must have the AWS account IDs and root email addresses of the AWS accounts that you want as your Security Hub members. (Note that you should only share your root email address and account ID with AWS accounts that you trust. Visit the IAM best practices page to learn more about how to keep access to your AWS accounts secured.)

By default, the Security Hub master/member association is independent of the relationships that you’ve established between your Amazon GuardDuty or Amazon Macie accounts and other associated accounts. If you have an existing master/member hierarchy in GuardDuty or Macie, you can export that list of accounts into a CSV file and then use it with the script. For example in GuardDuty, use the ListMembers API to export the AWS Account ID and email of all member accounts, as follows:

aws guardduty list-members –detector-id <Detector ID> –query "Members[].[AccountId, Email]" –output text | awk ‘{print $1 "," $2}’

The output of the above command will be your GuardDuty member account IDs and their corresponding root email addresses, one per line and separated with a comma as shown below:

12345678910,[email protected]
98765432101,[email protected]

2. Enable AWS Config in all AWS accounts and regions and leave the AWS CIS Foundations standard check enabled

When you enable Security Hub in any region, the AWS CIS standard checks are enabled by default. I recommend leaving them enabled; they are important security measures that are applicable to all AWS accounts.

To run most of these checks, Security Hub uses service-linked AWS Config rules. Because of this, you should make sure that AWS Config is turned on and recording all supported resources, including global resources, in all accounts and regions where Security Hub is deployed. You are not charged by AWS Config for these service-linked rules. You are only charged via Security Hub’s pricing model.

3. Use specific managed IAM policies for different types of Security Hub users

You can choose to allow a large group of users to access List and Read Security Hub actions, which will permit them to view your security findings. However, you should allow only a small group of users to access the Security Hub Write actions. This will permit only authorized users to archive, resolve, or remediate the findings.

You can use AWS managed policies to give your employees the permissions they need to get started. These policies are already available in your account and are maintained and updated by AWS. To grant more granular permission to your Security Hub users, I recommend that you create your own customer managed policies. A great place to start with this is to import an existing AWS managed policy. That way, you know that the policy is initially correct, and all you need to do is customize it for your environment.

AWS categorizes each service action into one of five access levels based on what each action does: List, Read, Write, Permissions management, or Tagging. To determine which access level to include in the IAM policies that you assign to your users, you can view the policy summary by navigating from the IAM Console to Policies, then selecting any AWS managed or customer managed policy. Next, on the Summary page, under the Permissions tab, select Policy summary (see Figure 1). For more details and examples of access level classification, see Understanding Access Level Summaries Within Policy Summaries.
 

Figure 1: Policy summary of AWSSecurityHubReadOnlyAccess AWS managed policy

Figure 1: Policy summary of AWSSecurityHubReadOnlyAccess AWS managed policy

4. Use tags for access controls and cost allocation

A SecurityHub::Hub resource represents the implementation of the AWS Security Hub service per region in your AWS account. Security Hub allows you to assign metadata to your SecurityHub::Hub resource in the form of tags. Each tag is a string consisting of a user-defined key and an optional key-value that makes it easier for you to identify and manage the AWS resources in your environment.

You can control access permissions by using tags on your SecurityHub::Hub resource. For example, you can allow a group of developer IAM entities to manage and update only the SecurityHub::Hub resources that have the tag key developer associated with them. This can help you restrict access to your production SecurityHub::Hub resources, while allowing your developers to continue testing in their developer environment.

For more information on the supported tag-based conditions which you can use with the Security Hub APIs, refer to Condition Keys for AWS Security Hub. Please note that when you use tag-based conditions for access control, you must define who can modify those tags.

To make it easier to categorize and track your AWS costs, you can also activate cost allocation tags. This helps you organize your SecurityHub::Hub resource costs. AWS generates a cost allocation report as a CSV file, with your usage and costs grouped according to your active tags. You can apply tags that represent business categories (such as cost centers, application names, or project environments) to organize your costs.

For more information on commonly used tagging categories and effective tagging strategies, read about AWS Tagging Strategies.

5. Integrate and enable your existing security products (with 34 integrations today and more to come)

Numerous tools can help you understand the security and compliance posture of your AWS accounts, but these tools generate their own set of findings, often in different formats. Security Hub normalizes the findings.

With Security Hub, findings generated from integrated providers (both third-party services and AWS services) are ingested using a standard findings format, which eliminates the need for security teams to convert the data. You can currently integrate 34 findings providers to import and/or export findings with Security Hub. Some partner products, like PagerDuty, Splunk, and Slack, can receive findings from Security Hub, although they don’t generate findings.

If you want to add a third-party partner product to your AWS environment, you can choose the Purchase link from the Security Hub console’s Integrations page and navigate to AWS Marketplace. Once purchased, choose the Configure link to navigate to step-by-step instructions to install the product and configure its integration with Security Hub. Then choose Enable integration to create a product subscription in your account for that third-party provider (see Figure 2).

After you enable a subscription, a resource policy is automatically attached to it. The resource policy defines the permissions that Security Hub needs to accept and process the product’s findings. You can also enable the subscription via the API and CloudFormation.
 

Figure 2: Integrating partner findings provider with Security Hub

Figure 2: Integrating partner findings provider with Security Hub

6. Build out customized remediation playbooks using Amazon CloudWatch Events, AWS Systems Manager Automation documents, and AWS Step Functions to automatically resolve findings that don’t require human intervention

Security Hub automatically sends all findings to Amazon CloudWatch Events. This integration helps you automate your response to threat incidents by allowing you to take specific actions using AWS Systems Manager Automation documents, OpsItems, and AWS Step Functions. Using these tools, you can create your own incident handling plan. This will allow your security team to focus on strengthening the security of your AWS environments rather than on remediating the current findings.
 

Figure 3: Creating a CloudWatch Events Rule for sending matched Security Hub findings to specific Targets

Figure 3: Creating a CloudWatch Events Rule for sending matched Security Hub findings to specific Targets

7. Create custom actions to send a copy of a Security Hub finding to a resource that is internal or external to your AWS account, enabling additional visibility and remediation options for the finding

Because of its integration with CloudWatch Events, you can use Security Hub to create custom actions that will send specific findings to ticketing, chat, email, or automated remediation systems. Custom actions can also be sent to your own AWS resources, such as AWS Systems Manager OpsCenter, AWS Lambda or Amazon Kinesis, allowing you to do your own remediation or data capture related to the finding.

For an in-depth look at this architecture, plus specific examples of how to implement custom actions, see How to Integrate AWS Security Hub Custom Actions with PagerDuty and How to Enable Custom Actions in AWS Security Hub.

In addition, Security Hub gives you the option to choose a language-specific AWS SDK so that you can use custom actions to resolve findings programmatically. Below, I’ll demonstrate how you can implement this using AWS Lambda and AWS SDK for Python (Boto3). In my example, I’ll remediate the finding generated by Security Hub for CIS check 2.4, “Ensure CloudTrail trails are integrated with Amazon CloudWatch Logs.” For this walk-through, I assume that you have the necessary AWS IAM permissions to work with Security Hub, CloudWatch Events, Lambda and AWS CloudTrail.
 

Figure 4: Data flow supporting remediation of Security Hub findings using custom actions

Figure 4: Data flow supporting remediation of Security Hub findings using custom actions

As shown in figure 4:

  1. When findings against CIS check 2.4 are generated in Security Hub, Security Hub will send them to CloudWatch Events using custom actions that I’ll describe below.
  2. CloudWatch Events will send the findings to a Lambda function that has been configured as the target.
  3. The Lambda function will utilize a Python script to check whether the finding has been generated against CIS check 2.4. If it has, the Lambda function will identify the affected CloudTrail trail and configure it with CloudWatch Logs to monitor the trail logs.

Prerequisites

  1. You must configure an IAM Role for AWS CloudTrail to assume so that it can deliver events to your CloudWatch Logs log group. For more information about how to do this, refer to the AWS CloudTrail documentation. I’ll refer to this role as the CloudTrail role.
  2. To deploy the Lambda function, you must configure an IAM Role for the Lambda function to assume. I’ll refer to this role as the Lambda execution role. The following sample policy includes the permissions that you’ll assign to it for this example. Please replace <CloudTrail_CloudWatchLogs_Role> with the CloudTrail role that you created in the previous step. Depending on your use case, you can restrict this IAM policy further to grant least privilege, which is a recommended IAM Best Practice.

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Action": [
                "logs:CreateLogGroup",
                "logs:CreateLogStream",
                "logs:PutLogEvents",
                "logs:DescribeLogGroups",
                "cloudtrail:UpdateTrail",
                "iam:GetRole"
            ],
            "Resource": "*"
        },
        {
            "Effect": "Allow",
            "Action": "iam:PassRole",
            "Resource": "arn:aws:iam::012345678910:role/<CloudTrail_CloudWatchLogs_Role>"
        }
    ]
}     

Solution deployment

  1. Create a custom action in AWS Security Hub and associate it with a CloudWatch Events rule that you configure for your Security Hub findings. Follow the instructions laid out in the Security Hub user guide for the exact steps to do this.
  2. Create a Lambda Function, which will complete the auto-remediation of the CIS 2.4 findings:
    1. Open the Lambda Console and select Create function.
    2. On the next page, choose Author from scratch.
    3. Under Basic information, enter a name for your function. For Runtime, select Python 3.7.
       
      Figure 5: Updating basic information to create the Lambda function

      Figure 5: Updating basic information to create the Lambda function

    4. Under Permissions, expand Choose or create an execution role.
    5. Under Execution role, select the drop down menu and change the setting to Use an existing role.
    6. Under Existing role, select the Lambda execution role that you created earlier, then select Create function.
       
      Figure 6: Updating basic information and permissions to create the Lambda function

      Figure 6: Updating basic information and permissions to create the Lambda function

    7. Delete the default function code and paste the code I’ve provided below:
      
              import json, boto3
              cloudtrail_client = boto3.client('cloudtrail')
              cloudwatchlogs_client = boto3.client('logs')
              iam_client = boto3.client('iam')
              
              role_details = iam_client.get_role(RoleName='<CloudTrail_CloudWatchLogs_Role>')
              
              def lambda_handler(event, context):
                  # First off all, let us see if the JSON sent by CWE has any Security Hub findings.
                  if 'detail' in event.keys() and 'findings' in event['detail'].keys() and len(event['detail']['findings']) > 0:
                      print("There are some findings. Let's check them!")
                      print("Number of findings: %i" % len(event['detail']['findings']))
              
                      # Then we need to filter out the findings. In this code snippet, we'll handle only findings related to CloudTrail trails for integration with CloudWatch Logs.
                      for finding in event['detail']['findings']:
                          if 'Title' in finding.keys():
                              if 'Ensure CloudTrail trails are integrated with CloudWatch Logs' in finding['Title']:
                                  print("There's a CloudTrail-related finding. I can handle it!")
              
                                  if 'Compliance' in finding.keys() and 'Status' in finding['Compliance'].keys():
                                      print("Compliance Status: %s" % finding['Compliance']['Status'])
              
                                      # We can skip compliant findings, and evaluate only the non-compliant ones.                        
                                      if finding['Compliance']['Status'] == 'PASSED':
                                          continue
              
                                      # For each non-compliant finding, we need to get specific pieces of information so as to create the correct log group and update the CloudTrail trail.                        
                                      for resource in finding['Resources']:
                                          resource_id = resource['Id']
                                          cloudtrail_name = resource['Details']['Other']['name']
                                          loggroup_name = 'CloudTrail/' + cloudtrail_name
                                          print("ResourceId for the finding is %s" % resource_id)
                                          print("LogGroup name: %s" % loggroup_name)
              
                                          # At this point, we can create the log group using the name extracted from the finding.
                                          try:
                                              response_logs = cloudwatchlogs_client.create_log_group(logGroupName=loggroup_name)
                                          except Exception as e:
                                              print("Exception: %s" % str(e))
              
                                          # For updating the CloudTrail trail, we need to have the ARN of the log group. Let's retrieve it now.                            
                                          response_logsARN = cloudwatchlogs_client.describe_log_groups(logGroupNamePrefix = loggroup_name)
                                          print("LogGroup ARN: %s" % response_logsARN['logGroups'][0]['arn'])
                                          print("The role used by CloudTrail is: %s" % role_details['Role']['Arn'])
              
                                          # Finally, let's update the CloudTrail trail so that it sends logs to the new log group created.
                                          try:
                                              response_cloudtrail = cloudtrail_client.update_trail(
                                                  Name=cloudtrail_name,
                                                  CloudWatchLogsLogGroupArn = response_logsARN['logGroups'][0]['arn'],
                                                  CloudWatchLogsRoleArn = role_details['Role']['Arn']
                                              )
                                          except Exception as e:
                                              print("Exception: %s" % str(e))
                              else:
                                  print("Title: %s" % finding['Title'])
                                  print("This type of finding cannot be handled by this function. Skipping it…")
                          else:
                              print("This finding doesn't have a title and so cannot be handled by this function. Skipping it…")
                  else:
                      print("There are no findings to remediate.")            
              

    8. After pasting the code, replace <CloudTrail_CloudWatchLogs_Role> with your CloudTrail role and select Save to save your Lambda function.
       
      Figure 7: Editing Lambda code to replace the correct CloudTrail role

      Figure 7: Editing Lambda code to replace the correct CloudTrail role

  3. Go to your CloudWatch console and select Rules in the navigation pane on the left.
    1. From the list of CloudWatch rules that you see, select the rule which you created in Step 1 of this solution deployment.
    2. Then, select Actions on the top right of the page and choose Edit.
    3. On the Step 1: Create rule page, under Targets, choose Lambda function and select the Lambda function you created in Step 2.
    4. Select Configure details.
    5. On the Step 2: Configure rule details page, select Update rule.
       
      Figure 8: Adding your created Lambda function as Target for the CloudWatch rule

      Figure 8: Adding your created Lambda function as target for the CloudWatch rule

  4. Configuration is now complete, and you can test your rule. Go to your AWS Security Hub console and select Compliance standards in the navigation pane.
    1. Next, select CIS AWS Foundations.
       
      Figure 9: Compliance standards page in the Security Hub console

      Figure 9: Compliance standards page in the Security Hub console

    2. Search for the rule 2.4 Ensure CloudTrail trails are integrated with CloudWatch Logs and select it.
       
      Figure 10: Locating CIS check 2.4 in the Security Hub console

      Figure 10: Locating CIS check 2.4 in the Security Hub console

    3. If you’ve left the default AWS Security Hub CIS checks enabled (along with AWS Config service in the same region), and if you have CloudTrail trails in that region which are not yet configured to deliver events to CloudWatch Logs, you should see a low severity finding with a Failed Compliance status.
    4. Select the failed finding by selecting the checkbox and choosing the Actions button.
    5. Finally, from the dropdown menu, select the custom action that you created in Step 1 to send the finding to CloudWatch Events. CloudWatch Events will send the finding to your Lambda function, which you configured as the target for the rule in step 3. The Lambda function will automatically identify the affected CloudTrail trail and configure CloudWatch Logs log group for you. The log group will have the same name as your trail for identification purposes. You can modify the code to suit your needs further.

    Note: There may be a delay before the compliance status of the remediated resource changes. Once the CIS AWS Foundations Standard is enabled, Security Hub will run the checks within 2 hours. After that, the checks are automatically run once every 24 hours.

     

    Figure 11: Findings generated against CIS check 2.4 in the Security Hub Console

    Figure 11: Findings generated against CIS check 2.4 in the Security Hub console

    8. Customize your insights using the default “managed insights” as templates and use them to prioritize resources and findings to act upon

    A Security Hub “insight” is a collection of related findings to which one or more Security Hub filters have been applied. Insights can help you organize your findings and identify security risks that need immediate attention.

    Security Hub offers several managed (default) insights. You can use these as templates for new insights, and modify them depending on your use case. You can save these modified queries as new custom insights to ensure an even greater visibility of your AWS accounts. Please refer to the documentation for step-by-step instructions on how to create custom insights.
     

    Figure 12: Creating a Security Hub custom insight

    Figure 12: Creating a Security Hub custom insight

    9. Use the free trial to evaluate what your costs could be

    Security Hub provides a 30-day free trial for all AWS accounts and regions. The trial is a good way to evaluate how much Security Hub will cost, on average, to monitor threats and compliance in your environments. You can view an estimate by navigating from the Security Hub console to Settings, then Usage (see Figure 13).
     

    Figure 13: Estimating your Security Hub costs

    Figure 13: Estimating your Security Hub costs

    Conclusion

    AWS Security Hub allows you to have more visibility into the security and compliance status of your AWS environments. Using the Security Hub best practices discussed here, security teams can spend more time on incident remediation and recovery rather than incident detection and organization. Security Hub has undergone HIPAA, ISO, PCI, and SOC certification. To learn more about Security Hub, refer to the AWS Security Hub documentation.

    If you have comments about this post, submit them in the Comments section below. If you have questions about anything in this post, start a new thread on the AWS Security Hub forum or contact AWS Support.

    Want more AWS Security news? Follow us on Twitter.

    Author

    Ketan Srivastava

    Ketan is a Cloud Support Engineer at AWS. He enjoys the fact that, at AWS, there are always so many opportunities to build things better for our customers and learn from these opportunities. Outside of work, he plays MOBAs and travels to new places with his wife. He holds a Master of Science degree from Rochester Institute of Technology.

Let’s Annotate Our Methods With The Features They Implement

Post Syndicated from Bozho original https://techblog.bozho.net/lets-annotate-our-methods-with-the-features-they-implement/

Writing software consists of very little actual “writing”, and much more thinking, designing, reading, “digging”, analyzing, debugging, refactoring, aligning and meeting others.

The reading and digging part is where you try to understand what has been implemented before, why it has been implemented, and how it works. In larger projects it becomes increasingly hard to find what is happening and why – there are so many classes that interfere, and so many methods participate in implementing a particular feature.

That’s probably because there is a mismatch between the programming units (classes, methods) and the business logic units (features). Product owners want a “password reset” feature, and they don’t care if it’s done using framework configuration, custom code split in three classes, or one monolithic controller method that does that job.

This mismatch is partially addressed by the so called BDD (behaviour driven development), as business people can define scenarios in a formalized language (although they rarely do, it’s still up to the QAs or developers to write the tests). But having your tests organized around features and behaviours doesn’t mean the code is, and BDD doesn’t help in making your way through the codebase in search of why and how something is implemented.

Another issue is linking a piece of code to the issue tracking system. Source control conventions and hooks allow for setting the issue tracker number as part of the commit, and then when browsing the code, you can annotate the file and see the issue number. However, due the the many changes, even a very strict team will end up methods that are related to multiple issues and you can’t easily tell which is the proper one.

Yet another issue with the lack of a “feature” unit in programming languages is that you can’t trivially reuse existing projects to start a new one. We’ve all been there – you have a similar project and you want to get a skeleton to get thing running faster. And while there are many tools to help that (Spring Boot, Spring Roo, and other scaffolding utilities), they can rarely deliver what you need – you always have to tweak something, delete something, customize some configuration, as defaults are almost never practical.

And I have a simple proposal that will help with the issues above. As with any complex problem, simple ideas don’t solve everything, but are at least a step forward.

The proposal is in the title – let’s annotate our methods with the features they implement. Let’s have @Feature(name = "Forgotten password", issueTrackerCode="PROJ-123"). A method can implement multiple features, but that is generally discouraged by best practices (e.g. the single responsibility principle). The granularity of “feature” is something that has to be determined by each team and is the tricky part – sometimes an epic describes a feature, sometimes individual stories or even subtasks do. A definition of a feature should be agreed upon and every new team member should be told what to do and how to interpret it.

There is of course a lot of complexity, e.g. for generic methods like DAO methods, utility methods, or methods that are reused in too many places. But they also represent features, it’s just that these features are horizontal. “Data access layer” is a feature – a more technical one indeed, but it counts, and maybe deserves a story in the issue tracker.

Your features can actually be listed in one or several enums, grouped by type – business, horizontal, performance, etc. That way you can even compose features – e.g. account creation contains the logic itself, database access, a security layer.

How does such a proposal help?

  • Consciousnesses about the single responsibility of methods and that code should be readable
  • Provides a rationale for the existence of each method. Even if a proper comment is missing, the annotation will put a method (or a class) in context
  • Helps navigating code and fixing issues (if you can see all places where a feature is implemented, you are more likely to spot an issue)
  • Allows tools to analyze your features – amount, complexity, how chaotic a feature is spread across the code base, test coverage per feature, etc.
  • Allows tools to use existing projects for scaffolding for new ones – you specify the features you want to have, and they are automatically copied

At this point I’m supposed to give a link to a GitHub project for a feature annotation library. But it doesn’t make sense to have a single-annotation project. It can easily be part of guava or something similar Or can be manually created in each project. The complex part – the tools that will do the scanning and analysis, deserve separate projects, but unfortunately I don’t have time to write one.

But even without the tools, the concept of annotating methods with their high-level features is I think a useful one. Instead of trying to deduce why is this method here and what requirements does it have to implement (and were all necessary tests written at the time), such an annotation can come handy.

The post Let’s Annotate Our Methods With The Features They Implement appeared first on Bozho's tech blog.

New – How to better monitor your custom application metrics using Amazon CloudWatch Agent

Post Syndicated from Helen Lin original https://aws.amazon.com/blogs/devops/new-how-to-better-monitor-your-custom-application-metrics-using-amazon-cloudwatch-agent/

This blog was contributed by Zhou Fang, Sr. Software Development Engineer for Amazon CloudWatch and Helen Lin, Sr. Product Manager for Amazon CloudWatch

Amazon CloudWatch collects monitoring and operational data from both your AWS resources and on-premises servers, providing you with a unified view of your infrastructure and application health. By default, CloudWatch automatically collects and stores many of your AWS services’ metrics and enables you to monitor and alert on metrics such as high CPU utilization of your Amazon EC2 instances. With the CloudWatch Agent that launched last year, you can also deploy the agent to collect system metrics and application logs from both your Windows and Linux environments. Using this data collected by CloudWatch, you can build operational dashboards to monitor your service and application health, set high-resolution alarms to alert and take automated actions, and troubleshoot issues using Amazon CloudWatch Logs.

We recently introduced CloudWatch Agent support for collecting custom metrics using StatsD and collectd. It’s important to collect system metrics like available memory, and you might also want to monitor custom application metrics. You can use these custom application metrics, such as request count to understand the traffic going through your application or understand latency so you can be alerted when requests take too long to process. StatsD and collectd are popular, open-source solutions that gather system statistics for a wide variety of applications. By combining the system metrics the agent already collects, with the StatsD protocol for instrumenting your own metrics and collectd’s numerous plugins, you can better monitor, analyze, alert, and troubleshoot the performance of your systems and applications.

Let’s dive into an example that demonstrates how to monitor your applications using the CloudWatch Agent.  I am operating a RESTful service that performs simple text encoding. I want to use CloudWatch to help monitor a few key metrics:

  • How many requests are coming into my service?
  • How many of these requests are unique?
  • What is the typical size of a request?
  • How long does it take to process a job?

These metrics help me understand my application performance and throughput, in addition to setting alarms on critical metrics that could indicate service degradation, such as request latency.

Step 1. Collecting StatsD metrics

My service is running on an EC2 instance, using Amazon Linux AMI 2018.03.0. Make sure to attach the CloudWatchAgentServerPolicy AWS managed policy so that the CloudWatch agent can collect and publish metrics from this instance:

Here is the service structure:

 

The “/encode” handler simply returns the base64 encoded string of an input text.  To monitor key metrics, such as total and unique request count as well as request size and method response time, I used StatsD to define these custom metrics.

@RestController

public class EncodeController {

    @RequestMapping("/encode")
    public String encode(@RequestParam(value = "text") String text) {
        long startTime = System.currentTimeMillis();
        statsd.incrementCounter("totalRequest.count", new String[]{"path:/encode"});
        statsd.recordSetValue("uniqueRequest.count", text, new String[]{"path:/encode"});
        statsd.recordHistogramValue("request.size", text.length(), new String[]{"path:/encode"});
        String encodedString = Base64.getEncoder().encodeToString(text.getBytes());
        statsd.recordExecutionTime("latency", System.currentTimeMillis() - startTime, new String[]{"path:/encode"});
        return encodedString;
    }
}

Note that I need to first choose a StatsD client from here.

The “/status” handler responds with a health check ping.  Here I am monitoring my available JVM memory:

@RestController
public class StatusController {

    @RequestMapping("/status")
    public int status() {
        statsd.recordGaugeValue("memory.free", Runtime.getRuntime().freeMemory(), new String[]{"path:/status"});
        return 0;
    }
}

 

Step 2. Emit custom metrics using collectd (optional)

collectd is another popular, open-source daemon for collecting application metrics. If I want to use the hundreds of available collectd plugins to gather application metrics, I can also use the CloudWatch Agent to publish collectd metrics to CloudWatch for 15-months retention. In practice, I might choose to use either StatsD or collectd to collect custom metrics, or I have the option to use both. All of these use cases  are supported by the CloudWatch agent.

Using the same demo RESTful service, I’ll show you how to monitor my service health using the collectd cURL plugin, which passes the collectd metrics to CloudWatch Agent via the network plugin.

For my RESTful service, the “/status” handler returns HTTP code 200 to signify that it’s up and running. This is important to monitor the health of my service and trigger an alert when the application does not respond with a HTTP 200 success code. Additionally, I want to monitor the lapsed time for each health check request.

To collect these metrics using collectd, I have a collectd daemon installed on the EC2 instance, running version 5.8.0. Here is my collectd config:

LoadPlugin logfile
LoadPlugin curl
LoadPlugin network

<Plugin logfile>
  LogLevel "debug"
  File "/var/log/collectd.log"
  Timestamp true
</Plugin>

<Plugin curl>
    <Page "status">
        URL "http://localhost:8080/status";
        MeasureResponseTime true
        MeasureResponseCode true
    </Page>
</Plugin>

<Plugin network>
    <Server "127.0.0.1" "25826">
        SecurityLevel Encrypt
        Username "user"
        Password "secret"
    </Server>
</Plugin>

 

For the cURL plugin, I configured it to measure response time (latency) and response code (HTTP status code) from the RESTful service.

Note that for the network plugin, I used Encrypt mode which requires an authentication file for the CloudWatch Agent to authenticate incoming collectd requests.  Click here for full details on the collectd installation script.

 

Step 3. Configure the CloudWatch agent

So far, I have shown you how to:

A.  Use StatsD to emit custom metrics to monitor my service health
B.  Optionally use collectd to collect metrics using plugins

Next, I will install and configure the CloudWatch agent to accept metrics from both the StatsD client and collectd plugins.

I installed the CloudWatch Agent following the instructions in the user guide, but here are the detailed steps:

Install CloudWatch Agent:

wget https://s3.amazonaws.com/amazoncloudwatch-agent/linux/amd64/latest/AmazonCloudWatchAgent.zip -O AmazonCloudWatchAgent.zip && unzip -o AmazonCloudWatchAgent.zip && sudo ./install.sh

Configure CloudWatch Agent to receive metrics from StatsD and collectd:

{
  "metrics": {
    "append_dimensions": {
      "AutoScalingGroupName": "${aws:AutoScalingGroupName}",
      "InstanceId": "${aws:InstanceId}"
    },
    "metrics_collected": {
      "collectd": {},
      "statsd": {}
    }
  }
}

Pass the above config (config.json) to the CloudWatch Agent:

sudo /opt/aws/amazon-cloudwatch-agent/bin/amazon-cloudwatch-agent-ctl -a fetch-config -m ec2 -c file:config.json -s

In case you want to skip these steps and just execute my sample agent install script, you can find it here.

 

Step 4. Generate and monitor application traffic in CloudWatch

Now that I have the CloudWatch agent installed and configured to receive StatsD and collect metrics, I’m going to generate traffic through the service:

echo "send 100 requests"
for i in {1..100}
do
   curl "localhost:8080/encode?text=TextToEncode_${i}[email protected]#%"
   echo ""
   sleep 1
done

 

Next, I log in to the CloudWatch console and check that the service is up and running. Here’s a graph of the StatsD metrics:

 

Here is a graph of the collectd metrics:

 

Conclusion

With StatsD and collectd support, you can now use the CloudWatch Agent to collect and monitor your custom applications in addition to the system metrics and application logs it already collects. Furthermore, you can create operational dashboards with these metrics, set alarms to take automated actions when free memory is low, and troubleshoot issues by diving into the application logs.  Note that StatsD supports both Windows and Linux operating systems while collectd is Linux only.  For Windows, you can also continue to use Windows Performance Counters to collect custom metrics instead.

The CloudWatch Agent with custom metrics support (version 1.203420.0 or later) is available in all public AWS Regions, AWS GovCloud (US), with AWS China (Beijing) and AWS China (Ningxia) coming soon.

The agent is free to use; you pay the usual CloudWatch prices for logs and custom metrics.

For more details, head over to the CloudWatch user guide for StatsD and collectd.

AWS Online Tech Talks – June 2018

Post Syndicated from Devin Watson original https://aws.amazon.com/blogs/aws/aws-online-tech-talks-june-2018/

AWS Online Tech Talks – June 2018

Join us this month to learn about AWS services and solutions. New this month, we have a fireside chat with the GM of Amazon WorkSpaces and our 2nd episode of the “How to re:Invent” series. We’ll also cover best practices, deep dives, use cases and more! Join us and register today!

Note – All sessions are free and in Pacific Time.

Tech talks featured this month:

 

Analytics & Big Data

June 18, 2018 | 11:00 AM – 11:45 AM PTGet Started with Real-Time Streaming Data in Under 5 Minutes – Learn how to use Amazon Kinesis to capture, store, and analyze streaming data in real-time including IoT device data, VPC flow logs, and clickstream data.
June 20, 2018 | 11:00 AM – 11:45 AM PT – Insights For Everyone – Deploying Data across your Organization – Learn how to deploy data at scale using AWS Analytics and QuickSight’s new reader role and usage based pricing.

 

AWS re:Invent
June 13, 2018 | 05:00 PM – 05:30 PM PTEpisode 2: AWS re:Invent Breakout Content Secret Sauce – Hear from one of our own AWS content experts as we dive deep into the re:Invent content strategy and how we maintain a high bar.
Compute

June 25, 2018 | 01:00 PM – 01:45 PM PTAccelerating Containerized Workloads with Amazon EC2 Spot Instances – Learn how to efficiently deploy containerized workloads and easily manage clusters at any scale at a fraction of the cost with Spot Instances.

June 26, 2018 | 01:00 PM – 01:45 PM PTEnsuring Your Windows Server Workloads Are Well-Architected – Get the benefits, best practices and tools on running your Microsoft Workloads on AWS leveraging a well-architected approach.

 

Containers
June 25, 2018 | 09:00 AM – 09:45 AM PTRunning Kubernetes on AWS – Learn about the basics of running Kubernetes on AWS including how setup masters, networking, security, and add auto-scaling to your cluster.

 

Databases

June 18, 2018 | 01:00 PM – 01:45 PM PTOracle to Amazon Aurora Migration, Step by Step – Learn how to migrate your Oracle database to Amazon Aurora.
DevOps

June 20, 2018 | 09:00 AM – 09:45 AM PTSet Up a CI/CD Pipeline for Deploying Containers Using the AWS Developer Tools – Learn how to set up a CI/CD pipeline for deploying containers using the AWS Developer Tools.

 

Enterprise & Hybrid
June 18, 2018 | 09:00 AM – 09:45 AM PTDe-risking Enterprise Migration with AWS Managed Services – Learn how enterprise customers are de-risking cloud adoption with AWS Managed Services.

June 19, 2018 | 11:00 AM – 11:45 AM PTLaunch AWS Faster using Automated Landing Zones – Learn how the AWS Landing Zone can automate the set up of best practice baselines when setting up new

 

AWS Environments

June 21, 2018 | 11:00 AM – 11:45 AM PTLeading Your Team Through a Cloud Transformation – Learn how you can help lead your organization through a cloud transformation.

June 21, 2018 | 01:00 PM – 01:45 PM PTEnabling New Retail Customer Experiences with Big Data – Learn how AWS can help retailers realize actual value from their big data and deliver on differentiated retail customer experiences.

June 28, 2018 | 01:00 PM – 01:45 PM PTFireside Chat: End User Collaboration on AWS – Learn how End User Compute services can help you deliver access to desktops and applications anywhere, anytime, using any device.
IoT

June 27, 2018 | 11:00 AM – 11:45 AM PTAWS IoT in the Connected Home – Learn how to use AWS IoT to build innovative Connected Home products.

 

Machine Learning

June 19, 2018 | 09:00 AM – 09:45 AM PTIntegrating Amazon SageMaker into your Enterprise – Learn how to integrate Amazon SageMaker and other AWS Services within an Enterprise environment.

June 21, 2018 | 09:00 AM – 09:45 AM PTBuilding Text Analytics Applications on AWS using Amazon Comprehend – Learn how you can unlock the value of your unstructured data with NLP-based text analytics.

 

Management Tools

June 20, 2018 | 01:00 PM – 01:45 PM PTOptimizing Application Performance and Costs with Auto Scaling – Learn how selecting the right scaling option can help optimize application performance and costs.

 

Mobile
June 25, 2018 | 11:00 AM – 11:45 AM PTDrive User Engagement with Amazon Pinpoint – Learn how Amazon Pinpoint simplifies and streamlines effective user engagement.

 

Security, Identity & Compliance

June 26, 2018 | 09:00 AM – 09:45 AM PTUnderstanding AWS Secrets Manager – Learn how AWS Secrets Manager helps you rotate and manage access to secrets centrally.
June 28, 2018 | 09:00 AM – 09:45 AM PTUsing Amazon Inspector to Discover Potential Security Issues – See how Amazon Inspector can be used to discover security issues of your instances.

 

Serverless

June 19, 2018 | 01:00 PM – 01:45 PM PTProductionize Serverless Application Building and Deployments with AWS SAM – Learn expert tips and techniques for building and deploying serverless applications at scale with AWS SAM.

 

Storage

June 26, 2018 | 11:00 AM – 11:45 AM PTDeep Dive: Hybrid Cloud Storage with AWS Storage Gateway – Learn how you can reduce your on-premises infrastructure by using the AWS Storage Gateway to connecting your applications to the scalable and reliable AWS storage services.
June 27, 2018 | 01:00 PM – 01:45 PM PTChanging the Game: Extending Compute Capabilities to the Edge – Discover how to change the game for IIoT and edge analytics applications with AWS Snowball Edge plus enhanced Compute instances.
June 28, 2018 | 11:00 AM – 11:45 AM PTBig Data and Analytics Workloads on Amazon EFS – Get best practices and deployment advice for running big data and analytics workloads on Amazon EFS.

Protecting your API using Amazon API Gateway and AWS WAF — Part I

Post Syndicated from Chris Munns original https://aws.amazon.com/blogs/compute/protecting-your-api-using-amazon-api-gateway-and-aws-waf-part-i/

This post courtesy of Thiago Morais, AWS Solutions Architect

When you build web applications or expose any data externally, you probably look for a platform where you can build highly scalable, secure, and robust REST APIs. As APIs are publicly exposed, there are a number of best practices for providing a secure mechanism to consumers using your API.

Amazon API Gateway handles all the tasks involved in accepting and processing up to hundreds of thousands of concurrent API calls, including traffic management, authorization and access control, monitoring, and API version management.

In this post, I show you how to take advantage of the regional API endpoint feature in API Gateway, so that you can create your own Amazon CloudFront distribution and secure your API using AWS WAF.

AWS WAF is a web application firewall that helps protect your web applications from common web exploits that could affect application availability, compromise security, or consume excessive resources.

As you make your APIs publicly available, you are exposed to attackers trying to exploit your services in several ways. The AWS security team published a whitepaper solution using AWS WAF, How to Mitigate OWASP’s Top 10 Web Application Vulnerabilities.

Regional API endpoints

Edge-optimized APIs are endpoints that are accessed through a CloudFront distribution created and managed by API Gateway. Before the launch of regional API endpoints, this was the default option when creating APIs using API Gateway. It primarily helped to reduce latency for API consumers that were located in different geographical locations than your API.

When API requests predominantly originate from an Amazon EC2 instance or other services within the same AWS Region as the API is deployed, a regional API endpoint typically lowers the latency of connections. It is recommended for such scenarios.

For better control around caching strategies, customers can use their own CloudFront distribution for regional APIs. They also have the ability to use AWS WAF protection, as I describe in this post.

Edge-optimized API endpoint

The following diagram is an illustrated example of the edge-optimized API endpoint where your API clients access your API through a CloudFront distribution created and managed by API Gateway.

Regional API endpoint

For the regional API endpoint, your customers access your API from the same Region in which your REST API is deployed. This helps you to reduce request latency and particularly allows you to add your own content delivery network, as needed.

Walkthrough

In this section, you implement the following steps:

  • Create a regional API using the PetStore sample API.
  • Create a CloudFront distribution for the API.
  • Test the CloudFront distribution.
  • Set up AWS WAF and create a web ACL.
  • Attach the web ACL to the CloudFront distribution.
  • Test AWS WAF protection.

Create the regional API

For this walkthrough, use an existing PetStore API. All new APIs launch by default as the regional endpoint type. To change the endpoint type for your existing API, choose the cog icon on the top right corner:

After you have created the PetStore API on your account, deploy a stage called “prod” for the PetStore API.

On the API Gateway console, select the PetStore API and choose Actions, Deploy API.

For Stage name, type prod and add a stage description.

Choose Deploy and the new API stage is created.

Use the following AWS CLI command to update your API from edge-optimized to regional:

aws apigateway update-rest-api \
--rest-api-id {rest-api-id} \
--patch-operations op=replace,path=/endpointConfiguration/types/EDGE,value=REGIONAL

A successful response looks like the following:

{
    "description": "Your first API with Amazon API Gateway. This is a sample API that integrates via HTTP with your demo Pet Store endpoints", 
    "createdDate": 1511525626, 
    "endpointConfiguration": {
        "types": [
            "REGIONAL"
        ]
    }, 
    "id": "{api-id}", 
    "name": "PetStore"
}

After you change your API endpoint to regional, you can now assign your own CloudFront distribution to this API.

Create a CloudFront distribution

To make things easier, I have provided an AWS CloudFormation template to deploy a CloudFront distribution pointing to the API that you just created. Click the button to deploy the template in the us-east-1 Region.

For Stack name, enter RegionalAPI. For APIGWEndpoint, enter your API FQDN in the following format:

{api-id}.execute-api.us-east-1.amazonaws.com

After you fill out the parameters, choose Next to continue the stack deployment. It takes a couple of minutes to finish the deployment. After it finishes, the Output tab lists the following items:

  • A CloudFront domain URL
  • An S3 bucket for CloudFront access logs
Output from CloudFormation

Output from CloudFormation

Test the CloudFront distribution

To see if the CloudFront distribution was configured correctly, use a web browser and enter the URL from your distribution, with the following parameters:

https://{your-distribution-url}.cloudfront.net/{api-stage}/pets

You should get the following output:

[
  {
    "id": 1,
    "type": "dog",
    "price": 249.99
  },
  {
    "id": 2,
    "type": "cat",
    "price": 124.99
  },
  {
    "id": 3,
    "type": "fish",
    "price": 0.99
  }
]

Set up AWS WAF and create a web ACL

With the new CloudFront distribution in place, you can now start setting up AWS WAF to protect your API.

For this demo, you deploy the AWS WAF Security Automations solution, which provides fine-grained control over the requests attempting to access your API.

For more information about deployment, see Automated Deployment. If you prefer, you can launch the solution directly into your account using the following button.

For CloudFront Access Log Bucket Name, add the name of the bucket created during the deployment of the CloudFormation stack for your CloudFront distribution.

The solution allows you to adjust thresholds and also choose which automations to enable to protect your API. After you finish configuring these settings, choose Next.

To start the deployment process in your account, follow the creation wizard and choose Create. It takes a few minutes do finish the deployment. You can follow the creation process through the CloudFormation console.

After the deployment finishes, you can see the new web ACL deployed on the AWS WAF console, AWSWAFSecurityAutomations.

Attach the AWS WAF web ACL to the CloudFront distribution

With the solution deployed, you can now attach the AWS WAF web ACL to the CloudFront distribution that you created earlier.

To assign the newly created AWS WAF web ACL, go back to your CloudFront distribution. After you open your distribution for editing, choose General, Edit.

Select the new AWS WAF web ACL that you created earlier, AWSWAFSecurityAutomations.

Save the changes to your CloudFront distribution and wait for the deployment to finish.

Test AWS WAF protection

To validate the AWS WAF Web ACL setup, use Artillery to load test your API and see AWS WAF in action.

To install Artillery on your machine, run the following command:

$ npm install -g artillery

After the installation completes, you can check if Artillery installed successfully by running the following command:

$ artillery -V
$ 1.6.0-12

As the time of publication, Artillery is on version 1.6.0-12.

One of the WAF web ACL rules that you have set up is a rate-based rule. By default, it is set up to block any requesters that exceed 2000 requests under 5 minutes. Try this out.

First, use cURL to query your distribution and see the API output:

$ curl -s https://{distribution-name}.cloudfront.net/prod/pets
[
  {
    "id": 1,
    "type": "dog",
    "price": 249.99
  },
  {
    "id": 2,
    "type": "cat",
    "price": 124.99
  },
  {
    "id": 3,
    "type": "fish",
    "price": 0.99
  }
]

Based on the test above, the result looks good. But what if you max out the 2000 requests in under 5 minutes?

Run the following Artillery command:

artillery quick -n 2000 --count 10  https://{distribution-name}.cloudfront.net/prod/pets

What you are doing is firing 2000 requests to your API from 10 concurrent users. For brevity, I am not posting the Artillery output here.

After Artillery finishes its execution, try to run the cURL request again and see what happens:

 

$ curl -s https://{distribution-name}.cloudfront.net/prod/pets

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<HTML><HEAD><META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=iso-8859-1">
<TITLE>ERROR: The request could not be satisfied</TITLE>
</HEAD><BODY>
<H1>ERROR</H1>
<H2>The request could not be satisfied.</H2>
<HR noshade size="1px">
Request blocked.
<BR clear="all">
<HR noshade size="1px">
<PRE>
Generated by cloudfront (CloudFront)
Request ID: [removed]
</PRE>
<ADDRESS>
</ADDRESS>
</BODY></HTML>

As you can see from the output above, the request was blocked by AWS WAF. Your IP address is removed from the blocked list after it falls below the request limit rate.

Conclusion

In this first part, you saw how to use the new API Gateway regional API endpoint together with Amazon CloudFront and AWS WAF to secure your API from a series of attacks.

In the second part, I will demonstrate some other techniques to protect your API using API keys and Amazon CloudFront custom headers.

Connect, collaborate, and learn at AWS Global Summits in 2018

Post Syndicated from Tina Kelleher original https://aws.amazon.com/blogs/big-data/connect-collaborate-and-learn-at-aws-global-summits-in-2018/

Regardless of your career path, there’s no denying that attending industry events can provide helpful career development opportunities — not only for improving and expanding your skill sets, but for networking as well. According to this article from PayScale.com, experts estimate that somewhere between 70-85% of new positions are landed through networking.

Narrowing our focus to networking opportunities with cloud computing professionals who’re working on tackling some of today’s most innovative and exciting big data solutions, attending big data-focused sessions at an AWS Global Summit is a great place to start.

AWS Global Summits are free events that bring the cloud computing community together to connect, collaborate, and learn about AWS. As the name suggests, these summits are held in major cities around the world, and attract technologists from all industries and skill levels who’re interested in hearing from AWS leaders, experts, partners, and customers.

In addition to networking opportunities with top cloud technology providers, consultants and your peers in our Partner and Solutions Expo, you’ll also hone your AWS skills by attending and participating in a multitude of education and training opportunities.

Here’s a brief sampling of some of the upcoming sessions relevant to big data professionals:

May 31st : Big Data Architectural Patterns and Best Practices on AWS | AWS Summit – Mexico City

June 6th-7th: Various (click on the “Big Data & Analytics” header) | AWS Summit – Berlin

June 20-21st : [email protected] | Public Sector Summit – Washington DC

June 21st: Enabling Self Service for Data Scientists with AWS Service Catalog | AWS Summit – Sao Paulo

Be sure to check out the main page for AWS Global Summits, where you can see which cities have AWS Summits planned for 2018, register to attend an upcoming event, or provide your information to be notified when registration opens for a future event.

Practice Makes Perfect: Testing Campaigns Before You Send Them

Post Syndicated from Zach Barbitta original https://aws.amazon.com/blogs/messaging-and-targeting/practice-makes-perfect-testing-campaigns-before-you-send-them/

In an article we posted to Medium in February, we talked about how to determine the best time to engage your customers by using Amazon Pinpoint’s built-in session heat map. The session heat map allows you to find the times that your customers are most likely to use your app. In this post, we continued on the topic of best practices—specifically, how to appropriately test a campaign before going live.

In this post, we’ll talk about the old adage “practice makes perfect,” and how it applies to the campaigns you send using Amazon Pinpoint. Let’s take a scenario many of our customers encounter daily: creating a campaign to engage users by sending a push notification.

As you can see from the preceding screenshot, the segment we plan to target has nearly 1.7M recipients, which is a lot! Of course, before we got to this step, we already put several best practices into practice. For example, we determined the best time to engage our audience, scheduled the message based on recipients’ local time zones, performed A/B/N testing, measured lift using a hold-out group, and personalized the content for maximum effectiveness. Now that we’re ready to send the notification, we should test the message before we send it to all of the recipients in our segment. The reason for testing the message is pretty straightforward: we want to make sure every detail of the message is accurate before we send it to all 1,687,575 customers.

Fortunately, Amazon Pinpoint makes it easy to test your messages—in fact, you don’t even have to leave the campaign wizard in order to do so. In step 3 of the campaign wizard, below the message editor, there’s a button labelled Test campaign.

When you choose the Test campaign button, you have three options: you can send the test message to a segment of 100 endpoints or less, or to a set of specific endpoint IDs (up to 10), or to a set of specific device tokens (up to 10), as shown in the following image.

In our case, we’ve already created a segment of internal recipients who will test our message. On the Test Campaign window, under Send a test message to, we choose A segment. Then, in the drop-down menu, we select our test segment, and then choose Send test message.

Because we’re sending the test message to a segment, Amazon Pinpoint automatically creates a new campaign dedicated to this test. This process executes a test campaign, complete with message analytics, which allows you to perform end-to-end testing as if you sent the message to your production audience. To see the analytics for your test campaign, go to the Campaigns tab, and then choose the campaign (the name of the campaign contains the word “test”, followed by four random characters, followed by the name of the campaign).

After you complete a successful test, you’re ready to launch your campaign. As a final check, the Review & Launch screen includes a reminder that indicates whether or not you’ve tested the campaign, as shown in the following image.

There are several other ways you can use this feature. For example, you could use it for troubleshooting a campaign, or for iterating on existing campaigns. To learn more about testing campaigns, see the Amazon Pinpoint User Guide.

AWS Online Tech Talks – May and Early June 2018

Post Syndicated from Devin Watson original https://aws.amazon.com/blogs/aws/aws-online-tech-talks-may-and-early-june-2018/

AWS Online Tech Talks – May and Early June 2018  

Join us this month to learn about some of the exciting new services and solution best practices at AWS. We also have our first re:Invent 2018 webinar series, “How to re:Invent”. Sign up now to learn more, we look forward to seeing you.

Note – All sessions are free and in Pacific Time.

Tech talks featured this month:

Analytics & Big Data

May 21, 2018 | 11:00 AM – 11:45 AM PT Integrating Amazon Elasticsearch with your DevOps Tooling – Learn how you can easily integrate Amazon Elasticsearch Service into your DevOps tooling and gain valuable insight from your log data.

May 23, 2018 | 11:00 AM – 11:45 AM PTData Warehousing and Data Lake Analytics, Together – Learn how to query data across your data warehouse and data lake without moving data.

May 24, 2018 | 11:00 AM – 11:45 AM PTData Transformation Patterns in AWS – Discover how to perform common data transformations on the AWS Data Lake.

Compute

May 29, 2018 | 01:00 PM – 01:45 PM PT – Creating and Managing a WordPress Website with Amazon Lightsail – Learn about Amazon Lightsail and how you can create, run and manage your WordPress websites with Amazon’s simple compute platform.

May 30, 2018 | 01:00 PM – 01:45 PM PTAccelerating Life Sciences with HPC on AWS – Learn how you can accelerate your Life Sciences research workloads by harnessing the power of high performance computing on AWS.

Containers

May 24, 2018 | 01:00 PM – 01:45 PM PT – Building Microservices with the 12 Factor App Pattern on AWS – Learn best practices for building containerized microservices on AWS, and how traditional software design patterns evolve in the context of containers.

Databases

May 21, 2018 | 01:00 PM – 01:45 PM PTHow to Migrate from Cassandra to Amazon DynamoDB – Get the benefits, best practices and guides on how to migrate your Cassandra databases to Amazon DynamoDB.

May 23, 2018 | 01:00 PM – 01:45 PM PT5 Hacks for Optimizing MySQL in the Cloud – Learn how to optimize your MySQL databases for high availability, performance, and disaster resilience using RDS.

DevOps

May 23, 2018 | 09:00 AM – 09:45 AM PT.NET Serverless Development on AWS – Learn how to build a modern serverless application in .NET Core 2.0.

Enterprise & Hybrid

May 22, 2018 | 11:00 AM – 11:45 AM PTHybrid Cloud Customer Use Cases on AWS – Learn how customers are leveraging AWS hybrid cloud capabilities to easily extend their datacenter capacity, deliver new services and applications, and ensure business continuity and disaster recovery.

IoT

May 31, 2018 | 11:00 AM – 11:45 AM PTUsing AWS IoT for Industrial Applications – Discover how you can quickly onboard your fleet of connected devices, keep them secure, and build predictive analytics with AWS IoT.

Machine Learning

May 22, 2018 | 09:00 AM – 09:45 AM PTUsing Apache Spark with Amazon SageMaker – Discover how to use Apache Spark with Amazon SageMaker for training jobs and application integration.

May 24, 2018 | 09:00 AM – 09:45 AM PTIntroducing AWS DeepLens – Learn how AWS DeepLens provides a new way for developers to learn machine learning by pairing the physical device with a broad set of tutorials, examples, source code, and integration with familiar AWS services.

Management Tools

May 21, 2018 | 09:00 AM – 09:45 AM PTGaining Better Observability of Your VMs with Amazon CloudWatch – Learn how CloudWatch Agent makes it easy for customers like Rackspace to monitor their VMs.

Mobile

May 29, 2018 | 11:00 AM – 11:45 AM PT – Deep Dive on Amazon Pinpoint Segmentation and Endpoint Management – See how segmentation and endpoint management with Amazon Pinpoint can help you target the right audience.

Networking

May 31, 2018 | 09:00 AM – 09:45 AM PTMaking Private Connectivity the New Norm via AWS PrivateLink – See how PrivateLink enables service owners to offer private endpoints to customers outside their company.

Security, Identity, & Compliance

May 30, 2018 | 09:00 AM – 09:45 AM PT – Introducing AWS Certificate Manager Private Certificate Authority (CA) – Learn how AWS Certificate Manager (ACM) Private Certificate Authority (CA), a managed private CA service, helps you easily and securely manage the lifecycle of your private certificates.

June 1, 2018 | 09:00 AM – 09:45 AM PTIntroducing AWS Firewall Manager – Centrally configure and manage AWS WAF rules across your accounts and applications.

Serverless

May 22, 2018 | 01:00 PM – 01:45 PM PTBuilding API-Driven Microservices with Amazon API Gateway – Learn how to build a secure, scalable API for your application in our tech talk about API-driven microservices.

Storage

May 30, 2018 | 11:00 AM – 11:45 AM PTAccelerate Productivity by Computing at the Edge – Learn how AWS Snowball Edge support for compute instances helps accelerate data transfers, execute custom applications, and reduce overall storage costs.

June 1, 2018 | 11:00 AM – 11:45 AM PTLearn to Build a Cloud-Scale Website Powered by Amazon EFS – Technical deep dive where you’ll learn tips and tricks for integrating WordPress, Drupal and Magento with Amazon EFS.

 

 

 

 

How AWS Meets a Physical Separation Requirement with a Logical Separation Approach

Post Syndicated from Min Hyun original https://aws.amazon.com/blogs/security/how-aws-meets-a-physical-separation-requirement-with-a-logical-separation-approach/

We have a new resource available to help you meet a requirement for physically-separated infrastructure using logical separation in the AWS cloud. Our latest guide, Logical Separation: An Evaluation of the U.S. Department of Defense Cloud Security Requirements for Sensitive Workloads outlines how AWS meets the U.S. Department of Defense’s (DoD) stringent physical separation requirement by pioneering a three-pronged logical separation approach that leverages virtualization, encryption, and deploying compute to dedicated hardware.

This guide will help you understand logical separation in the cloud and demonstrates its advantages over a traditional physical separation model. Embracing this approach can help organizations confidently meet or exceed security requirements found in traditional on-premises environments, while also providing increased security control and flexibility.

Logical Separation is the second guide in the AWS Government Handbook Series, which examines cybersecurity policy initiatives and identifies best practices.

If you have questions or want to learn more, contact your account executive or AWS Support.

Bad Software Is Our Fault

Post Syndicated from Bozho original https://techblog.bozho.net/bad-software-is-our-fault/

Bad software is everywhere. One can even claim that every software is bad. Cool companies, tech giants, established companies, all produce bad software. And no, yours is not an exception.

Who’s to blame for bad software? It’s all complicated and many factors are intertwined – there’s business requirements, there’s organizational context, there’s lack of sufficient skilled developers, there’s the inherent complexity of software development, there’s leaky abstractions, reliance on 3rd party software, consequences of wrong business and purchase decisions, time limitations, flawed business analysis, etc. So yes, despite the catchy title, I’m aware it’s actually complicated.

But in every “it’s complicated” scenario, there’s always one or two factors that are decisive. All of them contribute somehow, but the major drivers are usually a handful of things. And in the case of base software, I think it’s the fault of technical people. Developers, architects, ops.

We don’t seem to care about best practices. And I’ll do some nasty generalizations here, but bear with me. We can spend hours arguing about tabs vs spaces, curly bracket on new line, git merge vs rebase, which IDE is better, which framework is better and other largely irrelevant stuff. But we tend to ignore the important aspects that span beyond the code itself. The context in which the code lives, the non-functional requirements – robustness, security, resilience, etc.

We don’t seem to get security. Even trivial stuff such as user authentication is almost always implemented wrong. These days Twitter and GitHub realized they have been logging plain-text passwords, for example, but that’s just the tip of the iceberg. Too often we ignore the security implications.

“But the business didn’t request the security features”, one may say. The business never requested 2-factor authentication, encryption at rest, PKI, secure (or any) audit trail, log masking, crypto shredding, etc., etc. Because the business doesn’t know these things – we do and we have to put them on the backlog and fight for them to be implemented. Each organization has its specifics and tech people can influence the backlog in different ways, but almost everywhere we can put things there and prioritize them.

The other aspect is testing. We should all be well aware by now that automated testing is mandatory. We have all the tools in the world for unit, functional, integration, performance and whatnot testing, and yet many software projects lack the necessary test coverage to be able to change stuff without accidentally breaking things. “But testing takes time, we don’t have it”. We are perfectly aware that testing saves time, as we’ve all had those “not again!” recurring bugs. And yet we think of all sorts of excuses – “let the QAs test it”, we have to ship that now, we’ll test it later”, “this is too trivial to be tested”, etc.

And you may say it’s not our job. We don’t define what has do be done, we just do it. We don’t define the budget, the scope, the features. We just write whatever has been decided. And that’s plain wrong. It’s not our job to make money out of our code, and it’s not our job to define what customers need, but apart from that everything is our job. The way the software is structured, the security aspects and security features, the stability of the code base, the way the software behaves in different environments. The non-functional requirements are our job, and putting them on the backlog is our job.

You’ve probably heard that every software becomes “legacy” after 6 months. And that’s because of us, our sloppiness, our inability to mitigate external factors and constraints. Too often we create a mess through “just doing our job”.

And of course that’s a generalization. I happen to know a lot of great professionals who don’t make these mistakes, who strive for excellence and implement things the right way. But our industry as a whole doesn’t. Our industry as a whole produces bad software. And it’s our fault, as developers – as the only people who know why a certain piece of software is bad.

In a talk of his, Bob Martin warns us of the risks of our sloppiness. We have been building websites so far, but we are more and more building stuff that interacts with the real world, directly and indirectly. Ultimately, lives may depend on our software (like the recent unfortunate death caused by a self-driving car). And I’ll agree with Uncle Bob that it’s high time we self-regulate as an industry, before some technically incompetent politician decides to do that.

How, I don’t know. We’ll have to think more about it. But I’m pretty sure it’s our fault that software is bad, and no amount of blaming the management, the budget, the timing, the tools or the process can eliminate our responsibility.

Why do I insist on bashing my fellow software engineers? Because if we start looking at software development with more responsibility; with the fact that if it fails, it’s our fault, then we’re more likely to get out of our current bug-ridden, security-flawed, fragile software hole and really become the experts of the future.

The post Bad Software Is Our Fault appeared first on Bozho's tech blog.

CI/CD with Data: Enabling Data Portability in a Software Delivery Pipeline with AWS Developer Tools, Kubernetes, and Portworx

Post Syndicated from Kausalya Rani Krishna Samy original https://aws.amazon.com/blogs/devops/cicd-with-data-enabling-data-portability-in-a-software-delivery-pipeline-with-aws-developer-tools-kubernetes-and-portworx/

This post is written by Eric Han – Vice President of Product Management Portworx and Asif Khan – Solutions Architect

Data is the soul of an application. As containers make it easier to package and deploy applications faster, testing plays an even more important role in the reliable delivery of software. Given that all applications have data, development teams want a way to reliably control, move, and test using real application data or, at times, obfuscated data.

For many teams, moving application data through a CI/CD pipeline, while honoring compliance and maintaining separation of concerns, has been a manual task that doesn’t scale. At best, it is limited to a few applications, and is not portable across environments. The goal should be to make running and testing stateful containers (think databases and message buses where operations are tracked) as easy as with stateless (such as with web front ends where they are often not).

Why is state important in testing scenarios? One reason is that many bugs manifest only when code is tested against real data. For example, we might simply want to test a database schema upgrade but a small synthetic dataset does not exercise the critical, finer corner cases in complex business logic. If we want true end-to-end testing, we need to be able to easily manage our data or state.

In this blog post, we define a CI/CD pipeline reference architecture that can automate data movement between applications. We also provide the steps to follow to configure the CI/CD pipeline.

 

Stateful Pipelines: Need for Portable Volumes

As part of continuous integration, testing, and deployment, a team may need to reproduce a bug found in production against a staging setup. Here, the hosting environment is comprised of a cluster with Kubernetes as the scheduler and Portworx for persistent volumes. The testing workflow is then automated by AWS CodeCommit, AWS CodePipeline, and AWS CodeBuild.

Portworx offers Kubernetes storage that can be used to make persistent volumes portable between AWS environments and pipelines. The addition of Portworx to the AWS Developer Tools continuous deployment for Kubernetes reference architecture adds persistent storage and storage orchestration to a Kubernetes cluster. The example uses MongoDB as the demonstration of a stateful application. In practice, the workflow applies to any containerized application such as Cassandra, MySQL, Kafka, and Elasticsearch.

Using the reference architecture, a developer calls CodePipeline to trigger a snapshot of the running production MongoDB database. Portworx then creates a block-based, writable snapshot of the MongoDB volume. Meanwhile, the production MongoDB database continues serving end users and is uninterrupted.

Without the Portworx integrations, a manual process would require an application-level backup of the database instance that is outside of the CI/CD process. For larger databases, this could take hours and impact production. The use of block-based snapshots follows best practices for resilient and non-disruptive backups.

As part of the workflow, CodePipeline deploys a new MongoDB instance for staging onto the Kubernetes cluster and mounts the second Portworx volume that has the data from production. CodePipeline triggers the snapshot of a Portworx volume through an AWS Lambda function, as shown here

 

 

 

AWS Developer Tools with Kubernetes: Integrated Workflow with Portworx

In the following workflow, a developer is testing changes to a containerized application that calls on MongoDB. The tests are performed against a staging instance of MongoDB. The same workflow applies if changes were on the server side. The original production deployment is scheduled as a Kubernetes deployment object and uses Portworx as the storage for the persistent volume.

The continuous deployment pipeline runs as follows:

  • Developers integrate bug fix changes into a main development branch that gets merged into a CodeCommit master branch.
  • Amazon CloudWatch triggers the pipeline when code is merged into a master branch of an AWS CodeCommit repository.
  • AWS CodePipeline sends the new revision to AWS CodeBuild, which builds a Docker container image with the build ID.
  • AWS CodeBuild pushes the new Docker container image tagged with the build ID to an Amazon ECR registry.
  • Kubernetes downloads the new container (for the database client) from Amazon ECR and deploys the application (as a pod) and staging MongoDB instance (as a deployment object).
  • AWS CodePipeline, through a Lambda function, calls Portworx to snapshot the production MongoDB and deploy a staging instance of MongoDB• Portworx provides a snapshot of the production instance as the persistent storage of the staging MongoDB
    • The MongoDB instance mounts the snapshot.

At this point, the staging setup mimics a production environment. Teams can run integration and full end-to-end tests, using partner tooling, without impacting production workloads. The full pipeline is shown here.

 

Summary

This reference architecture showcases how development teams can easily move data between production and staging for the purposes of testing. Instead of taking application-specific manual steps, all operations in this CodePipeline architecture are automated and tracked as part of the CI/CD process.

This integrated experience is part of making stateful containers as easy as stateless. With AWS CodePipeline for CI/CD process, developers can easily deploy stateful containers onto a Kubernetes cluster with Portworx storage and automate data movement within their process.

The reference architecture and code are available on GitHub:

● Reference architecture: https://github.com/portworx/aws-kube-codesuite
● Lambda function source code for Portworx additions: https://github.com/portworx/aws-kube-codesuite/blob/master/src/kube-lambda.py

For more information about persistent storage for containers, visit the Portworx website. For more information about Code Pipeline, see the AWS CodePipeline User Guide.

Ransomware Update: Viruses Targeting Business IT Servers

Post Syndicated from Roderick Bauer original https://www.backblaze.com/blog/ransomware-update-viruses-targeting-business-it-servers/

Ransomware warning message on computer

As ransomware attacks have grown in number in recent months, the tactics and attack vectors also have evolved. While the primary method of attack used to be to target individual computer users within organizations with phishing emails and infected attachments, we’re increasingly seeing attacks that target weaknesses in businesses’ IT infrastructure.

How Ransomware Attacks Typically Work

In our previous posts on ransomware, we described the common vehicles used by hackers to infect organizations with ransomware viruses. Most often, downloaders distribute trojan horses through malicious downloads and spam emails. The emails contain a variety of file attachments, which if opened, will download and run one of the many ransomware variants. Once a user’s computer is infected with a malicious downloader, it will retrieve additional malware, which frequently includes crypto-ransomware. After the files have been encrypted, a ransom payment is demanded of the victim in order to decrypt the files.

What’s Changed With the Latest Ransomware Attacks?

In 2016, a customized ransomware strain called SamSam began attacking the servers in primarily health care institutions. SamSam, unlike more conventional ransomware, is not delivered through downloads or phishing emails. Instead, the attackers behind SamSam use tools to identify unpatched servers running Red Hat’s JBoss enterprise products. Once the attackers have successfully gained entry into one of these servers by exploiting vulnerabilities in JBoss, they use other freely available tools and scripts to collect credentials and gather information on networked computers. Then they deploy their ransomware to encrypt files on these systems before demanding a ransom. Gaining entry to an organization through its IT center rather than its endpoints makes this approach scalable and especially unsettling.

SamSam’s methodology is to scour the Internet searching for accessible and vulnerable JBoss application servers, especially ones used by hospitals. It’s not unlike a burglar rattling doorknobs in a neighborhood to find unlocked homes. When SamSam finds an unlocked home (unpatched server), the software infiltrates the system. It is then free to spread across the company’s network by stealing passwords. As it transverses the network and systems, it encrypts files, preventing access until the victims pay the hackers a ransom, typically between $10,000 and $15,000. The low ransom amount has encouraged some victimized organizations to pay the ransom rather than incur the downtime required to wipe and reinitialize their IT systems.

The success of SamSam is due to its effectiveness rather than its sophistication. SamSam can enter and transverse a network without human intervention. Some organizations are learning too late that securing internet-facing services in their data center from attack is just as important as securing endpoints.

The typical steps in a SamSam ransomware attack are:

1
Attackers gain access to vulnerable server
Attackers exploit vulnerable software or weak/stolen credentials.
2
Attack spreads via remote access tools
Attackers harvest credentials, create SOCKS proxies to tunnel traffic, and abuse RDP to install SamSam on more computers in the network.
3
Ransomware payload deployed
Attackers run batch scripts to execute ransomware on compromised machines.
4
Ransomware demand delivered requiring payment to decrypt files
Demand amounts vary from victim to victim. Relatively low ransom amounts appear to be designed to encourage quick payment decisions.

What all the organizations successfully exploited by SamSam have in common is that they were running unpatched servers that made them vulnerable to SamSam. Some organizations had their endpoints and servers backed up, while others did not. Some of those without backups they could use to recover their systems chose to pay the ransom money.

Timeline of SamSam History and Exploits

Since its appearance in 2016, SamSam has been in the news with many successful incursions into healthcare, business, and government institutions.

March 2016
SamSam appears

SamSam campaign targets vulnerable JBoss servers
Attackers hone in on healthcare organizations specifically, as they’re more likely to have unpatched JBoss machines.

April 2016
SamSam finds new targets

SamSam begins targeting schools and government.
After initial success targeting healthcare, attackers branch out to other sectors.

April 2017
New tactics include RDP

Attackers shift to targeting organizations with exposed RDP connections, and maintain focus on healthcare.
An attack on Erie County Medical Center costs the hospital $10 million over three months of recovery.
Erie County Medical Center attacked by SamSam ransomware virus

January 2018
Municipalities attacked

• Attack on Municipality of Farmington, NM.
• Attack on Hancock Health.
Hancock Regional Hospital notice following SamSam attack
• Attack on Adams Memorial Hospital
• Attack on Allscripts (Electronic Health Records), which includes 180,000 physicians, 2,500 hospitals, and 7.2 million patients’ health records.

February 2018
Attack volume increases

• Attack on Davidson County, NC.
• Attack on Colorado Department of Transportation.
SamSam virus notification

March 2018
SamSam shuts down Atlanta

• Second attack on Colorado Department of Transportation.
• City of Atlanta suffers a devastating attack by SamSam.
The attack has far-reaching impacts — crippling the court system, keeping residents from paying their water bills, limiting vital communications like sewer infrastructure requests, and pushing the Atlanta Police Department to file paper reports.
Atlanta Ransomware outage alert
• SamSam campaign nets $325,000 in 4 weeks.
Infections spike as attackers launch new campaigns. Healthcare and government organizations are once again the primary targets.

How to Defend Against SamSam and Other Ransomware Attacks

The best way to respond to a ransomware attack is to avoid having one in the first place. If you are attacked, making sure your valuable data is backed up and unreachable by ransomware infection will ensure that your downtime and data loss will be minimal or none if you ever suffer an attack.

In our previous post, How to Recover From Ransomware, we listed the ten ways to protect your organization from ransomware.

  1. Use anti-virus and anti-malware software or other security policies to block known payloads from launching.
  2. Make frequent, comprehensive backups of all important files and isolate them from local and open networks. Cybersecurity professionals view data backup and recovery (74% in a recent survey) by far as the most effective solution to respond to a successful ransomware attack.
  3. Keep offline backups of data stored in locations inaccessible from any potentially infected computer, such as disconnected external storage drives or the cloud, which prevents them from being accessed by the ransomware.
  4. Install the latest security updates issued by software vendors of your OS and applications. Remember to patch early and patch often to close known vulnerabilities in operating systems, server software, browsers, and web plugins.
  5. Consider deploying security software to protect endpoints, email servers, and network systems from infection.
  6. Exercise cyber hygiene, such as using caution when opening email attachments and links.
  7. Segment your networks to keep critical computers isolated and to prevent the spread of malware in case of attack. Turn off unneeded network shares.
  8. Turn off admin rights for users who don’t require them. Give users the lowest system permissions they need to do their work.
  9. Restrict write permissions on file servers as much as possible.
  10. Educate yourself, your employees, and your family in best practices to keep malware out of your systems. Update everyone on the latest email phishing scams and human engineering aimed at turning victims into abettors.

Please Tell Us About Your Experiences with Ransomware

Have you endured a ransomware attack or have a strategy to avoid becoming a victim? Please tell us of your experiences in the comments.

The post Ransomware Update: Viruses Targeting Business IT Servers appeared first on Backblaze Blog | Cloud Storage & Cloud Backup.

Serverless Architectures with AWS Lambda: Overview and Best Practices

Post Syndicated from Andrew Baird original https://aws.amazon.com/blogs/architecture/serverless-architectures-with-aws-lambda-overview-and-best-practices/

For some organizations, the idea of “going serverless” can be daunting. But with an understanding of best practices – and the right tools — many serverless applications can be fully functional with only a few lines of code and little else.

Examples of fully-serverless-application use cases include:

  • Web or mobile backends – Create fully-serverless, mobile applications or websites by creating user-facing content in a native mobile application or static web content in an S3 bucket. Then have your front-end content integrate with Amazon API Gateway as a backend service API. Lambda functions will then execute the business logic you’ve written for each of the API Gateway methods in your backend API.
  • Chatbots and virtual assistants – Build new serverless ways to interact with your customers, like customer support assistants and bots ready to engage customers on your company-run social media pages. The Amazon Alexa Skills Kit (ASK) and Amazon Lex have the ability to apply natural-language understanding to user-voice and freeform-text input so that a Lambda function you write can intelligently respond and engage with them.
  • Internet of Things (IoT) backends – AWS IoT has direct-integration for device messages to be routed to and processed by Lambda functions. That means you can implement serverless backends for highly secure, scalable IoT applications for uses like connected consumer appliances and intelligent manufacturing facilities.

Using AWS Lambda as the logic layer of a serverless application can enable faster development speed and greater experimentation – and innovation — than in a traditional, server-based environment.

We recently published the “Serverless Architectures with AWS Lambda: Overview and Best Practices” whitepaper to provide the guidance and best practices you need to write better Lambda functions and build better serverless architectures.

Once you’ve finished reading the whitepaper, below are a couple additional resources I recommend as your next step:

  1. If you would like to better understand some of the architecture pattern possibilities for serverless applications: Thirty Serverless Architectures in 30 Minutes (re:Invent 2017 video)
  2. If you’re ready to get hands-on and build a sample serverless application: AWS Serverless Workshops (GitHub Repository)
  3. If you’ve already built a serverless application and you’d like to ensure your application has been Well Architected: The Serverless Application Lens: AWS Well Architected Framework (Whitepaper)

About the Author

 

Andrew Baird is a Sr. Solutions Architect for AWS. Prior to becoming a Solutions Architect, Andrew was a developer, including time as an SDE with Amazon.com. He has worked on large-scale distributed systems, public-facing APIs, and operations automation.