All posts by David Hessler

Best practices for scaling AWS CDK adoption within your organization

Post Syndicated from David Hessler original https://aws.amazon.com/blogs/devops/best-practices-for-scaling-aws-cdk-adoption-within-your-organization/

Enterprises are constantly seeking ways to accelerate their journey to the cloud. Infrastructure as code (IaC) is crucial for automating and managing cloud resources efficiently. The AWS Cloud Development Kit (AWS CDK) lets you define your cloud infrastructure as code in your favorite programming language and deploy it using AWS CloudFormation. In this post, we will discuss strategies and best practices for accelerating CDK adoption within your organization. Our discussion begins after your organization has successfully completed a pilot. In this post, you will learn how to scale the lessons learned from the pilot project across your organization through platform engineering. You will learn how to reduce complexity through building reusable components, deploy with speed and safety via builder tooling, and accelerate project startup with an internal developer portal (IDP). We will conclude by discussing ways to participate in and benefit from the broader CDK community.

Before we dive in, let’s briefly discuss a new trend in technology: Platform Engineering. DevOps practices have helped IT organizations deliver software to customers more frequently and with higher quality. A recent evolution in DevOps is the introduction of platform engineering teams to build services, toolchains, and documentation to support workload teams. An important responsibility of the platform engineering team is governance of the software delivery process.

At Amazon, we have a long and storied history of leveraging platform engineering to accelerate deployments. This is why we are able to maintain 143 different compliance certifications and attestations while deploying 150 million times per year. Platform engineering increases productivity, reduces friction between ideas and implementation, and improves agility by accelerating the delivery of workloads via a secure, scalable, and reusable set of resources and components through self-service portals and developer tools. Platform Engineering is comprised of seven capabilities: Platform Architecture, Data Architecture, Platform Product Engineering, Data Engineering, Provisioning & Orchestration, Modern App Development and CI/CD. For more information on platform engineering visit the AWS Cloud Adoption Framework.

Establishing these capabilities takes several platform and workload teams working together. From an operating model standpoint, a workload team interacts with Platform Engineering in one of the three following ways (for more information, see Building a Cloud Operating Model):

Image describes a three different cloud operating models. The first model is a transitional model where Application Engineering and Application Operations teams both supported by Cloud Platform Engineering. The second model is strategic where Application Engineering and Cloud Platform Engineering equally own the responsibility. The third model is also strategic where Application Engineering and Cloud Platform Engineering jointly own responsibility but Application Engineering owns most of the responsibility.

Reduce Builder Complexity and Cognitive load with Reusable Components

So, how can the platform team incorporate CDK to accomplish their goals? One of the common objectives of the Platform Engineering team is to publish and curate reusable patterns called Constructs. Constructs provide a mechanism to create reusable, extensible, and common components that can be shared across multiple teams and projects.

Many customers write their own implementations for constructs to enforce security best practices such as encryption and specific AWS Identity and Access Management policies. For example, you might create a MyCompanyBucket that implements your organizations security requirements in place of the default Amazon S3 Bucket construct. This bucket configuration can be implemented and extended by multiple teams to ensure they are using components that are validated by your security and compliance teams.

For customers focused on data governance, CDK constructs can automatically add in best practices for recovery time objectives and recovery point objectives by ensuring backups and architecture meet an organization’s resilience policies. For advance customers looking to enforce data lifecycle policies, create uniform access controls, or emit required KPIs, CDK constructs can provide avenues to create safe and secure configuration by default. Applying CDK constructs to DataOps, customers can benefit from templated ETL pipelines that ensure data lineage metadata is maintained and data cleansing occurs.

Customers also build constructs for non-AWS resources. Teams can build Constructs for third-party builder tooling, observability systems, testing apparatuses and more. In this way, workload teams can codify AWS and non-AWS resources in one code base. There is a balance required when writing your own constructs between ensuring standardization and providing the freedom and flexibility of taking advantage of the growing ecosystems of CDK packages. Examples of this balance include AWS Solutions Constructs, as these are typically built upon standard constructs. Without extending standard constructs, the constructs you build will be harder for consumer to integrate with the larger CDK ecosystem since it uses standardize interfaces.

Construct Hub is a central destination for discovering and sharing cloud application design patterns and reference architectures defined for CDK, that are built and published by the AWS community. While AWS provides a public Construct Hub, enterprises can maintain their private Construct Hub inside their own AWS accounts (see construct-hub, the GitHub repository, or the CDK Workshop for more details). The primary objective in either case remains consistent: to provide shared libraries that can be readily utilized by different workload teams. This approach ensures enhanced consistency, reusability, and ultimately leads to cost reduction and faster development timelines.

One of the pitfalls customers often have with leveraging this approach is that Platform Engineering cannot keep up building reusable components to leverage the latest technology enhancements. This is where leveraging the lessons learned from a pilot really can help. A pilot team works with platform engineering to research and implement security best practices. Some customers have the platform engineering team act as approvers for new constructs in addition to authors of new constructs. In this model, a pilot team works to build construct(s) for a new technology. The platform engineers approve the new construct(s). Platform engineers ensure the pilot team meets required standards such as enforcing encryption at rest, encryption in transit, and least privilege. When approval occurs, the pilot team can publish the new construct(s) to Construct Hub. In this way, platform engineering can enable experimentation and innovation, rather than become a gatekeeper. Additionally, platform engineering teams can encourage and curate an inner-sourcing model for construct creation rather than being the sole creator of constructs.

Deploy Applications Using DevSecOps Best Practices

Application builders are most productive when their expertise is channeled towards writing code that directly addresses business challenges. While creating applications is a skill well within the grasp of many software developers, the complex task of deploying and operating these applications in line with organizational standards can be overwhelming, especially for those new to a team. This complexity often acts as a bottleneck, slowing down the experimentation process and delaying the realization of value from new application initiatives.

A solution to this challenge lies in automating the deployment pipeline and operational model. By employing thoroughly tested CDK (Cloud Development Kit) components that are shared across teams and validated through a robust CI/CD (Continuous Integration/Continuous Deployment) process, the burden on developers is significantly reduced. They no longer need to delve into the complexities of the organization’s deployment strategies, allowing them to concentrate on writing unique, innovative code. This approach not only streamlines the development process but also bridges the gap between development and operations, leading to more cohesive teams and faster, more efficient releases.

One key to high-quality software delivery is to have a proper Continuous Integration and Continuous Delivery (CI/CD) process in place. You can see CDK Pipelines: Continuous delivery for AWS CDK applications for practical examples. This high-level construct, powered by AWS CodePipeline, comes in handy when you need to go beyond test deployments with the cdk deploy command and build automated pipelines for production deployments to multiple environments in different regions and/or accounts.

Whenever you commit your AWS CDK app’s source code into AWS CodeCommit, GitHub, GitLab, BitBucket, or Amazon CodeCatalyst source repository, AWS CDK Pipelines automatically builds, tests, and deploys a new version of the application. This pipeline automatically reconfigures itself to deploy as the resources in stacks changes or the environments being deployed to change. For GitHub Actions users, see CDK Pipelines for GitHub Workflows.

A number of teams are extending these pipelines and adding their own stages to ensure deployed code meets the organization’s quality, security, risk, compliance and cloud financial management criteria. For best practices of what automation to put inside the pipeline, see the AWS Deployment Pipeline Reference Architecture. By creating fully functional pipelines, platform engineering teams can reduce the cognitive load place on development teams and increase the developer experience. This strategy has two implementations: QuickStart pipelines and golden pipelines.

In QuickStart pipelines, these pipelines are created as a construct in your Construct Hub and treated similar to the above discussion on reusable components. While these pipelines offer simplified interfaces and a reduction in cognitive load, workload teams remain in control of the pipeline and are free to modify it. As a result, quality gates such as security or compliance tooling can be disabled by workload teams and controls inside the pipeline aren’t provable. This is suboptimal for organizations looking to reduce costs of compliance and audit. As the number of versions of the construct grows, teams can have difficulty governing which versions are used to ensure teams consume.

In golden pipelines, the pipelines are created as constructs, but deployed via a centralized team. Workload teams cannot control or modify these pipelines, so quality gates such as security and compliance tooling cannot be disabled. These controls become provable to stakeholders in security, risk and compliance such as auditors. Removing permissions from workload teams comes with costs. With golden pipelines, platform engineering teams often spend a majority of their time troubleshooting workload teams’ deployments. With so much time spent on troubleshooting, teams have little time to introduce new tooling to raise the security and quality standard, improve environment setup and organizational consistency, or improve audit evidence and enforcement.

Two mechanisms can augment these strategies. Traditional change control boards (CCB) can provide provability in situations where gathering evidence and enforcement are difficult. CCBs can benefit from CDK constructs that integrate IT Service Management (ITSM) approvals and fleet management processes into the pipeline and account creation processes. Alternatively, there is an emerging story with Software Supply Chain Level Artifacts (SLSA). These artifacts can be used as digital proof. In the Kubernetes space, we see this pattern with tools like Tekton chains where attestations associated with OCI images and Kyverno is used for to enforcement the presence of attestations (see Protect the pipe! Secure CI/CD pipelines with a policy-based approach using Tekton and Kyverno for details).

Multi-account and cross-region deployment with CDK

DevOps best practices suggest multiple stages of deployment and testing before deploying to production. On top of that, AWS recommends a dedicated account for each stage to simplify resource isolation and access control. This multi-account strategy helps organizations make best use of AWS resources and provides fine-grain controls (see Recommended OUs and accounts).

Often, you will have a designated AWS account, where all CI/CD pipelines reside. A deployment is executed by these pipelines to publish to other AWS accounts, which may correspond to development, staging, or production stages. For more information about a cross-account strategy in reference to CI/CD pipelines on AWS, see Building a Secure Cross-Account Continuous Delivery Pipeline.

Automated Governance

Many enterprise customers leverage CDK to enforce security controls and policies and can prevent security issues before deployment with tooling to analyze code as part of the deployment pipeline. Using the industry standard tooling of cdk-nag, many teams check applications for best practices using a combination of available rule packs. We are also seeing enterprises build their own Aspects to enforce additional requirements such as tagging requirements to manage and organize their deployed resources.

Customers can create CDK synthesized CloudFormation and add additional checkpoints with CloudFormation Guard to verify the output using policy-as-code domain-specific language (DSL) rules. Platform Engineering teams can build the rules and workload team can consume rules and run CloudFormation Guard inside the pipeline. There is an official construct that supports makes it easy to add CloudFormation Guard checks to your application.

With AWS CDK, infrastructure is code. So, the standard tooling you already use to ensure quality and improve the builder experience should be used with CDK. If your organization has a code quality program, treat CDK applications no differently than web applications or microservices. Similarly, with Amazon CodeGuru Security and Amazon CodeWhisperer, builders can get actionable recommendations on how to improve both the security and quality on their CDK code as they would with any other type of application.

With Aspects, cdk-nag, and code quality tools, organizations can prevent security issues before they are deployed. However, it is also important to create controls that work after a deployment occurs. AWS CloudFormation Hooks allow customers to inspect resources prior to create, update, or delete CloudFormation Stacks or CDK Applications. With CloudFormation Hooks, Platform Engineering teams can provide warnings or prevent provisioning resources for non-compliant resources. These hooks can be created via CDK (see Build and Deploy CloudFormation Hooks using A CI/CD Pipeline for details).

Finally, you can deploy AWS Config’s conformance packs via CDK. These collections of rules you’re your organization insist on security standards at scale. If your organization wishes to build custom rules, teams can build reactive controls using higher level constructs for AWS Config Rules. While many of these patterns existed prior to CDK, CDK helps accelerate building and deploying cloud applications and controls by leveraging reusable components that are shared within the enterprise or by the community at large.

Operate the Application using Observability

The open-source community provides high-level construct libraries that expand basic monitoring capabilities for CDK applications. The cdk-monitoring-constructs project makes it easy to monitor CDK apps. Similarly, Cdk-wakeful takes that a step further, adding many additional services and provides easily configurable interfaces to automatically be notified by AWS System Manager Incident Manager, AWS Chatbot, or Amazon Simple Notification Service. By leveraging prebuilt solutions from the open-source community, you can focus on creating custom metrics and thresholds around your business logic. Platform Engineering teams can modify and extends 1open-source projects to help workload teams simplify their operations and emit health and status to centralized systems.

Accelerate New Project Startup with an Internal Developer Platform

An Internal Developer Platform (IDP) is built by platform engineering teams to build golden paths and enable developer self-service. These golden paths are expressed as a series of templates that the structure of a source control repository and files stored inside the repository. When the IDP uses these templates to create source code repositories, the resultant repository contains the following:

  • A getting-started tutorial (usually in a README.md)
  • Reference documentation
  • Skeleton source code
  • Dependency Management
  • CI/CD pipeline template
  • IaC template
  • Observability configuration

With CDK, the CI/CD pipeline, IaC template, and observability configuration can all be a part of a single CDK application.

Platform engineering teams build golden paths and expose them using tools like Backstage, Humanitec, or Port. When building golden paths, there are two common approaches to the underlying project structure. Some organizations choose the approach where their IaC code repository is separate from the application code. Others choose to include everything in one repository. There is a healthy tension between how much to place inside a golden path vs a reusable component. In both strategies, platform engineering teams can avoid code duplication by leveraging CDK. The approach your organization chooses will dictate how you organize your reusable components. Below, we will walk through both options and the implications on reusable constructs.

Option 1: Everything in one repository

In this approach, all the code is contained in one repository: infrastructure, application, configuration, and deployment. This approach enables builders to collaborate, build features, and innovate together quickly, which is why it is the recommended approach. For more details, refer to the Best practices documentation. For examples, see AWS Deployment Reference Architecture for Applications.

This approach works best in teams that are “value-stream aligned.” Value-stream aligned teams have development and operations capabilities within the same team. These teams are organized around solving problems for customers rather than technical capabilities. Within the project, teams can organize around logical units such as application tier (API, database, etc.) or business capabilities (order management, product catalog, delivery services, etc.). In organizations that are value stream aligned, larger, highly conventionalized reusable components are better. An extreme example of this type of constructs is a single construct that contains all the code for an entire microservice. In these teams, the cognitive load focuses on the customer problem, so reducing the complexity of developing applications is critical to success.

Option 2: Separated application code pipeline

In this alternative approach, you can decouple your application code from your infrastructure by storing them in separate repositories and having separate pipelines. Separating the pipelines often leads to siloes and less collaboration between workload builders, who shift focus to developing features, and infrastructure engineers, who limit their efforts to building the infrastructure on which those applications run.

This approach works best in teams that are “matrixed.” A matrix organization is structured around technical capabilities (development, operations, security, business, etc.). In these cases, more modular constructs work better than constructs that are highly conventionalized. Experts from each organization can use CDK constructs as mechanisms to share their expertise across the entire organization. Examples of these types of constructs are monitoring, alerting, or security constructs prebuilt with hooks to plug in to centralized monitoring.

Building a Community of Practice with Platform Engineering

Scaling any new technology within a large organization requires the creation and enablement of a community that fosters collaboration, establishes best practices, and stays up to date with the changes in the ecosystem. In order to enable the creation of these communities of practice within your organization, AWS supports multiple public communities centered around the creation of content to educate and enable CDK users. Members of your organization’s community of practice can connect with other CDK development teams around the world through these public AWS supported communities.

Communities of Practice

A Community of Practice (CoP) is a group of people with shared interest who come together to learn, collaborate and develop expertise in a specific domain through informal interactions and knowledge sharing. Within your organization, establishing communities of practice around CDK has been proven to enable mentorship, problem solving, and reusable assets. To get started, your platform engineering team – the creators of reusable constructs and builder tooling with CDK – become early content creators for the community of practice. This establishes a feedback loop where CDK creators publicize their achievements via the CoP and consumers can ask questions and provide direct guidance to creators. Once the CoP has sustainably expanded by the initial group that established it, the CoP can start to add hack-a-thons or game days within your organization, which can bring innovation and solve organization-wide challenges. Fully mature communities of practices own curated wikis or databases of knowledge. They use mechanisms such as townhalls, office hours, newsletters, and chat channels to keep the community up to date. In this way, CDK expertise is diffused across the organization. At AWS, this diffusion of expertise has led to teams other than platform engineering becoming creators of reusable constructs. By expanding who can create reusable constructs, we are able to accelerate our own innovation.

Communities

There is a growing community that supports CDK, with many different platforms available providing content, code, examples and meetups. CDK is currently maintained by AWS with support from the community on AWS CDK GitHub page where you can contribute to the platform, raise issues, see the backlog and join discussions with active community members.

CDK.dev is the community driven hub around the CDK ecosystem. This site brings together all the latest blogs, videos, and educational content. It also provides links to join the community Slack platform.

CDK Patterns houses an open source collection of AWS Serverless architecture patterns built with CDK for developers to use. These patterns are sources via AWS Community Builders / AWS Heroes.

Finally, AWS re:Post provides a question-and-answer portal for the community to resolve.

The AWS Community Builders program offers technical resources, education, and networking opportunities to AWS technical enthusiasts and emerging thought leaders who are passionate about sharing knowledge and connecting with the technical community.

Communities of practice can leverage AWS public communities like cdk.dev to fill gaps in knowledge. Townhalls can benefit from speakers from AWS Heroes or community builders, frequent contributors to GitHub or re:Post, or speakers from CDK Day. Newsletters can aggregate and summarize the latest news from across all AWS channels. Once your community of practice establishes CDK competencies, this collaboration can also be bidirectional. For example, experts in your organization’s community can become AWS Heroes. Success stories can be shared via CDK Day, guest blog posts, and you might even speak at one of our major events such as AWS Summits, AWS re:Invent, AWS re:Inforce, or AWS re:Mars.

Final Thoughts

As we’ve said throughout this blog, with CDK, Infrastructure is code. This has enabled a paradigm shift in the infrastructure management space. Today, we see many customers such as Liberty Mutual, Scenario, Checkmarx, and Registers of Scotland establishing mature ecosystems using CDK. With an active open-source community, an AWS dev team for long term support, and multiple platforms for knowledge sharing, your builders can quickly learn, build, and innovate. Due to successful pilots, many organizations adopt CDK, become more agile, and innovate faster. This is exactly what happened at Amazon, where CDK is the first choice for building new services.

Organizations often scale and reduce complexity through platform engineering. These teams build higher level constructs by applying best practices, and provide CI/CD pipelines to accelerate deployments. Your deployment is safer using unit testing on your infrastructure as code and through robust security controls to provide guidance to builders at every stage: from author to operate.

Finally, establishing a community enables your organization to build its own mature ecosystem. Through both internal and open-source communities your builders can connect, discover, and grow.

Photo of David Hessler

David Hessler

Prior to joining AWS, David spent a decade serving as a principal technologist and establishing Platform Engineering and SRE teams for the United States government. Since joining AWS in 2020, David has spent his time helping customers accelerate deployment speed and safety for some of AWS’s largest commercial and public sector customers. Today, as a part of the DevSecOps team within Global Services Security, he is building the next generation of DevSecOps tooling for AWS customers.

Amritha Shetty

Amritha Shetty

Amritha is a Solutions architect at AWS. She works with public sector customers to help migrate and modernize in the cloud. She loves helping citizens get more from public sector institutions through rapid innovation in the cloud. She brings over twelve years of software design and development experience and passionate about helping customers implement the next-generation development experience.

Photo of Chris Scudder

Chris Scudder

Chris is a Senior Solutions Architect with the UK Public Sector team. His primary focus is helping Public Sector customers adopt cloud technologies for their workloads, helping them streamline their development and operational processes. He has a background in application development and has created multiple Industry Solutions for UK Local Government. He has an interesting in Machine Learning and delivers AWS DeepRacer events alongside his day-to-day role.

Photo of Kumar Karra

Kumar Karra

Kumar Karra is a Senior Field Solutions Architect for AWS Small and Medium Business Customers. He has a strong background in designing and developing applications for small consumer facing customers to large mission critical applications for enterprises. He specialized in NextGen Developer Experience tools and enjoys helping customer shorten their time to value by guiding them on strategies to implement fast, repeatable, testable, and scalable tools and architectures.

How to build a multi-Region AWS Security Hub analytic pipeline and visualize Security Hub data

Post Syndicated from David Hessler original https://aws.amazon.com/blogs/security/how-to-build-a-multi-region-aws-security-hub-analytic-pipeline/

AWS Security Hub is a service that gives you aggregated visibility into your security and compliance posture across multiple Amazon Web Services (AWS) accounts. By joining Security Hub with Amazon QuickSight—a scalable, serverless, embeddable, machine learning-powered business intelligence (BI) service built for the cloud—your senior leaders and decision-makers can use dashboards to empower data-driven decisions and facilitate a secure configuration of AWS resources

In organizations that operate at cloud scale, being able to summarize and perform trend analysis is key to identifying and remediating problems early, which leads to the overall success of the organization. Additionally, QuickSight dashboards can be embedded in dashboard and reporting platforms that leaders are already familiar with, making the dashboards even more user friendly.

With the solution in this blog post, you can provide leaders with cross-AWS Region views of data to enable decision-makers to assess the health and status of an organizations IT infrastructure at a glance. You also can enrich the dashboard with data sources not available to Security Hub. Finally, this solution allows you the flexibility to have multiple administrator accounts across several AWS organizations and combine them into a single view.

In this blog post, you will learn how to build an analytics pipeline of your Security Hub findings, summarize the data with Amazon Athena, and visualize the data via QuickSight using the following steps:

  • Deploy an AWS Cloud Development Kit (AWS CDK) stack that builds the infrastructure you need to get started.
  • Create an Athena view that summarizes the raw findings.
  • Visualize the summary of findings in QuickSight.
  • Secure QuickSight using best practices.

For a high-level discussion without code examples please see Visualize AWS Security Hub Findings using Analytics and Business Intelligence Tools.

Prerequisites

This blog post assumes that you:

  • Have a basic understanding of how to authenticate and access your AWS account.
  • Are able to run commands via a command line prompt on your local machine.
  • Have a basic understanding of Structured Query Language (SQL).

Solution overview

Figure 1 shows the flow of events and a high-level architecture diagram of the solution.

Figure 1. High level architecture diagram

Figure 1. High level architecture diagram

The steps shown in Figure 1 include:

  • Detect
  • Collect
  • Aggregate
  • Transform
  • Analyze
  • Visualize

Detect

AWS offers a number of tools to help detect security findings continuously. These tools fall into three types:

In this blog, you will use two built-in security standards of Security Hub—CIS AWS Foundations Benchmark controls and AWS Foundational Security Best Practices Standard—and a serverless Prowler scanner that acts as a third-party partner product. In cases where AWS Organizations is used, member accounts send these findings to the member account’s Security Hub

Collect

Within a region, security findings are centralized into a single administrator account using Security Hub.

Aggregate

Using the cross-Region aggregation feature within Security Hub, findings within each administrator account can be aggregated and continuously synchronized across multiple regions.

Ingest

Security Hub not only provides a comprehensive view of security alerts and security posture across your AWS accounts, it also acts as a data sink for your security tools. Any tool that can expose data via AWS Security Finding Format (ASFF) can use the BatchImportFindings API action to push data to Security Hub. For more details, see Using custom product integration to send findings to AWS Security Hub and Available AWS service integrations in the Security Hub User Guide.

Transform

Data coming out of Security Hub is exposed via Amazon EventBridge. Unfortunately, it’s not quite in a form that Athena can consume. EventBridge streams data through Amazon Kinesis Data Firehose directly to Amazon Simple Storage Service (Amazon S3). From Amazon S3, you can create an AWS Lambda function that flattens and fixes some of the column names, such as by removing special characters that Athena cannot recognize. The Lambda function then saves the results back to S3. Finally, an AWS Glue crawler dynamically discovers the schema of the data and creates or updates an Athena table.

Analyze

You will aggregate the raw findings data and create metrics along various grains or pivots by creating a simple yet meaningful Athena view. With Athena, you also can use views to join the data with other data sources, such as your organization’s configuration management database (CMDB) or IT service management (ITSM) system.

Visualize

Using QuickSight, you will register the data sources and build visualizations that can be used to identify areas where security can be improved or reduce risk. This post shares steps detailing how to do this in the Build QuickSight visualizations section below.

Use AWS CDK to deploy the infrastructure

In order to analyze and visualize security related findings, you will need to deploy the infrastructure required to detect, ingest, and transform those findings. You will use an AWS CDK stack to deploy the infrastructure to your account. To begin, review the prerequisites to make sure you have everything you need to deploy the CDK stack. Once the CDK stack is deployed, you can deploy the actual infrastructure. After the infrastructure has been deployed, you will build an Athena view and a QuickSight visualization.

Install the software to deploy the solution

For the solution in this blog post, you must have the following tools installed:

  • The solution in this blog post is written in Python, so you must install Python in addition to CDK. Instructions on how to install Python version 3.X can be found on their downloads page.
  • AWS CDK requires node.js. Directions on how to install node.js can found on the node.js downloads page.
  • This CDK application uses Docker for local bundling. Directions for using Docker can be found at Get Docker.
  • AWS CDK—a software-development framework for defining cloud infrastructure in code and provisioning it through AWS CloudFormation. To install CDK, visit AWS CDK Toolkit page.

To confirm you have the everything you need

  1. Confirm you are running version 1.108.0 or later of CDK.

    $ cdk ‐‐version

  2. Download the code from github by cloning the repository. cd into the clone directory.

    $ git clone [email protected]:aws-samples/aws-security-hub-analytic-pipeline.git

    $ cd aws-security-hub-analytic-pipeline

  3. Manually create a virtualenv.

    $ python3 -m venv .venv

  4. After the initialization process completes and the virtualenv is created, you can use the following step to activate your virtualenv.

    $ source .venv/bin/activate

  5. If you’re using a Windows platform, use the following command to activate virtualenv:

    % .venv/Scripts/activate.bat

  6. Once the virtualenv is activated, you can install the required dependencies.

    $ pip install -r requirements.txt

Use AWS CDK to deploy the infrastructure into your account

The following steps use AWS CDK to deploy the infrastructure. This infrastructure includes the various scanners, Security Hub, EventBridge, and Kinesis Firehose streams. When complete, the raw Security Hub data will already be stored in an S3 bucket.

To deploy the infrastructure using AWS CDK

  1. If you’ve never used AWS CDK in the account you’re using or if you’ve never used CDK in the us-east-1, us-east-2, or us-west-1 Regions, you must bootstrap the regions via the command prompt.

    $ cdk bootstrap

  2. At this point, you can deploy the stack to your default AWS account via the command prompt.

    $ cdk deploy –all

  3. While cdk deploy is running, you will see the output in Figure 2. This is a prompt to ensure you’re aware that you’re making a security-relevant change and creating AWS Identity and Access Management (IAM) roles. Enter y when prompted to continue the deployment process:

    Figure 2. CDK approval prompt to create IAM roles

    Figure 2. CDK approval prompt to create IAM roles

  4. Confirm cdk deploy is finished. When the deployment is finished, you should see three stack ARNs. It will look similar to Figure 3.

    Figure 3. Final output of CDK deploy

    Figure 3. Final output of CDK deploy

As a result of the deployed CDK code, Security Hub and the Prowler scanner will automatically scan your account, process the data, and send it to S3. While it takes less than an hour for some data to be processed and searchable in Athena, we recommend waiting 24 hours before proceeding to the next steps, to ensure enough data is processed to generate useful visualizations. This is because the remaining steps roll-up findings by the hour. Also, it takes several minutes to get initial results from the Security Hub standards and up to an hour to get initial results from Prowler.

Build an Athena view

Now that you’re deployed the infrastructure to detect, ingest, and transform security related findings, it’s time to use an Athena view to accomplish the analyze portion of the solution. The following view aggregates the number of findings for a given day. Athena views can be used to summarize data or enrich it with data from other sources. Use the following steps to build a simple example view. For more information on creating Athena views, see Working with Views.

To build an Athena view

  1. Open the AWS Management Console and ensure that the Region is set to us-east-1 (Northern Virginia).
  2. Navigate to the Athena service. If you’ve never used this service, choose Get Started to navigate to the Query Editor screen. Otherwise, the Query Editor screen is the default view.
  3. If you’re new to Athena, you also need to set up a query result location.
    1. Choose Settings in the top right of the Query Editor screen to open the settings panel.
    2. Choose Select to select a query result location.

      Figure 4. Athena settings

      Figure 4. Athena settings

    3. Locate an S3 bucket in the list that starts with analyticsink-queryresults and choose the right-arrow icon.
    4. Choose Select to select a query results bucket.

      Figure 5. Select S3 location confirmation

      Figure 5. Select S3 location confirmation

  4. Select AwsDataCatalog as the Data source and security_hub_database as the Database. The Query Editor screen should look like Figure 6.

    Figure 6. Empty query editor

    Figure 6. Empty query editor

  5. Copy and paste the following SQL in the query window:

    CREATE OR REPLACE VIEW “security-hub-rolled-up-finding” AS
    SELECT

    “date_format”(“from_iso8601_timestamp”(updatedat), ‘%Y-%m-%d %H:00’) year_month_day
    , region
    , compliance_status
    , workflowstate
    , severity_label
    , COUNT(DISTINCT title) as cnt
    FROM
    security_hub_database.“security-hub-crawled-findings”
    GROUP BY “date_format”(“from_iso8601_timestamp”(updatedat), ‘%Y-%m-%d %H:00’), compliance_status, workflowstate, severity_label, region

  6. Choose the Run query button.

If everything is correct, you should see Query successful in the Results, as shown in Figure 7.

Figure 7. Creating an Athena view

Figure 7. Creating an Athena view

Build QuickSight visualizations

Now that you’ve deployed the infrastructure to detect, ingest, and transform security related findings, and have created an Athena view to analyze those findings, it’s time to use QuickSight to visualize the findings. To use QuickSight, you must first grant QuickSight permissions to access S3 and Athena. Next you create a QuickSight data source. Third, you will create a QuickSight analysis. (Optional) When complete, you can publish the analysis.

You will build a simple visualization that shows counts of findings over time separated by severity, though it’s also possible to use QuickSight to tell rich and compelling visual stories.

In order to use QuickSight, you need to sign up for a QuickSight subscription. Steps to do so can be found in Signing Up for an Amazon QuickSight Subscription.

The first thing you need to do once logged in to QuickSight is create the data source. If this is your first time logging in to the service, you will be greeted with an initial QuickSight page as shown in Figure 8.

Figure 8. Initial QuickSight page

Figure 8. Initial QuickSight page

Grant QuickSight access to S3 and Athena

While creating the Athena data source will enable QuickSight to query data from Athena, you also need to enable QuickSight to read from S3.

To grant QuickSight access to S3 and Athena

  1. Inside QuickSight, select your profile name (upper right). Choose Manage QuickSight, and then choose Security & permissions.
  2. Choose Add or remove.
  3. Ensure the checkbox next to Athena is selected.
  4. Ensure the checkbox next to Amazon S3 is selected.
  5. Choose Details and then choose Select S3 Buckets.
  6. Locate an S3 bucket in the list that starts with analyticsink-bucket and ensure the checkbox is selected.
    Figure 9. Example permissions

    Figure 9. Example permissions

  7. Choose Finish to save changes.

Create a QuickSight dataset

Once you’ve given QuickSight the necessary permissions, you can create a new dataset.

To create a QuickSight dataset

  1. Choose Datasets from the navigation pane at left. Then choose New Dataset.

    Figure 10. Dataset page

    Figure 10. Dataset page

  2. To create a new Athena connection profile, use the following steps:
    1. In the FROM NEW DATA SOURCES section, choose the Athena data source card.
    2. For Data source name, enter a descriptive name. For example: security-hub-rolled-up-finding.
    3. For Athena workgroup choose [ primary ].
    4. Choose Validate connection to test the connection. This also confirms encryption at rest.
    5. Choose Create data source.
  3. On the Choose your table screen, select:
    Catalog: AwsDataCatalog
    Database: security_hub_database
    Table: security-hub-rolled-up-finding
  4. Finally, select the Import to SPICE for quicker analytics option and choose Visualize.

Once you’re finished, the page to create your first analysis will automatically open. Figure 11 shows an example of the page.

Figure 11. Create an analysis page

Figure 11. Create an analysis page

Create a QuickSight analysis

A QuickSight analysis is more than just a visualization—it helps you uncover hidden insights and trends in your data, identify key drivers, and forecast business metrics. You can create rich analytic experiences with QuickSight. For more information, visit Working with Visuals in the QuickSight User Guide.

For simplicity, you’ll build a visualization that summarizes findings categories by severity and aggregated by hour.

To create a QuickSight analysis

  1. Choose Line Chart from the Visual Types.

    Figure 12. Visual types

    Figure 12. Visual types

  2. Select Fields. Figure 13 shows what your field wells should look like at the end of this step.
    1. Locate the year_month_day_hour field in the field list and drag it over to the X axis field well.
    2. Locate the cnt field in the field list and drag it over to the Value field well.
    3. Locate the severity_label field in the field list and drag it over to Color field well.

      Figure 13. Field wells

      Figure 13. Field wells

  3. Add Filters.
    1. Select Filter in the left navigation panel.

      Figure 14. Filters panel

      Figure 14. Filters panel

    2. Choose Create one… and select the compliance_status field.
    3. Expand the filter and clear NOT_AVAILABLE and PASSED (Note: depending on your data, you might not have all of these statuses).
    4. Choose Apply to apply the filter.

      Figure 15. Filtering out findings that are not failing

      Figure 15. Filtering out findings that are not failing

You should now see a visualization that looks like Figure 16, which shows a summary count of events and their severity.

Figure 16. Example visualization (note: this visualization has five days’ worth of data.)

Figure 16. Example visualization (note: this visualization has five days’ worth of data.)

Publish a QuickSight analysis dashboard (optional)

Publishing a dashboard is a great way to share reports with leaders. This two-step process allows you to share visualizations as a dashboard.

To publish a QuickSight analysis

  1. Choose Share on the application bar, then choose Publish dashboard.
  2. Select Publish new dashboard as, and then enter a dashboard name, such as Security Hub Findings by Severity.

You can also embed dashboards into web applications. This requires using the AWS SDK or through the AWS Command Line Interface (AWS CLI). For more information, see Embedding QuickSight Data Dashboards for Everyone.

Encouraged security posture in QuickSight

QuickSight has a number of security features. While the AWS Security section of the QuickSight User Guide goes into detail, here’s a summary of the standards that apply to this specific scenario. For more details see AWS security in Amazon QuickSight within the QuickSight user guide.

Clean up (optional)

When done, you can clean up QuickSight by removing the Athena view and the CDK stack. Follow the detailed steps below to clean up everything.

To clean up QuickSight

  1. Open the console and choose Datasets in the left navigation pane.
  2. Select security-hub-rolled-up-finding then choose Delete dataset.
  3. Confirm dataset deletion by choosing Delete.
  4. Choose Analyses from the left navigation pane.
  5. Choose the menu in the lower right corner of the security-hub-rolled-up-finding card.

    Figure 17. Example analysis card

    Figure 17. Example analysis card

  6. Select Delete and confirm Delete.

To remove the Athena view

  1. Paste the following SQL in the query window:

    DROP VIEW “security-hub-rolled-up-finding”

  2. Choose the Run query button.

To remove the CDK stack

  1. Run the following command in your terminal:

    cdk destroy

    Note: If you experience errors, you might need to reactivate your Python virtual environment by completing steps 3–5 of Use AWS CDK to deploy the infrastructure.

Conclusion

In this blog, you used Security Hub and QuickSight to deploy a scalable analytic pipeline for your security tools. Security Hub allowed you to join and collect security findings from multiple sources. With QuickSight, you summarized data for your senior leaders and decision-makers to give them the right data in real-time.

You ensured that your sensitive data remained protected by explicitly granting QuickSight the ability to read from a specific S3 bucket. By authorizing access only to the data sources needed to visualize your data, you ensure least privilege access. QuickSight supports many other AWS data sources, including Amazon RDS, Amazon Redshift, Lake Formation, and Amazon OpenSearch Service (successor to Amazon Elasticsearch Service). Because the data doesn’t live inside an Amazon Virtual Private Cloud (Amazon VPC), you didn’t need to grant access to any specific VPCs. Limiting access to VPCs is another great way to improve the security of your environment.

 
If you have feedback about this post, submit comments in the Comments section below. If you have questions about this post, start a new thread on the Security Hub forum. To start your 30-day free trial of Security Hub, visit AWS Security Hub.

Want more AWS Security how-to content, news, and feature announcements? Follow us on Twitter.

David Hessler

David Hessler

David is a senior cloud consultant with AWS Professional Services. He has over a decade of technical experience helping customers tackle their most challenging technical problems and providing tailor-made solutions using AWS services. He is passionate about DevOps, security automation, and how the two work together to allow customers to focus on what matters: their mission.