AWS Specialist Insights Team uses Amazon QuickSight to provide operational insights across the AWS Worldwide Specialist Organization

Post Syndicated from David Adamson original https://aws.amazon.com/blogs/big-data/aws-specialist-insights-team-uses-amazon-quicksight-to-provide-operational-insights-across-the-aws-worldwide-specialist-organization/

The AWS Worldwide Specialist Organization (WWSO) is a team of go-to-market experts that support strategic services, customer segments, and verticals at AWS. Working together, the Specialist Insights Team (SIT) and the Finance, Analytics, Science, and Technology team (FAST) support WWSO in acquiring, securing, and delivering information and business insights at scale by working with the broader AWS community (Sales, Business Units, Finance) enabling data-driven decisions to be made.

SIT is made up of analysts who bring deep knowledge of the business intelligence (BI) stack to support the WWSO business. Some analysts work across multiple areas, whereas others are deeply knowledgeable in their specific verticals, but all are technically proficient in BI tools and methodologies. The team’s ability to combine technical and operational knowledge, in partnership with domain experts within WWSO, helps us build a common, standard data platform that can be used throughout AWS.

Untapped potential in data availability

One of the ongoing challenges for the team was how to turn the 2.1 PB of data inside the data lake into actionable business intelligence that can drive actions and verifiable outcomes. The resources needed to translate the data, analyze it, and succinctly articulate what the data shows had been a blocker of our ability to be agile and responsive to our customers.

After reviewing several vendor products and evaluating the pros and cons of each, Amazon QuickSight was chosen to replace our existing legacy BI solution. It not only satisfied all of the criteria necessary to provide actionable insights across WWSO business but allows us to scale securely across tens of thousands of users at AWS.

In this post, we discuss what influenced the decision to implement QuickSight, and will detail some of the benefits our team has seen since implementation.

Legacy tool deprecation

The legacy BI solution presented a number of challenges, starting with scaling, complex governance, and siloed reporting. This resulted in poor performance, cumbersome development processes, multiple versions of truth, and high costs. Ultimately, the legacy BI solution had significant barriers to widespread adoption, including long time to insights, lack of trust, low innovation, and return-on-investment (ROI) justification.

After the decision was made to deprecate the previous BI tool our team had been using to provide reports and insights to the organization, the team began to make preparations for the impending switch. We met with analysts across the specialist organization to gather feedback on what they’d like to see in the next iteration of reporting capabilities. Based on that feedback, and with guidance from our leadership teams, the following criteria needed to be met in our next BI tool:

  • Accessible insights – To ensure users with varying levels of technical aptitude could understand the information, the insights format needed to be easy to understand.
  • Speed – With millions of records, processing speed needed to be lightning fast, and we also didn’t want to invest a lot of time in technical implementation or user education training.
  • Cost – Being a frugal company, we needed to ensure that our BI solution would not only do what we needed it to do but that it wouldn’t blow up our budget.
  • Security – Built-in row-level security, and a custom solution developed internally, had the ability to give access to thousands of users across AWS.

Among other considerations that ultimately influenced the decision to use QuickSight was that it’s a fully managed service, which meant no need to maintain a separate server or manage any upgrades. Because our team handles sensitive data, security was also top of mind. QuickSight passed that test as well; we were able to implement fine-grained security measures and saw no trade-off in performance.

A simple, speedy, single source of truth

With such a wide variety of teams needing access to the data and insights our team provides, our BI solution needed to be user-friendly and intuitive without the need for extensive training or convoluted instructions. With millions of records used to generate insights on sales pipelines, revenue, headcount, etc., queries could become quite complex. To meet our first top priority for accessible insights, we were looking for a combination of easy-to-operate and easy-to-understand visualizations.

Once our QuickSight implementation was complete, near-real-time, actionable insights with informative visuals were just a few clicks away. We were impressed by how simple it was to get at-a-glance insights that told data-driven stories about the key performance indicators that were most important to our stakeholder community. For business-critical metrics, we’re able to set up alerts that trigger emails to owners when certain thresholds are met, providing peace of mind that nothing important will slip through the cracks.

With the goal of migrating 400+ dashboards from the legacy BI solution over to QuickSight successfully, there were three critical components that we had to get right. Not only did we need to have the right technology, we also needed to set up the right processes while also keeping change management—from a people perspective—top of mind.

This migration project provided us with an opportunity to standardize our data, ensuring that we have a uniform source of truth that enables efficiency, as well as governed access and self-service across the company. In the spirit of working smarter (not harder), we kickstarted the migration in parallel with the data standardization project.

We started by establishing clear organization goals for alignment and a solid plan from start to finish. Next steps were to focus on row-level security design and evolution to ensure that we can provide governance and security at scale. To ensure success, we first piloted migrating 35+ dashboards and 500+ users. We then established a core technical team whose focus was to be experts at QuickSight and migrate another 400+ dashboards, 4,000+ users, and 60,0000+ impressions. The technical team also trained other members of the team to bring everyone along on the change management journey. We were able to complete the migration in 18 months across thousands of users.

With the base in place, we shifted focus to move from foundational business metrics to machine learning (ML) based insights and outcomes to help drive data-driven actions.

The following screenshot shows an example of what one of our QuickSight dashboards looks like, though the numbers do not reflect real values; this is test data.

With speed being next on our list of key criteria, we were delighted to learn more about how QuickSight works. SPICE, an acronym for Super-fast, Parallel, In-memory Calculation Engine, is the robust engine that QuickSight uses to rapidly perform advanced calculations and serve data. The query runtimes and dashboard development speed were both appreciably faster in comparison to other data visualization tools we had used, where we’d need to wait for it to process every time we added a new calculation or a new field to the visual. The dashboard load times were also noticeably faster than the load times from our previous BI tool; most load in under 5 seconds, compared to several minutes with the previous BI tool.

Another benefit of having chosen QuickSight is that there has been a significant reduction in the number of disagreements over data definitions or questions about discrepancies between reports. With standardized SPICE datasets built in QuickSight, we can now offer data as a service to the organization, creating a single source of truth for our insights shared across the organization. This saved the organization hours of time investigating unanswered questions, enabling us to be more agile and responsive, which makes us better able to serve our customers.

Dashboards are just the beginning

We’re very happy with QuickSight’s performance and scalability, and we are very excited to improve and expand on the solid reporting foundation we’ve begun to build. Having driven adoption from 50 percent to 83 percent, as well as seeing a 215 percent growth in views and a 157 percent growth in users since migrating to QuickSight, it’s clear we made the right choice.

We were intrigued by a recent post by Amy Laresch and Kellie Burton, AWS Analytics sales team uses QuickSight Q to save hours creating monthly business reviews. Based on what we learned from that post, we also plan to test out and eventually implement Amazon QuickSight Q, a ML powered natural language capability that gives anyone in the organization the ability to ask business questions in natural language and receive accurate answers with relevant visualizations. We’re also considering integrations with Amazon SageMaker and Amazon Honeycode-built apps for write back.

To learn more, visit Amazon QuickSight.


About the Authors

David Adamson is the head of WWSO Insights. He is leading the team on the journey to a data driven organization that delivers insightful, actionable data products to WWSO and shared in partnership with the broader AWS organization. In his spare time, he likes to travel across the world and can be found in his back garden, weather permitting exploring and photography the night sky.

Yash Agrawal is an Analytics Lead at Amazon Web Services. Yash’s role is to define the analytics roadmap, develop standardized global dashboards & deliver insightful analytics solutions for stakeholders across AWS.

Addis Crooks-Jones is a Sr. Manager of Business Intelligence Engineering at Amazon Web Services Finance, Analytics and Science Team (FAST). She is responsible for partnering with business leaders in the World Wide Specialist Organization to build a culture of data  to support strategic initiatives. The technology solutions developed are used globally to drive decision making across AWS. When not thinking about new plans involving data, she like to be on adventures big and small involving food, art and all the fun people in her life.

Graham Gilvar is an Analytics Lead at Amazon Web Services. He builds and maintains centralized QuickSight dashboards, which enable stakeholders across all WWSO services to interlock and make data driven decisions. In his free time, he enjoys walking his dog, golfing, bowling, and playing hockey.

Shilpa Koogegowda is the Sales Ops Analyst at Amazon Web Services and has been part of the WWSO Insights team for the last two years. Her role involves building standardized metrics, dashboards and data products to provide data and insights to the customers.

Enabling load-balancing of non-HTTP(s) traffic on AWS Wavelength

Post Syndicated from Sheila Busser original https://aws.amazon.com/blogs/compute/enabling-load-balancing-of-non-https-traffic-on-aws-wavelength/

This blog post is written by Jack Chen, Telco Solutions Architect, and Robert Belson, Developer Advocate.

AWS Wavelength embeds AWS compute and storage services within 5G networks, providing mobile edge computing infrastructure for developing, deploying, and scaling ultra-low-latency applications. AWS recently introduced support for Application Load Balancer (ALB) in AWS Wavelength zones. Although ALB addresses Layer-7 load balancing use cases, some low latency applications that get deployed in AWS Wavelength Zones rely on UDP-based protocols, such as QUIC, WebRTC, and SRT, which can’t be load-balanced by Layer-7 Load Balancers. In this post, we’ll review popular load-balancing patterns on AWS Wavelength, including a proposed architecture demonstrating how DNS-based load balancing can address customer requirements for load-balancing non-HTTP(s) traffic across multiple Amazon Elastic Compute Cloud (Amazon EC2) instances. This solution also builds a foundation for automatic scale-up and scale-down capabilities for workloads running in an AWS Wavelength Zone.

Load balancing use cases in AWS Wavelength

In the AWS Regions, customers looking to deploy highly-available edge applications often consider Amazon Elastic Load Balancing (Amazon ELB) as an approach to automatically distribute incoming application traffic across multiple targets in one or more Availability Zones (AZs). However, at the time of this publication, AWS-managed Network Load Balancer (NLB) isn’t supported in AWS Wavelength Zones and ALB is being rolled out to all AWS Wavelength Zones globally. As a result, this post will seek to document general architectural guidance for load balancing solutions on AWS Wavelength.

As one of the most prominent AWS Wavelength use cases, highly-immersive video streaming over UDP using protocols such as WebRTC at scale often require a load balancing solution to accommodate surges in traffic, either due to live events or general customer access patterns. These use cases, relying on Layer-4 traffic, can’t be load-balanced from a Layer-7 ALB. Instead, Layer-4 load balancing is needed.

To date, two infrastructure deployments involving Layer-4 load balancers are most often seen:

  • Amazon EC2-based deployments: Often the environment of choice for earlier-stage enterprises and ISVs, a fleet of EC2 instances will leverage a load balancer for high-throughput use cases, such as video streaming, data analytics, or Industrial IoT (IIoT) applications
  • Amazon EKS deployments: Customers looking to optimize performance and cost efficiency of their infrastructure can leverage containerized deployments at the edge to manage their AWS Wavelength Zone applications. In turn, external load balancers could be configured to point to exposed services via NodePort objects. Furthermore, a more popular choice might be to leverage the AWS Load Balancer Controller to provision an ALB when you create a Kubernetes Ingress.

Regardless of deployment type, the following design constraints must be considered:

  • Target registration: For load balancing solutions not managed by AWS, seamless solutions to load balancer target registration must be managed by the customer. As one potential solution, visit a recent HAProxyConf presentation, Practical Advice for Load Balancing at the Network Edge.
  • Edge Discovery: Although DNS records can be populated into Amazon Route 53 for each carrier-facing endpoint, DNS won’t deterministically route mobile clients to the most optimal mobile endpoint. When available, edge discovery services are required to most effectively route mobile clients to the lowest latency endpoint.
  • Cross-zone load balancing: Given the hub-and-spoke design of AWS Wavelength, customer-managed load balancers should proxy traffic only to that AWS Wavelength Zone.

Solution overview – Amazon EC2

In this solution, we’ll present a solution for a highly-available load balancing solution in a single AWS Wavelength Zone for an Amazon EC2-based deployment. In a separate post, we’ll cover the needed configurations for the AWS Load Balancer Controller in AWS Wavelength for Amazon Elastic Kubernetes Service (Amazon EKS) clusters.

The proposed solution introduces DNS-based load balancing, a technique to abstract away the complexity of intelligent load-balancing software and allow your Domain Name System (DNS) resolvers to distribute traffic (equally, or in a weighted distribution) to your set of endpoints.

Our solution leverages the weighted routing policy in Route 53 to resolve inbound DNS queries to multiple EC2 instances running within an AWS Wavelength zone. As EC2 instances for a given workload get deployed in an AWS Wavelength zone, Carrier IP addresses can be assigned to the network interfaces at launch.

Through this solution, Carrier IP addresses attached to AWS Wavelength instances are automatically added as DNS records for the customer-provided public hosted zone.

To determine how Route 53 responds to queries, given an arbitrary number of records of a public hosted zone, Route53 offers numerous routing policies:

Simple routing policy – In the event that you must route traffic to a single resource in an AWS Wavelength Zone, simple routing can be used. A single record can contain multiple IP addresses, but Route 53 returns the values in a random order to the client.

Weighted routing policy – To route traffic more deterministically using a set of proportions that you specify, this policy can be selected. For example, if you would like Carrier IP A to receive 50% of the traffic and Carrier IP B to receive 50% of the traffic, we’ll create two individual A records (one for each Carrier IP) with a weight of 50 and 50, respectively. Learn more about Route 53 routing policies by visiting the Route 53 Developer Guide.

The proposed solution leverages weighted routing policy in Route 53 DNS to route traffic to multiple EC2 instances running within an AWS Wavelength zone.

Reference architecture

The following diagram illustrates the load-balancing component of the solution, where EC2 instances in an AWS Wavelength zone are assigned Carrier IP addresses. A weighted DNS record for a host (e.g., www.example.com) is updated with Carrier IP addresses.

DNS-based load balancing

When a device makes a DNS query, it will be returned to one of the Carrier IP addresses associated with the given domain name. With a large number of devices, we expect a fair distribution of load across all EC2 instances in the resource pool. Given the highly ephemeral mobile edge environments, it’s likely that Carrier IPs could frequently be allocated to accommodate a workload and released shortly thereafter. However, this unpredictable behavior could yield stale DNS records, resulting in a “blackhole” – routes to endpoints that no longer exist.

Time-To-Live (TTL) is a DNS attribute that specifies the amount of time, in seconds, that you want DNS recursive resolvers to cache information about this record.

In our example, we should set to 30 seconds to force DNS resolvers to retrieve the latest records from the authoritative nameservers and minimize stale DNS responses. However, a lower TTL has a direct impact on cost, as a result of increased number of calls from recursive resolvers to Route53 to constantly retrieve the latest records.

The core components of the solution are as follows:

Alongside the services above in the AWS Wavelength Zone, the following services are also leveraged in the AWS Region:

  • AWS Lambda – a serverless event-driven function that makes API calls to the Route 53 service to update DNS records.
  • Amazon EventBridge– a serverless event bus that reacts to EC2 instance lifecycle events and invokes the Lambda function to make DNS updates.
  • Route 53– cloud DNS service with a domain record pointing to AWS Wavelength-hosted resources.

In this post, we intentionally leave the specific load balancing software solution up to the customer. Customers can leverage various popular load balancers available on the AWS Marketplace, such as HAProxy and NGINX. To focus our solution on the auto-registration of DNS records to create functional load balancing, this solution is designed to support stateless workloads only. To support stateful workloads, sticky sessions – a process in which routes requests to the same target in a target group – must be configured by the underlying load balancer solution and are outside of the scope of what DNS can provide natively.

Automation overview

Using the aforementioned components, we can implement the following workflow automation:

Event-driven Auto Scaling Workflow

Amazon CloudWatch alarm can trigger the Auto Scaling group Scale out or Scale in event by adding or removing EC2 instances. Eventbridge will detect the EC2 instance state change event and invoke the Lambda function. This function will update the DNS record in Route53 by either adding (scale out) or deleting (scale in) a weighted A record associated with the EC2 instance changing state.

Configuration of the automatic auto scaling policy is out of the scope of this post. There are many auto scaling triggers that you can consider using, based on predefined and custom metrics such as memory utilization. For the demo purposes, we will be leveraging manual auto scaling.

In addition to the core components that were already described, our solution also utilizes AWS Identity and Access Management (IAM) policies and CloudWatch. Both services are key components to building AWS Well-Architected solutions on AWS. We also use AWS Systems Manager Parameter Store to keep track of user input parameters. The deployment of the solution is automated via AWS CloudFormation templates. The Lambda function provided should be uploaded to an AWS Simple Storage Service (Amazon S3) bucket.

Amazon Virtual Private Cloud (Amazon VPC), subnets, Carrier Gateway, and Route Tables are foundational building blocks for AWS-based networking infrastructure. In our deployment, we are creating a new VPC, one subnet in an AWS Wavelength zone of your choice, a Carrier Gateway, and updating the route table for this subnet to point the default route to the Carrier Gateway.

Wavelength VPC architecture.

Deployment prerequisites

The following are prerequisites to deploy the described solution in your account:

  • Access to an AWS Wavelength zone. If your account is not allow-listed to use AWS Wavelength zones, then opt-in to AWS Wavelength zones here.
  • Public DNS Hosted Zone hosted in Route 53. You must have access to a registered public domain to deploy this solution. The zone for this domain should be hosted in the same account where you plan to deploy AWS Wavelength workloads.
    If you don’t have a public domain, then you can register a new one. Note that there will be a service charge for the domain registration.
  • Amazon S3 bucket. For the Lambda function that updates DNS records in Route 53, store the source code as a .zip file in an Amazon S3 bucket.
  • Amazon EC2 Key pair. You can use an existing Key pair for the deployment. If you don’t have a KeyPair in the region where you plan to deploy this solution, then create one by following these instructions.
  • 4G or 5G-connected device. Although the infrastructure can be deployed independent of the underlying connected devices, testing the connectivity will require a mobile device on one of the Wavelength partner’s networks. View the complete list of Telecommunications providers and Wavelength Zone locations to learn more.

Conclusion

In this post, we demonstrated how to implement DNS-based load balancing for workloads running in an AWS Wavelength zone. We deployed the solution that used the EventBridge Rule and the Lambda function to update DNS records hosted by Route53. If you want to learn more about AWS Wavelength, subscribe to AWS Compute Blog channel here.

Amazon Sunsets Cloud Drive

Post Syndicated from Stephanie Doyle original https://www.backblaze.com/blog/amazon-sunsets-cloud-drive/

Another one bites the dust. Amazon announced they’re putting Amazon Cloud Drive in the rearview to focus on Amazon Photos in a phased deprecation through December 2023. Today, we’ll dig into what this means for folks with data on Amazon Cloud Drive, especially those with files other than photos and videos.

Dear Amazon Drive User

When Amazon dropped the news, they explained the phased approach they would take to deprecating Amazon Drive. They’re not totally eliminating Drive—yet. Here’s what they’ve done so far, and what they plan to do moving forward:

  • October 31, 2022: Amazon removed the Drive app from iOS and Android app stores. The app doesn’t get bug fixes and security updates anymore.
  • January 31, 2023: Uploading to the Amazon Drive website will be cut off. You will have read-only access to your files.
  • December 31, 2023: Amazon Drive will no longer be supported and access to files will be cut off. Every file stored on Amazon Drive, except photo or video files, needs a new home. Users can access photo and video files on Amazon Photos.

Now, users face two options for what to do with files stored on Amazon Drive:

  1. Follow instructions to download Amazon Photos for iOS and Android devices. And, use the Amazon Drive website to download and store all other files locally or with another service.
  2. Transfer your entire library of photos, videos, and other data to another service.

Looking for an Amazon Cloud Drive Alternative?

Shameless plug: If you used Amazon Cloud Drive to store anything other than photos and you need a new place to keep your data, give Backblaze B2 Cloud Storage a try. The first 10GB are free, and our storage is priced at a flat rate of $5/TB/month ($0.005/GB/month) after that. And if you’re a business customer, we also offer the choice of capacity-based pricing with Backblaze B2 Reserve.

A Quick History of Amazon Cloud Drive

In 2014, Amazon offered free, unlimited photo storage on Amazon Cloud Drive as a loyalty perk for Prime members. The following year, they rolled out a subscription-based offering to store other types of files in addition to photos—video, documents, etc.—on Cloud Drive.

Then, in 2017, they capped the free tier at 5GB. This was just one of many in a string of cloud storage providers ending a free offering and forcing users to pay or move.

All Amazon account holders—regardless of whether they paid for Prime or not—got 5GB for photos and other file types free of charge. If you wanted or needed more storage than that, you had to sign up for the subscription-based offering starting at $11.99 per year for 100GB of storage, and prices went up from there.

You might consider this the beginning of the end for Amazon Cloud Drive.

Why Say Goodbye?

When tech companies deprecate a feature—as Amazon has done with Drive—it’s for any number of reasons:

  1. To combine one feature with another.
  2. To rectify naming inconsistencies.
  3. When a newer version makes supporting the older one impossible or impractical.
  4. To avoid flaws in a necessary feature.
  5. When a better alternative replaced the feature.
  6. To simplify the system as a whole.

Amazon’s reason for deprecating Drive? To provide a dedicated solution for photos and videos. The company stated, “We are taking the opportunity to more fully focus our efforts on Amazon Photos to provide customers a dedicated solution for photos and video storage.” Unfortunately, that leaves folks who store anything else high and dry.

Where Do We Go From Here?

The bottom line: Amazon Drive customers must park emails, documents, spreadsheets, PDFs, and text files somewhere else. If you’re an Amazon Drive customer looking to move your files out before you lose access, we invite you to try Backblaze B2. The first 10GB is on us.

How to Get Started with Backblaze B2

  1. If you’re not a customer, first sign up for B2 Cloud Storage.
  2. If you’re already a customer, enable B2 Cloud Storage in your “My Settings” tab. You can follow our Quick Start Guide for more detailed instructions.
  3. Download your data from Amazon Drive.
  4. Upload your data to Backblaze B2. Many customers choose to do so directly through the web interface, while others prefer to use integrated transfer solutions like Cyberduck, which is free and open-source, or Panic’s Transmit for Macs.
  5. Sit back and relax knowing your data is safely stored in the Backblaze B2 Storage Cloud.

The post Amazon Sunsets Cloud Drive appeared first on Backblaze Blog | Cloud Storage & Cloud Backup.

Impact of Infrastructure failures on shard in Amazon OpenSearch Service

Post Syndicated from Bukhtawar Khan original https://aws.amazon.com/blogs/big-data/impact-of-infrastructure-failures-on-shard-in-amazon-opensearch-service/

Amazon OpenSearch Service is a managed service that makes it easy to secure, deploy, and operate OpenSearch and legacy Elasticsearch clusters at scale in the AWS Cloud. Amazon OpenSearch Service provisions all the resources for your cluster, launches it, and automatically detects and replaces failed nodes, reducing the overhead of self-managed infrastructures. The service makes it easy for you to perform interactive log analytics, real-time application monitoring, website searches, and more by offering the latest versions of OpenSearch, support for 19 versions of Elasticsearch (1.5 to 7.10 versions), and visualization capabilities powered by OpenSearch Dashboards and Kibana (1.5 to 7.10 versions).

In the latest service software release, we’ve updated the shard allocation logic to be load-aware so that when shards are redistributed in case of any node failures, the service disallows surviving nodes from getting overloaded by shards previously hosted on the failed node. This is especially important for Multi-AZ domains to provide consistent and predictable cluster performance.

If you would like more background on shard allocation logic in general, please see Demystifying Elasticsearch shard allocation.

The challenge

An Amazon OpenSearch Service domain is said to be “balanced” when the number of nodes are equally distributed across configured Availability Zones, and the total number of shards are distributed equally across all the available nodes without concentration of shards of any one index on any one node. Also, OpenSearch has a property called “Zone Awareness” that, when enabled, ensures that the primary shard and its corresponding replica are allocated in different Availability Zones. If you have more than one copy of data, having multiple Availability Zones provides better fault tolerance and availability. In the event, the domain is scaled out or scaled in or during the failure of node(s), OpenSearch automatically redistributes the shards between available nodes while obeying the allocation rules based on zone awareness.

While the shard-balancing process ensures that shards are evenly distributed across Availability Zones, in some cases, if there is an unexpected failure in a single zone, shards will get reallocated to the surviving nodes. This might result in the surviving nodes getting overwhelmed, impacting cluster stability.

For instance, if one node in a three-node cluster goes down, OpenSearch redistributes the unassigned shards, as shown in the following diagram. Here “P“ represents a primary shard copy, whereas “R“ represents a replica shard copy.

This behavior of the domain can be explained in two parts – during failure and during recovery.

During failure

A domain deployed across multiple Availability Zones can encounter multiple types of failures during its lifecycle.

Complete zone failure

A cluster may lose a single Availability Zone due to a variety of reasons and also all the nodes in that zone. Today, the service tries to place the lost nodes in the remaining healthy Availability Zones. The service also tries to re-create the lost shards in the remaining nodes while still following the allocation rules. This can result in some unintended consequences.

  • When the shards of the impacted zone are getting reallocated to healthy zones, they trigger shard recoveries that can increase the latencies as it consumes additional CPU cycles and network bandwidth.
  • For an n-AZ, n-copy setup, (n>1), the surviving n-1 Availability Zones are allocated with the nth shard copy, which can be undesirable as it can cause skewness in shard distribution, which can also result in unbalanced traffic across nodes. These nodes can get overloaded, leading to further failures.

Partial zone failure

In the event of a partial zone failure or when the domain loses only some of the nodes in an Availability Zone, Amazon OpenSearch Service tries to replace the failed nodes as quickly as possible. However, in case it takes too long to replace the nodes, OpenSearch tries to allocate the unassigned shards of that zone into the surviving nodes in the Availability Zone. If the service cannot replace the nodes in the impacted Availability Zone, it may allocate them in the other configured Availability Zone, which may further skew the distribution of shards both across and within the zone. This again has unintended consequences.

  • If the nodes on the domain do not have enough storage space to accommodate the additional shards, the domain can be write-blocked, impacting indexing operation.
  • Due to the skewed distribution of shards, the domain may also experience skewed traffic across the nodes, which can further increase the latencies or timeouts for read and write operations.

Recovery

Today, in order to maintain the desired node count of the domain, Amazon OpenSearch Service launches data nodes in the remaining healthy Availability Zones, similar to the scenarios described in the failure section above. In order to ensure proper node distribution across all the Availability Zones after such an incident, manual intervention was needed by AWS.

What’s changing

To improve the overall failure handling and minimizing the impact of failure on the domain health and performance, Amazon OpenSearch Service is performing the following changes:

  • Forced Zone Awareness: OpenSearch has a preexisting shard balancing configuration called forced awareness that is used to set the Availability Zones to which shards need to be allocated. For example, if you have an awareness attribute called zone and configure nodes in zone1 and zone2, you can use forced awareness to prevent OpenSearch from allocating replicas if only one zone is available:
cluster.routing.allocation.awareness.attributes: zone
cluster.routing.allocation.awareness.force.zone.values: zone1,zone2

With this example configuration, if you start two nodes with node.attr.zone set to zone1 and create an index with five shards and one replica, OpenSearch creates the index and allocates the five primary shards but no replicas. Replicas are only allocated once nodes with node.attr.zone set to zone2 are available.

Amazon OpenSearch Service will use the forced awareness configuration on Multi-AZ domains to ensure that shards are only allocated according to the rules of zone awareness. This would prevent the sudden increase in load on the nodes of the healthy Availability Zones.

  • Load-Aware Shard Allocation: Amazon OpenSearch Service will take into consideration factors like the provisioned capacity, actual capacity, and total shard copies to calculate if any node is overloaded with more shards based on expected average shards per node. It would prevent shard assignment when any node has allocated a shard count that goes beyond this limit.

Note that any unassigned primary copy would still be allowed on the overloaded node to prevent the cluster from any imminent data loss.

Similarly, to address the manual recovery issue (as described in the Recovery section above), Amazon OpenSearch Service is also making changes to its internal scaling component. With the newer changes in place, Amazon OpenSearch Service will not launch nodes in the remaining Availability Zones even when it goes through the previously described failure scenario.

Visualizing the current and new behavior

For example, an Amazon OpenSearch Service domain is configured with 3-AZ, 6 data nodes, 12 primary shards, and 24 replica shards. The domain is provisioned across AZ-1, AZ-2, and AZ-3, with two nodes in each of the zones.

Current shard allocation:
Total number of shards: 12 Primary + 24 Replica = 36 shards
Number of Availability Zones: 3
Number of shards per zone (zone awareness is true): 36/3 = 12
Number of nodes per Availability Zone: 2
Number of shards per node: 12/2 = 6

The following diagram provides a visual representation of the domain setup. The circles denote the count of shards allocated to the node. Amazon OpenSearch Service will allocate six shards per node.

During a partial zone failure, where one node in AZ-3 fails, the failed node is assigned to the remaining zone, and the shards in the zone are redistributed based on the available nodes. After the changes described above, the cluster will not create a new node or redistribute shards after the failure of the node.


In the diagram above, with the loss of one node in AZ-3, Amazon OpenSearch Service would try to launch the replacement capacity in the same zone. However, due to some outage, the zone might be impaired and would fail to launch the replacement. In such an event, the service tries to launch deficit capacity in another healthy zone, which might lead to zone imbalance across Availability Zones. Shards on the impacted zone get stuffed on the surviving node in the same zone. However, with the new behavior, the service would try to attempt launching capacity in the same zone but would avoid launching deficit capacity in other zones to avoid imbalance. The shard allocator would also ensure that the surviving nodes don’t get overloaded.


Similarly, in case all the nodes in AZ-3 are lost, or the AZ-3 becomes impaired, Amazon OpenSearch Service brings up the lost nodes in the remaining Availability Zone and also redistributes the shards on the nodes. However, after the new changes, Amazon OpenSearch Service will neither allocate nodes to the remaining zone or it will try to re-allocate lost shards to the remaining zone. Amazon OpenSearch Service will wait for the Recovery to happen and for the domain to return to the original configuration after recovery.

If your domain is not provisioned with enough capacity to withstand the loss of an Availability Zone, you may experience a drop in throughput for your domain. It is therefore strongly recommended to follow the best practices while sizing your domain, which means having enough resources provisioned to withstand the loss of a single Availability Zone failure.


Currently, once the domain recovers, the service requires manual intervention to balance capacity across Availability Zones, which also involves shard movements. However, with the new behaviour, there is no intervention needed during the recovery process because the capacity returns in the impacted zone and the shards are also automatically allocated to the recovered nodes. This ensures that there are no competing priorities on the remaining resources.

What you can expect

After you update your Amazon OpenSearch Service domain to the latest service software release, the domains that have been configured with best practices will have more predictable performance even after losing one or many data nodes in an Availability Zone. There will be reduced cases of shard overallocation in a node. It is a good practice to provision sufficient capacity to be able to tolerate a single zone failure

You might at times see a domain turning yellow during such unexpected failures since we won’t assign replica shards to overloaded nodes. However, this does not mean that there will be data loss in a well-configured domain. We will still make sure that all primaries are assigned during the outages. There is an automated recovery in place, which will take care of balancing the nodes in the domain and ensuring that the replicas are assigned once the failure recovers.

Update the service software of your Amazon OpenSearch Service domain to get these new changes applied to your domain. More details on the service software update process are in the Amazon OpenSearch Service documentation.

Conclusion

In this post we saw how Amazon OpenSearch Service recently improved the logic to distribute nodes and shards across Availability Zones during zonal outages.

This change will help the service to ensure more consistent and predictable performance during node or zonal failures. Domains won’t see any increased latencies or write blocks during processing writes and reads, which used to surface earlier at times due to over-allocation of shards on nodes.


About the authors

Bukhtawar Khan is a Senior Software Engineer working on Amazon OpenSearch Service. He is interested in distributed and autonomous systems. He is an active contributor to OpenSearch.

Anshu Agarwal is a Senior Software Engineer working on AWS OpenSearch at Amazon Web Services. She is passionate about solving problems related to building scalable and highly reliable systems.

Shourya Dutta Biswas is a Software Engineer working on AWS OpenSearch at Amazon Web Services. He is passionate about building highly resilient distributed systems.

Rishab Nahata is a Software Engineer working on OpenSearch at Amazon Web Services. He is fascinated about solving problems in distributed systems. He is active contributor to OpenSearch.

Ranjith Ramachandra is an Engineering Manager working on Amazon OpenSearch Service at Amazon Web Services.

Jon Handler is a Senior Principal Solutions Architect, specializing in AWS search technologies – Amazon CloudSearch, and Amazon OpenSearch Service. Based in Palo Alto, he helps a broad range of customers get their search and log analytics workloads deployed right and functioning well.

Cloud Security and Compliance Best Practices: Highlights From The CSA Cloud Controls Matrix

Post Syndicated from Rapid7 original https://blog.rapid7.com/2022/12/22/cloud-security-and-compliance-best-practices-highlights-from-the-csa-cloud-controls-matrix/

Cloud Security and Compliance Best Practices: Highlights From The CSA Cloud Controls Matrix

In a recent blog post, we highlighted the release of an InsightCloudSec compliance pack, that helps organizations establish and adhere to AWS Foundational Security Best Practices. While that’s a great pack for those who have standardized on AWS and are looking for a trusted set of controls to harden their environment, we know that’s not always the case.

In fact, depending on what report you read, the percentage of organizations that have adopted multiple cloud platforms has soared and continues to rise exponentially. According to Gartner, by 2026 more than 90% of enterprises will extend their capabilities to multi-cloud environments, up from 76% in 2020.

It can be a time- and labor-intensive process to establish and enforce compliance standards across single cloud environments, but this becomes especially challenging in multi-cloud environments. First, the number of required checks and guardrails are multiplied, and second, because each platform is unique,  proper hygiene and security measures aren’t consistent across the various clouds. The general approaches and philosophies are fairly similar, but the way controls are implemented and the way policies are written can be significantly different.

For this post, we’ll dive into one of the most commonly-used cloud security standards for large, multi-cloud environments: the CSA Cloud Controls Matrix (CCM).

What is the CSA Cloud Controls Matrix?

In the unlikely event you’re unfamiliar, Cloud Security Alliance (CSA) is a non-profit organization dedicated to defining and raising awareness of best practices to help ensure a secure cloud computing environment. CSA brings together a community of cloud security experts, industry practitioners, associations, governments, and its corporate and individual members to offer cloud security-specific research, education, certification, events and products.

The Cloud Controls Matrix is a comprehensive cybersecurity control framework for cloud computing developed and maintained by CSA. It is widely-used as a systematic assessment of a cloud implementation and provides guidance on which security controls should be implemented within the cloud supply chain. The controls framework is aligned to the CSA Security Guidance for Cloud Computing and is considered a de-facto standard for cloud security assurance and compliance.

Five CSA CCM Principles and Why They’re Important

The CCM consists of many controls and best practices, which means we can’t cover them all in a single blog post. That said, we’ve outlined 5 major principles that logically group the various controls and why they’re important to implement in your cloud environment. Of course, the CCM provides a comprehensive set of specific and actionable directions that, when adopted, simplify the process of adhering to these principles—and many others.

Ensure consistent and proper management of audit logs
Audit logs record the occurrence of an event along with supporting metadata about the event, including the time at which it occurred, the responsible user or service, and the impacted entity or entities. By reviewing audit logs, security teams can investigate breaches and ensure compliance with regulatory requirements. Within CCM, there are a variety of controls focused on ensuring that you’ve got a process in place to collect, retain and analyze logs as well as limiting access and the ability to edit or delete such logs to only those who need it.

Ensure consistent data encryption and proper key management
Ensuring that data is properly encrypted, both at rest and in transit, is a critical step to protect your organization and customer data from unauthorized access. There are a variety of controls within the CCM that are centered around ensuring that data encryption is used consistently and that encryption keys are maintained properly—including regular rotation of keys as applicable.

Effectively manage IAM permissions and abide by Least Privilege Access (LPA)
In modern cloud environments, every user and resource is assigned a unique identity and a set of access permissions and privileges. This can be a challenge to keep track of, especially at scale, which can result in improper access, either from internal users or external malicious actors. To combat this, the CCM provides guidance around establishing processes and mechanisms to manage, track and enforce permissions across the organization. Further, the framework suggests employing the Least Privilege Access (LPA) principle to ensure users only have access to the systems and data that they absolutely need.

Establish and follow a process for managing vulnerabilities
There are a number of controls focused on establishing, implementing and evaluating processes, procedures and technical measures for detecting and remediating vulnerabilities. The CCM has dedicated controls for application vulnerabilities, external library vulnerabilities and host-level vulnerabilities. It is important to regularly scan your cloud environments for known vulnerabilities, and evaluate the processes and methodologies you use to do so, as well.

Define a process to proactively roll back changes to a previous state of good
In traditional, on-premises environments, patching and fixing existing resources is the proper course of action when an error or security concern is discovered. Conversely, when things go awry in cloud environments, remediation steps typically involve reverting back to a previous state of good. To this end, the CCM guides organizations to proactively establish and implement a process  that allows them to easily roll back changes to a previously known good state—whether manually or via automation.

How InsightCloudSec Helps Implement and Enforce CCM

InsightCloudSec allows security teams to establish and continuously measure compliance against organizational policies, whether they’re based on common industry frameworks or customized to specific business needs. This is accomplished through the use of compliance packs.

A compliance pack within InsightCloudSec is a set of checks that can be used to continuously assess your cloud environments for compliance with a given regulatory framework or industry best practices. The platform comes out-of-the-box with 30+ compliance packs, and also offers the ability to build custom compliance packs that are completely tailored to your business’ specific needs.

Whenever a non-compliant resource is created, or when a change is made to an existing resource’s configuration or permissions, InsightCloudSec will detect it within minutes. If you so choose, you can make use of the platform’s native, no-code automation to remediate the issue—either via deletion or by adjusting the configuration and/or permissions—without any human intervention.

If you’re interested in learning more about how InsightCloudSec can help implement and enforce security and compliance standards across your organization, be sure to check out a free demo!

James Alaniz and Ryan Blanchard contributed to this article.

Ryabitsev: Sending a kernel patch with b4 (part 1)

Post Syndicated from original https://lwn.net/Articles/918381/

Konstantin Ryabitsev has put up a
blog entry
showing how to use b4 to submit kernel patches
without (directly) using email.

While b4 started out as a way for maintainers to retrieve patches
from mailing lists, it also has contributor-oriented
features. Starting with version 0.10 b4 can:

  • create and manage patch series and cover letters
  • track and auto-reroll series revisions
  • display range-diffs between revisions
  • apply trailers received from reviewers and maintainers
  • submit patches without needing a valid SMTP gateway

Second Prototype Advances ALP (openSUSE News)

Post Syndicated from original https://lwn.net/Articles/918380/

The openSUSE News site covers
some highlights
from the second prototype
release
of the upcoming SUSE “ALP” distribution.

The mountainous prototype has the big addition of Full Disk
Encryption. ALP extended this Full Disk Encryption to bare metal
servers and the use of a Trusted Platform Module will open the
doors to leverage unattended booting while keeping systems
encrypted and secured. ALP is intended to run on both private and
public clouds that require encryption features.

Cloudflare Radar 2022 Year in Review

Post Syndicated from David Belson original https://blog.cloudflare.com/radar-2022-year-in-review/

Cloudflare Radar 2022 Year in Review

Cloudflare Radar 2022 Year in Review

In 2022, with nearly five billion people around the world (as well as an untold number of “bots”) using the Internet, analyzing aggregate data about this usage can uncover some very interesting trends. To that end, we’re excited to present the Cloudflare Radar 2022 Year In Review, featuring interactive charts, graphs, and maps you can use to explore notable Internet trends observed throughout this past year. The Year In Review website is part of Cloudflare Radar, which celebrated its second birthday in September with the launch of Radar 2.0.

We have organized the trends we observed around three different topic areas: Traffic, Adoption, and Security. The content covered within each of these areas is described in more detail in their respective sections below. Building on the 2021 Year In Review, we have incorporated several additional metrics this year, and have also improved the underlying methodology. (As such, the charts are not directly comparable to develop insights into year-over-year changes.)

Website visualizations shown at a weekly granularity cover the period from January 2 through November 26, 2022 (the start of the first full week of the year through the end of the last full week of November). We plan to update the underlying data sets through the end of the year in early 2023. Trends for nearly 200 locations are available on the website, with some smaller or less populated locations excluded due to insufficient data.

Before we jump in, we urge anyone who prefers to see the headline stats up front and to explore the data themselves to go ahead and visit the website. Anyone who wants a more lengthy, but curated set of observations should continue reading below. Regardless, we encourage you to consider how the trends presented within this post and the website’s various sections impact your business or organization, and to think about how these insights can inform actions that you can take to improve user experience or enhance your security posture.

Traffic

Cloudflare Radar 2022 Year in Review

Anyone following recent technology headlines might assume that the Internet’s decades-long trend of incredible growth would have finally begun to falter. In times like these, data is key. Our data indicates that global Internet traffic, which grew at 23% this year, is as robust as ever.

To determine the traffic trends over time, we first established a baseline, calculated as the average daily traffic volume (excluding bot traffic) over the second full calendar week (January 9-15) of 2022. We chose the second calendar week to allow time for people to get back into their “normal” routines (school, work, etc.) after the winter holidays and New Year’s Day. The percent change shown on the trend lines in our charts are calculated relative to the baseline value, and represents a seven-day trailing average — it does not represent absolute traffic volume for a location. The seven-day averaging is done to smooth the sharp changes seen with a daily granularity.

In addition to calculating traffic growth, our 1.1.1.1 public DNS resolver and broad global customer base enables us to have a unique view into online activity. This includes insights into the most popular types of Internet content and the most popular Internet services in general and across specific categories, as well as the impact of bots. Of course, none of this matters if connectivity is unavailable, so we also drill down into major Internet disruptions observed in 2022.

After an initial dip, worldwide Internet traffic saw nominal growth coinciding with the 2022 Olympic Winter Games in Beijing, but slipped again in the weeks after their conclusion. After a couple of months of slight growth, traffic again dipped below baseline heading into July. However, after reaching that nadir, Internet traffic experienced a fairly consistent rate of growth through the back part of the year. An upwards inflection at the end of November is visible in the worldwide traffic graph as well as the traffic graphs of a number of locations. Traffic analysis showed that this increase resulted from the convergence of early holiday shopping traffic (to e-commerce sites) with the run-up to and early days of FIFA World Cup Qatar 2022.

Cloudflare Radar 2022 Year in Review

The An Update on Cloudflare’s assistance to Ukraine blog post published during Impact Week looked at the conflict from an attack perspective. Viewing Ukraine through an Internet traffic lens provides unique insights into the impacts of the war’s damage and destruction to Internet connectivity within the country. After starting the year with some nominal traffic growth, that trend was quickly reversed once the Russian invasion began on February 24, with traffic quickly falling as infrastructure was damaged and the populace focused on finding safety and shelter. Although traffic started to grow again after that initial steep decline, drops in May and June appear to be correlated with significant outages observed by Cloudflare. After returning to growth during August, several additional disruptions were visible in September, October, and November coincident with widespread power outages across the country resulting from Russian attacks.

Cloudflare Radar 2022 Year in Review

Reliable electric power is critical for reliable Internet connectivity, both for the core network infrastructure in data centers, as well as for last-mile infrastructure like cell towers and Wi-Fi routers, as well as laptops, cellphones, and other devices used to access the Internet. For several years, the residents of Puerto Rico have struggled to contend with an unreliable electric grid, resulting in frequent power outages and slow restoration times. In 2022, the island suffered two multi-day power outages that clearly impacted otherwise strong traffic growth. In April, a fire at a power plant caused an outage that lasted three days, disrupting Internet connectivity during that period. In September, widespread power outages resulting from damage from Hurricane Fiona resulted in a rapid drop in Internet traffic with the disruption lasting over a week until power restoration work and infrastructure repair was completed.

Cloudflare Radar 2022 Year in Review

Top categories

Cloudflare’s global customer base spans a range of industry categories, including technology, e-commerce, and entertainment, among others. Analysis of the traffic to our customers’ websites and applications reveals which categories of content were most popular throughout the year, and can be broken out by user location. The domains associated with each customer zone have one or more associated categories — these can be viewed on Cloudflare Radar. To calculate the distribution of traffic across the set of categories for each location, we divided the number of requests for domains associated with a given category seen over the course of a week by the total number of requests mapped to a category seen over that week, filtering out bot traffic. If a domain is associated with multiple categories, then the associated request was included in the aggregate count for each category. The chart shows how the distribution of requests across the selected categories changes over the course of the year.

Globally, sites in the Technology category were the most popular, accounting for approximately one-third of traffic throughout the year. The next most popular category was Business & Economy, which drove approximately 15% of traffic. Shopping & Auctions also saw a bump in traffic in November, as consumers began their holiday shopping.

Cloudflare Radar 2022 Year in Review

In sharp contrast to other Asian countries, in South Korea, Internet Communication was consistently the second most popular category during the year. Elsewhere, Internet Communication was occasionally among the top five, but usually within the top 10. Internet Communication was followed closely by Entertainment and Business & Economy. The former saw multiple periods of increased traffic through the year, in contrast to other categories, which saw traffic share remain fairly consistent over time.

Traffic distribution in Turkey represented a rare departure from most other locations around the world. Although Technology started the year as the most popular category, its popularity waned during the back half of the year, ending below Shopping & Auctions and Society & Lifestyle. These latter two saw gradual growth starting in September, and posted larger increases in November. Business & Economy and Entertainment sites were comparatively less popular here, in contrast to many other locations.

Armenia’s traffic distribution also ran counter to that seen in most other locations. Entertainment was the most popular category for nearly the entire year, except for the final week of November. Technology was generally the second most popular category, although it was surpassed by Gambling several times throughout the year. However, Gambling saw its popularity fall significantly in November, as it was surpassed by the Shopping & Auctions and Business & Economy categories.

The luxury of being a popular Internet service is that the service’s brand becomes very recognizable, so it will be no surprise that Google was #1 in our General ranking.

Top 10 — General, late 2022 ranking
1. Google
2. Facebook
3. Apple, TikTok (tie)
5. YouTube
6. Microsoft
7. Amazon Web Services
8. Instagram
9. Amazon
10. iCloud, Netflix, Twitter, Yahoo (tie)

Last year TikTok was at the top of our ranking. However, the results between the two years aren’t comparable. As part of our launch of Radar 2.0, we introduced improvements to our domain ranking algorithms, and this year’s rankings are based on those new algorithms. In addition, this year we have grouped domains that all belong to a single Internet service. For example, Google operates google.com, google.pt, and mail.google.com among others, so we aggregated the popularity of each domain under a single “Google” Internet service for simplicity. However, while Meta operates both Facebook and Instagram, consumers typically perceive those brands as distinct, so we decided to group domains associated with those services separately.

Zooming out from our General top 10, the anonymized DNS query data from our 1.1.1.1 public DNS resolver reflects traffic from millions of users around the world, enabling us to offer category specific rankings as well. While you can view them all in the “Most popular Internet services” section of our Year in Review website, we’ve decided to highlight a few of our favorite observations below.

Cryptocurrencies always seem to have as much promise as they have controversy. We couldn’t help but be curious about which cryptocurrency services were the most popular. But before jumping into the Top 10, let’s double-click on one that fell out of the running: FTX. Known as the third largest cryptocurrency exchange in the world, our popularity ranking shows it hovered around 9th place for most of the year. That is, until it filed for bankruptcy in November. At that point, there is a precipitous drop, which also appears to coincide with reports that FTX disabled its users’ ability to make cryptocurrency withdrawals. Moving back to the Top 10, the two other major cryptocurrency exchanges, Binance and Coinbase, ranked #1 and #3 respectively and don’t appear to have been adversely impacted by FTX in our rankings.

Cloudflare Radar 2022 Year in Review

The universe has been the hottest place to be since the beginning of time, but some suggest that we’ll all soon be in the metaverse. If that’s true, then the question becomes “Whose metaverse?”. Last year, Facebook changed its name to Meta as it poured billions of dollars into the space, so we were curious about the impact of their efforts on the metaverse landscape one year later. With Meta’s Oculus offering their initial foray into the metaverse, our data indicates that while its popularity saw tangible improvements, rising from 10th to 5th in the back half of the year, Roblox is clearly the champion of the metaverse arena. It is fascinating to see this smaller challenger dominating Oculus, which is operated by Meta, a company ~18x larger in market capitalization. We are excited to check back at the end of 2023 to see whether Oculus’ ascent of the rankings topples Roblox, or if the smaller player retains the crown.

Cloudflare Radar 2022 Year in Review

Facebook’s transition to Meta, however, does not appear to have impacted its popularity as a social media platform. Within our ranking of the top social media platforms, Facebook held the top position throughout the year. TikTok and Snapchat also held steady in their places among the top five. Instagram and Twitter traded places several times mid-year, but the photo and video sharing app ultimately knocked Twitter from 3rd place in August. More active volatility was seen in the bottom half of the top 10, as LinkedIn, Discord, and Reddit frequently shifted between sixth, seventh, and eighth position in the rankings.

Cloudflare Radar 2022 Year in Review

While those are the most popular sites today, over the last 20+ years, the landscape of social media platforms has been quite dynamic, with new players regularly emerging. Some gained a foothold and became successful, while others became a footnote of Internet history. Although it has actually been around since 2016, Mastodon emerged as the latest potential disruptor in the space. In a landscape where the top social media platforms operate closed-source, centralized platforms, Mastodon offers free, open source software to allow anyone to start their own social networking platform, built around a decentralized architecture, and easily federated with others.

Aggregating the domain names used by 400 top Mastodon instances, this cohort started the year hovering around the #200 rank of most popular services overall. Its position in the overall rankings steadily improved throughout the year, hitting an inflection point in November, moving up about 60 positions. This trend appears to be driven by a spike in interest and usage of Mastodon, which we elaborate on in the Adoption section below.

Cloudflare Radar 2022 Year in Review

Bot traffic

Bot traffic describes any non-human traffic to a website or an app. Some bots are useful, such as those that monitor site and application availability or search engine bots that index content for search, and Cloudflare maintains a list of verified bots known to perform such services. However, visibility into other non-verified bot activity is just as, if not more, important as they may be used to perform malicious activities, such as breaking into user accounts or scanning the web for exposed vulnerabilities to exploit. To calculate bot traffic percentages, we used the bot score assigned to each request to identify those made by bots, and then divided the total number of daily requests from these bots by the total number of daily requests. These calculations were done both globally and on a per-location basis. The line shown in the trends graph represents a seven-day trailing average. For the top 10 chart, we calculated the average bot percentage on a monthly basis per location, and then ranked the locations by percentage. The chart illustrates the ranking by month, and how those rankings change across the year.

Globally, bots generally accounted for between 30-35% of traffic over the course of the year. Starting January at around 35%, the percentage of bot traffic dropped by nearly a quarter through the end of February, but then reclaimed some of that loss, staying just above 30% through October. A slight downward trend is evident at the start of November, due to human traffic increasing while bot traffic remained fairly consistent. Despite a couple of nominal spikes/drops, the global trend exhibited fairly low volatility overall throughout the year.

Cloudflare Radar 2022 Year in Review

While around one-third of global traffic was from bots, two locations stood out with bot traffic percentages double the global level. Except for two brief mid-year spikes, just under 70% of traffic from Ireland was classified as bot-driven. Similarly, in Singapore, bot traffic consistently ranged between 60-70% across the year. Bots account for the majority share of traffic from these locations due to the presence of local “regions” from multiple cloud platform providers in each. Because doing so is easily automated and free/inexpensive, attackers will frequently spin up ephemeral instances in these clouds in order to launch high volume attacks, such as we saw with the “Mantis” attack in June. (Internal traffic analysis indicates that a significant portion of traffic for these two geographies is from cloud provider networks and that the vast majority of traffic we see from these networks is classified as bot traffic.)

Cloudflare Radar 2022 Year in Review

Cloudflare Radar 2022 Year in Review

The top 10 list of locations with the highest percentage of bot traffic saw a fair amount of movement throughout the year, with four different locations holding the top slot at some point during the year, although Turkmenistan spent the most time at the top of the list. Overall, 17 locations held a spot among the top 10 at some point during 2022, with greater concentrations in Europe and Asia.

Internet outages

Although the metrics included in the 2022 Year In Review were ultimately driven by Internet traffic to Cloudflare from networks and locations around the world, there are, unfortunately, times when traffic is disrupted. These disruptions can have a number of potential causes, including natural disasters and extreme weather, fiber optic cable cuts, or power outages. However, they can also happen when authoritarian governments order Internet connectivity to be shutdown at a network, regional, or national level.

We saw examples of all of these types of Internet disruptions, and more, during 2022, and aggregated coverage of them in quarterly overview blog posts. With the launch of Radar 2.0 in September, we also began to catalog them on the Cloudflare Radar Outage Center. These disruptions are most often visible as drops in Cloudflare traffic from a given network, region, or country. The 2022 Year In Review website illustrates where these disruptions occurred throughout the year. Some notable outages observed during 2022 are highlighted below.

One of the most significant Internet disruptions of the year took place on AS812 (Rogers), one of Canada’s largest Internet service providers. During the morning of July 8, a near complete loss of traffic was observed, and it took nearly 24 hours for traffic volumes to return to normal levels. A Cloudflare blog post covered the Rogers outage in real-time as the provider attempted to restore connectivity. Data from APNIC estimates that as many as five million users were directly affected, while press coverage noted that the outage also impacted phone systems, retail point of sale systems, automatic teller machines, and online banking services. According to a notice posted by the Rogers CEO, the outage was attributed to “a network system failure following a maintenance update in our core network, which caused some of our routers to malfunction”.

Cloudflare Radar 2022 Year in Review

In late September, protests and demonstrations erupted across Iran in response to the death of Mahsa Amini. Amini was a 22-year-old woman from the Kurdistan Province of Iran, and was arrested on September 13 in Tehran by Iran’s “morality police”, a unit that enforces strict dress codes for women. She died on September 16 while in police custody. Iran’s government is no stranger to using Internet shutdowns as a means of limiting communication with the outside world, and in response to these protests and demonstrations, Internet connectivity across the country experienced multiple waves of disruptions.

Three of the major mobile network providers — AS44244 (Irancell), AS57218 (RighTel), and AS197207 (MCCI) — started implementing daily Internet “curfews” on September 21, generally taking place between 1600 and midnight local time (1230-2030 UTC), although the start times varied on several days. These regular shutdowns lasted into early October, with several more ad-hoc disruptions taking place through the middle of the month, as well as other more localized shutdowns of Internet connectivity. Over 75 million users were impacted by these shutdowns, based on subscriber figures for MCCI alone.

Cloudflare Radar 2022 Year in Review

Cable cuts are also a frequent cause of Internet outages, with an old joke among network engineers that suggested that backhoes were the Internet’s natural enemy. While backhoes may be a threat to terrestrial fiber-optic cable, natural disasters can wreak havoc on submarine cables.

A prime example took Tonga offline earlier this year, when the Hunga Tonga–Hunga Ha’apai volcanic eruption damaged the submarine cable connecting Tonga to Fiji, resulting in a 38-day Internet outage. After the January 14 eruption, only minimal Internet traffic (via limited satellite services) was seen from Tonga. On February 22, Digicel announced that the main island was back online after initial submarine cable repairs were completed, but it was estimated that repairs to the domestic cable, connecting outlying islands, could take an additional six to nine months. We saw rapid growth in traffic from Tonga once the initial cable repairs were completed.

Cloudflare Radar 2022 Year in Review

The war in Ukraine is now ten months old, and throughout the time it has been going on, multiple networks across the country have experienced outages. In March, we observed outages in Mariupol and other cities where fighting was taking place. In late May, an extended Internet disruption began in Kherson, coincident with AS47598 (Khersontelecom) starting to route traffic through Russian network provider AS201776 (MIranda), rather than a Ukrainian upstream. And in October, widespread power outages disrupted Internet connectivity in Kharkiv, Lviv, Kyiv, Poltova Oblast, and Zhytomyr. These outages and others were covered in more detail in the quarterly Internet disruption overview blog posts, as well as several other Ukraine-specific blog posts.

Adoption

Cloudflare Radar 2022 Year in Review

Working with millions of websites and applications accessed by billions of people as well as providing an industry-leading DNS resolver service gives Cloudflare a unique perspective on the adoption of key technologies and platforms. SpaceX Starlink was frequently in the news this year, and we observed a 15x increase in traffic from the satellite Internet service provider. Social networking platform Mastodon was also in the news this year, and saw significant growth in interest as well.

IPv6 remains increasingly important as connected device growth over the last decade has exhausted available IPv4 address space, but global adoption remained around 35% across the year. And as the Internet-connected population continues to grow, many of those people are using mobile devices as their primary means of access. To that end, we also explore mobile device usage trends across the year.

Starlink adoption

Internet connectivity through satellites in geostationary orbit (GEO) has been around for a number of years, but services have historically been hampered by high latency and slower speeds. However, the launch of SpaceX Starlink’s Low Earth Orbit (LEO) satellite Internet service in 2019 and subsequent expansion of the satellite constellation has made high performance Internet connections available in many locations that were previously unserved or underserved by traditional wired or wireless broadband. To track the growth in usage and availability of Starlink’s service, we analyzed aggregate Cloudflare traffic volumes associated with the service’s autonomous system (AS14593) throughout 2022. Although Starlink is not yet available globally, we did see traffic growth across a number of locations. The request volume shown on the trend line in the chart represents a seven-day trailing average.

Damage from the war in Ukraine has disrupted traditional wired and wireless Internet connectivity since the invasion started in late February. Starlink made headlines that month after the company activated service within the country, and the necessary satellite Internet terminals became more widely available. Within days, Cloudflare began to see Starlink traffic, with volume growing consistently throughout the year.

Cloudflare Radar 2022 Year in Review

Latent interest in the service was also apparent in a number of locations where traffic grew quickly after Starlink announced availability. One such example is Romania, which was included in Starlink’s May announcement of an expanded service footprint, and which saw rapid traffic growth after the announcement.

Cloudflare Radar 2022 Year in Review

And in the United States, where Starlink has provided service since launch, traffic grew more than 10x through the end of November. Service enhancements announced during the year, like the ability to get Internet connectivity from moving vehicles, boats, and planes will likely drive additional traffic growth in the future.

Cloudflare Radar 2022 Year in Review

Mastodon interest

Above, we showed that Mastodon hit an inflection point in its popularity during the last few months of 2022. To better understand how interest in Mastodon evolved during 2022, we analyzed aggregate 1.1.1.1 request volume data for the domain names associated with 400 top Mastodon instances, looking at aggregate request volume by location. The request volume shown on the trend line in the chart represents a seven-day trailing average.

Although interest in Mastodon clearly accelerated over the last few months of the year, this interest was unevenly distributed throughout the world as we saw little to no traffic across many locations. Graphs for those locations are not included within the Year In Review website. However, because Mastodon has been around since 2016, it built a base of early adopters over the last six years before being thrust into the spotlight in 2022.

Those early adopters are visible at a global level, as we see a steady volume of resolver traffic for the analyzed Mastodon instance domain names through the first nine months of the year, with the timing of the increase visible in late April aligning with the announcement that Elon Musk had reached a deal to acquire Twitter for $44 billion. The slope of the graph clearly shifted in October as it became increasingly clear that the acquisition would close shortly, with additional growth into November after the deal was completed. This growth is likely due to a combination of existing but dormant Mastodon accounts once again becoming active, and an influx of new users.

Cloudflare Radar 2022 Year in Review

The traffic pattern observed for the United States appears fairly similar to the global pattern, with traffic from an existing set of users seeing massive growth starting in late October as well.

Cloudflare Radar 2022 Year in Review

Although the core Mastodon software was developed by a programmer living in Germany, and the associated organization is incorporated as a German not-for-profit, it didn’t appear to have any significant home field advantage. Query volume for Germany was relatively low throughout most of the year, and only started to rapidly increase at the end of October, similar to behavior observed in a number of other countries.

Cloudflare Radar 2022 Year in Review

IPv6 adoption

Although IPv6 has been around for nearly a quarter-century, adoption has been relatively slow over that time. However, with the exhaustion of available IPv4 address space and the growth in connected and mobile devices, IPv6 plays a critical role in the future of the Internet. Cloudflare has enabled customers to deliver content over IPv6 since our first birthday, back in 2011, and we have evolved support in several ways since that time. Analysis of traffic to the Cloudflare network provides us with insights into IPv6 adoption across the Internet.

On a global basis, IPv6 adoption hovered around the 35% mark throughout the year, with nominal growth evident in the trend line shown in the graph. While it is encouraging to see one of every three requests for dual stacked content being made over IPv6, this adoption rate demonstrates a clear opportunity for improvement.

To calculate IPv6 adoption for each location, we identified the set of customer zones that had IPv6 enabled (were “dual stacked”) during 2022, and then divided the daily request count for the zones over IPv6 by the daily sum of IPv4 and IPv6 requests for the zones, filtering out bot traffic in both cases. The line shown in the trends graph represents a seven-day trailing average. For the top 10 chart, we calculated the average IPv6 adoption level on a monthly basis per location, and then ranked the locations by percentage. The chart illustrates the ranking by month, and how those rankings change across the year.

Cloudflare Radar 2022 Year in Review

One location that has seized that opportunity is India, which recorded the highest IPv6 adoption rate throughout the year. After seeing more than 70% adoption through July, it began to drop slightly in late summer, losing a couple of percentage points over the subsequent months.

One key driver behind India’s leadership in this area is IPv6 support from Jio, India’s largest mobile network operator, as well as being a provider of fiber-to-the-home broadband connectivity. They aggressively started their IPv6 journey in late 2015, and now much of Jio’s core network infrastructure is IPv6-only, while customer-facing mobile and fiber connections are dual-stacked.

Cloudflare Radar 2022 Year in Review

Also heading in the right direction are the more than 60 locations around the world that saw IP adoption rates more than double this year. One of the largest increases was seen in the European country of Georgia, which grew more than 3,500% to close out the year at 10% adoption thanks to rapid growth across February and March at Magticom, a leading Georgian telecommunications provider.

Many of the other locations in this set also experienced large gains over a short period of time, likely due to a local network provider enabling subscriber support for IPv6. While significant gains seen in over a quarter of the total surveyed locations is certainly a positive sign, it must be noted that over 50 are under 10% adoption, with more than half of those remaining well under 1%, even after seeing adoption more than double. Internet service providers around the world continue to add or improve IPv6 support for their subscribers, but many have low to non-existent adoption rates, presenting significant opportunity to improve in the future.

Cloudflare Radar 2022 Year in Review

As noted above, India had the highest level of IPv6 adoption through 2022. In looking at the remainder of the top 10 list, Saudi Arabia and Malaysia traded places several times during the year as the locations with the second and third-highest adoption rates, at just under 60% and around 55% respectively. The United States appeared towards the bottom of the top 10 list during the first quarter, but ranked lower for the remainder of the year. Belgium proved to be the most consistent, holding the fourth-place spot from March through November, with around 55% IPv6 adoption. Overall, a total of 14 locations appeared among the top 10 at some point during the year.

Mobile device usage

Each year, mobile devices become more and more powerful, and are increasingly being used as the primary onramp to the Internet in many places. In fact, in some parts of the world, so-called “desktop” devices (which includes laptop form factors) are the exception for Internet access, not the rule.

Analysis of the information included with each content request enables us to classify the type of device (mobile or desktop) used to make the request. To calculate the percentage of mobile device usage by location, we divided the number of requests made by mobile devices over the course of a week by the total number of requests seen that week, filtering out bot traffic in both cases. For the top 10 chart, we ranked the locations by the calculated percentage. The chart illustrates the ranking by month, and how those rankings change across the year.

In looking at the top 10 chart, we note that Iran and Sudan held the top two slots for much of the year, bookended by Yemen in January and Mauritania in November. Below the top two spots, however, significant volatility is clear throughout the year within the rest of the top 10. However, this movement was actually concentrated across a relatively small percentage range, with just five to ten percentage points separating the top and bottom ranked locations, depending on the week. The top ranked locations generally saw 80-85% of traffic from mobile devices, while the bottom ranked locations saw 75-80% of traffic from mobile devices.

This analysis reinforces the importance of mobile connectivity in Iran, and underscores why mobile network providers were targeted for Internet shutdowns in September and October, as discussed above. (And the shutdowns subsequently explain why Iran disappears from the top 10 list after September.)

Security

Cloudflare Radar 2022 Year in Review

Improving Internet security is a key part of Cloudflare’s drive to help build a better Internet. One way we do that is by protecting customer websites, applications, and network infrastructure from malicious traffic and attacks. Because malicious actors regularly use a variety of techniques and approaches in launching their attacks, we have a number of products within our security solution portfolio that provide customers with flexibility around how they handle these attacks. Below, we explore insights derived from the attack mitigation we do on behalf of customers, including how we are mitigating attacks, what kinds of websites and applications attacks are targeting, and where these attacks appear to be coming from. In addition, with the acquisition of Area 1 earlier in 2022, we are presenting insight into where malicious email originates from. Analysis of this data highlights that there is very much no “one size fits all” security solution, as attackers use a wide variety of techniques, frequently shifting between them. As such, having a broad but flexible portfolio of security solutions at the ready is critical for CISOs and CIOs.

Mitigation sources

Depending on the approach taken by an attacker, and the type of content being targeted, one attack mitigation technique may be preferable over another. Cloudflare refers to these techniques as “mitigation sources”, and they include popular tools and techniques like Web Application Firewall (WAF) and DDoS Mitigation (DDoS), but also lesser known ones like IP Reputation (IPR), Access Rules (AR), Bot Management (BM), and API Shield (APIS). Examining the distribution of mitigation sources applied by location can help us better understand the types of attacks originating from those locations. To calculate the percentage of mitigated traffic associated with each mitigation source by location, we divided the total number of daily mitigated requests for each source by the total number of mitigated requests seen that day. Bot traffic is included in these calculations, given that many attacks originate from bots. A single request can be mitigated by multiple techniques, and here we consider the last technique that mitigated the request.

Across many locations, IP Reputation, Bot Management, and Access Rules accounted for small amounts of mitigated traffic throughout the year, with the volumes varying by country. However, in other locations, IP Reputation and Access Rules were responsible for larger amounts of mitigated traffic, possibly indicating those places had more of their traffic being blocked outright. A number of countries saw a rapid and significant increase in DDoS mitigated traffic during January to the 80-90% range, followed by a rapid drop to the 10-20% range. In that vein, DDoS Mitigation and WAF percentage shifts were frequently very spiky, with only occasional sustained periods of relatively consistent percentages.

Overall, DDoS Mitigation and WAF were the two most frequently used techniques to address attacks. The former’s share on a global basis was highest in mid-January, growing to nearly 80%, while the latter’s peak was during February, when it accounted for almost 60% of mitigated traffic. A spike in the usage of Access Rules is clearly visible in August, related to similar spikes observed for the United States, United Arab Emirates, and Malaysia.

Cloudflare Radar 2022 Year in Review

Although Access Rules accounted for as much as 20% of mitigated traffic from the United States in August, it saw much lower usage throughout the balance of the year. DDoS Mitigation was the primary technique used to mitigate attack traffic coming from the United States, responsible for over 80% of such traffic during the first quarter, though it steadily declined through August. In a complimentary fashion, WAF drove only ~20% of mitigated traffic early in the year, but that volume steadily grew and had tripled through August. Interestingly, the growth in Access Rules usage followed rapid growth and then similarly rapid decline in WAF, possibly suggesting that more targeted rules were implemented to augment the managed rules applied by the Web Application Firewall against US-originated attacks.

Cloudflare Radar 2022 Year in Review

Access Rules and IP Reputation were applied more frequently to mitigate attack traffic coming from Germany, with Bot Management also seeing increased usage in February, March, and June. However, except for periods in February and July, DDoS Mitigation drove the bulk of mitigated traffic, generally ranging between 60-80%. WAF mitigation was clearly most significant during February, with 70-80% of mitigated traffic, and July, at around 60%.

Cloudflare Radar 2022 Year in Review

In mitigating attacks coming from Japan, it is interesting to see a couple of notable spikes in Bot Management. In March, it was briefly responsible for upwards of 40% of mitigated traffic, with another spike that was half as big in June. Access Rules also maintained a consistent presence in the graph, with around 5% of mitigated traffic through August, but slightly less in the following months. In dealing with Japanese attack traffic, WAF & DDoS Mitigation frequently traded positions as the largest source of mitigated traffic, although there was no clear pattern or apparent cycle. Both reached as much as 90% of mitigated traffic at times throughout the year – WAF in February and DDoS Mitigation in March. DDoS Mitigation’s periods of “dominance” tended to be more sustained, lasting for several weeks, but were punctuated by brief WAF spikes.

Cloudflare Radar 2022 Year in Review

WAF rules

As noted above, Cloudflare’s WAF is frequently used to mitigate application layer attacks. There are hundreds of individually managed rules that can be applied by the WAF depending on the characteristics of the mitigated request, but these rules can be grouped into over a dozen types. Examining the distribution of WAF rules by location can help us better understand the techniques that attacks coming from that location are using. (For example, are attackers trying to inject SQL code into a form field, or exploit a published CVE?) To calculate the distribution of WAF mitigated traffic across the set of rule types for each location, we divided the number of requests mitigated by a particular type of WAF rule seen over the course of a week by the total number of WAF mitigated requests seen over that week. A single request can be mitigated by multiple rules and here we consider the last rule in a sequence that mitigated the request. The chart shows how the distribution of mitigated requests across the selected rule types changes over the course of the year. Bot traffic is included in these calculations.

At a worldwide level, during the first few months of the year, approximately half of HTTP requests blocked by our Managed WAF Rules contained HTTP anomalies, such as malformed method names, null byte characters in headers, non-standard ports, or content length of zero with a POST request. During that period, Directory Traversal and SQL Injection (SQLi) rules both accounted for just over 10% of mitigated requests as well. Attackers began to further vary their approach starting in May, as Cross Site Scripting (XSS) and File Inclusion both grew to over 10% of mitigations, while HTTP anomalies dropped to below 30%. Use of Software Specific rules grew above 10% in July, as attackers apparently ramped their efforts to exploit vendor-specific vulnerabilities. Broken Authentication and Command Injection rulesets also saw some growth in activity during the last several months, suggesting that attackers increased their efforts to find vulnerabilities in login/authentication systems or to execute commands on vulnerable systems in an attempt to gain access.

Cloudflare Radar 2022 Year in Review

Although HTTP Anomaly was the most frequently applied rule when mitigations are aggregated at a global level, there were a number of locations where it held the top spot only briefly, if at all, as discussed below.

Attacks originating in Australia were WAF-mitigated using a number of rulesets, with the most applied ruleset changing frequently during the first half of the year. In contrast to the global overview, HTTP Anomaly was the top ruleset for only a single week in February, when it accounted for just over 30% of mitigations. Otherwise, attacks were most frequently mitigated with Software Specific, Directory Traversal, File Inclusion, and SQLi rules, generally accounting for 25-35% of mitigations. This pattern shifted starting in July, though, as Directory Traversal attacks became the most common, staying that way through the balance of the year. After peaking in June, SQLi attacks became significantly less common, rapidly falling and staying below 10% of mitigations.

WAF mitigations of attacks originating in Canada also demonstrated a pattern that differed from the global one. Although the HTTP Anomaly ruleset started the year accounting for approximately two thirds of mitigated requests, it was half that by the end of January, and saw significant volatility throughout the balance of the year. SQLi mitigations of Australian traffic effectively saw an opposite pattern, starting the year below 10% of mitigations but growing rapidly, accounting for 60% or more of mitigated traffic at multiple times throughout the year. Interestingly, SQLi attacks from Canada appeared to come in multi-week waves, becoming the most applied ruleset during those waves, and then receding for a brief period.

For attacks originating in Switzerland, the HTTP Anomaly ruleset was never the most frequently invoked, although it remained among the top five throughout the year. Instead, Directory Traversal and XSS rules were most frequently used, accounting for as much as 40% of mitigations. Directory Traversal most consistently held the top spot, though XSS attacks were the most prevalent during August. SQLi attacks saw peaks in April, July/August, and then again at the end of November. The Software Specific ruleset also breakout growth in September to as much as 20% of mitigated requests.

Target categories

Above, we discussed how traffic distribution across a set of categories provides insights into the types of content that users are most interested in. By performing similar analysis through a mitigation lens, we can gain insights into the types of websites and applications that are being most frequently targeted by attackers. To calculate the distribution of mitigated traffic across the set of categories for each location, we divided the number of mitigated requests for domains associated with a given category seen over the course of a week by the total number of requests mapped to that category during that week. The chart shows how the distribution of mitigated requests across each category changes over the course of the year. (As such, percentages will not sum to 100%). Bot traffic is included in these calculations. The percentage of traffic that was mitigated as an attack varied widely across industries and originating locations. In some places, a nominal percentage of traffic across all categories was mitigated, while in others, multiple categories experienced spikes in mitigated traffic at multiple times during 2022.

When aggregated at a global level, there was significant variance over the course of the year in the industry categories that attracted the most attacks as a fraction of their overall traffic. Through January and February, Technology sites had the largest percentage of mitigated requests, ranging between 20-30%. After that, a variety of categories moved in and out of the top slot, with none holding it for more than a few weeks. The biggest spike in attacks was targeted at Travel sites in mid-April, when more than half of the category’s traffic was mitigated. Coincident with the start of the 2022 World Cup in the last week of November, Gambling and Entertainment sites saw the largest percentages of mitigated traffic.

Cloudflare Radar 2022 Year in Review

For attacks coming from the United Kingdom, Technology sites consistently saw around 20% of mitigated traffic through the year. During those times that it was not the most mitigated category, half a dozen other categories topped the list. Travel sites experienced two significant bursts of attacks, with nearly 60% of traffic mitigated in April, and nearly 50% in October. Other categories, including Government & Politics, Real Estate, Religion, and Education had the largest shares of mitigated traffic at various times throughout the year. UK-originated attacks on Entertainment sites jumped significantly in late November, with 40% of traffic mitigated at the end of the month.

Similar to the trends seen at the global level, Technology sites accounted for the largest percentage of mitigated attacks from the United States in January and February, clocking in between 30-40%. After that, attackers shifted their focus to target other industry categories. In mid-April, Travel sites had over 60% of requests mitigated as attacks. However, starting in May, Gambling sites most frequently had the highest percentage of traffic being mitigated, generally ranging between 20-40%, but spiking up to 70% in late October/early November.

In contrast, significantly smaller percentages of traffic across the surveyed categories from Japan was mitigated as attacks throughout 2022. Most categories saw mitigation shares of less than 10%, although there were a number of brief spikes observed at times. In late March, traffic to sites in the Government & Politics category briefly jumped to a nearly 80% mitigation share, while Travel sites spiked to nearly 70% of requests mitigated as attacks, similar to the behavior seen in other locations. In late June, Religion sites had a mitigation share of over 60%, and a couple of months later, Gambling sites experienced a rapid increase in mitigated traffic, reaching just over 40%. These attacks targeting Gambling sites then receded for a few months before starting to aggressively increase again in October.

Phishing email sources

Phishing emails are ultimately intended to trick users into providing attackers with login credentials for important websites and applications. At a consumer level, this could include an e-commerce site or banking application, while for businesses, this could include code repositories or employee information systems. For customers protected by Cloudflare Area 1 Email Security, we can identify the location that these phishing emails are being sent from. IP address geolocation is used to identify origination location, and the aggregate email counts apply to emails processed by Area 1 only. For the top 10 chart, we aggregated the number of phishing emails seen on a weekly basis per location, and then ranked the locations by phishing email volume. The chart illustrates the ranking by week, and how those rankings change across the year.

Reviewing the top 10 list, we find that the United States was the top source of phishing emails observed by Area 1 during 2022. It held the top spot for nearly the entire year, ceding it only once to Germany in November. The balance of the top 10 saw a significant amount of volatility over time, with a total of 23 locations holding a spot in the rankings for at least one month during the year. These locations were well-distributed geographically across the Americas, Europe, and Asia, highlighting that no one region of the world is a greater threat than others. Obviously, distrusting or rejecting all email originating from these locations is not a particularly practical response, but applying additional scrutiny can help keep your organization, and the Internet, safer.

Conclusion

Attempting to concisely summarize our “year in review” observations is challenging, especially as we only looked at trends in this blog post across a small fraction of the nearly 200 locations included in the website’s visualizations. Having said that, we will leave you with the following brief thoughts:

  • Attack traffic comes from everywhere, with constantly shifting targets, using widely varied techniques. Ensure that your security solutions provider offers a comprehensive portfolio of services to help keep your sites, applications, and infrastructure safe.
  • Internet service providers around the world need to improve support for IPv6 — it is no longer a “new” technology, and available IPv4 address space will become both increasingly scarce and increasingly expensive. Support for IPv6 needs to become the default going forward.
  • Internet shutdowns are being increasingly used by governments to limit communications within a country, as well as limiting communications with the rest of the world. As the United Nations stated in a May 2022 report, “Blanket shutdowns in particular inherently impose unacceptable consequences for human rights and should never be imposed.”

As we said in the introduction, we encourage you to visit the full Cloudflare Radar 2022 Year In Review website and explore the trends relevant to locations and industries of interest, and to consider how they impact your organization so that you are appropriately prepared for 2023.

If you have any questions, you can contact the Cloudflare Radar team at [email protected] or on Twitter at @CloudflareRadar.

Acknowledgements

It truly took a village to produce the Cloudflare Radar 2022 Year In Review, and we would be remiss if we didn’t acknowledge the contributions of colleagues that were instrumental in making this project possible. Thank you to: Sabina Zejnilovic, Carlos Azevedo, Jorge Pacheco (Data Science); Ricardo Baeta, Syeef Karim (Design); Nuno Pereira, Tiago Dias, Junior Dias de Oliveira (Front End Development); João Tomé (Most popular Internet services); and Davide Marques, Paula Tavares, Celso Martinho (Project/Engineering Management).

Critical Microsoft Code-Execution Vulnerability

Post Syndicated from Bruce Schneier original https://www.schneier.com/blog/archives/2022/12/critical-microsoft-code-execution-vulnerability.html

A critical code-execution vulnerability in Microsoft Windows was patched in September. It seems that researchers just realized how serious it was (and is):

Like EternalBlue, CVE-2022-37958, as the latest vulnerability is tracked, allows attackers to execute malicious code with no authentication required. Also, like EternalBlue, it’s wormable, meaning that a single exploit can trigger a chain reaction of self-replicating follow-on exploits on other vulnerable systems. The wormability of EternalBlue allowed WannaCry and several other attacks to spread across the world in a matter of minutes with no user interaction required.

But unlike EternalBlue, which could be exploited when using only the SMB, or server message block, a protocol for file and printer sharing and similar network activities, this latest vulnerability is present in a much broader range of network protocols, giving attackers more flexibility than they had when exploiting the older vulnerability.

[…]

Microsoft fixed CVE-2022-37958 in September during its monthly Patch Tuesday rollout of security fixes. At the time, however, Microsoft researchers believed the vulnerability allowed only the disclosure of potentially sensitive information. As such, Microsoft gave the vulnerability a designation of “important.” In the routine course of analyzing vulnerabilities after they’re patched, Palmiotti discovered it allowed for remote code execution in much the way EternalBlue did. Last week, Microsoft revised the designation to critical and gave it a severity rating of 8.1, the same given to EternalBlue.

Huang: Towards a More Open Secure Element Chip

Post Syndicated from original https://lwn.net/Articles/918337/

Andrew ‘bunnie’ Huang writes about his work
with Cramium to bring more openness to secure element
chips:

In my view it’s better to compromise and have a seat at the table
now, than to walk away from negotiations and simply cede green
fields to proprietary technologies, hoping to retake lost ground
only after the community has achieved consensus around a robust
full-stack open source SE solution. So, instead of investing time
arguing over politics before any work is done, I’m choosing to
invest time building validation test suites. Once I have a solid
suite of tests in hand, I’ll have a much stronger position to argue
for the removal of any proprietary CPU cores.

(Thanks to Paul Wise)

CVE-2022-41080, CVE-2022-41082: Rapid7 Observed Exploitation of `OWASSRF` in Exchange for RCE

Post Syndicated from Glenn Thorpe original https://blog.rapid7.com/2022/12/21/cve-2022-41080-cve-2022-41082-rapid7-observed-exploitation-of-owassrf-in-exchange-for-rce/

CVE-2022-41080, CVE-2022-41082: Rapid7 Observed Exploitation of `OWASSRF` in Exchange for RCE

Beginning December 20, 2022, Rapid7 has responded to an increase in the number of Microsoft Exchange server compromises. Further investigation aligned these attacks to what CrowdStrike is reporting as “OWASSRF”, a chaining of CVE-2022-41080 and CVE-2022-41082 to bypass URL rewrite mitigations that Microsoft provided for ProxyNotShell allowing for remote code execution (RCE) via privilege escalation via Outlook Web Access (OWA).

Patched servers do not appear vulnerable, servers only utilizing Microsoft’s mitigations do appear vulnerable.

Threat actors are using this to deploy ransomware.

Rapid7 recommends that organizations who have yet to install the Exchange update (KB5019758) from November 2022 should do so immediately and investigate systems for indicators of compromise. Do not rely on the rewrite mitigations for protection.

Affected Products

The following on-prem versions of Exchange that have not applied the November 8, 2022 KB5019758 update are vulnerable:

  • Microsoft Exchange Server 2013
  • Microsoft Exchange Server 2016
  • Microsoft Exchange Server 2019

IOCs

In addition to the detection rules included in InsightIDR for Rapid7 customers, other IOCs include:

  • PowerShell spawned by IIS (‘w3wp.exe’) creating outbound network connections
  • 45.76.141[.]84
  • 45.76.143[.]143

Example command being spawned by IIS (w3wp.exe):

CVE-2022-41080, CVE-2022-41082: Rapid7 Observed Exploitation of `OWASSRF` in Exchange for RCE

Decoded command where the highlighted string (0x2d4c8f8f) is the hex representation of the IP address 45.76.143[.]143

CVE-2022-41080, CVE-2022-41082: Rapid7 Observed Exploitation of `OWASSRF` in Exchange for RCE

Rapid7 Customers

Customers already have coverage to assist in assessing exposure to and detecting exploitation of this threat.

InsightVM and Nexpose

InsightVM and Nexpose added checks for CVE-2022-41080 and CVE-2022-41082 on November 8, 2022.

InsightIDR

InsightIDR customers can look for the alerting of the following rules, typically seeing several (or all) triggered on a single executed command:

  • Attacker Technique – PowerShell Registry Cradle
  • Suspicious Process – PowerShell System.Net.Sockets.TcpClient
  • Suspicious Process – Exchange Server Spawns Process
  • PowerShell – Obfuscated Script
  • Webshell – IIS Spawns PowerShell
    Additional detections currently being observed with follow-on activity in these compromises include:
  • Attacker Technique – Plink Redirecting RDP
  • Attacker Technique – Renamed Plink
  • Suspicious Process – Started From Users Music Directory

Managed Detection & Response customers

Your customer advisor will reach out to you right away if any suspicious activity is observed in your organization.

Eoin Miller contributed to this article.

The collective thoughts of the interwebz

By continuing to use the site, you agree to the use of cookies. more information

The cookie settings on this website are set to "allow cookies" to give you the best browsing experience possible. If you continue to use this website without changing your cookie settings or you click "Accept" below then you are consenting to this.

Close