AWS Security Profile: Matt Luttrell, Principal Solutions Architect for AWS Identity

Post Syndicated from Maddie Bacon original https://aws.amazon.com/blogs/security/aws-security-profile-matt-luttrell-principal-solutions-architect-for-aws-identity/

AWS Security Profile: Matt Luttrell, Principal Solutions Architect for AWS Identity

In the AWS Security Profile series, I interview some of the humans who work in Amazon Web Services Security and help keep our customers safe and secure. In this profile, I interviewed Matt Luttrell, Principal Solutions Architect for AWS Identity.


How long have you been at AWS and what do you do in your current role?

I’ve been at AWS around five years and have worked in a variety of roles from Professional Services consulting as an application architect to a solutions architect. In my current role, I work on the Identity Solutions team, which is a group of solutions architects who are embedded directly in the Identity and Control Services team. We have both internal-facing and external-facing functions. Internally, we work with product managers, drive concepts like data perimeters, and generally act as the voice of the customer to our product teams. Externally, we have conversations with customers, present at events, and so on.

How did you get started in security?

My background is in software development. I’ve always had a side interest in security and have always worked for very security-conscious companies. Early in my career, I became CISSP certified and that’s what got me kickstarted in security-specific domains and conversations. At AWS, being involved in security isn’t an optional thing. So, even before I joined the Identity Solutions team, I spent a lot of time working on identity and AWS Identity and Access Management (IAM) in particular, as well as AWS IAM Access Analyzer, while working with security-conscious customers in the financial services industry. As I got involved in that, I was able to dive deep in the security elements of AWS, but I’ve always had a background in security.

How do you explain your job to non-technical friends and family?

I typically tell them that I work in the cloud computing division at Amazon and that my job title is Solutions Architect. Naturally, the next question is, “what does a solutions architect do? I’ve never heard of that.” I explain that I work with customers to figure out how to put the building blocks together that we offer them. We offer a bunch of different services and features, and my job is to teach customers how they all work and interact with each other.

What are you currently working on that you’re excited about?

One of the things our team is working on is data perimeters. Our customers will see continued guidance on data perimeters. We’ve done a lot of work in this space—workshops and presentations at some of our big conferences, as well as blog posts and example repositories.

I’m also putting together some videos that go in depth on IAM policy evaluation and offer prescriptive guidance on writing IAM policies.

In your opinion, what’s one of the coolest things happening in identity right now?

I might be biased here, but I think there’s been a shift in the security industry at large from network-based perimeters in the traditional on-premises world to identity-based perimeters in the cloud. This is where the concept of data perimeters comes into play. Because your resources and identities are distributed, you can no longer look at your server and touch your server that’s sitting right next to you. This really puts an extra emphasis on your authentication and authorization controls, as well as the need for visibility into those controls. I think there’s a lot of innovation happening in the identity world because of this increased focus on identity perimeters. You’re hearing about concepts in this area like zero trust, data perimeters, and general identity awareness in all levels of the application and infrastructure stacks. You have services like IAM Access Analyzer to help give you that visibility into your AWS environment and what your identities are doing in terms of who can access what. I think we’ll continue to see growth in these areas because workloads are not becoming less distributed over time.

Tell me about something fun that you’ve done recently at AWS.

Roberto Migli and I presented a 400-level workshop at re:Invent 2022 on IAM policy evaluation, AWS Identity and Access Management (IAM) policy evaluation in action. This workshop introduced a new mental model for thinking about policy evaluation and walked attendees through a number of different policy evaluation scenarios. The idea behind the workshop is that we introduce a scenario and have the attendee try to figure out what the result of the evaluation would be. It spends some extra time comparing how the evaluation of resource-based policies differs from that of identity-based policies. I hope attendees walked away with a better understanding of how policy evaluations work at a deeper level and how they can write better, more secure IAM policies. We presented practical advice on how to structure different types of IAM policies and the different tradeoffs when writing a policy one way compared to another. I hope the mental model we introduced helps customers better reason about how policies will evaluate when they write them in their environment.

What is your favorite Amazon Leadership Principle and why?

This is an easy one. For me, it’s definitely Learn and Be Curious. Something I try to do is put myself in uncomfortable situations because I feel that when I’m uncomfortable, I’m learning and growing because it means I don’t know something. I find comfortable situations boring at times, so I’m always trying to dig in and learn how things work. This can sometimes be distracting, too, because there’s so much to learn and understand in the identity world.

What’s the thing you’re most proud of in your career?

There’s no particular project that I can point to and say, “this is what I’m most proud of.” I’m proud to be a part of the team I’m on now. For my team, Customer Obsession is more than just a slogan. We really advocate on behalf of the customer, listen to the voice of the customer, and push back on features that might not be the best thing for the customer. I think it’s awesome that I get to work for a company that really does advocate on behalf of the customer, and that my voice is heard when I’m trying to be that advocate. That aspect of working at AWS and with my team is what I’m most proud of.

I’m also proud of the mentoring and teaching that I get to do within AWS and within my role specifically. It’s really fulfilling to watch somebody grow and realize that career growth is not a zero-sum game—just because someone else succeeds does not mean that I have to fail.

If you had to pick an industry outside of security, what would you want to do?

I’d probably choose to be a ski instructor. I’m a big fan of skiing, but I don’t get to ski very often because of where I live. I love being out on the mountains, skiing, and teaching. I’m looking for any excuse to spend my days in the mountains.

 
If you have feedback about this post, submit comments in the Comments section below. If you have questions about this post, contact AWS Support.

Want more AWS Security news? Follow us on Twitter.

Author

Maddie Bacon

Maddie (she/her) is a technical writer for Amazon Security with a passion for creating meaningful content that focuses on the human side of security and encourages a security-first mindset. She previously worked as a reporter and editor, and has a BA in Mathematics. In her spare time, she enjoys reading, traveling, and staunchly defending the Oxford comma.

Author

Matt Luttrell

Matt is a Principal Solutions Architect on the AWS Identity Solutions team. When he’s not spending time chasing his kids around, he enjoys skiing, cycling, and the occasional video game.

[$] The early days of Linux

Post Syndicated from original https://lwn.net/Articles/928581/

My name is Lars Wirzenius, and I was there when Linux started. Linux
is now a global success, but its beginnings were rather more humble.
These are my memories of the earliest days of Linux, its creation, and the
start of its path to where it is today.

SANS Institute uses Amazon QuickSight to drive transformational security awareness maturity within organizations

Post Syndicated from Carl R. Marrelli original https://aws.amazon.com/blogs/big-data/sans-institute-uses-amazon-quicksight-to-drive-transformational-security-awareness-maturity-within-organizations/

This is a guest post by Carl Marrelli from SANS Institute.

The SANS Institute is a world leader in cybersecurity training and certification. For over 30 years, SANS has worked with leading organizations to help ensure security across their organization, as well as with individual IT professionals who want to build and grow their security careers. We partner with over 500 organizations and support over 200,000 IT professionals with more than 90 technical training courses and over 40 professional (GIAC) certifications.

Our Security Awareness products include more than 70 instructional modules and have been deployed to over 6.5 million end-users to bring cybersecurity training to each employee within an organization.

As the Security Awareness department in particular began developing product strategies to deliver data-driven insights to customers, we were clear on using existing analytics services to rapidly build customer-facing analytics solutions. Building on a proven cloud provider would allow us to focus on our core expertise of helping organizations train, learn, and mature their programs instead of spending extra time and resources building and maintaining analytics from scratch.

We identified Amazon QuickSight, a fully managed, cloud-native business intelligence (BI) service, as the product that fit all our criteria. With it, we found an intuitive product with rich visualizations that we could build and grow with rapidly, allowing us to innovate without monetary risks or being locked in to cumbersome contracts. We considered other options, but they couldn’t support the licensing model that fit our needs.

In this post, we go over how we use QuickSight to serve our security customers.

Helping manage human risk with data-driven insights

SANS Security Awareness helps organizations use best-in-class security awareness and training solutions to transform their ability to measure and manage human risk. Security awareness programs are initiatives aimed at educating individuals about the importance of information security and the best practices for maintaining the confidentiality, integrity, and availability of information. We deliver expertly authored training materials to organizations, including computer-based video training sessions, interactive learning modules, supplemental materials, and reinforcement curriculum to keep security top-of-mind for all employees.

As organizations rapidly adopt and expand their use of digital technologies in their day-to-day work, the number of touchpoints with humans increases. As threat landscapes become increasingly more severe, managing human risk is critical to the success of the security program in any organization. Not only do organizations have to conduct security awareness training programs, but they also need insights into data and metrics that identify points of weakness to take data-driven corrective courses of action. As a leader in the space, we wanted to innovate by bringing relevant data-driven insights to our Security Awareness partners and customers in the journey to ensuring human-centered security across their organizations.

New data products to enhance and gamify risk assessment

We built one of our first insights products to support our Behavioral Risk Assessment. This service allows senior security and risk leaders to assess human risk with data handling, digital behavior, and compliance in an organization by individual, team, geography, business unit, and more. Leaders use the assessment to mature their security awareness capability with risk-informed interventions, identify process and procedure gaps, surface shadow IT, and reduce overall awareness training costs by focusing attention on the most important areas of risk.

Behavioral Risk Assessment dashboard with various charts

Delivered via a survey customized to the data types and risk profile of an organization, this assessment allows risk management leaders to more easily understand the data handling practices across roles and departments. Dashboards built in QuickSight empower stakeholders to quickly visualize what areas may need added attention by way of training intervention or updated policy.

Another product area where we invested in analytics to help organizations identify human risk is in gamified awareness training. The SANS Scavenger Hunt utilizes QuickSight in a unique way as a real-time game scoreboard. Players compete in the hunt while solving cybersecurity-related challenges, giving security teams a fun way to engage the workforce and promote good cyber behaviors.

Security Awareness challenge dashboard

The Scavenger Hunt was widely deployed during global Cybersecurity Awareness Month—a time for security awareness practitioners to shine a light on the purpose and mission of security awareness and also have a little fun. Typically, programs run during this time take place outside any regulated training cycle and are typically not delivered as mandatory training. This being the case, we identified dashboards as a way to gamify the experience to increase engagement among participants. These dashboards, built using QuickSight, provided users access to a leaderboard to not only track their own progress, but to also see how they compared to their fellow participants.

Building on the success of their experience with QuickSight and the Scavenger Hunt, we wanted to push the gamification and dashboards concept further so Chief Information and Security Officers (CISOs) and security teams could identify and mitigate the human side of ransomware risk. We developed Snack Attack!, a gamified learning experience that shows an organization how employees are performing in six key defensive areas where ransomware can be prevented. In 2021, over 80% of cyber breaches involved human error of some kind. Employees must have a fundamental awareness of cybersecurity and the ability to apply cyber knowledge within the scope of their jobs. Snack Attack! and QuickSight proved to be a great product to visualize and action on areas of human risk and sentiment for senior leadership.

A screenshot of a Snack Attack! dashboard

With Snack Attack!, we looked at Cybersecurity Awareness Month from the viewpoint of the awareness practitioner. The program itself focuses on driving engagement through an entertaining storyline with creative visuals. We chose to use the data from the training to help our customers build their awareness programs going forward. The dashboards included in Snack Attack! give the security awareness practitioner insights into the learned behavior of their users. Quick visualizations of learners’ scoring in Snack Attack! can act as an audit of the effectiveness of their existing program and provide a roadmap for future trainings.

Paving the way in using analytics for customer security

The SANS Institute brings together security awareness training programs with a metrics-based approach through out-of-the-box analytics dashboards so our customers can assess and manage human risk successfully. With QuickSight, we were able to rapidly innovate, developing valuable data products at a speed we could not have otherwise. Without up-front investments to get started and with the low cost to try with usage-based pricing, we were able to quickly ideate, build, and deploy customer-facing analytic products to drive security awareness within our customer organizations. Our analytics solutions differentiate us from existing enterprise products. With QuickSight, we are able to show organizations where they have cyber risk.

With the delivery of analytics solutions to customers, the SANS Institute is not only a top cybersecurity training, learning, and certification platform, but also a technology provider that helps customers use data and insights to make meaningful change in their organization. Moving forward, we have identified an expansion of QuickSight dashboards into our larger suite of assessments as the next logical step. Along with the Behavioral Risk Assessment, we offer Knowledge and Culture assessments to help security awareness practitioners better understand where and how to apply training and gauge the effectiveness of their programs. Because of the success we have had with QuickSight on our existing projects, we feel that similar dashboards can provide even more value to our customers.

To learn more about how QuickSight can help your business with dashboards, reports, and more, visit Amazon QuickSight.


About the Author

Carl R. Marrelli is the Director of Business Development and Digital Programs at SANS Institute. Based in Charlotte, NC, he has extensive experience in cross-functional team leadership, product management, and product marketing. Previously as Head of Product at SANS, Carl led the product management team for the Online Training and Security Awareness divisions through a significant growth period. Carl’s unique perspective and innovative ideas, support SANS as the company continues its mission to empower cybersecurity practitioners around the world.

Implementing up-to-date images with automated EC2 Image Builder pipelines

Post Syndicated from Sheila Busser original https://aws.amazon.com/blogs/compute/implementing-up-to-date-images-with-automated-ec2-image-builder-pipelines/

This blog post is written by Devin Gordon, Senior Solutions Architect, WWPS, and Brad Watson, Senior Solutions Architect, WWPS.

Amazon EC2 Image Builder is a service designed to simplify the creation and deployment of customized Virtual Machine (VM) and container images on AWS or on-premises. The posts Automate OS Image Build Pipelines with EC2 Image Builder and Quickly build STIG-compliant Amazon Machine Images using Amazon EC2 Image Builder show how you can create secure images using EC2 Image Builder pipelines.

In this post, we demonstrate how to automatically keep your base or standard images current, incorporating patches and any other changes using EC2 Image Builder pipelines. We also demonstrate how to keep workload-specific images current using Cascading Pipelines, a feature of EC2 Image Builder.

Dependency updates

You can use the Dependency update feature of EC2 Image Builder pipelines to automatically update your standard image based on changes to your build components.

When you create an EC2 Image Builder pipeline, you can choose to run the pipeline on a schedule, either using a schedule builder or a CRON expression (a method of defining minute, hour, day and month for scheduling). Furthermore, you can choose to only run the pipeline if a component in the pipeline or the source image has changed. This is referred to as a dependency update as shown in the following image.

Figure 1 An example EC2 Image Builder pipeline schedule with dependency update settings

Figure 1: An example EC2 Image Builder pipeline schedule with dependency update settings

When you select “Run pipeline at the scheduled time if there are dependency updates,” your pipeline only executes if the Base AMI or any Build or Test components have changed. The version of your components must be updated for this capability to work. Amazon-provided components include versioning out of the box. Here is an example of three versions of an Amazon-provided Build component that apply Security Technical Implementation Guide (STIG) baselines to Linux images.

Figure 2 Different versions of one Amazon-managed Build component

Figure 2: Different versions of one Amazon-managed Build component

When a new STIG baseline build component is released, the component’s version is incremented. If a pipeline includes this type of versioned Build component and utilizes the dependency updates capability, then the pipeline automatically runs at the next scheduled interval after the component is updated. Pipelines utilizing this capability will run when the base AMI changes or when a Build or Test component changes.

Notifications

To receive notifications about the pipeline execution, you can enable an Amazon Simple Notification Service (Amazon SNS) topic from within EC2 Image Builder. Under the Infrastructure Configuration section of the EC2 Image Builder pipeline, identify an SNS topic as shown in the following image.

Figure 3 An example SNS topic for sending pipeline execution notifications

Figure 3: An example SNS topic for sending pipeline execution notifications

The SNS topic receives a notification if a pipeline runs and completes with a status of AVAILABLE or FAILED. This occurs even when a pipeline execution is triggered by a component change that you didn’t directly initiate, such as when a new version of an Amazon-managed build component is released.

Even if no other aspects of the infrastructure configuration are used in the pipeline (instance type, security group, subnet, etc.), the SNS topic capability can be used to send a notification when the pipeline executes. With this in mind, you can leverage Amazon SNS to make sure that you’re always notified of any pipeline executions as well as trigger AWS Lambda functions for automation.

Cascading pipelines

Cascading Pipelines are a feature of EC2 Image Builder that you can use to create workload-specific images from a standard secured image (aka “gold image”) of an organization. The following image shows how you can use Cascading Pipelines to keep workload specific images updated.

Figure 4 An example workflow for a EC2 Image Builder Cascading Pipelines

Figure 4: An example workflow for a EC2 Image Builder Cascading Pipelines

You create a gold image pipeline for a hardened base operating system (OS) using the steps outlined in Automate OS Image Build Pipelines with EC2 Image Builder. This pipeline could include a base OS, OS patches, Build components to harden the OS (such as STIG or CIS baselines), as well as any additional software required by the organization (agents, etc.). Do not include application- or workload-specific software in the pipeline. Infrastructure or distribution components may not be included in the pipeline to maintain flexibility for using the gold image. For example, you typically wouldn’t want to include VPC configurations in your golden AMI build because that would constrain the AMI to a particular VPC.

To create a Cascading Pipeline that uses the gold image for applications or workloads, in the Base Image section of the EC2 Image Builder console, choose Select Managed Images.

Figure 5 Selecting the base image of a pipeline

Figure 5: Selecting the base image of a pipeline

Then, select “Images Owned by Me” and under Image Name, select the EC2 Image Builder pipeline used to create the gold image. Moreover, select “Use Latest Available OS Version” under Auto-versioning options to make sure that the Cascading Pipeline is executed any time there is a change to the base image.

Figure 6 Choosing the base golden image from a previous pipeline execution

Figure 6: Choosing the base golden image from a previous pipeline execution

Use this configuration to maintain images for each application or workload which utilizes the gold image. Any time that an update is made to the gold image, application pipelines execute, thus providing updated images. To send notifications, SNS topics are enabled on each workload-specific pipeline.

In this post, we demonstrated how to automatically update images for any changes using EC2 Image Builder pipelines. We also demonstrated how to keep workload specific images using Cascading Pipelines. Using these features, you can make sure that your organization stays up-to-date on the latest OS patches and dependency changes, without requiring human intervention. For more information on EC2 Image Builder, see the official documentation.

Introducing Cloudflare’s new Network Analytics dashboard

Post Syndicated from Omer Yoachimik original https://blog.cloudflare.com/network-analytics-v2-announcement/

Introducing Cloudflare’s new Network Analytics dashboard

Introducing Cloudflare’s new Network Analytics dashboard

We’re pleased to introduce Cloudflare’s new and improved Network Analytics dashboard. It’s now available to Magic Transit and Spectrum customers on the Enterprise plan.

The dashboard provides network operators better visibility into traffic behavior, firewall events, and DDoS attacks as observed across Cloudflare’s global network. Some of the dashboard’s data points include:

  1. Top traffic and attack attributes
  2. Visibility into DDoS mitigations and Magic Firewall events
  3. Detailed packet samples including full packets headers and metadata
Introducing Cloudflare’s new Network Analytics dashboard
Network Analytics – Drill down by various dimensions
Introducing Cloudflare’s new Network Analytics dashboard
Network Analytics – View traffic by mitigation system

This dashboard was the outcome of a full refactoring of our network-layer data logging pipeline. The new data pipeline is decentralized and much more flexible than the previous one — making it more resilient, performant, and scalable for when we add new mitigation systems, introduce new sampling points, and roll out new services. A technical deep-dive blog is coming soon, so stay tuned.

In this blog post, we will demonstrate how the dashboard helps network operators:

  1. Understand their network better
  2. Respond to DDoS attacks faster
  3. Easily generate security reports for peers and managers

Understand your network better

One of the main responsibilities network operators bare is ensuring the operational stability and reliability of their network. Cloudflare’s Network Analytics dashboard shows network operators where their traffic is coming from, where it’s heading, and what type of traffic is being delivered or mitigated. These insights, along with user-friendly drill-down capabilities, help network operators identify changes in traffic, surface abnormal behavior, and can help alert on critical events that require their attention — to help them ensure their network’s stability and reliability.

Starting at the top, the Network Analytics dashboard shows network operators their traffic rates over time along with the total throughput. The entire dashboard is filterable, you can drill down using select-to-zoom, change the time-range, and toggle between a packet or bit/byte view. This can help gain a quick understanding of traffic behavior and identify sudden dips or surges in traffic.

Cloudflare customers advertising their own IP prefixes from the Cloudflare network can also see annotations for BGP advertisement and withdrawal events. This provides additional context atop of the traffic rates and behavior.

Introducing Cloudflare’s new Network Analytics dashboard
The Network Analytics dashboard time series and annotations

Geographical accuracy

One of the many benefits of Cloudflare’s Network Analytics dashboard is its geographical accuracy. Identification of the traffic source usually involves correlating the source IP addresses to a city and country. However, network-layer traffic is subject to IP spoofing. Malicious actors can spoof (alter) their source IP address to obfuscate their origin (or their botnet’s nodes) while attacking your network. Correlating the location (e.g., the source country) based on spoofed IPs would therefore result in spoofed countries. Using spoofed countries would skew the global picture network operators rely on.

To overcome this challenge and provide our users accurate geoinformation, we rely on the location of the Cloudflare data center wherein the traffic was ingested. We’re able to achieve geographical accuracy with high granularity, because we operate data centers in over 285 locations around the world. We use BGP Anycast which ensures traffic is routed to the nearest data center within BGP catchment.

Introducing Cloudflare’s new Network Analytics dashboard
Traffic by Cloudflare data center country from the Network Analytics dashboard

Detailed mitigation analytics

The dashboard lets network operators understand exactly what is happening to their traffic while it’s traversing the Cloudflare network. The All traffic tab provides a summary of attack traffic that was dropped by the three mitigation systems, and the clean traffic that was passed to the origin.

Introducing Cloudflare’s new Network Analytics dashboard
The All traffic tab in Network Analytics

Each additional tab focuses on one mitigation system, showing traffic dropped by the corresponding mitigation system and traffic that was passed through it. This provides network operators almost the same level of visibility as our internal support teams have. It allows them to understand exactly what Cloudflare systems are doing to their traffic and where in the Cloudflare stack an action is being taken.

Introducing Cloudflare’s new Network Analytics dashboard
Introducing Cloudflare’s new Network Analytics dashboard
Data path for Magic Transit customers

Using the detailed tabs, users can better understand the systems’ decisions and which rules are being applied to mitigate attacks. For example, in the Advanced TCP Protection tab, you can view how the system is classifying TCP connection states. In the screenshot below, you can see the distribution of packets according to connection state. For example, a sudden spike in Out of sequence packets may result in the system dropping them.

Introducing Cloudflare’s new Network Analytics dashboard
The Advanced TCP Protection tab in Network Analytics

Note that the presence of tabs differ slightly for Spectrum customers because they do not have access to the Advanced TCP Protection and Magic Firewall tabs. Spectrum customers only have access to the first two tabs.

Respond to DDoS attacks faster

Cloudflare detects and mitigates the majority of DDoS attacks automatically. However, when a network operator responds to a sudden increase in traffic or a CPU spike in their data centers, they need to understand the nature of the traffic. Is this a legitimate surge due to a new game release for example, or an unmitigated DDoS attack? In either case, they need to act quickly to ensure there are no disruptions to critical services.

The Network Analytics dashboard can help network operators quickly pattern traffic by switching the time-series’ grouping dimensions. They can then use that pattern to drop packets using the Magic Firewall. The default dimension is the outcome indicating whether traffic was dropped or passed. But by changing the time series dimension to another field such as the TCP flag, Packet size, or Destination port a pattern can emerge.

In the example below, we have zoomed in on a surge of traffic. By setting the Protocol field as the grouping dimension, we can see that there is a 5 Gbps surge of UDP packets (totalling at 840 GB throughput out of 991 GB in this time period). This is clearly not the traffic we want, so we can hover and click the UDP indicator to filter by it.

Introducing Cloudflare’s new Network Analytics dashboard
Distribution of a DDoS attack by IP protocols

We can then continue to pattern the traffic, and so we set the Source port to be the grouping dimension. We can immediately see that, in this case, the majority of traffic (838 GB) is coming from source port 123. That’s no bueno, so let’s filter by that too.

Introducing Cloudflare’s new Network Analytics dashboard
The UDP flood grouped by source port

We can continue iterating to identify the main pattern of the surge. An example of a field that is not necessarily helpful in this case is the Destination port. The time series is only showing us the top five ports but we can already see that it is quite distributed.

Introducing Cloudflare’s new Network Analytics dashboard
The attack targets multiple destination ports

We move on to see what other fields can contribute to our investigation. Using the Packet size dimension yields good results. Over 771 GB of the traffic are delivered over 286 byte packets.

Introducing Cloudflare’s new Network Analytics dashboard
Zooming in on an UDP flood originating from source port 123 

Assuming that our attack is now sufficiently patterned, we can create a Magic Firewall rule to block the attack by combining those fields. You can combine additional fields to ensure you do not impact your legitimate traffic. For example, if the attack is only targeting a single prefix (e.g., 192.0.2.0/24), you can limit the scope of the rule to that prefix.

Introducing Cloudflare’s new Network Analytics dashboard
Creating a Magic Firewall rule directly from within the analytics dashboard
Introducing Cloudflare’s new Network Analytics dashboard
Creating a Magic Firewall rule to block a UDP flood

If needed for attack mitigation or network troubleshooting, you can also view and export packet samples along with the packet headers. This can help you identify the pattern and sources of the traffic.

Introducing Cloudflare’s new Network Analytics dashboard
Example of packet samples with one sample expanded
Introducing Cloudflare’s new Network Analytics dashboard
Example of a packet sample with the header sections expanded

Generate reports

Another important role of the network security team is to provide decision makers an accurate view of their threat landscape and network security posture. Understanding those will enable teams and decision makers to prepare and ensure their organization is protected and critical services are kept available and performant. This is where, again, the Network Analytics dashboard comes in to help. Network operators can use the dashboard to understand their threat landscape — which endpoints are being targeted, by which types of attacks, where are they coming from, and how does that compare to the previous period.

Introducing Cloudflare’s new Network Analytics dashboard
Dynamic, adaptive executive summary

Using the Network Analytics dashboard, users can create a custom report — filtered and tuned to provide their decision makers a clear view of the attack landscape that’s relevant to them.

Introducing Cloudflare’s new Network Analytics dashboard

In addition, Magic Transit and Spectrum users also receive an automated weekly Network DDoS Report which includes key insights and trends.

Extending visibility from Cloudflare’s vantage point

As we’ve seen in many cases, being unprepared can cost organizations substantial revenue loss, it can negatively impact their reputation, reduce users’ trust as well as burn out teams that need to constantly put out fires reactively. Furthermore, impact to organizations that operate in the healthcare industry, water, and electric and other critical infrastructure industries can cause very serious real-world problems, e.g., hospitals not being able to provide care for patients.

The Network Analytics dashboard aims to reduce the effort and time it takes network teams to investigate and resolve issues as well as to simplify and automate security reporting. The data is also available via GraphQL API and Logpush to allow teams to integrate the data into their internal systems and cross references with additional data points.

To learn more about the Network Analytics dashboard, refer to the developer documentation.

Security updates for Wednesday

Post Syndicated from original https://lwn.net/Articles/928870/

Security updates have been issued by Fedora (chromium, ghostscript, glusterfs, netatalk, php-Smarty, and skopeo), Mageia (ghostscript, imgagmagick, ipmitool, openssl, sudo, thunderbird, tigervnc/x11-server, and vim), Oracle (curl, haproxy, and postgresql), Red Hat (curl, haproxy, httpd:2.4, kernel, kernel-rt, kpatch-patch, and postgresql), Slackware (mozilla), SUSE (firefox), and Ubuntu (dotnet6, dotnet7, firefox, json-smart, linux-gcp, linux-intel-iotg, and sudo).

Internet disruptions overview for Q1 2023

Post Syndicated from David Belson original https://blog.cloudflare.com/q1-2023-internet-disruption-summary/

Internet disruptions overview for Q1 2023

Internet disruptions overview for Q1 2023

Cloudflare operates in more than 285 cities in over 100 countries, where we interconnect with over 11,500 network providers in order to provide a broad range of services to millions of customers. The breadth of both our network and our customer base provides us with a unique perspective on Internet resilience, enabling us to observe the impact of Internet disruptions.

We entered 2023 with Internet disruptions due to causes that ran the gamut, including several government-directed Internet shutdowns, cyclones, a massive earthquake, power outages, cable cuts, cyberattacks, technical problems, and military action. As we have noted in the past, this post is intended as a summary overview of observed disruptions, and is not an exhaustive or complete list of issues that have occurred during the quarter.

Government directed

Iran

Over the last six-plus months, government-directed Internet shutdowns in Iran have largely been in response to protests over the death of Mahsa Amini while in police custody. While these shutdowns are still occurring in a limited fashion, a notable shutdown observed in January was intended to prevent cheating on academic exams. Internet shutdowns with a similar purpose have been observed across a number of other countries, and have also occurred in Iran in the past. Access was restricted across AS44244 (Irancell) and AS197207 (MCCI), with lower traffic levels observed in Alborz Province, Fars, Khuzestan, and Razavi Khorasan between 08:00 to 11:30 local time (04:30 to 08:00 UTC) on January 19.

Internet disruptions overview for Q1 2023
Internet disruptions overview for Q1 2023

Mauritania

On March 6, Internet traffic across the three major mobile network providers in Mauritania was disrupted amid a search for four jihadist prisoners that escaped from prison. Starting around 10:00 local time (10:00 UTC), a drop in traffic was observed at AS37541 (Chinguitel), AS29544 (Mauritel), and AS37508 (Mattel), as well as at a country level. The Internet disruption lasted for multiple days, with traffic starting to recover around 13:45 local time (13:45 UTC) on March 12, after Mauritanian authorities reported that three of the escapees had been killed, with the fourth detained after a shootout.

Internet disruptions overview for Q1 2023

Punjab, India

A shutdown of mobile Internet connectivity in Punjab, India began on March 19, ordered by the local government amid concerns of protest-related violence. Although the initial shutdown was ordered to take place between March 18, 12:00 local time and March 19, 12:00 local time, it was extended several times, ultimately lasting for three days. Traffic for AS38266 (Vodafone India), AS45271 (Idea Cellular Limited), AS45609 (Bharti Mobility), and AS55836 (Reliance Jio Infocomm) began to fall around 12:30 local time (07:00 UTC) on March 18, recovering around 12:30 local time (07:00 UTC) on March 21. However, it was subsequently reported that connectivity remained shut down in some districts until March 23 or 24.

Internet disruptions overview for Q1 2023

Cable cuts

Bolivia

Bolivian ISP Cometco (AS27839) reported on January 12 that problems with international fiber links were causing degradation of Internet service. Traffic from the network dropped by approximately 80% starting around 16:00 local time (20:00 UTC) before returning to normal approximately eight hours later. It isn’t clear whether the referenced international fiber links were terrestrial connections to neighboring countries, or issues with submarine cables several network hops upstream. As a landlocked country, Bolivia is not directly connected to any submarine cables.

Internet disruptions overview for Q1 2023

Anguilla

On February 18, a Facebook post from the Government of Anguilla noted that there was a “Telecommunications Outage affecting both service providers, FLOW & DIGICEL.” The accompanying graphic noted that the outage was due to a “subsea fiber break”. Although not confirmed, the break likely occurred on the Eastern Caribbean Fiber System (ECFS), as this is the only submarine cable system that Anguilla is connected to. The figures below show a clear drop in traffic around 09:00 local time (13:00 UTC) in Anguilla and at AS2740 (Caribbean Cable Communications, acquired by Digicel) and to a lesser extent at AS11139 (Cable & Wireless, parent company of Flow Anguilla). The disruption lasted for over two days, with traffic returning to normal levels around 15:00 local time (19:00 UTC) on February 20, corroborated by a follow-on Facebook post from the Government of Anguilla.

Internet disruptions overview for Q1 2023
Internet disruptions overview for Q1 2023
Internet disruptions overview for Q1 2023

Bangladesh

A brief connectivity disruption was observed on Bangladeshi provider Grameenphone on February 23, between 11:45-14:00 local time (05:45-08:00 UTC). According to a Facebook post from Grameenphone, the outage was caused by fiber cuts due to road maintenance.

Internet disruptions overview for Q1 2023

Venezuela

Venezuela, and more specifically, AS8048 (CANTV), are no stranger to Internet disruptions, seeing several (Q1, Q2) during 2022, as well as others in previous years. During the last couple of days in February, a few small outages were observed on CANTV’s network in several Venezuelan states. However, a more significant near-complete outage occurred on February 28, starting around midnight local time (04:00 UTC), and lasting for the better part of the day, with traffic recovering at 17:30 local time (21:30 UTC). A Tweet posted the morning of February 28 by CANTV referenced an outage in their fiber optic network, which was presumably the cause.

Internet disruptions overview for Q1 2023

Power outages

Pakistan

A country-wide power outage in Pakistan on January 23 impacted more than 220 million people, and resulted in a significant drop in Internet traffic being observed in the country. The power outage began at 07:34 local time (02:34 UTC), with Internet traffic starting to drop almost immediately. The figure below shows that traffic volumes dropped as much as 50% from normal levels before recovering around 04:15 local time on January 24 (23:15 UTC on January 23). This power outage was reportedly due to a “sudden drop in the frequency of the power transmission system”, which led to a “widespread breakdown”. Nationwide power outages have also occurred in Pakistan in January 2021, May 2018, and January 2015.

Internet disruptions overview for Q1 2023

Bermuda

BELCO, the power company servicing the island of Bermuda, tweeted about a mass power outage affecting the island on February 3, and linked to their outage map so that customers could track restoration efforts. BELCO’s tweet was posted at 16:10 local time (20:10 UTC), approximately one hour after a significant drop was observed in Bermuda’s Internet traffic. The power outage, and subsequent Internet disruption, lasted over five hours, as BELCO later tweeted that “As of 9.45 pm [00:45 UTC, February 4], all circuits have been restored.

Internet disruptions overview for Q1 2023

Argentina

Soaring temperatures in Argentina triggered a large-scale power outage across the country that resulted in multi-hour Internet disruption on March 1. Internet traffic dropped by approximately one-third during the disruption, which lasted from 16:30 to 19:30 local time (19:30 to 22:30 UTC). Cities that experienced visible impacts to Internet traffic during the power outage included Buenos Aires, Cordoba, Mendoza, and Tucuman.

Internet disruptions overview for Q1 2023
Internet disruptions overview for Q1 2023

Kenya

Just a few days later on March 4, Kenya Power issued a Customer Alert at 18:25 local time (15:25 UTC) regarding a nationwide power outage, noting that it had “lost bulk power supply to various parts of the country due to a system disturbance.” The alert came approximately an hour after the country’s Internet traffic dropped significantly. A subsequent tweet dated midnight local time (21:00 UTC) claimed that “electricity supply has been restored to all areas countrywide” and the figure below shows that traffic levels returned to normal levels shortly thereafter.

Internet disruptions overview for Q1 2023

Earthquake

Turkey

On February 5, a magnitude 7.8 earthquake occurred 23 km east of Nurdağı, Turkey, leaving many thousands dead and injured. The quake, which occurred at 04:17 local time (01:17 UTC), was believed to be the strongest to hit Turkey since 1939. The widespread damage and destruction resulted in significant disruptions to Internet connectivity in multiple areas of the country, as shown in the figures below. Although Internet traffic volumes were relatively low because it was so early in the morning, the graphs show it dropping even further at or around the time of the earthquake. Nearly half a day later, traffic volumes in selected locations were between 63-94% lower than at the same time the previous week. A month later, after several aftershocks, traffic volumes had mostly recovered, although some regions were still struggling to recover.

Internet disruptions overview for Q1 2023
Internet disruptions overview for Q1 2023
Internet disruptions overview for Q1 2023
Internet disruptions overview for Q1 2023
Internet disruptions overview for Q1 2023

Weather

New Zealand

Called the “country’s biggest weather event in a century”, Cyclone Gabrielle wreaked havoc on northern New Zealand, including infrastructure damage and power outages impacting tens of thousands of homes. As a result, regions including Gisborne and Hawkes Bay experienced Internet disruptions that lasted several days, starting at 00:00 local time on February 14 (11:00 UTC on February 13). The figures below show that in both regions, peak traffic volume returned to pre-cyclone levels around February 19.

Internet disruptions overview for Q1 2023
Internet disruptions overview for Q1 2023

Vanuatu

Later in February, Cyclone Judy hit the South Pacific Ocean nation of Vanuatu, the South Pacific Ocean nation made up of roughly 80 islands that stretch 1,300 kilometers. The Category 4 cyclone damaged homes and caused power outages, resulting in a significant drop in the country’s Internet traffic. On February 28, Vanuatu’s traffic dropped by nearly 80% as the cyclone struck, and as seen in the figure below, it took nearly two weeks for traffic to recover to the levels seen earlier in February.

Internet disruptions overview for Q1 2023

Malawi

Cyclone Freddy, said to be the longest-lasting, most powerful cyclone on record, hit Malawi during the weekend of March 11-12, and into Monday, March 13. The resulting damage disrupted Internet connectivity in the east African country, with traffic dropping around 11:00 local time (09:00 UTC) on March 13. The disruption lasted for over two days, with traffic levels recovering around 21:00 local time (19:00 UTC) on March 15.

Internet disruptions overview for Q1 2023

Technical problems

South Africa

Just before 07:00 local time (05:00 UTC) on February 1, South African service provider RSAWEB initially tweeted about a problem that they said was impacting their cloud and VOIP platforms. However, in several subsequent tweets, they noted that the problem was also impacting internal systems, as well as fiber and mobile connectivity. The figure below shows traffic for RSAWEB dropping at 06:30 local time (04:30 UTC), a point at which it would normally be starting to increase for the day. Just before 16:00 local time (14:00 UTC), RSAWEB tweeted “…engineers are actively working on restoring services post the major incident. Customers who experienced no connectivity may see some services restoring.” The figure below shows a sharp increase in traffic around that time with gradual growth through the evening. However, full restoration of services across all of RSAWEB’s impacted platforms took a full week, according to a February 8 tweet.

Internet disruptions overview for Q1 2023

Italy

An unspecified “international interconnectivity problem” impacting Telecom Italia caused a multi-hour Internet disruption in Italy on February 5. At a country level, a nominal drop in traffic is visible in the figure below starting around 11:45 local time (10:45 UTC) with some volatility visible in the lower traffic through 17:15 local time (16:15 UTC). However, the impact of the problem is more obvious in the traffic graphs for AS3269 and AS16232, both owned by Telecom Italia. Both graphs show a more significant loss of traffic, as well as greater volatility through the five-plus hour disruption.

Internet disruptions overview for Q1 2023
Internet disruptions overview for Q1 2023
Internet disruptions overview for Q1 2023

Myanmar

A fire at an exchange office of MPT (Myanma Posts and Telecommunications) on February 7 disrupted Internet connectivity for customers of the Myanmar service provider. A Facebook post from MPT informed customers that “We are currently experiencing disruptions to our MPT’s services including MPT’s call centre, fiber internet, mobile internet and mobile and telephone communications.” The figure below shows the impact of this disruption on MPT-owned AS9988 and AS45558, with traffic dropping significantly at 10:00 local time (03:30 UTC). Significant recovery was seen by 22:00 local time (15:30 local time).

Internet disruptions overview for Q1 2023
Internet disruptions overview for Q1 2023

Republic of the Congo (Brazzaville)

Congo Telecom tweeted a “COMMUNIQUÉ” on March 15, alerting users to a service disruption due to a “network incident”. The impact of this disruption is clearly visible at a country level, with traffic dropping sharply at 00:45 local time (23:45 on March 14 UTC), driven by complete outages at MTN Congo and Congo Telecom, as seen in the graphs below. While traffic at MTN Congo began to recover around 08:00 local time (07:00 UTC), Congo Telecom’s recovery took longer, with traffic beginning to increase around 17:00 local time (16:00 UTC). Congo Telecom tweeted on March 16 that the nationwide Internet outage had been resolved. MTN Congo did not acknowledge the disruption on social media, and neither company provided more specific information about the reported “network incident”.

Internet disruptions overview for Q1 2023
Internet disruptions overview for Q1 2023
Internet disruptions overview for Q1 2023

Lebanon

Closing out March, disruptions observed at AS39010 (Terranet) and AS42334 (Mobi) in Lebanon may have been related to a strike at upstream provider Ogero Telecom, common to both networks. A published report quoted the Chairman of Ogero commenting on the strike, “We are heading to a catastrophe if a deal is not found with the government: the network will completely stop working as our generators will gradually run out of fuel. Lebanon completely relies on Ogero for its bandwidth, leaving no one exempt from a blackout.” Traffic at both Terranet and Mobi dropped around 05:00 local time (03:00 UTC) on March 29, with the disruption lasting approximately 4.5 hours, as traffic recovered at 09:30 local time (07:30 UTC).

Internet disruptions overview for Q1 2023
Internet disruptions overview for Q1 2023

Cyberattacks

South Korea

On January 29, South Korean Internet provider LG Uplus suffered two brief Internet disruptions which were reportedly caused by possible DDoS attacks. The first disruption occurred at 03:00 local time (18:00 UTC on January 28), and the second occurred at 18:15 local time (09:15 UTC). The disruptions impacted traffic on AS17858 and AS3786, both owned by LG. The company was reportedly hit by a second pair of DDoS attacks on February 4.

Internet disruptions overview for Q1 2023

Guam

In a March 17 tweet posted at 11:30 local time (01:30 UTC), Docomo Pacific reported an outage affecting multiple services, with a subsequent tweet noting that “Early this morning, a cyber security incident occurred and some of our servers were attacked”. This outage is visible at a country level in Guam, seen as a significant drop in traffic starting around 10:00 local time (00:00 UTC) in the figure below. However, in the graph below for AS3605 (ERX-KUENTOS/Guam Cablevision/Docomo Pacific), the cited outage results in a near-complete loss of traffic starting around 05:00 local time (19:00 on March 16 UTC). Traffic returned to normal levels by 18:00 local time on March 18 (08:00 UTC).

Internet disruptions overview for Q1 2023
Internet disruptions overview for Q1 2023

Ukraine/Military Action

In February, the conflict in Ukraine entered its second year, and over this past year, we have tracked its impact on the Internet, highlighting traffic shifts, attacks, routing changes, and connectivity disruptions. In the fourth quarter of 2022, a number of disruptions were related to attacks on electrical infrastructure, and this pattern continued into the first quarter of 2023.

One such disruption occurred in Odessa on January 27, amid news of Russian airstrikes on local energy infrastructure. As seen in the figure below, Internet traffic in Odessa usually begins to climb just before 08:00 local time (06:00 UTC), but failed to do so that morning after several energy infrastructure facilities near Odessa were hit and damaged. Traffic remained lower than levels seen the previous week for approximately 18 hours.

Internet disruptions overview for Q1 2023

Power outages resulting from Russian attacks on energy generation and distribution facilities on March 9 resulted in disruptions to Internet connectivity in multiple locations around Ukraine. As seen in the figures below, traffic dropped below normal levels after 02:00 local time (00:00 UTC) on March 9. Traffic in Kharkiv fell over 50% as compared to previous week, while in Odessa, traffic fell as much as 60%. In Odessa, Mykolaiv, and Kirovohrad Oblast, traffic recovered by around 08:00 local time (06:00 UTC), while in Kharkiv, the disruption lasted nearly two days, returning to normal levels around 23:45 local time (21:45 UTC) on Friday, March 10.

Internet disruptions overview for Q1 2023
Internet disruptions overview for Q1 2023
Internet disruptions overview for Q1 2023
Internet disruptions overview for Q1 2023

Conclusion

The first quarter of 2023 seemed to be particularly active from an Internet disruption perspective, but hopefully it is not a harbinger of things to come through the rest of the year. This is especially true of government-directed shutdowns, which occurred fairly regularly through 2022. To that end, civil society organization Access Now recently published their Internet shutdowns in 2022 report, finding that In 2022, governments and other actors disrupted the internet at least 187 times across 35 countries. Cloudflare Radar is proud to support Access Now’s #KeepItOn initiative, using our data to help illustrate the impact of Internet shutdowns and other disruptions.

To follow Internet disruptions as they occur, check the Cloudflare Radar Outage Center (CROC) or the Radar API. On social media, follow @CloudflareRadar on Twitter or cloudflare.social/@radar on Mastodon.

Let’s Architect! Monitoring production systems at scale

Post Syndicated from Vittorio Denti original https://aws.amazon.com/blogs/architecture/lets-architect-monitoring-production-systems-at-scale/

“Everything fails, all the time” is a famous quote from Amazon’s Chief Technology Officer Werner Vogels. This means that software and distributed systems may eventually fail because something can always go wrong. We have to accept this and design our systems accordingly, test our software and services, and think about all the possible edge cases.

With this in mind, we should also set our teams up for success by providing visibility in every environment for a quick turnaround when incidents happen. When a system serves traffic in production, we need to monitor it to make sure it behaves as expected and that all components are healthy. But questions arise such as:

  • How do we monitor a system?
  • What is monitoring?
  • What are some architectural and engineering approaches to implement in order to design a successful monitoring strategy?

All of these questions require complex answers. It’s not possible to cover everything in a blog post, but let’s start exploring the topic and sharing resources to guide you through this domain.

In this edition of Let’s Architect! we share some practices for monitoring used at Amazon and AWS, as well as more resources to discover how to build monitoring solutions for the workloads running on AWS.

Observability best practices at Amazon

Observability and monitoring are engineering tasks that also require putting a suitable cultural mindset in place. At Amazon, if a service doesn’t run as expected, the team writes a CoE (Correction of Errors) document to analyze the issue and answer critical questions to learn from it. There are also weekly operations meetings to analyze operational and performance dashboards for each service.

The session introduced here covers the full range of monitoring at Amazon, from how teams assess system health at a high level to how they understand the details of a single request. Use this resource to learn some best practices for metrics, logs, and tracing, and using these signals to achieve operational excellence.

Take me to this re:Invent video!

Observability is an iterative process which requires us to establish a feedback loop and improve based on the signals coming from the system.

Build an observability solution using managed AWS services and the OpenTelemetry standard

Visibility of what’s happening in a distributed system is key to operationalize workloads at scale. OpenTelemetry is the standard for observability and AWS services are fully integrated with that. The blog post introduced in this section shows you how AWS Distro for OpenTelemetry (ADOT) works under the hood and how to use it with a Kubernetes cluster. But keep in mind, this is just one of the many implementations available for AWS compute services and OpenTelemetry—so even if you’re not using Kubernetes right now, we’ve still got you covered!

Want more? Watch this re:Invent video for an understanding of how to think about logging, tracing, metrics, and monitoring with AWS services, and the possibilities to provide the observability your distributed systems need. This is a great learning resource with many demos and examples.

Take me to this blog post!

Flow of metrics and traces from Application services to the Observability Platform.

Optimizing your AWS Batch architecture for scale with observability dashboards

We’ve explored the mental models and strategies for monitoring in previous resources. Now let’s see how these principles can be applied in a scenario where we run batch and ML computing jobs at scale. In the blog post introduced in this section, you can learn how to use runtime metrics to understand an architecture designed on AWS Batch for running batch computing jobs. AWS Batch is a fully managed service enabling you to run jobs at any scale without needing to manage underlying compute resources. This blog explains how AWS Batch works and guides you through the process used to design a monitoring framework.

Since the solution is open-source, you are free to add other custom metrics you find useful. To get started with the AWS Batch open-source observability solution, visit the project page on GitHub. Several customers have used this monitoring tool to optimize their workload for scale by reshaping their jobs, refining their instance selection, and tuning their AWS Batch architecture.

Take me to this blog!

High-level structure of AWS Batch resources and interactions. This diagram depicts a user submitting jobs based on a job definition template to a job queue, which then communicates to a compute environment that resources are needed.

Observability workshop

This resource provides a hands-on experience for you on the variety of toolsets AWS offers to set up monitoring and observability on your applications. Whether your workload is on-premises or on AWS—or your application is a giant monolith or based on modern microservices-based architecture—the observability tools can provide deeper insights into application performance and health.

The monitoring tools covered in this workshop provide powerful capabilities that enable you to identify bottlenecks, issues, and defects without having to manually sift through various logs, metrics, and trace data.

Take me to this workshop!

The diagram illustrates the various components of the PetAdoptions architecture. In the workshop you will learn how to monitor this application.

See you next time!

Thanks for exploring architecture tools and resources with us!

Next time we’ll talk about containers on AWS.

To find all the posts from this series, check out the Let’s Architect! page of the AWS Architecture Blog.

FBI Advising People to Avoid Public Charging Stations

Post Syndicated from Bruce Schneier original https://www.schneier.com/blog/archives/2023/04/fbi-advising-people-to-avoid-public-charging-stations.html

The FBI is warning people against using public phone-charging stations, worrying that the combination power-data port can be used to inject malware onto the devices:

Avoid using free charging stations in airports, hotels, or shopping centers. Bad actors have figured out ways to use public USB ports to introduce malware and monitoring software onto devices that access these ports. Carry your own charger and USB cord and use an electrical outlet instead.

How much of a risk is this, really? I am unconvinced, although I do carry a USB condom for charging stations I find suspicious.

News article.

Patch Tuesday – April 2023

Post Syndicated from Adam Barnett original https://blog.rapid7.com/2023/04/11/patch-tuesday-april-2023/

Patch Tuesday - April 2023

Microsoft is offering fixes for 114 vulnerabilities for in April 2023. This month’s haul includes a single zero-day vulnerability, as well as seven critical Remote Code Execution (RCE) vulnerabilities. There is a strong focus on fixes for Windows OS this month.

Over the last 18 months or so, Rapid7 has written several times about the prevalence of driver-based attacks. This month’s sole zero-day vulnerability – a driver-based elevation of privilege – will only reinforce the popularity of the vector among threat actors. Successful exploitation of CVE-2023-28252 allows an attacker to obtain SYSTEM privileges via a vulnerability in the Windows Common Log File System (CLFS) driver. Microsoft has patched more than one similar CLFS driver vulnerability over the past year, including CVE-2023-23376 in February 2023 and CVE-2022-37969 in September 2022.

Microsoft has released patches for the zero-day vulnerability CVE-2023-28252 for all current versions of Windows. Microsoft is not aware of public disclosure, but has detected in-the-wild exploitation and is aware of functional exploit code. The assigned base CVSSv3 score of 7.8 lands this vulnerability near the top of the High severity range, which is expected since it gives complete control of an asset, but a remote attacker must first find some other method to access the target.

April 2023 also sees 45 separate Remote Code Execution (RCE) vulnerabilities patched, which is a significant uptick from the average of 33 per month over the past three months. Microsoft rates seven of this month’s RCE vulnerabilities as Critical, including two related vulnerabilities with a CVSSv3 base score of 9.8. CVE-2023-28250 describes a vulnerability in Windows Pragmatic General Multicast (PGM) which allows an attacker to achieve RCE by sending a specially crafted file over the network. CVE-2023-21554 allows an attacker to achieve RCE by sending a specially crafted Microsoft Messaging Queue packet. In both cases, the Microsoft Message Queueing Service must be enabled and listening on port 1801 for an asset to be vulnerable. The Message Queueing Service is not installed by default. Even so, Microsoft considers exploitation of CVE-2023-21554 more likely.

The other five Critical RCE this month are spread across various Windows components: Windows Raw Image Extension, Windows DHCP Protocol, and two frequent fliers: Windows Point-to-Point Tunneling Protocol and the Windows Layer 2 Tunneling Protocol.

The RAW Image Extension vulnerability CVE-2023-28921 is another example of what Microsoft refers to as an Arbitrary Code Execution (ACE), explaining “The attack itself is carried out locally. This means an attacker or victim needs to execute code from the local machine to exploit the vulnerability.” For some defenders, this may stretch the definition of the word Remote in Remote Code Execution, but there are many ways to deliver a file to a user, and an unpatched system remains vulnerable regardless.

DHCP server vulnerability CVE-2023-28231 requires an attacker to be on the same network as the target, but offers RCE via a specially crafted RPC call. Microsoft considers that exploitation is more likely.

The hunter becomes the hunted as Microsoft patches a Denial of Service vulnerability in Defender. The advisory for CVE-2023-24860 includes some unusual guidance: “Systems that have disabled Microsoft Defender are not in an exploitable state.” In practice this vulnerability is less likely to be exploited, and the default update cadence for Defender should mean that most assets are automatically patched in a short timeframe.

Windows Server administrators should take note of CVE-2023-28247. Successful exploitation allows an attacker to view contents of kernel memory remotely from the context of a user process. Microsoft lists Windows Server 2012, 2016, 2019, and 2022 as vulnerable. Although Microsoft assesses that exploitation is less likely, Windows stores many secrets in kernel memory, including cryptographic keys.

Machine learning is everywhere these days, and this month’s Patch Tuesday is no exception: CVE-2023-28312 describes a vulnerability in Azure Machine Learning which allows an attacker to access system logs, although any attack would need to be launched from within the same secure network. The advisory contains links to Microsoft detection and remediation guidance.

The other Azure vulnerability this month is a Azure Service Connector Security Feature Bypass. Microsoft rates Attack Complexity for CVE-2023-28300 as High, since this vulnerability is only useful when chained with other exploits to defeat other security measures. However, the Azure Service Connector only updates when the Azure Command-Line Interface is updated, and automatic updates are not enabled by default.

Final curtain call tonight for a raft of familiar names, since April 2023 Patch Tuesday includes the very last round of Extended Security Updates (ESU) for a number of Microsoft products. These include:

As always, the end of ESU means that Microsoft does not expect to patch or even disclose any future vulnerabilities which might emerge in these venerable software products, so it is no longer possible to secure them; these dates have been well-publicized far in advance under the fixed lifecycle policy. No vendor can feasibly support ancient software indefinitely, and some administrators may be glad that they will never have to install another Exchange Server 2013 patch.

Summary Charts

Patch Tuesday - April 2023
Printer Drivers, DNS, and the Windows Kernel.
Patch Tuesday - April 2023
Remote Code Execution and Elevation of Privilege account for the majority as usual. A rare appearance for Tampering.
Patch Tuesday - April 2023
CVSSv3 scoring tends to cluster around certain values.
Patch Tuesday - April 2023
As usual, the distribution of severity skews towards Very Important.
Patch Tuesday - April 2023
Printer drivers and CVEs go hand in hand.

Summary tables

Azure vulnerabilities

CVE Title Exploited? Publicly disclosed? CVSSv3 base score
CVE-2023-28300 Azure Service Connector Security Feature Bypass Vulnerability No No 7.5
CVE-2023-28312 Azure Machine Learning Information Disclosure Vulnerability No No 6.5

Browser vulnerabilities

CVE Title Exploited? Publicly disclosed? CVSSv3 base score
CVE-2023-28284 Microsoft Edge (Chromium-based) Security Feature Bypass Vulnerability No No 4.3
CVE-2023-28301 Microsoft Edge (Chromium-based) Tampering Vulnerability No No 4.2
CVE-2023-24935 Microsoft Edge (Chromium-based) Spoofing Vulnerability No No N/A
CVE-2023-1823 Chromium: CVE-2023-1823 Inappropriate implementation in FedCM No No N/A
CVE-2023-1822 Chromium: CVE-2023-1822 Incorrect security UI in Navigation No No N/A
CVE-2023-1821 Chromium: CVE-2023-1821 Inappropriate implementation in WebShare No No N/A
CVE-2023-1820 Chromium: CVE-2023-1820 Heap buffer overflow in Browser History No No N/A
CVE-2023-1819 Chromium: CVE-2023-1819 Out of bounds read in Accessibility No No N/A
CVE-2023-1818 Chromium: CVE-2023-1818 Use after free in Vulkan No No N/A
CVE-2023-1817 Chromium: CVE-2023-1817 Insufficient policy enforcement in Intents No No N/A
CVE-2023-1816 Chromium: CVE-2023-1816 Incorrect security UI in Picture In Picture No No N/A
CVE-2023-1815 Chromium: CVE-2023-1815 Use after free in Networking APIs No No N/A
CVE-2023-1814 Chromium: CVE-2023-1814 Insufficient validation of untrusted input in Safe Browsing No No N/A
CVE-2023-1813 Chromium: CVE-2023-1813 Inappropriate implementation in Extensions No No N/A
CVE-2023-1812 Chromium: CVE-2023-1812 Out of bounds memory access in DOM Bindings No No N/A
CVE-2023-1811 Chromium: CVE-2023-1811 Use after free in Frames No No N/A
CVE-2023-1810 Chromium: CVE-2023-1810 Heap buffer overflow in Visuals No No N/A

Developer Tools vulnerabilities

CVE Title Exploited? Publicly disclosed? CVSSv3 base score
CVE-2023-28296 Visual Studio Remote Code Execution Vulnerability No No 8.4
CVE-2023-28262 Visual Studio Elevation of Privilege Vulnerability No No 7.8
CVE-2023-24893 Visual Studio Code Remote Code Execution Vulnerability No No 7.8
CVE-2023-28260 .NET DLL Hijacking Remote Code Execution Vulnerability No No 7.8
CVE-2023-28299 Visual Studio Spoofing Vulnerability No No 5.5
CVE-2023-28263 Visual Studio Information Disclosure Vulnerability No No 5.5

ESU SQL Server vulnerabilities

CVE Title Exploited? Publicly disclosed? CVSSv3 base score
CVE-2023-23384 Microsoft SQL Server Remote Code Execution Vulnerability No No 7.3

ESU Windows vulnerabilities

CVE Title Exploited? Publicly disclosed? CVSSv3 base score
CVE-2023-28250 Windows Pragmatic General Multicast (PGM) Remote Code Execution Vulnerability No No 9.8
CVE-2023-21554 Microsoft Message Queuing Remote Code Execution Vulnerability No No 9.8
CVE-2023-28240 Windows Network Load Balancing Remote Code Execution Vulnerability No No 8.8
CVE-2023-21727 Remote Procedure Call Runtime Remote Code Execution Vulnerability No No 8.8
CVE-2023-28275 Microsoft WDAC OLE DB provider for SQL Server Remote Code Execution Vulnerability No No 8.8
CVE-2023-28231 DHCP Server Service Remote Code Execution Vulnerability No No 8.8
CVE-2023-28244 Windows Kerberos Elevation of Privilege Vulnerability No No 8.1
CVE-2023-28268 Netlogon RPC Elevation of Privilege Vulnerability No No 8.1
CVE-2023-28219 Layer 2 Tunneling Protocol Remote Code Execution Vulnerability No No 8.1
CVE-2023-28220 Layer 2 Tunneling Protocol Remote Code Execution Vulnerability No No 8.1
CVE-2023-28272 Windows Kernel Elevation of Privilege Vulnerability No No 7.8
CVE-2023-28293 Windows Kernel Elevation of Privilege Vulnerability No No 7.8
CVE-2023-24912 Windows Graphics Component Elevation of Privilege Vulnerability No No 7.8
CVE-2023-28252 Windows Common Log File System Driver Elevation of Privilege Vulnerability Yes No 7.8
CVE-2023-28241 Windows Secure Socket Tunneling Protocol (SSTP) Denial of Service Vulnerability No No 7.5
CVE-2023-24931 Windows Secure Channel Denial of Service Vulnerability No No 7.5
CVE-2023-28232 Windows Point-to-Point Tunneling Protocol Remote Code Execution Vulnerability No No 7.5
CVE-2023-28217 Windows Network Address Translation (NAT) Denial of Service Vulnerability No No 7.5
CVE-2023-28238 Windows Internet Key Exchange (IKE) Protocol Extensions Remote Code Execution Vulnerability No No 7.5
CVE-2023-28227 Windows Bluetooth Driver Remote Code Execution Vulnerability No No 7.5
CVE-2023-21769 Microsoft Message Queuing Denial of Service Vulnerability No No 7.5
CVE-2023-28302 Microsoft Message Queuing Denial of Service Vulnerability No No 7.5
CVE-2023-28254 Windows DNS Server Remote Code Execution Vulnerability No No 7.2
CVE-2023-28222 Windows Kernel Elevation of Privilege Vulnerability No No 7.1
CVE-2023-28229 Windows CNG Key Isolation Service Elevation of Privilege Vulnerability No No 7
CVE-2023-28218 Windows Ancillary Function Driver for WinSock Elevation of Privilege Vulnerability No No 7
CVE-2023-28216 Windows Advanced Local Procedure Call (ALPC) Elevation of Privilege Vulnerability No No 7
CVE-2023-28305 Windows DNS Server Remote Code Execution Vulnerability No No 6.6
CVE-2023-28255 Windows DNS Server Remote Code Execution Vulnerability No No 6.6
CVE-2023-28278 Windows DNS Server Remote Code Execution Vulnerability No No 6.6
CVE-2023-28256 Windows DNS Server Remote Code Execution Vulnerability No No 6.6
CVE-2023-28306 Windows DNS Server Remote Code Execution Vulnerability No No 6.6
CVE-2023-28307 Windows DNS Server Remote Code Execution Vulnerability No No 6.6
CVE-2023-28308 Windows DNS Server Remote Code Execution Vulnerability No No 6.6
CVE-2023-28223 Windows Domain Name Service Remote Code Execution Vulnerability No No 6.6
CVE-2023-28267 Remote Desktop Protocol Client Information Disclosure Vulnerability No No 6.5
CVE-2023-28228 Windows Spoofing Vulnerability No No 5.5
CVE-2023-28271 Windows Kernel Memory Information Disclosure Vulnerability No No 5.5
CVE-2023-28253 Windows Kernel Information Disclosure Vulnerability No No 5.5
CVE-2023-28298 Windows Kernel Denial of Service Vulnerability No No 5.5
CVE-2023-28266 Windows Common Log File System Driver Information Disclosure Vulnerability No No 5.5
CVE-2023-28276 Windows Group Policy Security Feature Bypass Vulnerability No No 4.4
CVE-2023-21729 Remote Procedure Call Runtime Information Disclosure Vulnerability No No 4.3

Microsoft Dynamics vulnerabilities

CVE Title Exploited? Publicly disclosed? CVSSv3 base score
CVE-2023-28309 Microsoft Dynamics 365 (on-premises) Cross-site Scripting Vulnerability No No 7.6
CVE-2023-28313 Microsoft Dynamics 365 Customer Voice Cross-Site Scripting Vulnerability No No 6.1
CVE-2023-28314 Microsoft Dynamics 365 (on-premises) Cross-site Scripting Vulnerability No No 6.1

Microsoft Office vulnerabilities

CVE Title Exploited? Publicly disclosed? CVSSv3 base score
CVE-2023-28311 Microsoft Word Remote Code Execution Vulnerability No No 7.8
CVE-2023-28287 Microsoft Publisher Remote Code Execution Vulnerability No No 7.8
CVE-2023-28295 Microsoft Publisher Remote Code Execution Vulnerability No No 7.8
CVE-2023-28285 Microsoft Office Graphics Remote Code Execution Vulnerability No No 7.8
CVE-2023-28288 Microsoft SharePoint Server Spoofing Vulnerability No No 6.5

SQL Server vulnerabilities

CVE Title Exploited? Publicly disclosed? CVSSv3 base score
CVE-2023-23375 Microsoft ODBC and OLE DB Remote Code Execution Vulnerability No No 7.8
CVE-2023-28304 Microsoft ODBC and OLE DB Remote Code Execution Vulnerability No No 7.8

System Center vulnerabilities

CVE Title Exploited? Publicly disclosed? CVSSv3 base score
CVE-2023-24860 Microsoft Defender Denial of Service Vulnerability No No 7.5

Windows vulnerabilities

CVE Title Exploited? Publicly disclosed? CVSSv3 base score
CVE-2023-28297 Windows Remote Procedure Call Service (RPCSS) Elevation of Privilege Vulnerability No No 8.8
CVE-2023-24924 Microsoft PostScript and PCL6 Class Printer Driver Remote Code Execution Vulnerability No No 8.8
CVE-2023-24925 Microsoft PostScript and PCL6 Class Printer Driver Remote Code Execution Vulnerability No No 8.8
CVE-2023-24884 Microsoft PostScript and PCL6 Class Printer Driver Remote Code Execution Vulnerability No No 8.8
CVE-2023-24926 Microsoft PostScript and PCL6 Class Printer Driver Remote Code Execution Vulnerability No No 8.8
CVE-2023-24885 Microsoft PostScript and PCL6 Class Printer Driver Remote Code Execution Vulnerability No No 8.8
CVE-2023-24927 Microsoft PostScript and PCL6 Class Printer Driver Remote Code Execution Vulnerability No No 8.8
CVE-2023-24886 Microsoft PostScript and PCL6 Class Printer Driver Remote Code Execution Vulnerability No No 8.8
CVE-2023-24928 Microsoft PostScript and PCL6 Class Printer Driver Remote Code Execution Vulnerability No No 8.8
CVE-2023-24887 Microsoft PostScript and PCL6 Class Printer Driver Remote Code Execution Vulnerability No No 8.8
CVE-2023-24929 Microsoft PostScript and PCL6 Class Printer Driver Remote Code Execution Vulnerability No No 8.8
CVE-2023-28243 Microsoft PostScript and PCL6 Class Printer Driver Remote Code Execution Vulnerability No No 8.8
CVE-2023-28291 Raw Image Extension Remote Code Execution Vulnerability No No 8.4
CVE-2023-28274 Windows Win32k Elevation of Privilege Vulnerability No No 7.8
CVE-2023-28246 Windows Registry Elevation of Privilege Vulnerability No No 7.8
CVE-2023-28225 Windows NTLM Elevation of Privilege Vulnerability No No 7.8
CVE-2023-28237 Windows Kernel Remote Code Execution Vulnerability No No 7.8
CVE-2023-28236 Windows Kernel Elevation of Privilege Vulnerability No No 7.8
CVE-2023-28248 Windows Kernel Elevation of Privilege Vulnerability No No 7.8
CVE-2023-28292 Raw Image Extension Remote Code Execution Vulnerability No No 7.8
CVE-2023-28233 Windows Secure Channel Denial of Service Vulnerability No No 7.5
CVE-2023-28234 Windows Secure Channel Denial of Service Vulnerability No No 7.5
CVE-2023-28247 Windows Network File System Information Disclosure Vulnerability No No 7.5
CVE-2023-28224 Windows Point-to-Point Protocol over Ethernet (PPPoE) Remote Code Execution Vulnerability No No 7.1
CVE-2023-28221 Windows Error Reporting Service Elevation of Privilege Vulnerability No No 7
CVE-2023-28273 Windows Clip Service Elevation of Privilege Vulnerability No No 7
CVE-2023-24914 Win32k Elevation of Privilege Vulnerability No No 7
CVE-2023-28235 Windows Lock Screen Security Feature Bypass Vulnerability No No 6.8
CVE-2023-28270 Windows Lock Screen Security Feature Bypass Vulnerability No No 6.8
CVE-2023-24883 Microsoft PostScript and PCL6 Class Printer Driver Information Disclosure Vulnerability No No 6.5
CVE-2023-28269 Windows Boot Manager Security Feature Bypass Vulnerability No No 6.2
CVE-2023-28249 Windows Boot Manager Security Feature Bypass Vulnerability No No 6.2
CVE-2023-28226 Windows Enroll Engine Security Feature Bypass Vulnerability No No 5.3
CVE-2023-28277 Windows DNS Server Information Disclosure Vulnerability No No 4.9

New Global AWS Data Processing Addendum

Post Syndicated from Chad Woolf original https://aws.amazon.com/blogs/security/new-global-aws-data-processing-addendum/

Navigating data protection laws around the world is no simple task. Today, I’m pleased to announce that AWS is expanding the scope of the AWS Data Processing Addendum (Global AWS DPA) so that it applies globally whenever customers use AWS services to process personal data, regardless of which data protection laws apply to that processing. AWS is proud to be the first major cloud provider to adopt this global approach to help you meet your compliance needs for data protection.

The Global AWS DPA is designed to help you satisfy requirements under data protection laws worldwide, without having to create separate country-specific data processing agreements for every location where you use AWS services. By introducing this global, one-stop addendum, we are simplifying contracting procedures and helping to reduce the time that you spend assessing contractual data privacy requirements on a country-by-country basis.

If you have signed a copy of the previous AWS General Data Protection Regulation (GDPR) DPA, then you do not need to take any action and can continue to rely on that addendum to satisfy data processing requirements. AWS is always innovating to help you meet your compliance obligations wherever you operate. We’re confident that this expanded Global AWS DPA will help you on your journey. If you have questions or need more information, see Data Protection & Privacy at AWS and GDPR Center.

 
If you have feedback about this post, submit comments in the Comments section below. If you have questions about this post, contact AWS Support.

Want more AWS Security news? Follow us on Twitter.

Author

Chad Woolf

Chad joined Amazon in 2010 and built the AWS compliance functions from the ground up, including audit and certifications, privacy, contract compliance, control automation engineering and security process monitoring. Chad’s work also includes enabling public sector and regulated industry adoption of the AWS cloud and leads the AWS trade and product compliance team.

[$] Python 3.12: error messages, perf support, and more

Post Syndicated from original https://lwn.net/Articles/928437/

Python 3.12 approaches. While the full feature set of the final
release—slated for October 2023—is still not completely known, by now
we have a good sense for what it will offer. It picks up where
Python 3.11 left off, improving error messages
and performance. These changes are accompanied by a smattering of smaller
changes, though Linux users will likely make use of one in particular:
support for the perf profiler.

7 Rapid Questions: Lindsey Searle

Post Syndicated from Rapid7 original https://blog.rapid7.com/2023/04/11/7-rapid-questions-lindsey-searle/

7 Rapid Questions: Lindsey Searle

Welcome back to 7 Rapid Questions, our blog series where we ask passionate leaders at Rapid7 to give us an inside look at what it’s like to work on their team, and how they’re creating an impact every day.

In this installment, we talk to Lindsey Searle, Senior Manager, Customer Advisors on how her team helps solve customer challenges, and how candidates can stand out in the interview process.

What kind of challenges are you/your team responsible for solving for customers?

The security space is evolving every day as hackers continue to advance. Many customer teams find themselves overwhelmed and in need of more customized services to stay ahead, and that’s where we can help support.

Our team is the face of Rapid7 for Managed Services customers. We provide advisory services for clients of all shapes and sizes and at all levels of their security maturity journeys. The Customer Advisor team works closely with the Security Operations Center (SOC) to monitor customers’ in-scope environments and provide custom tailored guidance to enhance their security posture. Many of our customers consider us an extension of their in-house security team, and we strive to build close knit working relationships and trust with each and every one of them.

In addition to day to day monitoring, our Customer Advisors work with our clients to understand their security goals, make recommendations to achieve those goals, and are personally invested in seeing those initiatives through to completion.

What does your team look like (team size, types of teams etc.), what growth has there been?

The Customer Advisor organization at Rapid7 grew by 30% last year—we added 35 new people to the team in 2022, including Advisors at all levels, four new managers, and a Vice President.

Our team is composed of all levels of security professionals, from associate CAs at the start of their career to tenured Lead and Principal Advisors. We have CAs supporting all three branches of Managed Services, and our teams are blended across Managed Detection and Response (MDR), Managed Vulnerability Management (MVM), and Managed Application Security (MAS) to allow for cross functional collaboration and learning.

We are fortunately in a position where as our Managed Services business grows, our Customer Advisor team has continued to expand, as it is a standard part of the service offering.

What makes the culture at R7 different from other tech / cyber security companies?

I always find it difficult to describe the Rapid7 culture when I interview candidates because it’s something that you really have to see to understand and believe. The underlying fact is that all Rapid7 employees are passionate about security—we are here because we want to help our customers succeed, and we truly enjoy working together for that common goal.

At the same time, every single person is unapologetically unique and does not hesitate to bring their own perspective to the table. We have a great balance of external hires bringing in fresh ideas, as well as internal hires that provide a different approach to a situation when you’ve worked on the other side of the curtain.

What 3 biggest things have you learned in your time at Rapid7?

One: Take the time to thank people for helping you out! We do ‘guitar picks’ at Rapid7—it’s an internal website where you can give fellow moose a virtual kudos and recognition, whether it be for a great presentation they gave, or for filling in for you on an assignment, or for just being awesome. Everyone in the company can see it, and the recipient gets a notification that they’ve received one. Sending a pick takes minutes but can make someone’s day! Our Chief People Officer selects a guitar pick submission and sends it out to the whole company every morning. It’s a quick and meaningful way to thank those around us.

Two: Don’t be afraid to ask questions—we are all constantly learning and there is definitely someone out there who can help.

Three: There is a Slack emoji for just about every situation, and if it doesn’t exist—make one! In fact, our recent Slack migration took longer than expected due to the 10,000+ custom emojis that Rapid7 employees have created. One of our core values is ‘Bring You’ so this is just one example of how people are getting creative to express themselves in different ways and build camaraderie in a globally distributed organization.

How does Rapid7 set you up for success in your role?

I was incredibly impressed with the corporate onboarding provided by Rapid7 when I went through it myself in late 2021. You attend your onboarding sessions with all new hires starting at the company that week and already start to build a network within your first few hours here.

Rapid7 is big on encouraging Insight Coffees—an informal 30 minute meeting with another Rapid7 employee to get to know them on a personal and professional level. Those connections stick with you throughout your time here and only strengthen your ability to work together down the road.

Our company culture is built around helping each other and working together as a team, which puts you in a great spot to be successful in your role.

What can a candidate do to stand out in the interview process?

Honestly, just be yourself—Bring You is a core value at Rapid7 and something that truly sets us apart from other companies. Finding people who embrace our collaborative culture and partner well to share ideas is a major piece of the Rapid7 interview process. These soft skills weigh as heavily as prior work experience and technical competency. Your individuality will set you apart from other candidates—so let your true self shine!

What advice would you give someone thinking about coming to work here?

Bring energy and enthusiasm, and take the time to build meaningful relationships with the people you work with. It is much easier to wake up and log on for the day when you are looking forward to interacting with your team members and your customers. At Rapid7 we live by the core value of ‘Impact Together’—teamwork makes the dream work! We have a far greater chance at success when working together than we do when trying to climb the ladder individually.

To learn more about Rapid7 Managed Services:

CLICK HERE

Reference guide to build inventory management and forecasting solutions on AWS

Post Syndicated from Jason Dalba original https://aws.amazon.com/blogs/big-data/reference-guide-to-build-inventory-management-and-forecasting-solutions-on-aws/

Inventory management is a critical function for any business that deals with physical products. The primary challenge businesses face with inventory management is balancing the cost of holding inventory with the need to ensure that products are available when customers demand them.

The consequences of poor inventory management can be severe. Overstocking can lead to increased holding costs and waste, while understocking can result in lost sales, reduced customer satisfaction, and damage to the business’s reputation. Inefficient inventory management can also tie up valuable resources, including capital and warehouse space, and can impact profitability.

Forecasting is another critical component of effective inventory management. Accurately predicting demand for products allows businesses to optimize inventory levels, minimize stockouts, and reduce holding costs. However, forecasting can be a complex process, and inaccurate predictions can lead to missed opportunities and lost revenue.

To address these challenges, businesses need an inventory management and forecasting solution that can provide real-time insights into inventory levels, demand trends, and customer behavior. Such a solution should use the latest technologies, including Internet of Things (IoT) sensors, cloud computing, and machine learning (ML), to provide accurate, timely, and actionable data. By implementing such a solution, businesses can improve their inventory management processes, reduce holding costs, increase revenue, and enhance customer satisfaction.

In this post, we discuss how to streamline inventory management forecasting systems with AWS managed analytics, AI/ML, and database services.

Solution overview

In today’s highly competitive business landscape, it’s essential for retailers to optimize their inventory management processes to maximize profitability and improve customer satisfaction. With the proliferation of IoT devices and the abundance of data generated by them, it has become possible to collect real-time data on inventory levels, customer behavior, and other key metrics.

To take advantage of this data and build an effective inventory management and forecasting solution, retailers can use a range of AWS services. By collecting data from store sensors using AWS IoT Core, ingesting it using AWS Lambda to Amazon Aurora Serverless, and transforming it using AWS Glue from a database to an Amazon Simple Storage Service (Amazon S3) data lake, retailers can gain deep insights into their inventory and customer behavior.

With Amazon Athena, retailers can analyze this data to identify trends, patterns, and anomalies, and use Amazon ElastiCache for customer-facing applications with reduced latency. Additionally, by building a point of sales application on Amazon QuickSight, retailers can embed customer 360 views into the application to provide personalized shopping experiences and drive customer loyalty.

Finally, we can use Amazon SageMaker to build forecasting models that can predict inventory demand and optimize stock levels.

With these AWS services, retailers can build an end-to-end inventory management and forecasting solution that provides real-time insights into inventory levels and customer behavior, enabling them to make informed decisions that drive business growth and customer satisfaction.

The following diagram illustrates a sample architecture.

With the appropriate AWS services, your inventory management and forecasting system can have optimized collection, storage, processing, and analysis of data from multiple sources. The solution includes the following components.

Data ingestion and storage

Retail businesses have event-driven data that requires action from downstream processes. It’s critical for an inventory management application to handle the data ingestion and storage for changing demands.

The data ingestion process is typically triggered by an event such as an order being placed, kicking off the inventory management workflow, which requires actions from backend services. Developers are responsible for the operational overhead of trying to maintain the data ingestion load from an event driven-application.

The volume and velocity of data can change in the retail industry each day. Events like Black Friday or a new campaign can create volatile demand in what is required to process and store the inventory data. Serverless services designed to scale to businesses’ needs help reduce the architectural and operational challenges that are driven from high-demand retail applications.

Understanding the scaling challenges that occur when inventory demand spikes, we can deploy Lambda, a serverless, event-driven compute service, to trigger the data ingestion process. As inventory events occur like purchases or returns, Lambda automatically scales compute resources to meet the volume of incoming data.

After Lambda responds to the inventory action request, the updated data is stored in Aurora Serverless. Aurora Serverless is a serverless relational database that is designed to scale to the application’s needs. When peak loads hit during events like Black Friday, Aurora Serverless deploys only the database capacity necessary to meet the workload.

Inventory management applications have ever-changing demands. Deploying serverless services to handle the ingestion and storage of data will not only optimize cost but also reduce the operational overhead for developers, freeing up bandwidth for other critical business needs.

Data performance

Customer-facing applications require low latency to maintain positive user experiences with microsecond response times. ElastiCache, a fully managed, in-memory database, delivers high-performance data retrieval to users.

In-memory caching provided by ElastiCache is used to improve latency and throughput for read-heavy applications that online retailers experience. By storing critical pieces of data in-memory like commonly accessed product information, the application performance improves. Product information is an ideal candidate for a cached store due to data staying relatively the same.

Functionality is often added to retail applications to retrieve trending products. Trending products can be cycled through the cache dependent on customer access patterns. ElastiCache manages the real-time application data caching, allowing your customers to experience microsecond response times while supporting high-throughput handling of hundreds of millions of operations per second.

Data transformation

Data transformation is essential in inventory management and forecasting solutions for both data analysis around sales and inventory, as well as ML for forecasting. This is because raw data from various sources can contain inconsistencies, errors, and missing values that may distort the analysis and forecast results.

In the inventory management and forecasting solution, AWS Glue is recommended for data transformation. The tool addresses issues such as cleaning, restructuring, and consolidating data into a standard format that can be easily analyzed. As a result of the transformation, businesses can obtain a more precise understanding of inventory, sales trends, and customer behavior, influencing data-driven decisions to optimize inventory management and sales strategies. Furthermore, high-quality data is crucial for ML algorithms to make accurate forecasts.

By transforming data, organizations can enhance the accuracy and dependability of their forecasting models, ultimately leading to improved inventory management and cost savings.

Data analysis

Data analysis has become increasingly important for businesses because it allows leaders to make informed operational decisions. However, analyzing large volumes of data can be a time-consuming and resource-intensive task. This is where Athena come in. With Athena, businesses can easily query historical sales and inventory data stored in S3 data lakes and combine it with real-time transactional data from Aurora Serverless databases.

The federated capabilities of Athena allow businesses to generate insights by combining datasets without the need to build ETL (extract, transform, and load) pipelines, saving time and resources. This enables businesses to quickly gain a comprehensive understanding of their inventory and sales trends, which can be used to optimize inventory management and forecasting, ultimately improving operations and increasing profitability.

With Athena’s ease of use and powerful capabilities, businesses can quickly analyze their data and gain valuable insights, driving growth and success without the need for complex ETL pipelines.

Forecasting

Inventory forecasting is an important aspect of inventory management for businesses that deal with physical products. Accurately predicting demand for products can help optimize inventory levels, reduce costs, and improve customer satisfaction. ML can help simplify and improve inventory forecasting by making more accurate predictions based on historical data.

SageMaker is a powerful ML platform that you can use to build, train, and deploy ML models for a wide range of applications, including inventory forecasting. In this solution, we use SageMaker to build and train an ML model for inventory forecasting, covering the basic concepts of ML, the data preparation process, model training and evaluation, and deploying the model for use in a production environment.

The solution also introduces the concept of hierarchical forecasting, which involves generating coherent forecasts that maintain the relationships within the hierarchy or reconciling incoherent forecasts. The workshop provides a step-by-step process for using the training capabilities of SageMaker to carry out hierarchical forecasting using synthetic retail data and the scikit-hts package. The FBProphet model was used along with bottom-up and top-down hierarchical aggregation and disaggregation methods. We used Amazon SageMaker Experiments to train multiple models, and the best model was picked out of the four trained models.

Although the approach was demonstrated on a synthetic retail dataset, you can use the provided code with any time series dataset that exhibits a similar hierarchical structure.

Security and authentication

The solution takes advantage of the scalability, reliability, and security of AWS services to provide a comprehensive inventory management and forecasting solution that can help businesses optimize their inventory levels, reduce holding costs, increase revenue, and enhance customer satisfaction. By incorporating user authentication with Amazon Cognito and Amazon API Gateway, the solution ensures that the system is secure and accessible only by authorized users.

Next steps

The next step to build an inventory management and forecasting solution on AWS would be to go through the Inventory Management workshop. In the workshop, you will get hands-on with AWS managed analytics, AI/ML, and database services to dive deep into an end-to-end inventory management solution. By the end of the workshop, you will have gone through the configuration and deployment of the critical pieces that make up an inventory management system.

Conclusion

In conclusion, building an inventory management and forecasting solution on AWS can help businesses optimize their inventory levels, reduce holding costs, increase revenue, and enhance customer satisfaction. With AWS services like IoT Core, Lambda, Aurora Serverless, AWS Glue, Athena, ElastiCache, QuickSight, SageMaker, and Amazon Cognito, businesses can use scalable, reliable, and secure technologies to collect, store, process, and analyze data from various sources.

The end-to-end solution is designed for individuals in various roles, such as business users, data engineers, data scientists, and data analysts, who are responsible for comprehending, creating, and overseeing processes related to retail inventory forecasting. Overall, an inventory management and forecasting solution on AWS can provide businesses with the insights and tools they need to make data-driven decisions and stay competitive in a constantly evolving retail landscape.


About the Authors

Jason D’Alba is an AWS Solutions Architect leader focused on databases and enterprise applications, helping customers architect highly available and scalable solutions.

Navnit Shukla is an AWS Specialist Solution Architect, Analytics, and is passionate about helping customers uncover insights from their data. He has been building solutions to help organizations make data-driven decisions.

Vetri Natarajan is a Specialist Solutions Architect for Amazon QuickSight. Vetri has 15 years of experience implementing enterprise business intelligence (BI) solutions and greenfield data products. Vetri specializes in integration of BI solutions with business applications and enable data-driven decisions.

Sindhura Palakodety is a Solutions Architect at AWS. She is passionate about helping customers build enterprise-scale Well-Architected solutions on the AWS platform and specializes in Data Analytics domain.

The collective thoughts of the interwebz