[$] 5.16 Merge window, part 1

Post Syndicated from original https://lwn.net/Articles/874683/rss

As of this writing, Linus Torvalds has pulled exactly 6,800 non-merge
changesets into the mainline repository for the 5.16 kernel release. That
is probably a little over half of what will arrive during this merge
window, so this is a good time to catch up on what has been pulled so far.
There are many significant changes and some large-scale restructuring of
internal kernel code, but relatively few ground-breaking new
features.

Security updates for Thursday

Post Syndicated from original https://lwn.net/Articles/875106/rss

Security updates have been issued by Fedora (ansible, chromium, kernel, mupdf, python-PyMuPDF, rust, and zathura-pdf-mupdf), openSUSE (qemu and webkit2gtk3), Red Hat (firefox and kpatch-patch), Scientific Linux (firefox), SUSE (qemu, tomcat, and webkit2gtk3), and Ubuntu (firefox and thunderbird).

[Security Nation] Pete Cooper and Irene Pontisso of the UK Cabinet Office on Their Cybersecurity Culture Competition

Post Syndicated from Rapid7 original https://blog.rapid7.com/2021/11/04/pete-cooper-and-irene-pontisso-of-the-uk-cabinet-office-on-their-cybersecurity-culture-competition/

[Security Nation] Pete Cooper and Irene Pontisso of the UK Cabinet Office on Their Cybersecurity Culture Competition

In this special bonus episode of Security Nation, Jen and Tod chat with Pete Cooper and Irene Pontisso from the UK Cabinet Office about their current competition aiming to promote cybersecurity culture among small businesses. They highlight their 9 hypotheses, which touch on the role of human factors, the distinction between cyber culture and security culture, and the importance of leadership. They chat about why they decided to get help validating these ideas through a competition format the “Bakeoff Approach,” as Irene calls to promote collaborative thinking and get a sense of what organizations are doing on these issues today.

The deadline to apply for the competition is fast approaching on Monday, November 8, and winners will be awarded contracts to carry out the competition over 12 weeks, beginning in late November. Check out the Invitation to Tender to submit your entry!

Pete Cooper

[Security Nation] Pete Cooper and Irene Pontisso of the UK Cabinet Office on Their Cybersecurity Culture Competition

Pete Cooper is Deputy Director Cyber Defence within the Government Security Group in the UK Cabinet Office, where he looks over the whole of the Government sector and is responsible for the Government Cyber Security Strategy, standards, and policies as well as responding to serious or cross-government cyber incidents. With a diverse military, private-sector, and government background, he has worked on everything ranging from cyber operations, global cyber security strategies, advising on the nature of state-vs.-state cyber conflict to leading cybersecurity change across industry, public sector, and the global hacker community, including founding and leading the Aerospace Village at DEF CON. A fast jet pilot turned cyber operations advisor, who on leaving the military in 2016, founded the UK’s first multi-disciplinary cyber strategy competition, he is passionate about tackling national and international cybersecurity challenges through better collaboration, diversity, and innovative partnerships. He has a Post Grad in Cyberspace Operations from Cranfield University, is a Non-Resident Senior Fellow at the Cyber Statecraft Initiative of the Scowcroft Centre for Strategy and Security at the Atlantic Council, and is a Visiting Senior Research Fellow in the Department of War Studies, King’s College London.

Irene Pontisso

[Security Nation] Pete Cooper and Irene Pontisso of the UK Cabinet Office on Their Cybersecurity Culture Competition

Irene is Assistant Head of Engagement and Information within the Government Security Group in the UK Cabinet Office. Irene is responsible for the design and strategic oversight of cross-government security education, awareness, and culture-related initiatives. She is also responsible for leading cross-government engagement and press activities for Government Security and the Government Chief Security Officer. Irene started her career in policy and international relations through her roles at the United Nations Platform for Space-Based Information for Disaster Management and Emergency Response (UN-SPIDER). Irene also has significant industry and third-sector experience, and she partnered with the world’s leading law firms to provide free access to legal advice for NGOs on international development projects. She also has experience in leading large-scale exhibitions and policy research in corporate environments. She holds a MSc in International Relations from the University of Bristol and a BSc from the University of Turin.

Show notes

Like the show? Want to keep Jen and Tod in the podcasting business? Feel free to rate and review with your favorite podcast purveyor, like Apple Podcasts.

Want More Inspiring Stories From the Security Community?

Subscribe to Security Nation Today

Building Threat-Informed Defenses: Rapid7 Experts Share Their Thoughts on MITRE ATT&CK

Post Syndicated from Margaret Wei original https://blog.rapid7.com/2021/11/04/building-threat-informed-defenses-rapid7-experts-share-their-thoughts-on-mitre-att-ck/

Building Threat-Informed Defenses: Rapid7 Experts Share Their Thoughts on MITRE ATT&CK

MITRE ATT&CK is considered by practitioners and the analyst community to be the most comprehensive framework of cybersecurity attacks and mitigation techniques available today. MITRE helps the security industry speak the same language and stick to a well-known, common framework.

To get more details on MITRE’s ATT&CK Matrix for Enterprise and its impact, I spoke with 3 members of Rapid7’s Managed Detection and Response team who have firsthand experience working with this framework every day — read our conversation below!

Laying some groundwork here, what are your thoughts on the MITRE ATT&CK framework?

John Fenninger, Manager of Rapid7’s Detection and Response Services, kicked us off by sharing his perspective:

“MITRE ATT&CK is an incredibly valuable framework for both vendors and customers. From things like compliance to more immediate needs like investigating an ongoing attack, MITRE makes it easy to see specific techniques that customers may not have heard of and helps think of tactical moves customers can protect against. With InsightIDR specifically, we align our detections to MITRE to give both our MDR SOC analysts and customers visibility into how far along a threat is on the ATT&CK chain.”

Rapid7 is not only a consumer of the MITRE ATT&CK Framework but an active contributor as well — in 2020, Rapid7 Incident Response Consultant Ted Samuels made a contribution to MITRE around a discovery for group policy objects that is now in the latest version of the ATT&CK framework.

Can you share your perspective on how the MITRE framework is used, and by who?

When it comes to leveraging the MITRE ATT&CK framework, there are 2 key audiences to consider, says Rapid7’s Senior Detection & Response Analyst, Vidya Tambe:

“There are 2 main categories of users — people who write detections and people who do the analysis of the detections, and the MITRE framework is important for both. From the analyst side, we want to know what stage of attack each alert is at, and based on where the alert falls, we know how critical an incident is. With MITRE, we can track how an attacker got to where they are and what kind of escalations they did — overall, it helps us back-track to see what they were able to compromise.

“From the detection writing standpoint, we want to stop attacks before they get too far into someone’s environment. Attacker techniques are always evolving, and while we aim to write detections for all the phases, a primary focus is to try and write detections early on to stop attackers as early in the ATT&CK chain as possible.”

What advice do you have for security teams when it comes to leveraging the MITRE framework to drive successful detection and response?

Rapid7 Detection and Response Analyst Carlo Anez Mazurco shared some advice for teams when it comes to using the MITRE framework at their organization:

“The MITRE Framework allows us to build a threat-informed defense. It shows us the 3 main areas that we need to focus on for data collection, data analysis, and expansion of detections. For teams to successfully utilize the MITRE framework, they need visibility into the following data sources at a minimum:

  • Process and process command line monitoring can be collected via Sysmon, Windows Event Logs, and many EDR platforms
  • File and registry monitoring is also often collected by Sysmon, Windows Event Logs, and many EDR platforms
  • Authentication logs collected from the domain controller
  • Packet capture, especially east/west capture, such as those collected between hosts and enclaves in your network

“Teams need a platform like InsightIDR, Rapid7’s extended detection and response solution, where the data from all of these sources can be ingested. Whatever platform or tool teams choose to use for this data ingestion should include MITRE mappings to attacker behaviors to understand what attackers are trying to do inside our environment at each stage, the TTPs (Tactics, Techniques, Procedures) of each threat actor should be documented in each alert — InsightIDR maps its detections to the MITRE framework to do just this for users.”

You mentioned InsightIDR has MITRE mapping — can you dig a little more into how this impacts customers?

“Our InsightIDR platform helps our customers collect all the necessary data sources,” Carlo continued. “That includes process and process command line monitoring via our endpoint Insight Agent, as well as file monitoring. Plus, authentication logs are collected from domain controllers and also via the Insight Agent, and network flow inside the environment can be gathered through our Insight Network Sensor.

“Our ABA and UBA detections are mapped to the MITRE framework to show our customers which TTPs are the most commonly used by threat actors in their environment, and it gives an insight into the attack patterns in real time. You can see an example of this in one of our past Rapid7 Threat Reports here.

“Additionally, our Rapid7 Threat Intelligence team is always developing new threat detections based on the threat intelligence feeds and public repositories of attacker behaviors. These new detections are mapped to the TTPs inside the MITRE framework and pushed out to all Rapid7 customers.”

We also recently released a new view of Detection Rules in InsightIDR where all detections are mapped to the MITRE ATT&CK Framework, and users can see associated MITRE tactics, techniques, and sub-techniques for detections while performing an investigation.

Interested in learning more?

As you can see, we really value the MITRE ATT&CK framework here at Rapid7. With InsightIDR your detections are vetted by a team of professional SOC analysts and mapped to MITRE to take the guessing game of what an attacker might do next.

If you’re looking to hear more from us on MITRE, watch a quick 3-minute rundown on the framework here.

DDoS Attack Trends for Q3 2021

Post Syndicated from Vivek Ganti original https://blog.cloudflare.com/ddos-attack-trends-for-2021-q3/

DDoS Attack Trends for Q3 2021

DDoS Attack Trends for Q3 2021

The third quarter of 2021 was a busy quarter for DDoS attackers. Cloudflare observed and mitigated record-setting HTTP DDoS attacks, terabit-strong network-layer attacks, one of the largest botnets ever deployed (Meris), and more recently, ransom DDoS attacks on voice over IP (VoIP) service providers and their network infrastructure around the world.

Here’s a summary of the trends observed in Q3 ‘21:

Application-layer (L7) DDoS attack trends:

  • For the second consecutive quarter in 2021, US-based companies were the most targeted in the world.
  • For the first time in 2021, attacks on UK-based and Canada-based companies skyrocketed, making them the second and third most targeted countries, respectively.
  • Attacks on Computer Software, Gaming/ Gambling, IT, and Internet companies increased by an average of 573% compared to the previous quarter.
  • Meris, one of the most powerful botnets in history, aided in launching DDoS campaigns across various industries and countries. You can read more on that here.

Network-layer (L3/4) DDoS attack trends:

  • DDoS attacks increased by 44% worldwide compared to the previous quarter.
  • The Middle East and Africa recorded the largest average attack increase of approximately 80%.
  • Morocco recorded the highest DDoS activity in the third quarter globally — three out of every 100 packets were part of a DDoS attack.
  • While SYN and RST attacks remain the dominant attack method used by attackers, Cloudflare observed a surge in DTLS amplification attacks — recording a 3,549% increase QoQ.
  • Attackers targeted (and continue to target going into the fourth quarter this year) VoIP service providers with massive DDoS attack campaigns in attempts to bring SIP infrastructure down.

Note on avoiding data biases: When we analyze attack trends, we calculate the “DDoS activity” rate, which is the percentage of attack traffic of the total traffic (attack + clean). When reporting application- and network-layer DDoS attack trends, we use this metric, which allows us to normalize the data points and avoid biases toward, for example, a larger Cloudflare data center that naturally handles more traffic and therefore also, possibly, more attacks compared to a smaller Cloudflare data center located elsewhere.

Application-layer DDoS attacks

Application-layer DDoS attacks, specifically HTTP DDoS attacks, are attacks that usually aim to disrupt a web server by making it unable to process legitimate user requests. If a server is bombarded with more requests than it can process, the server will drop legitimate requests and — in some cases — crash, resulting in degraded performance or an outage for legitimate users.

Q3 ‘21 was the quarter of Meris — one of the most powerful botnets deployed to launch some of the largest HTTP DDoS attacks in history.

This past quarter, we observed one of the largest recorded HTTP attacks — 17.2M rps (requests per second) — targeting a customer in the financial services industry. One of the most powerful botnets ever observed, called Meris, is known to be deployed in launching these attacks.

Meris (Latvian for plague) is a botnet behind recent DDoS attacks that have targeted networks or organizations around the world. The Meris botnet infected routers and other networking equipment manufactured by the Latvian company MikroTik. According to MikroTik’s blog, a vulnerability in the MikroTik RouterOS (that was patched after its detection back in 2018) was exploited in still unpatched devices to build a botnet and launch coordinated DDoS attacks by bad actors.

Similar to the Mirai botnet of 2016, Meris is one of the most powerful botnets recorded. While Mirai infected IoT devices with low computational power such as smart cameras, Meris is a growing swarm of networking infrastructure (such as routers and switches) with significantly higher processing power and data transfer capabilities than IoT devices — making them much more potent in causing harm at a larger scale. Be that as it may, Meris is an example of how the attack volume doesn’t necessarily guarantee damage to the target. As far as we know, Meris, despite its strength, was not able to cause significant impact or Internet outages. On the other hand, by tactically targeting the DYN DNS service in 2016, Mirai succeeded in causing significant Internet disruptions.

Application-layer DDoS attacks by industry

The tech and gaming industries were the most targeted industries in Q3 ‘21.

When we break down the application-layer attacks targeted by industry, Computer Software companies topped the charts. The Gaming/Gambling industry, also known to be regular targets of online attacks, was a close second, followed by the Internet and IT industries.

DDoS Attack Trends for Q3 2021

Application-layer DDoS attacks by source country

To understand the origin of the HTTP attacks, we look at the geolocation of the source IP address belonging to the client that generated the attack HTTP requests. Unlike network-layer attacks, source IPs cannot be spoofed in HTTP attacks. A high DDoS activity rate in a given country usually indicates the presence of botnets operating from within.

In the third quarter of 2021, most attacks originated from devices/servers in China, the United States, and India. While China remains in first place, the number of attacks originating from Chinese IPs actually decreased by 30% compared to the previous quarter. Almost one out of every 200 HTTP requests that originated from China was part of an HTTP DDoS attack.

Additionally, attacks from Brazil and Germany shrank by 38% compared to the previous quarter. Attacks originating from the US and Malaysia reduced by 40% and 45%, respectively.

DDoS Attack Trends for Q3 2021

Application-layer DDoS attacks by target country

In order to identify which countries are targeted the most by L7 attacks, we break down the DDoS activity by our customers’ billing countries.

For the second consecutive time this year, organizations in the United States were targeted the most by L7 DDoS attacks in the world, followed by those in the UK and Canada.

DDoS Attack Trends for Q3 2021

Network-layer DDoS attacks

While application-layer attacks target the application (Layer 7 of the OSI model) running the service that end users are trying to access, network-layer attacks aim to overwhelm network infrastructure (such as in-line routers and servers) and the Internet link itself.

Mirai-variant botnet strikes with a force of 1.2 Tbps.

Q3 ‘21 was also the quarter when the infamous Mirai made a resurgence. A Mirai-variant botnet launched over a dozen UDP- and TCP-based DDoS attacks that peaked multiple times above 1 Tbps, with a max peak of approximately 1.2 Tbps. These network-layer attacks targeted Cloudflare customers on the Magic Transit and Spectrum services. One of these targets was a major APAC-based Internet services, telecommunications, and hosting provider and the other was a gaming company. In all cases, the attacks were automatically detected and mitigated without human intervention.

Network-layer DDoS attacks by month

September was, by far, the busiest month for attackers this year.

Q3 ‘21 accounted for more than 38% of all attacks this year. September was the busiest month for attackers so far in 2021 — accounting for over 16% of all attacks this year.

DDoS Attack Trends for Q3 2021

Network-layer DDoS attacks by attack rate

Most attacks are ‘small’ in size, but the number of larger attacks continues to rise.

There are different ways of measuring the size of a L3/4 DDoS attack. One is the volume of traffic it delivers, measured as the bit rate (specifically, terabits per second or gigabits per second). Another is the number of packets it delivers, measured as the packet rate (specifically, millions of packets per second).

Attacks with high bit rates attempt to cause a denial-of-service event by clogging the Internet link, while attacks with high packet rates attempt to overwhelm the servers, routers, or other in-line hardware appliances. Appliances dedicate a certain amount of memory and computation power to process each packet. Therefore, by bombarding it with many packets, the appliance can be left with no further processing resources. In such a case, packets are “dropped,” i.e., the appliance is unable to process them. For users, this results in service disruptions and denial of service.

The distribution of attacks by their size (in bit rate) and month is shown below. Interestingly enough, all attacks over 400 Gbps took place in August, including some of the largest attacks we have seen; multiple attacks peaked above 1 Tbps and reached as high as 1.2 Tbps.

DDoS Attack Trends for Q3 2021

Packet rate
As seen in previous quarters, the majority of attacks observed in Q3 ‘21 were relatively small in size — nearly 89% of all attacks peaked below 50K packets per second (pps). While a majority of attacks are smaller in size, we observed that the number of larger attacks is increasing QoQ — attacks that peaked above 10M pps increased by 142% QoQ.

DDoS Attack Trends for Q3 2021

Attacks of packet rates ranging from 1-10 million packets per second increased by 196% compared to the previous quarter. This trend is similar to what we observed the last quarter as well, suggesting that larger attacks are increasing.

DDoS Attack Trends for Q3 2021

Bit rate
From the bit rate perspective, a similar trend was observed — a total of 95.4% of all attacks peaked below 500 Mbps.

DDoS Attack Trends for Q3 2021

QoQ data shows that the number of attacks of sizes ranging from 500 Mbps to 10 Gbps saw massive increases of 126% to 289% compared to the previous quarter. Attacks over 100 Gbps decreased by nearly 14%.

The number of larger bitrate attacks increased QoQ (with the one exception being attacks over 100 Gbps, which decreased by nearly 14% QoQ). In particular, attacks ranging from 500 Mbps to 1 Gbps saw a surge of 289% QoQ and those ranging from 1 Gbps to 100 Gbps surged by 126%.

This trend once again illustrates that, while (in general) a majority of the attacks are indeed smaller, the number of “larger” attacks is increasing. This suggests that more attackers are garnering more resources to launch larger attacks.

DDoS Attack Trends for Q3 2021

Network-layer DDoS attacks by duration

Most attacks remain under one hour in duration, reiterating the need for automated always-on DDoS mitigation solutions.

We measure the duration of an attack by recording the difference between when it is first detected by our systems as an attack and the last packet we see with that attack signature. As in previous quarters, most of the attacks are short-lived. To be specific, 94.4% of all DDoS attacks lasted less than an hour. On the other end of the axis, attacks over 6 hours accounted for less than 0.4% in Q3 ‘21, and we did see a QoQ increase of 165% in attacks ranging 1-2 hours. Be that as it may, a longer attack does not necessarily mean a more dangerous one.

DDoS Attack Trends for Q3 2021

Short attacks can easily go undetected, especially burst attacks that, within seconds, bombard a target with a significant number of packets, bytes, or requests. In this case, DDoS protection services that rely on manual mitigation by security analysis have no chance in mitigating the attack in time. They can only learn from it in their post-attack analysis, then deploy a new rule that filters the attack fingerprint and hope to catch it next time. Similarly, using an “on-demand” service, where the security team will redirect traffic to a DDoS provider during the attack, is also inefficient because the attack will already be over before the traffic routes to the on-demand DDoS provider.

Cloudflare recommends that companies use automated, always-on DDoS protection services that analyze traffic and apply real-time fingerprinting fast enough to block the short-lived attacks. Cloudflare analyzes traffic out-of-path, ensuring that DDoS mitigation does not add any latency to legitimate traffic, even in always-on deployments. Once an attack is identified, our autonomous edge DDoS protection system (dosd) generates and applies a dynamically crafted rule with a real-time signature. Pre-configured firewall rules comprising allow/deny lists for known traffic patterns take effect immediately.

Attack vectors

SYN floods remain attackers’ favorite method of attack, while attacks over DTLS saw a massive surge — 3,549% QoQ.

An attack vector is the term used to describe the method that the attacker utilizes in their attempt to cause a denial-of-service event.

As observed in previous quarters, attacks utilizing SYN floods remain the most popular method used by attackers.

A SYN flood attack is a DDoS attack that works by exploiting the very foundation of the TCP protocol — the stateful TCP connection between a client and a server as a part of the 3-way TCP handshake. As a part of the TCP handshake, the client sends an initial connection request packet with a synchronize flag (SYN). The server responds with a packet that contains a synchronized acknowledgment flag (SYN-ACK). Finally, the client responds with an acknowledgment (ACK) packet. At this point, a connection is established and data can be exchanged until the connection is closed. This stateful process can be abused by attackers to cause denial-of-service events.

By repeatedly sending SYN packets, the attacker attempts to overwhelm a server or the router’s connection table that tracks the state of TCP connections. The server replies with a SYN-ACK packet, allocates a certain amount of memory for each given connection, and falsely waits for the client to respond with the final ACK. Given a sufficient number of connections occupying the server’s memory, the server is unable to allocate further memory for legitimate clients, causing the server to crash or preventing it from handling legitimate client connections, i.e., a denial-of-service event.

More than half of all attacks observed over our network were SYN floods. This was followed by RST, ACK, and UDP floods.

DDoS Attack Trends for Q3 2021

Emerging threats

While SYN and RST floods remain popular overall, when we look at emerging attack vectors — which helps us understand what new vectors attackers are deploying to launch attacks — we observed a massive spike in DTLS amplification attacks. DTLS floods increased by 3,549% QoQ.

Datagram Transport Layer Security (DTLS) is a protocol similar to Transport Layer Security (TLS) designed to provide similar security guarantees to connectionless datagram-based applications to prevent message forgery, eavesdropping, or tampering. DTLS, being connectionless, is specifically useful for establishing VPN connections, without the TCP meltdown problem. The application is responsible for reordering and other connection properties.

Just as with most UDP-based protocols, DTLS is spoofable and being used by attackers to generate reflection amplification attacks to overwhelm network gateways.

DDoS Attack Trends for Q3 2021

Network-layer DDoS attacks by country

While Morocco topped the charts in terms of the highest network attack rate observed, Asian countries closely followed.

When analyzing network-layer DDoS attacks, we bucket the traffic by the Cloudflare edge data center locations where the traffic was ingested, and not by the source IP. The reason for this is that, when attackers launch network-layer attacks, they can spoof the source IP address in order to obfuscate the attack source and introduce randomness into the attack properties, which may make it harder for simple DDoS protection systems to block the attack. Hence, if we were to derive the source country based on a spoofed source IP, we would get a spoofed country.

Cloudflare is able to overcome the challenges of spoofed IPs by displaying the attack data by the location of the Cloudflare data center in which the attack was observed. We are able to achieve geographical accuracy in our report because we have data centers in over 250 cities around the world.

Worldwide

DDoS Attack Trends for Q3 2021

To view all regions and countries, check out the Radar DDoS Report dashboard’s interactive map.

A note on recent attacks on voice over-IP service providers — and ransom DDoS attacks

DDoS Attack Trends for Q3 2021

We recently reported and provided an update on the surge in DDoS attacks on VoIP service providers — some of who have also received ransom threats. As of early Q4 ‘21, this attack campaign is still ongoing and current. At Cloudflare, we continue to onboard VoIP service providers and shield their applications and networks against attacks.

HTTP attacks against API gateways and the corporate websites of the providers have been combined with network-layer and transport-layer attacks against VoIP infrastructures.

Examples include:

  1. TCP floods targeting stateful firewalls: These are being used in “trial-and-error” type attacks. They are not very effective against telephony infrastructure specifically (because it is mostly UDP) but very effective at overwhelming stateful firewalls.
  2. UDP floods targeting SIP infrastructure: Floods of UDP traffic that have no well-known fingerprint, aimed at critical VoIP services. Generic floods like this may look like legitimate traffic to unsophisticated filtering systems.
  3. UDP reflection targeting SIP infrastructure: These methods, when targeted at SIP or RTP services, can easily overwhelm Session Border Controllers (SBCs) and other telephony infrastructure. The attacker seems to learn enough about the target’s infrastructure to target such services with high precision.
  4. SIP protocol-specific attacks: Attacks at the application layer are of particular concern because of the higher resource cost of generating application errors versus filtering on network devices.

Organizations also continue to receive ransom notes that threaten attacks in exchange for bitcoin. Ransomware and ransom DDoS attacks, for the fourth consecutive quarter, continue to be a germane threat to organizations all over the world.

Cloudflare products close off several threat vectors that can lead to a ransomware infection and ransom DDoS attacks:

  • Cloudflare DNS filtering blocks unsafe websites.
  • Cloudflare Browser Isolation prevents drive-by downloads and other browser-based attacks.
  • A Zero Trust architecture can help prevent ransomware from spreading within a network.
  • Magic Transit protects organizations’ networks against DDoS attacks using BGP route redistribution — without impacting latency.

Helping build a better Internet

Cloudflare was founded on the mission to help build a better Internet. And part of that mission is to build an Internet where the impact of DDoS attacks is a thing of the past. Over the last 10 years, we have been unwavering in our efforts to protect our customers’ Internet properties from DDoS attacks of any size or kind. In 2017, we announced unmetered DDoS protection for free — as part of every Cloudflare service and plan, including the Free plan — to make sure every organization can stay protected and available. Organizations big and small have joined Cloudflare over the past several years to ensure their websites, applications, and networks are secure from DDoS attacks, and remain fast and reliable.

But cyberattacks come in various forms, not just DDoS attacks. Malicious bots, ransomware attacks, email phishing, and VPN / remote access hacks are some many attacks that continue to plague organizations of all sizes globally. These attacks target websites, APIs, applications, and entire networks — which form the lifeblood of any online business. That is why the Cloudflare security portfolio accounts for everything and everyone connected to the Internet.

To learn more about Cloudflare DDoS or our network services, create an account or reach out to us.

US Blacklists NSO Group

Post Syndicated from Bruce Schneier original https://www.schneier.com/blog/archives/2021/11/us-blacklists-nso-group.html

The Israeli cyberweapons arms manufacturer — and human rights violator, and probably war criminal — NSO Group has been added to the US Department of Commerce’s trade blacklist. US companies and individuals cannot sell to them. Aside from the obvious difficulties this causes, it’ll make it harder for them to buy zero-day vulnerabilities on the open market.

This is another step in the ongoing US actions against the company.

The Raspberry Pi Build HAT and LEGO® components at our CoderDojo

Post Syndicated from Mark Calleja original https://www.raspberrypi.org/blog/raspberry-pi-build-hat-lego-education-robotics-coderdojo/

As so many CoderDojos around the world, our office-based CoderDojo hadn’t been able to bring learners together in person since the start of the coronavirus pandemic. So we decided that our first time back in the Raspberry Pi Foundation headquarters should be something special. Having literally just launched the new Raspberry Pi Build HAT for programming LEGO® projects with Raspberry Pi computers, we wanted to celebrate our Dojo’s triumphant return to in-person session by offering a ‘LEGO bricks and Raspberry Pi’ activity!

A robot buggy built by young people with LEGO bricks and the Raspberry Pi Build HAT.

Back in person, with new ways to create with code

The Raspberry Pi Build HAT allows learners to build and program projects with Raspberry Pi computers and LEGO® Technic™ motors and sensors from the LEGO® Education SPIKE™ Portfolio.

A close-up of the Raspberry Pi Build HAT on a Maker Plate and connected to electronic components.

What better way could there be to get the more experienced coders among our Dojo’s young people (Ninjas) properly excited to be back? We knew they were fond of building things with LEGO bricks, as so many young people are, so we were sure they would have great fun with this activity!

Two girls work together on a coding project.

For our beginners, we set up Raspberry Pi workstations and got them coding the projects on the Home island on our brand-new Code Club World platform, which they absolutely loved, so their jealousy was mitigated somewhat. 

Being able to rely on your learners’ existing skills in making the physical build leaves you a lot more time to support them with what they’re actually here to learn: the coding and digital making skills.

We wanted to keep our first Dojo back small, so for the ‘LEGO bricks and Raspberry Pi’ activity, we set up just four workstations, each with a Raspberry Pi 4, with 4GB RAM and a Raspberry Pi Build HAT on top, and a LEGO Education SPIKE Prime set. We put eight participants into teams of two, and made sure that all of them brought a little experience with text-based coding, because we wanted them to be able to focus on making projects in their own style, rather than first learning the basics of coding in Python. Then we offered our Ninjas the choice of the first two projects in the Introduction to the Raspberry Pi Build HAT and LEGO path: make Pong game controllers, or make a remote-controlled robot buggy. As I had predicted, all the teams chose to make a robot buggy!

""

Teamwork and design

The teams of Ninjas were immediately off and making — in fact, they couldn’t wait to get the lids off the boxes of brightly coloured bricks and beams!

Two young people work as a team at a CoderDojo coding club.

Our project instructions focus primarily on supporting learners through coding and testing the mechanics of their creations, leaving the design and build totally up to them. This was evidenced by the variety of buggy designs we saw at the project showcase at the end of the two-hour Dojo session!

One of the amazing things Raspberry Pi makes possible when you use it with the Raspberry Pi Build HAT and SPIKE™ Prime set: it’s simple to make the Raspberry Pi at the heart of the creation talk to a mobile device via Bluetooth, and off you go controlling what you’ve created via a phone or tablet.

While beginner-friendly, the projects in the Introduction path involve a mix of coding, testing, designing, and building. So it required focus and solid teamwork for the Ninjas to finish their buggies in time for the project showcase. And this is where building with LEGO pieces was really helpful.

Coding front and centre, thanks to the Raspberry Pi Build HAT

Having LEGO bricks and the Build HAT available to create their Raspberry Pi–powered robot buggies made it easy for our Ninjas to focus on writing the code to get their buggies to work. They weren’t relying on crafting skills or duct tape and glue guns to make a chassis in the relatively short time they had, and the coding could be front and centre for them.

The most exciting part for the Ninjas was that they were building remote-controlled robot buggies. This is one of the amazing things Raspberry Pi makes possible when you use it with the Build HAT and SPIKE™ Prime set: it’s simple to make the Raspberry Pi at the heart of the creation talk to a mobile device via Bluetooth, and off you go controlling what you’ve created via a phone or tablet.

The LEGO Technic motors that are part of the LEGO Education SPIKE Prime set are of really high quality, and they’re super easy to program with the Build HAT and its Python library! You can change the motors’ speed by setting a single parameter in your code. You can also easily write code to set or read the motors’ exact angle (their absolute position). That allows you to finely control the motors’ movements, or to use them as sensors.

Some of our teams, inspired by everything the SPIKE Prime set has to offer, tried out programming the set’s sensors, to switch their robot buggy on or help it avoid obstacles. Because we only had about 90 minutes of digital making, not all teams managed to finish adding the extra features they wanted — but next time for sure!

A young person programs a robot buggy built with LEGO bricks and the Raspberry Pi Build HAT.

With a little more time (or another Dojo session), it would have been possible for the Ninjas to make some very advanced remote-controlled buggies indeed, complete with headlights, brake lights, sensors, and sound.

Learning with LEGO® elements and Raspberry Pi computers

If you have access to LEGO Education SPIKE Prime sets for your learners, then the Raspberry Pi Build HAT is a great addition that allows them to build complex robotics projects with very simple code — but I think that’s not its main benefit.

A robot buggy built by young people with LEGO bricks and the Raspberry Pi Build HAT.

Because the Build HAT allows your learners to work with LEGO elements, you know that many of them already understand one aspect of the creation process: they’ve got experience of using LEGO bricks to solve a problem. In a coding or STEM club session, or in a classroom lesson, you can only give your learners limited amount of time to complete a project, or get their project prototype to a stable point. So being able to rely on your learners’ existing skills in making the physical build leaves you a lot more time to support them with what they’re actually here to learn: the coding and digital making skills.

You and your young people next!

The projects using the Raspberry Pi Build HATs were such a hit, we’ll be getting them and the LEGO Education SPIKE Prime sets out at every Dojo session from now on! We’re excited to see what young people around the world will be creating thanks to our new collaboration with LEGO Education.

Have you used the Raspberry Pi Build HAT with your learners or young people at home yet? Share their stories and creations in the comments here, or on social media using #BuildHAT.

The post The Raspberry Pi Build HAT and LEGO® components at our CoderDojo appeared first on Raspberry Pi.

[$] Trojan Source: tricks (no treats) with Unicode

Post Syndicated from original https://lwn.net/Articles/874951/rss

A new security vulnerability that was disclosed
on November 1
has some interesting properties. “Trojan Source“, as it has been
dubbed, is effectively an attack on human perceptions, especially as they
are filtered through the tools used for source-code review. While the
specifics of the flaw are new, this kind of trickery is not completely
novel, but Trojan Source finds another way to confuse the humans who are
in the loop.

Introducing Amazon S3 shuffle in AWS Glue

Post Syndicated from Anubhav Awasthi original https://aws.amazon.com/blogs/big-data/introducing-amazon-s3-shuffle-in-aws-glue/

AWS Glue is a serverless data integration service that makes it easy to discover, prepare, and combine data for analytics, machine learning (ML), and application development. In AWS Glue, you can use Apache Spark, which is an open-source, distributed processing system for your data integration tasks and big data workloads. Apache Spark utilizes in-memory caching and optimized query execution for fast analytic queries against your datasets, which are split into multiple partitions so that you can execute different transformations in parallel.

Shuffling is an important step in a Spark job whenever data is rearranged between partitions. The groupByKey(), reduceByKey(), join(), and distinct() are some examples of wide transformations that can cause a shuffle. During a shuffle, data is written to disk and transferred across the network. As a result, the shuffle operation is often constrained by the available local disk capacity, or data skew, which can cause straggling executors. Spark often throws a No space left on device or MetadataFetchFailedException error when there is not enough disk space left on the executor and there is no recovery.

This post introduces a new Spark shuffle manager available in AWS Glue that disaggregates Spark compute and shuffle storage by utilizing Amazon Simple Storage Service (Amazon S3) to store Spark shuffle and spill files. Using Amazon S3 for Spark shuffle storage lets you run data-intensive workloads much more reliably.

Understanding the shuffle operation in AWS Glue

Spark creates physical plans for running your workflow, called Directed Acyclic Graphs (DAGs). The DAG represents a series of transformations on your dataset, each resulting in a new immutable RDD. All of the transformations in Spark are lazy, in that they are not computed until an action is called to generate results. There are two types of transformations:

  • Narrow transformation – Such as map, filter, union, and mapPartition, where each input partition contributes to only one output partition.
  • Wide transformation – Such as join, groupBykey, reduceByKey, and repartition, where each input partition contributes to many output partitions.

In Spark, a shuffle occurs whenever data is rearranged between partitions. This is required because the wide transformation needs information from other partitions in order to complete its processing. Spark gathers the required data from each partition and combines it into a new partition. During a shuffle phase, all Spark map tasks write shuffle data to a local disk that is then transferred across the network and fetched by Spark reduce tasks. With AWS Glue, workers write shuffle data on local disk volumes attached to the AWS Glue workers.

In addition to shuffle writes, Spark uses local disk to spill data from memory that exceeds the heap space defined by the spark.memory.fraction configuration parameter. Shuffle spill (memory) is the size of the de-serialized form of the data in the memory at the time when the worker spills it. Whereas shuffle spill (disk) is the size of the serialized form of the data on disk after the worker has spilled.

Challenges

Spark uses local disk for storing intermediate shuffle and shuffle spills. This introduces the following key challenges:

  • Hitting local storage limits – If you have a Spark job that computes transformations over a large amount of data, and results in either too much spill or shuffle or both, then you might get a failed job with  java.io.IOException: No space left on device exception if the underlying storage has filled up.
  • Co-location of storage with executors – If an executor is lost, then shuffle files are lost as well. This leads to several task and stage retries, as Spark tries to recompute stages in order to recover lost shuffle data. Spark natively provides an external shuffle service that lets it store shuffle data independent to the life of executors. But the shuffle service itself becomes a point of failure and must always be up in order to serve shuffle data. Additionally, shuffles are still stored on local disk, which might run out of space for a large job.

To illustrate one of the preceding scenarios, let’s use the query q80.sql from the standard TPC-DS 3 TB dataset as an example. This query attempts to calculate the total sales, returns, and eventual profit realized during a specific time frame. It involves multiple wide transformations (shuffles) caused by left outer join, group by, and union all. Let’s run the following query with 10 G1.x AWS Glue DPU (data processing unit). For the G.1X worker type, each worker maps to 1 DPU and 1 executor. 10 G1.x workers account for a total of 640GB of disk space. See the following sql query:

with ssr as
 (select  s_store_id as store_id,
          sum(ss_ext_sales_price) as sales,
          sum(coalesce(sr_return_amt, 0)) as returns,
          sum(ss_net_profit - coalesce(sr_net_loss, 0)) as profit
  from store_sales left outer join store_returns on
         (ss_item_sk = sr_item_sk and ss_ticket_number = sr_ticket_number),
     date_dim, store, item, promotion
 where ss_sold_date_sk = d_date_sk
       and d_date between cast('2000-08-23' as date)
                  and (cast('2000-08-23' as date) + interval '30' day)
       and ss_store_sk = s_store_sk
       and ss_item_sk = i_item_sk
       and i_current_price > 50
       and ss_promo_sk = p_promo_sk
       and p_channel_tv = 'N'
 group by s_store_id),
 csr as
 (select  cp_catalog_page_id as catalog_page_id,
          sum(cs_ext_sales_price) as sales,
          sum(coalesce(cr_return_amount, 0)) as returns,
          sum(cs_net_profit - coalesce(cr_net_loss, 0)) as profit
  from catalog_sales left outer join catalog_returns on
         (cs_item_sk = cr_item_sk and cs_order_number = cr_order_number),
     date_dim, catalog_page, item, promotion
 where cs_sold_date_sk = d_date_sk
       and d_date between cast('2000-08-23' as date)
                  and (cast('2000-08-23' as date) + interval '30' day)
        and cs_catalog_page_sk = cp_catalog_page_sk
       and cs_item_sk = i_item_sk
       and i_current_price > 50
       and cs_promo_sk = p_promo_sk
       and p_channel_tv = 'N'
 group by cp_catalog_page_id),
 wsr as
 (select  web_site_id,
          sum(ws_ext_sales_price) as sales,
          sum(coalesce(wr_return_amt, 0)) as returns,
          sum(ws_net_profit - coalesce(wr_net_loss, 0)) as profit
  from web_sales left outer join web_returns on
         (ws_item_sk = wr_item_sk and ws_order_number = wr_order_number),
     date_dim, web_site, item, promotion
 where ws_sold_date_sk = d_date_sk
       and d_date between cast('2000-08-23' as date)
                  and (cast('2000-08-23' as date) + interval '30' day)
        and ws_web_site_sk = web_site_sk
       and ws_item_sk = i_item_sk
       and i_current_price > 50
       and ws_promo_sk = p_promo_sk
       and p_channel_tv = 'N'
 group by web_site_id)
 select channel, id, sum(sales) as sales, sum(returns) as returns, sum(profit) as profit
 from (select
        'store channel' as channel, concat('store', store_id) as id, sales, returns, profit
      from ssr
      union all
      select
        'catalog channel' as channel, concat('catalog_page', catalog_page_id) as id,
        sales, returns, profit
      from csr
      union all
      select
        'web channel' as channel, concat('web_site', web_site_id) as id, sales, returns, profit
      from  wsr) x
 group by rollup (channel, id)
 order by channel, id

The following screenshot shows the AWS Glue job run details from the Apache Spark web UI:

The job runs for about 1 hour and 25 minutes, then we start observing task failures. Spark ends up stopping the stage and canceling the job when the task retries also fail.

The following screenshots show the aggregated metrics for the failed stage, as well as how much data is spilled to disk by individual executors:

As seen in the Shuffle Write metric from the above Spark UI screenshot, all 10 workers shuffle over 50 GB of data. Further writes aren’t allowed, and tasks start failing with a “No space left on device” error.

The remaining storage is occupied by data that is spilled to disk, as seen in the Shuffle Spill (Disk) metric from the above Spark UI screenshot. This failed job is a classic example of a data-intensive transformation where Spark is both shuffling and spilling to disk when executor memory is filled.

Solution overview

We have various methods for overcoming the disk space error:

  • Scale out – Increase the number of workers. This incurs an increase in cost. However, scaling out might not always work, especially if your data is heavily skewed on a few keys. Fixing skewness will require considerable modifications to your Spark application logic.
  • Increase shuffle partitions – Increasing the shuffle partitions can sometimes help overcome space errors. However, this might not always work, and therefore is unreliable.
  • Disaggregate compute and storage – This approach presents several of the advantages of not only scaling storage for large shuffles, but also adding reliability in the event of node failures because shuffle data is independently stored. Following are few implementations of this disaggregated approach:
    • Dedicated intermediate storage cluster – In this approach, you use an additional fleet of shuffle services to serve intermediate shuffle. It has several advantages, such as merging shuffle files and sequential I/O, but it introduces an overhead of fleet maintenance from both operations, as well as a cost standpoint. For examples of this approach, see Cosco: An Efficient Facebook-Scale Shuffle Service and Zeus: Uber’s Highly Scalable and Distributed Shuffle as a Service.
    • Serverless storage – AWS Glue implements a different approach in which you utilize Amazon S3, a cost-effective managed and serverless storage, to store intermediate shuffle data. This design does not depend upon a dedicated daemon, such as shuffle service, to preserve shuffle files. This lets you elastically scale your Spark job without the overhead of running, operating, and maintaining additional storage or compute nodes.

With AWS Glue 2.0, you can now use Amazon S3 to store Spark shuffle and spill data. Amazon S3 is an object storage service that offers industry-leading scalability, data availability, security, and performance. This gives complete elasticity to Spark jobs, thereby allowing you to run your most data intensive workloads reliably.

The following diagram illustrates how Spark map tasks write the shuffle and spill files to the given Amazon S3 shuffle bucket. Reducer tasks consider the shuffle blocks as remote blocks and read them from the same shuffle bucket.

Use Amazon S3 to store shuffle and spill data

The following job parameters enable and tune Spark to use S3 buckets for storing shuffle and spill data. You can also enable at-rest encryption when writing shuffle data to Amazon S3 by using security configuration settings.

Key  Value  Explanation
–write-shuffle-files-to-s3 TRUE This is the main flag, which tells Spark to use S3 buckets for writing and reading shuffle data.
–write-shuffle-spills-to-s3 TRUE This is an optional flag that lets you offload spill files to S3 buckets, which provides additional resiliency to your Spark job. This is only required for large workloads that spill a lot of data to disk. This flag is disabled by default.
–conf spark.shuffle.glue.s3ShuffleBucket=S3://<shuffle-bucket> This is also optional, and it specifies the S3 bucket where we write the shuffle files. By default, we use —TempDir/shuffle-data.

You can also use the AWS Glue Studio console to enable Amazon S3 based shuffle or spill. You can choose the preceding properties from pre-populated options in the Job parameters section.

Results

Let’s run the same q80.sql query with Amazon S3 shuffle enabled. We can view the shuffle files stored in the S3 bucket in the following format:

shuffle_<jobid>_<mapperid>_<reducerid>.data/index

Two kinds of files are created:

  • Data – Stores the shuffle output of the current task
  • Index – Stores the classification information of the data in the data file by storing partition offsets

The following screenshots shows example shuffle directories and shuffle files:

The following screenshot shows the aggregated metrics from the Spark UI:

The following are a few key highlights:

  • q80.sql, which had failed earlier after 1 hour and 25 minutes, and was able to complete only 13 out of 18 stages, finished successfully in about 2 hours and 53 minutes, completing all 18 stages.
  • We were able to shuffle 479.7 GB of data without worrying about storage limits.
  • Additional workers aren’t required to scale storage, which provides substantial cost savings.

Considerations and best practices

Keep in mind the following best practices when considering this solution:

  • This feature is recommended when you want to ensure the reliable execution of your data intensive workloads that create a large amount of shuffle or spill data. Writing and reading shuffle files from Amazon S3 is marginally slower when compared to local disk for our experiments with TPC-DS queries. S3 shuffle performance would be impacted by the number and size of shuffle files. For example, S3 could be slower for reads as compared to local storage if you have a large number of small shuffle files or partitions in your Spark application.
  • You can use this feature if your job frequently suffers from No space left on device issues.
  • You can use this feature if your job frequently suffers fetch failure issues (org.apache.spark.shuffle.MetadataFetchFailedException).
  • You can use this feature if your data is skewed.
  • We recommend setting the S3 bucket lifecycle policies on the shuffle bucket (spark.shuffle.glue.s3ShuffleBucket) in order to clean up old shuffle data.
  • At the time of writing this blog, this feature is currently available on AWS Glue 2.0 and Spark 2.4.3.

Conclusion

This post discussed how we can independently scale storage in AWS Glue without adding additional workers. With this feature, you can expect jobs that are processing terabytes of data to run much more reliably. Happy shuffling!


About the Authors

Anubhav Awasthi is a Big Data Specialist Solutions Architect at AWS. He works with customers to provide architectural guidance for running analytics solutions on Amazon EMR, Amazon Athena, AWS Glue, and AWS Lake Formation.

Rajendra Gujja is a Software Development Engineer on the AWS Glue team. He is passionate about distributed computing and everything and anything to do with the data.

Mohit Saxena is a Software Engineering Manager on the AWS Glue team. His team works on distributed systems for efficiently managing data lakes on AWS and optimizes Apache Spark for performance and reliability.

Integrate AWS Glue DataBrew and Amazon PinPoint to launch marketing campaigns

Post Syndicated from Suraj Shivananda original https://aws.amazon.com/blogs/big-data/integrate-aws-glue-databrew-and-amazon-pinpoint-to-launch-marketing-campaigns/

Marketing teams often rely on data engineers to provide a consumer dataset that they can use to launch marketing campaigns. This can sometimes cause delays in launching campaigns and consume data engineers’ bandwidth. The campaigns are often launched using complex solutions that are either code heavy or using licensed tools. The processes of both extract, transform, and load (ETL) and launching campaigns need engineers who know coding, take time to build, and require maintenance overtime.

You can now simplify this process by integrating AWS services like Amazon PinPoint, AWS Glue DataBrew, and AWS Lambda. We can use DataBrew (a visual data preparation service) to perform ETL, Amazon PinPoint (an outbound and inbound marketing communications service) to launch campaigns, and Lambda functions to achieve end-to-end automation. This solution helps reduce time to production, can be implemented by less-technical folks because it doesn’t require coding, and has no licensing costs involved.

In this post, we walk through the end-to-end workflow and how to implement this solution.

Solution overview

In this solution, the source datasets are pushed to Amazon Simple Storage Service (Amazon S3) using SFTP (batch data) or Amazon API Gateway (streaming data) services. DataBrew jobs perform data transformations and prepare the data for Amazon PinPoint to launch campaigns. We use Lambda to achieve end-to-end automation, which includes an alert system via Amazon Simple Notification Service (Amazon SNS) to alert relevant teams of anomalies or errors.

The workflow includes the following steps:

  1. We can load or push data to Amazon S3 either through AWS Command Line Interface (AWS CLI) commands, AWS access keys, an AWS transfer service (SFTP), or an API Gateway service.
  2. We use DataBrew to perform ETL jobs.
  3. These ETL jobs are either triggered (using Lambda functions) or scheduled.
  4. The processed dataset from DataBrew is imported to Amazon PinPoint as segments using trigger-based Lambda functions.
  5. Marketing teams use Amazon PinPoint to launch campaigns.
  6. We can also perform data profiling using DataBrew.
  7. Finally, we export Amazon PinPoint metrics data to Amazon S3 using Amazon Kinesis Data Firehose. We can use the data for further analysis using Amazon Athena and Amazon QuickSight.

The following diagram illustrates our solution architecture.

As a bonus step, we can create a simple web portal using AWS Amplify that makes API calls to Amazon PinPoint to launch campaigns in case we want to restrict users from launching campaigns using Amazon PinPoint from the AWS Management Console. This web portal can also provide basic metrics generated by Amazon PinPoint. You can also use it as a product or platform that anyone can use to launch campaigns.

To implement the solution, we complete the following steps:

  1. Create the source datasets using both batch ingestion and Amazon Kinesis streaming services.
  2. Build an automated data ingestion pipeline that transforms the source data and makes it campaign ready.
  3. Build a DataBrew data profile job, after which you can view profile metrics and alert teams in case of any anomalies in the source data.
  4. Launch a campaign using Amazon PinPoint.
  5. Export Amazon PinPoint project events to Amazon S3 using Kinesis Data Firehose.

Create the source datasets

In this step, we create the source datasets from both an SFTP server and Kinesis in Amazon S3.

First, we create an S3 bucket to store the source, processed, and campaign-ready data.

We can ingest data into AWS through many different methods. For this post, we consider methods for batch and streaming ingestion.

For batch ingestion, we create an SFTP-enabled server for source data ingestion. We can configure SFTP servers to store data on either Amazon S3 or Amazon Elastic File System (Amazon EFS). For this post, we configure an SFTP server to store data on Amazon S3.

We use API Gateway and Kinesis for stream ingestion. We can also push data to Kinesis directly, but using API Gateway is preferred because it’s easy to handle cross-account and authentication issues. For instructions on integrating API Gateway and a Kinesis stream, see Tutorial: Create a REST API as an Amazon Kinesis proxy in API Gateway.

Each dataset should be in its own S3 folder: s3://<bucket-name>/src-files/datasource1/, s3://<bucket-name>/src-files/datasource2/, and so on.

Build the automated data ingestion pipeline

In this step, we build the end-to-end automated data ingestion pipeline that transforms the source data and makes it campaign ready.

We use DataBrew for data preparation and data quality, and Lambda and Amazon SNS for automation and alerting. Amazon PinPoint can then use this campaign data to launch campaigns.

Our pipeline performs the following functions:

  1. Run transformations on source datasets.
  2. Merge the datasets to make the data campaign ready.
  3. Perform data quality checks.
  4. Alert relevant teams in case of anomalies in the data quality.
  5. Alert relevant teams in case of DataBrew job failures.
  6. Import the campaign-ready dataset as a segment in Amazon PinPoint.

We can either have one DataBrew project to perform necessary transformations on each dataset and merge all the datasets into one final dataset, or have one DataBrew project for each dataset and another project that merges all the transformed datasets. The advantage of having individual projects for each dataset is that we’re decoupling all data sources, so an issue in one data source doesn’t impact another. We use this latter approach in this post.

For instructions on building the DataBrew job, see Creating and working with AWS Glue DataBrew recipe jobs.

DataBrew provides more than 250 transformations. For this post, we add the following transformations to clean the source datasets:

  • Validate column values and add default values if missing
  • Convert column values to a standard format, such as standardize date values or convert lowercase to uppercase (Amazon PinPoint is case-sensitive)
  • Remove special characters if needed
  • Split values if needed
  • Check for duplicates
  • Add audit columns

You could also add a few data quality recipe steps to the recipe.

DataBrew jobs can be scheduled or trigger based (through a Lambda function). For this post, we configure jobs processing individual data sources to be trigger based, and the final job merging all datasets is scheduled. The advantage of having trigger-based DataBrew jobs is that it only triggers if you have a source file, which helps reduce costs.

We first configure an S3 event that triggers a Lambda function. The function triggers the DataBrew job based on the S3 key value. See the following code (Boto3) for our function:

import boto3
 
client = boto3.client('databrew')

def lambda_handler(event, context):
record=event['Records'][0]
bucketname = record['s3']['bucket']['name']
key = record['s3']['object']['key']
print('key: '+key)
  
if key.find('SourceFolder1') != -1:
print('processing source1 file')
response = client.start_job_run(Name='databrew-process-data-job1')
if key.find('SourceFolder2') != -1:
print('processing source2 file')
response = client.start_job_run(Name='databrew-process-data-job2')

The processed data appears in s3://<bucket-name>/src-files/process-data/data-source1/.

Next we create the job that merges the processed data files into a final, campaign-ready dataset. We can configure the job to merge only those files that have been dropped in the last 24 hours.

The campaign-ready data is located in s3://<bucket-name>/src-files/campaign-ready/. This dataset is now ready to serve as input to Amazon PinPoint.

Build a DataBrew data profile job

We can use DataBrew to run a data profile job on any of the datasets defined in the previous steps. When you profile your data, DataBrew creates a report called a data profile. This report displays statistics such as the number of rows in the sample and the distribution of unique values in each column. In this post, we use Lambda functions to read the report and detect anomalies and send alerts using Amazon SNS to respective teams for further action.

  1. On the DataBrew console, on the Datasets page, select your dataset.
  2. Choose Run data profile.
  3. For Job name, enter a name for your job.
  4. For Data sample, select either Full dataset or Custom sample (for this post, we sample 20,000 rows).
  5. In the Job output settings section, for S3 location, enter your output bucket.
  6. Optionally, select Enable encryption for job output file to encrypt your data.
  7. Configure optional settings, such as profile configurations; number of nodes, job timeouts, and retries; schedules; and tags.
  8. Choose an AWS Identity and Access Management (IAM) role for the profile job.
  9. Choose Create and run job.

After you run the job, DataBrew provides you with job metrics. For example, the following screenshot shows the Dataset preview tab.

The following screenshot shows an example of the Data profile overview tab.

This tab also includes a summary of the column details.

The following screenshot shows an example of the Column statistics tab.

Next, we set up alerts using the profile output file.

  1. Configure an S3 event to trigger a Lambda function.

The function reads the output and checks for anomalies in the data. You define the anomalies; for example, when the total missing for a column is greater than 10. The function can then raise an SNS alert if it detects the anomaly.

Launch a campaign using Amazon PinPoint

Before you create segments, create an Amazon PinPoint project. To create segments, we use S3 events to trigger a Lambda function that creates a new base segment whenever the final DataBrew job loads the campaign-ready data. We can either create or update base segments; for this post, we create a new segment. See the following code (Boto3) for the Lambda function:

import boto3
import time
from datetime import datetime

now = datetime.now() # current date and time

date_time = now.strftime("%Y_%m_%d")

segname = segment_name_'+date_time

time.sleep(60)


client = boto3.client('pinpoint')

def lambda_handler(event, context):

response = client.create_import_job(
ApplicationId='xxx-project-id-from-pinpoint',
ImportJobRequest={
'DefineSegment': True,
'Format': 'CSV',
'RoleArn': 'arn:aws:iam::xxx-accountid:role/service-role/xxx_pinpoint_role',
'S3Url': 's3://<xxx-bucketname>/campaign-ready',
'SegmentName': segname
}
)

print(response)

Use this base segment to create dynamic segments and launch a campaign. For more information, see Amazon PinPoint campaigns.

Export Amazon PinPoint project events to Amazon S3 using Kinesis Data Firehose

You can track and push events related to your project, such as sent, delivered, opened messages, and a few others, to either Amazon Kinesis Data Streams or Kinesis Data Firehose, which stream this data to AWS data stores such as Amazon S3. For this post, we use Kinesis Data Firehose. We create our delivery stream prior to enabling event streams on the Amazon PinPoint project.

  1. Create your Firehose delivery stream.

The event stream is disabled by default.

  1. Choose Edit.
  2. Select Stream to Amazon Kinesis and select Send events to an Amazon Kinesis Data Firehose delivery stream.
  3. Choose the delivery stream you created.
  4. For IAM role, you can allow DataBrew to automatically create a new role or use an existing role.
  5. Choose Save.

The events are now sent to Amazon S3. You could either create an Athena table or use Amazon Kinesis Data Analytics to analyze events and build a dashboard using QuickSight.

Security best practices

Consider the following best practices in order to mitigate security threats:

  • Be mindful while creating IAM roles to provide access to only necessary services
  • If sending emails through Amazon SNS, make sure you send email alerts only to verified or subscribed recipients to minimize the possibility of automated emails being used to target victims’ external email addresses
  • Use IAM roles rather than user keys
  • Have logs written to Amazon CloudWatch, and set CloudWatch alarms in case of failures
  • Take regular backups of DataBrew jobs and Amazon Pinpoint campaigns (if needed)
  • Restrict network access for inbound and outbound traffic to least privilege
  • Enable the lifecycle policy to retain only necessary data, and delete unnecessary data
  • Enable server-side encryption using AWS KMS (SSE-KMS) or Amazon S3 (SSE-S3)
  • Enable cross-Region replication of data in case you feel backing up the source data is necessary

Clean up

To avoid ongoing charges, clean up the resources you created as part of this post:

  • S3 bucket
  • SFTP server
  • DataBrew resources
  • PinPoint resources
  • Firehose delivery stream, if applicable
  • Athena tables and QuickSight dashboards, if applicable

Conclusion

In this post, we walked through how to implement an automated workflow using DataBrew to perform ETL, Amazon PinPoint to launch campaigns, and Lambda to automate the process. This solution helps reduce time to production, is easy to implement because it doesn’t require coding, and has no licensing costs involved. Try this solution today for your own datasets, and leave any comments or questions in the comments section.


About the Authors

Suraj Shivananda is a Solutions Architect at AWS. He has over a decade of experience in Software Engineering, Data and Analytics, DevOps specifically for data solutions, automating and optimizing cloud based solutions. He’s a trusted technical advisor and helps customers build Well Architected solutions on the AWS platform.

Surbhi Dangi is a Sr Manager, Product Management at AWS. Her work includes building user experiences for Database, Analytics & AI AWS consoles, launching new database and analytics products, working on new feature launches for existing products, and building broadly adopted internal tools for AWS teams. She enjoys traveling to new destinations to discover new cultures, trying new cuisines, and teaches product management 101 to aspiring PMs.

Top 5: Featured Architecture Content for October

Post Syndicated from Elyse Lopez original https://aws.amazon.com/blogs/architecture/top-5-featured-architecture-content-for-october/

The AWS Architecture Center provides new and notable reference architecture diagrams, vetted architecture solutions, AWS Well-Architected best practices, whitepapers, and more. This blog post features some of our best picks from the new and newly updated content we released in the past month.

 1. AWS Security at the Edge

This new whitepaper provides the foundations for implementing a defense-in-depth security strategy at the edge. It addresses three areas:

  1. AWS services at AWS edge locations
  2. How those services can be used to implement the best practices outlined in the design principles of the Security Pillar of the AWS Well-Architected Framework
  3. The security aspects of additional AWS edge services that you can use to secure your edge environments or expand operations into new, previously unsupported environments

2. Machine Learning Lens for the AWS Well-Architected Framework 

Machine learning (ML) algorithms discover and learn patterns in data and construct mathematical models to enable predictions on future data. ML solutions can revolutionize lives through better diagnosis of diseases, environment protection, products and services transformation, and more.

This newly updated whitepaper provides you with established cloud- and technology-agnostic best practices. Apply these architectural principles when designing your ML workloads or after workloads have entered production as part of continuous improvement.

3. Streaming Media Lens for the AWS Well-Architected Framework 

In this newly published Lens, learn how Well-Architected best practices can help you design, deliver, and maintain streaming media workloads. The Lens defines components, explores common workload scenarios, and outlines design principles that help you apply the Well-Architected Framework. Dive deeper into how each scenario affects the architecture of your workload and evaluate the technology architecture that best meets your needs.

4. Maintaining Personalized Experiences with Machine Learning

Boost your customer engagement by providing up-to-date product recommendations, personalized product re-rankings, and customized direct marketing. This new Solutions Implementation helps you provide real-time, curated experiences across multiple channels using Amazon Personalize. The implementation automates the entire lifecycle of a personalization workload, presenting the results in an Amazon CloudWatch dashboard.

5. AWS QnABot

Alexa, what does a QnABot do? This new Solutions Implementation dives deep on AWS QnABot, a multi-channel, multi-language conversational interface (chatbot) that responds to customer’s questions, answers, and feedback. Learn how to deploy QnABot across channels such as chat, voice, SMS, and Amazon Alexa, implementing the newest ML technologies to enhance the customer experience and build more efficient communications.

Use the web user interface to ask: “What is Q and A bot?”. The answer now displays the heading, links, and emphasis specified in your markdown text.

The tale of a single register value

Post Syndicated from Jakub Sitnicki original https://blog.cloudflare.com/the-tale-of-a-single-register-value/

The tale of a single register value

“Once you eliminate the impossible, whatever remains, no matter how improbable, must be the truth.” — Sherlock Holmes

Intro

The tale of a single register value

It’s not every day that you get to debug what may well be a packet of death. It was certainly the first time for me.

What do I mean by “a packet of death”? A software bug where the network stack crashes in reaction to a single received network packet, taking down the whole operating system with it. Like in the well known case of Windows ping of death.

Challenge accepted.

It starts with an oops

Around a year ago we started seeing kernel crashes in the Linux ipv4 stack. Servers were crashing sporadically, but we learned the hard way to never ignore cases like that — when possible we always trace crashes. We also couldn’t tie it to a particular kernel version, which could indicate a regression which hopefully could be tracked down to a single faulty change in the Linux kernel.

The crashed servers were leaving behind only a crash report, affectionately known as a “kernel oops”. Let’s take a look at it and go over what information we have there.

The tale of a single register value

Parts of the oops, like offsets into functions, need to be decoded in order to be human-readable. Fortunately Linux comes with the decode_stacktrace.sh script that did the work for us.

All we need is to install a kernel debug and source packages before running the script. We will use the latest version of the script as it has been significantly improved since Linux v5.4 came out.

$ RELEASE=`uname -r`
$ apt install linux-image-$RELEASE-dbg linux-source-$RELEASE
$ curl -sLO https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/plain/scripts/decode_stacktrace.sh
$ curl -sLO https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/plain/scripts/decodecode
$ chmod +x decode_stacktrace.sh decodecode
$ ./decode_stacktrace.sh -r 5.4.14-cloudflare-2020.1.11 < oops.txt > oops-decoded.txt

When decoded, the oops report is even longer than before! But that is a good thing. There is new information there that can help us.

The tale of a single register value

What has happened?

With this much input we can start sketching a picture of what could have happened. First thing to check is where exactly did we crash?

The report points at line 5160 in the skb_gso_transport_seglen() function. If we take a look at the source code, we can get a rough idea of what happens there. We are processing a Generic Segmentation Offload (GSO) packet carrying an encapsulated TCP packet. What is a GSO packet? In this context it’s a batch of consecutive TCP segments, travelling through the network stack together to amortize the processing cost. We will look more at the GSO later.

net/core/skbuff.c:
5150) static unsigned int skb_gso_transport_seglen(const struct sk_buff *skb)
5151) {
          …
5155)     if (skb->encapsulation) {
                  …
5159)             if (likely(shinfo->gso_type & (SKB_GSO_TCPV4 | SKB_GSO_TCPV6)))
5160)                     thlen += inner_tcp_hdrlen(skb); 👈
5161)     } else if (…) {
          …
5172)     return thlen + shinfo->gso_size;
5173) }

The exact line where we crashed belongs to an if-branch that handles tunnel traffic. It calculates the length of the TCP header of the inner packet, that is the encapsulated one. We do that to compute the length of the outer L4 segment, which accounts for the inner packet length:

The tale of a single register value

To understand how the length of the inner TCP header is computed we have to peel off a few layers of inlined function calls:

inner_tcp_hdrlen(skb)
⇓
inner_tcp_hdr(skb)->doff * 4
⇓
((struct tcphdr *)skb_inner_transport_header(skb))->doff * 4
⇓
((struct tcphdr *)(skb->head + skb->inner_transport_header))->doff * 4

Now it is clear that inner_tcp_hdrlen(skb) simply reads the Data Offset field (doff) inside the inner TCP header. Because Data Offset carries the number of 32-bit words in the TCP header, we multiply it by 4 to get the TCP header length in bytes.

From the memory access point of view, to read the Data Offset value we need to:

  1. load skb->head value from address skb + offsetof(struct sk_buff, head)
  2. load skb->inner_transport_header value from address skb + offsetof(struct sk_buff, inner_transport_header),
  3. load the TCP Data Offset from skb->head + skb->inner_transport_header + offsetof(struct tcphdr, doff)

Potentially, any of these loads could trigger a page fault. But it’s unlikely that skb contains an invalid address since we accessed the skb->encapsulation field without crashing just a few lines earlier. Our main suspect is the last load.

The invalid memory address we attempt to load from should be in one of the CPU registers at the time of the exception. And we have the CPU register snapshot in the oops report. Which register holds the address? That has been decided by the compiler. We will need to take a look at the instruction stream to discover that.

Remember the disassembly in the decoded kernel oops? Now is the time to go back to it. Hint, it’s in AT&T syntax. But to give everyone a fair chance to follow along, here’s the same disassembly but in Intel syntax. (Alright, alright. You caught me. I just can’t read the AT&T syntax.)

All code
========
   0:   c0 41 83 e0             rol    BYTE PTR [rcx-0x7d],0xe0
   4:   11 f6                   adc    esi,esi
   6:   87 81 00 00 00 20       xchg   DWORD PTR [rcx+0x20000000],eax
   c:   74 30                   je     0x3e
   e:   0f b7 87 aa 00 00 00    movzx  eax,WORD PTR [rdi+0xaa]
  15:   0f b7 b7 b2 00 00 00    movzx  esi,WORD PTR [rdi+0xb2]
  1c:   48 01 c1                add    rcx,rax
  1f:   48 29 f0                sub    rax,rsi
  22:   45 85 c0                test   r8d,r8d
  25:   48 89 c6                mov    rsi,rax
  28:   74 0d                   je     0x37
  2a:*  0f b6 41 0c             movzx  eax,BYTE PTR [rcx+0xc]           <-- trapping instruction
  2e:   c0 e8 04                shr    al,0x4
  31:   0f b6 c0                movzx  eax,al
  34:   8d 04 86                lea    eax,[rsi+rax*4]
  37:   0f b7 52 04             movzx  edx,WORD PTR [rdx+0x4]
  3b:   01 d0                   add    eax,edx
  3d:   c3                      ret
  3e:   45                      rex.RB
  3f:   85                      .byte 0x85

Code starting with the faulting instruction
===========================================
   0:   0f b6 41 0c             movzx  eax,BYTE PTR [rcx+0xc]
   4:   c0 e8 04                shr    al,0x4
   7:   0f b6 c0                movzx  eax,al
   a:   8d 04 86                lea    eax,[rsi+rax*4]
   d:   0f b7 52 04             movzx  edx,WORD PTR [rdx+0x4]
  11:   01 d0                   add    eax,edx
  13:   c3                      ret
  14:   45                      rex.RB
  15:   85                      .byte 0x85

When the trapped page fault happened, we tried to load from address %rcx + 0xc, or 12 bytes from whatever memory location %rcx held. Which is hardly a coincidence since the Data Offset field is 12 bytes into the TCP header.

This means that %rcx holds the computed skb->head + skb->inner_transport_header address. Let’s take a look at it:

RSP: 0018:ffffa4740d344ba0 EFLAGS: 00010202
RAX: 000000000000feda RBX: ffff9d982becc900 RCX: ffff9d9624bbaffc
RDX: ffff9d9624babec0 RSI: 000000000000feda RDI: ffff9d982becc900
…

The RCX value doesn’t look particularly suspicious. We can say that:

  1. it’s in a kernel virtual address space because it is greater than 0xffff000000000000 – expected, and
  2. it is very close to the 4 KiB page boundary (0xffff9d9624bbb000 – 4),

… and not much more.

We must go back further in the instruction stream. Where did the value in %rcx come from? What I like to do is try to correlate the machine code leading up to the crash with pseudo source code:

<function entry>                # %rdi = skb
…
movzx  eax,WORD PTR [rdi+0xaa]  # %eax = skb->inner_transport_header
movzx  esi,WORD PTR [rdi+0xb2]  # %esi = skb->transport_header
add    rcx,rax                  # %rcx = skb->head + skb->inner_transport_header
sub    rax,rsi                  # %rax = skb->inner_transport_header - skb->transport_header
test   r8d,r8d
mov    rsi,rax                  # %rsi = skb->inner_transport_header - skb->transport_header
je     0x37
movzx  eax,BYTE PTR [rcx+0xc]   # %eax = *(skb->head + skb->inner_transport_header + offsetof(struct tcphdr, doff))

How did I decode that assembly snippet? We know that the skb address was passed to our function in the %rdi register because the System V AMD64 ABI calling convention dictates that. If the %rdi register hasn’t been clobbered by any function calls, or reused because the compiler decided so, then maybe, just maybe, it still holds the skb address.

If 0xaa and 0xb2 are offsets into an sk_buff structure, then pahole tool can tell us which fields they correspond to:

$ pahole --hex -C sk_buff /usr/lib/debug/vmlinux-5.4.14-cloudflare-2020.1.11 | grep '\(head\|inner_transport_header\|transport_header\);'
        __u16                      inner_transport_header; /*  0xaa   0x2 */
        __u16                      transport_header;     /*  0xb2   0x2 */
        unsigned char *            head;                 /*  0xc0   0x8 */

To confirm our guesswork, we can disassemble the whole function in gdb.

It would be great to find out the value of the inner_transport_header and transport_header offsets. But the registers that were holding them, %rax and %rsi, respectively, were reused after the offset values were loaded.

However, we can still examine the difference between inner_transport_header and transport_header that both %rax and %rsi hold. Let’s take a look.

The suspicious offset

Here are the register values from the oops as a reminder:

RAX: 000000000000feda RBX: ffff9d982becc900 RCX: ffff9d9624bbaffc
RDX: ffff9d9624babec0 RSI: 000000000000feda RDI: ffff9d982becc900

From the register snapshot we can tell that:

%rax = %rsi = skb->inner_transport_header - skb->transport_header = 0xfeda = 65242

That is clearly suspicious. We expect that skb->transport_header < skb->inner_transport_header, so either

  1. skb->inner_transport_header > 0xfeda, which would mean that between outer and inner L4 packets there is 65k+ bytes worth of headers – unlikely, or
  2. 0xfeda is a garbage value, perhaps an effect of an underflow if skb->inner_transport_header < skb->transport_header.

Let’s entertain the theory that an underflow has occurred.

Any other scenario, be it an out-of-bounds write or a use-after-free that corrupted the memory, is a scary prospect where we don’t stand much chance of debugging it without help from tools like KASAN report.

But if we assume for a moment that it’s an underflow, then the task is simple 😉. We “just” need to audit all places where skb->inner_transport_header or skb->transport_header offsets could have been updated while the skb buffer travelled through the network stack.

That raises the question — what path did the packet take through the network stack before it brought the machine down?

Packet path

It is time to take a look at the call trace in the oops report. If we walk through it, it is apparent that a veth device received a packet. The packet then got routed and forwarded to some other network device. The kernel crashed before the egress device transmitted the packet out.

The tale of a single register value

What immediately draws our attention is the veth_poll() function in the call trace. Polling inside a virtual device that acts as a simple pipe joining two network namespaces together? Puzzling!

The regular operation mode of a veth device is that transmission of a packet from one side of a veth-pair results in immediate, in-line, on the same CPU, reception of the packet by the other side of the pair. There shouldn’t be any polling, context switches or such.

However, in Linux v4.19 veth driver gained support for native mode eXpress Data Path (XDP). XDP relies on NAPI, an interface between the network drivers and the Linux network stack. NAPI requires that drivers register a poll() callback for fetching received packets.

The NAPI receive path in the veth driver is taken only when there is an XDP program attached. The fork occurs in veth_forward_skb, where the TX path ends and a RX path on the other side begins.

The tale of a single register value

This is an important observation because only on the NAPI/XDP path in the veth driver, received packets might get aggregated by the Generic Receive Offload.

Super-packets

Early on we’ve noted that the crash happens when processing a GSO packet. I’ve promised we will get back to it and now is the time.

Generic Segmentation Offload (GSO) is all about delaying the L4 segmentation process until the very last moment. So called super-packets, that exceed the egress route MTU in size, travel all the way through the network stack, only to be cut into MTU-sized segments just before handing the data over to the network driver for transmission. This way we process just one big packet on the transmit path, instead of a few smaller ones and save on CPU cycles in all the IP-level stack functions like routing, nftables, traffic control

Where do these super-packets come from? They can be a result of large write to a socket, or as is our case, they can be received from one network and forwarded to another network.

The latter case, that is forwarding a super-packet, happens when Generic Receive Offload (GRO) kicks in during receive. GRO is the opposite process of GSO. Smaller, MTU-sized packets get merged to form a super-packet early on the receive path. The goal is the same — process less by pushing just one packet through the network stack layers.

Not just any packets can be fused together by GRO. Loosely speaking, any two packets to be merged must form a logical sequence in the network flow, and carry the same metadata in protocol headers. It is critical that no information is lost in the aggregation process. Otherwise, GSO won’t be able to reconstruct the segment stream when serializing packets in the network card transmission code.

To this end, each network protocol that supports GRO provides a callback which signals whether the above conditions hold true. GRO implementation (dev_gro_receive()) then walks through the packet headers, the outer as well as the inner ones, and delegates the pre-merge check to the right protocol callback. If all stars align, the packets get spliced at the end of the callback chain (skb_gro_receive()).

I will be frank. The code that performs GRO is pretty complex, and I spent a significant amount of time staring into it. Hat tip to its authors. However, for our little investigation it will be enough to understand that a TCP stream encapsulated with GRE1 would trigger callback chain like so:

The tale of a single register value

Armed with basic GRO/GSO understanding we are ready to take a shot at reproducing the crash.

The reproducer

Let’s recap what we know:

  1. a super-packet was received from a veth device,
  2. the veth device had an XDP program attached,
  3. the packet was forwarded to another device,
  4. the egress device was transmitting a GSO super-packet,
  5. the packet was encapsulated,
  6. the super-packet must have been produced by GRO on ingress.

This paints a pretty clear picture on what the setup should look like:

The tale of a single register value

We can work with that. A simple shell script will be our setup machinery.

We will be sending traffic from 10.1.1.1 to 10.2.2.2. Our traffic pattern will be a TCP stream consisting of two consecutive segments so that GRO can merge something. A Scapy script will be great for that. Let’s call it send-a-pair.py and give it a run:

$ { sleep 5; sudo ip netns exec A ./send-a-pair.py; } &
[1] 1603
$ sudo ip netns exec B tcpdump -i BA -n -nn -ttt 'ip and not arp'
…
 00:00:00.020506 IP 10.1.1.1 > 10.2.2.2: GREv0, length 1480: IP 192.168.1.1.12345 > 192.168.2.2.443: Flags [.], seq 0:1436, ack 1, win 8192, length 1436
 00:00:00.000082 IP 10.1.1.1 > 10.2.2.2: GREv0, length 1480: IP 192.168.1.1.12345 > 192.168.2.2.443: Flags [.], seq 1436:2872, ack 1, win 8192, length 1436

Where is our super-packet? Look at the packet sizes, the GRO didn’t merge anything.

Turns out NAPI is just too fast at fetching the packets from the Rx ring. We need a little buffering on transmit to increase our chances of GRO batching:

# Help GRO
ip netns exec A tc qdisc add dev AB root netem delay 200us slot 5ms 10ms packets 2 bytes 64k

With the delay in place, things look better:

 00:00:00.016972 IP 10.1.1.1 > 10.2.2.2: GREv0, length 2916: IP 192.168.1.1.12345 > 192.168.2.2.443: Flags [.], seq 0:2872, ack 1, win 8192, length 2872

8192 bytes shown by tcpdump clearly indicate GRO in action. And we are even hitting the crash point:

$ sudo bpftrace -e 'kprobe:skb_gso_transport_seglen { print(kstack()); }' -c '/usr/bin/ip netns exec A ./send-a-pair.py'
Attaching 1 probe...

        skb_gso_transport_seglen+1
        skb_gso_validate_network_len+17
        __ip_finish_output+293
        ip_output+113
        ip_forward+876
        ip_rcv+188
        __netif_receive_skb_one_core+128
        netif_receive_skb_internal+47
        napi_gro_flush+151
        napi_complete_done+183
        veth_poll+1697
        net_rx_action+314
        …

^C

…but we are not crashing. We will need to dig deeper.

We know what packet metadata skb_gso_transport_seglen() looks at — the header offsets, then encapsulation flag, and GSO info. Let’s dump all of it:

$ sudo bpftrace ./why-no-crash.bt -c '/usr/bin/ip netns exec A ./send-a-pair.py'
Attaching 2 probes...
DEV  LEN  NH  TH  ENC INH ITH GSO SIZE SEGS TYPE FUNC
sink 2936 270 290 1   294 254  |  1436 2    0x41 skb_gso_transport_seglen

Since the skb->encapsulation flag (ENC) is set, both outer and inner header offsets should be valid. Are they?

The outer network / L3 header (NH) looks sane. When XDP is enabled, it reserves 256 bytes of headroom before the headers. 14 byte long Ethernet header follows the headroom. The IPv4 header should then start at 270 bytes into the packet buffer.

The outer transport / L4 header offset is as expected as well. The IPv4 header takes 20 bytes, and the GRE header follows it.

The inner network header (INH) begins at the offset of 294 bytes. This makes sense because the GRE header in its most basic form is 4 bytes long.

The surprise comes last. The inner transport header offset points somewhere near the end of headroom which XDP reserves. Instead, it should start at 314, following the inner IPv4 header.

The tale of a single register value

Is this the smoking gun we were looking for?

The bug

skb_gso_transport_seglen() calculates the length of the outer L4 segment when given a GSO packet. If the inner_transport_header offset is off, then the result of the calculation might be off as well. Worth checking.

We know that our segments are 1500 bytes long. That makes the L4 part 1480 bytes long. What does skb_gso_transport_seglen() say though?

$ sudo bpftrace -e 'kretprobe:skb_gso_transport_seglen { print(retval); }' -c …
Attaching 1 probe...
1460

Seems that we don’t agree. But if skb_gso_transport_seglen() is getting garbage on input we can’t really blame it.

If inner_transport_header is not correct, that TCP Data Offset read that we know happens inside the function cannot end well.

If we map it out, it looks like we are loading part of the source MAC address (upper 4 bits of the 5th byte, to be precise) and interpreting it as TCP Data Offset.

The tale of a single register value

Are we? There is an easy way to check.

If we ask nicely, tcpdump will tell us what the MAC addresses are:

The tale of a single register value

Plugging that into the calculations that skb_gso_transport_seglen()

thlen = inner_transport_header(skb) - transport_header(skb) = 254 - 290 = -36
thlen += inner_transport_header(skb)->doff * 4 = -36 + (0xf * 4) = -36 + 60 = 24
retval = gso_size + thlen = 1436 + 24 = 1460

…checks out!

Does this mean that I can control the return value by setting source MAC address?!

                                               👇
$ sudo ip -n A link set AB address be:d6:07:5e:05:11 # Change the MAC address 
$ sudo bpftrace -e 'kretprobe:skb_gso_transport_seglen { print(retval); }' -c …
Attaching 1 probe...
1400

Yes! 1436 + (-36) + (0 * 4) = 1400. This is it.

However, how does all this tie it to the original crash? The badly calculated L4 segment length will make GSO emit shorter segments on egress. But that’s all.

Remember the suspicious offset from the crash report?

%rax = %rsi = skb->inner_transport_header - skb->transport_header = 0xfeda = 65242

We now know that skb->transport_header should be 290. That makes skb->inner_transport_header = 65242 + 290 = 65532 = 0xfffc.

Which means that when we triggered the page fault we were trying to load memory from a location at

skb->head + skb->inner_transport_header + offsetof(tcphdr, doff) = skb->head + 0xfffc + 12 = 0xffff9d9624bbb008

Solving it for skb->head yields 0xffff9d9624bbb008 - 0xfffc - 12 = 0xffff9d9624bab000.

And this makes sense. The skb->head buffer is page-aligned, meaning it’s a multiple of 4 KiB on x86-64 platforms — the 12 least significant bits the address are 0.

However, the address we were trying to read was (0xfffc+12)/4096 ~= 16 pages (or 64 KiB) past the skb->head page boundary (0xffff9d9624babfff).

The tale of a single register value

Who knows if there was memory mapped to this address?! Looks like from time to time there wasn’t anything mapped there and the kernel page fault handling code went “oops!”.

The fix

It is finally time to understand who sets the header offsets in a super-packet.

Once GRO is done merging segments, it flushes the super-packet down the pipe by kicking off a chain of gro_complete callbacks:

napi_gro_complete → inet_gro_complete → gre_gro_complete → inet_gro_complete → tcp4_gro_complete → tcp_gro_complete

These callbacks are responsible for updating the header offsets and populating the GSO-related fields in skb_shared_info struct. Later on the transmit side will consume this data.

Let’s see how the packet metadata changes as it travels through the gro_complete callbacks2 by adding a few more tracepoint to our bpftrace script:

$ sudo bpftrace ./why-no-crash.bt -c '/usr/bin/ip netns exec A ./send-a-pair.py'
Attaching 7 probes...
DEV  LEN  NH  TH  ENC INH ITH GSO SIZE SEGS TYPE FUNC
BA   2936 294 314 0   254 254  |  1436 0    0x00 napi_gro_complete
BA   2936 294 314 0   254 254  |  1436 0    0x00 inet_gro_complete
BA   2936 294 314 0   254 254  |  1436 0    0x00 gre_gro_complete
BA   2936 294 314 1   254 254  |  1436 0    0x40 inet_gro_complete
BA   2936 294 314 1   294 254  |  1436 0    0x40 tcp4_gro_complete
BA   2936 294 314 1   294 254  |  1436 0    0x41 tcp_gro_complete
sink 2936 270 290 1   294 254  |  1436 2    0x41 skb_gso_transport_seglen

As the packet travels through the gro_complete callbacks, the inner network header (INH) offset gets updated after we have processed the inner IPv4 header.

However, the same thing did not happen to the inner transport header (ITH) that is causing trouble. We need to fix that.

--- a/net/ipv4/tcp_offload.c
+++ b/net/ipv4/tcp_offload.c
@@ -298,6 +298,9 @@ int tcp_gro_complete(struct sk_buff *skb)
        if (th->cwr)
                skb_shinfo(skb)->gso_type |= SKB_GSO_TCP_ECN;

+       if (skb->encapsulation)
+               skb->inner_transport_header = skb->transport_header;
+
        return 0;
 }
 EXPORT_SYMBOL(tcp_gro_complete);

With the patch in place, the header offsets are finally all sane and skb_gso_transport_seglen() return value is as expected:

$ sudo bpftrace ./why-no-crash.bt -c '/usr/bin/ip netns exec A ./send-a-pair.py'
Attaching 2 probes...
DEV  LEN  NH  TH  ENC INH ITH GSO SIZE SEGS TYPE FUNC
sink 2936 270 290 1   294 314  |  1436 2    0x41 skb_gso_transport_seglen

$ sudo bpftrace -e 'kretprobe:skb_gso_transport_seglen { print(retval); }' -c …
Attaching 1 probe...
1480

Don’t worry, though. The fix is already likely in your kernel long time ago. Patch d51c5907e980 (“net, gro: Set inner transport header offset in tcp/udp GRO hook”) has been merged into Linux v5.14, and backported to v5.10.58 and v5.4.140 LTS kernels. The Linux kernel community has got you covered. But please, keep on updating your production kernels.

Outro

What a journey! We have learned a ton and fixed a real bug in the Linux kernel. In the end it was not a Packet of Death. Maybe next time we can find one 😉

Enjoyed the read? Why not join Cloudflare and help us fix the remaining bugs in the Linux kernel? We are hiring in Lisbon, London, and Austin.

And if you would like to see more kernel blog posts, please let us know!


1Why GRE and not some other type of encapsulation? If you follow our blog closely, you might already know that Cloudflare Magic Transit uses veth pairs to route traffic into and out of network namespaces. It also happens to use GRE encapsulation. If you are curious why we chose network namespaces linked with veth pairs, be sure to watch the How we built Magic Transit talk.
2Just turn off GRO on all other network devices in use to get a clean output (sudo ethtool -K enp0s5 gro off).

InsightVM Scan Diagnostics: Troubleshooting Credential Issues for Authenticated Scanning

Post Syndicated from Greg Wiseman original https://blog.rapid7.com/2021/11/03/insightvm-scan-diagnostics-troubleshooting-credential-issues-for-authenticated-scanning/

InsightVM Scan Diagnostics: Troubleshooting Credential Issues for Authenticated Scanning

Have you ever tried to figure out why a vulnerability or policy scan isn’t showing you the results you expect, even though you’ve provided credentials? If so, you’ll be pleased to hear that the November 3rd release of Nexpose and InsightVM (version 6.6.111) will introduce a new check category designed to help troubleshoot issues with credentialed scanning: Scan Diagnostics.

No more combing through scan logs to get answers! These checks will be disabled by default, but you can configure them to run by adjusting your scan templates. When enabled, Scan Diagnostics checks will report a “vulnerable” result against assets when the Scan Engine is supplied with credentials but unable to gather local information.

The challenge of finding the right fix

For complete and accurate coverage, the InsightVM Scan Engine requires local access to systems being scanned. Most Microsoft checks are based on values found in the Windows Registry. Checks for Linux distributions typically require access to the package manager, which is important for vulnerability correlation (cutting down on noise due to backported fixes).

This means that for many types of scans, you need to tell the engine how it should authenticate by configuring credentials. When things go smoothly, the engine will collect all the data it needs for vulnerability or policy assessments. But when results aren’t as expected, it can be challenging to understand what went wrong.

The way the Security Console currently indicates authentication status results is rather coarse-grained. Credential Success means it’s all good, but a Credential Failure (or the puzzling “Partial Credential Success”) can often leave a VM analyst scratching their head about how to fix things.

Bringing greater visibility to your scanning environment

Our new Scan Diagnostics checks provide more detailed visibility into where things fell apart. Because the results are written as vulnerability checks, you’ll be able to use a lot of familiar product functionality to work with them. They can be enabled or disabled via scan templates, and you can report on them like any other category. We’ll also be providing solution information to make it easier to resolve issues, and you can look at the proof the scanner provides to get additional context.

InsightVM Scan Diagnostics: Troubleshooting Credential Issues for Authenticated Scanning

Another advantage of using check results for Scan Diagnostics is that they are built atop the expert system at the core of the scanner. This helps Scan Diagnostics target the most precise cause of an error and provide guidance accordingly. No more going through a laundry list of checkboxes to make sure your sites and assets are configured correctly.

Here is the initial set of Scan Diagnostics we’ll be releasing:

Credential Type Check ID Summary
AS400 rapid7-diagnostics-as400-service-usable No usable AS400 service
SNMP rapid7-diagnostics-snmp-service-usable No usable SNMP service
Telnet rapid7-diagnostics-telnet-service-usable No usable Telnet service
SSH rapid7-diagnostics-privilege-elevation-failed-cisco Cisco SSH privilege elevation failed for the scan
rapid7-diagnostics-privilege-elevation-failed-unix Unix SSH privilege elevation failed for the scan
rapid7-diagnostics-ssh-algorithm-compatibility SSH algorithm mismatch between scan engine and target
rapid7-diagnostics-unix-privilege-elevation-root SSH credential is configured to elevate to a non-root user
rapid7-diagnostics-unix-variant-authenticated-with-non-root-account No SSH credentials with root privileges configured
CIFS rapid7-diagnostics-cifs-read-access-errors Access errors while attempting to read from the file system
rapid7-diagnostics-cifs-sam-access-errors Unable to access the remote Security Account Manager
rapid7-diagnostics-cifs-sam-unknown-error Unknown error while trying to access the remote Security Account Manager
rapid7-diagnostics-cifs-write-access-errors Access errors while attempting to write to the file system
rapid7-diagnostics-smb2-share-access Unable to obtain access to SMB2 shares
rapid7-diagnostics-windows-registry-access-issues Access issues interacting with the Windows Registry
rapid7-diagnostics-windows-registry-enable-services-template Windows Services not enabled in template
rapid7-diagnostics-windows-registry-failed-to-enable-services Failed to enable Windows Services
rapid7-diagnostics-windows-registry-unexpected-error Failed to connect to the Remote Registry Service
rapid7-diagnostics-wmi-connection-error Unable to connect to the WMI Service
rapid7-diagnostics-wmi-dcom-port-error Error when connecting to DCOM Ports (required for WMI)
rapid7-diagnostics-wmi-permission-error Permission error when connecting to the WMI Service
rapid7-diagnostics-wmi-read-access-errors Access errors encountered in attempts to read over WMI
rapid7-diagnostics-wmi-unknown-error Unknown error occurred trying to connect to the WMI Service
rapid7-diagnostics-winrm-authentication-error Authentication error when connecting to the WMI Service
rapid7-diagnostics-winrm-listener-error The WinRM listener on the target appears to be blocking the scan engine from connecting
rapid7-diagnostics-winrm-unknown-error Unknown error occurred trying to connect to the WinRM Service
rapid7-diagnostics-winrm-unencrypted The WinRM service is operating over an unencrypted protocol, potentially leaking valuable data

Note that these “vulnerabilities” carry the lowest possible severity and will not increase your risk score. However, they may increase overall vulnerability counts, so we’re leaving them turned off by default for now. If you do scan with them, you can adjust the scope of generated reports to exclude these results if you don’t want them to get passed through to remediation teams.

The existing status shown in the Authentication column of Scan Results and Node pages will remain the same for the time being, so that it is available regardless of whether this new Scan Diagnostics check type is enabled and won’t adjust anything on the corresponding dashboard card.

We’d love to hear any feedback on this new feature! Please reach out to your Customer Success Manager to let them know how it’s working out in your scans.

NEVER MISS A BLOG

Get the latest stories, expertise, and news about security today.

The collective thoughts of the interwebz