Tag Archives: Magic Transit

Free network flow monitoring for all enterprise customers

Post Syndicated from Chris Draper original https://blog.cloudflare.com/free-network-monitoring-for-enterprise


A key component of effective corporate network security is establishing end to end visibility across all traffic that flows through the network. Every network engineer needs a complete overview of their network traffic to confirm their security policies work, to identify new vulnerabilities, and to analyze any shifts in traffic behavior. Often, it’s difficult to build out effective network monitoring as teams struggle with problems like configuring and tuning data collection, managing storage costs, and analyzing traffic across multiple visibility tools.

Today, we’re excited to announce that a free version of Cloudflare’s network flow monitoring product, Magic Network Monitoring, is available to all Enterprise Customers. Every Enterprise Customer can configure Magic Network Monitoring and immediately improve their network visibility in as little as 30 minutes via our self-serve onboarding process.

Enterprise Customers can visit the Magic Network Monitoring product page, click “Talk to an expert”, and fill out the form. You’ll receive access within 24 hours of submitting the request. Over the next month, the free version of Magic Network Monitoring will be rolled out to all Enterprise Customers. The product will automatically be available by default without the need to submit a form.

How it works

Cloudflare customers can send their network flow data (either NetFlow or sFlow) from their routers to Cloudflare’s network edge.

Magic Network Monitoring will pick up this data, parse it, and instantly provide insights and analytics on your network traffic. These analytics include traffic volume overtime in bytes and packets, top protocols, sources, destinations, ports, and TCP flags.

Dogfooding Magic Network Monitoring during the remediation of the Thanksgiving 2023 security incident

Let’s review a recent example of how Magic Network Monitoring improved Cloudflare’s own network security and traffic visibility during the Thanksgiving 2023 security incident. Our security team needed a lightweight method to identify malicious packet characteristics in our core data center traffic. We monitored for any network traffic sourced from or destined to a list of ASNs associated with the bad actor. Our security team setup Magic Network Monitoring and established visibility into our first core data center within 24 hours of the project kick-off. Today, Cloudflare continues to use Magic Network Monitoring to monitor for traffic related to bad actors and to provide real time traffic analytics on more than 1 Tbps of core data center traffic.

Magic Network Monitoring – Traffic Analytics

Monitoring local network traffic from IoT devices

Magic Network Monitoring also improves visibility on any network traffic that doesn’t go through Cloudflare. Imagine that you’re a network engineer at ACME Corporation, and it’s your job to manage and troubleshoot IoT devices in a factory that are connected to the factory’s internal network. The traffic generated by these IoT devices doesn’t go through Cloudflare because it is destined to other devices and endpoints on the internal network. Nonetheless, you still need to establish network visibility into device traffic over time to monitor and troubleshoot the system.

To solve the problem, you configure a router or other network device to securely send encrypted traffic flow summaries to Cloudflare via an IPSec tunnel. Magic Network Monitoring parses the data, and instantly provides you with insights and analytics on your network traffic. Now, when an IoT device goes down, or a connection between IoT devices is unexpectedly blocked, you can analyze historical network traffic data in Magic Network Monitoring to speed up the troubleshooting process.

Monitoring cloud network traffic

As cloud networking becomes increasingly prevalent, it is essential for enterprises to invest in visibility across their cloud environments. Let’s say you’re responsible for monitoring and troubleshooting your corporation’s cloud network operations which are spread across multiple public cloud providers. You need to improve visibility into your cloud network traffic to analyze and troubleshoot any unexpected traffic patterns like configuration drift that leads to an exposed network port.

To improve traffic visibility across different cloud environments, you can export cloud traffic flow logs from any virtual device that supports NetFlow or sFlow to Cloudflare. In the future, we are building support for native cloud VPC flow logs in conjunction with Magic Cloud Networking. Cloudflare will parse this traffic flow data and provide alerts plus analytics across all your cloud environments in a single pane of glass on the Cloudflare dashboard.

Improve your security posture today in less than 30 minutes

If you’re an existing Enterprise customer, and you want to improve your corporate network security, you can get started right away. Visit the Magic Network Monitoring product page, click “Talk to an expert”, and fill out the form. You’ll receive access within 24 hours of submitting the request. You can begin the self-serve onboarding tutorial, and start monitoring your first batch of network traffic in less than 30 minutes.

Over the next month, the free version of Magic Network Monitoring will be rolled out to all Enterprise Customers. The product will be automatically available by default without the need to submit a form.

If you’re interested in becoming an Enterprise Customer, and have more questions about Magic Network Monitoring, you can talk with an expert. If you’re a free customer, and you’re interested in testing a limited beta of Magic Network Monitoring, you can fill out this form to request access.

Simplifying how enterprises connect to Cloudflare with Express Cloudflare Network Interconnect

Post Syndicated from Ben Ritter original https://blog.cloudflare.com/announcing-express-cni


We’re excited to announce the largest update to Cloudflare Network Interconnect (CNI) since its launch, and because we’re making CNIs faster and easier to deploy, we’re calling this Express CNI. At the most basic level, CNI is a cable between a customer’s network router and Cloudflare, which facilitates the direct exchange of information between networks instead of via the Internet. CNIs are fast, secure, and reliable, and have connected customer networks directly to Cloudflare for years. We’ve been listening to how we can improve the CNI experience, and today we are sharing more information about how we’re making it faster and easier to order CNIs, and connect them to Magic Transit and Magic WAN.

Interconnection services and what to consider

Interconnection services provide a private connection that allows you to connect your networks to other networks like the Internet, cloud service providers, and other businesses directly. This private connection benefits from improved connectivity versus going over the Internet and reduced exposure to common threats like Distributed Denial of Service (DDoS) attacks.

Cost is an important consideration when evaluating any vendor for interconnection services. The cost of an interconnection is typically comprised of a fixed port fee, based on the capacity (speed) of the port, and the variable amount of data transferred. Some cloud providers also add complex inter-region bandwidth charges.

Other important considerations include the following:

  • How much capacity is needed?
  • Are there variable or fixed costs associated with the port?
  • Is the provider located in the same colocation facility as my business?
  • Are they able to scale with my network infrastructure?
  • Are you able to predict your costs without any unwanted surprises?
  • What additional products and services does the vendor offer?

Cloudflare does not charge a port fee for Cloudflare Network Interconnect, nor do we charge for inter-region bandwidth. Using CNI with products like Magic Transit and Magic WAN may even reduce bandwidth spending with Internet service providers. For example, you can deliver Magic Transit-cleaned traffic to your data center with a CNI instead of via your Internet connection, reducing the amount of bandwidth that you would pay an Internet service provider for.

To underscore the value of CNI, one vendor charges nearly \$20,000 a year for a 10 Gigabit per second (Gbps) direct connect port. The same 10 Gbps CNI on Cloudflare for one year is $0. Their cost also does not include any costs related to the amount of data transferred between different regions or geographies, or outside of their cloud. We have never charged for CNIs, and are committed to making it even easier for customers to connect to Cloudflare, and destinations beyond on the open Internet.

3 Minute Provisioning

Our first big announcement is a new, faster approach to CNI provisioning and deployment. Starting today, all Magic Transit and Magic WAN customers can order CNIs directly from their Cloudflare account. The entire process is about 3 clicks and takes less than 3 minutes (roughly the time to make coffee). We’re going to show you how simple it is to order a CNI.

The first step is to find out whether Cloudflare is in the same data center or colocation facility as your routers, servers, and network hardware. Let’s navigate to the new “Interconnects” section of the Cloudflare dashboard, and order a new Direct CNI.

Search for the city of your data center, and quickly find out if Cloudflare is in the same facility. I’m going to stand up a CNI to connect my example network located in Ashburn, VA.

It looks like Cloudflare is in the same facility as my network, so I’m going to select the location where I’d like to connect.

As of right now, my data center is only exchanging a few hundred Megabits per second of traffic on Magic Transit, so I’m going to select a 1 Gigabit per second interface, which is the smallest port speed available. I can also order a 10 Gbps link if I have more than 1 Gbps of traffic in a single location. Cloudflare also supports 100 Gbps CNIs, but if you have this much traffic to exchange with us, we recommend that you coordinate with your account team.

After selecting your preferred port speed, you can name your CNI, which will be referenceable later when you direct your Magic Transit or Magic WAN traffic to the interconnect. We are given the opportunity to verify that everything looks correct before confirming our CNI order.

Once we click the “Confirm Order” button, Cloudflare will provision an interface on our router for your CNI, and also assign IP addresses for you to configure on your router interface. Cloudflare will also issue you a Letter of Authorization (LOA) for you to order a cross connect with the local facility. Cloudflare will provision a port on our router for your CNI within 3 minutes of your order, and you will be able to ping across the CNI as soon as the interface line status comes up.

After downloading the Letter of Authorization (LOA) to order a cross connect, we’ll navigate back to our Interconnects area. Here we can see the point to point IP addressing, and the CNI name that is used in our Magic Transit or Magic WAN configuration. We can also redownload the LOA if needed.

Simplified Magic Transit and Magic WAN onboarding

Our second major announcement is that Express CNI dramatically simplifies how Magic Transit and Magic WAN customers connect to Cloudflare. Getting packets into Magic Transit or Magic WAN in the past with a CNI required customers to configure a GRE (Generic Routing Encapsulation) tunnel on their router. These configurations are complex, and not all routers and switches support these changes. Since both Magic Transit and Magic WAN protect networks, and operate at the network layer on packets, customers rightly asked us, “If I connect directly to Cloudflare with CNI, why do I also need a GRE tunnel for Magic Transit and Magic WAN?”

Starting today, GRE tunnels are no longer required with Express CNI. This means that Cloudflare supports standard 1500-byte packets on the CNI, and there’s no need for complex GRE or MSS adjustment configurations to get traffic into Magic Transit or Magic WAN. This significantly reduces the amount of configuration required on a router for Magic Transit and Magic WAN customers who can connect over Express CNI. If you’re not familiar with Magic Transit, the key takeaway is that we’ve reduced the complexity of changes you must make on your router to protect your network with Cloudflare.

What’s next for CNI?

We’re excited about how Express CNI simplifies connecting to Cloudflare’s network. Some customers connect to Cloudflare through our Interconnection Platform Partners, like Equinix and Megaport, and we plan to bring the Express CNI features to our partners too.

We have upgraded a number of our data centers to support Express CNI, and plan to upgrade many more over the next few months. We are rapidly expanding the number of global locations that support Express CNI as we install new network hardware. If you’re interested in connecting to Cloudflare with Express CNI, but are unable to find your data center, please let your account team know.

If you’re on an existing classic CNI today, and you don’t need Express CNI features, there is no obligation to migrate to Express CNI. Magic Transit and Magic WAN customers have been asking for BGP support to control how Cloudflare routes traffic back to their networks, and we expect to extend BGP support to Express CNI first, so keep an eye out for more Express CNI announcements later this year.

Get started with Express CNI today

As we’ve demonstrated above, Express CNI makes it fast and easy to connect your network to Cloudflare. If you’re a Magic Transit or Magic WAN customer, the new “Interconnects” area is now available on your Cloudflare dashboard. To deploy your first CNI, you can follow along with the screenshots above, or refer to our updated interconnects documentation.

Introducing Cloudflare’s new Network Analytics dashboard

Post Syndicated from Omer Yoachimik original https://blog.cloudflare.com/network-analytics-v2-announcement/

Introducing Cloudflare’s new Network Analytics dashboard

Introducing Cloudflare’s new Network Analytics dashboard

We’re pleased to introduce Cloudflare’s new and improved Network Analytics dashboard. It’s now available to Magic Transit and Spectrum customers on the Enterprise plan.

The dashboard provides network operators better visibility into traffic behavior, firewall events, and DDoS attacks as observed across Cloudflare’s global network. Some of the dashboard’s data points include:

  1. Top traffic and attack attributes
  2. Visibility into DDoS mitigations and Magic Firewall events
  3. Detailed packet samples including full packets headers and metadata
Introducing Cloudflare’s new Network Analytics dashboard
Network Analytics – Drill down by various dimensions
Introducing Cloudflare’s new Network Analytics dashboard
Network Analytics – View traffic by mitigation system

This dashboard was the outcome of a full refactoring of our network-layer data logging pipeline. The new data pipeline is decentralized and much more flexible than the previous one — making it more resilient, performant, and scalable for when we add new mitigation systems, introduce new sampling points, and roll out new services. A technical deep-dive blog is coming soon, so stay tuned.

In this blog post, we will demonstrate how the dashboard helps network operators:

  1. Understand their network better
  2. Respond to DDoS attacks faster
  3. Easily generate security reports for peers and managers

Understand your network better

One of the main responsibilities network operators bare is ensuring the operational stability and reliability of their network. Cloudflare’s Network Analytics dashboard shows network operators where their traffic is coming from, where it’s heading, and what type of traffic is being delivered or mitigated. These insights, along with user-friendly drill-down capabilities, help network operators identify changes in traffic, surface abnormal behavior, and can help alert on critical events that require their attention — to help them ensure their network’s stability and reliability.

Starting at the top, the Network Analytics dashboard shows network operators their traffic rates over time along with the total throughput. The entire dashboard is filterable, you can drill down using select-to-zoom, change the time-range, and toggle between a packet or bit/byte view. This can help gain a quick understanding of traffic behavior and identify sudden dips or surges in traffic.

Cloudflare customers advertising their own IP prefixes from the Cloudflare network can also see annotations for BGP advertisement and withdrawal events. This provides additional context atop of the traffic rates and behavior.

Introducing Cloudflare’s new Network Analytics dashboard
The Network Analytics dashboard time series and annotations

Geographical accuracy

One of the many benefits of Cloudflare’s Network Analytics dashboard is its geographical accuracy. Identification of the traffic source usually involves correlating the source IP addresses to a city and country. However, network-layer traffic is subject to IP spoofing. Malicious actors can spoof (alter) their source IP address to obfuscate their origin (or their botnet’s nodes) while attacking your network. Correlating the location (e.g., the source country) based on spoofed IPs would therefore result in spoofed countries. Using spoofed countries would skew the global picture network operators rely on.

To overcome this challenge and provide our users accurate geoinformation, we rely on the location of the Cloudflare data center wherein the traffic was ingested. We’re able to achieve geographical accuracy with high granularity, because we operate data centers in over 285 locations around the world. We use BGP Anycast which ensures traffic is routed to the nearest data center within BGP catchment.

Introducing Cloudflare’s new Network Analytics dashboard
Traffic by Cloudflare data center country from the Network Analytics dashboard

Detailed mitigation analytics

The dashboard lets network operators understand exactly what is happening to their traffic while it’s traversing the Cloudflare network. The All traffic tab provides a summary of attack traffic that was dropped by the three mitigation systems, and the clean traffic that was passed to the origin.

Introducing Cloudflare’s new Network Analytics dashboard
The All traffic tab in Network Analytics

Each additional tab focuses on one mitigation system, showing traffic dropped by the corresponding mitigation system and traffic that was passed through it. This provides network operators almost the same level of visibility as our internal support teams have. It allows them to understand exactly what Cloudflare systems are doing to their traffic and where in the Cloudflare stack an action is being taken.

Introducing Cloudflare’s new Network Analytics dashboard
Introducing Cloudflare’s new Network Analytics dashboard
Data path for Magic Transit customers

Using the detailed tabs, users can better understand the systems’ decisions and which rules are being applied to mitigate attacks. For example, in the Advanced TCP Protection tab, you can view how the system is classifying TCP connection states. In the screenshot below, you can see the distribution of packets according to connection state. For example, a sudden spike in Out of sequence packets may result in the system dropping them.

Introducing Cloudflare’s new Network Analytics dashboard
The Advanced TCP Protection tab in Network Analytics

Note that the presence of tabs differ slightly for Spectrum customers because they do not have access to the Advanced TCP Protection and Magic Firewall tabs. Spectrum customers only have access to the first two tabs.

Respond to DDoS attacks faster

Cloudflare detects and mitigates the majority of DDoS attacks automatically. However, when a network operator responds to a sudden increase in traffic or a CPU spike in their data centers, they need to understand the nature of the traffic. Is this a legitimate surge due to a new game release for example, or an unmitigated DDoS attack? In either case, they need to act quickly to ensure there are no disruptions to critical services.

The Network Analytics dashboard can help network operators quickly pattern traffic by switching the time-series’ grouping dimensions. They can then use that pattern to drop packets using the Magic Firewall. The default dimension is the outcome indicating whether traffic was dropped or passed. But by changing the time series dimension to another field such as the TCP flag, Packet size, or Destination port a pattern can emerge.

In the example below, we have zoomed in on a surge of traffic. By setting the Protocol field as the grouping dimension, we can see that there is a 5 Gbps surge of UDP packets (totalling at 840 GB throughput out of 991 GB in this time period). This is clearly not the traffic we want, so we can hover and click the UDP indicator to filter by it.

Introducing Cloudflare’s new Network Analytics dashboard
Distribution of a DDoS attack by IP protocols

We can then continue to pattern the traffic, and so we set the Source port to be the grouping dimension. We can immediately see that, in this case, the majority of traffic (838 GB) is coming from source port 123. That’s no bueno, so let’s filter by that too.

Introducing Cloudflare’s new Network Analytics dashboard
The UDP flood grouped by source port

We can continue iterating to identify the main pattern of the surge. An example of a field that is not necessarily helpful in this case is the Destination port. The time series is only showing us the top five ports but we can already see that it is quite distributed.

Introducing Cloudflare’s new Network Analytics dashboard
The attack targets multiple destination ports

We move on to see what other fields can contribute to our investigation. Using the Packet size dimension yields good results. Over 771 GB of the traffic are delivered over 286 byte packets.

Introducing Cloudflare’s new Network Analytics dashboard
Zooming in on an UDP flood originating from source port 123 

Assuming that our attack is now sufficiently patterned, we can create a Magic Firewall rule to block the attack by combining those fields. You can combine additional fields to ensure you do not impact your legitimate traffic. For example, if the attack is only targeting a single prefix (e.g., 192.0.2.0/24), you can limit the scope of the rule to that prefix.

Introducing Cloudflare’s new Network Analytics dashboard
Creating a Magic Firewall rule directly from within the analytics dashboard
Introducing Cloudflare’s new Network Analytics dashboard
Creating a Magic Firewall rule to block a UDP flood

If needed for attack mitigation or network troubleshooting, you can also view and export packet samples along with the packet headers. This can help you identify the pattern and sources of the traffic.

Introducing Cloudflare’s new Network Analytics dashboard
Example of packet samples with one sample expanded
Introducing Cloudflare’s new Network Analytics dashboard
Example of a packet sample with the header sections expanded

Generate reports

Another important role of the network security team is to provide decision makers an accurate view of their threat landscape and network security posture. Understanding those will enable teams and decision makers to prepare and ensure their organization is protected and critical services are kept available and performant. This is where, again, the Network Analytics dashboard comes in to help. Network operators can use the dashboard to understand their threat landscape — which endpoints are being targeted, by which types of attacks, where are they coming from, and how does that compare to the previous period.

Introducing Cloudflare’s new Network Analytics dashboard
Dynamic, adaptive executive summary

Using the Network Analytics dashboard, users can create a custom report — filtered and tuned to provide their decision makers a clear view of the attack landscape that’s relevant to them.

Introducing Cloudflare’s new Network Analytics dashboard

In addition, Magic Transit and Spectrum users also receive an automated weekly Network DDoS Report which includes key insights and trends.

Extending visibility from Cloudflare’s vantage point

As we’ve seen in many cases, being unprepared can cost organizations substantial revenue loss, it can negatively impact their reputation, reduce users’ trust as well as burn out teams that need to constantly put out fires reactively. Furthermore, impact to organizations that operate in the healthcare industry, water, and electric and other critical infrastructure industries can cause very serious real-world problems, e.g., hospitals not being able to provide care for patients.

The Network Analytics dashboard aims to reduce the effort and time it takes network teams to investigate and resolve issues as well as to simplify and automate security reporting. The data is also available via GraphQL API and Logpush to allow teams to integrate the data into their internal systems and cross references with additional data points.

To learn more about the Network Analytics dashboard, refer to the developer documentation.

Introducing Advanced DDoS Alerts

Post Syndicated from Omer Yoachimik original https://blog.cloudflare.com/advanced-ddos-alerts/

Introducing Advanced DDoS Alerts

Introducing Advanced DDoS Alerts

We’re pleased to introduce Advanced DDoS Alerts. Advanced DDoS Alerts are customizable and provide users the flexibility they need when managing many Internet properties. Users can easily define which alerts they want to receive — for which DDoS attack sizes, protocols and for which Internet properties.

This release includes two types of Advanced DDoS Alerts:

  1. Advanced HTTP DDoS Attack Alerts – Available to WAF/CDN customers on the Enterprise plan, who have also subscribed to the Advanced DDoS Protection service.
  2. Advanced L3/4 DDoS Attack Alerts – Available to Magic Transit and Spectrum BYOIP customers on the Enterprise plan.

Standard DDoS Alerts are available to customers on all plans, including the Free plan. Advanced DDoS Alerts are part of Cloudflare’s Advanced DDoS service.

Why alerts?

Distributed Denial of Service attacks are cyber attacks that aim to take down your Internet properties and make them unavailable for your users. As early as 2017, Cloudflare pioneered the Unmetered DDoS Protection to provide all customers with DDoS protection, without limits, to ensure that their Internet properties remain available. We’re able to provide this level of commitment to our customers thanks to our automated DDoS protection systems. But if the systems operate automatically, why even be alerted?

Well, to put it plainly, when our DDoS protection systems kick in, they insert ephemeral rules inline to mitigate the attack. Many of our customers operate business critical applications and services. When our systems make a decision to insert a rule, customers might want to be able to verify that all the malicious traffic is mitigated, and that legitimate user traffic is not. Our DDoS alerts begin firing as soon as our systems make a mitigation decision. Therefore, by informing our customers about a decision to insert a rule in real time, they can observe and verify that their Internet properties are both protected and available.

Managing many Internet properties

The standard DDoS Alerts alert you on DDoS attacks that target any and all of your Cloudflare-protected Internet properties. However, some of our customers may manage large numbers of Internet properties ranging from hundreds to hundreds of thousands. The standard DDoS Alerts would notify users every time one of those properties would come under attack — which could become very noisy.

The Advanced DDoS Alerts address this concern by allowing users to select the specific Internet properties that they want to be notified about; zones and hostnames for WAF/CDN customers, and IP prefixes for Magic Transit and Spectrum BYOIP customers.

Introducing Advanced DDoS Alerts
Creating an Advanced HTTP DDoS Attack Alert: selecting zones and hostnames
Introducing Advanced DDoS Alerts
Creating an Advanced L3/4 DDoS Attack Alert: selecting prefixes

One (attack) size doesn’t fit all

The standard DDoS Alerts alert you on DDoS attacks of any size. Well, almost any size. We implemented minimal alert thresholds to avoid spamming our customers’ email inboxes. Those limits are very small and not customer-configurable. As we’ve seen in the recent DDoS trends report, most of the attacks are very small — another reason why the standard DDoS Alert could become noisy for customers that only care about very large attacks. On the opposite end of the spectrum, choosing not to alert may become too quiet for customers that do want to be notified about smaller attacks.

The Advanced DDoS Alerts let customers choose their own alert threshold. WAF/CDN customers can define the minimum request-per-second rate of an HTTP DDoS attack alert. Magic Transit and Spectrum BYOIP customers can define the packet-per-second and Megabit-per-second rates of a L3/4 DDoS attack alert.

Introducing Advanced DDoS Alerts
Creating an Advanced HTTP DDoS Attack Alert: defining request rate
Introducing Advanced DDoS Alerts
Creating an Advanced L3/4 DDoS Attack Alert: defining packet/bit rate

Not all protocols are created equal

As part of the Advanced L3/4 DDoS Alerts, we also let our users define the protocols to be alerted on. If a Magic Transit customer manages mostly UDP applications, they may not care if TCP-based DDoS attacks target it. Similarly, if a Spectrum BYOIP customer only cares about HTTP/TCP traffic, other-protocol-based attacks could be of no concern to them.

Introducing Advanced DDoS Alerts
Introducing Advanced DDoS Alerts
Creating an Advanced L3/4 DDoS Attack Alert: selecting the protocols

Creating an Advanced DDoS Alert

We’ll show here how to create an Advanced HTTP DDoS Alert, but the process to create a L3/4 alert is similar. You can view a more detailed guide on our developers website.

First, click here or log in to your Cloudflare account, navigate to Notifications and click Add. Then select the Advanced HTTP DDoS Attack Alert or Advanced L3/4 DDoS Attack Alert (based on your eligibility). Give your alert a name, an optional description, add your preferred delivery method (e.g., Webhook) and click Next.

Introducing Advanced DDoS Alerts
Step 1: Creating an Advanced HTTP DDoS Attack Alert

Second, select the domains you’d like to be alerted on. You can also narrow it down to specific hostnames. Define the minimum request-per-second rate to be alerted on, click Save, and voilà.

Introducing Advanced DDoS Alerts
Step 2: Defining the Advanced HTTP DDoS Attack Alert conditions

Actionable alerts for making better decisions

Cloudflare Advanced DDoS Alerts aim to provide our customers with configurable controls to make better decisions for their own environments. Customers can now be alerted on attacks based on which domain/prefix is being attacked, the size of the attack, and the protocol of the attack. We recognize that the power to configure and control DDoS attack alerts should ultimately be left up to our customers, and we are excited to announce the availability of this functionality.

Want to learn more about Advanced DDoS Alerts? Visit our developer site.

Interested in upgrading to get Advanced DDoS Alerts? Contact your account team.

New to Cloudflare? Speak to a Cloudflare expert.

Introducing Cloudflare Adaptive DDoS Protection – our new traffic profiling system for mitigating DDoS attacks

Post Syndicated from Omer Yoachimik original https://blog.cloudflare.com/adaptive-ddos-protection/

Introducing Cloudflare Adaptive DDoS Protection - our new traffic profiling system for mitigating DDoS attacks

Introducing Cloudflare Adaptive DDoS Protection - our new traffic profiling system for mitigating DDoS attacks

Every Internet property is unique, with its own traffic behaviors and patterns. For example, a website may only expect user traffic from certain geographies, and a network might only expect to see a limited set of protocols.

Understanding that the traffic patterns of each Internet property are unique is what led us to develop the Adaptive DDoS Protection system. Adaptive DDoS Protection joins our existing suite of automated DDoS defenses and takes it to the next level. The new system learns your unique traffic patterns and adapts to protect against sophisticated DDoS attacks.

Adaptive DDoS Protection is now generally available to Enterprise customers:

  • HTTP Adaptive DDoS Protection – available to WAF/CDN customers on the Enterprise plan, who have also subscribed to the Advanced DDoS Protection service.
  • L3/4 Adaptive DDoS Protection – available to Magic Transit and Spectrum customers on an Enterprise plan.

Adaptive DDoS Protection learns your traffic patterns

The Adaptive DDoS Protection system creates a traffic profile by looking at a customer’s maximal rates of traffic every day, for the past seven days. The profiles are recalculated every day using the past seven-day history. We then store the maximal traffic rates seen for every predefined dimension value. Every profile uses one dimension and these dimensions include the source country of the request, the country where the Cloudflare data center that received the IP packet is located, user agent, IP protocol, destination ports and more.

So, for example, for the profile that uses the source country as a dimension, the system will log the maximal traffic rates seen per country. e.g. 2,000 requests per second (rps) for Germany, 3,000 rps for France, 10,000 rps for Brazil, and so on. This example is for HTTP traffic, but Adaptive DDoS protection also profiles L3/4 traffic for our Magic Transit and Spectrum Enterprise customers.

Another note on the maximal rates is that we use the 95th percentile rates. This means that we take a look at the maximal rates and discard the top 5% of the highest rates. The purpose of this is to eliminate outliers from the calculations.

Calculating traffic profiles is done asynchronously — meaning that it does not induce any latency to our customers’ traffic. The system  then distributes a compact profile representation across our network that can be consumed by our DDoS protection systems to be used to detect and mitigate DDoS attacks in a much more cost-efficient manner.

In addition to the traffic profiles, the Adaptive DDoS Protection also leverages Cloudflare’s Machine Learning generated Bot Scores as an additional signal to differentiate between user and automated traffic. The purpose of using these scores is to differentiate between legitimate spikes in user traffic that deviates from the traffic profile, and a spike of automated and potentially malicious traffic.

Out of the box and easy to use

Adaptive DDoS Protection just works out of the box. It automatically creates the profiles, and then customers can tweak and tune the settings as they need via DDoS Managed Rules. Customers can change the sensitivity level, leverage expression fields to create overrides (e.g. exclude this type of traffic), and change the mitigation action to tailor the behavior of the system to their specific needs and traffic patterns.

Introducing Cloudflare Adaptive DDoS Protection - our new traffic profiling system for mitigating DDoS attacks

Adaptive DDoS Protection complements the existing DDoS protection systems which leverages dynamic fingerprinting to detect and mitigate DDoS attacks. The two work in tandem to protect our customers from DDoS attacks. When Cloudflare customers onboard a new Internet property to Cloudflare, the dynamic fingerprinting protects them automatically and out of the box — without requiring any user action. Once the Adaptive DDoS Protection learns their legitimate traffic patterns and creates a profile, users can turn it on to provide an extra layer of protection.

Rules included as part of the Adaptive DDoS Protection

As part of this release, we’re pleased to announce the following capabilities as part of Cloudflare’s Adaptive DDoS Protection:

Profiling Dimension Availability
WAF/CDN customers on the Enterprise plan with Advanced DDoS Magic Transit & Spectrum Enterprise customers
Origin errors
Client IP Country & region Coming soon
User Agent (globally, not per customer*)
IP Protocol
Combination of IP Protocol and Destination Port Coming soon

*The User-Agent-aware feature analyzes, learns and profiles all the top user agents that we see across the Cloudflare network. This feature helps us identify DDoS attacks that leverage legacy or wrongly configured user agents.

Excluding UA-aware DDoS Protection, Adaptive DDoS Protection rules are deployed in Log mode. Customers can observe the traffic that’s flagged, tweak the sensitivity if needed, and then deploy the rules in mitigation mode. You can follow the steps outlined in this guide to do so.

Making the impact of DDoS attacks a thing of the past

Our mission at Cloudflare is to help build a better Internet. The DDoS Protection team’s vision is derived from this mission: our goal is to make the impact of DDoS attacks a thing of the past. Cloudflare’s Adaptive DDoS Protection takes us one step closer to achieving that vision: making Cloudflare’s DDoS protection even more intelligent, sophisticated, and tailored to our customer’s unique traffic patterns and individual needs.

Want to learn more about Cloudflare’s Adaptive DDoS Protection? Visit our developer site.

Interested in upgrading to get access to Adaptive DDoS Protection? Contact your account team.

New to Cloudflare? Speak to a Cloudflare expert.

Integrating Network Analytics Logs with your SIEM dashboard

Post Syndicated from Omer Yoachimik original https://blog.cloudflare.com/network-analytics-logs/

Integrating Network Analytics Logs with your SIEM dashboard

Integrating Network Analytics Logs with your SIEM dashboard

We’re excited to announce the availability of Network Analytics Logs. Magic Transit, Magic Firewall, Magic WAN, and Spectrum customers on the Enterprise plan can feed packet samples directly into storage services, network monitoring tools such as Kentik, or their Security Information Event Management (SIEM) systems such as Splunk to gain near real-time visibility into network traffic and DDoS attacks.

What’s included in the logs

By creating a Network Analytics Logs job, Cloudflare will continuously push logs of packet samples directly to the HTTP endpoint of your choice, including Websockets. The logs arrive in JSON format which makes them easy to parse, transform, and aggregate. The logs include packet samples of traffic dropped and passed by the following systems:

  1. Network-layer DDoS Protection Ruleset
  2. Advanced TCP Protection
  3. Magic Firewall

Note that not all mitigation systems are applicable to all Cloudflare services. Below is a table describing which mitigation service is applicable to which Cloudflare service:

Mitigation System Cloudflare Service
Magic Transit Magic WAN Spectrum
Network-layer DDoS Protection Ruleset
Advanced TCP Protection
Magic Firewall

Packets are processed by the mitigation systems in the order outlined above. Therefore, a packet that passed all three systems may produce three packet samples, one from each system. This can be very insightful when troubleshooting and wanting to understand where in the stack a packet was dropped. To avoid overcounting the total passed traffic, Magic Transit users should only take into consideration the passed packets from the last mitigation system, Magic Firewall.

An example of a packet sample log:

{"AttackCampaignID":"","AttackID":"","ColoName":"bkk06","Datetime":1652295571783000000,"DestinationASN":13335,"Direction":"ingress","IPDestinationAddress":"(redacted)","IPDestinationSubnet":"/24","IPProtocol":17,"IPSourceAddress":"(redacted)","IPSourceSubnet":"/24","MitigationReason":"","MitigationScope":"","MitigationSystem":"magic-firewall","Outcome":"pass","ProtocolState":"","RuleID":"(redacted)","RulesetID":"(redacted)","RulesetOverrideID":"","SampleInterval":100,"SourceASN":38794,"Verdict":"drop"}

All the available log fields are documented here: https://developers.cloudflare.com/logs/reference/log-fields/account/network_analytics_logs/

Setting up the logs

In this walkthrough, we will demonstrate how to feed the Network Analytics Logs into Splunk via Postman. At this time, it is only possible to set up Network Analytics Logs via API. Setting up the logs requires three main steps:

  1. Create a Cloudflare API token.
  2. Create a Splunk Cloud HTTP Event Collector (HEC) token.
  3. Create and enable a Cloudflare Logpush job.

Let’s get started!

1) Create a Cloudflare API token

  1. Log in to your Cloudflare account and navigate to My Profile.
  2. On the left-hand side, in the collapsing navigation menu, click API Tokens.
  3. Click Create Token and then, under Custom token, click Get started.
  4. Give your custom token a name, and select an Account scoped permission to edit Logs. You can also scope it to a specific/subset/all of your accounts.
  5. At the bottom, click Continue to summary, and then Create Token.
  6. Copy and save your token. You can also test your token with the provided snippet in Terminal.

When you’re using an API token, you don’t need to provide your email address as part of the API credentials.

Integrating Network Analytics Logs with your SIEM dashboard

Read more about creating an API token on the Cloudflare Developers website: https://developers.cloudflare.com/api/tokens/create/

2) Create a Splunk token for an HTTP Event Collector

In this walkthrough, we’re using a Splunk Cloud free trial, but you can use almost any service that can accept logs over HTTPS. In some cases, if you’re using an on-premise SIEM solution, you may need to allowlist Cloudflare IP address in your firewall to be able to receive the logs.

  1. Create a Splunk Cloud account. I created a trial account for the purpose of this blog.
  2. In the Splunk Cloud dashboard, go to Settings > Data Input.
  3. Next to HTTP Event Collector, click Add new.
  4. Follow the steps to create a token.
  5. Copy your token and your allocated Splunk hostname and save both for later.
Integrating Network Analytics Logs with your SIEM dashboard

Read more about using Splunk with Cloudflare Logpush on the Cloudflare Developers website: https://developers.cloudflare.com/logs/get-started/enable-destinations/splunk/

Read more about creating an HTTP Event Collector token on Splunk’s website: https://docs.splunk.com/Documentation/Splunk/8.2.6/Data/UsetheHTTPEventCollector

3) Create a Cloudflare Logpush job

Creating and enabling a job is very straightforward. It requires only one API call to Cloudflare to create and enable a job.

To send the API calls I used Postman, which is a user-friendly API client that was recommended to me by a colleague. It allows you to save and customize API calls. You can also use Terminal/CMD or any other API client/script of your choice.

One thing to notice is Network Analytics Logs are account-scoped. The API endpoint is therefore a tad different from what you would normally use for zone-scoped datasets such as HTTP request logs and DNS logs.

This is the endpoint for creating an account-scoped Logpush job:

https://api.cloudflare.com/client/v4/accounts/{account-id}/logpush/jobs

Your account identifier number is a unique identifier of your account. It is a string of 32 numbers and characters. If you’re not sure what your account identifier is, log in to Cloudflare, select the appropriate account, and copy the string at the end of the URL.

https://dash.cloudflare.com/{account-id}

Then, set up a new request in Postman (or any other API client/CLI tool).

To successfully create a Logpush job, you’ll need the HTTP method, URL, Authorization token, and request body (data). The request body must include a destination configuration (destination_conf), the specified dataset (network_analytics_logs, in our case), and the token (your Splunk token).

Integrating Network Analytics Logs with your SIEM dashboard

Method:

POST

URL:

https://api.cloudflare.com/client/v4/accounts/{account-id}/logpush/jobs

Authorization: Define a Bearer authorization in the Authorization tab, or add it to the header, and add your Cloudflare API token.

Body: Select a Raw > JSON

{
"destination_conf": "{your-unique-splunk-configuration}",
"dataset": "network_analytics_logs",
"token": "{your-splunk-hec-tag}",
"enabled": "true"
}

If you’re using Splunk Cloud, then your unique configuration has the following format:

{your-unique-splunk-configuration}=splunk://{your-splunk-hostname}.splunkcloud.com:8088/services/collector/raw?channel={channel-id}&header_Authorization=Splunk%20{your-splunk–hec-token}&insecure-skip-verify=false

Definition of the variables:

{your-splunk-hostname}= Your allocated Splunk Cloud hostname.

{channel-id} = A unique ID that you choose to assign for.`{your-splunk–hec-token}` = The token that you generated for your Splunk HEC.

An important note is that customers should have a valid SSL/TLS certificate on their Splunk instance to support an encrypted connection.

After you’ve done that, you can create a GET request to the same URL (no request body needed) to verify that the job was created and is enabled.

The response should be similar to the following:

{
    "errors": [],
    "messages": [],
    "result": {
        "id": {job-id},
        "dataset": "network_analytics_logs",
        "frequency": "high",
        "kind": "",
        "enabled": true,
        "name": null,
        "logpull_options": null,
        "destination_conf": "{your-unique-splunk-configuration}",
        "last_complete": null,
        "last_error": null,
        "error_message": null
    },
    "success": true
}

Shortly after, you should start receiving logs to your Splunk HEC.

Integrating Network Analytics Logs with your SIEM dashboard

Read more about enabling Logpush on the Cloudflare Developers website: https://developers.cloudflare.com/logs/reference/logpush-api-configuration/examples/example-logpush-curl/

Reduce costs with R2 storage

Depending on the amount of logs that you read and write, the cost of third party cloud storage can skyrocket — forcing you to decide between managing a tight budget and being able to properly investigate networking and security issues. However, we believe that you shouldn’t have to make those trade-offs. With R2’s low costs, we’re making this decision easier for our customers. Instead of feeding logs to a third party, you can reap the cost benefits of storing them in R2.

To learn more about the R2 features and pricing, check out the full blog post. To enable R2, contact your account team.

Cloudflare logs for maximum visibility

Cloudflare Enterprise customers have access to detailed logs of the metadata generated by our products. These logs are helpful for troubleshooting, identifying network and configuration adjustments, and generating reports, especially when combined with logs from other sources, such as your servers, firewalls, routers, and other appliances.

Network Analytics Logs joins Cloudflare’s family of products on Logpush: DNS logs, Firewall events, HTTP requests, NEL reports, Spectrum events, Audit logs, Gateway DNS, Gateway HTTP, and Gateway Network.

Not using Cloudflare yet? Start now with our Free and Pro plans to protect your websites against DDoS attacks, or contact us for comprehensive DDoS protection and firewall-as-a-service for your entire network.

Cloudflare partners with Kentik to enhance on-demand DDoS protection

Post Syndicated from Matt Lewis original https://blog.cloudflare.com/kentik-and-magic-transit/

Cloudflare partners with Kentik to enhance on-demand DDoS protection

We are excited to announce that as of today, network security teams can procure and use Magic Transit, Cloudflare’s industry-leading DDoS mitigation solution, and Kentik’s network observability as an integrated solution. We are excited to help our customers not just with technical simplicity, but business process simplicity as well.

Cloudflare partners with Kentik to enhance on-demand DDoS protection

Why monitoring and mitigation?

Distributed Denial of Service (DDoS) attacks are highly disruptive to businesses everywhere. According to the Cloudflare DDoS Attack Trends report, in the first half of 2021 the world witnessed massive ransomware and ransom DDoS attack campaigns that interrupted critical infrastructure, including oil pipelines, healthcare, and financial services. In the second half, we saw a growing swarm of attacks, including one of the most powerful botnets deployed (Meris), with record-breaking network-layer attacks observed on the Cloudflare network.

Along with an increase in severity, there is a proliferation of automated toolkits that make it simple and cheap for anyone to launch these attacks. Detecting and stopping these attacks manually is not effective, and network security engineers are increasingly turning to automated tools to help ensure network and application availability.

DDoS protection has evolved over the years from appliances to hybrid models to fully Internet-native solutions, like Cloudflare’s Magic Transit. Cloudflare has been protecting millions of Internet properties against DDoS attacks, ensuring they are available at all times. Magic Transit extends Cloudflare’s industry-leading DDoS protection to shield entire IP subnets from DDoS attacks, while also accelerating network traffic, ensuring your data centers, cloud services and corporate networks are always reachable from the Internet. Our powerful global network spanning 250+ cities and 121 Tbps of capacity ensures that customers can have always-on DDoS protection without impacting network latency and application performance. Magic Transit also supports on-demand mode, which allows customers to activate DDoS protection when they need it most.

Network observability becomes critical to understand what normal looks like for your environment so that DDoS attacks are readily detected. Flow-based monitoring helps you understand not only how much traffic is flowing over your network, but also where it came from, where it’s going, and what applications are consuming bandwidth.

Magic Transit protection for every network configuration

Magic Transit is one of the most powerful DDoS mitigation platforms available today. We have worked hard to ensure Magic Transit is flexible enough for the most demanding network architectures. We need to fit into your world, not the other way around. And that involves partnering with leading network observability vendors to give you multiple options for how you choose to protect your network.

With this new partnership, customers can now consume Cloudflare’s Magic Transit service in one of three modes:

  • Always On — Customers looking for fast mitigation and traffic acceleration can deploy Magic Transit in Always On mode.
  • On Demand — Customers can choose to turn on Magic Transit response to a DDoS attack via Cloudflare’s UI or Cloudflare’s Magic Transit API.
  • On Demand + Flow-based Monitoring — Customers can now purchase and deploy an integrated network observability and DDoS protection solution consisting of Cloudflare Magic Transit On Demand and Kentik Protect from a single vendor.

In each configuration, Magic Transit is seamlessly paired with Magic Firewall — our cloud-native firewall-as-a-service.

Why Kentik’s flow-based monitoring?

At Cloudflare, we continuously take feedback from our customers on both our product and on what other tools they use. Customer feedback helps us build our products and how we grow Cloudflare’s Technology Partner Program.

For our Magic Transit customers, we found that many of our customers who chose Magic Transit On Demand have adopted solutions from Kentik, the network observability company with one of the leading flow-based monitoring tools in the ecosystem. Kentik empowers network professionals to plan, run, and fix any network with observability into all their traffic.

Simplifying network security

Cloudflare strives to simplify how customers can shield their network from cybersecurity threats like DDoS attacks. Magic Transit gives network security professionals the confidence that their network resources are immune from DDoS-related outages. We have now extended that same simplicity to this joint solution, making it simple for our customers to procure, provision, and integrate Magic Transit and Kentik. Our end goal is always creating the best experience possible for our customers, with Cloudflare’s services fitting seamlessly into their existing technology stack.

Kentik’s powerful network observability cloud collects flow logs from your network components and continuously learns network behavior, detecting anomalies such as DDoS attacks. Using our native API integration, the Kentik platform can trigger Magic Transit to start attracting network traffic when there’s an attack underway. Magic Transit’s autonomous DDoS mitigation automatically analyzes incoming traffic and filters out DDoS traffic across the entire Cloudflare network, protecting your network from unwanted traffic and avoiding service availability issues and outages.

Together, Kentik and Cloudflare have created a well-supported integration and a more streamlined procurement process to combine Kentik’s best-of-breed network observability and Cloudflare’s industry-leading DDoS protection in Magic Transit. Customers can now receive the best DDoS protection and network observability in a completely SaaS-based offering.

Cloudflare partners with Kentik to enhance on-demand DDoS protection

“We are excited to partner with Cloudflare to make it easier for our mutual customers to integrate our leading technology solutions and deploy industry-leading DDoS protection in a fully SaaS-based environment”, said Mike Mooney, CRO at Kentik.

Conclusion

Now, customers seeking to combine purpose-built, best-of-breed network observability and visualization from Kentik with Cloudflare’s Magic Transit On Demand can do so through a single vendor agreement and an integrated solution.

If you’d like to learn more DDoS attack trends and how Kentik plus Cloudflare combine to provide the leading SaaS-based DDoS protection solution with over 121 Tbps of capacity, review our developer documentation and join our upcoming webinar on April 28 to learn more.

Optimizing Magic Firewall’s IP lists

Post Syndicated from Jordan Griege original https://blog.cloudflare.com/magic-firewall-optimizing-ip-lists/

Optimizing Magic Firewall’s IP lists

Optimizing Magic Firewall’s IP lists

Magic Firewall is Cloudflare’s replacement for network-level firewall hardware. It evaluates gigabits of traffic every second against user-defined rules that can include millions of IP addresses. Writing a firewall rule for each IP address is cumbersome and verbose, so we have been building out support for various IP lists in Magic Firewall—essentially named groups that make the rules easier to read and write. Some users want to reject packets based on our growing threat intelligence of bad actors, while others know the exact set of IPs they want to match, which Magic Firewall supports via the same API as Cloudflare’s WAF.

With all those IPs, the system was using more of our memory budget than we’d like. To understand why, we need to first peek behind the curtain of our magic.

Life inside a network namespace

Magic Transit and Magic WAN enable Cloudflare to route layer 3 traffic, and they are the front door for Magic Firewall. We have previously written about how Magic Transit uses network namespaces to route packets and isolate customer configuration. Magic Firewall operates inside these namespaces, using nftables as the primary implementation of packet filtering.

Optimizing Magic Firewall’s IP lists

When a user makes an API request to configure their firewall, a daemon running on every server detects the change and makes the corresponding changes to nftables. Because nftables configuration is stored within the namespace, each user’s firewall rules are isolated from the next. Every Cloudflare server routes traffic for all of our customers, so it’s important that the firewall only applies a customer’s rules to their own traffic.

Using our existing technologies, we built IP lists using nftables sets. In practice, this means that the wirefilter expression

ip.geoip.country == “US”

is implemented as

table ip mfw {
	set geoip.US {
		typeof ip saddr
		elements = { 192.0.2.0, … }
	}
	chain input {
		ip saddr @geoip.US block
	}
}

This design for the feature made it very easy to extend our wirefilter syntax so that users could write rules using all the different IP lists we support.

Scaling to many customers

We chose this design since sets fit easily into our existing use of nftables. We shipped quickly with just a few, relatively small lists to a handful of customers. This approach worked well at first, but we eventually started to observe a dramatic increase in our system’s memory use. It’s common practice for our team to enable a new feature for just a few customers at first. Here’s what we observed as those customers started using IP lists:

Optimizing Magic Firewall’s IP lists

It turns out you can’t store millions of IPs in the kernel without someone noticing, but the real problem is that we pay that cost for each customer that uses them. Each network namespace gets a complete copy of all the lists that its rules reference. Just to make it perfectly clear, if 10 users have a rule that uses the anonymizer list, then a server has 10 copies of that list stored in the kernel. Yikes!

And it isn’t just the storage of these IPs that caused alarm; the sheer size of the nftables configuration causes nft (the command-line program for configuring nftables) to use quite a bit of compute power.

Optimizing Magic Firewall’s IP lists

Can you guess when the nftables updates take place?  Now the CPU usage wasn’t particularly problematic, as the service is already quite lean. However, those spikes also create competition with other services that heavily use netlink sockets. At one point, a monitoring alert for another service started firing for high CPU, and a fellow development team went through the incident process only to realize our system was the primary contributor to the problem. They made use of the -terse flag to prevent nft from wasting precious time rendering huge lists of IPs, but degrading another team’s service was a harsh way to learn that lesson.

We put quite a bit of effort into working around this problem. We tried doing incremental updates of the sets, only adding or deleting the elements that had changed since the last update. It was helpful sometimes, but the complexity ended up not being worth the small savings. Through a lot of profiling, we found that splitting an update across multiple statements improved the situation some. This means we changed

add element ip mfw set0 { 192.0.2.0, 192.0.2.2, …, 198.51.100.0, 198.51.100.2, … }

to

add element ip mfw set0 { 192.0.2.0, … }
add element ip mfw set0 { 198.51.100.0, … }

just to squeeze out a second or two of reduced time that nft took to process our configuration. We were spending a fair amount of development cycles looking for optimizations, and we needed to find a way to turn the tides.

One more step with eBPF

As we mentioned on our blog recently, Magic Firewall leverages eBPF for some advanced packet-matching. And it may not be a big surprise that eBPF can match an IP in a list, but it wasn’t a tool we thought to reach for a year ago. What makes eBPF relevant to this problem is that eBPF maps exist regardless of a network namespace. That’s right; eBPF gives us a way to share this data across all the network namespaces we create for our customers.

Since several of our IP lists contain ranges, we can’t use a simple BPF_MAP_TYPE_HASH or BPF_MAP_TYPE_ARRAY to perform a lookup. Fortunately, Linux already has a data structure that we can use to accommodate ranges: the BPF_MAP_TYPE_LPM_TRIE.

Given an IP address, the trie’s implementation of bpf_map_lookup_elem() will return a non-null result if the IP falls within any of the ranges already inserted into the map. And using the map is about as simple as it gets for eBPF programs. Here’s an example:

typedef struct {
	__u32 prefix_len;
	__u8 ip[4];
} trie_key_t;

SEC("maps")
struct bpf_map_def ip_list = {
    .type = BPF_MAP_TYPE_LPM_TRIE,
    .key_size = sizeof(trie_key_t),
    .value_size = 1,
    .max_entries = 1,
    .map_flags = BPF_F_NO_PREALLOC,
};

SEC("socket/saddr_in_list")
int saddr_in_list(struct __sk_buff *skb) {
	struct iphdr iph;
	trie_key_t trie_key;

	bpf_skb_load_bytes(skb, 0, &iph, sizeof(iph));

	trie_key.prefix_len = 32;
	trie_key.ip[0] = iph.saddr & 0xff;
	trie_key.ip[1] = (iph.saddr >> 8) & 0xff;
	trie_key.ip[2] = (iph.saddr >> 16) & 0xff;
	trie_key.ip[3] = (iph.saddr >> 24) & 0xff;

	void *val = bpf_map_lookup_elem(&ip_list, &trie_key);
	return val == NULL ? BPF_OK : BPF_DROP;
}

nftables befriends eBPF

One of the limitations of our prior work incorporating eBPF into the firewall involves how the program is loaded into the nftables ruleset. Because the nft command-line program does not support calling a BPF program from a rule, we formatted the Netlink messages ourselves. While this allowed us to insert a rule that executes an eBPF program, it came at the cost of losing out on atomic rule replacements.

So any native nftables rules could be applied in one transaction, but any eBPF-based rules would have to be inserted afterwards. This introduces some headaches for handling failures—if inserting an eBPF rule fails, should we simply retry, or perhaps roll back the nftables ruleset as well? And what if those follow-up operations fail?

In other words, we went from a relatively simple operation—the new firewall configuration either fully applied or didn’t—to a much more complex one. We didn’t want to keep using more and more eBPF without solving this problem. After hacking on the nftables repo for a few days, we came out with a patch that adds a BPF keyword to the nftables configuration syntax.

I won’t cover all the changes, which we hope to upstream soon, but there were two key parts. The first is implementing struct expr_ops and some methods required, which is how nft’s parser knows what to do with each keyword in its configuration language (like the “bpf” keyword we introduced). The second is just a different approach for what we solved previously. Instead of directly formatting struct xt_bpf_info_v1 into a byte array as iptables does, nftables uses the nftnl library to create the netlink messages.

Once we had our footing working in a foreign codebase, that code turned out fairly simple.

static void netlink_gen_bpf(struct netlink_linearize_ctx *ctx,
                            const struct expr *expr,
                            enum nft_registers dreg)
{
       struct nftnl_expr *nle;

       nle = alloc_nft_expr("match");
       nftnl_expr_set_str(nle, NFTNL_EXPR_MT_NAME, "bpf");
       nftnl_expr_set_u8(nle, NFTNL_EXPR_MT_REV, 1);
       nftnl_expr_set(nle, NFTNL_EXPR_MT_INFO, expr->bpf.info, sizeof(struct xt_bpf_info_v1));
       nft_rule_add_expr(ctx, nle, &expr->location);
}

With the patch finally built, it became possible for us to write configuration like this where we can provide a path to any pinned eBPF program.

# nft -f - <<EOF
table ip mfw {
  chain input {
    bpf pinned "/sys/fs/bpf/filter" drop
  }
}
EOF

And now we can mix native nftables matches with eBPF programs without giving up the atomic nature of nft commands. This unlocks the opportunity for us to build even more advanced packet inspection and protocol detection in eBPF.

Results

This plot shows kernel memory during the time of the deployment – the change was rolled out to each namespace over a few hours.

Optimizing Magic Firewall’s IP lists

Though at this point, we’ve merely moved memory out of this slab. Fortunately, the eBPF map memory now appears in the cgroup for our Golang process, so it’s relatively easy to report. Here’s the point at which we first started populating the maps:

Optimizing Magic Firewall’s IP lists

This change was pushed a couple of weeks before the maps were actually used in the firewall (i.e. the graph just above). And it has remained constant ever since—unlike our original design, the number of customers is no longer a predominant factor in this feature’s resource usage.  The new model makes us more confident about how the service will behave as our product continues to grow.

Other than being better positioned to use eBPF more in the future, we made a significant improvement on the efficiency of the product today. In any project that grows rapidly for a while, an engineer can often see a few pesky pitfalls looming on the horizon. It is very satisfying to dive deep into these technologies and come out with a creative, novel solution before experiencing too much pain. We love solving hard problems and improving our systems to handle rapid growth in addition to pushing lots of new features.

Does that sound like an environment you want to work in? Consider applying!

Protect all network traffic with Cloudflare

Post Syndicated from Annika Garbers original https://blog.cloudflare.com/protect-all-network-traffic/

Protect all network traffic with Cloudflare

Protect all network traffic with Cloudflare

Magic Transit protects customers’ entire networks—any port/protocol—from DDoS attacks and provides built-in performance and reliability. Today, we’re excited to extend the capabilities of Magic Transit to customers with any size network, from home networks to offices to large cloud properties, by offering Cloudflare-maintained and Magic Transit-protected IP space as a service.

What is Magic Transit?

Magic Transit extends the power of Cloudflare’s global network to customers, absorbing all traffic destined for your network at the location closest to its source. Once traffic lands at the closest Cloudflare location, it flows through a stack of security protections including industry-leading DDoS mitigation and cloud firewall. Detailed Network Analytics, alerts, and reporting give you deep visibility into all your traffic and attack patterns. Clean traffic is forwarded to your network using Anycast GRE or IPsec tunnels or Cloudflare Network Interconnect. Magic Transit includes load balancing and automatic failover across tunnels to steer traffic across the healthiest path possible, from everywhere in the world.

Protect all network traffic with Cloudflare
Magic Transit architecture: Internet BGP advertisement attracts traffic to Cloudflare’s network, where attack mitigation and security policies are applied before clean traffic is forwarded back to customer networks with an Anycast GRE tunnel or Cloudflare Network Interconnect.

The “Magic” is in our Anycast architecture: every server across our network runs every Cloudflare service, so traffic can be processed wherever it lands. This means the entire capacity of our network—121+Tbps as of this post—is available to block even the largest attacks. It also drives huge benefits for performance versus traditional “scrubbing center” solutions that route traffic to specialized locations for processing, and makes onboarding much easier for network engineers: one tunnel to Cloudflare automatically connects customer infrastructure to our entire network in over 250 cities worldwide.

What’s new?

Historically, Magic Transit has required customers to bring their own IP addresses—a minimum of a class C IP block, or /24—in order to use this service. This is because a /24 is the minimum prefix length that can be advertised via BGP on the public Internet, which is how we attract traffic for customer networks.

But not all customers have this much IP space; we’ve talked to many of you who want IP layer protection for a smaller network than we’re able to advertise to the Internet on your behalf. Today, we’re extending the availability of Magic Transit to customers with smaller networks by offering Magic Transit-protected, Cloudflare-managed IP space. Starting now, you can direct your network traffic to dedicated static IPs and receive all the benefits of Magic Transit including industry leading DDoS protection, visibility, performance, and resiliency.

Let’s talk through some new ways you can leverage Magic Transit to protect and accelerate any network.

Consistent cross-cloud security

Organizations adopting a hybrid or poly-cloud strategy have struggled to maintain consistent security controls across different environments. Where they used to manage a single firewall appliance in a datacenter, security teams now have a myriad of controls across different providers—physical, virtual, and cloud-based—all with different capabilities and control mechanisms.

Cloudflare is the single control plane across your hybrid cloud deployment, allowing you to manage security policies from one place, get uniform protection across your entire environment, and get deep visibility into your traffic and attack patterns.

Protect all network traffic with Cloudflare

Protecting branches of any size

As DDoS attack frequency and variety continues to grow, attackers are getting more creative with angles to target organizations. Over the past few years, we have seen a consistent rise in attacks targeted at corporate infrastructure including internal applications. As the percentage of a corporate network dependent on the Internet continues to grow, organizations need consistent protection across their entire network.

Now, you can get any network location covered—branch offices, stores, remote sites, event venues, and more—with Magic Transit-protected IP space. Organizations can also replace legacy hardware firewalls at those locations with our built-in cloud firewall, which filters bidirectional traffic and propagates changes globally within seconds.

Protect all network traffic with Cloudflare

Keeping streams alive without worrying about leaked IPs

Generally, DDoS attacks target a specific application or network in order to impact the availability of an Internet-facing resource. But you don’t have to be hosting anything in order to get attacked, as many gamers and streamers have unfortunately discovered. The public IP associated with a home network can easily be leaked, giving attackers the ability to directly target and take down a live stream.

As a streamer, you can now route traffic from your home network through a Magic Transit-protected IP. This means no more worrying about leaking your IP: attackers targeting you will have traffic blocked at the closest Cloudflare location to them, far away from your home network. And no need to worry about impact to your game: thanks to Cloudflare’s globally distributed and interconnected network, you can get protected without sacrificing performance.

Protect all network traffic with Cloudflare

Get started today

This solution is available today; learn more or contact your account team to get started.

How to customize your layer 3/4 DDoS protection settings

Post Syndicated from Omer Yoachimik original https://blog.cloudflare.com/l34-ddos-managed-rules/

How to customize your layer 3/4 DDoS protection settings

How to customize your layer 3/4 DDoS protection settings

After initially providing our customers control over the HTTP-layer DDoS protection settings earlier this year, we’re now excited to extend the control our customers have to the packet layer. Using these new controls, Cloudflare Enterprise customers using the Magic Transit and Spectrum services can now tune and tweak their L3/4 DDoS protection settings directly from the Cloudflare dashboard or via the Cloudflare API.

The new functionality provides customers control over two main DDoS rulesets:

  1. Network-layer DDoS Protection ruleset — This ruleset includes rules to detect and mitigate DDoS attacks on layer 3/4 of the OSI model such as UDP floods, SYN-ACK reflection attacks, SYN Floods, and DNS floods. This ruleset is available for Spectrum and Magic Transit customers on the Enterprise plan.
  2. Advanced TCP Protection ruleset — This ruleset includes rules to detect and mitigate sophisticated out-of-state TCP attacks such as spoofed ACK Floods, Randomized SYN Floods, and distributed SYN-ACK Reflection attacks. This ruleset is available for Magic Transit customers only.

To learn more, review our DDoS Managed Ruleset developer documentation. We’ve put together a few guides that we hope will be helpful for you:

  1. Onboarding & getting started with Cloudflare DDoS protection
  2. Handling false negatives
  3. Handling false positives
  4. Best practices when using VPNs, VoIP, and other third-party services
  5. How to simulate a DDoS attack

Cloudflare’s DDoS Protection

A Distributed Denial of Service (DDoS) attack is a type of cyberattack that aims to disrupt the victim’s Internet services. There are many types of DDoS attacks, and they can be generated by attackers at different layers of the Internet. One example is the HTTP flood. It aims to disrupt HTTP application servers such as those that power mobile apps and websites. Another example is the UDP flood. While this type of attack can be used to disrupt HTTP servers, it can also be used in an attempt to disrupt non-HTTP applications. These include TCP-based and UDP-based applications, networking services such as VoIP services, gaming servers, cryptocurrency, and more.

How to customize your layer 3/4 DDoS protection settings

To defend organizations against DDoS attacks, we built and operate software-defined systems that run autonomously. They automatically detect and mitigate DDoS attacks across our entire network. You can read more about our autonomous DDoS protection systems and how they work in our deep-dive technical blog post.

How to customize your layer 3/4 DDoS protection settings

Unmetered and unlimited DDoS Protection

The level of protection that we offer is unmetered and unlimited — It is not bounded by the size of the attack, the number of the attacks, or the duration of the attacks. This is especially important these days because as we’ve recently seen, attacks are getting larger and more frequent. Consequently, in Q3, network-layer attacks increased by 44% compared to the previous quarter. Furthermore, just recently, our systems automatically detected and mitigated a DDoS attack that peaked just below 2 Tbps — the largest we’ve seen to date.

How to customize your layer 3/4 DDoS protection settings
Mirai botnet launched an almost 2 Tbps DDoS attack

Read more about recent DDoS trends.

Managed Rulesets

You can think of our autonomous DDoS protection systems as groups (rulesets) of intelligent rules. There are rulesets of HTTP DDoS Protection rules, Network-layer DDoS Protection rules and Advanced TCP Protection rules. In this blog post, we will cover the latter two rulesets. We’ve already covered the former in the blog post How to customize your HTTP DDoS protection settings.

How to customize your layer 3/4 DDoS protection settings
Cloudflare L3/4 DDoS Managed Rules

In the Network-layer DDoS Protection rulesets, each rule has a unique set of conditional fingerprints, dynamic field masking, activation thresholds, and mitigation actions. These rules are managed (by Cloudflare), meaning that the specifics of each rule is curated in-house by our DDoS experts. Before deploying a new rule, it is first rigorously tested and optimized for mitigation accuracy and efficiency across our entire global network.

In the Advanced TCP Protection ruleset, we use a novel TCP state classification engine to identify the state of TCP flows. The engine powering this ruleset is flowtrackd — you can read more about it in our announcement blog post. One of the unique features of this system is that it is able to operate using only the ingress (inbound) packet flows. The system sees only the ingress traffic and is able to drop, challenge, or allow packets based on their legitimacy. For example, a flood of ACK packets that don’t correspond to open TCP connections will be dropped.

How attacks are detected and mitigated

Sampling

Initially, traffic is routed through the Internet via BGP Anycast to the nearest Cloudflare edge data center. Once the traffic reaches our data center, our DDoS systems sample it asynchronously allowing for out-of-path analysis of traffic without introducing latency penalties. The Advanced TCP Protection ruleset needs to view the entire packet flow and so it sits inline for Magic Transit customers only. It, too, does not introduce any latency penalties.

Analysis & mitigation

The analysis for the Advanced TCP Protection ruleset is straightforward and efficient. The system qualifies TCP flows and tracks their state. In this way, packets that don’t correspond to a legitimate connection and its state are dropped or challenged. The mitigation is activated only above certain thresholds that customers can define.

The analysis for the Network-layer DDoS Protection ruleset is done using data streaming algorithms. Packet samples are compared to the conditional fingerprints and multiple real-time signatures are created based on the dynamic masking. Each time another packet matches one of the signatures, a counter is increased. When the activation threshold is reached for a given signature, a mitigation rule is compiled and pushed inline. The mitigation rule includes the real-time signature and the mitigation action, e.g., drop.

How to customize your layer 3/4 DDoS protection settings

​​​​Example

As a simple example, one fingerprint could include the following fields: source IP, source port, destination IP, and the TCP sequence number. A packet flood attack with a fixed sequence number would match the fingerprint and the counter would increase for every packet match until the activation threshold is exceeded. Then a mitigation action would be applied.

However, in the case of a spoofed attack where the source IP addresses and ports are randomized, we would end up with multiple signatures for each combination of source IP and port. Assuming a sufficiently randomized/distributed attack, the activation thresholds would not be met and mitigation would not occur. For this reason, we use dynamic masking, i.e. ignoring fields that may not be a strong indicator of the signature. By masking (ignoring) the source IP and port, we would be able to match all the attack packets based on the unique TCP sequence number regardless of how randomized/distributed the attack is.

Configuring the DDoS Protection Settings

For now, we’ve only exposed a handful of the Network-layer DDoS protection rules that we’ve identified as the ones most prone to customizations. We will be exposing more and more rules on a regular basis. This shouldn’t affect any of your traffic.

How to customize your layer 3/4 DDoS protection settings
Overriding the sensitivity level and mitigation action

For the Network-layer DDoS Protection ruleset, for each of the available rules, you can override the sensitivity level (activation threshold), customize the mitigation action, and apply expression filters to exclude/include traffic from the DDoS protection system based on various packet fields. You can create multiple overrides to customize the protection for your network and your various applications.

How to customize your layer 3/4 DDoS protection settings
Configuring expression fields for the DDoS Managed Rules to match on

In the past, you’d have to go through our support channels to customize the rules. In some cases, this may have taken longer to resolve than desired. With today’s announcement, you can tailor and fine-tune the settings of our autonomous edge system by yourself to quickly improve the accuracy of the protection for your specific network needs.

For the Advanced TCP Protection ruleset, for now, we’ve only exposed the ability to enable or disable it as a whole in the dashboard. To enable or disable the ruleset per IP prefix, you must use the API. At this time, when initially onboarding to Cloudflare, the Cloudflare team must first create a policy for you. After onboarding, if you need to change the sensitivity thresholds, use Monitor mode, or add filter expressions you must contact Cloudflare Support. In upcoming releases, this too will be available via the dashboard and API without requiring help from our Support team.

How to customize your layer 3/4 DDoS protection settings

Pre-existing customizations

If you previously contacted Cloudflare Support to apply customizations, your customizations have been preserved, and you can visit the dashboard to view the settings of the Network-layer DDoS Protection ruleset and change them if you need. If you require any changes to your Advanced TCP Protection customizations, please reach out to Cloudflare Support.

If so far you didn’t have the need to customize this protection, there is no action required on your end. However, if you would like to view and customize your DDoS protection settings, follow this dashboard guide or review the API documentation to programmatically configure the DDoS protection settings.

Helping Build a Better Internet

At Cloudflare, everything we do is guided by our mission to help build a better Internet. The DDoS team’s vision is derived from this mission: our goal is to make the impact of DDoS attacks a thing of the past. Our first step was to build the autonomous systems that detect and mitigate attacks independently. Done. The second step was to expose the control plane over these systems to our customers (announced today). Done. The next step will be to fully automate the configuration with an auto-pilot feature — training the systems to learn your specific traffic patterns to automatically optimize your DDoS protection settings. You can expect many more improvements, automations, and new capabilities to keep your Internet properties safe, available, and performant.

Not using Cloudflare yet? Start now.

Magic Firewall gets Smarter

Post Syndicated from Achiel van der Mandele original https://blog.cloudflare.com/magic-firewall-gets-smarter/

Magic Firewall gets Smarter

Magic Firewall gets Smarter

Today, we’re very excited to announce a set of updates to Magic Firewall, adding security and visibility features that are key in modern cloud firewalls. To improve security, we’re adding threat intel integration and geo-blocking. For visibility, we’re adding packet captures at the edge, a way to see packets arrive at the edge in near real-time.

Magic Firewall is our network-level firewall which is delivered through Cloudflare to secure your enterprise. Magic Firewall covers your remote users, branch offices, data centers and cloud infrastructure. Best of all, it’s deeply integrated with Cloudflare, giving you a one-stop overview of everything that’s happening on your network.

A brief history of firewalls

We talked a lot about firewalls on Monday, including how our firewall-as-a-service solution is very different from traditional firewalls and helps security teams that want sophisticated inspections at the Application Layer. When we talk about the Application Layer, we’re referring to OSI Layer 7. This means we’re applying security features using semantics of the protocol. The most common example is HTTP, the protocol you’re using to visit this website. We have Gateway and our WAF to protect inbound and outbound HTTP requests, but what about Layer 3 and Layer 4 capabilities? Layer 3 and 4 refer to the packet and connection levels. These security features aren’t applied to HTTP requests, but instead to IP packets and (for example) TCP connections. A lot of folks in the CIO organization want to add extra layers of security and visibility without resorting to decryption at Layer 7. We’re excited to talk to you about two sets of new features that will make your lives easier: geo-blocking and threat intel integration to improve security posture, and packet captures to get you better visibility.

Threat Intel and IP Lists

Magic Firewall is great if you know exactly what you want to allow and block. You can put in rules that match exactly on IP source and destination, as well as bitslicing to verify the contents of various packets. However, there are many situations in which you don’t exactly know who the bad and good actors are: is this IP address that’s trying to access my network a perfectly fine consumer, or is it part of a botnet that’s trying to attack my network?

The same goes the other way: whenever someone inside your network is trying to create a connection to the Internet, how do you know whether it’s an obscure blog or a malware website? Clearly, you don’t want to play whack-a-mole and try to keep track of every malicious actor on the Internet by yourself. For most security teams, it’s nothing more than a waste of time! You’d much rather rely on a company that makes it their business to focus on this.

Today, we’re announcing Magic Firewall support for our in-house Threat Intelligence feed. Cloudflare sees approximately 28 million HTTP requests each second and blocks 76 billion cyber threats each day. With almost 20% of the top 10 million Alexa websites on Cloudflare, we see a lot of novel threats pop up every day. We use that data to detect malicious actors on the Internet and turn it into a list of known malicious IPs. And we don’t stop there: we also integrate with a number of third party vendors to augment our coverage.

To match on any of the threat intel lists, just set up a rule in the UI as normal:

Magic Firewall gets Smarter

Threat intel feed categories include Malware, Anonymizer and Botnet Command-and-Control centers. Malware and Botnet lists cover properties on the Internet distributing malware and known command and control centers. Anonymizers contain a list of known forward proxies that allow attackers to hide their IP addresses.

In addition to the managed lists, you also have the flexibility of creating your own lists, either to add your own known set of malicious IPs or to make management of your known good network endpoints easier. As an example, you may want to create a list of all your own servers. That way, you can easily block traffic to and from it from any rule, without having to replicate the list each time.

Another particularly gnarly problem that many of our customers deal with is geo restrictions. Many are restricted in where they are allowed (or want to) accept traffic from and to. The challenge here is that nothing about an IP address tells you anything about the geolocation of it. And even worse, IP addresses regularly change hands, moving from one country to the other.

As of today, you can easily block or allow traffic to any country, without the management hassle that comes with maintaining lists yourself. Country lists are kept up to date entirely by Cloudflare, all you need to do is set up a rule matching on the country and we’ll take care of the rest.

Magic Firewall gets Smarter

Packet captures at the edge

Finally, we’re releasing a very powerful feature: packet captures at the edge. A packet capture is a pcap file that contains all packets that were seen by a particular network box (usually a firewall or router) during a specific time frame. Packet captures are useful if you want to debug your network: why can’t my users connect to a particular website? Or you may want to get better visibility into a DDoS attack, so you can put up better firewall rules.

Traditionally, you’d log into your router or firewall and start up something like tcpdump. You’d set up a filter to only match on certain packets (packet capture files can quickly get very big) and grab the file. But what happens if you want coverage across your entire network: on-premises, offices and all your cloud environments? You’ll likely have different vendors for each of those locations and have to figure out how to get packet captures from all of them. Even worse, some of them might not even support grabbing packet captures.

With Magic Firewall, grabbing packet captures across your entire network becomes simple: because you run a single network-firewall-as-a-service, you can grab packets across your entire network in one go. This gets you instant visibility into exactly where that particular IP is interacting with your network, regardless of physical or virtual location. You have the option of grabbing all network traffic (warning, it might be a lot!) or set a filter to only grab a subset. Filters follow the same Wireshark syntax that Magic Firewall rules use:

(ip.src in $cf.anonymizer)

We think these are great additions to Magic Firewall, giving you powerful primitives to police traffic and tooling to gain visibility into what’s actually going on in your network. Threat Intel, geo blocking and IP lists are all available today — reach out to your account team to have them activated. Packet captures is entering early access later in December. Similarly, if you’re interested, please reach out to your account team!

How We Used eBPF to Build Programmable Packet Filtering in Magic Firewall

Post Syndicated from Chris J Arges original https://blog.cloudflare.com/programmable-packet-filtering-with-magic-firewall/

How We Used eBPF to Build Programmable Packet Filtering in Magic Firewall

How We Used eBPF to Build Programmable Packet Filtering in Magic Firewall

Cloudflare actively protects services from sophisticated attacks day after day. For users of Magic Transit, DDoS protection detects and drops attacks, while Magic Firewall allows custom packet-level rules, enabling customers to deprecate hardware firewall appliances and block malicious traffic at Cloudflare’s network. The types of attacks and sophistication of attacks continue to evolve, as recent DDoS and reflection attacks against VoIP services targeting protocols such as Session Initiation Protocol (SIP) have shown. Fighting these attacks requires pushing the limits of packet filtering beyond what traditional firewalls are capable of. We did this by taking best of class technologies and combining them in new ways to turn Magic Firewall into a blazing fast, fully programmable firewall that can stand up to even the most sophisticated of attacks.

Magical Walls of Fire

Magic Firewall is a distributed stateless packet firewall built on Linux nftables. It runs on every server, in every Cloudflare data center around the world. To provide isolation and flexibility, each customer’s nftables rules are configured within their own Linux network namespace.

How We Used eBPF to Build Programmable Packet Filtering in Magic Firewall

This diagram shows the life of an example packet when using Magic Transit, which has Magic Firewall built in. First, packets go into the server and DDoS protections are applied, which drops attacks as early as possible. Next, the packet is routed into a customer-specific network namespace, which applies the nftables rules to the packets. After this, packets are routed back to the origin via a GRE tunnel. Magic Firewall users can construct firewall statements from a single API, using a flexible Wirefilter syntax. In addition, rules can be configured via the Cloudflare dashboard, using friendly UI drag and drop elements.

Magic Firewall provides a very powerful syntax for matching on various packet parameters, but it is also limited to the matches provided by nftables. While this is more than sufficient for many use cases, it does not provide enough flexibility to implement the advanced packet parsing and content matching we want. We needed more power.

Hello eBPF, meet Nftables!

When looking to add more power to your Linux networking needs, Extended Berkeley Packet Filter (eBPF) is a natural choice. With eBPF, you can insert packet processing programs that execute in the kernel, giving you the flexibility of familiar programming paradigms with the speed of in-kernel execution. Cloudflare loves eBPF and this technology has been transformative in enabling many of our products. Naturally, we wanted to find a way to use eBPF to extend our use of nftables in Magic Firewall. This means being able to match, using an eBPF program within a table and chain as a rule. By doing this we can have our cake and eat it too, by keeping our existing infrastructure and code, and extending it further.

If nftables could leverage eBPF natively, this story would be much shorter; alas, we had to continue our quest. To get us started in our search, we know that iptables integrates with eBPF. For example, one can use iptables and a pinned eBPF program for dropping packets with the following command:

iptables -A INPUT -m bpf --object-pinned /sys/fs/bpf/match -j DROP

This clue helped to put us on the right path. Iptables uses the xt_bpf extension to match on an eBPF program. This extension uses the BPF_PROG_TYPE_SOCKET_FILTER eBPF program type, which allows us to load the packet information from the socket buffer and return a value based on our code.

Since we know iptables can use eBPF, why not just use that? Magic Firewall currently leverages nftables, which is a great choice for our use case due to its flexibility in syntax and programmable interface. Thus, we need to find a way to use the xt_bpf extension with nftables.

How We Used eBPF to Build Programmable Packet Filtering in Magic Firewall

This diagram helps explain the relationship between iptables, nftables and the kernel. The nftables API can be used by both the iptables and nft userspace programs, and can configure both xtables matches (including xt_bpf) and normal nftables matches.

This means that given the right API calls (netlink/netfilter messages), we can embed an xt_bpf match within an nftables rule. In order to do this, we need to understand which netfilter messages we need to send. By using tools such as strace, Wireshark, and especially using the source we were able to construct a message that could append an eBPF rule given a table and chain.

NFTA_RULE_TABLE table
NFTA_RULE_CHAIN chain
NFTA_RULE_EXPRESSIONS | NFTA_MATCH_NAME
	NFTA_LIST_ELEM | NLA_F_NESTED
	NFTA_EXPR_NAME "match"
		NLA_F_NESTED | NFTA_EXPR_DATA
		NFTA_MATCH_NAME "bpf"
		NFTA_MATCH_REV 1
		NFTA_MATCH_INFO ebpf_bytes	

The structure of the netlink/netfilter message to add an eBPF match should look like the above example. Of course, this message needs to be properly embedded and include a conditional step, such as a verdict, when there is a match. The next step was decoding the format of `ebpf_bytes` as shown in the example below.

 struct xt_bpf_info_v1 {
	__u16 mode;
	__u16 bpf_program_num_elem;
	__s32 fd;
	union {
		struct sock_filter bpf_program[XT_BPF_MAX_NUM_INSTR];
		char path[XT_BPF_PATH_MAX];
	};
};

The bytes format can be found in the kernel header definition of struct xt_bpf_info_v1. The code example above shows the relevant parts of the structure.

The xt_bpf module supports both raw bytecodes, as well as a path to a pinned ebpf program. The later mode is the technique we used to combine the ebpf program with nftables.

With this information we were able to write code that could create netlink messages and properly serialize any relevant data fields. This approach was just the first step, we are also looking into incorporating this into proper tooling instead of sending custom netfilter messages.

Just Add eBPF

Now we needed to construct an eBPF program and load it into an existing nftables table and chain. Starting to use eBPF can be a bit daunting. Which program type do we want to use? How do we compile and load our eBPF program? We started this process by doing some exploration and research.

First we constructed an example program to try it out.

SEC("socket")
int filter(struct __sk_buff *skb) {
  /* get header */
  struct iphdr iph;
  if (bpf_skb_load_bytes(skb, 0, &iph, sizeof(iph))) {
    return BPF_DROP;
  }

  /* read last 5 bytes in payload of udp */
  __u16 pkt_len = bswap_16(iph.tot_len);
  char data[5];
  if (bpf_skb_load_bytes(skb, pkt_len - sizeof(data), &data, sizeof(data))) {
    return BPF_DROP;
  }

  /* only packets with the magic word at the end of the payload are allowed */
  const char SECRET_TOKEN[5] = "xyzzy";
  for (int i = 0; i < sizeof(SECRET_TOKEN); i++) {
    if (SECRET_TOKEN[i] != data[i]) {
      return BPF_DROP;
    }
  }

  return BPF_OK;
}

The excerpt mentioned is an example of an eBPF program that only accepts packets that have a magic string at the end of the payload. This requires checking the total length of the packet to find where to start the search. For clarity, this example omits error checking and headers.

Once we had a program, the next step was integrating it into our tooling. We tried a few technologies to load the program, like BCC, libbpf, and we even created a custom loader. Ultimately, we ended up using cilium’s ebpf library, since we are using Golang for our control-plane program and the library makes it easy to generate, embed and load eBPF programs.

# nft list ruleset
table ip mfw {
	chain input {
		#match bpf pinned /sys/fs/bpf/mfw/match drop
	}
}

Once the program is compiled and pinned, we can add matches into nftables using netlink commands. Listing the ruleset shows the match is present. This is incredible! We are now able to deploy custom C programs to provide advanced matching inside a Magic Firewall ruleset!

More Magic

With the addition of eBPF to our toolkit, Magic Firewall is an even more flexible and powerful way to protect your network from bad actors. We are now able to look deeper into packets and implement more complex matching logic than nftables alone could provide. Since our firewall is running as software on all Cloudflare servers, we can quickly iterate and update features.

One outcome of this project is SIP protection, which is currently in beta. That’s only the beginning. We are currently exploring using eBPF for protocol validations, advanced field matching, looking into payloads, and supporting even larger sets of IP lists.

We welcome your help here, too! If you have other use cases and ideas, please talk to your account team. If you find this technology interesting, come join our team!

Introducing Cloudflare’s Technology Partner Program

Post Syndicated from Matt Lewis original https://blog.cloudflare.com/technology-partner-program/

Introducing Cloudflare’s Technology Partner Program

The Internet is built on a series of shared protocols, all working in harmony to deliver the collective experience that has changed the way we live and work. These open standards have created a platform such that a myriad of companies can build unique services and products that work together seamlessly. As a steward and supporter of an open Internet, we aspire to provide an interoperable platform that works with all the complementary technologies that our customers use across their technology stack. This has been the guiding principle for the multiple partnerships we have launched over the last few years.  

One example is our Bandwidth Alliance — launched in 2018, this alliance with 18 cloud and storage providers aims to reduce egress fees, also known as data transfer fees, for our customers. The Bandwidth Alliance has broken the norms of the cloud industry so that customers can move data more freely. Since then, we have launched several technology partner programs with over 40+ partners, including:

  • Analytics — Visualize Cloudflare logs and metrics easily, and help customers better understand events and trends from websites and applications on the Cloudflare network.
  • Network Interconnect — Partnerships with best-in-class Interconnection platforms offer private, secure, software-defined links with near instant-turn-up of ports.
  • Endpoint Protection Partnerships — With these integrations, every connection to our customer’s corporate application gets an additional layer of identity assurance without the need to connect to VPN.
  • Identity Providers — Easily integrate your organization’s single sign-on provider and benefit from the ease-of-use and functionality of Cloudflare Access.
Introducing Cloudflare’s Technology Partner Program

These partner programs have helped us serve our customers better alongside our partners with our complementary solutions. The integrations we have driven have made it easy for thousands of customers to use Cloudflare with other parts of their stack.

We aim to continue expanding the Cloudflare Partner Network to make it seamless for our customers to use Cloudflare. To support our growing ecosystem of partners, we are excited to launch our Technology Partner Program.

Announcing Cloudflare’s Technology Partner Program

Cloudflare’s Technology Partner Program facilitates innovative integrations that create value for our customers, our technology partners, and Cloudflare. Our partners not only benefit from technical integrations with us, but also have the opportunity to drive sales and marketing efforts to better serve mutual customers and prospects.

This program offers a guiding structure so that our partners can benefit across three key areas:

  • Build with Cloudflare: Sandbox access to Cloudflare enterprise features and APIs to build and test integrations. Opportunity to collaborate with Cloudflare’s product teams to build innovative solutions.
  • Market with Cloudflare: Develop joint solution brief and host joint events to drive awareness and adoption of integrations. Leverage a range of our partners tools and resources to bring our joint solutions to market.
  • Sell with Cloudflare: Align with our sales teams to jointly target relevant customer segments across geographies.

Technology Partner Tiers

Depending on the maturity of the integration and fit with Cloudflare’s product portfolio, we have two types of partners:

  • Strategic partners: Strategic partners have mature integrations across the Cloudflare product suite. They are leaders in their industries and have a significant overlap with our customer base. These partners are strategically aligned with our sales and marketing efforts, and they collaborate with our product teams to bring innovative solutions to market.
  • Integration partners: Integration partners are early participants in Cloudflare’s partnership ecosystem. They already have or are on a path to build validated, functional integrations with Cloudflare. These partners have programmatic access to resources that will help them experiment with and build integrations with Cloudflare.

Work with Us

If you are interested in working with our Technology Partnerships team to develop and bring to market a joint solution, we’d love to hear from you!  Partners can complete the application on our Technology Partner Program website and we will reach out quickly to discuss how we can help build solutions for our customers together.

Magic makes your network faster

Post Syndicated from Annika Garbers original https://blog.cloudflare.com/magic-makes-your-network-faster/

Magic makes your network faster

Magic makes your network faster

We launched Magic Transit two years ago, followed more recently by its siblings Magic WAN and Magic Firewall, and have talked at length about how this suite of products helps security teams sleep better at night by protecting entire networks from malicious traffic. Today, as part of Speed Week, we’ll break down the other side of the Magic: how using Cloudflare can automatically make your entire network faster. Our scale and interconnectivity, use of data to make more intelligent routing decisions, and inherent architecture differences versus traditional networks all contribute to performance improvements across all IP traffic.

What is Magic?

Cloudflare’s “Magic” services help customers connect and secure their networks without the cost and complexity of maintaining legacy hardware. Magic Transit provides connectivity and DDoS protection for Internet-facing networks; Magic WAN enables customers to replace legacy WAN architectures by routing private traffic through Cloudflare; and Magic Firewall protects all connected traffic with a built-in firewall-as-a-service. All three share underlying architecture principles that form the basis of the performance improvements we’ll dive deeper into below.

Anycast everything

In contrast to traditional “point-to-point” architecture, Cloudflare uses Anycast GRE or IPsec (coming soon) tunnels to send and receive traffic for customer networks. This means that customers can set up a single tunnel to Cloudflare, but effectively get connected to every single Cloudflare location, dramatically simplifying the process to configure and maintain network connectivity.

Magic makes your network faster

Every service everywhere

In addition to being able to send and receive traffic from anywhere, Cloudflare’s edge network is also designed to run every service on every server in every location. This means that incoming traffic can be processed wherever it lands, which allows us to block DDoS attacks and other malicious traffic within seconds, apply firewall rules, and route traffic efficiently and without bouncing traffic around between different servers or even different locations before it’s dispatched to its destination.

Zero Trust + Magic: the next-gen network of the future

With Cloudflare One, customers can seamlessly combine Zero Trust and network connectivity to build a faster, more secure, more reliable experience for their entire corporate network. Everything we’ll talk about today applies even more to customers using the entire Cloudflare One platform – stacking these products together means the performance benefits multiply (check out our post on Zero Trust and speed from today for more on this).

More connectivity = faster traffic

So where does the Magic come in? This part isn’t intuitive, especially for customers using Magic Transit in front of their network for DDoS protection: how can adding a network hop subtract latency?

The answer lies in Cloudflare’s network architecture — our web of connectivity to the rest of the Internet. Cloudflare has invested heavily in building one of the world’s most interconnected networks (9800 interconnections and counting, including with major ISPs, cloud services, and enterprises). We’re also continuing to grow our own private backbone and giving customers the ability to directly connect with us. And our expansive connectivity to last mile providers means we’re just milliseconds away from the source of all your network traffic, regardless of where in the world your users or employees are.

This toolkit of varying connectivity options means traffic routed through the Cloudflare network is often meaningfully faster than paths across the public Internet alone, because more options available for BGP path selection mean increased ability to choose more performant routes. Imagine having only one possible path between your house and the grocery store versus ten or more – chances are, adding more options means better alternatives will be available. A cornucopia of connectivity methods also means more resiliency: if there’s an issue on one of the paths (like construction happening on what is usually the fastest street), we can easily route around it to avoid impact to your traffic.

One common comparison customers are interested in is latency for inbound traffic. From the end user perspective, does routing through Cloudflare speed up or slow down traffic to networks protected by Magic Transit? Our response: let’s test it out and see! We’ve repeatedly compared Magic Transit vs standard Internet performance for customer networks across geographies and industries and consistently seen really exciting results. Here’s an example from one recent test where we used third-party probes to measure the ping time to the same customer network location (their data center in Qatar) before and after onboarding with Magic Transit:

Probe location RTT w/o Magic (ms) RTT w/ Magic (ms) Difference (ms) Difference (% improvement)
Dubai 27 23 4 13%
Marseille 202 188 13 7%
Global (results averaged across 800+ distributed probes) 194 124 70 36%

All of these results were collected without the use of Argo Smart Routing for Packets, which we announced on Tuesday. Early data indicates that networks using Smart Routing will see even more substantial gains.

Modern architecture eliminates traffic trombones

In addition to the performance boost available for traffic routed across the Cloudflare network versus the public Internet, customers using Magic products benefit from a new architecture model that totally removes up to thousands of miles worth of latency.

Traditionally, enterprises adopted a “hub and spoke” model for granting employees access to applications within and outside their network. All traffic from within a connected network location was routed through a central “hub” where a stack of network hardware (e.g. firewalls) was maintained. This model worked great in locations where the hub and spokes were geographically close, but started to strain as companies became more global and applications moved to the cloud.

Now, networks using hub and spoke architecture are often backhauling traffic thousands of miles, between continents and across oceans, just to apply security policies before packets are dispatched to their final destination, which is often physically closer to where they started! This creates a “trombone” effect, where precious seconds are wasted bouncing traffic back and forth across the globe, and performance problems are amplified by packet loss and instability along the way.

Magic makes your network faster

Network and security teams have tried to combat this issue by installing hardware at more locations to establish smaller, regional hubs, but this quickly becomes prohibitively expensive and hard to manage. The price of purchasing multiple hardware boxes and dedicated private links adds up quickly, both in network gear and connectivity itself as well as the effort required to maintain additional infrastructure. Ultimately, this cost usually outweighs the benefit of the seconds regained with shorter network paths.

The “hub” is everywhere

There’s a better way — with the Anycast architecture of Magic products, all traffic is automatically routed to the closest Cloudflare location to its source. There, security policies are applied with single-pass inspection before traffic is routed to its destination. This model is conceptually similar to a hub and spoke, except that the hub is everywhere: 95% of the entire Internet-connected world is within 50 ms of a Cloudflare location (check out this week’s updates on our quickly-expanding network presence for the latest). This means instead of tromboning traffic between locations, it can stop briefly at a Cloudflare hop in-path before it goes on its way: dramatically faster architecture without compromising security.

To demonstrate how this architecture shift can make a meaningful difference, we created a lab to mirror the setup we’ve heard many customers describe as they’ve explained performance issues with their existing network. This example customer network is headquartered in South Carolina and has branch office locations on the west coast, in California and Oregon. Traditionally, traffic from each branch would be backhauled through the South Carolina “hub” before being sent on to its destination, either another branch or the public Internet.

In our alternative setup, we’ve connected each customer network location to Cloudflare with an Anycast GRE tunnel, simplifying configuration and removing the South Carolina trombone. We can also enforce network and application-layer filtering on all of this traffic, ensuring that the faster network path doesn’t compromise security.

Magic makes your network faster

Here’s a summary of results from performance tests on this example network demonstrating the difference between the traditional hub and spoke setup and the Magic “global hub” — we saw up to 70% improvement in these tests, demonstrating the dramatic impact this architecture shift can make.

LAX <> OR (ms)
ICMP round-trip for “Regular” (hub and spoke) WAN 127
ICMP round-trip for Magic WAN 38
Latency savings for Magic WAN vs “Regular” WAN 70%

This effect can be amplified for networks with globally distributed locations — imagine the benefits for customers who are used to delays from backhauling traffic between different regions across the world.

Getting smarter

Adding more connectivity options and removing traffic trombones provide a performance boost for all Magic traffic, but we’re not stopping there. In the same way we leverage insights from hundreds of billions of requests per day to block new types of malicious traffic, we’re also using our unique perspective on Internet traffic to make more intelligent decisions about routing customer traffic versus relying on BGP alone. Earlier this week, we announced updates to Argo Smart Routing including the brand-new Argo Smart Routing for Packets. Customers using Magic products can enable it to automatically boost performance for any IP traffic routed through Cloudflare (by 10% on average according to results so far, and potentially more depending on the individual customer’s network topology) — read more on this in the announcement blog.

What’s next?

The modern architecture, well-connected network, and intelligent optimizations we’ve talked about today are just the start. Our vision is for any customer using Magic to connect and protect their network to have the best performance possible for all of their traffic, automatically. We’ll keep investing in expanding our presence, interconnections, and backbone, as well as continuously improving Smart Routing — but we’re also already cooking up brand-new products in this space to deliver optimizations in new ways, including WAN Optimization and Quality of Service functions. Stay tuned for more Magic coming soon, and get in touch with your account team to learn more about how we can help make your network faster starting today.

Making Magic Transit health checks faster and more responsive

Post Syndicated from Meyer Zinn original https://blog.cloudflare.com/making-magic-transit-health-checks-faster-and-more-responsive/

Making Magic Transit health checks faster and more responsive

Making Magic Transit health checks faster and more responsive

Magic Transit advertises our customer’s IP prefixes directly from our edge network, applying DDoS mitigation and firewall policies to all traffic destined for the customer’s network. After the traffic is scrubbed, we deliver clean traffic to the customer over GRE tunnels (over the public Internet or Cloudflare Network Interconnect). But sometimes, we experience inclement weather on the Internet: network paths between Cloudflare and the customer can become unreliable or go down. Customers often configure multiple tunnels through different network paths and rely on Cloudflare to pick the best tunnel to use if, for example, some router on the Internet is having a stormy day and starts dropping traffic.

Making Magic Transit health checks faster and more responsive

Because we use Anycast GRE, every server across Cloudflare’s 200+ locations globally can send GRE traffic to customers. Every server needs to know the status of every tunnel, and every location has completely different network routes to customers. Where to start?

In this post, I’ll break down my work to improve the Magic Transit GRE tunnel health check system, creating a more stable experience for customers and dramatically reducing CPU and memory usage at Cloudflare’s edge.

Everybody has their own weather station

To decide where to send traffic, Cloudflare edge servers need to periodically send health checks to each customer tunnel endpoint.

When Magic Transit was first launched, every server sent a health check to every tunnel once per minute. This naive, “shared-nothing” approach was simple to implement and served customers well, but would occasionally deliver less than optimal health check behavior in two specific ways.

Way #1: Inconsistent weather reports

Sometimes a server just runs into bad luck, and a check randomly fails. From there, the server would mark the tunnel as degraded and immediately start shifting traffic towards a fallback tunnel. Imagine you and I were standing right next to each other under a clear sky, and I felt a single drop of water and declared, “It’s raining!” whereas you felt no raindrops and declared, “It’s all clear!”

With relatively minimal data per server, it means that health determinations can be imprecise. It also means that individual servers could overreact to individual failures. From a customer’s point of view, it’s like Cloudflare detected a problem with the primary tunnel. But, in reality, the server just got a bad weather forecast and made a different judgement call.

Way #2: Slow to respond to storms

Even when tunnel states are consistent across servers, they can be slow to respond. In this case, if a server runs a health check which succeeds, but a second later the tunnel goes down, the next health check won’t happen for another 59 seconds. Until that next health check fails, the server has no idea anything is wrong, so it keeps sending traffic over unhealthy tunnels, leading to packet loss and latency for the customer.

Much like how a live, up-to-the-minute rain forecast helps you decide when to leave to avoid the rain, servers that send tunnel checks more frequently get a finer view of the Internet weather and can respond faster to localized storms. But if every server across Cloudflare’s edge sent health checks too frequently, we would very quickly start to overwhelm our customers’ networks.

All of the weather stations nearby start sharing observations

Clearly, we needed to hammer out some kinks. We wanted servers in the same location to come to the same conclusions about where to send traffic, and we wanted faster detection of issues without increasing the frequency of tunnel checks.

Health checks sent from servers in the same data center take the same route across the Internet. Why not share the results among them?

Instead of a single raindrop causing me to declare that it’s raining, I’d tell you about the raindrop I felt, and you’d tell me about the clear sky you’re looking at. Together, we come to the same conclusion: there isn’t enough rain to open an umbrella.

There is even a special networking protocol that allows us to easily share information between servers in the same private network. From the makers of Unicast and Anycast, now presenting: Multicast!

A single IP address does not necessarily represent a single machine in a network. The Internet Protocol specifies a way to send one message that gets delivered to a group of machines, like writing to an email list. Every machine has to opt into the group—we can’t just enroll people at random for our email list—but once a machine joins, it receives a copy of any message sent to the group’s address.

Making Magic Transit health checks faster and more responsive

The servers in a Cloudflare edge data center are part of the same private network, so for “version 2” of our health check system, we had each server in a data center join a multicast group and share their health check results with one another. Each server still made an independent assessment for each tunnel, but that assessment was based on data collected by all servers in the same location.

This second version of tunnel health checks resulted in more consistent  tunnel health determinations by servers in the same data center. It also resulted in faster response times—especially in large data centers where servers receive updates from their peers very rapidly.

However, we started seeing scaling problems. As we added more customers, we added more tunnel endpoints where we need to check the weather. In some of our larger data centers, each server was receiving close to half a billion messages per minute.

Imagine it’s not just you and me telling each other about the weather above us. You’re in a crowd of hundreds of people, and now everyone is shouting the weather updates for thousands of cities around the world!

One weather station to rule them all

As an engineering intern on the Magic Transit team, my project this summer has been developing a third approach. Rather than having every server infrequently check the weather for every tunnel and shouting the observation to everyone else, now every server tunnel can frequently check the weather for a few tunnels. With this new approach, servers would then only tell the others about the overall weather report—not every individual measurement.

That scenario sounds more efficient, but we need to distribute the task of sending tunnel health checks across all the servers in a location so one server doesn’t get an overwhelming amount of work. So how can we assign tunnels to servers in a way that doesn’t require a centralized orchestrator or shared database? Enter consistent hashing, the single coolest distributed computing concept I got to apply this summer.

Every server sends a multicast “heartbeat” every few seconds. Then, by listening for multicast heartbeats, each server can construct a list of the IP addresses of peers known to be alive, including its own address, sorted by taking the hash of each address. Every server in a data center has the same list of peers in the same order.

When a server needs to decide which tunnels it is responsible for sending health checks to, the server simply hashes each tunnel to an integer and searches through the list of peer addresses to find the peer with the smallest hash greater than the tunnel’s hash, wrapping around to the first peer if no peer is found. The server is responsible for sending health checks to the tunnel when the assigned peer’s address equals the server’s address.

If a server stops sending messages for a long enough period of time, the server gets removed from the known peers list. As a consequence, the next time another server tries to hash a tunnel the removed peer was previously assigned, the tunnel simply gets reassigned to the next peer in the list.

And like magic, we have devised a scheme to consistently assign tunnels to servers in a way that is resilient to server failures and does not require any extra coordination between servers beyond heartbeats. Now, the assigned server can send health checks way more frequently, compose more precise weather forecasts, and share those forecasts without being drowned out by the crowd.

Results

Releasing the new health check system globally reduced Magic Transit’s CPU usage by over 70% and memory usage by nearly 85%.

Memory usage (measured in terabytes):

Making Magic Transit health checks faster and more responsive

CPU usage (measured in CPU-seconds per two minute interval, averaged over two days):

Making Magic Transit health checks faster and more responsive

Reducing the number of multicast messages means that servers can now keep up with the Internet weather, even in the largest data centers. We’re now poised for the next stage of Magic Transit’s growth, just in time for our two-year anniversary.

If you want to help build the future of networking, join our team.

Announcing Project Pangea: Helping Underserved Communities Expand Access to the Internet For Free

Post Syndicated from Marwan Fayed original https://blog.cloudflare.com/pangea/

Announcing Project Pangea: Helping Underserved Communities Expand Access to the Internet For Free

Announcing Project Pangea: Helping Underserved Communities Expand Access to the Internet For Free

Half of the world’s population has no access to the Internet, with many more limited to poor, expensive, and unreliable connectivity. This problem persists despite large levels of public investment, private infrastructure, and effort by local organizers.

Today, Cloudflare is excited to announce Project Pangea: a piece of the puzzle to help solve this problem. We’re launching a program that provides secure, performant, reliable access to the Internet for community networks that support underserved communities, and we’re doing it for free1 because we want to help build an Internet for everyone.

What is Cloudflare doing to help?

Project Pangea is Cloudflare’s project to help bring underserved communities secure connectivity to the Internet through Cloudflare’s global and interconnected network.

Cloudflare is offering our suite of network services — Cloudflare Network Interconnect, Magic Transit, and Magic Firewall — for free to nonprofit community networks, local networks, or other networks primarily focused on providing Internet access to local underserved or developing areas. This service would dramatically reduce the cost for communities to connect to the Internet, with industry leading security and performance functions built-in:

  • Cloudflare Network Interconnect provides access to Cloudflare’s edge in 200+ cities across the globe through physical and virtual connectivity options.
  • Magic Transit acts as a conduit to and from the broader Internet and protects community networks by mitigating DDoS attacks within seconds at the edge.
  • Magic Firewall gives community networks access to a network-layer firewall as a service, providing further protection from malicious traffic.

We’ve learned from working with customers that pure connectivity is not enough to keep a network sustainably connected to the Internet. Malicious traffic, such as DDoS attacks, can target a network and saturate Internet service links, which can lead to providers aggressively rate limiting or even entirely shutting down incoming traffic until the attack subsides. This is why we’re including our security services in addition to connectivity as part of Project Pangea: no attacker should be able to keep communities closed off from accessing the Internet.

Announcing Project Pangea: Helping Underserved Communities Expand Access to the Internet For Free

What is a community network?

Community networks have existed almost as long as commercial subscribership to the Internet that began with dial-up service. The Internet Society, or ISOC, describes community networks as happening “when people come together to build and maintain the necessary infrastructure for Internet connection.”

Most often, community networks emerge from need, and in response to the lack or absence of available Internet connectivity. They consistently demonstrate success where public and private-sector initiatives have either failed or under-deliver. We’re not talking about stop-gap solutions here, either — community networks around the world have been providing reliable, sustainable, high-quality connections for years.

Many will operate only within their communities, but many others can grow, and have grown, to regional or national scale. The most common models of governance and operation are as not-for-profits or cooperatives, models that ensure reinvestment within the communities being served. For example, we see networks that reinvest their proceeds to replace Wi-Fi infrastructure with fibre-to-the-home.

Cloudflare celebrates these networks’ successes, and also the diversity of the communities that these networks represent. In that spirit, we’d like to dispel myths that we encountered during the launch of this program — many of which we wrongly assumed or believed to be true — because the myths turn out to be barriers that communities so often are forced to overcome.  Community networks are built on knowledge sharing, and so we’re sharing some of that knowledge, so others can help accelerate community projects and policies, rather than rely on the assumptions that impede progress.

Myth #1: Only very rural or remote regions are underserved and in need. It’s true that remote regions are underserved. It is also true that underserved regions exist within 10 km (about six miles) of large city centers, and even within the largest cities themselves, as evidenced by the existence of some of our launch partners.

Myth #2: Remote, rural, or underserved is also low-income. This might just be the biggest myth of all. Rural and remote populations are often thriving communities that can afford service, but have no access. In contrast, the need for urban community networks are often egalitarian, and emerge because the access that is available is unaffordable to many.

Myth #3: Service is necessarily more expensive. This myth is sometimes expressed by statements such as, “if large service providers can’t offer affordable access, then no one can.”  More than a myth, this is a lie. Community networks (including our launch partners) use novel governance and cost models to ensure that subscribers pay rates similar to the wider market.

Myth #4: Technical expertise is a hard requirement and is unavailable. There is a rich body of evidence and examples showing that, with small amounts of training and support, communities can build their own local networks cheaply and reliably with commodity hardware and non-specialist equipment.

These myths aside, there is one truth: the path to sustainability is hard. The start and initial growth of community networks often consists of volunteer time or grant funding, which are difficult to sustain in the long-term. Eventually the starting models need to transition to models of “willing to charge and willing to pay” — Project Pangea is designed to help fill this gap.

What is the problem?

Communities around the world can and have put up Wi-Fi antennas and laid their own fibre. Even so, and however well-connected the community is to itself, Internet services are prohibitively expensive — if they can be found at all.

Two elements are required to connect to the Internet, and each incurs its own cost:

  • Backhaul connections to an interconnection point — the connection point may be anything from a local cabinet to a large Internet exchange point (IXP).
  • Internet Services are provided by a network that interfaces with the wider Internet, and agrees to route traffic to and from on behalf of the community network.

These are distinct elements. Backhaul service carries data packets along a physical link (a fibre cable or wireless medium). Internet service is separate and may be provided over that link, or at its endpoint.

The cost of Internet service for networks is both dominant and variable (with usage), so in most cases it is cheaper to purchase both as a bundle from service providers that also own or operate their own physical network. Telecommunications and energy companies are prime examples.

However, the operating costs and complexity of long-distance backhaul is significantly lower than the costs of Internet service. If reliable, high capacity service were affordable, then community networks could extend their knowledge and governance models sustainably to also provide their own backhaul.

For all that community networks can build, establish, and operate, the one element entirely outside their control is the cost of Internet service — a problem that Project Pangea helps to solve.

Why does the problem persist?

On this subject, I — Marwan — can only share insights drawn from prior experience as a computer science professor, and a co-founder of HUBS c.i.c., launched with talented professors and a network engineer. HUBS is a not-for-profit backhaul and Internet provider in Scotland. It is a cooperative of more than a dozen community networks — some that serve communities with no roads in or out — across thousands of square kilometers along Scotland’s West Coast and Borders regions. As is true of many community networks, not least some of Pangea’s launch partners, HUBS’ is award-winning, and engages in advocacy and policy.

During that time my co-founders and I engaged with research funders, economic development agencies, three levels of government, and so many communities that I lost track. After all that, the answer to the question is still far from clear. There are, however, noteworthy observations and experiences that stood out, and often came from surprising places:

  • Cables on the ground get chewed by animals that, small or large, might never be seen.
  • Burying power and Ethernet cables, even 15 centimeters below soil, makes no difference because (we think) animals are drawn by the electrical current.
  • Property owners sometimes need to be convinced that 8 to 10 square meters to build a small tower in exchange for free Internet and community benefit is a good thing.
  • The raising of small towers, even that no one will see, is sometimes blocked by legislation or regulation that assumes private non-residential structures can only be a shed, or never taller than a shed.
  • Private fibre backbone installations installed with public funds are often inaccessible, or are charged by distance even though the cost to light 100 meters of fibre is identical to the cost of lighting 1 km of fibre.
  • Civil service agencies may be enthusiastic, but are also cautious, even in the face of evidence. Be patient, suffer frustration, be more patient, and repeat. Success is possible.
  • If and where possible, it’s best to avoid attempts to deliver service where national telecommunications companies have plans to do so.
  • Never underestimate tidal fading — twice a day, wireless signals over water will be amazing, and will completely disappear. We should have known!

All anecdotes aside, the best policies and practices are non-trivial — but because of so many prior community efforts, and organizations such as ISOC, the APC, the A4AI, and more, the challenges and solutions are better understood than ever before.

How does a community network reach the Internet?

First, we’d like to honor the many organisations we’ve learned from who might say that there are no technical barriers to success. Connections within the community networks may be shaped by geographical features or regional regulations. For example, wireless lines of sight between antenna towers on personal property are guided by hills or restricted by regulations. Similarly, Ethernet cables and fibre deployments are guided by property ownership, digging rights, and the presence or migration of grazing animals that dig into soil and gnaw at cables — yes, they do, even small rabbits.

Once the community establishes its own area network, the connections to meet Internet services are more conventional, more familiar. In part, the choice is influenced or determined by proximity to Internet exchanges, PoPs, or regional fibre cabinet installations. The connections with community networks fall into three broad categories.

Colocation. A community network may be fortunate enough to have service coverage that overlaps with, or is near to, an Internet eXchange Point (IXP), as shown in the figure below. In this case a natural choice is to colocate a router within the exchange, near to the Internet service provider’s router (labeled as Cloudflare in the figure). Our launch partner NYC Mesh connects in this manner. Unfortunately, being that exchanges are most often located in urban settings, colocation is unavailable to many, if not most, community networks.

Announcing Project Pangea: Helping Underserved Communities Expand Access to the Internet For Free

Conventional point-to-point backhaul. Community networks that are remote must establish a point-to-point backhaul connection to the Internet exchange. This connection method is shown in the figure below in which the community network in the previous figure has moved to the left, and is joined by a physical long-distance link to the Internet service router that remains in the exchange on the right.

Announcing Project Pangea: Helping Underserved Communities Expand Access to the Internet For Free

Point-to-point backhaul is familiar. If the infrastructure is available — and this is a big ‘if’ — then backhaul is most often available from a utility company, such as a telecommunications or energy provider, that may also bundle Internet service as a way to reduce total costs. Even bundled, the total cost is variable and unaffordable to individual community networks, and is exacerbated by distance. Some community networks have succeeded in acquiring backhaul through university, research and education, or publicly-funded networks that are compelled or convinced to offer the service in the public interest. On the west coast of Scotland, for example, Tegola launched with service from the University of Highlands and Islands and the University of Edinburgh.

Start a backhaul cooperative for point-to-point and colocation. The last connection option we see among our launch partners overcomes the prohibitive costs by forming a cooperative network in which the individual subscriber community networks are also members. The cooperative model can be seen in the figure below. The exchange remains on the right. On the left the community network in the previous figure is now replaced by a collection of community networks that may optionally connect with each other (for example, to establish reliable routing if any link fails). Either directly or indirectly via other community networks, each of these community networks has a connection to a remote router at the near-end of the point-to-point connection. Crucially, the point-to-point backhaul service — as well as the co-located end-points — are owned and operated by the cooperative. In this manner, an otherwise expensive backhaul service is made affordable by being a shared cost.

Announcing Project Pangea: Helping Underserved Communities Expand Access to the Internet For Free

Two of our launch partners, Guifi.net and HUBS c.i.c., are organised this way and their 10+ years in operation demonstrate both success and sustainability. Since the backhaul provider is a cooperative, the community network members have a say in the ways that revenue is saved, spent, and — best of all — reinvested back into the service and infrastructure.

Why is Cloudflare doing this?

Cloudflare’s mission is to help build a better Internet, for everyone, not just those with privileged access based on their geographical location. Project Pangea aligns with this mission by extending the Internet we’re helping to build — a faster, more reliable, more secure Internet — to otherwise underserved communities.

How can my community network get involved?

Check out our landing page to learn more and apply for Project Pangea today.

The ‘community’ in Cloudflare

Lastly, in a blog post about community networks, we feel it is appropriate to acknowledge the ‘community’ at Cloudflare: Project Pangea is the culmination of multiple projects, and multiple peoples’ hours, effort, dedication, and community spirit. Many, many thanks to all.
______

1For eligible networks, free up to 5Gbps at p95 levels.

Interconnect Anywhere — Reach Cloudflare’s network from 1,600+ locations

Post Syndicated from Matt Lewis original https://blog.cloudflare.com/interconnect-anywhere/

Interconnect Anywhere — Reach Cloudflare’s network from 1,600+ locations

Interconnect Anywhere — Reach Cloudflare’s network from 1,600+ locations

Customers choose Cloudflare for our network performance, privacy and security.  Cloudflare Network Interconnect is the best on-ramp for our customers to utilize our diverse product suite. In the past, we’ve talked about Cloudflare’s physical footprint in over 200+ data centers, and how Cloudflare Network Interconnect enabled companies in those data centers to connect securely to Cloudflare’s network. Today, Cloudflare is excited to announce expanded partnerships that allows customers to connect to Cloudflare from their own Layer 2 service fabric. There are now over 1,600 locations where enterprise security and network professionals have the option to connect to Cloudflare securely and privately from their existing fabric.

Interconnect Anywhere is a journey

Since we launched Cloudflare Network Interconnect (CNI) in August 2020, we’ve been focused on extending the availability of Cloudflare’s network to as many places as possible. The initial launch opened up 150 physical locations alongside 25 global partner locations. During Security Week this year, we grew that availability by adding data center partners to our CNI Partner Program. Today, we are adding even more connectivity options by expanding Cloudflare availability to all of our partners’ locations, as well as welcoming CoreSite Open Cloud Exchange (OCX) and Infiny by Epsilon into our CNI Partner Program. This totals 1,638 locations where our customers can now connect securely to the Cloudflare network. As we continue to expand, customers are able to connect the fabric of their choice to Cloudflare from a growing list of data centers.

Fabric Partner Enabled Locations
PacketFabric 180+
Megaport 700+
Equinix Fabric 46+
Console Connect 440+
CoreSite Open Cloud Exchange 22+
Infiny by Epsilon 250+

“We are excited to expand our partnership with Cloudflare to ensure that our mutual customers can benefit from our carrier-class Software Defined Network (SDN) and Cloudflare’s network security in all Packetfabric locations. Now customers can easily connect from wherever they are located to access best of breed security services alongside Packetfabric’s Cloud Connectivity options.”
Alex Henthorn-Iwane, PacketFabric’s Chief Marketing Officer

“With the significant rise in DDoS attacks over the past year, it’s becoming ever more crucial for IT and Operations teams to prevent and mitigate network security threats. We’re thrilled to enable Cloudflare Interconnect everywhere on Megaport’s global Software Defined Network, which is available in over 700 enabled locations in 24 countries worldwide. Our partnership will give organizations the ability to reduce their exposure to network attacks, improve customer experiences, and simplify their connectivity across our private on-demand network in a matter of a few mouse clicks.”
Misha Cetrone, Megaport VP of Strategic Partnerships

“Expanding the connectivity options to Cloudflare ensures that customers can provision hybrid architectures faster and more easily, leveraging enterprise-class network services and automation on the Open Cloud Exchange. Simplifying the process of establishing modern networks helps achieve critical business objectives, including reducing total cost of ownership, and improving business agility as well as interoperability.”
Brian Eichman, CoreSite Senior Director of Product Development

“Partner accessibility is key in our cloud enablement and interconnection strategy. We are continuously evolving to offer our customers and partners simple and secure connectivity no matter where their network resides in the world. Infiny enables access to Cloudflare from across a global footprint while delivering high-quality cloud connectivity solutions at scale. Customers and partners gain an innovative Network-as-a-Service SDN platform that supports them with programmable and automated connectivity for their cloud and networking needs.”
Mark Daley, Epsilon Director of Digital Strategy

Uncompromising security and increased reliability from your choice of network fabric

Now, companies can connect to Cloudflare’s suite of network and security products without traversing shared public networks by taking advantage of software-defined networking providers. No matter where a customer is connected to one of our fabric partners, Cloudflare’s 200+ data centers ensure that world-class network security is close by and readily available via a secure, low latency, and cost-effective connection. An increased number of locations further allows customers to have multiple secure connections to the Cloudflare network, increasing redundancy and reliability. As we further expand our global network and increase the number of data centers where Cloudflare and our partners are connected, latency becomes shorter and customers will reap the benefits.

Let’s talk about how a customer can use Cloudflare Network Interconnect to improve their security posture through a fabric provider.

Plug and Play Fabric connectivity

Acme Corp is an example company that wants to deliver highly performant digital services to their customers and ensure employees can securely connect to their business apps from anywhere. They’ve purchased Magic Transit and Cloudflare Access and are evaluating Magic WAN to secure their network while getting the performance Cloudflare provides. They want to avoid potential network traffic congestion and latency delays, so they have designed a network architecture with their software-defined network fabric and Cloudflare using Cloudflare Network Interconnect.

With Cloudflare Network Interconnect, provisioning this connection is simple. Acme goes to their partner portal, and requests a virtual layer 2 connection to Cloudflare with the bandwidth of their choice. Cloudflare accepts the connection, provides the BGP session establishment information and organizes a turn up call if required. Easy!

Let’s talk about how Cloudflare and our partners have worked together to simplify the interconnectivity experience for the customer.

With Cloudflare Network interconnect, availability only gets better

Interconnect Anywhere — Reach Cloudflare’s network from 1,600+ locations
Connection Established with Cloudflare Network Interconnect

When a customer uses CNI to establish a connection between a fabric partner and Cloudflare, that connection runs over layer 2, configured via the partner user interface. Our new partnership model allows customers to connect privately and securely to Cloudflare’s network even when the customer is not located in the same data center as Cloudflare.

Interconnect Anywhere — Reach Cloudflare’s network from 1,600+ locations
Shorter Customer Path with New Partner Locations‌‌

The diagram above shows a shorter customer path thanks to an incremental partner-enabled location. Every time Cloudflare brings a new data center online and connects with a partner fabric, all customers in that region immediately benefit from the closer proximity and reduced latency.

Network fabrics in action

For those who want to self-serve, we’ve published documentation that details the steps in provisioning these connections. You can find the steps for each partner below:

As we expand our network, it’s critical we provide more ways to allow our customers to connect easily. We will continue to shorten the time it takes to set up a new interconnection, drive down costs, strengthen security and improve customer experience with all of our Network On-Ramp partners.

If you are using one of our software-defined networking partners and would like to connect to Cloudflare via their fabric, contact your fabric partner account team or reach out to us using the Cloudflare Network Interconnect page. If you are not using a fabric today, but would like to take advantage of software-defined networking to connect to Cloudflare, reach out to your account team.

Flow-based monitoring for Magic Transit

Post Syndicated from Annika Garbers original https://blog.cloudflare.com/flow-based-monitoring-for-magic-transit/

Flow-based monitoring for Magic Transit

Flow-based monitoring for Magic Transit

Network-layer DDoS attacks are on the rise, prompting security teams to rethink their L3 DDoS mitigation strategies to prevent business impact. Magic Transit protects customers’ entire networks from DDoS attacks by placing our network in front of theirs, either always on or on demand. Today, we’re announcing new functionality to improve the experience for on-demand Magic Transit customers: flow-based monitoring. Flow-based monitoring allows us to detect threats and notify customers when they’re under attack so they can activate Magic Transit for protection.

Magic Transit is Cloudflare’s solution to secure and accelerate your network at the IP layer. With Magic Transit, you get DDoS protection, traffic acceleration, and other network functions delivered as a service from every Cloudflare data center. With Cloudflare’s global network (59 Tbps capacity across 200+ cities) and <3sec time to mitigate at the edge, you’re covered from even the largest and most sophisticated attacks without compromising performance. Learn more about Magic Transit here.

Using Magic Transit on demand

With Magic Transit, Cloudflare advertises customers’ IP prefixes to the Internet with BGP in order to attract traffic to our network for DDoS protection. Customers can choose to use Magic Transit always on or on demand. With always on, we advertise their IPs and mitigate attacks all the time; for on demand, customers activate advertisement only when their networks are under active attack. But there’s a problem with on demand: if your traffic isn’t routed through Cloudflare’s network, by the time you notice you’re being targeted by an attack and activate Magic Transit to mitigate it, the attack may have already caused impact to your business.

Flow-based monitoring for Magic Transit

On demand with flow-based monitoring

Flow-based monitoring solves the problem with on-demand by enabling Cloudflare to detect and notify you about attacks based on traffic flows from your data centers. You can configure your routers to continuously send NetFlow or sFlow (coming soon) to Cloudflare. We’ll ingest your flow data and analyze it for volumetric DDoS attacks.

Flow-based monitoring for Magic Transit
Send flow data from your network to Cloudflare for analysis

When an attack is detected, we’ll notify you automatically (by email, webhook, and/or PagerDuty) with information about the attack.

Flow-based monitoring for Magic Transit
Cloudflare detects attacks based on your flow data

You can choose whether you’d like to activate IP advertisement with Magic Transit manually – we support activation via the Cloudflare dashboard or API – or automatically, to minimize the time to mitigation. Once Magic Transit is activated and your traffic is flowing through Cloudflare, you’ll receive only the clean traffic back to your network over your GRE tunnels.

Flow-based monitoring for Magic Transit
Activate Magic Transit for DDoS protection

Using flow-based monitoring with Magic Transit on demand will provide your team peace of mind. Rather than acting in response to an attack after it impacts your business, you can complete a simple one-time setup and rest assured that Cloudflare will notify you (and/or start protecting your network automatically) when you’re under attack. And once Magic Transit is activated, Cloudflare’s global network and industry-leading DDoS mitigation has you covered: your users can continue business as usual with no impact to performance.

Example flow-based monitoring workflow: faster time to mitigate for Acme Corp

Let’s walk through an example customer deployment and workflow with Magic Transit on demand and flow-based monitoring. Acme Corp’s network was hit by a large ransom DDoS attack recently, which caused downtime for both external-facing and internal applications. To make sure they’re not impacted again, the Acme network team chose to set up on-demand Magic Transit. They authorize Cloudflare to advertise their IP space to the Internet in case of an attack, and set up Anycast GRE tunnels to receive clean traffic from Cloudflare back to their network. Finally, they configure their routers at each data center to send NetFlow data to a Cloudflare Anycast IP.

Cloudflare receives Acme’s NetFlow data at a location close to the data center sending it (thanks, Anycast!) and analyzes it for DDoS attacks. When traffic exceeds attack thresholds, Cloudflare triggers an automatic PagerDuty incident for Acme’s NOC team and starts advertising Acme’s IP prefixes to the Internet with BGP. Acme’s traffic, including the attack, starts flowing through Cloudflare within minutes, and the attack is blocked at the edge. Clean traffic is routed back to Acme through their GRE tunnels, causing no disruption to end users – they’ll never even know Acme was attacked. When the attack has subsided, Acme’s team can withdraw their prefixes from Cloudflare with one click, returning their traffic to its normal path.

Flow-based monitoring for Magic Transit
When the attack subsides, withdraw your prefixes from Cloudflare to return to normal

Get started

To learn more about Magic Transit and flow-based monitoring, contact us today.

Ransom DDoS attacks target a Fortune Global 500 company

Post Syndicated from Omer Yoachimik original https://blog.cloudflare.com/ransom-ddos-attacks-target-a-fortune-global-500-company/

Ransom DDoS attacks target a Fortune Global 500 company

Ransom DDoS attacks target a Fortune Global 500 company

In late 2020, a major Fortune Global 500 company was targeted by a Ransom DDoS (RDDoS) attack by a group claiming to be the Lazarus Group. Cloudflare quickly onboarded them to the Magic Transit service and protected them against the lingering threat. This extortion attempt was part of wider ransom campaigns that have been unfolding throughout the year, targeting thousands of organizations around the world. Extortionists are threatening organizations with crippling DDoS attacks if they do not pay a ransom.

Throughout 2020, Cloudflare onboarded and protected many organizations with Magic Transit, Cloudflare’s DDoS protection service for critical network infrastructure, the WAF service for HTTP applications, and the Spectrum service for TCP/UDP based applications — ensuring their business’s availability and continuity.

Unwinding the attack timeline

I spoke with Daniel (a pseudonym) and his team, who work at the Incident Response and Forensics team at the aforementioned company. I wanted to learn about their experience, and share it with our readers so they could learn how to better prepare for such an event. The company has requested to stay anonymous and so some details have been omitted to ensure that. In this blog post, I will refer to them as X.

Initially, the attacker sent ransom emails to a handful of X’s publicly listed email aliases such as press@, shareholder@, and hostmaster@. We’ve heard from other customers that in some cases, non-technical employees received the email and ignored it as being spam which delayed the incident response team’s time to react by hours. However, luckily for X, a network engineer that was on the email list of the hostmaster@ alias saw it and immediately forwarded it to Daniel’s incident response team.

In the ransom email, the attackers demanded 20 bitcoin and gave them a week to pay up, or else a second larger attack would strike, and the ransom would increase to 30 bitcoin. Daniel says that they had a contingency plan ready for this situation and that they did not intend to pay. Paying the ransom funds illegitimate activities, motivates the attackers, and does not guarantee that they won’t attack anyway.

…Please perform a google search of “Lazarus Group” to have a look at some of our previous work. Also, perform a search for “NZX” or “New Zealand Stock Exchange” in the news. You don’t want to be like them, do you?…

The current fee is 20 Bitcoin (BTC). It’s a small price to pay for what will happen if your whole network goes down. Is it worth it? You decide!…

If you decide not to pay, we will start the attack on the indicated date and uphold it until you do. We will completely destroy your reputation and make sure your services will remain offline until you pay…

–An excerpt of the ransom note

The contingency plan

Upon receiving the email from the network engineer, Daniel called him and they started combing through the network data — they noticed a significant increase in traffic towards one of their global data centers. This attacker was not playing around, firing gigabits per second towards a single server. The attack, despite just being a proof of intention, saturated the Internet uplink to that specific data center, causing a denial of service event and generating a series of failure events.

This first “teaser” attack came on a work day, towards the end of business hours as people were already wrapping up their day. At the time, X was not protected by Cloudflare and relied on an on-demand DDoS protection service. Daniel activated the contingency plan which relied on the on-demand scrubbing center service.

Daniel contacted their DDoS protection service. It took them over 30 minutes to activate the service and redirect X’s traffic to the scrubbing center. Activating the on-demand service caused networking failures and resulted in multiple incidents for X on various services — even ones that were not under attack. Daniel says hindsight is 2020 and he realized that an always-on service would have been much more effective than on-demand, reactionary control that takes time to implement, after the impact is felt. The networking failures amplified the one-hour attack resulting in incidents lasting much longer than expected.

Onboarding to Cloudflare

Following the initial attack, Daniel’s team reached out to Cloudflare and wanted to onboard to our automated always-on DDoS protection service, Magic Transit. The goal was to onboard to it before the second attack would strike. Cloudflare explained the pre-onboarding steps, provided details on the process, and helped onboard X’s network in a process Daniel defined as “quite painless and very professional. The speed and responsiveness were impressive. One of the key differentiation is the attack and traffic analytics that we see that our incumbent provider couldn’t provide us. We’re seeing attacks we never knew about being mitigated automatically.”

The attackers promised a second, huge attack which never happened. Perhaps it was just an empty threat, or it could be that the attackers detected that X is protected by Cloudflare which deterred them and they, therefore, decided to move on to their next target?

Recommendations for organizations

I asked Daniel if he has any recommendations for businesses so they can learn from his experience and be better prepared, should they be targeted by ransom attacks:

1. Utilize an automated always-on DDoS protection service

Do not rely on reactive on-demand SOC-based DDoS Protection services that require humans to analyze attack traffic. It just takes too long. Don’t be tempted to use an on-demand service: “you get all of the pain and none of the benefits”. Instead, onboard to a cloud service that has sufficient network capacity and automated DDoS mitigation systems.

2. Work with your vendor to build and understand your threat model

Work together with your DDoS protection vendor to tailor mitigation strategies to your workload. Every network is different, and each poses unique challenges when integrating with DDoS mitigation systems.

3. Create a contingency plan and educate your employees

Be prepared. Have plans ready and train your teams on them. Educate all of your employees, even the non-techies, on what to do if they receive a ransom email. They should report it immediately to your Security Incident Response team.

Cloudflare customers need not worry as they are protected. Enterprise customers can reach out to their account team if they are being extorted in order to review and optimize their security posture if needed. Customers on all other plans can reach out to our support teams and learn more about how to optimize your Cloudflare security configuration.

Not a Cloudflare customer yet? Speak to an expert or sign up.

Beat – An Acoustics Inspired DDoS Attack

Post Syndicated from Omer Yoachimik original https://blog.cloudflare.com/beat-an-acoustics-inspired-ddos-attack/

Beat - An Acoustics Inspired DDoS Attack

Beat - An Acoustics Inspired DDoS Attack

On the week of Black Friday, Cloudflare automatically detected and mitigated a unique ACK DDoS attack, which we’ve codenamed “Beat”, that targeted a Magic Transit customer. Usually, when attacks make headlines, it’s because of their size. However, in this case, it’s not the size that is unique but the method that appears to have been borrowed from the world of acoustics.

Acoustic inspired attack

As can be seen in the graph below, the attack’s packet rate follows a wave-shaped pattern for over 8 hours. It seems as though the attacker was inspired by an acoustics concept called beat. In acoustics, a beat is a term that is used to describe an interference of two different wave frequencies. It is the superposition of the two waves. When the two waves are nearly 180 degrees out of phase, they create the beating phenomenon. When the two waves merge they amplify the sound and when they are out of sync they cancel one another, creating the beating effect.

Beat - An Acoustics Inspired DDoS Attack
Beat DDoS Attack

Acedemo.org has a nice tool where you can create your own beat wave. As you can see in the screenshot below, the two waves in blue and red are out of phase and the purple wave is their superposition, the beat wave.

Beat - An Acoustics Inspired DDoS Attack
Source: https://academo.org/demos/wave-interference-beat-frequency/ 

Reverse engineering the attack

It looks like the attacker launched a flood of packets where the rate of the packets is determined by the equation of the beat wave: ybeat=y1+y2. The two equations y1 and y2 represent the two waves.

Each equation is expressed as

Beat - An Acoustics Inspired DDoS Attack

where fi is the frequency of each wave and t is time.

Therefore, the packet rate of the attack is determined by manipulation of the equation

Beat - An Acoustics Inspired DDoS Attack

to achieve a packet rate that ranges from ~18M to ~42M pps.

To get to the scale of this attack we will need to multiply ybeat by a certain variable a and also add a constant c, giving us ybeat=aybeat+c. Now, it’s been a while since I played around with equations, so I’m only going to try and get an approximation of the equation.

By observing the attack graph, we can guesstimate that

Beat - An Acoustics Inspired DDoS Attack

by playing around with desmos’s cool graph visualizer tool, if we set f1=0.0000345 and f2=0.00003455 we can generate a graph that resembles the attack graph. Plotting in those variables, we get:

Beat - An Acoustics Inspired DDoS Attack

Now this formula assumes just one node firing the packets. However, this specific attack was globally distributed, and if we assume that each node, or bot in this botnet, was firing an equal amount of packets at an equal rate, then we can divide the equation by the size of the botnet; the number of bots b. Then the final equation is something in the form of:

Beat - An Acoustics Inspired DDoS Attack

In the screenshot below, g = f 1. You can view this graph here.

Beat - An Acoustics Inspired DDoS Attack

Beating the drum

The attacker may have utilized this method in order to try and overcome our DDoS protection systems (perhaps thinking that the rhythmic rise and fall of the attack would fool our systems). However, flowtrackd, our unidirectional TCP state tracking machine, detected it as being a flood of ACK packets that do not belong to any existing TCP connection. Therefore, flowtrackd automatically dropped the attack packets at Cloudflare’s edge.

The attacker was beating the drum for over 19 hours with an amplitude of ~7 Mpps, a wavelength of ~4 hours, and peaking at ~42 Mpps. During the two days in which the attack took place, Cloudflare systems automatically detected and mitigated over 700 DDoS attacks that targeted this customer. The attack traffic accumulated at almost 500 Terabytes out of a total of 3.6 Petabytes of attack traffic that targeted this single customer in November alone. During those two days, the attackers utilized mainly ACK floods, UDP floods, SYN floods, Christmas floods (where all of the TCP flags are ‘lit’), ICMP floods, and RST floods.

The challenge of TCP based attacks

TCP is a stateful protocol, which means that in some cases, you’d need to keep track of a TCP connection’s state in order to know if a packet is legitimate or part of an attack, i.e. out of state. We were able to provide protection against out-of-state TCP packet attacks for our “classic” WAF/CDN service and Spectrum service because in both cases Cloudflare serves as a reverse-proxy seeing both ingress and egress traffic.

Beat - An Acoustics Inspired DDoS Attack

However, when we launched Magic Transit, which relies on an asymmetric routing topology with a direct server return (DSR), we couldn’t utilize our existing TCP connection tracking systems.

Beat - An Acoustics Inspired DDoS Attack

And so, being a software-defined company, we’re able to write code and spin up software when and where needed — as opposed to vendors that utilize dedicated DDoS protection hardware appliances. And that is what we did. We built flowtrackd, which runs autonomously on each server at our network’s edge. flowtrackd is able to classify the state of TCP flows by analyzing only the ingress traffic, and then drops, challenges, or rate-limits attack packets that do not correspond to an existing flow.

Beat - An Acoustics Inspired DDoS Attack

flowtrackd works together with our two additional DDoS protection systems, dosd and Gatebot, to assure our customers are protected against DDoS attacks, regardless of their size or sophistication — in this case, serving as a noise-canceling system to the Beat attack; reducing the headaches for our customers.

Read more about how our DDoS protection systems work here.