All posts by Annika Garbers

A stronger bridge to Zero Trust

Post Syndicated from Annika Garbers original

A stronger bridge to Zero Trust

A stronger bridge to Zero Trust

We know that migration to Zero Trust architecture won’t be an overnight process for most organizations, especially those with years of traditional hardware deployments and networks stitched together through M&A. But part of why we’re so excited about Cloudflare One is that it provides a bridge to Zero Trust for companies migrating from legacy network architectures.

Today, we’re doubling down on this — announcing more enhancements to the Cloudflare One platform that make a transition from legacy architecture to the Zero Trust network of the future easier than ever: new plumbing for more Cloudflare One on-ramps, expanded support for additional IPsec parameters, and easier on-ramps from your existing SD-WAN appliances.

Any on- or off-ramp: fully composable and interoperable

When we announced our vision for Cloudflare One, we emphasized the importance of allowing customers to connect to our network however they want — with hardware devices they’ve already deployed, with any carrier they already have in place, with existing technology standards like IPsec tunnels or more Zero Trust approaches like our lightweight application connector. In hundreds of customer conversations since that launch, we’ve heard you reiterate the importance of this flexibility. You need a platform that meets you where you are today and gives you a smooth path to your future network architecture by acting as a global router with a single control plane for any way you want to connect and manage your network traffic.

We’re excited to share that over the past few months, the last pieces of this puzzle have fallen into place, and customers can now use any Cloudflare One on-ramp and off-ramp together to route traffic seamlessly between devices, offices, data centers, cloud properties, and self-hosted or SaaS applications. This includes (new since our last announcement, and rounding out the compatibility matrix below) the ability to route traffic from networks connected with a GRE tunnel, IPsec tunnel, or CNI to applications connected with Cloudflare Tunnel.

Fully composable Cloudflare One on-ramps

From ↓ To → BYOIP WARP client CNI GRE tunnel IPSec tunnel Cloudflare Tunnel
WARP client
GRE tunnel
IPSec tunnel

This interoperability is key to organizations’ strategy for migrating from legacy network architecture to Zero Trust. You can start by improving performance and enhancing security using technologies that look similar to what you’re used to today, and incrementally upgrade to Zero Trust at a pace that makes sense for your organization.

Expanded options and easier management of Anycast IPsec tunnels

We’ve seen incredibly exciting demand since our launch of Anycast IPsec as an on-ramp for Cloudflare One back in December. Since IPsec has been the industry standard for encrypted network connectivity for almost thirty years, there are many implementations and parameters available to choose from, and our customers are using a wide variety of network devices to terminate these tunnels. To make the process of setting up and managing IPsec tunnels from any network easier, we’ve built on top of our initial release with support for new parameters, a new UI and Terraform provider support, and step-by-step guides for popular implementations.

  • Expanded support for additional configuration parameters: We started with a small set of default parameters based on industry best practices, and have expanded from there – you can see the up-to-date list in our developer docs. Since we wrote our own IPsec implementation from scratch (read more about why in our announcement blog), we’re able to add support for new parameters with just a single (quick!) development cycle. If the settings you’re looking for aren’t on our list yet, contact us to learn about our plans for supporting them.
  • Configure and manage tunnels from the Cloudflare dashboard: Anycast IPsec and GRE tunnel configuration can be managed with just a few clicks from the Cloudflare dashboard. After creating a tunnel, you can view connectivity to it from every Cloudflare location worldwide and run traceroutes or packet captures on-demand to get a more in-depth view of your traffic for troubleshooting.
  • Terraform provider support to manage your network as code: Busy IT teams love the fact that they can manage all their network configuration from a single place with Terraform.
  • Step-by-step guides for setup with your existing devices: We’ve developed and will continue to add new guides in our developer docs to walk you through establishing IPsec tunnels with Cloudflare from a variety of devices.

(Even) easier on-ramp from your existing SD-WAN appliances

We’ve heard from you consistently that you want to be able to use whatever hardware you have in place today to connect to Cloudflare One. One of the easiest on-ramp methods is leveraging your existing SD-WAN appliances to connect to us, especially for organizations with many locations. Previously, we announced partnerships with leading SD-WAN providers to make on-ramp configuration even smoother; today, we’re expanding on this by introducing new integration guides for additional devices and tunnel mechanisms including Cisco Viptela. Your IT team can follow these verified step-by-step instructions to easily configure connectivity to Cloudflare’s network.

Get started on your Zero Trust journey today

Our team is helping thousands of organizations like yours transition from legacy network architecture to Zero Trust – and we love hearing from you about the new products and features we can continue building to make this journey even easier. Learn more about Cloudflare One or reach out to your account team to talk about how we can partner to transform your network, starting today!

Next generation intrusion detection: an update on Cloudflare’s IDS capabilities

Post Syndicated from Annika Garbers original

Next generation intrusion detection: an update on Cloudflare’s IDS capabilities

Next generation intrusion detection: an update on Cloudflare’s IDS capabilities

In an ideal world, intrusion detection would apply across your entire network – data centers, cloud properties, and branch locations. It wouldn’t impact the performance of your traffic. And there’d be no capacity constraints. Today, we’re excited to bring this one step closer to reality by announcing the private beta of Cloudflare’s intrusion detection capabilities: live monitoring for threats across all of your network traffic, delivered as-a-service — with none of the constraints of legacy hardware approaches.

Cloudflare’s Network Services, part of Cloudflare One, help you connect and secure your entire corporate network — data center, cloud, or hybrid — from DDoS attacks and other malicious traffic. You can apply Firewall rules to keep unwanted traffic out or enforce a positive security model, and integrate custom or managed IP lists into your firewall policies to block traffic associated with known malware, bots, or anonymizers. Our new Intrusion Detection System (IDS) capabilities expand on these critical security controls by actively monitoring for a wide range of known threat signatures in your traffic.

What is an IDS?

Intrusion Detection Systems are traditionally deployed as standalone appliances but often incorporated as features in more modern or higher end firewalls. They expand the security coverage of traditional firewalls – which focus on blocking traffic you know you don’t want in your network – to analyze traffic against a broader threat database, detecting a variety of sophisticated attacks such as ransomware, data exfiltration, and network scanning based on signatures or “fingerprints” in network traffic. Many IDSs also incorporate anomaly detection, monitoring activity against a baseline to identify unexpected traffic patterns that could indicate malicious activity. (If you’re interested in the evolution of network firewall capabilities, we recommend this where we’ve dived deeper on the topic).

What problems have users encountered with existing IDS solutions?

We’ve interviewed tons of customers about their experiences deploying IDS and the pain points they’re hoping we can solve. Customers have mentioned the full list of historical problems we hear frequently with other hardware-based security solutions, including capacity planning, location planning and back hauling traffic through a central location for monitoring, downtime for installation, maintenance, and upgrades, and vulnerability to congestion or failure with large volumes of traffic (e.g. DDoS attacks).

Customers we talked to also consistently cited challenges making trade off decisions between security and performance for their network traffic. One network engineer explained:

“I know my security team hates me for this, but I can’t let them enable the IDS function on our on-prem firewalls – in the tests my team ran, it cut my throughput by almost a third. I know we have this gap in our security now, and we’re looking for an alternative way to get IDS coverage for our traffic, but I can’t justify slowing down the network for everyone in order to catch some theoretical bad traffic.”

Finally, customers who did choose to take the performance hit and invest in an IDS appliance reported that they often mute or ignore the feed of alerts coming into their SOC after turning it on. With the amount of noise on the Internet and the potential risk of missing an important signal, IDSs can end up generating a lot of false positives or non-actionable notifications. This volume can lead busy SOC teams to get alert fatigue and end up silencing potentially important signals buried in the noise.

How is Cloudflare tackling these problems?

We believe there’s a more elegant, efficient, and effective way to monitor all of your network traffic for threats without introducing performance bottlenecks or burning your team out with non-actionable alerts. Over the past year and a half, we’ve learned from your feedback, experimented with different technology approaches, and developed a solution to take those tough trade off decisions out of the picture.

Next generation intrusion detection: an update on Cloudflare’s IDS capabilities

One interface across all your traffic

Cloudflare’s IDS capabilities operate across all of your network traffic – any IP port or protocol — whether it flows to your IPs that we advertise on your behalf, IPs we lease to you, or soon, traffic within your private network. You can enforce consistent monitoring and security control across your entire network in one place.

No more hardware headaches

Like all of our security functions, we built our IDS from scratch in software, and it is deployed across every server on Cloudflare’s global Anycast network. This means:

  • No more capacity planning: Cloudflare’s entire global network capacity is now the capacity of your IDS – currently 142 Tbps and counting.
  • No more location planning: No more picking regions, backhauling traffic to central locations, or deploying primary and backup appliances – because every server runs our IDS software and traffic is automatically attracted to the closest network location to its source, redundancy and failover are built in.
  • No maintenance downtime: Improvements to Cloudflare’s IDS capabilities, like all of our products, are deployed continuously across our global network.

Threat intelligence from across our interconnected global network

The attack landscape is constantly evolving, and you need an IDS that stays ahead of it. Because Cloudflare’s IDS is delivered in software we wrote from the ground up and maintain, we’re able to continuously feed threat intelligence from the 20+ million Internet properties on Cloudflare back into our policies, keeping you protected from both known and new attack patterns.

Our threat intelligence combines open-source feeds that are maintained and trusted by the security community – like Suricata threat signatures – with information collected from our unique vantage point as an incredibly interconnected network carrying a significant percentage of all Internet traffic. Not only do we share these insights publicly through tools like Cloudflare Radar; we also feed them back into our security tools including IDS so that our customers are protected as quickly as possible from emerging threats. Cloudflare’s newly announced Threat Intel team will augment these capabilities even further, applying additional expertise to understanding and deriving insights from our network data.

Excited to get started?

If you’re an Advanced Magic Firewall customer, you can get access to these features in private beta starting now. You can reach out to your account team to learn more or get started now – we can’t wait to hear your feedback as we continue to develop these capabilities!

Zero Trust, SASE and SSE: foundational concepts for your next-generation network

Post Syndicated from Annika Garbers original

Zero Trust, SASE and SSE: foundational concepts for your next-generation network

Zero Trust, SASE and SSE: foundational concepts for your next-generation network

If you’re a security, network, or IT leader, you’ve most likely heard the terms Zero Trust, Secure Access Service Edge (SASE) and Secure Service Edge (SSE) used to describe a new approach to enterprise network architecture. These frameworks are shaping a wave of technology that will fundamentally change the way corporate networks are built and operated, but the terms are often used interchangeably and inconsistently. It can be easy to get lost in a sea of buzzwords and lose track of the goals behind them: a more secure, faster, more reliable experience for your end users, applications, and networks. Today, we’ll break down each of these concepts — Zero Trust, SASE, and SSE — and outline the critical components required to achieve these goals. An evergreen version of this content is available at our Learning Center here.

What is Zero Trust?

Zero Trust is an IT security model that requires strict identity verification for every person and device trying to access resources on a private network, regardless of whether they are sitting within or outside the network perimeter. This is in contrast to the traditional perimeter-based security model, where users are able to access resources once they’re granted access to the network — also known as a “castle and moat” architecture.

More simply put: traditional IT network security trusts anyone and anything inside the network. A Zero Trust architecture trusts no one and nothing. You can learn more about Zero Trust security here.

What is Secure Access Service Edge (SASE)?

Gartner introduced SASE as the framework to implement a Zero Trust architecture across any organization. SASE combines software-defined networking capabilities with a number of network security functions, all of which are delivered from a single cloud platform. In this way, SASE enables employees to authenticate and securely connect to internal resources from anywhere, and gives organizations better control over the traffic and data that enters and leaves their internal network.

The Secure Access component of SASE includes defining Zero Trust security policies across user devices and applications as well as branch, data center, and cloud traffic. The Service Edge component allows all traffic, regardless of its location, to pass through the Secure Access controls — without requiring back hauling to a central “hub” where those controls are enforced. You can learn more about SASE here.

What is Security Service Edge (SSE)?

SSE, also coined by Gartner, is a subset of SASE functionality specifically focused on security enforcement capabilities. It is a common stepping stone to a full SASE deployment, which extends SSE security controls to the corporate Wide Area Network (WAN) and includes software-defined networking capabilities such as traffic shaping and quality of service. You can learn more about SSE here.

What makes up SASE?

The most commonly available definitions of SASE list a number of security functions like Zero Trust Network Access (ZTNA) and Cloud Access Security Broker (CASB), focusing on what a SASE platform needs to do. Security functions are a critical piece of the story, but these definitions are incomplete: they miss describing how the functions are achieved, which is just as important.

The complete definition of SASE builds on this list of security functions to include three distinct aspects: secure access, on-ramps, and service edge.

Zero Trust, SASE and SSE: foundational concepts for your next-generation network

Cloudflare One: a comprehensive SASE platform

Cloudflare One is a complete SASE platform that combines a holistic set of secure access functions with flexible on-ramps to connect any traffic source and destination, all delivered on Cloudflare’s global network that acts as a blazing fast and reliable service edge. For organizations who want to start with SSE as a stepping stone to SASE, Cloudflare One also has you covered. It’s completely composable, so components can be deployed individually to address immediate use cases and build toward a full SASE architecture at your own pace.

Let’s break down each of the components of a SASE architecture in more detail and explain how Cloudflare One delivers them.

Secure access: security functions

Secure Access functions operate across your traffic to keep your users, applications, network, and data secure. In an input/process/output (IPO) model, you can think of secure access as the processes that monitor and act on your traffic.

Zero Trust, SASE and SSE: foundational concepts for your next-generation network

Zero Trust Network Access (ZTNA)

Zero Trust Network Access is the technology that makes it possible to implement a Zero Trust security model by requiring strict verification for every user and every device before authorizing them to access internal resources. Compared to traditional virtual private networks (VPNs), which grant access to an entire local network at once, ZTNA only grants access to the specific application requested and denies access to applications and data by default.

ZTNA can work together with other application security functions, like Web Application Firewalls, DDoS protection, and bot management, to provide complete protection for applications on the public Internet. More on ZTNA here.

Zero Trust, SASE and SSE: foundational concepts for your next-generation networkCloudflare One includes a ZTNA solution, Cloudflare Access, which operates in client-based or clientless modes to grant access to self-hosted and SaaS applications.

Secure Web Gateway (SWG)

A Secure Web Gateway operates between a corporate network and the Internet to enforce security policies and protect company data. Whether traffic originates from a user device, branch office, or application, SWGs provide layers of protection including URL filtering, malware detection and blocking, and application control. As a higher and higher percentage of corporate network traffic shifts from private networks to the Internet, deploying SWG has become critical to keeping company devices, networks, and data safe from a variety of security threats.

SWGs can work together with other tools including Web Application Firewalls and Network Firewalls to secure both inbound and outbound traffic flows across a corporate network. They can also integrate with Remote Browser Isolation (RBI) to prevent malware and other attacks from affecting corporate devices and networks, without completely blocking user access to Internet resources. More on SWG here.

Zero Trust, SASE and SSE: foundational concepts for your next-generation networkCloudflare One includes a SWG solution, Cloudflare Gateway, which provides DNS, HTTP, and Network filtering for traffic from user devices and network locations.

Remote Browser Isolation (RBI)

Browser isolation is a technology that keeps browsing activity secure by separating the process of loading webpages from the user devices displaying the webpages. This way, potentially malicious webpage code does not run on a user’s device, preventing malware infections and other cyber attacks from impacting both user devices and internal networks.

RBI works together with other secure access functions – for example, security teams can configure Secure Web Gateway policies to automatically isolate traffic to known or potentially suspicious websites. More on Browser Isolation here.

Zero Trust, SASE and SSE: foundational concepts for your next-generation networkCloudflare One includes Browser Isolation. In contrast to legacy remote browser approaches, which send a slow and clunky version of the web page to the user, Cloudflare Browser Isolation draws an exact replica of the page on the user’s device, and then delivers that replica so quickly that it feels like a regular browser.

Cloud Access Security Broker (CASB)

A cloud access security broker scans, detects, and continuously monitors for security issues in SaaS applications. Organizations use CASB for:

  • Data security – e.g. ensuring a wrong file or folder is not shared publicly in Dropbox
  • User activity – e.g. alerting to suspicious user permissions changing in Workday at 2:00 AM
  • Misconfigurations – e.g. keeping Zoom recordings from becoming publicly accessible
  • Compliance – e.g. tracking and reporting who modified Bitbucket branch permissions
  • Shadow IT – e.g. detecting users that signed up for an unapproved application with their work email

API-driven CASBs leverage API integrations with various SaaS applications and take just a few minutes to connect. CASB can also be used in tandem with RBI to detect and then prevent unwanted behaviors to both approved and unsanctioned SaaS applications, like disabling the ability to download files or copy text out of documents. More on CASB here.

Zero Trust, SASE and SSE: foundational concepts for your next-generation networkCloudflare One includes an API-driven CASB which gives comprehensive visibility and control over SaaS apps, so you can easily prevent data leaks and compliance violations.

Data Loss Prevention (DLP)

Data loss prevention tools detect and prevent data exfiltration (data moving without company authorization) or data destruction. Many DLP solutions analyze network traffic and internal “endpoint” devices to identify the leakage or loss of confidential information such as credit card numbers and personally identifiable information (PII). DLP uses a number of techniques to detect sensitive data including data fingerprinting, keyword matching, pattern matching, and file matching. More on DLP here.

Zero Trust, SASE and SSE: foundational concepts for your next-generation networkDLP capabilities for Cloudflare One are coming soon. These will include the ability to check data against common patterns like PII, label and index specific data you need to protect, and combine DLP rules with other Zero Trust policies.


Firewall-as-a-service, also referred to as cloud firewall, filters out potentially malicious traffic without requiring a physical hardware presence within a customer network. More on firewall-as-a-service here.

Zero Trust, SASE and SSE: foundational concepts for your next-generation networkCloudflare One includes Magic Firewall, a firewall-as-a-service that allows you to filter any IP traffic from a single control plane and (new!) enforce IDS policies across your traffic.

Email security

Email security is the process of preventing email-based cyber attacks and unwanted communications. It spans protecting inboxes from takeover, protecting domains from spoofing, stopping phishing attacks, preventing fraud, blocking malware delivery, filtering spam, and using encryption to protect the contents of emails from unauthorized persons.

Email security tools can be used in conjunction with other secure access functions including DLP and RBI – for example, potentially suspicious links in emails can be launched in an isolated browser without blocking false positives. More on email security here.

Zero Trust, SASE and SSE: foundational concepts for your next-generation networkCloudflare One includes Area 1 email security, which crawls the Internet to stop phishing, Business Email Compromise (BEC), and email supply chain attacks at the earliest stages of the attack cycle. Area 1 enhances built-in security from cloud email providers with deep integrations into Microsoft and Google environments and workflows.

On-ramps: get connected

In order to apply secure access functions to your traffic, you need mechanisms to get that traffic from its source (whether that’s a remote user device, branch office, data center, or cloud) to the service edge (see below) where those functions operate. On-ramps are those mechanisms – the inputs and outputs in the IPO model, or in other words, the ways your traffic gets from point A to point B after filters have been applied.

Zero Trust, SASE and SSE: foundational concepts for your next-generation network

Reverse proxy (for applications)

A reverse proxy sits in front of web servers and forwards client (e.g. web browser) requests to those web servers. Reverse proxies are typically implemented to help increase security, performance, and reliability. When used in conjunction with identity and endpoint security providers, a reverse proxy can be used to grant network access to web-based applications.

Cloudflare One includes one of the world’s most-used reverse proxies, which processes over 1.39 billion DNS requests every day.

Application connector (for applications)

For private or non-web-based applications, IT teams can install a lightweight daemon in their infrastructure and create an outbound-only connection to the service edge. These application connectors enable connectivity to HTTP web servers, SSH servers, remote desktops, and other applications/protocols without opening the applications to potential attacks.

Cloudflare One includes Cloudflare Tunnel. Users can install a lightweight daemon that creates an encrypted tunnel between their origin web server and Cloudflare’s nearest data center without opening any public inbound ports.

Device client (for users)

In order to get traffic from devices, including laptops and phones, to the service edge for filtering and private network access, users can install a client. This client, or “roaming agent,” acts as a forward proxy to direct some or all traffic from the device to the service edge.

Cloudflare One includes the WARP device client, which is used by millions of users worldwide and available for iOS, Android, ChromeOS, Mac, Linux, and Windows.

Bring-your-own or lease IPs (for branches, data centers, and clouds)

Depending on the capabilities of a SASE provider’s network/service edge, organizations may elect to bring their own IPs or lease IPs to enable entire network connectivity via BGP advertisement.

Cloudflare One includes BYOIP and leased IP options, both of which involve advertising ranges across our entire Anycast network.

Network tunnels (for branches, data centers, and clouds)

Most hardware or virtual hardware devices that sit at physical network perimeters are able to support one or multiple types of industry-standard tunneling mechanisms such as GRE and IPsec. These tunnels can be established to the service edge from branches, data centers and public clouds to enable network level connectivity.

Cloudflare One includes Anycast GRE and IPsec tunnel options, which are configured like traditional point-to-point tunnels but grant automatic connectivity to Cloudflare’s entire Anycast network for ease of management and redundancy. These options also enable easy connectivity from existing SD-WAN devices, which can enable simple to manage or entirely automated tunnel configuration.

Direct connection (for branches and data centers)

A final on-ramp option for networks with high reliability and capacity needs is to directly connect to the service edge, either with a physical cross-connect/last mile connection or a virtual interconnection through a virtual fabric provider.

Cloudflare One includes Cloudflare Network Interconnect (CNI), which enables you to connect with Cloudflare’s network via a direct physical connection or virtual connection through a partner. Cloudflare for Offices brings CNI directly to your physical premise for even simpler connectivity.

Service edge: the network that powers it all

Secure access functions need somewhere to operate. In the traditional perimeter architecture model, that place was a rack of hardware boxes in a corporate office or data center; with SASE, it’s a distributed network that is located as close as possible to users and applications wherever they are in the world. But not all service edges are created equal: for a SASE platform to deliver a good experience for your users, applications, and networks, the underlying network needs to be fast, intelligent, interoperable, programmable, and transparent. Let’s break down each of these platform capabilities in more detail.

Zero Trust, SASE and SSE: foundational concepts for your next-generation network

Performance: locations, interconnectivity, speed, capacity

Historically, IT teams have had to make tough trade off decisions between security and performance. These could include whether and which traffic to back haul to a central location for security filtering and which security functions to enable to balance throughput with processing overhead. With SASE, those trade-offs are no longer required, as long as the service edge is:

  • Geographically dispersed: it’s important to have service edge locations as close as possible to where your users and applications are, which increasingly means potentially anywhere in the world.
  • Interconnected: your service edge needs to be interconnected with other networks, including major transit, cloud, and SaaS providers, in order to deliver reliable and fast connectivity to the destinations you’re ultimately routing traffic to.
  • Fast: as expectations for user experience continue to rise, your service edge needs to keep up. Perceived application performance is influenced by many factors, from the availability of fast last-mile Internet connectivity to the impact of security filtering and encryption/decryption steps, so SASE providers need to take a holistic approach to measuring and improving network performance.
  • High capacity: with a SASE architecture model, you should never need to think about capacity planning for your security functions – “what size box to buy” is a question of the past. This means that your service edge needs to have enough capacity at each location where your network traffic can land, and the ability to intelligently load balance traffic to use that capacity efficiently across the service edge.

Cloudflare One is built on Cloudflare’s global network, which spans over 270 cities in over 100 countries, 10,500+ interconnected networks, and 140+ Tbps capacity.

Traffic intelligence: shaping, QoS, telemetry-based routing

On top of the inherent performance attributes of a network/service edge, it’s also important to be able to influence traffic based on characteristics of your individual network. Techniques like traffic shaping, quality of service (QoS), and telemetry-based routing can further improve performance for traffic across the security service edge by prioritizing bandwidth for critical applications and routing around congestion, latency, and other problems along intermediate paths.

Cloudflare One includes Argo Smart Routing, which optimizes Layer 3 through 7 traffic to intelligently route around congestion, packet loss, and other issues on the Internet. Additional traffic shaping and QoS capabilities are on the Cloudflare One roadmap.

Threat intelligence

In order to power the secure access functions, your service edge needs a continuously updating feed of intelligence that includes known and new attack types across all layers of the OSI stack. The ability to integrate third party threat feeds is a good start, but native threat intelligence from the traffic flowing across the service edge is even more powerful.

Cloudflare One includes threat intelligence gathered from the 20M+ Internet properties on Cloudflare’s network, which is continuously fed back into our secure access policies to keep customers protected from emerging threats.

Interoperability: integrations, standards, and composability

Your SASE platform will replace many of the components of your legacy network architecture, but you may choose to keep some of your existing tools and introduce new ones in the future. Your service edge needs to be compatible with your existing connectivity providers, hardware, and tools in order to enable a smooth migration to SASE.

At the same time, the service edge should also help you stay ahead of new technology and security standards like TLS 1.3 and HTTP3. It should also be fully composable, with every service working together to drive better outcomes than a stack of point solutions could alone.

Cloudflare One integrates with platforms like Identity Provider and Endpoint Protection solutions, SD-WAN appliances, interconnection providers, and Security Incident and Event Management tools (SIEMs). Existing security and IT tools can be used alongside Cloudflare One with minimal integration work.

Cloudflare is also a leader in advancing Internet and Networking standards. Any new web standard and protocols have likely been influenced by our research team.

Cloudflare One is also fully composable, allowing you to start with one use case and layer in additional functionality to create a “1+1=3” effect for your network.

Orchestration: automation and programmability

Deploying and managing your SASE configuration can be complex after scaling beyond a few users, applications, and locations. Your service edge should offer full automation and programmability, including the ability to manage your infrastructure as code with tools like Terraform.

Cloudflare One includes full API and Terraform support for easily deploying and managing configuration.

Visibility: analytics and logs

Your team should have full visibility into all the traffic routing through the service edge. In the classic perimeter security model, IT and security teams could get visibility by configuring network taps at the handful of locations where traffic entered and left the corporate network. As applications left the data center and users left the office, it became much more challenging to get access to this data. With a SASE architecture, because all of your traffic is routed through a service edge with a single control plane, you can get that visibility back – both via familiar formats like flow data and packet captures as well as rich logs and analytics.

All secure access components of Cloudflare One generate rich analytics and logs that can be evaluated directly in the Cloudflare One Dashboard or pushed in SIEM tools for advanced analytics.

Get started on your SASE journey with Cloudflare One

Over the next week, we will be announcing new features that further augment the capabilities of the Cloudflare One platform to make it even easier for your team to realize the vision of SASE. You can follow along at our Innovation Week homepage here or contact us to get started today.

Magic NAT: everywhere, unbounded, and lower cost

Post Syndicated from Annika Garbers original

Magic NAT: everywhere, unbounded, and lower cost

Magic NAT: everywhere, unbounded, and lower cost

Network Address Translation (NAT) is one of the most common and versatile network functions, used by everything from your home router to the largest ISPs. Today, we’re delighted to introduce a new approach to NAT that solves the problems of traditional hardware and virtual solutions. Magic NAT is free from capacity constraints, available everywhere through our global Anycast architecture, and operates across any network (physical or cloud). For Internet connectivity providers, Magic NAT for Carriers operates across high volumes of traffic, removing the complexity and cost associated with NATing thousands or millions of connections.

What does NAT do?

The main function of NAT is in its name:  NAT is responsible for translating the network address in the header of an IP packet from one address to another – for example, translating the private IP to the publicly routable IP Organizations use NAT to grant Internet connectivity from private networks, enable routing within private networks with overlapping IP space, and preserve limited IP resources by mapping thousands of connections to a single IP. These use cases are typically accomplished with a hardware appliance within a physical network or a managed service delivered by a cloud provider.

Let’s look at those different use cases.

Allowing traffic from private subnets to connect to the Internet

Resources within private subnets often need to reach out to the public Internet. The most common example of this is connectivity from your laptop, which might be allocated a private address like, reaching out to a public resource like In order for Google to respond to a request from your laptop, the source IP of your request needs to be publicly routable on the Internet. To accomplish this, your ISP translates the private source IP in your request to a public IP (and reverse-translates for the responses back to you). This use case is often referred to as public NAT, performed by hardware or software acting as a “NAT gateway.”

Magic NAT: everywhere, unbounded, and lower cost
Public NAT translates private IP addresses to public ones so that traffic from within private networks can access the Internet.

Users might also have requirements around the specific IP addresses that outgoing packets are NAT’d to. For example, they may need packets to egress from only one or a small subset of IPs so that the services they’re reaching out to can positively identify them – e.g. “only allow traffic from this specific source IP and block everything else.” They might also want traffic to NAT to IPs that accurately reflect the source’s geolocation, in order to pass the “pizza test”: are the results returned for the search term “pizza near me” geographically relevant? These requirements can increase the complexity of a customer’s NAT setup.

Enabling communication between private subnets with overlapping IP space

NATs are also used for routing traffic within fully private networks, in order to enable communication between resources with overlapping IP space. One example: imagine that you’re an IT architect at a retail company with a hundred geographically distributed store locations and a central data center. To make your life easier, you want to use the same IP address management scheme for all of your stores – e.g. host all of your printers on, point of sale devices on, and security cameras on These devices need to reach out to resources hosted in your data center, which is also on your private network. The challenge: if multiple devices across your stores have the same source IP, how do return packets from your data center get back to the right device? This is where private NAT comes in.

Magic NAT: everywhere, unbounded, and lower cost
Private NAT translates IPs into a different private range so that devices with overlapping IP space can communicate with each other.

A NAT gateway sitting in a private network can enable connectivity between overlapping subnets by translating the original source IP (the one shared by multiple resources) to an IP in a different range. This can enable communication between mirrored subnets and other resources (like in our store → datacenter example), as well as between the mirrored subnets themselves – e.g. if traffic needed to flow between our store locations directly, such as a VoIP call from one store to another.

Conserving IP address space

As of 2019, the available pool of allocatable IPv4 space has been exhausted, making addresses a limited resource. In order to conserve their IPv4 space while the industry slowly transitions to IPv6, ISPs have adopted carrier-grade NAT solutions to map multiple users to a single IP, maximizing the mileage of the space they have available. This uses the same mechanisms for address translation we’ve already described, but at a large scale – ISPs need to deploy devices that can handle thousands or millions of concurrent connections without impacting traffic performance.

Magic NAT: everywhere, unbounded, and lower cost

Challenges with existing NAT solutions

Today, users accomplish the use cases we’ve described with a physical appliance (often a firewall) or a virtual appliance delivered as a managed service from a cloud provider. These approaches have the same fundamental limitations as other hardware and virtualized hardware solutions traditionally used to accomplish most network functions.

Geography constraints

Physical or virtual devices performing NAT are deployed in one or a few specific locations (e.g. within a company’s data center or in a specific cloud region). Traffic may need to be backhauled out of its way through those specific locations to be NAT’d. A common example is the hub and spoke network architecture, where all Internet-bound traffic is backhauled from geographically distributed locations to be filtered and passed through a NAT gateway to the Internet at a central “hub.” (We’ve written about this challenge previously in the context of hardware firewalls.)

Managed NAT services offered by cloud providers require customers to deploy NAT gateway instances in specific availability zones. This means that if customers have origin services in multiple availability zones, they either need to backhaul traffic from one zone to another, incurring fees and latency, or deploy instances in multiple zones. They also need to plan for redundancy – for example, AWS recommends configuring a NAT gateway in every availability zone for “zone-independent architecture.”

Capacity constraints

Each appliance or virtual device can only support up to a certain amount of traffic, and higher supported traffic volumes usually come at a higher cost. Beyond these limits, users need to deploy multiple NAT instances and design mechanisms to load balance traffic across them, adding additional hardware and network hops to their stack.

Cost challenges

Physical devices that perform NAT functionality have several costs associated – in addition to the upfront CAPEX for device purchases, organizations need to plan for installation, maintenance, and upgrade costs. While managed cloud services don’t carry the same cost line items of traditional hardware, leading providers’ models include multiple costs and variable pricing that can be hard to predict. A combination of hourly charges, data processing charges, and data transfer charges can lead to surprises at the end of the month, especially if traffic experiences momentary spikes.

Hybrid infrastructure challenges

More and more customers we talk to are embracing hybrid (datacenter/cloud), multi-cloud, or poly-cloud infrastructure to diversify their spend and leverage the best of breed features offered by each provider. This means deploying separate NAT instances across each of these networks, which introduces additional complexity, management overhead, and cost.

Magic NAT: everywhere, unbounded, cross-platform, and predictably priced

Over the past few years, as we’ve been growing our portfolio of network services, we’ve heard over and over from customers that you want an alternative to the NAT solutions currently available on the market and a better way to address the challenges we described. We’re excited to introduce Magic NAT, the latest entrant in our “Magic” family of services designed to help customers build their next-generation networks on Cloudflare.

How does it work?

Magic NAT is built on the foundational components of Cloudflare One, our Zero Trust network-as-a-service platform. You can follow a few simple steps to get set up:

  1. Connect to Cloudflare. Magic NAT works with all of our network-layer on-ramps including Anycast GRE or IPsec, CNI, and WARP. Users set up a tunnel or direct connection and route privately sourced traffic across it; packets land at the closest Cloudflare location automatically.
  2. Upgrade for Internet connectivity. Users can enable Internet-bound TCP and UDP traffic (any port) to access resources on the Internet from Cloudflare IPs.
  3. (Optional) Enable dedicated egress IPs. Available if you need traffic to egress from one or multiple dedicated IPs rather than a shared pool. Dedicated egress IPs may be useful if you interact with services that “allowlist” specific IP addresses or otherwise care about which IP addresses are seen by servers on the Internet.
  4. (Optional) Layer on security policies for safe access. Magic NAT works natively with Cloudflare One security tools including Magic Firewall and our Secure Web Gateway. Users can add policies on top of East/West and Internet-bound traffic to secure all network traffic with L3 through L7 protection.

Address translation between IP versions will also be supported, including 4to6 and 6to4 NAT capabilities to ensure backwards and forwards compatibility when clients or servers are only reachable via IPv4 or IPv6.

Magic NAT: everywhere, unbounded, and lower cost

Anycast: Magic NAT is everywhere, automatically

With Cloudflare’s Anycast architecture and global network of over 275 cities across the world, users no longer need to think about deploying NAT capabilities in specific locations or “availability zones.” Anycast on-ramps mean that traffic automatically lands at the closest Cloudflare location. If that location becomes unavailable (e.g. for maintenance), traffic fails over automatically to the next closest – zero configuration work from customers required. Failover from Cloudflare to customer networks is also automatic; we’ll always route traffic across the healthiest available path to you.

Scale: Magic NAT leverages Cloudflare’s entire network capacity

Cloudflare’s global capacity is at 141 Tbps and counting, and automated traffic management systems like Unimog allow us to take full advantage of that capacity to serve high volumes of traffic smoothly. We absorb some of the largest DDoS attacks on the Internet, process hundreds of Gbps for customers through Magic Firewall, and provide privacy for millions of user devices across the world – and Magic NAT is built with this scale in mind. You’ll never need to provision and load balance across multiple instances or worry about traffic throttling or congestion again.

Cost: no more hardware costs and no surprises

Magic NAT, like our other network services, is priced based on the 95th percentile of clean bandwidth for your network: no installation, maintenance, or upgrades, and no surprise charges for data transfer spikes. Unlike managed services offered by cloud providers, we won’t charge you for traffic twice. This means fair, predictable billing based on what you actually use.

Hybrid and multi-cloud: simplify networking across environments

Today, customers deploying NAT across on-prem environments and cloud properties need to manage separate instances for each network. As with Cloudflare’s other products that provide an overlay across multiple environments (e.g. Magic Firewall), we can dramatically simplify this architecture by giving users a single place for all their traffic to NAT through regardless of source/origin network.


Traditional NAT solutions Magic NAT
Deploy physical or virtual appliances in one or more locations; additional cost for redundancy.
No more planning availability zones. Magic NAT is everywhere and extremely fault-tolerant, automatically.
Physical and virtual appliances have upper limits for throughput; need to deploy and load balance across multiple devices to overcome.
No more planning for capacity and deploying multiple instances to load balance traffic across – Magic NAT leverages Cloudflare’s entire network capacity, automatically.
High (hardware) and/or unpredictable (cloud) cost
CAPEX plus installation, maintenance, and upgrades or triple charge for managed cloud service.
Fairly and predictably priced
No more sticker shock from unexpected data processing charges at the end of the month.
Tied to physical network or single cloud
Need to deploy multiple instances to cover traffic flows across the entire network.
Simplify networking across environments; one control plane across all of your traffic flows.

Learn more

Magic NAT is currently in beta, translating network addresses globally for a variety of workloads, large and small. We’re excited to get your feedback about it and other new capabilities we’re cooking up to help you simplify and future-proof your network – learn more or contact your account team about getting access today!

A bridge to Zero Trust

Post Syndicated from Annika Garbers original

A bridge to Zero Trust

A bridge to Zero Trust

Cloudflare One enables customers to build their corporate networks on a faster, more secure Internet by connecting any source or destination and configuring routing, security, and performance policies from a single control plane. Today, we’re excited to announce another piece of the puzzle to help organizations on their journey from traditional network architecture to Zero Trust: the ability to route traffic from user devices with our lightweight roaming agent (WARP) installed to any network connected with our Magic IP-layer tunnels (Anycast GRE, IPsec, or CNI). From there, users can upgrade to Zero Trust over time, providing an easy path from traditional castle and moat to next-generation architecture.

The future of corporate networks

Customers we talk to describe three distinct phases of architecture for their corporate networks that mirror the shifts we’ve seen with storage and compute, just with a 10 to 20 year delay. Traditional networks (“Generation 1”) existed within the walls of a datacenter or headquarters, with business applications hosted on company-owned servers and access granted via private LAN or WAN through perimeter security appliances. As applications shifted to the cloud and users left the office, companies have adopted “Generation 2” technologies like SD-WAN and virtualized appliances to handle increasingly fragmented and Internet-dependent traffic. What they’re left with now is a frustrating patchwork of old and new technologies, gaps in visibility and security, and headaches for overworked IT and networking teams.

We think there’s a better future to look forward to:the architecture Gartner describes as SASE, where security and network functions shift from physical or virtual appliances to true cloud-native services delivered just milliseconds away from users and applications regardless of where they are in the world. This new paradigm will mean vastly more secure, more performant, and more reliable networks, creating better experiences for users and reducing total cost of ownership. IT will shift from being viewed as a cost center and bottleneck for business changes to a driver of innovation and efficiency.

A bridge to Zero Trust
Generation 1: Castle and Moat; Generation 2: Virtualized Functions; Generation 3: Zero Trust Network

But transformative change can’t happen overnight. For many organizations, especially those transitioning from legacy architecture, it’ll take months or years to fully embrace Generation 3. The good news: Cloudflare is here to help, providing a bridge from your current network architecture to Zero Trust, no matter where you are on your journey.

How do we get there?

Cloudflare One, our combined Zero Trust network-as-a-service platform, allows customers to connect to our global network from any traffic source or destination with a variety of “on-ramps” depending on your needs. To connect individual devices, users can install the WARP client, which acts as a forward proxy to tunnel traffic to the closest Cloudflare location regardless of where users are in the world. Cloudflare Tunnel allows you to establish a secure, outbound-only connection between your origin servers and Cloudflare by installing a lightweight daemon.

Last year, we announced the ability to route private traffic from WARP-enrolled devices to applications connected with Cloudflare Tunnel, enabling private network access for any TCP or UDP applications. This is the best practice architecture we recommend for Zero Trust network access, but we’ve also heard from customers with legacy architecture that you want options to enable a more gradual transition.

For network-level (OSI Layer 3) connectivity, we offer standards-based GRE or IPsec options, with a Cloudflare twist: these tunnels are Anycast, meaning one tunnel from your network connects automatically to Cloudflare’s entire network in 250+ cities, providing redundancy and simplifying network management. Customers also have the option to leverage Cloudflare Network Interconnect, which enables direct connectivity to the Cloudflare network through a physical or virtual connection in over 1,600 locations worldwide. These Layer 1 through 3 on-ramps allow you to connect your public and private networks to Cloudflare with familiar technologies that automatically make all of your IP traffic faster and more resilient.

Now, traffic from WARP-enrolled devices can route automatically to any network connected with an IP-layer on-ramp. This additional “plumbing” for Cloudflare One increases the flexibility that users have to connect existing network infrastructure, allowing organizations to transition from traditional VPN architecture to Zero Trust with application-level connectivity over time.

A bridge to Zero Trust

How does it work?

Users can install the WARP client on any device to proxy traffic to the closest Cloudflare location. From there, if the device is enrolled in a Cloudflare account with Zero Trust and private routing enabled, its traffic will get delivered to the account’s dedicated, isolated network “namespace,” a logical copy of the Linux networking stack specific to a single customer. This namespace, which exists on every server in every Cloudflare data center, holds all the routing and tunnel configuration for a customer’s connected network.

Once traffic lands in a customer namespace, it’s routed to the destination network over the configured GRE, IPsec, or CNI tunnels. Customers can configure route prioritization to load balance traffic over multiple tunnels and automatically fail over to the healthiest possible traffic path from each Cloudflare location.

On the return path, traffic from customer networks to Cloudflare is also routed via Anycast to the closest Cloudflare location—but this location is different from that of the WARP session, so this return traffic is forwarded to the server where the WARP session is active. In order to do this, we leverage a new internal service called Hermes that allows data to be shared across all servers in our network. Just as our Quicksilver service propagates key-value data from our core infrastructure throughout our network, Hermes allows servers to write data that can be read by other servers. When a WARP session is established, its location is written to Hermes. And when return traffic is received, the WARP session’s location is read from Hermes, and the traffic is tunneled appropriately.

What’s next?

This on-ramp method is available today for all Cloudflare One customers. Contact your account team to get set up! We’re excited to add more functionality to make it even easier for customers to transition to Zero Trust, including layering additional security policies on top of connected network traffic and providing service discovery to help organizations prioritize applications to migrate to Zero Trust connectivity.

Protect all network traffic with Cloudflare

Post Syndicated from Annika Garbers original

Protect all network traffic with Cloudflare

Protect all network traffic with Cloudflare

Magic Transit protects customers’ entire networks—any port/protocol—from DDoS attacks and provides built-in performance and reliability. Today, we’re excited to extend the capabilities of Magic Transit to customers with any size network, from home networks to offices to large cloud properties, by offering Cloudflare-maintained and Magic Transit-protected IP space as a service.

What is Magic Transit?

Magic Transit extends the power of Cloudflare’s global network to customers, absorbing all traffic destined for your network at the location closest to its source. Once traffic lands at the closest Cloudflare location, it flows through a stack of security protections including industry-leading DDoS mitigation and cloud firewall. Detailed Network Analytics, alerts, and reporting give you deep visibility into all your traffic and attack patterns. Clean traffic is forwarded to your network using Anycast GRE or IPsec tunnels or Cloudflare Network Interconnect. Magic Transit includes load balancing and automatic failover across tunnels to steer traffic across the healthiest path possible, from everywhere in the world.

Protect all network traffic with Cloudflare
Magic Transit architecture: Internet BGP advertisement attracts traffic to Cloudflare’s network, where attack mitigation and security policies are applied before clean traffic is forwarded back to customer networks with an Anycast GRE tunnel or Cloudflare Network Interconnect.

The “Magic” is in our Anycast architecture: every server across our network runs every Cloudflare service, so traffic can be processed wherever it lands. This means the entire capacity of our network—121+Tbps as of this post—is available to block even the largest attacks. It also drives huge benefits for performance versus traditional “scrubbing center” solutions that route traffic to specialized locations for processing, and makes onboarding much easier for network engineers: one tunnel to Cloudflare automatically connects customer infrastructure to our entire network in over 250 cities worldwide.

What’s new?

Historically, Magic Transit has required customers to bring their own IP addresses—a minimum of a class C IP block, or /24—in order to use this service. This is because a /24 is the minimum prefix length that can be advertised via BGP on the public Internet, which is how we attract traffic for customer networks.

But not all customers have this much IP space; we’ve talked to many of you who want IP layer protection for a smaller network than we’re able to advertise to the Internet on your behalf. Today, we’re extending the availability of Magic Transit to customers with smaller networks by offering Magic Transit-protected, Cloudflare-managed IP space. Starting now, you can direct your network traffic to dedicated static IPs and receive all the benefits of Magic Transit including industry leading DDoS protection, visibility, performance, and resiliency.

Let’s talk through some new ways you can leverage Magic Transit to protect and accelerate any network.

Consistent cross-cloud security

Organizations adopting a hybrid or poly-cloud strategy have struggled to maintain consistent security controls across different environments. Where they used to manage a single firewall appliance in a datacenter, security teams now have a myriad of controls across different providers—physical, virtual, and cloud-based—all with different capabilities and control mechanisms.

Cloudflare is the single control plane across your hybrid cloud deployment, allowing you to manage security policies from one place, get uniform protection across your entire environment, and get deep visibility into your traffic and attack patterns.

Protect all network traffic with Cloudflare

Protecting branches of any size

As DDoS attack frequency and variety continues to grow, attackers are getting more creative with angles to target organizations. Over the past few years, we have seen a consistent rise in attacks targeted at corporate infrastructure including internal applications. As the percentage of a corporate network dependent on the Internet continues to grow, organizations need consistent protection across their entire network.

Now, you can get any network location covered—branch offices, stores, remote sites, event venues, and more—with Magic Transit-protected IP space. Organizations can also replace legacy hardware firewalls at those locations with our built-in cloud firewall, which filters bidirectional traffic and propagates changes globally within seconds.

Protect all network traffic with Cloudflare

Keeping streams alive without worrying about leaked IPs

Generally, DDoS attacks target a specific application or network in order to impact the availability of an Internet-facing resource. But you don’t have to be hosting anything in order to get attacked, as many gamers and streamers have unfortunately discovered. The public IP associated with a home network can easily be leaked, giving attackers the ability to directly target and take down a live stream.

As a streamer, you can now route traffic from your home network through a Magic Transit-protected IP. This means no more worrying about leaking your IP: attackers targeting you will have traffic blocked at the closest Cloudflare location to them, far away from your home network. And no need to worry about impact to your game: thanks to Cloudflare’s globally distributed and interconnected network, you can get protected without sacrificing performance.

Protect all network traffic with Cloudflare

Get started today

This solution is available today; learn more or contact your account team to get started.

Packet captures at the edge

Post Syndicated from Annika Garbers original

Packet captures at the edge

Packet captures at the edge

Packet captures are a critical tool used by network and security engineers every day. As more network functions migrate from legacy on-prem hardware to cloud-native services, teams risk losing the visibility they used to get by capturing 100% of traffic funneled through a single device in a datacenter rack. We know having easy access to packet captures across all your network traffic is important for troubleshooting problems and deeply understanding traffic patterns, so today, we’re excited to announce the general availability of on-demand packet captures from Cloudflare’s global network.

What are packet captures and how are they used?

A packet capture is a file that contains all packets that were seen by a particular network box, usually a firewall or router, during a specific time frame. Packet captures are a powerful and commonly used tool for debugging network issues or getting better visibility into attack traffic to tighten security (e.g. by adding firewall rules to block a specific attack pattern).

A network engineer might use a pcap file in combination with other tools, like mtr, to troubleshoot problems with reachability to their network. For example, if an end user reports intermittent connectivity to a specific application, an engineer can set up a packet capture filtered to the user’s source IP address to record all packets received from their device. They can then analyze that packet capture and compare it to other sources of information (e.g. pcaps from the end user’s side of the network path, traffic logs and analytics) to understand the magnitude and isolate the source of the problem.

Security engineers can also use packet captures to gain a better understanding of potentially malicious traffic. Let’s say an engineer notices an unexpected spike in traffic that they suspect could be an attempted attack. They can grab a packet capture to record the traffic as it’s hitting their network and analyze it to determine whether the packets are valid. If they’re not, for example, if the packet payload is randomly generated gibberish, the security engineer can create a firewall rule to block traffic that looks like this from entering their network.

Packet captures at the edge
Example of a packet capture from a recent DDoS attack targeted at Cloudflare infrastructure. The contents of this pcap can be used to create a “signature” to block the attack.

Fragmenting traffic creates gaps in visibility

Traditionally, users capture packets by logging into their router or firewall and starting a process like tcpdump. They’d set up a filter to only match on certain packets and grab the file. But as networks have become more fragmented and users are moving security functions out to the edge, it’s become increasingly challenging to collect packet captures for relevant traffic. Instead of just one device that all traffic flows through (think of a drawbridge in the “castle and moat” analogy) engineers may have to capture packets across many different physical and virtual devices spread across locations. Many of these packets may not allow taking pcaps at all, and then users have to try to  stitch them back together to create a full picture of their network traffic. This is a nearly impossible task today and only getting harder as networks become more fractured and complex.

Packet captures at the edge

On-demand packet captures from the Cloudflare global network

With Cloudflare, you can regain this visibility. With Magic Transit and Magic WAN, customers route all their public and private IP traffic through Cloudflare’s network to make it more secure, faster, and more reliable, but also to increase visibility. You can think of Cloudflare like a giant, globally distributed version of the drawbridge in our old analogy: because we act as a single cloud-based router and firewall across all your traffic, we can capture packets across your entire network and deliver them back to you in one place.

How does it work?

Customers can request a packet capture using our Packet Captures API. To get the packets you’re looking for you can provide a filter with the IP address, ports, and protocol of the packets you want.

curl -X POST${account_id}/pcaps \
-H 'Content-Type: application/json' \
-H 'X-Auth-Email: [email protected]' \
-H 'X-Auth-Key: 00000000000' \
--data '{
        "filter_v1": {
               "source_address": "",
               "protocol": 6
        "time_limit": 300,
        "byte_limit": "10mb",
        "packet_limit": 10000,
        "type": "simple",
        "system": "magic-transit"

Example of a request for packet capture using our API.

We leverage nftables to apply the filter to the customer’s incoming packets and log them using nflog:

table inet pcaps_1 {
    chain pcap_1 {
        ip protocol 6 ip saddr log group 1 comment “packet capture”

Example nftables configuration used to filter log customer packets

nflog creates a netfilter socket through which logs of a packet are sent from the Linux kernel to user space. In user space, we use tcpdump to read packets off the netfilter socket and generate a packet capture file:

tcpdump -i nflog:1 -w pcap_1.pcap

Example tcpdump command to create a packet capture file.

Usually tcpdump is used by listening to incoming packets on a network interface, but in our case we configure it to read packet logs from an nflog group. tcpdump will convert the packet logs into a packet capture file.

Once we have a packet capture file, we need to deliver it to customers. Because packet capture files can be large and contain sensitive information (e.g. packet payloads), we send them to customers directly from our machines to a cloud storage service of their choice. This means we never store sensitive data, and it’s easy for customers to manage and store these large files.

Get started today

On-demand packet captures are now generally available for customers who have purchased the Advanced features of Magic Firewall. The packet capture API allows customers to capture the first 160 bytes of packets, sampled at a default rate of 1/100. More functionality including full packet captures and on-demand packet capture control in the Cloudflare Dashboard is coming in the following weeks. Contact your account team to stay updated on the latest!

Announcing Anycast IPsec: a new on-ramp to Cloudflare One

Post Syndicated from Annika Garbers original

Announcing Anycast IPsec: a new on-ramp to Cloudflare One

Announcing Anycast IPsec: a new on-ramp to Cloudflare One

Today, we’re excited to announce support for IPsec as an on-ramp to Cloudflare One. As a customer, you should be able to use whatever method you want to get your traffic to Cloudflare’s network. We’ve heard from you that IPsec is your method of choice for connecting to us at the network layer, because of its near-universal vendor support and blanket layer of encryption across all traffic. So we built support for it! Read on to learn how our IPsec implementation is faster and easier to use than traditional IPsec connectivity, and how it integrates deeply with our Cloudflare One suite to provide unified security, performance, and reliability across all your traffic.

Using the Internet as your corporate network

With Cloudflare One, customers can connect any traffic source or destination — branch offices, data centers, cloud properties, user devices — to our network. Traffic is routed to the closest Cloudflare location, where security policies are applied before we send it along optimized routes to its destination — whether that’s within your private network or on the Internet. It is good practice to encrypt any traffic that’s sensitive at the application level, but for customers who are transitioning from forms of private connectivity like Multiprotocol Label Switching (MPLS), this often isn’t a reality. We’ve talked to many customers who have legacy file transfer and other applications running across their MPLS circuits unencrypted, and are relying on the fact that these circuits are “private” to provide security. In order to start sending this traffic over the Internet, customers need a blanket layer of encryption across all of it; IPsec tunnels are traditionally an easy way to accomplish this.

Traditional IPsec implementations

IPsec as a technology has been around since 1995, and is broadly implemented across many hardware and software platforms. Many companies have adopted IPsec VPNs for securely transferring corporate traffic over the Internet. These VPNs tend to have one of two main architectures: hub and spoke, or mesh.

Announcing Anycast IPsec: a new on-ramp to Cloudflare One

In the hub and spoke model, each “spoke” node establishes an IPsec tunnel back to a core “hub,” usually a headquarters or data center location. Traffic between spokes flows through the hub for routing and in order to have security policies applied (like by an on-premise firewall). This architecture is simple because each node only needs to maintain one tunnel to get connectivity to other locations, but it can introduce significant performance penalties. Imagine a global network with two “spokes”, one in India and another one in Singapore, but a “hub” located in the United States — traffic needs to travel a round trip thousands of miles back and forth in order to get to its destination.

Announcing Anycast IPsec: a new on-ramp to Cloudflare One

In the mesh model, every node is connected to every other node with a dedicated IPsec tunnel. This improves performance because traffic can take more direct paths, but in practice means an unmanageable number of tunnels after even a handful of locations are added.

Customers we’ve talked to about IPsec know they want it for the blanket layer of encryption and broad vendor support, but they haven’t been particularly excited about it because of the problems with existing architecture models. We knew we wanted to develop something that was easier to use and left those problems in the past, so that customers could get excited about building their next-generation network on Cloudflare. So how are we bringing IPsec out of the 90s? By delivering it on our global Anycast network: customers establish one IPsec tunnel to us and get automatic connectivity to 250+ locations. It’s conceptually similar to the hub and spoke model, but the “hub” is everywhere, blazing fast, and easy to manage.

So how does IPsec actually work?

IPsec was designed back in 1995 to provide authentication, integrity, and confidentiality for IP packets. One of the ways it does this is by creating tunnels between two hosts, encrypting the IP packets, and adding a new IP header onto encrypted packets. To make this happen, IPsec has two components working together: a userspace Internet Key Exchange (IKE) daemon and an IPsec stack in kernel-space. IKE is the protocol which creates Security Associations (SAs) for IPsec. An SA is a collection of all the security parameters, like those for authentication and encryption, that are needed to establish an IPsec tunnel.

When a new IPsec tunnel needs to be set up, one IKE daemon will initiate a session with another and create an SA. All the complexity of configuration, key negotiation, and key generation happens in a handful of packets between the two IKE daemons safely in userspace. Once the IKE Daemons have started their session, they hand off their nice and neat SA to the IPsec stack in kernel-space, which now has all the information it needs to intercept the right packets for encryption and decryption.

There are plenty of open source IKE daemons, including strongSwan, Libreswan, and Openswan, that we considered using for our IPsec implementation. These “swans” all tie speaking the IKE protocol tightly with configuring the IPsec stack. This is great for establishing point-to-point tunnels — installing one “swan” is all you need to speak IKE and configure an encrypted tunnel. But we’re building the next-generation network that takes advantage of Cloudflare’s entire global Anycast edge. So how do we make it so that a customer sets up one tunnel with Cloudflare with every single edge server capable of exchanging data on it?

Anycast IPsec: an implementation for next-generation networks

The fundamental problem in the way of Anycast IPsec is that the SA needs to be handed off to the kernel-space IPsec stack on every Cloudflare edge server, but the SA is created on only one server — the one running the IKE daemon that the customer’s IKE daemon connects to. How do we solve this problem? The first thing that needs to be true is that every server needs to be able to create that SA.

Every Cloudflare server now runs an IKE daemon, so customers can have a fast, reliable connection to start a tunnel anywhere in the world. We looked at using one of the existing “swans” but that tight coupling of IKE with the IPsec stack meant that the SA was hard to untangle from configuring the dataplane. We needed the SA totally separate and neatly sharable from the server that created it to every other server on our edge. Naturally, we built our own “swan” to do just that.

To send our SA worldwide, we put a new spin on an old trick. With Cloudflare Tunnels, a customer’s cloudflared tunnel process creates connections to a few nearby Cloudflare edge servers. But traffic destined for that tunnel could arrive at any edge server, which needs to know how to proxy traffic to the tunnel-connected edge servers. So, we built technology that enables an edge server to rapidly distribute information about its Cloudflare Tunnel connections to all other edge servers.

Fundamentally, the problem of SA distribution is similar — a customer’s IKE daemon connects to a single Cloudflare edge server’s IKE daemon, and information about that connection needs to be distributed to every other edge server. So, we upgraded the Cloudflare Tunnel technology to make it more general and are now using it to distribute SAs as part of Anycast IPsec support. Within seconds of an SA being created, it is distributed to every Cloudflare edge server where a streaming protocol applies the configuration to the kernel-space IPsec stack. Cloudflare’s Anycast IPsec benefits from the same reliability and resilience we’ve built in Cloudflare Tunnels and turns our network into one massively scalable, resilient IPsec tunnel to your network.

On-ramp with IPsec, access all of Cloudflare One

We built IPsec as an on-ramp to Cloudflare One on top of our existing global system architecture, putting the principles customers care about first. You care about ease of deployment, so we made it possible for you to connect to your entire virtual network on Cloudflare One with a single IPsec tunnel. You care about performance, so we built technology that connects your IPsec tunnel to every Cloudflare location, eliminating hub-and-spoke performance penalties. You care about enforcing security policies across all your traffic regardless of source, so we integrated IPsec with the entire Cloudflare One suite including Magic Transit, Magic Firewall, Zero Trust, and more.

IPsec is in early access for Cloudflare One customers. If you’re interested in trying it out, contact your account team today!

Welcome to CIO Week and the future of corporate networks

Post Syndicated from Annika Garbers original

Welcome to CIO Week and the future of corporate networks

Welcome to CIO Week and the future of corporate networks

The world of a CIO has changed — today’s corporate networks look nothing like those of even five or ten years ago — and these changes have created gaps in visibility and security, introduced high costs and operational burdens, and made networks fragile and brittle.

We’re optimistic that CIOs have a brighter future to look forward to. The Internet has evolved from a research project into integral infrastructure companies depend on, and we believe a better Internet is the path forward to solving the most challenging problems CIOs face today. Cloudflare is helping build an Internet that’s faster, more secure, more reliable, more private, and programmable, and by doing so, we’re enabling organizations to build their next-generation networks on ours.

This week, we’ll demonstrate how Cloudflare One, our Zero Trust Network-as-a-Service, is helping CIOs transform their corporate networks. We’ll also introduce new functionality that expands the scope of Cloudflare’s platform to address existing and emerging needs for CIOs. But before we jump into the week, we wanted to spend some time on our vision for the corporate network of the future. We hope this explanation will clarify language and acronyms used by vendors and analysts who have realized the opportunity in this space (what does Zero Trust Network-as-a-Service mean, anyway?) and set context for how our innovative approach is realizing this vision for real CIOs today.

Welcome to CIO Week and the future of corporate networks

Generation 1: Castle and moat

For years, corporate networks looked like this:

Welcome to CIO Week and the future of corporate networks

Companies built or rented space in data centers that were physically located within or close to major office locations. They hosted business applications — email servers, ERP systems, CRMs, etc. — on servers in these data centers. Employees in offices connected to these applications through the local area network (LAN) or over private wide area network (WAN) links from branch locations. A stack of security hardware (e.g., firewalls) in each data center enforced security for all traffic flowing in and out. Once on the corporate network, users could move laterally to other connected devices and hosted applications, but basic forms of network authentication and physical security controls like employee badge systems generally prevented untrusted users from getting access.

Network Architecture Scorecard: Generation 1

Characteristic Score Description
Security ⭐⭐ All traffic flows through perimeter security hardware. Network access restricted with physical controls. Lateral movement is only possible once on network.
Performance ⭐⭐⭐ Majority of users and applications stay within the same building or regional network.
Reliability ⭐⭐ Dedicated data centers, private links, and security hardware present single points of failure. There are cost tradeoffs to purchase redundant links and hardware.
Cost ⭐⭐ Private connectivity and hardware are high cost capital expenditures, creating a high barrier to entry for small or new businesses. However, a limited number of links/boxes are required (trade off with redundancy/reliability). Operational costs are low to medium after initial installation.
Visibility ⭐⭐⭐ All traffic is routed through central location, so it’s possible to access NetFlow/packet captures and more for 100% of flows.
Agility Significant network changes have a long lead time.
Precision Controls are primarily exercised at the network layer (e.g., IP ACLs). Accomplishing “allow only HR to access employee payment data” looks like: IP in range X allowed to access IP in range Y (and requires accompanying spreadsheet to track IP allocation).

Applications and users left the castle

So what changed? In short, the Internet. Faster than anyone expected, the Internet became critical to how people communicate and get work done. The Internet introduced a radical shift in how organizations thought about their computing resources: if any computer can talk to any other computer, why would companies need to keep servers in the same building as employees’ desktops? And even more radical, why would they need to buy and maintain their own servers at all? From these questions, the cloud was born, enabling companies to rent space on other servers and host their applications while minimizing operational overhead. An entire new industry of Software-as-a-Service emerged to simplify things even further, allowing companies to completely abstract away questions of capacity planning, server reliability, and other operational struggles.

This golden, Internet-enabled future — cloud and SaaS everything — sounds great! But CIOs quickly ran into problems. Established corporate networks with castle-and-moat architecture can’t just go down for months or years during a large-scale transition, so most organizations are in a hybrid state, one foot still firmly in the world of data centers, hardware, and MPLS. And traffic to applications still needs to stay secure, so even if it’s no longer headed to a server in a company-owned data center, many companies have continued to send it there (backhauled through private lines) to flow through a stack of firewall boxes and other hardware before it’s set free.

As more applications moved to the Internet, the volume of traffic leaving branches — and being backhauled through MPLS lines through data centers for security — continued to increase. Many CIOs faced an unpleasant surprise in their bandwidth charges the month after adopting Office 365: with traditional network architecture, more traffic to the Internet meant more traffic over expensive private links.

As if managing this first dramatic shift — which created complex hybrid architectures and brought unexpected cost increases — wasn’t enough, CIOs had another to handle in parallel. The Internet changed the game not just for applications, but also for users. Just as servers don’t need to be physically located at a company’s headquarters anymore, employees don’t need to be on the office LAN to access their tools. VPNs allow people working outside of offices to get access to applications hosted on the company network (whether physical or in the cloud).

These VPNs grant remote users access to the corporate network, but they’re slow, clunky to use, and can only support a limited number of people before performance degrades to the point of unusability. And from a security perspective, they’re terrifying — once a user is on the VPN, they can move laterally to discover and gain access to other resources on the corporate network. It’s much harder for CIOs and CISOs to control laptops with VPN access that could feasibly be brought anywhere — parks, public transportation, bars — than computers used by badged employees in the traditional castle-and-moat office environment.

In 2020, COVID-19 turned these emerging concerns about VPN cost, performance, and security into mission-critical, business-impacting challenges, and they’ll continue to be even as some employees return to offices.

Welcome to CIO Week and the future of corporate networks

Generation 2: Smörgåsbord of point solutions

Lots of vendors have emerged to tackle the challenges introduced by these major shifts, often focusing on one or a handful of use cases. Some providers offer virtualized versions of hardware appliances, delivered over different cloud platforms; others have cloud-native approaches that address a specific problem like application access or web filtering. But stitching together a patchwork of point solutions has caused even more headaches for CIOs and most products available focused only on shoring up identity, endpoint, and application security without truly addressing network security.

Gaps in visibility

Compared to the castle and moat model, where traffic all flowed through a central stack of appliances, modern networks have extremely fragmented visibility. IT teams need to piece together information from multiple tools to understand what’s happening with their traffic. Often, a full picture is impossible to assemble, even with the support of tools including SIEM and SOAR applications that consolidate data from multiple sources. This makes troubleshooting issues challenging: IT support ticket queues are full of unsolved mysteries. How do you manage what you can’t see?

Gaps in security

This patchwork architecture — coupled with the visibility gaps it introduced — also creates security challenges. The concept of “Shadow IT” emerged to describe services that employees have adopted and are using without explicit IT permission or integration into the corporate network’s traffic flow and security policies. Exceptions to filtering policies for specific users and use cases have become unmanageable, and our customers have described a general “wild west” feeling about their networks as Internet use grew faster than anyone could have anticipated. And it’s not just gaps in filtering that scare CIOs — the proliferation of Shadow IT means company data can and does now exist in a huge number of unmanaged places across the Internet.

Poor user experience

Backhauling traffic through central locations to enforce security introduces latency for end users, amplified as they work in locations farther and farther away from their former offices. And the Internet, while it’s come a long way, is still fundamentally unpredictable and unreliable, leaving IT teams struggling to ensure availability and performance of apps for users with many factors (even down to shaky coffee shop Wi-Fi) out of their control.

High (and growing) cost

CIOs are still paying for MPLS links and hardware to enforce security across as much traffic as possible, but they’ve now taken on additional costs of point solutions to secure increasingly complex networks. And because of fragmented visibility and security gaps, coupled with performance challenges and rising expectations for a higher quality of user experience, the cost of providing IT support is growing.

Network fragility

All this complexity means that making changes can be really hard. On the legacy side of current hybrid architectures, provisioning MPLS lines and deploying new security hardware come with long lead times, only worsened by recent issues in the global hardware supply chain. And with the medley of point solutions introduced to manage various aspects of the network, a change to one tool can have unintended consequences for another. These effects compound in IT departments often being the bottleneck for business changes, limiting the flexibility of organizations to adapt to an only-accelerating rate of change.

Network Architecture Scorecard: Generation 2

Characteristic Score Description
Security Many traffic flows are routed outside of perimeter security hardware, Shadow IT is rampant, and controls that do exist are enforced inconsistently and across a hodgepodge of tools.
Performance Traffic backhauled through central locations introduces latency as users move further away; VPNs and a bevy of security tools introduce processing overhead and additional network hops.
Reliability ⭐⭐ The redundancy/cost tradeoff from Generation 1 is still present; partial cloud adoption grants some additional resiliency but growing use of unreliable Internet introduces new challenges.
Cost Costs from Generation 1 architecture are retained (few companies have successfully deprecated MPLS/security hardware so far), but new costs of additional tools added, and operational overhead is growing.
Visibility Traffic flows and visibility are fragmented; IT stitches partial picture together across multiple tools.
Agility ⭐⭐ Some changes are easier to make for aspects of business migrated to cloud; others have grown more painful as additional tools introduce complexity.
Precision ⭐⭐ Mix of controls exercised at network layer and application layer. Accomplishing “allow only HR to access employee payment data” looks like: Users in group X allowed to access IP in range Y (and accompanying spreadsheet to track IP allocation)

In summary — to reiterate where we started — modern CIOs have really hard jobs. But we believe there’s a better future ahead.

Generation 3: The Internet as the new corporate network

The next generation of corporate networks will be built on the Internet. This shift is already well underway, but CIOs need a platform that can help them get access to a better Internet — one that’s more secure, faster, more reliable, and preserves user privacy while navigating complex global data regulations.

Zero Trust security at Internet scale

CIOs are hesitant to give up expensive forms of private connectivity because they feel more secure than the public Internet. But a Zero Trust approach, delivered on the Internet, dramatically increases security versus the classic castle and moat model or a patchwork of appliances and point software solutions adopted to create “defense in depth.” Instead of trusting users once they’re on the corporate network and allowing lateral movement, Zero Trust dictates authenticating and authorizing every request into, out of, and between entities on your network, ensuring that visitors can only get to applications they’re explicitly allowed to access. And delivering this authentication and policy enforcement from an edge location close to the user enables radically better performance, rather than forcing traffic to backhaul through central data centers or traverse a huge stack of security tools.

In order to enable this new model, CIOs need a platform that can:

Connect all the entities on their corporate network.

It has to not just be possible, but also easy and reliable to connect users, applications, offices, data centers, and cloud properties to each other as flexibly as possible. This means support for the hardware and connectivity methods customers have today, from enabling mobile clients to operate across OS versions to compatibility with standard tunneling protocols and network peering with global telecom providers.

Apply comprehensive security policies.

CIOs need a solution that integrates tightly with their existing identity and endpoint security providers and provides Zero Trust protection at all layers of the OSI stack across traffic within their network. This includes end-to-end encryption, microsegmentation, sophisticated and precise filtering and inspection for traffic between entities on their network (“East/West”) and to/from the Internet (“North/South”), and protection from other threats like DDoS and bot attacks.

Visualize and provide insight on traffic.

At a base level, CIOs need to understand the full picture of their traffic: who’s accessing what resources and what does performance (latency, jitter, packet loss) look like? But beyond providing the information necessary to answer basic questions about traffic flows and user access, next-generation visibility tools should help users understand trends and highlight potential problems proactively, and they should provide easy-to-use controls to respond to those potential problems. Imagine logging into one dashboard that provides a comprehensive view of your network’s attack surface, user activity, and performance/traffic health, receiving customized suggestions to tighten security and optimize performance, and being able to act on those suggestions with a single click.

Better quality of experience, everywhere in the world

More classic critiques of the public Internet: it’s slow, unreliable, and increasingly subject to complicated regulations that make operating on the Internet as a CIO of a globally distributed company exponentially challenging. The platform CIOs need will make intelligent decisions to optimize performance and ensure reliability, while offering flexibility to make compliance easy.

Fast, in the ways that matter most.

Traditional methods of measuring network performance, like speed tests, don’t tell the full story of actual user experience. Next-generation platforms will measure performance holistically and consider application-specific factors, along with using real-time data on Internet health, to optimize traffic end-to-end.

Reliable, despite factors out of your control.

Scheduled downtime is a luxury of the past: today’s CIOs need to operate 24×7 networks with as close as possible to 100% uptime and reachability from everywhere in the world. They need a provider that’s resilient in its own services, but also has the capacity to handle massive attacks with grace and flexibility to route around issues with intermediary providers. Network teams should also not need to take action for their provider’s planned or unplanned data center outages, such as needing to manually configure new data center connections. And they should be able to onboard new locations at any time without waiting for vendors to provision additional capacity close to their network.

Localized and compliant with data privacy regulations.

Data sovereignty laws are rapidly evolving. CIOs need to bet on a platform that will give them the flexibility to adapt as new protections are rolled out across the globe, with one interface to manage their data (not fractured solutions in different regions).

A paradigm shift that’s possible starting today

These changes sound radical and exciting. But they’re also intimidating — wouldn’t a shift this large be impossible to execute, or at least take an unmanageably long time, in complex modern networks? Our customers have proven this doesn’t have to be the case.

Meaningful change starting with just one flow

Generation 3 platforms should prioritize ease of use. It should be possible for companies to start their Zero Trust journey with just one traffic flow and grow momentum from there. There’s lots of potential angles to start with, but we think one of the easiest is configuring clientless Zero Trust access for one application. Anyone, from the smallest to the largest organizations, should be able to pick an app and prove the value of this approach within minutes.

A bridge between the old & new world

Shifting from network-level access controls (IP ACLs, VPNs, etc.) to application and user-level controls to enforce Zero Trust across your entire network will take time. CIOs should pick a platform that makes it easy to migrate infrastructure over time by allowing:

  • Upgrading from IP-level to application-level architecture over time: Start by connecting with a GRE or IPsec tunnel, then use automatic service discovery to identify high-priority applications to target for finer-grained connection.
  • Upgrading from more open to more restrictive policies over time: Start with security rules that mirror your legacy architecture, then leverage analytics and logs to implement more restrictive policies once you can see who’s accessing what.
  • Making changes to be quick and easy: Design your next-generation network using a modern SaaS interface.
Welcome to CIO Week and the future of corporate networks

Network Architecture Scorecard: Generation 3

Characteristic Score Description
Security ⭐⭐⭐ Granular security controls are exercised on every traffic flow; attacks are blocked close to their source; technologies like Browser Isolation keep malicious code entirely off of user devices.
Performance ⭐⭐⭐ Security controls are enforced at location closest to each user; intelligent routing decisions ensure optimal performance for all types of traffic.
Reliability ⭐⭐⭐ The platform leverages redundant infrastructure to ensure 100% availability; no one device is responsible for holding policy and no one link is responsible for carrying all critical traffic.
Cost ⭐⭐ Total cost of ownership is reduced by consolidating functions.
Visibility ⭐⭐⭐ Data from across the edge is aggregated, processed and presented along with insights and controls to act on it.
Agility ⭐⭐⭐ Making changes to network configuration or policy is as simple as pushing buttons in a dashboard; changes propagate globally within seconds.
Precision ⭐⭐⭐ Controls are exercised at the user and application layer. Accomplishing “allow only HR to access employee payment data” looks like: Users in HR on trusted devices allowed to access employee payment data

Cloudflare One is the first built-from-scratch, unified platform for next-generation networks

In order to achieve the ambitious vision we’ve laid out, CIOs need a platform that can combine Zero Trust and network services operating on a world-class global network. We believe Cloudflare One is the first platform to enable CIOs to fully realize this vision.

We built Cloudflare One, our combined Zero Trust network-as-a-service platform, on our global network in software on commodity hardware. We initially started on this journey to serve the needs of our own IT and security teams and extended capabilities to our customers over time as we realized their potential to help other companies transform their networks. Every Cloudflare service runs on every server in over 250 cities with over 100 Tbps of capacity, providing unprecedented scale and performance. Our security services themselves are also faster — our DNS filtering runs on the world’s fastest public DNS resolver and identity checks run on Cloudflare Workers, the fastest serverless platform.

We leverage insights from over 28 million requests per second and 10,000+ interconnects to make smarter security and performance decisions for all of our customers. We provide both network connectivity and security services in a single platform with single-pass inspection and single-pane management to fill visibility gaps and deliver exponentially more value than the sum of point solutions could alone. We’re giving CIOs access to our globally distributed, blazing-fast, intelligent network to use as an extension of theirs.

This week, we’ll recap and expand on Cloudflare One, with examples from real customers who are building their next-generation networks on Cloudflare. We’ll dive more deeply into the capabilities that are available today and how they’re solving the problems introduced in Generation 2, as well as introduce some new product areas that will make CIOs’ lives easier by eliminating the cost and complexity of legacy hardware, hardening security across their networks and from multiple angles, and making all traffic routed across our already fast network even faster.

We’re so excited to share how we’re making our dreams for the future of corporate networks reality — we hope CIOs (and everyone!) reading this are excited to hear about it.

Magic makes your network faster

Post Syndicated from Annika Garbers original

Magic makes your network faster

Magic makes your network faster

We launched Magic Transit two years ago, followed more recently by its siblings Magic WAN and Magic Firewall, and have talked at length about how this suite of products helps security teams sleep better at night by protecting entire networks from malicious traffic. Today, as part of Speed Week, we’ll break down the other side of the Magic: how using Cloudflare can automatically make your entire network faster. Our scale and interconnectivity, use of data to make more intelligent routing decisions, and inherent architecture differences versus traditional networks all contribute to performance improvements across all IP traffic.

What is Magic?

Cloudflare’s “Magic” services help customers connect and secure their networks without the cost and complexity of maintaining legacy hardware. Magic Transit provides connectivity and DDoS protection for Internet-facing networks; Magic WAN enables customers to replace legacy WAN architectures by routing private traffic through Cloudflare; and Magic Firewall protects all connected traffic with a built-in firewall-as-a-service. All three share underlying architecture principles that form the basis of the performance improvements we’ll dive deeper into below.

Anycast everything

In contrast to traditional “point-to-point” architecture, Cloudflare uses Anycast GRE or IPsec (coming soon) tunnels to send and receive traffic for customer networks. This means that customers can set up a single tunnel to Cloudflare, but effectively get connected to every single Cloudflare location, dramatically simplifying the process to configure and maintain network connectivity.

Magic makes your network faster

Every service everywhere

In addition to being able to send and receive traffic from anywhere, Cloudflare’s edge network is also designed to run every service on every server in every location. This means that incoming traffic can be processed wherever it lands, which allows us to block DDoS attacks and other malicious traffic within seconds, apply firewall rules, and route traffic efficiently and without bouncing traffic around between different servers or even different locations before it’s dispatched to its destination.

Zero Trust + Magic: the next-gen network of the future

With Cloudflare One, customers can seamlessly combine Zero Trust and network connectivity to build a faster, more secure, more reliable experience for their entire corporate network. Everything we’ll talk about today applies even more to customers using the entire Cloudflare One platform – stacking these products together means the performance benefits multiply (check out our post on Zero Trust and speed from today for more on this).

More connectivity = faster traffic

So where does the Magic come in? This part isn’t intuitive, especially for customers using Magic Transit in front of their network for DDoS protection: how can adding a network hop subtract latency?

The answer lies in Cloudflare’s network architecture — our web of connectivity to the rest of the Internet. Cloudflare has invested heavily in building one of the world’s most interconnected networks (9800 interconnections and counting, including with major ISPs, cloud services, and enterprises). We’re also continuing to grow our own private backbone and giving customers the ability to directly connect with us. And our expansive connectivity to last mile providers means we’re just milliseconds away from the source of all your network traffic, regardless of where in the world your users or employees are.

This toolkit of varying connectivity options means traffic routed through the Cloudflare network is often meaningfully faster than paths across the public Internet alone, because more options available for BGP path selection mean increased ability to choose more performant routes. Imagine having only one possible path between your house and the grocery store versus ten or more – chances are, adding more options means better alternatives will be available. A cornucopia of connectivity methods also means more resiliency: if there’s an issue on one of the paths (like construction happening on what is usually the fastest street), we can easily route around it to avoid impact to your traffic.

One common comparison customers are interested in is latency for inbound traffic. From the end user perspective, does routing through Cloudflare speed up or slow down traffic to networks protected by Magic Transit? Our response: let’s test it out and see! We’ve repeatedly compared Magic Transit vs standard Internet performance for customer networks across geographies and industries and consistently seen really exciting results. Here’s an example from one recent test where we used third-party probes to measure the ping time to the same customer network location (their data center in Qatar) before and after onboarding with Magic Transit:

Probe location RTT w/o Magic (ms) RTT w/ Magic (ms) Difference (ms) Difference (% improvement)
Dubai 27 23 4 13%
Marseille 202 188 13 7%
Global (results averaged across 800+ distributed probes) 194 124 70 36%

All of these results were collected without the use of Argo Smart Routing for Packets, which we announced on Tuesday. Early data indicates that networks using Smart Routing will see even more substantial gains.

Modern architecture eliminates traffic trombones

In addition to the performance boost available for traffic routed across the Cloudflare network versus the public Internet, customers using Magic products benefit from a new architecture model that totally removes up to thousands of miles worth of latency.

Traditionally, enterprises adopted a “hub and spoke” model for granting employees access to applications within and outside their network. All traffic from within a connected network location was routed through a central “hub” where a stack of network hardware (e.g. firewalls) was maintained. This model worked great in locations where the hub and spokes were geographically close, but started to strain as companies became more global and applications moved to the cloud.

Now, networks using hub and spoke architecture are often backhauling traffic thousands of miles, between continents and across oceans, just to apply security policies before packets are dispatched to their final destination, which is often physically closer to where they started! This creates a “trombone” effect, where precious seconds are wasted bouncing traffic back and forth across the globe, and performance problems are amplified by packet loss and instability along the way.

Magic makes your network faster

Network and security teams have tried to combat this issue by installing hardware at more locations to establish smaller, regional hubs, but this quickly becomes prohibitively expensive and hard to manage. The price of purchasing multiple hardware boxes and dedicated private links adds up quickly, both in network gear and connectivity itself as well as the effort required to maintain additional infrastructure. Ultimately, this cost usually outweighs the benefit of the seconds regained with shorter network paths.

The “hub” is everywhere

There’s a better way — with the Anycast architecture of Magic products, all traffic is automatically routed to the closest Cloudflare location to its source. There, security policies are applied with single-pass inspection before traffic is routed to its destination. This model is conceptually similar to a hub and spoke, except that the hub is everywhere: 95% of the entire Internet-connected world is within 50 ms of a Cloudflare location (check out this week’s updates on our quickly-expanding network presence for the latest). This means instead of tromboning traffic between locations, it can stop briefly at a Cloudflare hop in-path before it goes on its way: dramatically faster architecture without compromising security.

To demonstrate how this architecture shift can make a meaningful difference, we created a lab to mirror the setup we’ve heard many customers describe as they’ve explained performance issues with their existing network. This example customer network is headquartered in South Carolina and has branch office locations on the west coast, in California and Oregon. Traditionally, traffic from each branch would be backhauled through the South Carolina “hub” before being sent on to its destination, either another branch or the public Internet.

In our alternative setup, we’ve connected each customer network location to Cloudflare with an Anycast GRE tunnel, simplifying configuration and removing the South Carolina trombone. We can also enforce network and application-layer filtering on all of this traffic, ensuring that the faster network path doesn’t compromise security.

Magic makes your network faster

Here’s a summary of results from performance tests on this example network demonstrating the difference between the traditional hub and spoke setup and the Magic “global hub” — we saw up to 70% improvement in these tests, demonstrating the dramatic impact this architecture shift can make.

LAX <> OR (ms)
ICMP round-trip for “Regular” (hub and spoke) WAN 127
ICMP round-trip for Magic WAN 38
Latency savings for Magic WAN vs “Regular” WAN 70%

This effect can be amplified for networks with globally distributed locations — imagine the benefits for customers who are used to delays from backhauling traffic between different regions across the world.

Getting smarter

Adding more connectivity options and removing traffic trombones provide a performance boost for all Magic traffic, but we’re not stopping there. In the same way we leverage insights from hundreds of billions of requests per day to block new types of malicious traffic, we’re also using our unique perspective on Internet traffic to make more intelligent decisions about routing customer traffic versus relying on BGP alone. Earlier this week, we announced updates to Argo Smart Routing including the brand-new Argo Smart Routing for Packets. Customers using Magic products can enable it to automatically boost performance for any IP traffic routed through Cloudflare (by 10% on average according to results so far, and potentially more depending on the individual customer’s network topology) — read more on this in the announcement blog.

What’s next?

The modern architecture, well-connected network, and intelligent optimizations we’ve talked about today are just the start. Our vision is for any customer using Magic to connect and protect their network to have the best performance possible for all of their traffic, automatically. We’ll keep investing in expanding our presence, interconnections, and backbone, as well as continuously improving Smart Routing — but we’re also already cooking up brand-new products in this space to deliver optimizations in new ways, including WAN Optimization and Quality of Service functions. Stay tuned for more Magic coming soon, and get in touch with your account team to learn more about how we can help make your network faster starting today.

Introducing Greencloud

Post Syndicated from Annika Garbers original

Introducing Greencloud

Introducing Greencloud

Over the past few days, as part of Cloudflare’s Impact Week, we’ve written about the work we’re doing to help build a greener Internet. We’re making bold climate commitments for our own network and facilities and introducing new capabilities that help customers understand and reduce their impact. And in addition to organization-level initiatives, we also recognize the importance of individual impact — which is why we’re excited to publicly introduce Greencloud, our sustainability-focused employee working group.

What is Greencloud?

Greencloud is a coalition of Cloudflare employees who are passionate about the environment. Initially founded in 2019, we’re a cross-functional, global team with a few areas of focus:

  1. Awareness: Greencloud compiles and shares resources about environmental activism with each other and the broader organization. We believe that collective action — not just conscious consumerism, but also engagement in local policy and community movements — is critical to a more sustainable future, and that the ability to affect change starts with education. We’re also consistently inspired by the great work other folks in tech are doing in this space, and love sharing updates from peers that push us to do better within our own spheres of influence.
  2. Support: Our membership includes Cloudflare team members from across the org chart, which enables us to be helpful in supporting multidisciplinary projects led by functional teams within Cloudflare.
  3. Advocacy: We recognize the importance of both individual and organization-level action. We continue to challenge ourselves, each other and the broader organization to think about environmental impact in every decision we make as a company.

Our vision is to contribute on every level to addressing the climate crisis and creating a more sustainable future, helping Cloudflare become a clear leader in sustainable practices among tech companies. Moreover, we want to empower our colleagues to make more sustainable decisions in each of our individual lives.

What has Greencloud done so far?

Since launching in 2019, Greencloud has created a space for conversation and idea generation around Cloudflare’s sustainability initiatives, many of which have been implemented across our organization. As a group, we’ve created content to educate ourselves and external audiences about a broad range of sustainability topics:

  • Benchmarked Cloudflare’s sustainability practices against peer companies to understand our baseline and source ideas for improvement.
  • Curated guides for colleagues on peer-reviewed content, product recommendations, and “low-hanging fruit” actions we all have the ability to take, such as choosing a sustainable 401k investment plan and using a paperless option for all employee documents.
  • Hosted events such as sustainability-themed trivia/quiz nights to spark discussion and teach participants techniques for making more sustainable decisions in our own homes and lives.

In addition to creating “evergreen” resources and hosting events, Greencloud threw a special celebration for April 22, 2021 — the 51st global Earth Day. For the surrounding week, we hosted a series of events to engage our employees and community in sustainability education and actions.

Greencloud TV Takeover

You can catch reruns of our Earth Week content on Cloudflare TV, covering a broad range of topics:

Tuesday: Infrastructure
A chat with Michael Aylward, Head of Cloudflare’s Network Partners Program and renewable energy expert, about the carbon footprint of Internet infrastructure. We explored how the Internet contributes to climate change and what tech companies, including Cloudflare, are doing to minimize this footprint.

Wednesday: Policy
An interview with Doug Kramer, Cloudflare’s General Counsel, and Patrick Day, Cloudflare’s Senior Policy Counsel, on the overlap between sustainability, tech, and public policy. We dove into how tech companies, including Cloudflare, are working with policymakers to build a more sustainable future.

Thursday: Cloudflare and the Climate
Francisco Ponce de León interviewed Sagar Aryal, the CTO of Plant for the Planet, an organization of young Climate Justice Ambassadors with the goal of planting one trillion trees. Plant for the Planet is a participant in Project Galileo, Cloudflare’s program providing free protection for at-risk public interest groups.

In addition, Amy Bibeau, our Greencloud Places team lead, interviewed Cloudflare’s Head Of Real Estate and Workplace Operations, Caroline Quick and LinkedIn’s Dana Jennings, Senior Project Manager, Global Sustainability for a look into the opportunities and challenges around creating sustainable workplaces. Like most companies, Cloudflare is re-thinking what our workplace will look like post-COVID.  Baking sustainability into those plans, and being a model for other companies, can be game changing.

Friday: Personal Impact & Trivia
A panel of Greencloud employees addressed the challenge of personal versus collective/system-level action and broke down some of the highest value actions we’re working on taking in our own lives.

Finally, Greencloud took over Cloudflare TV’s signature game show Silicon Valley Squares with Earth Day-themed questions!

Get engaged

No one person, group, or organization working alone can save our planet — the degree of collective action required to reverse climate change is staggering, but we’re excited and inspired by the work that leaders across every industry are pitching in every day. We’d love for you and/or your organization to join us in this calling to create a more sustainable planet and tell us about your initiatives to exchange ideas.

Flow-based monitoring for Magic Transit

Post Syndicated from Annika Garbers original

Flow-based monitoring for Magic Transit

Flow-based monitoring for Magic Transit

Network-layer DDoS attacks are on the rise, prompting security teams to rethink their L3 DDoS mitigation strategies to prevent business impact. Magic Transit protects customers’ entire networks from DDoS attacks by placing our network in front of theirs, either always on or on demand. Today, we’re announcing new functionality to improve the experience for on-demand Magic Transit customers: flow-based monitoring. Flow-based monitoring allows us to detect threats and notify customers when they’re under attack so they can activate Magic Transit for protection.

Magic Transit is Cloudflare’s solution to secure and accelerate your network at the IP layer. With Magic Transit, you get DDoS protection, traffic acceleration, and other network functions delivered as a service from every Cloudflare data center. With Cloudflare’s global network (59 Tbps capacity across 200+ cities) and <3sec time to mitigate at the edge, you’re covered from even the largest and most sophisticated attacks without compromising performance. Learn more about Magic Transit here.

Using Magic Transit on demand

With Magic Transit, Cloudflare advertises customers’ IP prefixes to the Internet with BGP in order to attract traffic to our network for DDoS protection. Customers can choose to use Magic Transit always on or on demand. With always on, we advertise their IPs and mitigate attacks all the time; for on demand, customers activate advertisement only when their networks are under active attack. But there’s a problem with on demand: if your traffic isn’t routed through Cloudflare’s network, by the time you notice you’re being targeted by an attack and activate Magic Transit to mitigate it, the attack may have already caused impact to your business.

Flow-based monitoring for Magic Transit

On demand with flow-based monitoring

Flow-based monitoring solves the problem with on-demand by enabling Cloudflare to detect and notify you about attacks based on traffic flows from your data centers. You can configure your routers to continuously send NetFlow or sFlow (coming soon) to Cloudflare. We’ll ingest your flow data and analyze it for volumetric DDoS attacks.

Flow-based monitoring for Magic Transit
Send flow data from your network to Cloudflare for analysis

When an attack is detected, we’ll notify you automatically (by email, webhook, and/or PagerDuty) with information about the attack.

Flow-based monitoring for Magic Transit
Cloudflare detects attacks based on your flow data

You can choose whether you’d like to activate IP advertisement with Magic Transit manually – we support activation via the Cloudflare dashboard or API – or automatically, to minimize the time to mitigation. Once Magic Transit is activated and your traffic is flowing through Cloudflare, you’ll receive only the clean traffic back to your network over your GRE tunnels.

Flow-based monitoring for Magic Transit
Activate Magic Transit for DDoS protection

Using flow-based monitoring with Magic Transit on demand will provide your team peace of mind. Rather than acting in response to an attack after it impacts your business, you can complete a simple one-time setup and rest assured that Cloudflare will notify you (and/or start protecting your network automatically) when you’re under attack. And once Magic Transit is activated, Cloudflare’s global network and industry-leading DDoS mitigation has you covered: your users can continue business as usual with no impact to performance.

Example flow-based monitoring workflow: faster time to mitigate for Acme Corp

Let’s walk through an example customer deployment and workflow with Magic Transit on demand and flow-based monitoring. Acme Corp’s network was hit by a large ransom DDoS attack recently, which caused downtime for both external-facing and internal applications. To make sure they’re not impacted again, the Acme network team chose to set up on-demand Magic Transit. They authorize Cloudflare to advertise their IP space to the Internet in case of an attack, and set up Anycast GRE tunnels to receive clean traffic from Cloudflare back to their network. Finally, they configure their routers at each data center to send NetFlow data to a Cloudflare Anycast IP.

Cloudflare receives Acme’s NetFlow data at a location close to the data center sending it (thanks, Anycast!) and analyzes it for DDoS attacks. When traffic exceeds attack thresholds, Cloudflare triggers an automatic PagerDuty incident for Acme’s NOC team and starts advertising Acme’s IP prefixes to the Internet with BGP. Acme’s traffic, including the attack, starts flowing through Cloudflare within minutes, and the attack is blocked at the edge. Clean traffic is routed back to Acme through their GRE tunnels, causing no disruption to end users – they’ll never even know Acme was attacked. When the attack has subsided, Acme’s team can withdraw their prefixes from Cloudflare with one click, returning their traffic to its normal path.

Flow-based monitoring for Magic Transit
When the attack subsides, withdraw your prefixes from Cloudflare to return to normal

Get started

To learn more about Magic Transit and flow-based monitoring, contact us today.

How our network powers Cloudflare One

Post Syndicated from Annika Garbers original

How our network powers Cloudflare One

Earlier this week, we announced Cloudflare One™, a unified approach to solving problems in enterprise networking and security. With Cloudflare One, your organization’s data centers, offices, and devices can all be protected and managed in a single control plane. Cloudflare’s network is central to the value of all of our products, and today I want to dive deeper into how our network powers Cloudflare One.

Over the past ten years, Cloudflare has encountered the same challenges that face every organization trying to grow and protect a global network: we need to protect our infrastructure and devices from attackers and malicious outsiders, but traditional solutions aren’t built for distributed networks and teams. And we need visibility into the activity across our network and applications, but stitching together logging and analytics tools across multiple solutions is painful and creates information gaps.

How our network powers Cloudflare One

We’ve architected our network to meet these challenges, and with Cloudflare One, we’re extending the advantages of these decisions to your company’s network to help you solve them too.


Enterprises and some small organizations alike have team members around the world. Legacy models of networking forced traffic back through central choke points, slowing down users and constraining network scale. We keep hearing from our customers who want to stop buying appliances and expensive MPLS links just to try and outpace the increased demand their distributed teams place on their network.

Wherever your users are, we are too

Global companies have enough of a challenge managing widely distributed corporate networks, let alone the additional geographic dispersity introduced as users are enabled to work from home or from anywhere. Because Cloudflare has data centers close to Internet users around the world, all traffic can be processed close to its source (your users), regardless of their location. This delivers performance benefits across all of our products.

We built our network to meet users where they are. Today, we have data centers in over 200 cities and over 100 countries. As the geographical reach of Cloudflare’s network has expanded, so has our capacity, which currently tops 42 Tbps. This reach and capacity is extended to your enterprise with Cloudflare One.

The same Cloudflare, everywhere

Traditional solutions for securing enterprise networks often involve managing a plethora of regional providers with different capabilities. This means that traffic from two users in different parts of the world may be treated completely differently, for example, with respect to quality of DDoS attack detection. With Cloudflare One, you can manage security for your entire global network from one place, consolidating and standardizing control.

Capacity for the good & the bad

With 42 Tbps of network capacity, you can rest assured that Cloudflare can handle all of your traffic – the clean, legitimate traffic you want, and the malicious and attack traffic you don’t.


Every product on every server

All of Cloudflare’s services are standardized across our entire network. Every service runs on every server, which means that traffic through all of the products you use can be processed close to its source, rather than being sent around to different locations for different services. This also means that as our network continues to grow, all products benefit: new data centers will automatically process traffic for every service you use.

For example, your users who connect to the Internet through Cloudflare Gateway in South America connect to one of our data centers in the region, rather than backhauling to another location. When those users need to reach an origin located on the other side of the world, we can also route them over our private backbone to get them there faster.

Commodity hardware, software-based functions

We built our network using commodity hardware, which allows us to scale quickly without relying on one single vendor or getting stuck in supply chain bottlenecks. And the services that process your traffic are software-based – no specialized, third-party hardware performing specific functions. This means that the development, maintenance, and support for the products you use all lives within Cloudflare, reducing the complexity of getting help when you need it.

This approach also lets us build efficiency into our network. We use that efficiency to serve customers on our free plan and deliver a more cost-effective platform to our larger customers.


Cloudflare interconnects with over 8,800 networks globally, including major ISPs, cloud services, and enterprises. Because we’ve built one of the most interconnected networks in the world, Cloudflare One can deliver a better experience for your users and applications, regardless of your network architecture or connectivity/transit vendors.

Broad interconnectivity with eyeball networks

Because of our CDN product (among others), being close to end users (“eyeballs”) has always been critical for our network. Now that more people than ever are working from home, eyeball → datacenter connectivity is more crucial than ever. We’ve spoken to customers who, since transitioning to a work-from-home model earlier this year, have had congestion issues with providers who are not well-connected with eyeball networks. With Cloudflare One, your employees can do their jobs from anywhere with Cloudflare smoothly keeping their traffic (and your infrastructure) secure.

Extensive presence in peering facilities

Earlier this year, we announced Cloudflare Network Interconnect (CNI), the ability for you to connect your network with Cloudflare’s via a secure physical or virtual connection. Using CNI means more secure, reliable traffic to your network through Cloudflare One. With our highly-connected network, there’s a good chance we’re colocated with your organization in at least one peering facility, making CNI setup a no-brainer. We’ve also partnered with five interconnect platforms to provide even more flexibility with virtual (software-defined layer 2) connections with Cloudflare. Finally, we peer with major cloud providers all over the world, providing even more flexibility for organizations at any stage of hybrid/cloud transition.

Making the Internet smarter

Traditional approaches to creating secure and reliable network connectivity involve relying on expensive MPLS links to provide point to point connection. Cloudflare is built from the ground-up on the Internet, relying on and improving the same Internet links that customers use today. We’ve built software and techniques that help us be smarter about how we use the Internet to deliver better performance and reliability to our customers. We’ve also built the Cloudflare Global Private Backbone to help us even further enhance our software and techniques to deliver even more performance and reliability where it’s needed the most.

This approach allows us to use the variety of connectivity options in our toolkit intelligently, building toward a more performant network than what we could accomplish with a traditional MPLS solution. And because we use transit from a wide variety of providers, chances are that whoever your ISP is, you already have high-quality connectivity to Cloudflare’s network.


Diverse traffic workload yields attack intelligence

We process all kinds of traffic thanks to our network’s reach and the diversity of our customer base. That scale gives us unique insight into the Internet. We can analyze trends and identify new types of attacks before they hit the mainstream, allowing us to better prepare and protect customers as the security landscape changes.

We also provide you with visibility into these network and threat intelligence insights with tools like Cloudflare Radar and Cloudflare One Intel. Earlier this week, we launched a feature to block DNS tunneling attempts. We analyze a tremendous number of DNS queries and have built a model of what they should look like. We use that model to block suspicious queries which might leak data from devices.

Unique network visibility enables Smart Routing

In addition to attacks and malicious traffic across our network, we’re paying attention to the state of the Internet. Visibility across carriers throughout the world allows us to identify congestion and automatically route traffic along the fastest and most reliable paths. Contrary to the experience delivered by traditional scrubbing providers, Magic Transit customers experience minimal latency and sometimes even performance improvements with Cloudflare in path, thanks to our extensive connectivity and transit diversity.

Argo Smart Routing, powered by our extensive network visibility, improves performance for web assets by 30% on average; we’re excited to bring these benefits to any traffic through Cloudflare One with Argo Smart Routing for Magic Transit (coming soon!).

What’s next?

Cloudflare’s network is the foundation of the value and vision for Cloudflare One. With Cloudflare One, you can put our network between the Internet and your entire enterprise, gaining the powerful benefits of our global reach, scalability, connectivity, and insight. All of the products we’ve launched this week, like everything we’ve built so far, benefit from the unique advantages of our network.

We’re excited to see these effects multiply as organizations adopt Cloudflare One to protect and accelerate all of their traffic. And we’re just getting started: we’re going to continue to expand our network, and the products that run on it, to deliver an even faster, more secure, more reliable experience across all of Cloudflare One.

Helping sites get back online: the origin monitoring intern project

Post Syndicated from Annika Garbers original

Helping sites get back online: the origin monitoring intern project

Helping sites get back online: the origin monitoring intern project

The most impactful internship experiences involve building something meaningful from scratch and learning along the way. Those can be tough goals to accomplish during a short summer internship, but our experience with Cloudflare’s 2019 intern program met both of them and more! Over the course of ten weeks, our team of three interns (two engineering, one product management) went from a problem statement to a new feature, which is still working in production for all Cloudflare customers.

The project

Cloudflare sits between customers’ origin servers and end users. This means that all traffic to the origin server runs through Cloudflare, so we know when something goes wrong with a server and sometimes reflect that status back to users. For example, if an origin is refusing connections and there’s no cached version of the site available, Cloudflare will display a 521 error. If customers don’t have monitoring systems configured to detect and notify them when failures like this occur, their websites may go down silently, and they may hear about the issue for the first time from angry users.

Helping sites get back online: the origin monitoring intern project
Helping sites get back online: the origin monitoring intern project
When a customer’s origin server is unreachable, Cloudflare sends a 5xx error back to the visitor.‌‌

This problem became the starting point for our summer internship project: since Cloudflare knows when customers’ origins are down, let’s send them a notification when it happens so they can take action to get their sites back online and reduce the impact to their users! This work became Cloudflare’s passive origin monitoring feature, which is currently available on all Cloudflare plans.

Over the course of our internship, we ran into lots of interesting technical and product problems, like:

Making big data small

Working with data from all requests going through Cloudflare’s 26 million+ Internet properties to look for unreachable origins is unrealistic from a data volume and performance perspective. Figuring out what datasets were available to analyze for the errors we were looking for, and how to adapt our whiteboarded algorithm ideas to use this data, was a challenge in itself.

Ensuring high alert quality

Because only a fraction of requests show up in the sampled timing and error dataset we chose to use, false positives/negatives were disproportionately likely to occur for low-traffic sites. These are the sites that are least likely to have sophisticated monitoring systems in place (and therefore are most in need of this feature!). In order to make the notifications as accurate and actionable as possible, we analyzed patterns of failed requests throughout different types of Cloudflare Internet properties. We used this data to determine thresholds that would maximize the number of true positive notifications, while making sure they weren’t so sensitive that we end up spamming customers with emails about sporadic failures.

Designing actionable notifications

Cloudflare has lots of different kinds of customers, from people running personal blogs with interest in DDoS mitigation to large enterprise companies with extremely sophisticated monitoring systems and global teams dedicated to incident response. We wanted to make sure that our notifications were understandable and actionable for people with varying technical backgrounds, so we enabled the feature for small samples of customers and tested many variations of the “origin monitoring email”. Customers responded right back to our notification emails, sent in support questions, and posted on our community forums. These were all great sources of feedback that helped us improve the message’s clarity and actionability.

We frontloaded our internship with lots of research (both digging into request data to understand patterns in origin unreachability problems and talking to customers/poring over support tickets about origin unreachability) and then spent the next few weeks iterating. We enabled passive origin monitoring for all customers with some time remaining before the end of our internships, so we could spend time improving the supportability of our product, documenting our design decisions, and working with the team that would be taking ownership of the project.

We were also able to develop some smaller internal capabilities that built on the work we’d done for the customer-facing feature, like notifications on origin outage events for larger sites to help our account teams provide proactive support to customers. It was super rewarding to see our work in production, helping Cloudflare users get their sites back online faster after receiving origin monitoring notifications.

Our internship experience

The Cloudflare internship program was a whirlwind ten weeks, with each day presenting new challenges and learnings! Some factors that led to our productive and memorable summer included:

A well-scoped project

It can be tough to find a project that’s meaningful enough to make an impact but still doable within the short time period available for summer internships. We’re grateful to our managers and mentors for identifying an interesting problem that was the perfect size for us to work on, and for keeping us on the rails if the technical or product scope started to creep beyond what would be realistic for the time we had left.

Working as a team of interns

The immediate team working on the origin monitoring project consisted of three interns: Annika in product management and Ilya and Zhengyao in engineering. Having a dedicated team with similar goals and perspectives on the project helped us stay focused and work together naturally.

Quick, agile cycles

Since our project faced strict time constraints and our team was distributed across two offices (Champaign and San Francisco), it was critical for us to communicate frequently and work in short, iterative sprints. Daily standups, weekly planning meetings, and frequent feedback from customers and internal stakeholders helped us stay on track.

Great mentorship & lots of freedom

Our managers challenged us, but also gave us room to explore our ideas and develop our own work process. Their trust encouraged us to set ambitious goals for ourselves and enabled us to accomplish way more than we may have under strict process requirements.

After the internship

In the last week of our internships, the engineering interns, who were based in the Champaign, IL office, visited the San Francisco office to meet with the team that would be taking over the project when we left and present our work to the company at our all hands meeting. The most exciting aspect of the visit: our presentation was preempted by Cloudflare’s co-founders announcing public S-1 filing at the all hands! 🙂

Helping sites get back online: the origin monitoring intern project
Helping sites get back online: the origin monitoring intern project

Over the next few months, Cloudflare added a notifications page for easy configurability and announced the availability of passive origin monitoring along with some other tools to help customers monitor their servers and avoid downtime.

Ilya is working for Cloudflare part-time during the school semester and heading back for another internship this summer, and Annika is joining the team full-time after graduation this May. We’re excited to keep working on tools that help make the Internet a better place!

Also, Cloudflare is doubling the size of the 2020 intern class—if you or someone you know is interested in an internship like this one, check out the open positions in software engineering, security engineering, and product management.