Tag Archives: Network Interconnect

Simplifying how enterprises connect to Cloudflare with Express Cloudflare Network Interconnect

2024-03-06 Ben Ritter

Post Syndicated from Ben Ritter original https://blog.cloudflare.com/announcing-express-cni

We’re excited to announce the largest update to Cloudflare Network Interconnect (CNI) since its launch, and because we’re making CNIs faster and easier to deploy, we’re calling this Express CNI. At the most basic level, CNI is a cable between a customer’s network router and Cloudflare, which facilitates the direct exchange of information between networks instead of via the Internet. CNIs are fast, secure, and reliable, and have connected customer networks directly to Cloudflare for years. We’ve been listening to how we can improve the CNI experience, and today we are sharing more information about how we’re making it faster and easier to order CNIs, and connect them to Magic Transit and Magic WAN.

Interconnection services and what to consider

Interconnection services provide a private connection that allows you to connect your networks to other networks like the Internet, cloud service providers, and other businesses directly. This private connection benefits from improved connectivity versus going over the Internet and reduced exposure to common threats like Distributed Denial of Service (DDoS) attacks.

Cost is an important consideration when evaluating any vendor for interconnection services. The cost of an interconnection is typically comprised of a fixed port fee, based on the capacity (speed) of the port, and the variable amount of data transferred. Some cloud providers also add complex inter-region bandwidth charges.

Other important considerations include the following:

How much capacity is needed?
Are there variable or fixed costs associated with the port?
Is the provider located in the same colocation facility as my business?
Are they able to scale with my network infrastructure?
Are you able to predict your costs without any unwanted surprises?
What additional products and services does the vendor offer?

Cloudflare does not charge a port fee for Cloudflare Network Interconnect, nor do we charge for inter-region bandwidth. Using CNI with products like Magic Transit and Magic WAN may even reduce bandwidth spending with Internet service providers. For example, you can deliver Magic Transit-cleaned traffic to your data center with a CNI instead of via your Internet connection, reducing the amount of bandwidth that you would pay an Internet service provider for.

To underscore the value of CNI, one vendor charges nearly \$20,000 a year for a 10 Gigabit per second (Gbps) direct connect port. The same 10 Gbps CNI on Cloudflare for one year is $0. Their cost also does not include any costs related to the amount of data transferred between different regions or geographies, or outside of their cloud. We have never charged for CNIs, and are committed to making it even easier for customers to connect to Cloudflare, and destinations beyond on the open Internet.

3 Minute Provisioning

Our first big announcement is a new, faster approach to CNI provisioning and deployment. Starting today, all Magic Transit and Magic WAN customers can order CNIs directly from their Cloudflare account. The entire process is about 3 clicks and takes less than 3 minutes (roughly the time to make coffee). We’re going to show you how simple it is to order a CNI.

The first step is to find out whether Cloudflare is in the same data center or colocation facility as your routers, servers, and network hardware. Let’s navigate to the new “Interconnects” section of the Cloudflare dashboard, and order a new Direct CNI.

Search for the city of your data center, and quickly find out if Cloudflare is in the same facility. I’m going to stand up a CNI to connect my example network located in Ashburn, VA.

It looks like Cloudflare is in the same facility as my network, so I’m going to select the location where I’d like to connect.

As of right now, my data center is only exchanging a few hundred Megabits per second of traffic on Magic Transit, so I’m going to select a 1 Gigabit per second interface, which is the smallest port speed available. I can also order a 10 Gbps link if I have more than 1 Gbps of traffic in a single location. Cloudflare also supports 100 Gbps CNIs, but if you have this much traffic to exchange with us, we recommend that you coordinate with your account team.

After selecting your preferred port speed, you can name your CNI, which will be referenceable later when you direct your Magic Transit or Magic WAN traffic to the interconnect. We are given the opportunity to verify that everything looks correct before confirming our CNI order.

Once we click the “Confirm Order” button, Cloudflare will provision an interface on our router for your CNI, and also assign IP addresses for you to configure on your router interface. Cloudflare will also issue you a Letter of Authorization (LOA) for you to order a cross connect with the local facility. Cloudflare will provision a port on our router for your CNI within 3 minutes of your order, and you will be able to ping across the CNI as soon as the interface line status comes up.

After downloading the Letter of Authorization (LOA) to order a cross connect, we’ll navigate back to our Interconnects area. Here we can see the point to point IP addressing, and the CNI name that is used in our Magic Transit or Magic WAN configuration. We can also redownload the LOA if needed.

Simplified Magic Transit and Magic WAN onboarding

Our second major announcement is that Express CNI dramatically simplifies how Magic Transit and Magic WAN customers connect to Cloudflare. Getting packets into Magic Transit or Magic WAN in the past with a CNI required customers to configure a GRE (Generic Routing Encapsulation) tunnel on their router. These configurations are complex, and not all routers and switches support these changes. Since both Magic Transit and Magic WAN protect networks, and operate at the network layer on packets, customers rightly asked us, “If I connect directly to Cloudflare with CNI, why do I also need a GRE tunnel for Magic Transit and Magic WAN?”

Starting today, GRE tunnels are no longer required with Express CNI. This means that Cloudflare supports standard 1500-byte packets on the CNI, and there’s no need for complex GRE or MSS adjustment configurations to get traffic into Magic Transit or Magic WAN. This significantly reduces the amount of configuration required on a router for Magic Transit and Magic WAN customers who can connect over Express CNI. If you’re not familiar with Magic Transit, the key takeaway is that we’ve reduced the complexity of changes you must make on your router to protect your network with Cloudflare.

What’s next for CNI?

We’re excited about how Express CNI simplifies connecting to Cloudflare’s network. Some customers connect to Cloudflare through our Interconnection Platform Partners, like Equinix and Megaport, and we plan to bring the Express CNI features to our partners too.

We have upgraded a number of our data centers to support Express CNI, and plan to upgrade many more over the next few months. We are rapidly expanding the number of global locations that support Express CNI as we install new network hardware. If you’re interested in connecting to Cloudflare with Express CNI, but are unable to find your data center, please let your account team know.

If you’re on an existing classic CNI today, and you don’t need Express CNI features, there is no obligation to migrate to Express CNI. Magic Transit and Magic WAN customers have been asking for BGP support to control how Cloudflare routes traffic back to their networks, and we expect to extend BGP support to Express CNI first, so keep an eye out for more Express CNI announcements later this year.

Get started with Express CNI today

As we’ve demonstrated above, Express CNI makes it fast and easy to connect your network to Cloudflare. If you’re a Magic Transit or Magic WAN customer, the new “Interconnects” area is now available on your Cloudflare dashboard. To deploy your first CNI, you can follow along with the screenshots above, or refer to our updated interconnects documentation.

Cloudflare’s global network grows to 300 cities and ever closer to end users with connections to 12,000 networks

2023-06-19 Damian Matacz

Post Syndicated from Damian Matacz original http://blog.cloudflare.com/cloudflare-connected-in-over-300-cities/

Cloudflare's global network grows to 300 cities and ever closer to end users with connections to 12,000 networks

We make no secret about how passionate we are about building a world-class global network to deliver the best possible experience for our customers. This means an unwavering and continual dedication to always improving the breadth (number of cities) and depth (number of interconnects) of our network.

This is why we are pleased to announce that Cloudflare is now connected to over 12,000 Internet networks in over 300 cities around the world!

The Cloudflare global network runs every service in every data center so your users have a consistent experience everywhere—whether you are in Reykjavík, Guam or in the vicinity of any of the 300 cities where Cloudflare lives. This means all customer traffic is processed at the data center closest to its source, with no backhauling or performance tradeoffs.

Having Cloudflare’s network present in hundreds of cities globally is critical to providing new and more convenient ways to serve our customers and their customers. However, the breadth of our infrastructure network provides other critical purposes. Let’s take a closer look at the reasons we build and the real world impact we’ve seen to customer experience:

Reduce latency

Our network allows us to sit approximately 50 ms from 95% of the Internet-connected population globally. Nevertheless, we are constantly reviewing network performance metrics and working with local regional Internet service providers to ensure we focus on growing underserved markets where we can add value and improve performance. So far, in 2023 we’ve already added 12 new cities to bring our network to over 300 cities spanning 122 unique countries!

City
Albuquerque, New Mexico, US
Austin, Texas, US
Bangor, Maine, US
Campos dos Goytacazes, Brazil
Fukuoka, Japan
Kingston, Jamaica
Kinshasa, Democratic Republic of the Congo
Lyon, France
Oran, Algeria
São José dos Campos, Brazil
Stuttgart, Germany
Vitoria, Brazil

In May, we activated a new data center in Campos dos Goytacazes, Brazil, where we interconnected with a regional network provider, serving 100+ local ISPs. While it's not too far from Rio de Janeiro (270km) it still cut our 50th and 75th percentile latency measured from the TCP handshake between Cloudflare's servers and the user's device in half and provided a noticeable performance improvement!

Improve interconnections

A larger number of local interconnections facilitates direct connections between network providers, content delivery networks, and regional Internet Service Providers. These interconnections enable faster and more efficient data exchange, content delivery, and collaboration between networks.

Currently there are approximately 74,000¹ AS numbers routed globally. An Autonomous System (AS) number is a unique number allocated per ISP, enterprise, cloud, or similar network that maintains Internet routing capabilities using BGP. Of these approximate 74,000 ASNs, 43,000² of them are stub ASNs, or only connected to one other network. These are often enterprise, or internal use ASNs, that only connect to their own ISP or internal network, but not with other networks.

It’s mind blowing to consider that Cloudflare is directly connected to 12,372 unique Internet networks, or approximately 1/3rd of the possible networks to connect globally! This direct connectivity builds resilience and enables performance, making sure there are multiple places to connect between networks, ISPs, and enterprises, but also making sure those connections are as fast as possible.

A previous example of this was shown as we started connecting more locally. As seen in this blog post the local connections even increased how much our network was being used: better performance drives further usage!

At Cloudflare we ensure that infrastructure expansion strategically aligns to building in markets where we can interconnect deeper, because increasing our network breadth is only as valuable as the number of local interconnections that it enables. For example, we recently connected to a local ISP (representing a new ASN connection) in Pakistan, where the 50th percentile improved from ~90ms to 5ms!

Build resilience

Network expansion may be driven by reducing latency and improving interconnections, but it’s equally valuable to our existing network infrastructure. Increasing our geographic reach strengthens our redundancy, localizes failover and helps further distribute compute workload resulting in more effective capacity management. This improved resilience reduces the risk of service disruptions and ensures network availability even in the event of hardware failures, natural disasters, or other unforeseen circumstances. It enhances reliability and prevents single points of failure in the network architecture.

Ultimately, our commitment to strategically expanding the breadth and depth of our network delivers improved latency, stronger interconnections and a more resilient architecture – all critical components of a better Internet! If you’re a network operator, and are interested in how, together, we can deliver an improved user experience, we’re here to help! Please check out our Edge Partner Program and let’s get connected.

……..
¹CIDR Report
²Origin ASs announced via a single AS path

Cloudflare’s network expansion in Indonesia

2023-03-10 Joanne Liew

Post Syndicated from Joanne Liew original https://blog.cloudflare.com/indonesia/

Cloudflare's network expansion in Indonesia

As home to over 200 million Internet users and the fourth-largest population in the world, Indonesians depend on fast and reliable Internet, but this has always been a challenging part of the world for Internet infrastructure. This has real world implications on performance and reliability (IP transit is on average 6x more expensive than our major South East Asian interconnection markets). That said, first we wanted to share what makes things challenging in Indonesia; geography, infrastructure, and market dynamics.

Geography: The Internet backbone for many countries is almost entirely delivered by terrestrial fiber optic cables, where connectivity is more affordable and easier to build when the land mass is contiguous and there is a concentrated population distribution. However, Indonesia is a collection of over 18,000 islands, spanning three time zones, and approximately 3,200 miles (5,100 km) east to west. By comparison, the United States is 2,800 miles (4,500 km) east to west. While parts of Indonesia are geographically close to Singapore (the regional Internet hub with over 60% of the region’s data centers) given how large Indonesia is, much of it is far away.

Infrastructure: Indonesia is a large country and to connect it to the rest of the Internet it currently relies on submarine fiber optic cables. There are a total of 22 separate submarine cables connecting Indonesia to Singapore, Malaysia, Australia and onward. Many of the cable systems cross the Strait of Malacca, a narrow stretch of water, between the Malay Peninsula (Peninsular Malaysia) and the Indonesian island of Sumatra to the southwest, connecting the Indian and Pacific Oceans. This makes reliability challenging as a result of human activities, such as ships dropping their anchors, fishing trawlers, and dredging as it is one of the world’s top five busiest shipping lanes. Additionally, Indonesia is geographically located in a very active seismic zone and is very earthquake prone.

There are a number of new submarine cable systems that have come online and four significant builds planned (Apricot, ACC-1, Echo and Nui) that will improve both available capacity and cost economics in the market. Right now the cost is still significantly higher than comparable distances. For example Jakarta to Singapore is approximately 60 times more expensive than a service the same distance would be in the continental US or Europe for a 100Gbps wavelength service. Staying in Asia, a similar distance from Hong Kong to Taiwan costs around 1/6th that of Jakarta to Singapore.

While areas like Batam are becoming increasingly popular for data center builds due to its proximity to Singapore, Jakarta is still the most developed and mature market. It has the largest and best interconnected data centers in the country, including the two pictured.

Cloudflare is deployed in the facility on the right (NTT NexCenter), however most ISPs are inside the building on the left (Cyber 1). The two buildings are approximately 30-50 meters apart, yet it’s surprisingly difficult to be able to connect between them. One of the reasons why is market fragmentation and how many options are available. In the adjacent picture of the Cyber 1 building lobby directory many of the listings are unique data centers each with different policies and access conditions.

In the past, we’ve talked about the Cost of Bandwidth around the world (and updated here), but we’ve never talked about Indonesia specifically. Using the same methodology as we’ve used in the past, Indonesia’s cost is 43x times more expensive than North America or Europe, or even multiples more expensive than other countries in Asia.

Market dynamics: While Indonesia has good and functioning Internet Exchanges, there are a few ISPs who dominate the market. The three largest ISPs in the country (Telkom Indonesia, Indosat Ooredoo Hutchison and XL Axiata) collectively control 80% of the market, while Telkom Indonesia alone has a market share of around 60% by revenue.

This results in Telkom Indonesia having a heavily dominant market share position to leverage resulting in refusal to peer, or exchange Internet traffic in Indonesia without expensive payments, or instead, preferring to connect to other networks outside of Indonesia, introducing latency and diminished performance.

Despite all of these challenges, our network has come a long way since our initial deployment to Jakarta in 2019.

We’ve established:

A carrier neutral local point of presence at NTT Indonesia Nexcenter Data Center, one of the major interconnection hubs in Jakarta
An edge partnership point of presence in Yogyakarta with CitranetIX
Direct interconnections in country with two of the top three networks.
Peering across three of the larger local internet exchanges, Indonesia Internet Exchange, Jakarta Internet Exchange and Biznet Internet Exchange
Dedicated 100G wavelength transport back to Singapore

All of this results in a more performant and reliable network for our local customers.

We wanted to see how our network is performing since deployment. We mentioned during Speed Week in 2021 how we benchmark against different networks, and sharing some of those benchmarks here.

At the end of December 2021, Cloudflare was only faster in a few networks, as compared to other providers in Indonesia.

Fast forward twelve months to December 2022, Cloudflare is significantly faster in even more networks.

The TCP protocol is a connection-oriented protocol, which means that a connection is established and maintained until the application programs at each end have finished exchanging messages. The Connect Time summarizes how fast a session can be set up between a client and a server over a network. TTLB (or time to last byte) is the time taken to send the entire response to the web browser. It’s a good measure of how long a complete download takes. Check out our recent blog on Benchmarking Edge Network Performance for more information on how we measure the performance of our network and benchmark ourselves against industry players.

On closer inspection against the three major ISPs specifically, we’re the top provider for two out of the three networks. Cloudflare’s performance has improved year-on-year (16% reduction) and continues to lead (comparative to the other networks) meaning faster and more responsive services for our customers.

Helping build a better Internet for Indonesia doesn’t stop here and there is always more work to be done! We want to be the number one network everywhere and won’t rest until we are. We are continuing to connect to more networks locally, invest in direct submarine cable capacity, as well as further deployments into new data center buildings, Internet Exchanges and new cities too!

Are you operating a network and not yet peering with Cloudflare? Log-in to our Peering Portal or find out more information here for ways to set up peering, or request we deploy nodes into your network directly.

How Cloudflare Is Solving Network Interconnection for CIOs

2021-12-11 David Tuber

Post Syndicated from David Tuber original https://blog.cloudflare.com/more-offices-faster/

How Cloudflare Is Solving Network Interconnection for CIOs

Building a corporate network is hard. We want to enable IT teams to focus on exploring and deploying cutting edge technologies to make employees happier and more productive — not figuring out how to add 100 Mbps of capacity on the third floor of a branch office building.

And yet, as we speak to CIOs and IT teams, we consistently hear of the challenge required to manage organization connectivity. Today, we’re sharing more about how we’re solving connectivity challenges for CIOs and IT teams. There are three parts to our approach: we’re making our network more valuable in terms of the benefit you get from connecting to us; we’re expanding our reach, so we can offer connectivity in more places; and we’re further reducing our provisioning times, so there’s no more need to plan six months in advance.

Making Interconnection Valuable

Cloudflare delivers security, reliability, and performance products as a service, all from our global network. We’ve spent the past week talking about new releases and enhanced functionality — if you haven’t yet, please check out some exciting posts on how to replace your hardware firewall, managing third party tools in the cloud, and protecting your web pages from malicious scripting. By interconnecting with us, you get access to all these new products and features with zero additional latency and super easy configuration. This includes, for example, leveraging private paths from Cloudflare’s Magic Transit to your datacenters, completely bypassing the public Internet. It also includes the ability to leverage our private backbone and global network, to gain dramatic performance improvements throughout your network. You can read more examples about how interconnection gives you faster, more secure access to our products which improve your Internet experience in our Cloudflare Network Interconnect blog.

But it’s not just all the products and features you gain access to. Cloudflare has over 28 million Internet properties that rely on it to protect and accelerate their Internet presence. Every time a new property connects to our network, our network becomes more useful. Our free customers or consumers who use 1.1.1.1 provide us unparalleled vision into the Internet to improve our network performance. Similarly, as we expand our surface area on the Internet, it helps us improve our threat detection; it’s like an immune system that learns as it gets exposed to more pathogens. Each customer we make faster and more secure helps others in turn. We have a vast network of customers, including the titans of ecommerce, banking, ERP and CRM systems, and other cloud services. It’s only continuing to grow — and that will be to your advantage.

Making Interconnection Available Everywhere

Building corporate networks requires diverse types of locations to connect to each other: data centers, remote workers, branches in various locations, factories, and more. To accommodate the diversity and geographic spread of modern networks, Cloudflare offers many interconnection options, from our 250 locations around the world to 1000 new interconnection locations that will be enabled over the next year as a part of Cloudflare for Offices.

Connecting data centers to Cloudflare

You can interconnect with Cloudflare in over 250 locations around the world. Check out our peeringDB page to learn more about where you can connect with us.

We also have several Interconnect Partners who provide even more locations for interconnection. If you already have datacenter presence in these locations, interconnection with Cloudflare becomes even easier. Go to our partnership page to learn more about how to get connected through one of our partners.

Connecting your branch offices

A refresher on our Birthday Week post: Cloudflare for Offices is our initiative to bring Cloudflare’s presence to office buildings and multi-homed dwellings. Simply put, Cloudflare is coming to an office near you. That means that by plugging into Cloudflare you get direct, private, performant access to all Cloudflare services, particularly Cloudflare One. With Cloudflare for Offices, your Gateway queries never traverse the public Internet before Cloudflare, your private network built on Magic WAN is even more private, and Argo for Packets makes your offices faster than before. Cloudflare for Offices is the ultimate on-ramp for all on-premise traffic.

If we’re going to 1000 new locations, there has to be a method to the madness! The process for selecting new locations includes a number of factors. Our goal for each location is to allow the most customers to interconnect with us, while also leveraging our network partners to get connected as fast as possible.

What does a building need to have?

We want to offer reliable, turnkey connectivity to our zero trust security and other services customers connect to our network to consume.

When we evaluate any building, it has to meet the following criteria:

It must be connected to the Internet with one or more telecom partners. Working with existing providers reduces overhead and time to provision. Plugging into our network to get protected doesn’t work if we have to lay fiber for three months.
It must be multi-tenant and in a large metro area. Eventually we want to go everywhere, even to buildings with only one tenant. But as we’re starting from zero, we want to go to the places where we can have the most impact immediately. That means looking at buildings that are large, have a large number of potential or active customers, and have large population counts.

However, once we’ve chosen the building, the journey is far from over. Getting connected in a building has a host of challenges beyond just choosing a connectivity partner to the building. After the building is selected, Cloudflare works with building operators and network providers to provide connectivity to tenants in the building. Regardless of how we get to your office, we want to make it as easy as possible to get connected. And our expansion into 1000 more buildings means we’re on the path to being everywhere.

Once a building is provisioned for connectivity, you have to get connected. We’ve been working to provide a one-stop solution for all your office and datacenter connectivity that will look the same, regardless of location.

Getting Interconnection Done Quickly

Interconnection should be easy, and should just involve plugging in and getting connected. Cloudflare has been hard at work since the release of Cloudflare Network Interconnect thinking through the best ways to streamline connectivity to make provisioning an interconnection as seamless as plugging in a cable. With Cloudflare for Offices expanding its reach as we detailed above, this will be easy: users who are connecting via offices are using pre-established connectivity through partners.

But for customers who aren’t in a building covered by Cloudflare for Offices, or who use Cloudflare Network Interconnect, it’s not that simple. Provisioning network connectivity has traditionally been a time-consuming process for everyone involved. Customers need to deal with datacenter providers, receive letters of authorization (or LOAs for short), contract remote hands to plug in cables, read light levels, and that’s before software gets involved. This process has typically taken weeks in the industry, and Cloudflare has spent a lot of time shrinking that down. We don’t want weeks, we want minutes, and we’re excited that we are finally getting there.

There are three main initiatives we are pursuing to get this done: automating BGP configurations, streamlining cross-connect provisioning, and improving uptime. Let’s dive into each of those.

Instant BGP session turnup

When you provision a CNI, you’re essentially creating a brand new road between your neighborhood and the Cloudflare neighborhood. If the cross-connected cable is the paving of the actual street, BGP sessions are the street signs and map applications that tell everyone the new road is up. Establishing a BGP session is critical to using a CNI because it lets traffic going through Cloudflare and through your network know that a new private path exists between the two networks.

But when you pave a new road, you update the street signs in parallel to building the road. So why shouldn’t you do the same with interconnection? Cloudflare is now provisioning BGP sessions once the cross-connects are ordered so that the session is up and ready for you to configure. This cuts down on lots of back-and-forth and also parallelizes critical work to reduce overall provisioning time.

Cross-connect provisioning and Interconnect partners

Building the road itself takes a lot of time, and provisioning cross-connects can run into similar issues if we’re following the metaphor. Although we all wish robots could manage cross-connects in every data center, we still rely on booking time with humans and filling out purchase orders, completing methods of procedure (or MOP) to tell them what to do, and hoping that nobody bumps any cables or is accidentally clumsy during the maintenance. Imagine trying to plug your cables into one of these.

To fix this and reduce complexity, Cloudflare is standardizing connectivity in our datacenters to make it easy for humans to know where things get plugged in. We’re also better utilizing things like patch panels, which allow operators to interconnect with us without having to go in cages. This reduces time and complexity because operators are less likely to bump into things in cages, causing outages.

In addition, we also have our Interconnect Partners, which leverage existing connectivity with Cloudflare to provide virtual interconnection. Our list of partners is ever growing, and they’re super excited to work with us and you to give you the best, fastest, most secure connectivity experience possible.

“Megaport’s participation in Cloudflare Network Interconnect as an Interconnection Platform Partner helps make connectivity easier for our mutual customers. Reducing the time it takes for customers to go live with new Virtual Cross Connects and Megaport Cloud Routers helps them realize the promise of software-defined networking.”
– Peter Gallagher, Head of Channel, Megaport

“Console Connect and Cloudflare are continuing our partnership as part of Cloudflare’s Network Interconnect program, helping our mutual customers enhance the performance and control of their network through Software-Defined Interconnection®. As more and more customers move from physical to virtual connectivity, our partnership will help shorten onboarding times and make interconnecting easier than ever before.”
– Michael Glynn, VP of Digital Automated Innovation, Console Connect.

Improving connection resilience uptime

One customer quote that always resonates is, “I love using your services and products, but if you’re not up, then that doesn’t matter.” In the arena of interconnectivity, that is never more true. To that end, Cloudflare is excited to announce Bidirectional Forwarding Detection (or BFD) support on physical CNI links. BFD is a networking protocol that constantly monitors links and BGP sessions down to the second by sending a constant stream of traffic across the session. If a small number of those packets does not make it to the other side of the session, that session is considered down. This solution is useful for CNI customers who cannot tolerate any amount of packet loss during the session. If you’re a CNI customer, or even just a Cloudflare customer who has a low-loss requirement, CNI with BFD is a great solution to ensure that quick decisions are made with regard to your CNI to ensure your traffic always gets through.

Get connected today

Cloudflare is always trying to push the boundaries of what’s possible. We built a better path through the Internet with Argo, took on edge computing with Workers, and showed that zero trust networking could be done in the cloud with Cloudflare One. Pushing the boundaries of improving connectivity is the next step in Cloudflare’s journey to help build a better Internet. There are hard problems for people to solve on the Internet, like how to best protect what belongs to you. Figuring out how to get connected and protected should be fast and easy. With Cloudflare for Offices and CNI, we want to make it that easy.

If you are interested in CNI or Cloudflare for Offices, visit our landing page or reach out to your account team to get plugged in today!

Making Magic Transit health checks faster and more responsive

2021-08-23 Meyer Zinn

Post Syndicated from Meyer Zinn original https://blog.cloudflare.com/making-magic-transit-health-checks-faster-and-more-responsive/

Making Magic Transit health checks faster and more responsive

Magic Transit advertises our customer’s IP prefixes directly from our edge network, applying DDoS mitigation and firewall policies to all traffic destined for the customer’s network. After the traffic is scrubbed, we deliver clean traffic to the customer over GRE tunnels (over the public Internet or Cloudflare Network Interconnect). But sometimes, we experience inclement weather on the Internet: network paths between Cloudflare and the customer can become unreliable or go down. Customers often configure multiple tunnels through different network paths and rely on Cloudflare to pick the best tunnel to use if, for example, some router on the Internet is having a stormy day and starts dropping traffic.

Because we use Anycast GRE, every server across Cloudflare’s 200+ locations globally can send GRE traffic to customers. Every server needs to know the status of every tunnel, and every location has completely different network routes to customers. Where to start?

In this post, I’ll break down my work to improve the Magic Transit GRE tunnel health check system, creating a more stable experience for customers and dramatically reducing CPU and memory usage at Cloudflare’s edge.

Everybody has their own weather station

To decide where to send traffic, Cloudflare edge servers need to periodically send health checks to each customer tunnel endpoint.

When Magic Transit was first launched, every server sent a health check to every tunnel once per minute. This naive, “shared-nothing” approach was simple to implement and served customers well, but would occasionally deliver less than optimal health check behavior in two specific ways.

Way #1: Inconsistent weather reports

Sometimes a server just runs into bad luck, and a check randomly fails. From there, the server would mark the tunnel as degraded and immediately start shifting traffic towards a fallback tunnel. Imagine you and I were standing right next to each other under a clear sky, and I felt a single drop of water and declared, “It’s raining!” whereas you felt no raindrops and declared, “It’s all clear!”

With relatively minimal data per server, it means that health determinations can be imprecise. It also means that individual servers could overreact to individual failures. From a customer’s point of view, it’s like Cloudflare detected a problem with the primary tunnel. But, in reality, the server just got a bad weather forecast and made a different judgement call.

Way #2: Slow to respond to storms

Even when tunnel states are consistent across servers, they can be slow to respond. In this case, if a server runs a health check which succeeds, but a second later the tunnel goes down, the next health check won’t happen for another 59 seconds. Until that next health check fails, the server has no idea anything is wrong, so it keeps sending traffic over unhealthy tunnels, leading to packet loss and latency for the customer.

Much like how a live, up-to-the-minute rain forecast helps you decide when to leave to avoid the rain, servers that send tunnel checks more frequently get a finer view of the Internet weather and can respond faster to localized storms. But if every server across Cloudflare’s edge sent health checks too frequently, we would very quickly start to overwhelm our customers’ networks.

Clearly, we needed to hammer out some kinks. We wanted servers in the same location to come to the same conclusions about where to send traffic, and we wanted faster detection of issues without increasing the frequency of tunnel checks.

Health checks sent from servers in the same data center take the same route across the Internet. Why not share the results among them?

Instead of a single raindrop causing me to declare that it’s raining, I’d tell you about the raindrop I felt, and you’d tell me about the clear sky you’re looking at. Together, we come to the same conclusion: there isn’t enough rain to open an umbrella.

There is even a special networking protocol that allows us to easily share information between servers in the same private network. From the makers of Unicast and Anycast, now presenting: Multicast!

A single IP address does not necessarily represent a single machine in a network. The Internet Protocol specifies a way to send one message that gets delivered to a group of machines, like writing to an email list. Every machine has to opt into the group—we can’t just enroll people at random for our email list—but once a machine joins, it receives a copy of any message sent to the group’s address.

The servers in a Cloudflare edge data center are part of the same private network, so for “version 2” of our health check system, we had each server in a data center join a multicast group and share their health check results with one another. Each server still made an independent assessment for each tunnel, but that assessment was based on data collected by all servers in the same location.

This second version of tunnel health checks resulted in more consistent tunnel health determinations by servers in the same data center. It also resulted in faster response times—especially in large data centers where servers receive updates from their peers very rapidly.

However, we started seeing scaling problems. As we added more customers, we added more tunnel endpoints where we need to check the weather. In some of our larger data centers, each server was receiving close to half a billion messages per minute.

Imagine it’s not just you and me telling each other about the weather above us. You’re in a crowd of hundreds of people, and now everyone is shouting the weather updates for thousands of cities around the world!

One weather station to rule them all

As an engineering intern on the Magic Transit team, my project this summer has been developing a third approach. Rather than having every server infrequently check the weather for every tunnel and shouting the observation to everyone else, now every server tunnel can frequently check the weather for a few tunnels. With this new approach, servers would then only tell the others about the overall weather report—not every individual measurement.

That scenario sounds more efficient, but we need to distribute the task of sending tunnel health checks across all the servers in a location so one server doesn’t get an overwhelming amount of work. So how can we assign tunnels to servers in a way that doesn’t require a centralized orchestrator or shared database? Enter consistent hashing, the single coolest distributed computing concept I got to apply this summer.

Every server sends a multicast “heartbeat” every few seconds. Then, by listening for multicast heartbeats, each server can construct a list of the IP addresses of peers known to be alive, including its own address, sorted by taking the hash of each address. Every server in a data center has the same list of peers in the same order.

When a server needs to decide which tunnels it is responsible for sending health checks to, the server simply hashes each tunnel to an integer and searches through the list of peer addresses to find the peer with the smallest hash greater than the tunnel’s hash, wrapping around to the first peer if no peer is found. The server is responsible for sending health checks to the tunnel when the assigned peer’s address equals the server’s address.

If a server stops sending messages for a long enough period of time, the server gets removed from the known peers list. As a consequence, the next time another server tries to hash a tunnel the removed peer was previously assigned, the tunnel simply gets reassigned to the next peer in the list.

And like magic, we have devised a scheme to consistently assign tunnels to servers in a way that is resilient to server failures and does not require any extra coordination between servers beyond heartbeats. Now, the assigned server can send health checks way more frequently, compose more precise weather forecasts, and share those forecasts without being drowned out by the crowd.

Results

Releasing the new health check system globally reduced Magic Transit’s CPU usage by over 70% and memory usage by nearly 85%.

Memory usage (measured in terabytes):

CPU usage (measured in CPU-seconds per two minute interval, averaged over two days):

Reducing the number of multicast messages means that servers can now keep up with the Internet weather, even in the largest data centers. We’re now poised for the next stage of Magic Transit’s growth, just in time for our two-year anniversary.

If you want to help build the future of networking, join our team.

Announcing Project Pangea: Helping Underserved Communities Expand Access to the Internet For Free

2021-07-26 Marwan Fayed

Post Syndicated from Marwan Fayed original https://blog.cloudflare.com/pangea/

Announcing Project Pangea: Helping Underserved Communities Expand Access to the Internet For Free

Half of the world’s population has no access to the Internet, with many more limited to poor, expensive, and unreliable connectivity. This problem persists despite large levels of public investment, private infrastructure, and effort by local organizers.

Today, Cloudflare is excited to announce Project Pangea: a piece of the puzzle to help solve this problem. We’re launching a program that provides secure, performant, reliable access to the Internet for community networks that support underserved communities, and we’re doing it for free¹ because we want to help build an Internet for everyone.

What is Cloudflare doing to help?

Project Pangea is Cloudflare’s project to help bring underserved communities secure connectivity to the Internet through Cloudflare’s global and interconnected network.

Cloudflare is offering our suite of network services — Cloudflare Network Interconnect, Magic Transit, and Magic Firewall — for free to nonprofit community networks, local networks, or other networks primarily focused on providing Internet access to local underserved or developing areas. This service would dramatically reduce the cost for communities to connect to the Internet, with industry leading security and performance functions built-in:

Cloudflare Network Interconnect provides access to Cloudflare’s edge in 200+ cities across the globe through physical and virtual connectivity options.
Magic Transit acts as a conduit to and from the broader Internet and protects community networks by mitigating DDoS attacks within seconds at the edge.
Magic Firewall gives community networks access to a network-layer firewall as a service, providing further protection from malicious traffic.

We’ve learned from working with customers that pure connectivity is not enough to keep a network sustainably connected to the Internet. Malicious traffic, such as DDoS attacks, can target a network and saturate Internet service links, which can lead to providers aggressively rate limiting or even entirely shutting down incoming traffic until the attack subsides. This is why we’re including our security services in addition to connectivity as part of Project Pangea: no attacker should be able to keep communities closed off from accessing the Internet.

What is a community network?

Community networks have existed almost as long as commercial subscribership to the Internet that began with dial-up service. The Internet Society, or ISOC, describes community networks as happening “when people come together to build and maintain the necessary infrastructure for Internet connection.”

Most often, community networks emerge from need, and in response to the lack or absence of available Internet connectivity. They consistently demonstrate success where public and private-sector initiatives have either failed or under-deliver. We’re not talking about stop-gap solutions here, either — community networks around the world have been providing reliable, sustainable, high-quality connections for years.

Many will operate only within their communities, but many others can grow, and have grown, to regional or national scale. The most common models of governance and operation are as not-for-profits or cooperatives, models that ensure reinvestment within the communities being served. For example, we see networks that reinvest their proceeds to replace Wi-Fi infrastructure with fibre-to-the-home.

Cloudflare celebrates these networks’ successes, and also the diversity of the communities that these networks represent. In that spirit, we’d like to dispel myths that we encountered during the launch of this program — many of which we wrongly assumed or believed to be true — because the myths turn out to be barriers that communities so often are forced to overcome. Community networks are built on knowledge sharing, and so we’re sharing some of that knowledge, so others can help accelerate community projects and policies, rather than rely on the assumptions that impede progress.

Myth #1: Only very rural or remote regions are underserved and in need. It’s true that remote regions are underserved. It is also true that underserved regions exist within 10 km (about six miles) of large city centers, and even within the largest cities themselves, as evidenced by the existence of some of our launch partners.

Myth #2: Remote, rural, or underserved is also low-income. This might just be the biggest myth of all. Rural and remote populations are often thriving communities that can afford service, but have no access. In contrast, the need for urban community networks are often egalitarian, and emerge because the access that is available is unaffordable to many.

Myth #3: Service is necessarily more expensive. This myth is sometimes expressed by statements such as, “if large service providers can’t offer affordable access, then no one can.” More than a myth, this is a lie. Community networks (including our launch partners) use novel governance and cost models to ensure that subscribers pay rates similar to the wider market.

Myth #4: Technical expertise is a hard requirement and is unavailable. There is a rich body of evidence and examples showing that, with small amounts of training and support, communities can build their own local networks cheaply and reliably with commodity hardware and non-specialist equipment.

These myths aside, there is one truth: the path to sustainability is hard. The start and initial growth of community networks often consists of volunteer time or grant funding, which are difficult to sustain in the long-term. Eventually the starting models need to transition to models of “willing to charge and willing to pay” — Project Pangea is designed to help fill this gap.

What is the problem?

Communities around the world can and have put up Wi-Fi antennas and laid their own fibre. Even so, and however well-connected the community is to itself, Internet services are prohibitively expensive — if they can be found at all.

Two elements are required to connect to the Internet, and each incurs its own cost:

Backhaul connections to an interconnection point — the connection point may be anything from a local cabinet to a large Internet exchange point (IXP).
Internet Services are provided by a network that interfaces with the wider Internet, and agrees to route traffic to and from on behalf of the community network.

These are distinct elements. Backhaul service carries data packets along a physical link (a fibre cable or wireless medium). Internet service is separate and may be provided over that link, or at its endpoint.

The cost of Internet service for networks is both dominant and variable (with usage), so in most cases it is cheaper to purchase both as a bundle from service providers that also own or operate their own physical network. Telecommunications and energy companies are prime examples.

However, the operating costs and complexity of long-distance backhaul is significantly lower than the costs of Internet service. If reliable, high capacity service were affordable, then community networks could extend their knowledge and governance models sustainably to also provide their own backhaul.

For all that community networks can build, establish, and operate, the one element entirely outside their control is the cost of Internet service — a problem that Project Pangea helps to solve.

Why does the problem persist?

On this subject, I — Marwan — can only share insights drawn from prior experience as a computer science professor, and a co-founder of HUBS c.i.c., launched with talented professors and a network engineer. HUBS is a not-for-profit backhaul and Internet provider in Scotland. It is a cooperative of more than a dozen community networks — some that serve communities with no roads in or out — across thousands of square kilometers along Scotland’s West Coast and Borders regions. As is true of many community networks, not least some of Pangea’s launch partners, HUBS’ is award-winning, and engages in advocacy and policy.

During that time my co-founders and I engaged with research funders, economic development agencies, three levels of government, and so many communities that I lost track. After all that, the answer to the question is still far from clear. There are, however, noteworthy observations and experiences that stood out, and often came from surprising places:

Cables on the ground get chewed by animals that, small or large, might never be seen.
Burying power and Ethernet cables, even 15 centimeters below soil, makes no difference because (we think) animals are drawn by the electrical current.
Property owners sometimes need to be convinced that 8 to 10 square meters to build a small tower in exchange for free Internet and community benefit is a good thing.
The raising of small towers, even that no one will see, is sometimes blocked by legislation or regulation that assumes private non-residential structures can only be a shed, or never taller than a shed.
Private fibre backbone installations installed with public funds are often inaccessible, or are charged by distance even though the cost to light 100 meters of fibre is identical to the cost of lighting 1 km of fibre.
Civil service agencies may be enthusiastic, but are also cautious, even in the face of evidence. Be patient, suffer frustration, be more patient, and repeat. Success is possible.
If and where possible, it’s best to avoid attempts to deliver service where national telecommunications companies have plans to do so.
Never underestimate tidal fading — twice a day, wireless signals over water will be amazing, and will completely disappear. We should have known!

All anecdotes aside, the best policies and practices are non-trivial — but because of so many prior community efforts, and organizations such as ISOC, the APC, the A4AI, and more, the challenges and solutions are better understood than ever before.

How does a community network reach the Internet?

First, we’d like to honor the many organisations we’ve learned from who might say that there are no technical barriers to success. Connections within the community networks may be shaped by geographical features or regional regulations. For example, wireless lines of sight between antenna towers on personal property are guided by hills or restricted by regulations. Similarly, Ethernet cables and fibre deployments are guided by property ownership, digging rights, and the presence or migration of grazing animals that dig into soil and gnaw at cables — yes, they do, even small rabbits.

Once the community establishes its own area network, the connections to meet Internet services are more conventional, more familiar. In part, the choice is influenced or determined by proximity to Internet exchanges, PoPs, or regional fibre cabinet installations. The connections with community networks fall into three broad categories.

Colocation. A community network may be fortunate enough to have service coverage that overlaps with, or is near to, an Internet eXchange Point (IXP), as shown in the figure below. In this case a natural choice is to colocate a router within the exchange, near to the Internet service provider’s router (labeled as Cloudflare in the figure). Our launch partner NYC Mesh connects in this manner. Unfortunately, being that exchanges are most often located in urban settings, colocation is unavailable to many, if not most, community networks.

Conventional point-to-point backhaul. Community networks that are remote must establish a point-to-point backhaul connection to the Internet exchange. This connection method is shown in the figure below in which the community network in the previous figure has moved to the left, and is joined by a physical long-distance link to the Internet service router that remains in the exchange on the right.

Point-to-point backhaul is familiar. If the infrastructure is available — and this is a big ‘if’ — then backhaul is most often available from a utility company, such as a telecommunications or energy provider, that may also bundle Internet service as a way to reduce total costs. Even bundled, the total cost is variable and unaffordable to individual community networks, and is exacerbated by distance. Some community networks have succeeded in acquiring backhaul through university, research and education, or publicly-funded networks that are compelled or convinced to offer the service in the public interest. On the west coast of Scotland, for example, Tegola launched with service from the University of Highlands and Islands and the University of Edinburgh.

Start a backhaul cooperative for point-to-point and colocation. The last connection option we see among our launch partners overcomes the prohibitive costs by forming a cooperative network in which the individual subscriber community networks are also members. The cooperative model can be seen in the figure below. The exchange remains on the right. On the left the community network in the previous figure is now replaced by a collection of community networks that may optionally connect with each other (for example, to establish reliable routing if any link fails). Either directly or indirectly via other community networks, each of these community networks has a connection to a remote router at the near-end of the point-to-point connection. Crucially, the point-to-point backhaul service — as well as the co-located end-points — are owned and operated by the cooperative. In this manner, an otherwise expensive backhaul service is made affordable by being a shared cost.

Two of our launch partners, Guifi.net and HUBS c.i.c., are organised this way and their 10+ years in operation demonstrate both success and sustainability. Since the backhaul provider is a cooperative, the community network members have a say in the ways that revenue is saved, spent, and — best of all — reinvested back into the service and infrastructure.

Why is Cloudflare doing this?

Cloudflare’s mission is to help build a better Internet, for everyone, not just those with privileged access based on their geographical location. Project Pangea aligns with this mission by extending the Internet we’re helping to build — a faster, more reliable, more secure Internet — to otherwise underserved communities.

How can my community network get involved?

Check out our landing page to learn more and apply for Project Pangea today.

The ‘community’ in Cloudflare

Lastly, in a blog post about community networks, we feel it is appropriate to acknowledge the ‘community’ at Cloudflare: Project Pangea is the culmination of multiple projects, and multiple peoples’ hours, effort, dedication, and community spirit. Many, many thanks to all.
______

¹For eligible networks, free up to 5Gbps at p95 levels.

Noise

Tag Archives: Network Interconnect

Simplifying how enterprises connect to Cloudflare with Express Cloudflare Network Interconnect

Interconnection services and what to consider

3 Minute Provisioning

Simplified Magic Transit and Magic WAN onboarding

What’s next for CNI?

Get started with Express CNI today

Cloudflare’s global network grows to 300 cities and ever closer to end users with connections to 12,000 networks

Reduce latency

Improve interconnections

Build resilience

Cloudflare’s network expansion in Indonesia

How Cloudflare Is Solving Network Interconnection for CIOs

Making Interconnection Valuable

Making Interconnection Available Everywhere

Connecting data centers to Cloudflare

Connecting your branch offices

What does a building need to have?

Getting Interconnection Done Quickly

Instant BGP session turnup

Cross-connect provisioning and Interconnect partners

Improving connection resilience uptime

Get connected today

Making Magic Transit health checks faster and more responsive

Everybody has their own weather station

Way #1: Inconsistent weather reports

Way #2: Slow to respond to storms

One weather station to rule them all

Results

Announcing Project Pangea: Helping Underserved Communities Expand Access to the Internet For Free

What is Cloudflare doing to help?

What is a community network?

What is the problem?

Why does the problem persist?

How does a community network reach the Internet?

Why is Cloudflare doing this?

How can my community network get involved?

The ‘community’ in Cloudflare

The collective thoughts of the interwebz

Interconnection services and what to consider

3 Minute Provisioning

Simplified Magic Transit and Magic WAN onboarding

What’s next for CNI?

Get started with Express CNI today

Reduce latency

Improve interconnections

Build resilience

Making Interconnection Valuable

Making Interconnection Available Everywhere

Connecting data centers to Cloudflare

Connecting your branch offices

What does a building need to have?

Getting Interconnection Done Quickly

Instant BGP session turnup

Cross-connect provisioning and Interconnect partners

Improving connection resilience uptime

Get connected today

Everybody has their own weather station

Way #1: Inconsistent weather reports

Way #2: Slow to respond to storms

All of the weather stations nearby start sharing observations

One weather station to rule them all

Results

What is Cloudflare doing to help?

What is a community network?

What is the problem?

Why does the problem persist?

How does a community network reach the Internet?

Why is Cloudflare doing this?

How can my community network get involved?

The ‘community’ in Cloudflare

The collective thoughts of the interwebz