Tag Archives: RPKI

Fall 2020 RPKI Update

Post Syndicated from Louis Poinsignon original https://blog.cloudflare.com/rpki-2020-fall-update/

Fall 2020 RPKI Update

The Internet is a network of networks. In order to find the path between two points and exchange data, the network devices rely on the information from their peers. This information consists of IP addresses and Autonomous Systems (AS) which announce the addresses using Border Gateway Protocol (BGP).

One problem arises from this design: what protects against a malevolent peer who decides to announce incorrect information? The damage caused by route hijacks can be major.

Routing Public Key Infrastructure (RPKI) is a framework created in 2008. Its goal is to provide a source of truth for Internet Resources (IP addresses) and ASes in signed cryptographically signed records called Route Origin Objects (ROA).

Recently, we’ve seen the significant threshold of two hundred thousands of ROAs being passed. This represents a big step in making the Internet more secure against accidental and deliberate BGP tampering.

We have talked about RPKI in the past but we thought it would be a good time for an update.

In a more technical context, the RPKI framework consists of two parts:

  • IP addresses need to be cryptographically signed by their owners in a database managed by a Trust Anchor: Afrinic, APNIC, ARIN, LACNIC and RIPE. Those five organizations are in charge of allocating Internet resources. The ROA indicates which Network Operator is allowed to announce the addresses using BGP.
  • Network operators download the list of ROAs, perform the cryptographic checks and then apply filters on the prefixes they receive: this is called BGP Origin Validation.

The “Is BGP Safe Yet” website

The launch of the website isbgpsafeyet.com to test if your ISP correctly performs BGP Origin Validation was a success. Since launch, it has been visited more than five million times from over 223 countries and 13,000 unique networks (20% of the entire Internet), generating half a million BGP Origin Validation tests.

Many providers subsequently indicated on social media (for example, here or here) that they had an RPKI deployment in the works. This increase in Origin Validation by networks is increasing the security of the Internet globally.

The site’s test for Origin Validation consists of queries toward two addresses, one of which is behind an RPKI invalid prefix and the other behind an RPKI valid prefix. If the query towards the invalid succeeds, the test fails as the ISP does not implement Origin Validation. We counted the number of queries that failed to reach invalid.cloudflare.com. This also included a few thousand RIPE Atlas tests that were started by Cloudflare and various contributors, providing coverage for smaller networks.

Every month since launch we’ve seen that around 10 to 20 networks are deploying RPKI Origin Validation. Among the major providers we can build the following table:

Month Networks
August Swisscom (Switzerland), Salt (Switzerland)
July Telstra (Australia), Quadranet (USA), Videotron (Canada)
June Colocrossing (USA), Get Norway (Norway), Vocus (Australia), Hurricane Electric (Worldwide), Cogent (Worldwide)
May Sengked Fiber (Indonesia), Online.net (France), WebAfrica Networks (South Africa), CableNet (Cyprus), IDnet (Indonesia), Worldstream (Netherlands), GTT (Worldwide)

With the help of many contributors, we have compiled a list of network operators and public statements at the top of the isbgpsafeyet.com page.

We excluded providers that manually blocked the traffic towards the prefix instead of using RPKI. Among the techniques we see are firewall filtering and manual prefix rejection. The filtering is often propagated to other customer ISPs. In a unique case, an ISP generated a “more-specific” blackhole route that leaked to multiple peers over the Internet.

The deployment of RPKI by major transit providers, also known as Tier 1, such as Cogent, GTT, Hurricane Electric, NTT and Telia made many downstream networks more secure without them having them deploying validation software.

Overall, we looked at the evolution of the successful tests per ASN and we noticed a steady increase over the recent months of 8%.

Fall 2020 RPKI Update

Furthermore, when we probed the entire IPv4 space this month, using a similar technique to the isbgpsafeyet.com test, many more networks were not able to reach an RPKI invalid prefix than compared to the same period last year. This confirms an increase of RPKI Origin Validation deployment across all network operators. The picture below shows the IPv4 space behind a network with RPKI Origin Validation enabled in yellow and the active space in blue. It uses a Hilbert Curve to efficiently plot IP addresses: for example one /20 prefix (4096 IPs) is a pixel, a /16 prefix (65536 IPs) will form a 4×4 pixels square.

The more the yellow spreads, the safer the Internet becomes.

Fall 2020 RPKI Update

What does it mean exactly? If you were hijacking a prefix, the users behind the yellow space would likely not be affected. This also applies if you miss-sign your prefixes: you would not be able to reach the services or users behind the yellow space. Once RPKI is enabled everywhere, there will only be yellow squares.

Progression of signed prefixes

Owners of IP addresses indicate the networks allowed to announce them. They do this by signing prefixes: they create Route Origin Objects (ROA). As of today, there are more than 200,000 ROAs. The distribution shows that the RIPE region is still leading in ROA count, then followed by the APNIC region.

Fall 2020 RPKI Update

2020 started with 172,000 records and the count is getting close to 200,000 at the beginning of November, approximately a quarter of all the Internet routes. Since last year, the database of ROAs grew by more than 70 percent, from 100,000 records, an average pace of 5% every month.

On the following graph of unique ROAs count per day, we can see two points that were followed by a change in ROA creation rate: 140/day, then 231/day, and since August, 351 new ROAs per day.

It is not yet clear what caused the increase in August.

Fall 2020 RPKI Update

Free services and software

In 2018 and 2019, Cloudflare was impacted by BGP route hijacks. Both could have been avoided with RPKI. Not long after the first incident, we started signing prefixes and developing RPKI software. It was necessary to make BGP safer and we wanted to do more than talk about it. But we also needed enough networks to be deploying RPKI as well. By making deployment easier for everyone, we hoped to increase adoption.

The following is a reminder of what we built over the years around RPKI and how it grew.

OctoRPKI is Cloudflare’s open source RPKI Validation software. It periodically generates a JSON document of validated prefixes that we pass onto our routers using GoRTR. It generates most of the data behind the graphs here.

The latest version, 1.2.0, of OctoRPKI was released at the end of October. It implements important security fixes, better memory management and extended logging. This is the first validator to provide detailed information around cryptographically invalid records into Sentry and performance data in distributed tracing tools.
GoRTR remains heavily used in production, including by transit providers. It can natively connect to other validators like rpki-client.

When we released our public rpki.json endpoint in early 2019, the idea was to enable anyone to see what Cloudflare was filtering.

The file is also used as a bootstrap by GoRTR, so that users can test a deployment. The file is cached on more than 200 data centers, ensuring quick and secure delivery of a list of valid prefixes, making RPKI more accessible for smaller networks and developers.

Between March 2019 and November 2020, the number of queries more than doubled and there are five times more networks querying this file.

The growth of queries follows approximately the rate of ROA creation (~5% per month).

Fall 2020 RPKI Update

A public RTR server is also available on rtr.rpki.cloudflare.com. It includes a plaintext endpoint on port 8282 and an SSH endpoint on port 8283. This allows us to test new versions of GoRTR before release.

Later in 2019, we also built a public dashboard where you can see in-depth RPKI validation. With a GraphQL API, you can now explore the validation data, test a list of prefixes, or see the status of the current routing table.

Fall 2020 RPKI Update

Currently, the API is used by BGPalerter, an open-source tool that detects routing issues (including hijacks!) from a stream of BGP updates.

Additionally, starting in November, you can access the historical data from May 2019. Data is computed daily and contains the unique records. The team behind the dashboard worked hard to provide a fast and accurate visualization of the daily ROA changes and the volumes of files changed over the day.

Fall 2020 RPKI Update

The future

We believe RPKI is going to continue growing, and we would like to thank the hundreds of network engineers around the world who are making the Internet routing more secure by deploying RPKI.

25% of routes are signed and 20% of the Internet is doing origin validation and those numbers grow everyday. We believe BGP will be safer before reaching 100% of deployment; for instance, once the remaining transit providers enable Origin Validation, it is unlikely a BGP hijack will make it to the front page of world news outlets.

While difficult to quantify, we believe that critical mass of protected resources will be reached in late 2021.

We will keep improving the tooling; OctoRPKI and GoRTR are open-source and we welcome contributions. In the near future, we plan on releasing a packaged version of GoRTR that can be directly installed on certain routers. Stay tuned!

Is BGP Safe Yet? No. But we are tracking it carefully

Post Syndicated from Louis Poinsignon original https://blog.cloudflare.com/is-bgp-safe-yet-rpki-routing-security-initiative/

Is BGP Safe Yet? No. But we are tracking it carefully

BGP leaks and hijacks have been accepted as an unavoidable part of the Internet for far too long. We relied on protection at the upper layers like TLS and DNSSEC to ensure an untampered delivery of packets, but a hijacked route often results in an unreachable IP address. Which results in an Internet outage.

The Internet is too vital to allow this known problem to continue any longer. It’s time networks prevented leaks and hijacks from having any impact. It’s time to make BGP safe. No more excuses.

Border Gateway Protocol (BGP), a protocol to exchange routes has existed and evolved since the 1980s. Over the years it has had security features. The most notable security addition is Resource Public Key Infrastructure (RPKI), a security framework for routing. It has been the subject of a few blog posts following our deployment in mid-2018.

Today, the industry considers RPKI mature enough for widespread use, with a sufficient ecosystem of software and tools, including tools we’ve written and open sourced. We have fully deployed Origin Validation on all our BGP sessions with our peers and signed our prefixes.

However, the Internet can only be safe if the major network operators deploy RPKI. Those networks have the ability to spread a leak or hijack far and wide and it’s vital that they take a part in stamping out the scourge of BGP problems whether inadvertent or deliberate.

Many like AT&T and Telia pioneered global deployments of RPKI in 2019. They were successfully followed by Cogent and NTT in 2020. Hundreds networks of all sizes have done a tremendous job over the last few years but there is still work to be done.

If we observe the customer-cones of the networks that have deployed RPKI, we see around 50% of the Internet is more protected against route leaks. That’s great, but it’s nothing like enough.

Is BGP Safe Yet? No. But we are tracking it carefully

Today, we are releasing isBGPSafeYet.com, a website to track deployments and filtering of invalid routes by the major networks.

We are hoping this will help the community and we will crowdsource the information on the website. The source code is available on GitHub, we welcome suggestions and contributions.

We expect this initiative will make RPKI more accessible to everyone and ultimately will reduce the impact of route leaks. Share the message with your Internet Service Providers (ISP), hosting providers, transit networks to build a safer Internet.

Additionally, to monitor and test deployments, we decided to announce two bad prefixes from our 200+ data centers and via the 233+ Internet Exchange Points (IXPs) we are connected to:

  • 103.21.244.0/24
  • 2606:4700:7000::/48

Both these prefixes should be considered invalid and should not be routed by your provider if RPKI is implemented within their network. This makes it easy to demonstrate how far a bad route can go, and test whether RPKI is working in the real world.

Is BGP Safe Yet? No. But we are tracking it carefully
A Route Origin Authorization for 103.21.244.0/24 on rpki.cloudflare.com

In the test you can run on isBGPSafeYet.com, your browser will attempt to fetch two pages: the first one valid.rpki.cloudflare.com, is behind an RPKI-valid prefix and the second one, invalid.rpki.cloudflare.com, is behind the RPKI-invalid prefix.

The test has two outcomes:

  • If both pages were correctly fetched, your ISP accepted the invalid route. It does not implement RPKI.
  • If only valid.rpki.cloudflare.com was fetched, your ISP implements RPKI. You will be less sensitive to route-leaks.
Is BGP Safe Yet? No. But we are tracking it carefully
a simple test of RPKI invalid reachability

We will be performing tests using those prefixes to check for propagation. Traceroutes and probing helped us in the past by creating visualizations of deployment.

A simple indicator is the number of networks sending the accepted route to their peers and collectors:

Is BGP Safe Yet? No. But we are tracking it carefully
Routing status from online route collection tool RIPE Stat

In December 2019, we released a Hilbert curve map of the IPv4 address space. Every pixel represents a /20 prefix. If a dot is yellow, the prefix responded only to the probe from a RPKI-valid IP space. If it is blue, the prefix responded to probes from both RPKI valid and invalid IP space.

To summarize, the yellow areas are IP space behind networks that drop RPKI invalid prefixes. The Internet isn’t safe until the blue becomes yellow.

Is BGP Safe Yet? No. But we are tracking it carefully
Hilbert Curve Map of IP address space behind networks filtering RPKI invalid prefixes

Last but not least, we would like to thank every network that has already deployed RPKI and every developer that contributed to validator-software code bases. The last two years have shown that the Internet can become safer and we are looking forward to the day where we can call route leaks and hijacks an incident of the past.

RPKI and the RTR protocol

Post Syndicated from Martin J Levy original https://blog.cloudflare.com/rpki-and-the-rtr-protocol/

RPKI and the RTR protocol

Today’s Internet requires stronger protection within its core routing system and as we have already said: it’s high time to stop BGP route leaks and hijacks by deploying operationally-excellent RPKI!

Luckily, over the last year plus a lot of good work has happened in this arena. If you’ve been following the growth of RPKI’s validation data, then you’ll know that more and more networks are signing their routes and creating ROA’s or Route Origin Authorizations. These are cryptographically-signed assertions of the validity of an announced IP block and contribute to the further securing of the global routing table that makes for a safer Internet.

The protocol that we have not written much about is RTR. The Resource Public Key Infrastructure (RPKI) to Router Protocol – or RTR Protocol for short. Today we’re fixing that.

RPKI rewind

We have written a few times about RPKI (here and here). We have written about how Cloudflare both signs its announced routes and filters its routing inbound from other networks (both transits and peers) using RPKI data. We also added our efforts in the open-source software space with the release of the Cloudflare RPKI Toolkit.

The primary part of the RPKI (Resource Public Key Infrastructure) system is a cryptographically-signed database which is read and processed by a RPKI validator. The validator works with the published ROAs to build a list of validated routes. A ROA consists of an IP address block plus an ASN (Autonomous System Number) that together define who can announce which IP block.

RPKI and the RTR protocol

After that step, it is then the job of that validator (or some associated software module) to communicate its list of valid routes to an Internet router. That’s where the RTR protocol (the RPKI to Router Protocol) comes in. Its job is to communicate between the validator and device in charge of allowing or rejecting routes in its table.

RTR

The IETF defines the RTR protocol in RFC 8210. This blog post focuses on version 1 and ignores previous versions.

In order to verifiably validate the origin Autonomous Systems and Autonomous System Paths of BGP announcements, routers need a simple but reliable mechanism to receive Resource Public Key Infrastructure (RFC 6480) prefix origin data and router keys from a trusted cache. This document describes a protocol to deliver them.

This document describes version 1 of the RPKI-Router protocol.

The Internet’s routers are, to put it bluntly, not the best place to run a routing table’s cryptographic processing. The RTR protocol allows the heavy lifting to be done outside of the valuable processing modules that routers have. RTR is a very lightweight protocol with a low memory footprint. The router simply decides yay-or-nay when a route is received (called “announce” in BGP speak) and hence the router never needs to touch the complex cryptographic validation algorithms. In many cases, it also provides some isolation between the outside world, where certificates need to be fetched from across the globe and then stored, checked, processed, and databased locally. In many cases the control plane (where RTR communication happens) exists on private or protected networks. Separation is a good thing.

Cloudflare’s open-source software for RPKI validation also includes GoRTR, an implementation of the RTR protocol. As mentioned, in Cloudflare’s operational model, we separate the validation (done with OctoRPKI) from the RTR process.

RPKI and the RTR protocol

RTR protocol implementations are also provided in other RPKI validation software packages. In fact, RPKI is unable to filter routes without the final step of running RTR (or something similar – should it exist). Here’s a current list of RPKI software packages that either validate or validate and run RTR.

Each of these open source software packages has its own specific database model and operational methods. Because GoRTR reads a somewhat common JSON file format, you can mix and match between different validators and GoRTR’s code.

The RTR protocol

The protocol’s core is all about synchronizing a database between a validator and a router. This is done using serial-numbers and session-ids.

It’s kicked off with a router setting up a TCP connection towards a backend RTR server followed by a series of serial-number exchanges and data records exchanges such that a cache on the validator (or RTR server) can be synced fully with a cache on the router. As mentioned, the lightweight protocol is void of all the cryptographic data that RPKI is built upon and simply deals with the validated routing list, which consists of CIDRs, ASNs and maybe a MaxLength parameter.

Here’s a simple Cisco configuration for enabling RTR on a router:

router bgp 65001
 rpki server 192.168.1.100
  transport tcp port 8282
 !
!

The configuration can take additional parameters in order to enable SSH or similar transport options. Other platforms (such as Juniper, Arista, Bird 2.0, etc) have their own specific configuration language.

The RTR protocol supports IPv4 and IPv6 routing information (as you would expect).

RPKI and the RTR protocol

Being specified as a lightweight protocol, RTR allows the data to be transferred quickly. With a session-id created by the RTR cache server plus serial-numbers exchanged between cache servers and routers, there’s the solid ability for route authentication data on the router to be kept fresh with a minimum amount of actual data being transferred. Remember, as we said above, the router has much better things to do with its control plane processor like running the BGP convergence algorithm, or SRv6, or ISIS, or any of the protocols needed to manage routing tables.

Is RTR a weak link in the RPKI story?

All aspects of RPKI data processing are built around solid cryptographic principles. The five RIRs each hold a root key called a Trust Anchor (TA). Each publishes data fully signed up/down so that every piece of information can be proven to be correct and without tampering. A validators job is to do that processing and spit out (or store) a list of valid ROAs (Route Origin Authorizations) that are assertions traceable back to a known source. If you want to study this protocol, you can start with RFC6480 and work forward through all the other relevant RFCs (Hint: It’s at least thirty more RFCs from RFC6483 thru RFC8210 and counting).

However, RTR does not carry that trust through to the Internet router. All that complexity (and hence assertions) are stripped away before a router sees anything. It is 100% up to the network operator to build a reliable and secure path between validator or RTR cache and router so that this lightweight transfer is still trusted.

RTR helps somewhat in this space. It provides more than one way to communicate between cache server and router. The RFC defines various methods to communicate.

  • A plain TCP connection (which is clearly insecure). In this case the RFC states: “the cache and routers MUST be on the same trusted and controlled network.”.
  • A TCP connection with TCP-AO transport.
  • A Secure Shell version 2 (SSHv2) transport.
  • A TCP connection with TCP MD5 transport (which is already obsoleted by TCP-AO).
  • A TCP connection over IPsec transport.
  • Transport Layer Security (TLS) transport.

This plethora of options is all well and good; however, there’s no useful implementation of TCP-AO out in the production world and hence (ironically) a lot of early implementations are living with plain-text communications. SSH and TLS are much better options; however, this comes with classic operational problems to solve. For example, in SSH’s case, the RFC states:

It is assumed that the router and cache have exchanged keys out of band by some reasonably secured means.

For a TLS connection, there’s also some worthwhile security setup mentioned in the RFC. It starts off as follows:

Client routers using TLS transport MUST present client-side certificates to authenticate themselves to the cache in order to allow the cache to manage the load by rejecting connections from unauthorized routers.

Then the RFC continues with enough information to secure the connection fully. If implemented correctly, then security is correctly provided between RTR cache and router such that no MITM attack can take place.

Assuming that these operational issues are handled fully then the RTR protocol is a perfect protocol for operationally implementing RPKI’s final linkage into the routers.

Testing the RTR protocol and open-source rpki-rtr-client

A modern router software stack can be configured to run RTR against a cache. If you have a test lab (as most modern networks do); then you have all you need to see RPKI route filtering (and the dropping of invalid routes).

However, if you are without a router and want to see RTR in action, Cloudflare has just placed rpki-rtr-client on GitHub. This software, written in Python, performs the router portion of the RTR protocol and comes with enough debug output that it can also be used to help write new RTR caches, or test existing code bases. The code was written directly from the RFC and then tested against a public RTR cache that Cloudflare operates.

$ pip3 install netaddr
...
$ git clone https://github.com/cloudflare/rpki-rtr-client.git
...
$ cd rpki-rtr-client
$

Operating the client is easy (and doubly-so if you use the Cloudflare provided cache).

$ ./rtr_client.py -h rtr.rpki.cloudflare.com -p 8282
...
^C
$

As there is no router (and hence no dropping of invalids) this code simply creates data files for later review. See the README file for more information.

$ ls -lt data/2020-02
total 21592
-rw-r--r--  1 martin martin  5520676 Feb 16 18:22 2020-02-17-022209.routes.00000365.json
-rw-r--r--  1 martin martin  5520676 Feb 16 18:42 2020-02-17-024242.routes.00000838.json
-rw-r--r--  1 martin martin      412 Feb 16 19:56 2020-02-17-035645.routes.00000841.json
-rw-r--r--  1 martin martin      272 Feb 16 20:16 2020-02-17-041647.routes.00000842.json
-rw-r--r--  1 martin martin      643 Feb 16 20:36 2020-02-17-043649.routes.00000843.json
$

As the RTR protocol communicates and increments its serial-number, the rpki-rtr-client software writes the routing information is a fresh file for later review.

$ for f in data/2020-02/*.json ; do echo "$f `jq -r '.routes.announce[]|.ip' < $f | wc -l` `jq -r '.routes.withdraw[]|.ip' < $f | wc -l`" ; done
data/2020-02/2020-02-17-022209.routes.00000365.json   128483        0
data/2020-02/2020-02-17-024242.routes.00000838.json   128483        0
data/2020-02/2020-02-17-035645.routes.00000841.json        3        6
data/2020-02/2020-02-17-041647.routes.00000842.json        5        0
data/2020-02/2020-02-17-043649.routes.00000843.json        9        5
$

Valid ROAs are listed as follows:

$ jq -r '.routes.announce[]|.ip,.asn,.maxlen' data/2020-02/*0838.json | paste - - - | sort -V | head
1.0.0.0/24      13335   null
1.1.1.0/24      13335   null
1.9.0.0/16      4788    24
1.9.12.0/24     65037   null
1.9.21.0/24     24514   null
1.9.23.0/24     65120   null
1.9.31.0/24     65077   null
1.9.65.0/24     24514   null
1.34.0.0/15     3462    24
1.36.0.0/16     4760    null
$

The code can also dump the raw binary protocol and then replay that data to debug the protocol.

As the code is on GitHub, any protocol developer can feel free to expand on the code.

Future of RTR protocol

The present RFC defines version 1 of the protocol and it is expected that this lightweight protocol will progress to also include additional functions, but stay lightweight. RPKI is a Route Origin Validation protocol (i.e. mapping an IP route or CIDR to an ASN). It does not provide support for validating the AS-PATH. Neither does it provide any support for IRR databases (which are non-cryptographically-signed routing definitions). Presently IRR data is the primary method used for filtering routing on the global Internet. Today that is done by building massive filter lists within a router’s configuration file and not via a lightweight protocol like RTR.

At the present time there’s an IETF proposal for RTR version 2. This is draft, work alongside the ASPA (Autonomous System Provider Authorization) draft and draft work. These draft documents from Alexander Azimov et al. define ASPA extending the RPKI data structures to handle BGP path information. The version 2 of RTR protocol should provide the required messaging in order to move ASPA data into the router.

RPKI and the RTR protocol

Additionally, RPKI is going to potentially further expand, at some point, from today’s singular data type (the ROA object). Just like with the ASPA draft, RTR will need to advance in lock-step. Hopefully the open-source code we have published will help this effort.

Some final thoughts on RTR and RPKI

If RPKI is to become ubiquitous, then RTR support in all BGP speaking Internet routers is going to be required. Vendors need to complete their RTR software delivery and additionally support some of the more secure transport definitions from the RFC. Additionally, should the protocol advance, then timely support for the new version will be needed.

Cloudflare continues to be committed to a secure Internet; so should you also have the same thoughts and you like what you’ve read here or elsewhere on our blog; then please take a look at our jobs page. We have software and network engineering open roles in many of our offices around the world.

The deep-dive into how Verizon and a BGP Optimizer Knocked Large Parts of the Internet Offline Monday

Post Syndicated from Martin J Levy original https://blog.cloudflare.com/the-deep-dive-into-how-verizon-and-a-bgp-optimizer-knocked-large-parts-of-the-internet-offline-monday/

A recap on what happened Monday

The deep-dive into how Verizon and a BGP Optimizer Knocked Large Parts of the Internet Offline Monday

On Monday we wrote about a painful Internet wide route leak. We wrote that this should never have happened because Verizon should never have forwarded those routes to the rest of the Internet. That blog entry came out around 19:58 UTC, just over seven hours after the route leak finished (which will we see below was around 12:39 UTC). Today we will dive into the archived routing data and analyze it. The format of the code below is meant to use simple shell commands so that any reader can follow along and, more importantly, do their own investigations on the routing tables.

This was a very public BGP route leak event. It was both reported online via many news outlets and the event’s BGP data was reported via social media as it was happening. Andree Toonk tweeted a quick list of 2,400 ASNs that were affected.


Using RIPE NCC archived data

The RIPE NCC operates a very useful archive of BGP routing. It runs collectors globally and provides an API for querying the data. More can be seen at https://stat.ripe.net/. In the world of BGP all routing is public (within the ability of anyone collecting data to have enough collections points). The archived data is very valuable for research and that’s what we will do in this blog. The site can create some very useful data visualizations.

The deep-dive into how Verizon and a BGP Optimizer Knocked Large Parts of the Internet Offline Monday

Dumping the RIPEstat data for this event

Presently, the RIPEstat data gets ingested around eight to twelve hours after real-time. It’s not meant to be a real-time service. The data can be queried in many ways, including a full web interface and an API. We are using the API to extract the data in a JSON format.

We are going to focus only on the Cloudflare routes that were leaked. Many other ASNs were leaked (see the tweet above); however, we want to deal with a finite data set and focus on what happened to Cloudflare’s routing. All the commands below can be run with ease on many systems. The following was done on MacBook Pro running macOS Mojave.

First we collect 24 hours of route announcements and AS-PATH data that RIPEstat sees coming from AS13335 (Cloudflare).

$ # Collect 24 hours of data - more than enough
$ ASN=”AS13335”
$ START=”2019-06-24T00:00:00”END=”2019-06-25T00:00:00”
$ ARGS=”resource=${ASN}&starttime=${START}&endtime=${END}"
$ URL=”https://stat.ripe.net/data/bgp-updates/data.json?${ARGS}”
$ # Fetch the data from RIPEstat
$ curl -sS "${URL}" | jq . > 13335-routes.json
$ ls -l 13335-routes.json
-rw-r--r--  1 martin  staff  339363899 Jun 25 08:47 13335-routes.json
$

That’s 340MB of data – which seems like a lot, but it contains plenty of white space and plenty of data we just don’t need. Our second task is to reduce this raw data down to just the required data – that’s timestamps, actual routes, and AS-PATH. The third item will be very useful. Note we are using jq, which can be installed on macOS with the  brew package manager.

$ # Extract just the times, routes, and AS-PATH
$ jq -rc '.data.updates[]|.timestamp,.attrs.target_prefix,.attrs.path' < 13335-routes.json | paste - - - > 13335-listing-a.txt
$ wc -l 13335-listing-a.txt
691318 13335-listing-a.txt
$

We are down to just below seven hundred thousand routing events, however, that’s not a leak, that’s everything that includes Cloudflare’s ASN (the number 13335 above). For that we need to go back to Monday’s blog and realize it was AS396531 (Allegheny Technologies) that showed up with 701 (Verizon) in the leak. Now we reduce the data further:

$ # Extract the route leak 701,396531
$ # AS701 is Verizon and AS396531 is Allegheny Technologies
$ egrep '701,396531' < 13335-listing-a.txt > 13335-listing-b.txt
$ wc -l 13335-listing-b.txt
204568 13335-listing-b.txt
$

At 204 thousand data points, we are looking better. It’s still a lot of data because BGP can be very chatty if topology is changing. A route leak will cause exactly that. Now let’s see how many routes were affected:

$ # Extract the actual routes affected
$ cut -f2 < 13335-listing-b.txt | sort -n -u > 13335-listing-c.txt
$ wc -l 13335-listing-c.txt
20 13335-listing-c.txt
$

It’s a much smaller number. We now have a listing of at least twenty routes that were leaked via Verizon. This may not be the full list because route collectors like RIPEstat don’t have direct feeds from Verizon, so this data is a blended view with Verizon’s path and other paths. We can see that if we look at the AS-PATH in the above files. Here’s the listing of affected routes.

$ cat 13335-listing-c.txt
8.39.214.0/24
8.42.245.0/24
8.44.58.0/24
104.16.80.0/21
104.17.168.0/21
104.18.32.0/21
104.19.168.0/21
104.20.64.0/21
104.22.8.0/21
104.23.128.0/21
104.24.112.0/21
104.25.144.0/21
104.26.0.0/21
104.27.160.0/21
104.28.16.0/21
104.31.0.0/21
141.101.120.0/23
162.159.224.0/21
172.68.60.0/22
172.69.116.0/22
$

This is an interesting list, as some of these routes do not originate from Cloudflare’s network, however, they show up with AS13335 (our ASN) as the originator. For example, the 104.26.0.0/21 route is not announced from our network, but we do announce 104.26.0.0/20 (which covers that route). More importantly, we have an IRR route object plus an RPKI ROA for that block. Here’s the IRR object:

route:          104.26.0.0/20
origin:         AS13335
source:         ARIN

And here’s the RPKI ROA. This ROA has maxlength set to 20, so no smaller route should be accepted.

Prefix:       104.26.0.0/20
Max Length:   /20
ASN:          13335
Trust Anchor: ARIN
Validity:     Thu, 02 Aug 2018 04:00:00 GMT - Sat, 31 Jul 2027 04:00:00 GMT
Emitted:      Thu, 02 Aug 2018 21:45:37 GMT
Name:         535ad55d-dd30-40f9-8434-c17fc413aa99
Key:          4a75b5de16143adbeaa987d6d91e0519106d086e
Parent Key:   a6e7a6b44019cf4e388766d940677599d0c492dc
Path:         rsync://rpki.arin.net/repository/arin-rpki-ta/5e4a23ea-...

The Max Length field in an ROA says what the minimum size of an acceptable announcement is. The fact that this is a /20 route with a /20 Max Length says that a /21 (or /22 or /23 or /24) within this IP space isn’t allowed. Looking further at the route list above we get the following listing:

Route Seen            Cloudflare IRR & ROA    ROA Max Length
104.16.80.0/21    ->  104.16.80.0/20          /20
104.17.168.0/21   ->  104.17.160.0/20         /20
104.18.32.0/21    ->  104.18.32.0/20          /20
104.19.168.0/21   ->  104.19.160.0/20         /20
104.20.64.0/21    ->  104.20.64.0/20          /20
104.22.8.0/21     ->  104.22.0.0/20           /20
104.23.128.0/21   ->  104.23.128.0/20         /20
104.24.112.0/21   ->  104.24.112.0/20         /20
104.25.144.0/21   ->  104.25.144.0/20         /20
104.26.0.0/21     ->  104.26.0.0/20           /20
104.27.160.0/21   ->  104.27.160.0/20         /20
104.28.16.0/21    ->  104.28.16.0/20          /20
104.31.0.0/21     ->  104.31.0.0/20           /20

So how did all these /21’s show up? That’s where we dive into the world of BGP route optimization systems and their propensity to synthesize routes that should not exist. If those routes leak (and it’s very clear after this week that they can), all hell breaks loose. That can be compounded when not one, but two ISPs allow invalid routes to be propagated outside their autonomous network. We will explore the AS-PATH further down this blog.

More than 20 years ago, RFC1997 added the concept of “communities” to BGP. Communities are a way of tagging or grouping route advertisements. Communities are often used to label routes so that specific handling policies can be applied. RFC1997 includes a small number of universal “well-known” communities. One of these is the NO_EXPORT community, which has the following specification:

    All routes received carrying a communities attribute
    containing this value MUST NOT be advertised outside a BGP
    confederation boundary (a stand-alone autonomous system that
    is not part of a confederation should be considered a
    confederation itself).

The use of the NO_EXPORT community is very common within BGP enabled networks and is a community tag that would have helped alleviate this route leak immensely.

How BGP route optimization systems work (or don’t work in this case) can be a subject for a whole other blog entry.

Timing of the route leak

As we saved away the timestamps in the JSON file and in the text files, we can confirm the time for every route in the route leak by looking at the first and the last timestamp of a route in the data. We saved data from 00:00:00 UTC until 00:00:00 the next day, so we know we have covered the period of the route leak. We write a script that checks the first and last entry for every route and report the information sorted by start time:

$ while read cidr
do
  echo $cidr
  fgrep $cidr < 13335-listing-b.txt | head -1 | cut -f1
  fgrep $cidr < 13335-listing-b.txt | tail -1 | cut -f1
done < 13335-listing-c.txt |\
paste - - - | sort -k2,3 | column -t | sed -e ‘s/2019-06-24T//g’
104.25.144.0/21   10:34:25  12:38:54
104.22.8.0/21     10:34:27  12:29:39
104.20.64.0/21    10:34:27  12:30:00
104.23.128.0/21   10:34:27  12:30:34
141.101.120.0/23  10:34:27  12:30:39
162.159.224.0/21  10:34:27  12:30:39
104.18.32.0/21    10:34:29  12:30:34
104.24.112.0/21   10:34:29  12:30:34
104.27.160.0/21   10:34:29  12:30:34
104.28.16.0/21    10:34:29  12:30:34
104.31.0.0/21     10:34:29  12:30:34
8.39.214.0/24     10:34:31  12:19:24
104.26.0.0/21     10:34:36  12:29:53
172.68.60.0/22    10:34:38  12:19:24
172.69.116.0/22   10:34:38  12:19:24
8.44.58.0/24      10:34:38  12:19:24
8.42.245.0/24     11:52:49  11:53:19
104.17.168.0/21   12:00:13  12:29:34
104.16.80.0/21    12:00:13  12:30:00
104.19.168.0/21   12:09:39  12:29:34
$

Now we know the times. The route leak started at 10:34:25 UTC (just before lunchtime London time) on 2019-06-24 and ended at 12:38:54 UTC. That’s a hair over two hours. Here’s that same time data in a graphical form showing the near-instant start of the event and the duration of each route leaked:

The deep-dive into how Verizon and a BGP Optimizer Knocked Large Parts of the Internet Offline Monday

We can also go back to RIPEstat and look at the activity graph for Cloudflare’s AS13335 network:

The deep-dive into how Verizon and a BGP Optimizer Knocked Large Parts of the Internet Offline Monday

Clearly between 10:30 UTC and 12:40 UTC there’s a lot of route activity – far more than normal.

Note that as we mentioned above, RIPEstat doesn’t get a full view of Verizon’s network routing and hence some of the propagated routes won’t show up.

Drilling down on the AS-PATH part of the data

Having the routes is useful, but now we want to look at the paths of these leaked routes to see which ASNs are involved. We knew the offending ASNs during the route leak on Monday. Now we want to dig deeper using the archived data. This allows us to see the extent and reach of this route leak.

# Use the list of routes to extract the full AS-PATH
# Merge the results together to show an amalgamation of paths.
# We know (luckily) the last few ASNs in the AS-PATH are consistent
$ cut -f3 < 13335-listing-b.txt | tr -d '[\[\]]' |\
awk '{
  n=split($0, a, ",");
  printf "%50s\n",
    a[n-5] "_" a[n-4] "_" a[n-3] "_" a[n-2] "_" a[n-1] "_" a[n];
}' | sort -u
174_701_396531_33154_3356_13335
2497_701_396531_33154_174_13335
577_701_396531_33154_3356_13335
6939_701_396531_33154_174_13335
1239_701_396531_33154_3356_13335
1273_701_396531_33154_3356_13335
1280_701_396531_33154_3356_13335
2497_701_396531_33154_3356_13335
2516_701_396531_33154_3356_13335
3320_701_396531_33154_3356_13335
3491_701_396531_33154_3356_13335
4134_701_396531_33154_3356_13335
4637_701_396531_33154_3356_13335
6453_701_396531_33154_3356_13335
6461_701_396531_33154_3356_13335
6762_701_396531_33154_3356_13335
6830_701_396531_33154_3356_13335
6939_701_396531_33154_3356_13335
7738_701_396531_33154_3356_13335
12956_701_396531_33154_3356_13335
17639_701_396531_33154_3356_13335
23148_701_396531_33154_3356_13335
$

This script clearly shows the AS-PATH of the leaked routes. It’s very consistent. Reading from the back of the line to the front, we have 13335 (Cloudflare), 3356 or 174 (Level3/CenturyLink or Cogent – both tier 1 transit providers for Cloudflare). So far, so good. Then we have 33154 (DQE Communications) and 396531 (Allegheny Technologies Inc) which is still technically not a leak, but trending that way. The reason why we can state this is still technically not a leak is because we don’t know the relationship between those two ASs. It’s possible they have a mutual transit agreement between them. That’s up to them.

Back to the AS-PATH’s. Still reading leftwards, we see 701 (Verizon), which is very very bad and clear evidence of a leak. It’s a leak for two reasons. First, this matches the path when a transit is leaking a non-customer route from a customer. This Verizon customer does not have 13335 (Cloudflare) listed as a customer. Second, the route contains within its path a tier 1 ASN. This is the point where a route leak should have been absolutely squashed by filtering on the customer BGP session. Beyond this point there be dragons.

And dragons there be! Everything above is about how Verizon filtered (or didn’t filter) its customer. What follows the 701 (i.e the number to the left of it) is the peers or customers of Verizon that have accepted these leaked routes. They are mainly other tier 1 networks of Verizon in this list: 174 (Cogent), 1239 (Sprint), 1273 (Vodafone), 3320 (DTAG), 3491 (PCCW), 6461 (Zayo), 6762 (Telecom Italia), etc.

What’s missing from that list are two networks worthy of mentioning – 2914 (NTT) & 7018 (AT&T). Both implement a very simple AS-PATH filter which saved the day for their network. They do not allow one tier 1 ISP to send them a route which has another tier 1 further down the path. That’s because when that happens, it’s officially a leak as each tier 1 is fully connected to all other tier 1’s (which is part of the definition of a tier 1 network). The topology of the Internet’s global BGP routing tables simply states that if you see another tier 1 in the path, then it’s a bad route and it should be filtered away.

Additionally we know that 7018 (AT&T) operates a network which drops RPKI invalids. Because Cloudflare routes are RPKI signed, this also means that AT&T would have dropped these routes when it receives them from Verizon. This shows a clear win for RPKI (and for AT&T when you see their bandwidth graph below)!


That all said, keep in mind we are still talking about routes that Cloudflare didn’t announce. They all came from the route optimizer.

What should 701 Verizon network accept from their customer 396531?

This is a great question to ask. Normally we would look at the IRR (Internet Routing Registries) to see what policy an ASN wants for it’s routes.

$ whois -h whois.radb.net AS396531 ; the Verizon customer
%  No entries found for the selected source(s).
$ whois -h whois.radb.net AS33154  ; the downstream of that customer
%  No entries found for the selected source(s).
$ 

That’s enough to say that we should not be seeing this ASN anywhere on the Internet, however, we should go further into checking this. As we know the ASN of the network, we can search for any routes that are listed for that ASN. We find one:

$ whois -h whois.radb.net ' -i origin AS396531' | egrep '^route|^origin|^mnt-by|^source'
route:          192.92.159.0/24
origin:         AS396531
mnt-by:         MNT-DCNSL
source:         ARIN
$

More importantly, now we have a maintainer (the owner of the routing registry entries). We can see what else is there for that network and we are specifically looking for this:

$ whois -h whois.radb.net ' -i mnt-by -T as-set MNT-DCNSL' | egrep '^as-set|^members|^mnt-by|^source'
as-set:         AS-DQECUST
members:        AS4130, AS5050, AS11199, AS11360, AS12017, AS14088, AS14162,
                AS14740, AS15327, AS16821, AS18891, AS19749, AS20326,
                AS21764, AS26059, AS26257, AS26461, AS27223, AS30168,
                AS32634, AS33039, AS33154, AS33345, AS33358, AS33504,
                AS33726, AS40549, AS40794, AS54552, AS54559, AS54822,
                AS393456, AS395440, AS396531, AS15204, AS54119, AS62984,
                AS13659, AS54934, AS18572, AS397284
mnt-by:         MNT-DCNSL
source:         ARIN
$

This object is important. It lists all the downstream ASNs that this network is expected to announce to the world. It does not contain Cloudflare’s ASN (or any of the leaked ASNs). Clearly this as-set was not used for any BGP filtering.

Just for completeness the same exercise can be done for the other ASN (the downstream of the customer of Verizon). In this case, we just searched for the maintainer object (as there are plenty of route and route6 objects listed).

$ whois -h whois.radb.net ' -i origin AS33154' | egrep '^mnt-by' | sort -u
mnt-by:         MNT-DCNSL
mnt-by:     MAINT-AS3257
mnt-by:     MAINT-AS5050
$

None of these maintainers are directly related to 33154 (DQE Communications). They have been created by other parties and hence they become a dead-end in that search.

It’s worth doing a secondary search to see if any as-set object exists with 33154 or 396531 included. We turned to the most excellent IRR Explorer website run by NLNOG. It provides deep insight into the routing registry data. We did a simple search for 33154 using http://irrexplorer.nlnog.net/search/33154 and we found these as-set objects.

The deep-dive into how Verizon and a BGP Optimizer Knocked Large Parts of the Internet Offline Monday

It’s interesting to see this ASN listed in other as-set’s but none are important or related to Monday’s route-leak. Next we looked at 396531

The deep-dive into how Verizon and a BGP Optimizer Knocked Large Parts of the Internet Offline Monday

This shows that there’s nowhere else we need to check. AS-DQECUST is the as-set macro that controls (or should control) filtering for any transit provider of their network.

The summary of all the investigation is a solid statement that no Cloudflare routes or ASNs are listed anywhere within the routing registry data for the customer of Verizon. As there were 2,300 ASNs listed in the tweet above, we can conclusively state no filtering was in place and hence this route leak went on its way unabated.

IPv6? Where is the IPv6 route leak?

In what could be considered the only plus from Monday’s route leak, we can confirm that there was no route leak within IPv6 space. Why?

It turns out that 396531 (Allegheny Technologies Inc) is a network without IPv6 enabled. Normally you would hear Cloudflare chastise anyone that’s yet to enable IPv6, however, in this case we are quite happy that one of the two protocol families survived. IPv6 was stable during this route leak, which now can be called an IPv4-only route leak.

Yet that’s not really the whole story. Let’s look at the percentage of traffic Cloudflare sends Verizon that’s IPv6 (vs IPv4). Normally the IPv4/IPv6 percentage holds steady.

The deep-dive into how Verizon and a BGP Optimizer Knocked Large Parts of the Internet Offline Monday

This uptick in IPv6 traffic could be the direct result of Happy Eyeballs on mobile handsets picking a working IPv6 path into Cloudflare vs a non-working IPv4 path. Happy Eyeballs is meant to protect against IPv6 failure, however in this case it’s doing a wonderful job in protecting from an IPv4 failure. Yet we have to be careful with this graph because after further thought and investigation the percentage only increased because IPv4 reduces. Sometimes graphs can be misinterpreted, yet Happy Eyeballs still did a good job even as end users were being affected.

Happy Eyeballs, described in RFC8305, is a mechanism where a client (lets say a mobile device) tries to connect to a website both on IPv4 and IPv6 simultaneously. IPv6 is sometimes given a head-start. The theory is that, should a failure exist on one of the paths (sadly IPv6 is the norm), then IPv4 will save the day. Monday was a day of opposites for Verizon.

In fact, enabling IPv6 for mobile users is the one area where we can praise the Verizon network this week (or at least the Verizon mobile network), unlike the residential Verizon networks where IPv6 is almost non-existent.

Using bandwidth graphs to confirm routing leaks and stability.

As we have already stated,Verizon had impacted their own users/customers. Let’s start with their bandwidth graph:

The deep-dive into how Verizon and a BGP Optimizer Knocked Large Parts of the Internet Offline Monday

The red line is 24 July 2019 (00:00 UTC to 00:00 UTC the next day). The gray lines are previous days for comparison. This graph includes both Verizon fixed-line services like FiOS along with mobile.

The AT&T graph is quite different.

The deep-dive into how Verizon and a BGP Optimizer Knocked Large Parts of the Internet Offline Monday

There’s no perturbation. This, along with some direct confirmation, shows that 7018 (AT&T) was not affected. This is an important point.

Going back and looking at a third tier 1 network, we can see 6762 (Telecom Italia) affected by this route leak and yet Cloudflare has a direct interconnect with them.

The deep-dive into how Verizon and a BGP Optimizer Knocked Large Parts of the Internet Offline Monday

We will be asking Telecom Italia to improve their route filtering as we now have this data.

Future work that could have helped on Monday

The IETF is doing work in the area of BGP path protection within the Secure Inter-Domain Routing Operations Working Group (sidrops) area. The charter of this IETF group is:

The SIDR Operations Working Group (sidrops) develops guidelines for the operation of SIDR-aware networks, and provides operational guidance on how to deploy and operate SIDR technologies in existing and new networks.

One new effort from this group should be called out to show how important the issue of route leaks like today’s event is. The draft document from Alexander Azimov et al named draft-ietf-sidrops-aspa-profile (ASPA stands for Autonomous System Provider Authorization) extends the RPKI data structures to handle BGP path information. This in ongoing work and Cloudflare and other companies are clearly interested in seeing it progress further.

However, as we said in Monday’s blog and something we should reiterate again and again: Cloudflare encourages all network operators to deploy RPKI now!

The deep-dive into how Verizon and a BGP Optimizer Knocked Large Parts of the Internet Offline Monday

Post Syndicated from Martin J Levy original https://blog.cloudflare.com/the-deep-dive-into-how-verizon-and-a-bgp-optimizer-knocked-large-parts-of-the-internet-offline-monday/

A recap on what happened Monday

The deep-dive into how Verizon and a BGP Optimizer Knocked Large Parts of the Internet Offline Monday

On Monday we wrote about a painful Internet wide route leak. We wrote that this should never have happened because Verizon should never have forwarded those routes to the rest of the Internet. That blog entry came out around 19:58 UTC, just over seven hours after the route leak finished (which will we see below was around 12:39 UTC). Today we will dive into the archived routing data and analyze it. The format of the code below is meant to use simple shell commands so that any reader can follow along and, more importantly, do their own investigations on the routing tables.

This was a very public BGP route leak event. It was both reported online via many news outlets and the event’s BGP data was reported via social media as it was happening. Andree Toonk tweeted a quick list of 2,400 ASNs that were affected.


This blog contains a large number of acronyms and those are explained at the end of the blog.

Using RIPE NCC archived data

The RIPE NCC operates a very useful archive of BGP routing. It runs collectors globally and provides an API for querying the data. More can be seen at https://stat.ripe.net/. In the world of BGP all routing is public (within the ability of anyone collecting data to have enough collections points). The archived data is very valuable for research and that’s what we will do in this blog. The site can create some very useful data visualizations.

The deep-dive into how Verizon and a BGP Optimizer Knocked Large Parts of the Internet Offline Monday

Dumping the RIPEstat data for this event

Presently, the RIPEstat data gets ingested around eight to twelve hours after real-time. It’s not meant to be a real-time service. The data can be queried in many ways, including a full web interface and an API. We are using the API to extract the data in a JSON format.

We are going to focus only on the Cloudflare routes that were leaked. Many other ASNs were leaked (see the tweet above); however, we want to deal with a finite data set and focus on what happened to Cloudflare’s routing. All the commands below can be run with ease on many systems. Both the scripts and the raw data file are now available on GitHub. The following was done on MacBook Pro running macOS Mojave.

First we collect 24 hours of route announcements and AS-PATH data that RIPEstat sees coming from AS13335 (Cloudflare).

$ # Collect 24 hours of data - more than enough
$ ASN="AS13335"
$ START="2019-06-24T00:00:00"
$ END="2019-06-25T00:00:00"
$ ARGS="resource=${ASN}&starttime=${START}&endtime=${END}"
$ URL="https://stat.ripe.net/data/bgp-updates/data.json?${ARGS}"
$ # Fetch the data from RIPEstat
$ curl -sS "${URL}" | jq . > 13335-routes.json
$ ls -l 13335-routes.json
-rw-r--r--  1 martin  staff  339363899 Jun 25 08:47 13335-routes.json
$

That’s 340MB of data – which seems like a lot, but it contains plenty of white space and plenty of data we just don’t need. Our second task is to reduce this raw data down to just the required data – that’s timestamps, actual routes, and AS-PATH. The third item will be very useful. Note we are using jq, which can be installed on macOS with the  brew package manager.

$ # Extract just the times, routes, and AS-PATH
$ jq -rc '.data.updates[]|.timestamp,.attrs.target_prefix,.attrs.path' < 13335-routes.json | paste - - - > 13335-listing-a.txt
$ wc -l 13335-listing-a.txt
691318 13335-listing-a.txt
$

We are down to just below seven hundred thousand routing events, however, that’s not a leak, that’s everything that includes Cloudflare’s ASN (the number 13335 above). For that we need to go back to Monday’s blog and realize it was AS396531 (Allegheny Technologies) that showed up with 701 (Verizon) in the leak. Now we reduce the data further:

$ # Extract the route leak 701,396531
$ # AS701 is Verizon and AS396531 is Allegheny Technologies
$ egrep '701,396531' < 13335-listing-a.txt > 13335-listing-b.txt
$ wc -l 13335-listing-b.txt
204568 13335-listing-b.txt
$

At 204 thousand data points, we are looking better. It’s still a lot of data because BGP can be very chatty if topology is changing. A route leak will cause exactly that. Now let’s see how many routes were affected:

$ # Extract the actual routes affected by the route leak
$ cut -f2 < 13335-listing-b.txt | sort -V -u > 13335-listing-c.txt
$ wc -l 13335-listing-c.txt
101 13335-listing-c.txt
$

It’s a much smaller number. We now have a listing of at least 101 routes that were leaked via Verizon. This may not be the full list because route collectors like RIPEstat don’t have direct feeds from Verizon, so this data is a blended view with Verizon’s path and other paths. We can see that if we look at the AS-PATH in the above files. Please note that I had a typo in this script when this blog was first published and only 20 routes showed up because the -n vs -V option was used on sort. Now the list is correct with 101 affected routes. Please see this short article from stackoverflow to see the issue.

Here’s a partial listing of affected routes.

$ cat 13335-listing-c.txt
8.39.214.0/24
8.42.245.0/24
8.44.58.0/24
...
104.16.80.0/21
104.17.168.0/21
104.18.32.0/21
104.19.168.0/21
104.20.64.0/21
104.22.8.0/21
104.23.128.0/21
104.24.112.0/21
104.25.144.0/21
104.26.0.0/21
104.27.160.0/21
104.28.16.0/21
104.31.0.0/21
141.101.120.0/23
162.159.224.0/21
172.68.60.0/22
172.69.116.0/22
...
$

This is an interesting list, as some of these routes do not originate from Cloudflare’s network, however, they show up with AS13335 (our ASN) as the originator. For example, the 104.26.0.0/21 route is not announced from our network, but we do announce 104.26.0.0/20 (which covers that route). More importantly, we have an IRR (Internet Routing Registries) route object plus an RPKI ROA for that block. Here’s the IRR object:

route:          104.26.0.0/20
origin:         AS13335
source:         ARIN

And here’s the RPKI ROA. This ROA has Max Length set to 20, so no smaller route should be accepted.

Prefix:       104.26.0.0/20
Max Length:   /20
ASN:          13335
Trust Anchor: ARIN
Validity:     Thu, 02 Aug 2018 04:00:00 GMT - Sat, 31 Jul 2027 04:00:00 GMT
Emitted:      Thu, 02 Aug 2018 21:45:37 GMT
Name:         535ad55d-dd30-40f9-8434-c17fc413aa99
Key:          4a75b5de16143adbeaa987d6d91e0519106d086e
Parent Key:   a6e7a6b44019cf4e388766d940677599d0c492dc
Path:         rsync://rpki.arin.net/repository/arin-rpki-ta/5e4a23ea-...

The Max Length field in an ROA says what the minimum size of an acceptable announcement is. The fact that this is a /20 route with a /20 Max Length says that a /21 (or /22 or /23 or /24) within this IP space isn’t allowed. Looking further at the route list above we get the following listing:

Route Seen            Cloudflare IRR & ROA    ROA Max Length
104.16.80.0/21    ->  104.16.80.0/20          /20
104.17.168.0/21   ->  104.17.160.0/20         /20
104.18.32.0/21    ->  104.18.32.0/20          /20
104.19.168.0/21   ->  104.19.160.0/20         /20
104.20.64.0/21    ->  104.20.64.0/20          /20
104.22.8.0/21     ->  104.22.0.0/20           /20
104.23.128.0/21   ->  104.23.128.0/20         /20
104.24.112.0/21   ->  104.24.112.0/20         /20
104.25.144.0/21   ->  104.25.144.0/20         /20
104.26.0.0/21     ->  104.26.0.0/20           /20
104.27.160.0/21   ->  104.27.160.0/20         /20
104.28.16.0/21    ->  104.28.16.0/20          /20
104.31.0.0/21     ->  104.31.0.0/20           /20

So how did all these /21’s show up? That’s where we dive into the world of BGP route optimization systems and their propensity to synthesize routes that should not exist. If those routes leak (and it’s very clear after this week that they can), all hell breaks loose. That can be compounded when not one, but two ISPs allow invalid routes to be propagated outside their autonomous network. We will explore the AS-PATH further down this blog.

More than 20 years ago, RFC1997 added the concept of communities to BGP. Communities are a way of tagging or grouping route advertisements. Communities are often used to label routes so that specific handling policies can be applied. RFC1997 includes a small number of universal well-known communities. One of these is the NO_EXPORT community, which has the following specification:

    All routes received carrying a communities attribute
    containing this value MUST NOT be advertised outside a BGP
    confederation boundary (a stand-alone autonomous system that
    is not part of a confederation should be considered a
    confederation itself).

The use of the NO_EXPORT community is very common within BGP enabled networks and is a community tag that would have helped alleviate this route leak immensely.

How BGP route optimization systems work (or don’t work in this case) can be a subject for a whole other blog entry.

Timing of the route leak

As we saved away the timestamps in the JSON file and in the text files, we can confirm the time for every route in the route leak by looking at the first and the last timestamp of a route in the data. We saved data from 00:00:00 UTC until 00:00:00 the next day, so we know we have covered the period of the route leak. We write a script that checks the first and last entry for every route and report the information sorted by start time:

$ # Extract the timing of the route leak
$ while read cidr
do
  echo $cidr
  fgrep $cidr < 13335-listing-b.txt | head -1 | cut -f1
  fgrep $cidr < 13335-listing-b.txt | tail -1 | cut -f1
done < 13335-listing-c.txt |\
paste - - - | sort -k2,3 | column -t | sed -e 's/2019-06-24T//g'
...
104.25.144.0/21   10:34:25  12:38:54
104.22.8.0/21     10:34:27  12:29:39
104.20.64.0/21    10:34:27  12:30:00
104.23.128.0/21   10:34:27  12:30:34
141.101.120.0/23  10:34:27  12:30:39
162.159.224.0/21  10:34:27  12:30:39
104.18.32.0/21    10:34:29  12:30:34
104.24.112.0/21   10:34:29  12:30:34
104.27.160.0/21   10:34:29  12:30:34
104.28.16.0/21    10:34:29  12:30:34
104.31.0.0/21     10:34:29  12:30:34
8.39.214.0/24     10:34:31  12:19:24
104.26.0.0/21     10:34:36  12:29:53
172.68.60.0/22    10:34:38  12:19:24
172.69.116.0/22   10:34:38  12:19:24
8.44.58.0/24      10:34:38  12:19:24
8.42.245.0/24     11:52:49  11:53:19
104.17.168.0/21   12:00:13  12:29:34
104.16.80.0/21    12:00:13  12:30:00
104.19.168.0/21   12:09:39  12:29:34
...
$

Now we know the times. The route leak started at 10:34:25 UTC (just before lunchtime London time) on 2019-06-24 and ended at 12:38:54 UTC. That’s a hair over two hours. Here’s that same time data in a graphical form showing the near-instant start of the event and the duration of each route leaked:

The deep-dive into how Verizon and a BGP Optimizer Knocked Large Parts of the Internet Offline Monday

We can also go back to RIPEstat and look at the activity graph for Cloudflare’s AS13335 network:

The deep-dive into how Verizon and a BGP Optimizer Knocked Large Parts of the Internet Offline Monday

Clearly between 10:30 UTC and 12:40 UTC there’s a lot of route activity – far more than normal.

Note that as we mentioned above, RIPEstat doesn’t get a full view of Verizon’s network routing and hence some of the propagated routes won’t show up.

Drilling down on the AS-PATH part of the data

Having the routes is useful, but now we want to look at the paths of these leaked routes to see which ASNs are involved. We knew the offending ASNs during the route leak on Monday. Now we want to dig deeper using the archived data. This allows us to see the extent and reach of this route leak.

$ # Extract the AS-PATH of the route leak
$ # Use the list of routes to extract the full AS-PATH
$ # Merge the results together to show an amalgamation of paths.
$ # We know (luckily) the last few ASNs in the AS-PATH are consistent
$ cut -f3 < 13335-listing-b.txt | tr -d '[\[\]]' |\
awk '{
  n=split($0, a, ",");
  printf "%50s\n",
    a[n-5] "_" a[n-4] "_" a[n-3] "_" a[n-2] "_" a[n-1] "_" a[n];
}' | sort -u
   174_701_396531_33154_3356_13335
   2497_701_396531_33154_174_13335
   577_701_396531_33154_3356_13335
   6939_701_396531_33154_174_13335
  1239_701_396531_33154_3356_13335
  1273_701_396531_33154_3356_13335
  1280_701_396531_33154_3356_13335
  2497_701_396531_33154_3356_13335
  2516_701_396531_33154_3356_13335
  3320_701_396531_33154_3356_13335
  3491_701_396531_33154_3356_13335
  4134_701_396531_33154_3356_13335
  4637_701_396531_33154_3356_13335
  6453_701_396531_33154_3356_13335
  6461_701_396531_33154_3356_13335
  6762_701_396531_33154_3356_13335
  6830_701_396531_33154_3356_13335
  6939_701_396531_33154_3356_13335
  7738_701_396531_33154_3356_13335
 12956_701_396531_33154_3356_13335
 17639_701_396531_33154_3356_13335
 23148_701_396531_33154_3356_13335
$

This script clearly shows the AS-PATH of the leaked routes. It’s very consistent. Reading from the back of the line to the front, we have 13335 (Cloudflare), 3356 or 174 (Level3/CenturyLink or Cogent – both tier 1 transit providers for Cloudflare). So far, so good. Then we have 33154 (DQE Communications) and 396531 (Allegheny Technologies Inc) which is still technically not a leak, but trending that way. The reason why we can state this is still technically not a leak is because we don’t know the relationship between those two ASs. It’s possible they have a mutual transit agreement between them. That’s up to them.

Back to the AS-PATH’s. Still reading leftwards, we see 701 (Verizon), which is very very bad and clear evidence of a leak. It’s a leak for two reasons. First, this matches the path when a transit is leaking a non-customer route from a customer. This Verizon customer does not have 13335 (Cloudflare) listed as a customer. Second, the route contains within its path a tier 1 ASN. This is the point where a route leak should have been absolutely squashed by filtering on the customer BGP session. Beyond this point there be dragons.

And dragons there be! Everything above is about how Verizon filtered (or didn’t filter) its customer. What follows the 701 (i.e the number to the left of it) is the peers or customers of Verizon that have accepted these leaked routes. They are mainly other tier 1 networks of Verizon in this list: 174 (Cogent), 1239 (Sprint), 1273 (Vodafone), 3320 (DTAG), 3491 (PCCW), 6461 (Zayo), 6762 (Telecom Italia), etc.

What’s missing from that list are three networks worthy of mentioning – 1299 (Telia), 2914 (NTT), and 7018 (AT&T). All three implement a very simple AS-PATH filter which saved the day for their network. They do not allow one tier 1 ISP to send them a route which has another tier 1 further down the path. That’s because when that happens, it’s officially a leak as each tier 1 is fully connected to all other tier 1’s (which is part of the definition of a tier 1 network). The topology of the Internet’s global BGP routing tables simply states that if you see another tier 1 in the path, then it’s a bad route and it should be filtered away.

Additionally we know that 7018 (AT&T) operates a network which drops RPKI invalids. Because Cloudflare routes are RPKI signed, this also means that AT&T would have dropped these routes when it receives them from Verizon. This shows a clear win for RPKI (and for AT&T when you see their bandwidth graph below)!


That all said, keep in mind we are still talking about routes that Cloudflare didn’t announce. They all came from the route optimizer.

What should 701 Verizon network accept from their customer 396531?

This is a great question to ask. Normally we would look at the IRR (Internet Routing Registries) to see what policy an ASN wants for it’s routes.

$ whois -h whois.radb.net AS396531 ; the Verizon customer
%  No entries found for the selected source(s).
$ whois -h whois.radb.net AS33154  ; the downstream of that customer
%  No entries found for the selected source(s).
$ 

That’s enough to say that we should not be seeing this ASN anywhere on the Internet, however, we should go further into checking this. As we know the ASN of the network, we can search for any routes that are listed for that ASN. We find one:

$ whois -h whois.radb.net ' -i origin AS396531' | egrep '^route|^origin|^mnt-by|^source'
route:          192.92.159.0/24
origin:         AS396531
mnt-by:         MNT-DCNSL
source:         ARIN
$

More importantly, now we have a maintainer (the owner of the routing registry entries). We can see what else is there for that network and we are specifically looking for this:

$ whois -h whois.radb.net ' -i mnt-by -T as-set MNT-DCNSL' | egrep '^as-set|^members|^mnt-by|^source'
as-set:         AS-DQECUST
members:        AS4130, AS5050, AS11199, AS11360, AS12017, AS14088, AS14162,
                AS14740, AS15327, AS16821, AS18891, AS19749, AS20326,
                AS21764, AS26059, AS26257, AS26461, AS27223, AS30168,
                AS32634, AS33039, AS33154, AS33345, AS33358, AS33504,
                AS33726, AS40549, AS40794, AS54552, AS54559, AS54822,
                AS393456, AS395440, AS396531, AS15204, AS54119, AS62984,
                AS13659, AS54934, AS18572, AS397284
mnt-by:         MNT-DCNSL
source:         ARIN
$

This object is important. It lists all the downstream ASNs that this network is expected to announce to the world. It does not contain Cloudflare’s ASN (or any of the leaked ASNs). Clearly this as-set was not used for any BGP filtering.

Just for completeness the same exercise can be done for the other ASN (the downstream of the customer of Verizon). In this case, we just searched for the maintainer object (as there are plenty of route and route6 objects listed).

$ whois -h whois.radb.net ' -i origin AS33154' | egrep '^mnt-by' | sort -u
mnt-by:         MNT-DCNSL
mnt-by:     MAINT-AS3257
mnt-by:     MAINT-AS5050
$

None of these maintainers are directly related to 33154 (DQE Communications). They have been created by other parties and hence they become a dead-end in that search.

It’s worth doing a secondary search to see if any as-set object exists with 33154 or 396531 included. We turned to the most excellent IRR Explorer website run by NLNOG. It provides deep insight into the routing registry data. We did a simple search for 33154 using http://irrexplorer.nlnog.net/search/33154 and we found these as-set objects.

The deep-dive into how Verizon and a BGP Optimizer Knocked Large Parts of the Internet Offline Monday

It’s interesting to see this ASN listed in other as-set’s but none are important or related to Monday’s route-leak. Next we looked at 396531

The deep-dive into how Verizon and a BGP Optimizer Knocked Large Parts of the Internet Offline Monday

This shows that there’s nowhere else we need to check. AS-DQECUST is the as-set macro that controls (or should control) filtering for any transit provider of their network.

The summary of all the investigation is a solid statement that no Cloudflare routes or ASNs are listed anywhere within the routing registry data for the customer of Verizon. As there were 2,300 ASNs listed in the tweet above, we can conclusively state no filtering was in place and hence this route leak went on its way unabated.

IPv6? Where is the IPv6 route leak?

In what could be considered the only plus from Monday’s route leak, we can confirm that there was no route leak within IPv6 space. Why?

It turns out that 396531 (Allegheny Technologies Inc) is a network without IPv6 enabled. Normally you would hear Cloudflare chastise anyone that’s yet to enable IPv6, however, in this case we are quite happy that one of the two protocol families survived. IPv6 was stable during this route leak, which now can be called an IPv4-only route leak.

Yet that’s not really the whole story. Let’s look at the percentage of traffic Cloudflare sends Verizon that’s IPv6 (vs IPv4). Normally the IPv4/IPv6 percentage holds steady.

The deep-dive into how Verizon and a BGP Optimizer Knocked Large Parts of the Internet Offline Monday

This uptick in IPv6 traffic could be the direct result of Happy Eyeballs on mobile handsets picking a working IPv6 path into Cloudflare vs a non-working IPv4 path. Happy Eyeballs is meant to protect against IPv6 failure, however in this case it’s doing a wonderful job in protecting from an IPv4 failure. Yet we have to be careful with this graph because after further thought and investigation the percentage only increased because IPv4 reduces. Sometimes graphs can be misinterpreted, yet Happy Eyeballs still did a good job even as end users were being affected.

Happy Eyeballs, described in RFC8305, is a mechanism where a client (lets say a mobile device) tries to connect to a website both on IPv4 and IPv6 simultaneously. IPv6 is sometimes given a head-start. The theory is that, should a failure exist on one of the paths (sadly IPv6 is the norm), then IPv4 will save the day. Monday was a day of opposites for Verizon.

In fact, enabling IPv6 for mobile users is the one area where we can praise the Verizon network this week (or at least the Verizon mobile network), unlike the residential Verizon networks where IPv6 is almost non-existent.

Using bandwidth graphs to confirm routing leaks and stability.

As we have already stated,Verizon had impacted their own users/customers. Let’s start with their bandwidth graph:

The deep-dive into how Verizon and a BGP Optimizer Knocked Large Parts of the Internet Offline Monday

The red line is 24 June 2019 (00:00 UTC to 00:00 UTC the next day). The gray lines are previous days for comparison. This graph includes both Verizon fixed-line services like FiOS along with mobile.

The AT&T graph is quite different.

The deep-dive into how Verizon and a BGP Optimizer Knocked Large Parts of the Internet Offline Monday

There’s no perturbation. This, along with some direct confirmation, shows that 7018 (AT&T) was not affected. This is an important point.

Going back and looking at a third tier 1 network, we can see 6762 (Telecom Italia) affected by this route leak and yet Cloudflare has a direct interconnect with them.

The deep-dive into how Verizon and a BGP Optimizer Knocked Large Parts of the Internet Offline Monday

We will be asking Telecom Italia to improve their route filtering as we now have this data.

Future work that could have helped on Monday

The IETF is doing work in the area of BGP path protection within the Secure Inter-Domain Routing Operations Working Group (sidrops) area. The charter of this IETF group is:

The SIDR Operations Working Group (sidrops) develops guidelines for the operation of SIDR-aware networks, and provides operational guidance on how to deploy and operate SIDR technologies in existing and new networks.

One new effort from this group should be called out to show how important the issue of route leaks like today’s event is. The draft document from Alexander Azimov et al named draft-ietf-sidrops-aspa-profile (ASPA stands for Autonomous System Provider Authorization) extends the RPKI data structures to handle BGP path information. This in ongoing work and Cloudflare and other companies are clearly interested in seeing it progress further.

However, as we said in Monday’s blog and something we should reiterate again and again: Cloudflare encourages all network operators to deploy RPKI now!

Acronyms used in the blog

  • API – Application Programming Interface
  • AS-PATH – The list of ASNs that a routes has traversed so far
  • ASN – Autonomous System Number – A unique number assigned for each network on the Internet
  • BGP – Border Gateway Protocol (version 4) – the core routing protocol for the Internet
  • IETF – Internet Engineering Task Force – an open standards organization
  • IPv4 – Internet Protocol version 4
  • IPv6 – Internet Protocol version 6
  • IRR – Internet Routing Registries – a database of Internet route objects
  • ISP – Internet Service Provider
  • JSON – JavaScript Object Notation – a lightweight data-interchange format
  • RFC – Request For Comment – published by the IETF
  • RIPE NCC – Réseaux IP Européens Network Coordination Centre – a regional Internet registry
  • ROA – Route Origin Authorization – a cryptographically signed attestation of a BGP route announcement
  • RPKI – Resource Public Key Infrastructure – a public key infrastructure framework for routing information
  • Tier 1 – A network that has no default route and peers with all other tier 1’s
  • UTC – Coordinated Universal Time – a time standard for clocks and time
  • "there be dragons" – a mistype, as it was meant to be "here be dragons" which means dangerous or unexplored territories

Comment Verizon et un optimiseur BGP ont affecté de nombreuses partie d’Internet aujourd’hu

Post Syndicated from Tom Strickx original https://blog.cloudflare.com/comment-verizon-et-un-optimiseur-bgp-ont-affecte-de-nombreuses-partie-dinternet-aujourdhu/

Une fuite massive de routes a eu un impact sur de nombreuses parties d’Internet, y compris sur Cloudflare

Que s’est-il passé ?

Comment Verizon et un optimiseur BGP ont affecté de nombreuses partie d’Internet aujourd’hu

Aujourd’hui à 10h30 UTC, Internet a connu une sorte de mini crise cardiaque. Une petite entreprise du nord de la Pennsylvanie est devenue le chemin privilégié de nombreuses routes Internet à cause de Verizon (AS701), un important fournisseur de transit Internet. C’est un peu comme si Waze venait à diriger le trafic d’une autoroute complète vers une petite rue de quartier : de nombreux sites Web sur Cloudflare et beaucoup d’autres fournisseurs étaient indisponibles depuis une grande partie du réseau. Cet incident n’aurait jamais dû arriver, car Verizon n’aurait jamais dû transmettre ces itinéraires au reste d’Internet. Pour en comprendre les raisons, lisez la suite de cet article.

Nous avons déjà écrit un certain nombre d’articles par le passé sur ces événements malheureux qui sont plus fréquents qu’on ne le pense. Cette fois, les effets ont pu être observés dans le monde entier. Aujourd’hui, le problème a été aggravé par l’implication d’un produit « Optimiseur BGP » de Noction. Ce produit dispose d’une fonctionnalité qui permet de diviser les préfixes IP reçus en parties contributives plus petites (appelées « préfixes plus spécifiques »). Par exemple, notre propre route IPv4 104.20.0.0/20 a été transformée en 104.20.0.0/21 et 104.20.8.0/21. Imaginons par exemple qu’il existe un panneau unique acheminant le trafic de Paris vers la Normandie et que ce panneau soit remplacé par deux panneaux, l’un pour Le Havre, l’autre pour Dieppe. En divisant ces blocs IP importants en parties plus petites, un réseau dispose d’un mécanisme lui permettant de diriger le trafic à l’intérieur de son réseau, mais cette division n’aurait jamais dû être annoncée au monde entier. Le fait qu’elle l’ait été a entraîné la panne survenue aujourd’hui.

Pour expliquer ce qui s’est passé ensuite, rappelons tout d’abord brièvement le fonctionnement de la « carte » de base d’Internet. Le terme « Internet » désigne littéralement un réseau de réseaux, constitué de réseaux appelés Autonomous Systems ou systèmes autonomes (abrégé AS). Chacun de ces réseaux possède un identifiant unique, son numéro d’AS. Tous ces réseaux sont interconnectés par un protocole de routage appelé Border Gateway Protocol (BGP). Le protocole BGP réunit ces réseaux et construit la « carte » Internet qui permet au trafic de circuler, par exemple, de votre fournisseur d’accès à Internet vers un site Web populaire situé à l’autre bout du monde.

À l’aide du protocole BGP, les réseaux échangent des informations sur les itinéraires : comment y accéder quel que soit votre emplacement. Ces itinéraires peuvent être soit des itinéraires spécifiques, comme lorsque vous recherchez une ville sur votre GPS, soit des itinéraires plus généraux, comme lorsque vous demandez à votre GPS de vous orienter vers une région. C’est précisément là que le bât a blessé aujourd’hui.

Un fournisseur d’accès Internet de Pennsylvanie (AS33154 – DQE Communications) utilisait un optimiseur BGP sur son réseau, ce qui signifie qu’il possédait de nombreux itinéraires plus spécifiques au sein même de son réseau. Les itinéraires spécifiques remplacent les itinéraires plus généraux (pour conserver l’analogie avec Waze, un itinéraire, par exemple, vers le Palais de l’Elysée est beaucoup plus spécifique qu’un itinéraire vers Paris).

DQE a annoncé ces itinéraires spécifiques à son client (AS396531 – Allegheny Technologies Inc). Toutes ces informations de routage ont ensuite été envoyées à leur autre fournisseur de transit (AS701 – Verizon), qui a ensuite informé tout Internet de l’existence de tels itinéraires « optimisés ». Ces itinéraires étaient supposés « optimisés » dans la mesure où ils étaient plus précis, plus spécifiques.

La fuite aurait dû s’arrêter au niveau de Verizon. Cependant, contrairement aux bonnes pratiques définies ci-dessous, l’absence de filtrage par Verizon a créé un incident majeur, entraînant des pannes pour de nombreux services Internet, tels qu’Amazon, Linode et Cloudflare.

En d’autres termes, Verizon, Allegheny et DQE ont dû soudainement faire face à un afflux massif d’internautes tentant d’accéder à ces services via leur réseau. Aucun de ces réseaux n’était équipé de manière appropriée pour faire face à cette augmentation considérable du trafic, ce qui a entraîné une interruption du service. Même s’ils avaient disposé d’une capacité suffisante, DQE, Allegheny et Verizon n’étaient pas autorisés à dire qu’ils avaient le meilleur itinéraire pour Cloudflare, Amazon, Linode, etc.

Comment Verizon et un optimiseur BGP ont affecté de nombreuses partie d’Internet aujourd’hu
Processus de fuite BGP avec un optimiseur BGP

Au cours de l’incident, nous avons observé au moment le plus critique une perte d’environ 15 % de notre trafic mondial.

Comment Verizon et un optimiseur BGP ont affecté de nombreuses partie d’Internet aujourd’hu
Niveaux de trafic chez Cloudflare au cours de l’incident.

Comment cette fuite aurait-elle pu être évitée ?

Cette fuite aurait pu être évitée de différentes manières :

Une session BGP peut être configurée avec une limite fixe de préfixes à recevoir. En d’autres termes, un routeur peut décider de fermer une session si le nombre de préfixes dépasse le seuil. Si Verizon avait mis en place une limite de préfixes, l’incident ne se serait pas produit. La mise en place des limites fait partie des bonnes pratiques. Verizon aurait pu parfaitement définir ce type de limites. Il n’existe donc aucune raison, autre que la négligence ou la paresse, pour justifier que la société ne l’ait pas fait.

Une autre manière pour les opérateurs de réseau d’empêcher les fuites de ce type consiste à appliquer un filtrage IRR. L’IRR (Internet Routing Registry) est le registre de routage d’Internet, et les réseaux peuvent ajouter des entrées à ces bases de données distribuées. Les autres opérateurs de réseau peuvent ensuite utiliser ces enregistrements IRR pour générer des listes de préfixes spécifiques pour les sessions BGP avec leurs pairs. Si le filtrage IRR avait été utilisé, aucun des réseaux concernés n’aurait accepté les préfixes plus spécifiques défectueux. Il est pour le moins choquant que Verizon n’ait appliqué aucun de ces filtres au cours de sa session BGP avec Allegheny Technologies, alors que le filtrage IRR existe (et est bien documenté) depuis plus de 24 ans. Le filtrage IRR ne représentait pas un coût supplémentaire pour Verizon et n’aurait en aucune manière limité son service. Comme nous l’avons déjà dit, le fait que ce filtrage n’ait pas été mis en œuvre ne peut s’expliquer que par la négligence ou la paresse.

La RPKI que nous avons mise en place et déployée l’année dernière à l’échelle mondiale vise à prévenir ce type de fuite. Elle active le filtrage sur le réseau d’origine et la taille des préfixes. Les préfixes que Cloudflare annonce sont signés pour une taille maximale de 20. La RPKI indique alors que tout préfixe plus spécifique ne doit pas être accepté, quel que soit le chemin. Pour que ce mécanisme soit opérationnel, un réseau doit activer BGP Origin Validation. De nombreux fournisseurs tels que AT&T l’ont déjà activé sur leur réseau.

Si Verizon avait utilisé la RPKI, elle se serait rendue compte que les routes annoncées n’étaient pas valides, et ces dernières auraient été automatiquement ignorées par le routeur.

Cloudflare encourage tous les opérateurs de réseau à déployer la RPKI dès maintenant !

Comment Verizon et un optimiseur BGP ont affecté de nombreuses partie d’Internet aujourd’hu
Prévention des fuites d’itinéraire à l’aide de l’IRR, la RPKI et les limites de préfixes

Toutes les suggestions ci-dessus sont bien résumées dans le projet MANRS (Normes d’accord mutuel sur la sécurité du routage.)

Résolution de l’incident

L’équipe réseau de Cloudflare a contacté les réseaux concernés, AS33154 (DQE Communications) et AS701 (Verizon). Nous avons eu quelques difficultés pour joindre l’un ou l’autre réseau, probablement en raison de l’heure de l’incident, car il était encore tôt sur la côte Est des États-Unis lorsque la fuite des routes a commencé.

Comment Verizon et un optimiseur BGP ont affecté de nombreuses partie d’Internet aujourd’hu
Capture d’écran du courriel envoyé à Verizon

L’un de nos ingénieurs réseau a rapidement pris contact avec DQE Communications qui, après un certain temps, nous a mis en contact avec une personne en mesure de résoudre le problème. Nous avons travaillé avec DQE au téléphone pour qu’ils cessent de faire la promotion de ces routes « optimisées » vers Allegheny Technologies Inc. Nous les remercions pour l’aide qu’ils nous ont apportée. Après cette intervention, Internet s’est stabilisé et la situation est redevenue normale.

Comment Verizon et un optimiseur BGP ont affecté de nombreuses partie d’Internet aujourd’hu
Capture d’écran des tentatives de communication avec le service d’assistance de DQE et Verizon

Il est regrettable que nos tentatives de contact par courriel et par téléphone avec Verizon soient restées vaines, au moment de la rédaction de cet article (plus de 8 heures après l’incident), nous n’avons toujours pas eu de nouvelles et nous ne savons pas si la société a pris des mesures pour résoudre le problème.

Chez Cloudflare, nous souhaitons que de tels événements ne se produisent jamais, malheureusement, l’état actuel d’Internet fait bien peu pour que ce type d’incident ne se reproduise pas. Il est temps que l’industrie adopte une meilleure sécurité du routage grâce à des systèmes tels que la RPKI. Nous espérons que les principaux fournisseurs suivront l’exemple de Cloudflare, Amazon et AT&T pour valider les itinéraires. Nous nous tournons en particulier vers vous, Verizon, et nous attendons toujours votre réponse.

Bien que cet incident soit dû à des événements indépendants de notre volonté, nous sommes désolés des perturbations occasionnées. Notre équipe est profondément attachée à la qualité de notre service et quelques minutes après l’identification du problème, nous étions déjà en ligne avec des ingénieurs aux États-Unis, au Royaume-Uni, en Australie et à Singapour.

Comment Verizon et un optimiseur BGP ont affecté de nombreuses partie d’Internet aujourd’hu

Post Syndicated from Tom Strickx original https://blog.cloudflare.com/comment-verizon-et-un-optimiseur-bgp-ont-affecte-de-nombreuses-partie-dinternet-aujourdhu/

Une fuite massive de routes a eu un impact sur de nombreuses parties d’Internet, y compris sur Cloudflare

Que s’est-il passé ?

Comment Verizon et un optimiseur BGP ont affecté de nombreuses partie d’Internet aujourd’hu

Aujourd’hui à 10h30 UTC, Internet a connu une sorte de mini crise cardiaque. Une petite entreprise du nord de la Pennsylvanie est devenue le chemin privilégié de nombreuses routes Internet à cause de Verizon (AS701), un important fournisseur de transit Internet. C’est un peu comme si Waze venait à diriger le trafic d’une autoroute complète vers une petite rue de quartier : de nombreux sites Web sur Cloudflare et beaucoup d’autres fournisseurs étaient indisponibles depuis une grande partie du réseau. Cet incident n’aurait jamais dû arriver, car Verizon n’aurait jamais dû transmettre ces itinéraires au reste d’Internet. Pour en comprendre les raisons, lisez la suite de cet article.

Nous avons déjà écrit un certain nombre d’articles par le passé sur ces événements malheureux qui sont plus fréquents qu’on ne le pense. Cette fois, les effets ont pu être observés dans le monde entier. Aujourd’hui, le problème a été aggravé par l’implication d’un produit « Optimiseur BGP » de Noction. Ce produit dispose d’une fonctionnalité qui permet de diviser les préfixes IP reçus en parties contributives plus petites (appelées « préfixes plus spécifiques »). Par exemple, notre propre route IPv4 104.20.0.0/20 a été transformée en 104.20.0.0/21 et 104.20.8.0/21. Imaginons par exemple qu’il existe un panneau unique acheminant le trafic de Paris vers la Normandie et que ce panneau soit remplacé par deux panneaux, l’un pour Le Havre, l’autre pour Dieppe. En divisant ces blocs IP importants en parties plus petites, un réseau dispose d’un mécanisme lui permettant de diriger le trafic à l’intérieur de son réseau, mais cette division n’aurait jamais dû être annoncée au monde entier. Le fait qu’elle l’ait été a entraîné la panne survenue aujourd’hui.

Pour expliquer ce qui s’est passé ensuite, rappelons tout d’abord brièvement le fonctionnement de la « carte » de base d’Internet. Le terme « Internet » désigne littéralement un réseau de réseaux, constitué de réseaux appelés Autonomous Systems ou systèmes autonomes (abrégé AS). Chacun de ces réseaux possède un identifiant unique, son numéro d’AS. Tous ces réseaux sont interconnectés par un protocole de routage appelé Border Gateway Protocol (BGP). Le protocole BGP réunit ces réseaux et construit la « carte » Internet qui permet au trafic de circuler, par exemple, de votre fournisseur d’accès à Internet vers un site Web populaire situé à l’autre bout du monde.

À l’aide du protocole BGP, les réseaux échangent des informations sur les itinéraires : comment y accéder quel que soit votre emplacement. Ces itinéraires peuvent être soit des itinéraires spécifiques, comme lorsque vous recherchez une ville sur votre GPS, soit des itinéraires plus généraux, comme lorsque vous demandez à votre GPS de vous orienter vers une région. C’est précisément là que le bât a blessé aujourd’hui.

Un fournisseur d’accès Internet de Pennsylvanie (AS33154 – DQE Communications) utilisait un optimiseur BGP sur son réseau, ce qui signifie qu’il possédait de nombreux itinéraires plus spécifiques au sein même de son réseau. Les itinéraires spécifiques remplacent les itinéraires plus généraux (pour conserver l’analogie avec Waze, un itinéraire, par exemple, vers le Palais de l’Elysée est beaucoup plus spécifique qu’un itinéraire vers Paris).

DQE a annoncé ces itinéraires spécifiques à son client (AS396531 – Allegheny Technologies Inc). Toutes ces informations de routage ont ensuite été envoyées à leur autre fournisseur de transit (AS701 – Verizon), qui a ensuite informé tout Internet de l’existence de tels itinéraires « optimisés ». Ces itinéraires étaient supposés « optimisés » dans la mesure où ils étaient plus précis, plus spécifiques.

La fuite aurait dû s’arrêter au niveau de Verizon. Cependant, contrairement aux bonnes pratiques définies ci-dessous, l’absence de filtrage par Verizon a créé un incident majeur, entraînant des pannes pour de nombreux services Internet, tels qu’Amazon, Linode et Cloudflare.

En d’autres termes, Verizon, Allegheny et DQE ont dû soudainement faire face à un afflux massif d’internautes tentant d’accéder à ces services via leur réseau. Aucun de ces réseaux n’était équipé de manière appropriée pour faire face à cette augmentation considérable du trafic, ce qui a entraîné une interruption du service. Même s’ils avaient disposé d’une capacité suffisante, DQE, Allegheny et Verizon n’étaient pas autorisés à dire qu’ils avaient le meilleur itinéraire pour Cloudflare, Amazon, Linode, etc.

Comment Verizon et un optimiseur BGP ont affecté de nombreuses partie d’Internet aujourd’hu
Processus de fuite BGP avec un optimiseur BGP

Au cours de l’incident, nous avons observé au moment le plus critique une perte d’environ 15 % de notre trafic mondial.

Comment Verizon et un optimiseur BGP ont affecté de nombreuses partie d’Internet aujourd’hu
Niveaux de trafic chez Cloudflare au cours de l’incident.

Comment cette fuite aurait-elle pu être évitée ?

Cette fuite aurait pu être évitée de différentes manières :

Une session BGP peut être configurée avec une limite fixe de préfixes à recevoir. En d’autres termes, un routeur peut décider de fermer une session si le nombre de préfixes dépasse le seuil. Si Verizon avait mis en place une limite de préfixes, l’incident ne se serait pas produit. La mise en place des limites fait partie des bonnes pratiques. Verizon aurait pu parfaitement définir ce type de limites. Il n’existe donc aucune raison, autre que la négligence ou la paresse, pour justifier que la société ne l’ait pas fait.

Une autre manière pour les opérateurs de réseau d’empêcher les fuites de ce type consiste à appliquer un filtrage IRR. L’IRR (Internet Routing Registry) est le registre de routage d’Internet, et les réseaux peuvent ajouter des entrées à ces bases de données distribuées. Les autres opérateurs de réseau peuvent ensuite utiliser ces enregistrements IRR pour générer des listes de préfixes spécifiques pour les sessions BGP avec leurs pairs. Si le filtrage IRR avait été utilisé, aucun des réseaux concernés n’aurait accepté les préfixes plus spécifiques défectueux. Il est pour le moins choquant que Verizon n’ait appliqué aucun de ces filtres au cours de sa session BGP avec Allegheny Technologies, alors que le filtrage IRR existe (et est bien documenté) depuis plus de 24 ans. Le filtrage IRR ne représentait pas un coût supplémentaire pour Verizon et n’aurait en aucune manière limité son service. Comme nous l’avons déjà dit, le fait que ce filtrage n’ait pas été mis en œuvre ne peut s’expliquer que par la négligence ou la paresse.

La RPKI que nous avons mise en place et déployée l’année dernière à l’échelle mondiale vise à prévenir ce type de fuite. Elle active le filtrage sur le réseau d’origine et la taille des préfixes. Les préfixes que Cloudflare annonce sont signés pour une taille maximale de 20. La RPKI indique alors que tout préfixe plus spécifique ne doit pas être accepté, quel que soit le chemin. Pour que ce mécanisme soit opérationnel, un réseau doit activer BGP Origin Validation. De nombreux fournisseurs tels que AT&T l’ont déjà activé sur leur réseau.

Si Verizon avait utilisé la RPKI, elle se serait rendue compte que les routes annoncées n’étaient pas valides, et ces dernières auraient été automatiquement ignorées par le routeur.

Cloudflare encourage tous les opérateurs de réseau à déployer la RPKI dès maintenant !

Comment Verizon et un optimiseur BGP ont affecté de nombreuses partie d’Internet aujourd’hu
Prévention des fuites d’itinéraire à l’aide de l’IRR, la RPKI et les limites de préfixes

Toutes les suggestions ci-dessus sont bien résumées dans le projet MANRS (Normes d’accord mutuel sur la sécurité du routage.)

Résolution de l’incident

L’équipe réseau de Cloudflare a contacté les réseaux concernés, AS33154 (DQE Communications) et AS701 (Verizon). Nous avons eu quelques difficultés pour joindre l’un ou l’autre réseau, probablement en raison de l’heure de l’incident, car il était encore tôt sur la côte Est des États-Unis lorsque la fuite des routes a commencé.

Comment Verizon et un optimiseur BGP ont affecté de nombreuses partie d’Internet aujourd’hu
Capture d’écran du courriel envoyé à Verizon

L’un de nos ingénieurs réseau a rapidement pris contact avec DQE Communications qui, après un certain temps, nous a mis en contact avec une personne en mesure de résoudre le problème. Nous avons travaillé avec DQE au téléphone pour qu’ils cessent de faire la promotion de ces routes « optimisées » vers Allegheny Technologies Inc. Nous les remercions pour l’aide qu’ils nous ont apportée. Après cette intervention, Internet s’est stabilisé et la situation est redevenue normale.

Comment Verizon et un optimiseur BGP ont affecté de nombreuses partie d’Internet aujourd’hu
Capture d’écran des tentatives de communication avec le service d’assistance de DQE et Verizon

Il est regrettable que nos tentatives de contact par courriel et par téléphone avec Verizon soient restées vaines, au moment de la rédaction de cet article (plus de 8 heures après l’incident), nous n’avons toujours pas eu de nouvelles et nous ne savons pas si la société a pris des mesures pour résoudre le problème.

Chez Cloudflare, nous souhaitons que de tels événements ne se produisent jamais, malheureusement, l’état actuel d’Internet fait bien peu pour que ce type d’incident ne se reproduise pas. Il est temps que l’industrie adopte une meilleure sécurité du routage grâce à des systèmes tels que la RPKI. Nous espérons que les principaux fournisseurs suivront l’exemple de Cloudflare, Amazon et AT&T pour valider les itinéraires. Nous nous tournons en particulier vers vous, Verizon, et nous attendons toujours votre réponse.

Bien que cet incident soit dû à des événements indépendants de notre volonté, nous sommes désolés des perturbations occasionnées. Notre équipe est profondément attachée à la qualité de notre service et quelques minutes après l’identification du problème, nous étions déjà en ligne avec des ingénieurs aux États-Unis, au Royaume-Uni, en Australie et à Singapour.

Welcome to Crypto Week 2019

Post Syndicated from Nick Sullivan original https://blog.cloudflare.com/welcome-to-crypto-week-2019/

Welcome to Crypto Week 2019

Welcome to Crypto Week 2019

The Internet is an extraordinarily complex and evolving ecosystem. Its constituent protocols range from the ancient and archaic (hello FTP) to the modern and sleek (meet WireGuard), with a fair bit of everything in between. This evolution is ongoing, and as one of the most connected networks on the Internet, Cloudflare has a duty to be a good steward of this ecosystem. We take this responsibility to heart: Cloudflare’s mission is to help build a better Internet. In this spirit, we are very proud to announce Crypto Week 2019.

Every day this week we’ll announce a new project or service that uses modern cryptography to build a more secure, trustworthy Internet. Everything we release this week will be free and immediately useful. This blog is a fun exploration of the themes of the week.

  • Monday: Coming Soon
  • Tuesday: Coming Soon
  • Wednesday: Coming Soon
  • Thursday: Coming Soon
  • Friday: Coming Soon

The Internet of the Future

Many pieces of the Internet in use today were designed in a different era with different assumptions. The Internet’s success is based on strong foundations that support constant reassessment and improvement. Sometimes these improvements require deploying new protocols.

Performing an upgrade on a system as large and decentralized as the Internet can’t be done by decree;

  • There are too many economic, cultural, political, and technological factors at play.
  • Changes must be compatible with existing systems and protocols to even be considered for adoption.
  • To gain traction, new protocols must provide tangible improvements for users. Nobody wants to install an update that doesn’t improve their experience!

The last time the Internet had a complete reboot and upgrade was during TCP/IP flag day in 1983. Back then, the Internet (called ARPANET) had fewer than ten thousand hosts! To have an Internet-wide flag day today to switch over to a core new protocol is inconceivable; the scale and diversity of the components involved is way too massive. Too much would break. It’s challenging enough to deprecate outmoded functionality. In some ways, the open Internet is a victim of its own success. The bigger a system grows and the longer it stays the same, the harder it is to change. The Internet is like a massive barge: it takes forever to steer in a different direction and it’s carrying a lot of garbage.

Welcome to Crypto Week 2019
ARPANET, 1983 (Computer History Museum)

As you would expect, many of the warts of the early Internet still remain. Both academic security researchers and real-life adversaries are still finding and exploiting vulnerabilities in the system. Many vulnerabilities are due to the fact that most of the protocols in use on the Internet have a weak notion of trust inherited from the early days. With 50 hosts online, it’s relatively easy to trust everyone, but in a world-scale system, that trust breaks down in fascinating ways. The primary tool to scale trust is cryptography, which helps provide some measure of accountability, though it has its own complexities.

In an ideal world, the Internet would provide a trustworthy substrate for human communication and commerce. Some people naïvely assume that this is the natural direction the evolution of the Internet will follow. However, constant improvement is not a given. It’s possible that the Internet of the future will actually be worse than the Internet today: less open, less secure, less private, less trustworthy. There are strong incentives to weaken the Internet on a fundamental level by Governments, by businesses such as ISPs, and even by the financial institutions entrusted with our personal data.

In a system with as many stakeholders as the Internet, real change requires principled commitment from all invested parties. At Cloudflare, we believe everyone is entitled to an Internet built on a solid foundation of trust. Crypto Week is our way of helping nudge the Internet’s evolution in a more trust-oriented direction. Each announcement this week helps bring the Internet of the future to the present in a tangible way.

Ongoing Internet Upgrades

Before we explore the Internet of the future, let’s explore some of the previous and ongoing attempts to upgrade the Internet’s fundamental protocols.

Routing Security

As we highlighted in last year’s Crypto Week one of the weak links on the Internet is routing. Not all networks are directly connected.

To send data from one place to another, you might have to rely on intermediary networks to pass your data along. A packet sent from one host to another may have to be passed through up to a dozen of these intermediary networks. No single network knows the full path the data will have to take to get to its destination, it only knows which network to pass it to next.  The protocol that determines how packets are routed is called the Border Gateway Protocol (BGP.) Generally speaking, networks use BGP to announce to each other which addresses they know how to route packets for and (dependent on a set of complex rules) these networks share what they learn with their neighbors.

Unfortunately, BGP is completely insecure:

  • Any network can announce any set of addresses to any other network, even addresses they don’t control. This leads to a phenomenon called BGP hijacking, where networks are tricked into sending data to the wrong network.
  • A BGP hijack is most often caused by accidental misconfiguration, but can also be the result of malice on the network operator’s part.
  • During a BGP hijack, a network inappropriately announces a set of addresses to other networks, which results in packets destined for the announced addresses to be routed through the illegitimate network.

Understanding the risk

If the packets represent unencrypted data, this can be a big problem as it allows the hijacker to read or even change the data:

Mitigating the risk

The Resource Public Key Infrastructure (RPKI) system helps bring some trust to BGP by enabling networks to utilize cryptography to digitally sign network routes with certificates, making BGP hijacking much more difficult.

  • This enables participants of the network to gain assurances about the authenticity of route advertisements. Certificate Transparency (CT) is a tool that enables additional trust for certificate-based systems. Cloudflare operates the Cirrus CT log to support RPKI.

Since we announced our support of RPKI last year, routing security has made big strides. More routes are signed, more networks validate RPKI, and the software ecosystem has matured, but this work is not complete. Most networks are still vulnerable to BGP hijacking. For example, Pakistan knocked YouTube offline with a BGP hijack back in 2008, and could likely do the same today. Adoption here is driven less by providing a benefit to users, but rather by reducing systemic risk, which is not the strongest motivating factor for adopting a complex new technology. Full routing security on the Internet could take decades.

DNS Security

The Domain Name System (DNS) is the phone book of the Internet. Or, for anyone under 25 who doesn’t remember phone books, it’s the system that takes hostnames (like cloudflare.com or facebook.com) and returns the Internet address where that host can be found. For example, as of this publication, www.cloudflare.com is 104.17.209.9 and 104.17.210.9 (IPv4) and 2606:4700::c629:d7a2, 2606:4700::c629:d6a2 (IPv6). Like BGP, DNS is completely insecure. Queries and responses sent unencrypted over the Internet are modifiable by anyone on the path.

There are many ongoing attempts to add security to DNS, such as:

  • DNSSEC that adds a chain of digital signatures to DNS responses
  • DoT/DoH that wraps DNS queries in the TLS encryption protocol (more on that later)

Both technologies are slowly gaining adoption, but have a long way to go.

Welcome to Crypto Week 2019
DNSSEC-signed responses served by Cloudflare

Welcome to Crypto Week 2019
Cloudflare’s 1.1.1.1 resolver queries are already over 5% DoT/DoH

Just like RPKI, securing DNS comes with a performance cost, making it less attractive to users. However,

The Web

Transport Layer Security (TLS) is a cryptographic protocol that gives two parties the ability to communicate over an encrypted and authenticated channel. TLS protects communications from eavesdroppers even in the event of a BGP hijack. TLS is what puts the “S” in HTTPS. TLS protects web browsing against multiple types of network adversaries.

Welcome to Crypto Week 2019
Requests hop from network to network over the Internet

Welcome to Crypto Week 2019
For unauthenticated protocols, an attacker on the path can impersonate the server

Welcome to Crypto Week 2019
Attackers can use BGP hijacking to change the path so that communication can be intercepted

Welcome to Crypto Week 2019
Authenticated protocols are protected from interception attacks

The adoption of TLS on the web is partially driven by the fact that:

  • It’s easy and free for websites to get an authentication certificate (via Let’s Encrypt, Universal SSL, etc.)
  • Browsers make TLS adoption appealing to website operators by only supporting new web features such as HTTP/2 over HTTPS.

This has led to the rapid adoption of HTTPS over the last five years.

Welcome to Crypto Week 2019
HTTPS adoption curve (from Google Chrome)‌‌

To further that adoption, TLS recently got an upgrade in TLS 1.3, making it faster and more secure (a combination we love). It’s taking over the Internet!

Welcome to Crypto Week 2019
TLS 1.3 adoption over the last 12 months (from Cloudflare’s perspective)

Despite this fantastic progress in the adoption of security for routing, DNS, and the web, there are still gaps in the trust model of the Internet. There are other things needed to help build the Internet of the future. To find and identify these gaps, we lean on research experts.

Research Farm to Table

Cryptographic security on the Internet is a hot topic and there have been many flaws and issues recently pointed out in academic journals. Researchers often study the vulnerabilities of the past and ask:

  • What other critical components of the Internet have the same flaws?
  • What underlying assumptions can subvert trust in these existing systems?

The answers to these questions help us decide what to tackle next. Some recent research  topics we’ve learned about include:

  • Quantum Computing
  • Attacks on Time Synchronization
  • DNS attacks affecting Certificate issuance
  • Scaling distributed trust

Cloudflare keeps abreast of these developments and we do what we can to bring these new ideas to the Internet at large. In this respect, we’re truly standing on the shoulders of giants.

Future-proofing Internet Cryptography

The new protocols we are currently deploying (RPKI, DNSSEC, DoT/DoH, TLS 1.3) use relatively modern cryptographic algorithms published in the 1970s and 1980s.

  • The security of these algorithms is based on hard mathematical problems in the field of number theory, such as factoring and the elliptic curve discrete logarithm problem.
  • If you can solve the hard problem, you can crack the code. Using a bigger key makes the problem harder, making it more difficult to break, but also slows performance.

Modern Internet protocols typically pick keys large enough to make it infeasible to break with classical computers, but no larger. The sweet spot is around 128-bits of security; meaning a computer has to do approximately 2¹²⁸ operations to break it.

Arjen Lenstra and others created a useful measure of security levels by comparing the amount of energy it takes to break a key to the amount of water you can boil using that much energy. You can think of this as the electric bill you’d get if you run a computer long enough to crack the key.

  • 35-bit security is “Teaspoon security” — It takes about the same amount of energy to break a 35-bit key as it does to boil a teaspoon of water (pretty easy).

Welcome to Crypto Week 2019

  • 65 bits gets you up to “Pool security” – The energy needed to boil the average amount of water in a swimming pool.

Welcome to Crypto Week 2019

  • 105 bits is “Sea Security” – The energy needed to boil the Mediterranean Sea.

Welcome to Crypto Week 2019

  • 114-bits is “Global Security” –  The energy needed to boil all water on Earth.

Welcome to Crypto Week 2019

  • 128-bit security is safely beyond that of Global Security – Anything larger is overkill.
  • 256-bit security corresponds to “Universal Security” – The estimated mass-energy of the observable universe. So, if you ever hear someone suggest 256-bit AES, you know they mean business.

Welcome to Crypto Week 2019

Post-Quantum of Solace

As far as we know, the algorithms we use for cryptography are functionally uncrackable with all known algorithms that classical computers can run. Quantum computers change this calculus. Instead of transistors and bits, a quantum computer uses the effects of quantum mechanics to perform calculations that just aren’t possible with classical computers. As you can imagine, quantum computers are very difficult to build. However, despite large-scale quantum computers not existing quite yet, computer scientists have already developed algorithms that can only run efficiently on quantum computers. Surprisingly, it turns out that with a sufficiently powerful quantum computer, most of the hard mathematical problems we rely on for Internet security become easy!

Although there are still quantum-skeptics out there, some experts estimate that within 15-30 years these large quantum computers will exist, which poses a risk to every security protocol online. Progress is moving quickly; every few months a more powerful quantum computer is announced.

Welcome to Crypto Week 2019

Luckily, there are cryptography algorithms that rely on different hard math problems that seem to be resistant to attack from quantum computers. These math problems form the basis of so-called quantum-resistant (or post-quantum) cryptography algorithms that can run on classical computers. These algorithms can be used as substitutes for most of our current quantum-vulnerable algorithms.

  • Some quantum-resistant algorithms (such as McEliece and Lamport Signatures) were invented decades ago, but there’s a reason they aren’t in common use: they lack some of the nice properties of the algorithms we’re currently using, such as key size and efficiency.
  • Some quantum-resistant algorithms require much larger keys to provide 128-bit security
  • Some are very CPU intensive,
  • And some just haven’t been studied enough to know if they’re secure.

It is possible to swap our current set of quantum-vulnerable algorithms with new quantum-resistant algorithms, but it’s a daunting engineering task. With widely deployed protocols, it is hard to make the transition from something fast and small to something slower, bigger or more complicated without providing concrete user benefits. When exploring new quantum-resistant algorithms, minimizing user impact is of utmost importance to encourage adoption. This is a big deal, because almost all the protocols we use to protect the Internet are vulnerable to quantum computers.

Cryptography-breaking quantum computing is still in the distant future, but we must start the transition to ensure that today’s secure communications are safe from tomorrow’s quantum-powered onlookers; however, that’s not the most timely problem with the Internet. We haven’t addressed that…yet.

Attacking time

Just like DNS, BGP, and HTTP, the Network Time Protocol (NTP) is fundamental to how the Internet works. And like these other protocols, it is completely insecure.

  • Last year, Cloudflare introduced Roughtime as a mechanism for computers to access the current time from a trusted server in an authenticated way.
  • Roughtime is powerful because it provides a way to distribute trust among multiple time servers so that if one server attempts to lie about the time, it will be caught.

However, Roughtime is not exactly a secure drop-in replacement for NTP.

  • Roughtime lacks the complex mechanisms of NTP that allow it to compensate for network latency and yet maintain precise time, especially if the time servers are remote. This leads to imprecise time.
  • Roughtime also involves expensive cryptography that can further reduce precision. This lack of precision makes Roughtime useful for browsers and other systems that need coarse time to validate certificates (most certificates are valid for 3 months or more), but some systems (such as those used for financial trading) require precision to the millisecond or below.

With Roughtime we supported the time protocol of the future, but there are things we can do to help improve the health of security online today.

Welcome to Crypto Week 2019

Some academic researchers, including Aanchal Malhotra of Boston University, have demonstrated a variety of attacks against NTP, including BGP hijacking and off-path User Datagram Protocol (UDP) attacks.

  • Some of these attacks can be avoided by connecting to an NTP server that is close to you on the Internet.
  • However, to bring cryptographic trust to time while maintaining precision, we need something in between NTP and Roughtime.
  • To solve this, it’s natural to turn to the same system of trust that enabled us to patch HTTP and DNS: Web PKI.

Attacking the Web PKI

The Web PKI is similar to the RPKI, but is more widely visible since it relates to websites rather than routing tables.

  • If you’ve ever clicked the lock icon on your browser’s address bar, you’ve interacted with it.
  • The PKI relies on a set of trusted organizations called Certificate Authorities (CAs) to issue certificates to websites and web services.
  • Websites use these certificates to authenticate themselves to clients as part of the TLS protocol in HTTPS.

Welcome to Crypto Week 2019
TLS provides encryption and integrity from the client to the server with the help of a digital certificate 

Welcome to Crypto Week 2019
TLS connections are safe against MITM, because the client doesn’t trust the attacker’s certificate

While we were all patting ourselves on the back for moving the web to HTTPS, some researchers managed to find and exploit a weakness in the system: the process for getting HTTPS certificates.

Certificate Authorities (CAs) use a process called domain control validation (DCV) to ensure that they only issue certificates to websites owners who legitimately request them.

  • Some CAs do this validation manually, which is secure, but can’t scale to the total number of websites deployed today.
  • More progressive CAs have automated this validation process, but rely on insecure methods (HTTP and DNS) to validate domain ownership.

Without ubiquitous cryptography in place (DNSSEC may never reach 100% deployment), there is no completely secure way to bootstrap this system. So, let’s look at how to distribute trust using other methods.

One tool at our disposal is the distributed nature of the Cloudflare network.

Cloudflare is global. We have locations all over the world connected to dozens of networks. That means we have different vantage points, resulting in different ways to traverse networks. This diversity can prove an advantage when dealing with BGP hijacking, since an attacker would have to hijack multiple routes from multiple locations to affect all the traffic between Cloudflare and other distributed parts of the Internet. The natural diversity of the network raises the cost of the attacks.

A distributed set of connections to the Internet and using them as a quorum is a mighty paradigm to distribute trust, with or without cryptography.

Distributed Trust

This idea of distributing the source of trust is powerful. Last year we announced the Distributed Web Gateway that

  • Enables users to access content on the InterPlanetary File System (IPFS), a network structured to reduce the trust placed in any single party.
  • Even if a participant of the network is compromised, it can’t be used to distribute compromised content because the network is content-addressed.
  • However, using content-based addressing is not the only way to distribute trust between multiple independent parties.

Another way to distribute trust is to literally split authority between multiple independent parties. We’ve explored this topic before. In the context of Internet services, this means ensuring that no single server can authenticate itself to a client on its own. For example,

  • In HTTPS the server’s private key is the lynchpin of its security. Compromising the owner of the private key (by hook or by crook) gives an attacker the ability to impersonate (spoof) that service. This single point of failure puts services at risk. You can mitigate this risk by distributing the authority to authenticate the service between multiple independently-operated services.

Welcome to Crypto Week 2019
TLS doesn’t protect against server compromise

Welcome to Crypto Week 2019
With distributed trust, multiple parties combine to protect the connection

Welcome to Crypto Week 2019
An attacker that has compromised one of the servers cannot break the security of the system‌‌

The Internet barge is old and slow, and we’ve only been able to improve it through the meticulous process of patching it piece by piece. Another option is to build new secure systems on top of this insecure foundation. IPFS is doing this, and IPFS is not alone in its design. There has been more research into secure systems with decentralized trust in the last ten years than ever before.

The result is radical new protocols and designs that use exotic new algorithms. These protocols do not supplant those at the core of the Internet (like TCP/IP), but instead, they sit on top of the existing Internet infrastructure, enabling new applications, much like HTTP did for the web.

Gaining Traction

Some of the most innovative technical projects were considered failures because they couldn’t attract users. New technology has to bring tangible benefits to users to sustain it: useful functionality, content, and a decent user experience. Distributed projects, such as IPFS and others, are gaining popularity, but have not found mass adoption. This is a chicken-and-egg problem. New protocols have a high barrier to entryusers have to install new softwareand because of the small audience, there is less incentive to create compelling content. Decentralization and distributed trust are nice security features to have, but they are not products. Users still need to get some benefit out of using the platform.

An example of a system to break this cycle is the web. In 1992 the web was hardly a cornucopia of awesomeness. What helped drive the dominance of the web was its users.

  • The growth of the user base meant more incentive for people to build services, and the availability of more services attracted more users. It was a virtuous cycle.
  • It’s hard for a platform to gain momentum, but once the cycle starts, a flywheel effect kicks in to help the platform grow.

The Distributed Web Gateway project Cloudflare launched last year in Crypto Week is our way of exploring what happens if we try to kickstart that flywheel. By providing a secure, reliable, and fast interface from the classic web with its two billion users to the content on the distributed web, we give the fledgling ecosystem an audience.

  • If the advantages provided by building on the distributed web are appealing to users, then the larger audience will help these services grow in popularity.
  • This is somewhat reminiscent of how IPv6 gained adoption. It started as a niche technology only accessible using IPv4-to-IPv6 translation services.
  • IPv6 adoption has now grown so much that it is becoming a requirement for new services. For example, Apple is requiring that all apps work in IPv6-only contexts.

Eventually, as user-side implementations of distributed web technologies improve, people may move to using the distributed web natively rather than through an HTTP gateway. Or they may not! By leveraging Cloudflare’s global network to give users access to new technologies based on distributed trust, we give these technologies a better chance at gaining adoption.

Happy Crypto Week

At Cloudflare, we always support new technologies that help make the Internet better. Part of helping make a better Internet is scaling the systems of trust that underpin web browsing and protect them from attack. We provide the tools to create better systems of assurance with fewer points of vulnerability. We work with academic researchers of security to get a vision of the future and engineer away vulnerabilities before they can become widespread. It’s a constant journey.

Cloudflare knows that none of this is possible without the work of researchers. From award-winning researcher publishing papers in top journals to the blog posts of clever hobbyists, dedicated and curious people are moving the state of knowledge of the world forward. However, the push to publish new and novel research sometimes holds researchers back from committing enough time and resources to fully realize their ideas. Great research can be powerful on its own, but it can have an even broader impact when combined with practical applications. We relish the opportunity to stand on the shoulders of these giants and use our engineering know-how and global reach to expand on their work to help build a better Internet.

So, to all of you dedicated researchers, thank you for your work! Crypto Week is yours as much as ours. If you’re working on something interesting and you want help to bring the results of your research to the broader Internet, please contact us at [email protected]. We want to help you realize your dream of making the Internet safe and trustworthy.