Tag Archives: Internet Performance

Gone offline: how Cloudflare Radar detects Internet outages

Post Syndicated from Carlos Azevedo original http://blog.cloudflare.com/detecting-internet-outages/

Gone offline: how Cloudflare Radar detects Internet outages

Gone offline: how Cloudflare Radar detects Internet outages

Currently, Cloudflare Radar curates a list of observed Internet disruptions (which may include partial or complete outages) in the Outage Center. These disruptions are recorded whenever we have sufficient context to correlate with an observed drop in traffic, found by checking status updates and related communications from ISPs, or finding news reports related to cable cuts, government orders, power outages, or natural disasters.

However, we observe more disruptions than we currently report in the outage center because there are cases where we can’t find any source of information that provides a likely cause for what we are observing, although we are still able to validate with external data sources such as Georgia Tech’s IODA. This curation process involves manual work, and is supported by internal tooling that allows us to analyze traffic volumes and detect anomalies automatically, triggering the workflow to find an associated root cause. While the Cloudflare Radar Outage Center is a valuable resource, one of key shortcomings include that we are not reporting all disruptions, and that the current curation process is not as timely as we’d like, because we still need to find the context.

As we announced today in a related blog post, Cloudflare Radar will be publishing anomalous traffic events for countries and Autonomous Systems (ASes). These events are the same ones referenced above that have been triggering our internal workflow to validate and confirm disruptions. (Note that at this time “anomalous traffic events” are associated with drops in traffic, not unexpected traffic spikes.) In addition to adding traffic anomaly information to the Outage Center, we are also launching the ability for users to subscribe to notifications at a location (country) or network (autonomous system) level whenever a new anomaly event is detected, or a new entry is added to the outage table. Please refer to the related blog post for more details on how to subscribe.

Gone offline: how Cloudflare Radar detects Internet outages

The current status of each detected anomaly will be shown in the new “Traffic anomalies” table on the Outage Center page:

  • When the anomaly is automatically detected its status will initially be Unverified
  • After attempting to validate ‘Unverified’ entries:
    • We will change the status to ‘Verified’ if we can confirm that the anomaly appears across multiple internal data sources, and possibly external ones as well. If we find associated context for it, we will also create an outage entry.
    • We will change status to ‘False Positive’ if we cannot confirm it across multiple data sources. This will remove it from the “Traffic anomalies” table. (If a notification has been sent, but the anomaly isn’t shown in Radar anymore, it means we flagged it as ‘False Positive’.)
  • We might also manually add an entry with a “Verified” status. This might occur if we observe, and validate, a drop in traffic that is noticeable, but was not large enough for the algorithm to catch it.

A glimpse at what Internet traffic volume looks like

At Cloudflare, we have several internal data sources that can give us insights into what the traffic for a specific entity looks like. We identify the entity based on IP address geolocation in the case of locations, and IP address allocation in the case of ASes, and can analyze traffic from different sources, such as DNS, HTTP, NetFlows, and Network Error Logs (NEL). All the signals used in the figures below come from one of these data sources and in this blog post we will treat this as a univariate time-series problem — in the current algorithm, we use more than one signal just to add redundancy and identify anomalies with a higher level of confidence. In the discussion below, we intentionally select various examples to encompass a broad spectrum of potential Internet traffic volume scenarios.

1. Ideally, the signals would resemble the pattern depicted below for Australia (AU): a stable weekly pattern with a slightly positive trend meaning that the trend average is moving up over time (we see more traffic over time from users in Australia).

Gone offline: how Cloudflare Radar detects Internet outages

These statements can be clearly seen when we perform time-series decomposition which allows us to break down a time-series into its constituent parts to better understand and analyze its underlying patterns. Decomposing the traffic volume for Australia above assuming a weekly pattern with Seasonal-Trend decomposition using LOESS (STL) we get the following:

Gone offline: how Cloudflare Radar detects Internet outages

The weekly pattern we are referring to is represented by the seasonal part of the signal that is expected to be observed due to the fact that we are interested in eyeball / human internet traffic. As observed in the image above, the trend component is expected to move slowly when compared with the signal level and the residual part ideally would resemble white noise meaning that all existing patterns in the signal are represented by the seasonal and trend components.

2. Below we have the traffic volume for AS15964 (CAMNET-AS) that appears to have more of a daily pattern, as opposed to weekly.

We also observe that there’s a value offset of the signal right after the first four days (blue dashed-line) and the red background shows us an outage for which we didn’t find any reporting besides seeing it in our data and other Internet data providers — our intention here is to develop an algorithm that will trigger an event when it comes across this or similar patterns.

Gone offline: how Cloudflare Radar detects Internet outages

3. Here we have a similar example for French Guiana (GF). We observe some data offsets (August 9 and 23), a change in the amplitude (between August 15 and 23) and another outage for which we do have context that is observable in Cloudflare Radar.

Gone offline: how Cloudflare Radar detects Internet outages

4. Another scenario is several scheduled outages for AS203214 (HulumTele), for which we also have context. These anomalies are the easiest to detect since the traffic goes to values that are unique to outages (cannot be mistaken as regular traffic), but it poses another challenge: if our plan was to just check the weekly patterns, since these government-directed outages happen with the same frequency, at some point the algorithm would see this as expected traffic.

Gone offline: how Cloudflare Radar detects Internet outages

5. This outage in Kenya could be seen as similar to the above: the traffic volume went down to unseen values although not as significantly. We also observe some upward spikes in the data that are not following any specific pattern — possibly outliers — that we should clean depending on the approach we use to model the time-series.

Gone offline: how Cloudflare Radar detects Internet outages

6. Lastly, here's the data that will be used throughout this post as an example of how we are approaching this problem. For Madagascar (MG), we observe a clear pattern with pronounced weekends (blue background). There’s also a holiday (Assumption of Mary), highlighted with a green background, and an outage, with a red background. In this example, weekends, holidays, and outages all seem to have roughly the same traffic volume. Fortunately, the outage gives itself away by showing that it intended to go up as in a normal working day, but then there was a sudden drop — we will see it more closely later in this post.

Gone offline: how Cloudflare Radar detects Internet outages

In summary, here we looked over six examples out of ~700 (the number of entities we are automatically detecting anomalies for currently) and we see a wide range of variability. This means that in order to effectively model the time-series we would have to run a lot of preprocessing steps before the modeling itself. These steps include removing outliers, detecting short and long-term data offsets and readjusting, and detecting changes in variance, mean, or magnitude. Time is also a factor in preprocessing, as we would also need to know in advance when to expect events / holidays that will push the traffic down, apply daylight saving time adjustments that will cause a time shift in the data, and be able to apply local time zones for each entity, including dealing with locations that have multiple time zones and AS traffic that is shared across different time zones.

To add to the challenge, some of these steps cannot even be performed in a close-to-real-time fashion (example: we can only say there’s a change in seasonality after some time of observing the new pattern). Considering the challenges mentioned earlier, we have chosen an algorithm that combines basic preprocessing and statistics. This approach aligns with our expectations for the data's characteristics, offers ease of interpretation, allows us to control the false positive rate, and ensures fast execution while reducing the need for many of the preprocessing steps discussed previously.

Above, we noted that we are detecting anomalies for around 700 entities (locations and autonomous systems) at launch. This obviously does not represent the entire universe of countries and networks, and for good reason. As we discuss in this post, we need to see enough traffic from a given entity (have a strong enough signal) to be able to build relevant models and subsequently detect anomalies. For some smaller or sparsely populated countries, the traffic signal simply isn’t strong enough, and for many autonomous systems, we see little-to-no traffic from them, again resulting in a signal too weak to be useful. We are initially focusing on locations where we have a sufficiently strong traffic signal and/or are likely to experience traffic anomalies, as well as major or notable autonomous systems — those that represent a meaningful percentage of a location’s population and/or those that are known to have been impacted by traffic anomalies in the past.

Detecting anomalies

The approach we took to solve this problem involves creating a forecast that is a set of data points that correspond to our expectation according to what we’ve seen in historical data. This will be explained in the section Creating a forecast. We take this forecast and compare it to what we are actually observing — if what we are observing is significantly different from what we expect, then we call it an anomaly. Here, since we are interested in traffic drops, an anomaly will always correspond to lower traffic than the forecast / expected traffic. This comparison is elaborated in the section Comparing forecast with actual traffic.

In order to compute the forecast we need to fulfill the following business requirements:

  • We are mainly interested in traffic related to human activity.
  • The more timely we detect the anomaly, the more useful it is. This needs to take into account constraints such as data ingestion and data processing times, but once the data is available, we should be able to use the latest data point and detect if it is an anomaly.
  • A low False Positive (FP) rate is more important than a high True Positive (TP) rate. As an internal tool, this is not necessarily true, but as a publicly visible notification service, we want to limit spurious entries at the cost of not reporting some anomalies.

Selecting which entities to observe

Aside from the examples given above, the quality of the data highly depends on the volume of the data, and this means that we have different levels of data quality depending on which entity (location / AS) we are considering. As an extreme example, we don’t have enough data from Antarctica to reliably detect outages. Follows the process we used to select which entities are eligible to be observed.

For ASes, since we are mainly interested in Internet traffic that represents human activity, we use the number of users estimation provided by APNIC. We then compute the total number of users per location by summing up the number of users of each AS in that location, and then we calculate what percentage of users an AS has for that location (this number is also provided by the APNIC table in column ‘% of country’). We filter out ASes that have less than 1% of the users in that location. Here’s what the list looks like for Portugal — AS15525 (MEO-EMPRESAS) is excluded because it has less than 1% of users of the total number of Internet users in Portugal (estimated).

Gone offline: how Cloudflare Radar detects Internet outages

At this point we have a subset of ASes and a set of locations (we don’t exclude any location a priori because we want to cover as much as possible) but we will have to narrow it down based on the quality of the data to be able to reliably detect anomalies automatically. After testing several metrics and visually analyzing the results, we came to the conclusion that the best predictor of a stable signal is related to the volume of data, so we removed the entities that don’t satisfy the criteria of a minimum number of unique IPs daily in a two weeks period — the threshold is based on visual inspection.

Creating a forecast

In order to detect the anomalies in a timely manner, we decided to go with traffic aggregated every fifteen minutes, and we are forecasting one hour of data (four data points / blocks of fifteen minutes) that are compared with the actual data.

After selecting the entities for which we will detect anomalies, the approach is quite simple:

1. We look at the last 24 hours immediately before the forecast window and use that interval as the reference. The assumption is that the last 24 hours will contain information about the shape of what follows. In the figure below, the last 24 hours (in blue) corresponds to data transitioning from Friday to Saturday. By using the Euclidean distance, we get the six most similar matches to that reference (orange) — four of those six matches correspond to other transitions from Friday to Saturday. It also captures the holiday on Monday (August 14, 2023) to Tuesday, and we also see a match that is the most dissimilar to the reference, a regular working day from Wednesday to Thursday. Capturing one that doesn't represent the reference properly should not be a problem because the forecast is the median of the most similar 24 hours to the reference, and thus the data of that day ends up being discarded.

Gone offline: how Cloudflare Radar detects Internet outages

  1. There are two important parameters that we are using for this approach to work:
    • We take into consideration the last 28 days (plus the reference day equals 29). This way we ensure that the weekly seasonality can be seen at least 4 times, we control the risk associated with the trend changing over time, and we set an upper bound to the amount of data we need to process. Looking at the example above, the first day was one with the highest similarity to the reference because it corresponds to the transition from Friday to Saturday.
    • The other parameter is the number of most similar days. We are using six days as a result of empirical knowledge: given the weekly seasonality, when using six days, we expect at most to match four days for the same weekday and then two more that might be completely different. Since we use the median to create the forecast, the majority is still four and thus those extra days end up not being used as reference. Another scenario is in the case of holidays such as the example below:

Gone offline: how Cloudflare Radar detects Internet outages

A holiday in the middle of the week in this case looks like a transition from Friday to Saturday. Since we are using the last 28 days and the holiday starts on a Tuesday we only see three such transitions that are matching (orange) and then another three regular working days because that pattern is not found anywhere else in the time-series and those are the closest matches. This is why we use the lower quartile when computing the median for an even number of values (meaning we round the data down to the lower values) and use the result as the forecast. This also allows us to be more conservative and plays a role in the true positive/false positive tradeoff.

Lastly let's look at the outage example:

Gone offline: how Cloudflare Radar detects Internet outages

In this case, the matches are always connected to low traffic because the last 24h (reference) corresponds to a transition from Sunday to Monday and due to the low traffic the lowest Euclidean distance (most similar 24h) are either Saturdays (two times) or Sundays (four times). So the forecast is what we would expect to see on a regular Monday and that’s why the forecast (red) has an upward trend but since we had an outage, the actual volume of traffic (black) is considerably lower than the forecast.

This approach works for regular seasonal patterns, as would several other modeling approaches, and it has also been shown to work in case of holidays and other moving events (such as festivities that don’t happen at the same day every year) without having to actively add that information in. Nevertheless, there are still use cases where it will fail specifically when there’s an offset in the data. This is one of the reasons why we use multiple data sources to reduce the chances of the algorithm being affected by data artifacts.

Below we have an example of how the algorithm behaves over time.

Comparing forecast with actual traffic

Once we have the forecast and the actual traffic volume, we do the following steps.

We calculate relative change, which measures how much one value has changed relative to another. Since we are detecting anomalies based on traffic drops, the actual traffic will always be lower than the forecast.

Gone offline: how Cloudflare Radar detects Internet outages

After calculating this metric, we apply the following rules:

  • The difference between the actual and the forecast must be at least 10% of the magnitude of the signal. This magnitude is computed using the difference between 95th and 5th percentiles of the selected data. The idea is to avoid scenarios where the traffic is low, particularly during the off-peaks of the day and scenarios where small changes in actual traffic correspond to big changes in relative change because the forecast is also low. As an example:
    • a forecast of 100 Gbps compared with an actual value of 80 Gbps gives us a relative change of -0.20 (-20%).
    • a forecast of 20 Mbps compared with an actual value of 10 Mbps gives us a much smaller decrease in total volume than the previous example but a relative change of -0.50 (50%).
  • Then we have two rules for detecting considerably low traffic:
    • Sustained anomaly: The relative change is below a given threshold α throughout the forecast window (for all four data points). This allows us to detect weaker anomalies (with smaller relative changes) that are extended over time.

Gone offline: how Cloudflare Radar detects Internet outages
  • Point anomaly: The relative change of the last data point of the forecast window is below a given threshold β (where β < α — these thresholds are negative; as an example, β and α might be -0.6 and -0.4, respectively). In this case we need β < α to avoid triggering anomalies due to the stochastic nature of the data but still be able to detect sudden and short-lived traffic drops.
Gone offline: how Cloudflare Radar detects Internet outages
  • The values of α and β were chosen empirically to maximize detection rate, while keeping the false positive rate at an acceptable level.

Closing an anomaly event

Although the most important message that we want to convey is when an anomaly starts, it is also crucial to detect when the Internet traffic volume goes back to normal for two main reasons:

  • We need to have the notion of active anomaly, which means that we detected an anomaly and that same anomaly is still ongoing. This allows us to stop considering new data for the reference while the anomaly is still active. Considering that data would impact the reference and the selection of most similar sets of 24 hours.
  • Once the traffic goes back to normal, knowing the duration of the anomaly allows us to flag those data points as outliers and replace them, so we don’t end up using it as reference or as best matches to the reference. Although we are using the median to compute the forecast, and in most cases that would be enough to overcome the presence of anomalous data, there are scenarios such as the one for AS203214 (HulumTele), used as example four, where the outages are frequently occurring at the same time of the day that would make the anomalous data become the expectation after few days.

Whenever we detect an anomaly we keep the same reference until the data comes back to normal, otherwise our reference would start including anomalous data. To determine when the traffic is back to normal, we use lower thresholds than α and we give it a time period (currently four hours) where there should be no anomalies in order for it to close. This is to avoid situations where we observe drops in traffic that bounce back to normal and drop again. In such cases we want to detect a single anomaly and aggregate it to avoid sending multiple notifications, and in terms of semantics there’s a high chance that it’s related to the same anomaly.

Conclusion

Internet traffic data is generally predictable, which in theory would allow us to build a very straightforward anomaly detection algorithm to detect Internet disruptions. However, due to the heterogeneity of the time series depending on the entity we are observing (Location or AS) and the presence of artifacts in the data, it also needs a lot of context that poses some challenges if we want to track it in real-time. Here we’ve shown particular examples of what makes this problem challenging, and we have explained how we approached this problem in order to overcome most of the hurdles. This approach has been shown to be very effective at detecting traffic anomalies while keeping a low false positive rate, which is one of our priorities. Since it is a static threshold approach, one of the downsides is that we are not detecting anomalies that are not as steep as the ones we’ve shown.

We will keep working on adding more entities and refining the algorithm to be able to cover a broader range of anomalies.

Visit Cloudflare Radar for additional insights around (Internet disruptions, routing issues, Internet traffic trends, attacks, Internet quality, etc.). Follow us on social media at @CloudflareRadar (Twitter), cloudflare.social/@radar (Mastodon), and radar.cloudflare.com (Bluesky), or contact us via e-mail.

Making home Internet faster has little to do with “speed”

Post Syndicated from Mike Conlow original https://blog.cloudflare.com/making-home-internet-faster/

Making home Internet faster has little to do with “speed”

Making home Internet faster has little to do with “speed”

More than ten years ago, researchers at Google published a paper with the seemingly heretical title “More Bandwidth Doesn’t Matter (much)”. We published our own blog showing it is faster to fly 1TB of data from San Francisco to London than it is to upload it on a 100 Mbps connection. Unfortunately, things haven’t changed much. When you make purchasing decisions about home Internet plans, you probably consider the bandwidth of the connection when evaluating Internet performance. More bandwidth is faster speed, or so the marketing goes. In this post, we’ll use real-world data to show both bandwidth and – spoiler alert! – latency impact the speed of an Internet connection. By the end, we think you’ll understand why Cloudflare is so laser focused on reducing latency everywhere we can find it.

First, we should quickly define bandwidth and latency. Bandwidth is the amount of data that can be transmitted at any single time. It’s the maximum throughput, or capacity, of the communications link between two servers that want to exchange data. Usually, the bottleneck – the place in the network where the connection is constrained by the amount of bandwidth available – is in the “last mile”, either the wire that connects a home, or the modem or router in the home itself.

If the Internet is an information superhighway, bandwidth is the number of lanes on the road. The wider the road, the more traffic can fit on the highway at any time. Bandwidth is useful for downloading large files like operating system updates and big game updates. While bandwidth, throughput and capacity are synonyms, confusingly “speed” has come to mean bandwidth when talking about Internet plans. More on this later.

We use bandwidth when streaming video, though probably less than you think. Netflix recommends 15 Mbps of bandwidth to watch a stream in 4K/Ultra HD. A 1 Gbps connection could stream more than 60 Netflix shows in 4K at the same time!

Latency, on the other hand, is how fast data is moving through the Internet. To extend our analogy, tatency is the speed at which traffic is moving on the highway. If traffic is moving quickly, you’ll get to your destination faster. Latency is measured in the number of milliseconds that it takes a packet of data to go from a client (such as your laptop computer) to a server, and then back to the client. If you’re practicing tennis against a wall, round-trip latency is the time the ball was in the air. On the Internet “backbone” data is traveling at almost 200,000 kilometers per second as it bounces off the glass on the inside of fiber optic wires. That’s fast!

While we can’t make light travel through glass much faster, we can improve latency by moving the content closer to users, shortening the distance data needs to travel. That’s the effect of our presence in more than 285 cities globally: when you’re on the Internet superhighway trying to reach Cloudflare, we want to be just off the next exit.

We know that low-latency (low delay; high speed) connections are important for gaming, where tiny bits of data, such as the change in position of players in a game, need to reach another computer quickly. And increasingly, we’re becoming aware of high latency when it makes our videoconferencing choppy and unpleasant.

But we don’t use the Internet only to play games, nor only watch streaming video. We do those, and we visit a lot of normal web pages in between.

In the 2010 paper from Google, the author simulated loading web pages while varying the throughput and latency of the connection. What he finds is that above about 5 Mbps, the page doesn’t load much faster. Increasing bandwidth from 1 Mbps to 2 Mbps is almost a 40 percent improvement in page load time. From 5 Mbps to 6 Mbps is less than a 5 percent improvement.

When he varies the latency (the Round Trip Time, or RTT), something interesting happens: there is a linear return to better latency. For every 20 milliseconds of reduced latency, the page load time improves by about 10%.

Let’s see what this looks like in real life with empirical data. Below is a chart from an excellent recent paper by two researchers from MIT. Using data from the FCC’s Measuring Broadband America program, these researchers produce a similar chart to what we expect from the 2010 simulation. Though the point of diminishing returns to more bandwidth has moved higher – to about 20 Mbps – the overall trend is exactly the same.

Making home Internet faster has little to do with “speed”

For the latency relationship, we use aggregate and anonymized data from Cloudflare’s own delivery network. It’s a familiar pattern. For every 200 milliseconds of latency we can save, we cut the page load time by over 1 second. That relationship applies when the latency is 950 milliseconds. And it applies when the latency is 50 milliseconds.

Making home Internet faster has little to do with “speed”

Let’s try to understand why this relationship exists. When you connect to a website, the first thing that your browser does is encrypt the connection and make sure the site you want to access is authenticated to be the site you think you’re accessing. This process is called the TLS establishment and SSL handshake. Before you even see a pixel of content, your browser and the website will exchange information up to seven times: three to perform TLS establishment, and three more to perform the SSL handshake.

Making home Internet faster has little to do with “speed”

On top of that, when we load a webpage after we establish encryption and verify website authority, we might be asking the browser to load hundreds of different files across dozens of different domains. Some of these files can be loaded in parallel, but others need to be loaded sequentially. As the browser races to compile all these different files, it’s the speed at which it can get to the server and back that determines how fast it can put the page together. The files are often quite small, but there’s a lot of them.

The chart below shows the beginning of what the browser does when it loads cnn.com. After a 301 redirect, the browser loads the main HTML page in step two. Only then, more than 1 second into the load, does it know about all the javascript files it needs to render the page. Files 3-19 become unblocked once the browser has the HTML file in step two, but we see more blocking later on. Files 20-27 are all blocked on earlier files. They can’t start until the browser has the previous file back from the server. There are 650 assets in this page load, and the blocking happens all the way through the page load. Here’s why this matters: better latency makes every file load faster, which in turn unblocks other files faster, and so on.

Making home Internet faster has little to do with “speed”

The browser will do its best to use all the bandwidth available, but often is using just a fraction of what’s available. It’s no wonder then that adding more bandwidth doesn’t speed up the page load, but better latency does. While developments like Early Hints help this by allowing browsers to pre-fetch files and content that don’t need to be serialized, this is still a problem for many websites on the Internet today.

Making home Internet faster has little to do with “speed”

Recently, Internet researchers have turned their attention to using our understanding of the relationship between throughput and latency to improve Internet Quality of Experience (QoE). A paper from the Broadband Internet Technical Advisory Group (BITAG) summarizes:

But we now recognize that it is not just greater throughput that matters, but also consistently low latency. Unfortunately, the way that we’ve historically understood and characterized latency was flawed, and our latency measurements and metrics were not aligned with end-user QoE.

There’s a difference between latency on an idle Internet connection and latency measured in working conditions, which we call “working latency” or “responsiveness”. Since responsiveness is what the user experiences as the speed of their Internet connection, it’s important to understand and measure this particular latency.

An Internet connection can suffer from poor responsiveness (even if it has good idle latency) when data is delayed in buffers. If you download a large file, for example an operating system update, the server sending the file might send the file with higher throughput than the Internet connection can accept. That’s ok. Extra bits of the file will sit in a buffer until it’s their turn to go through the funnel. Adding extra lanes to the highway allows more cars to pass through, and is a good strategy if we aren’t particularly concerned with the speed of the traffic.

Say for example, Christabel is watching a stream of the news while on a video meeting. When Christabel starts watching the video, her browser fetches a bunch of content and stores it in various buffers on the way from the content host to the browser. Those same buffers also contain data packets pertaining to the video meeting Christabel is currently in. If the data generated as part of a video conference sits in the same buffer as the video files, the video files will fill up the buffer and cause delay for the video meeting packets as well. That type of buffering delay is called bufferbloat, and leads to jittery video calls and delays where people speak over each other and get frustrated.

Making home Internet faster has little to do with “speed”

To help users understand the strengths and weaknesses of their connection, we recently added Aggregated Internet Measurement (AIM) scores to our own “Speed” Test. These scores remove the technical metrics and give users a real-world, plain-English understanding of what their connection will be good at, and where it might struggle. We’d also like to collect more data from our speed test to help track Page Load Times (PLT) and see how they are correlated with the reduction of lower working latency. You’ll start seeing those numbers on our speed test soon!

Making home Internet faster has little to do with “speed”

We all use our Internet connections in slightly different ways, but we share the desire for our connections to be as fast as possible. As more and more services move into the cloud – word documents, music, websites, communications, etc – the speed at which we can access those services becomes critical. While bandwidth plays a part, the latency of the connection – the real Internet “speed” – is more important.

At Cloudflare, we’re working every day to help build a more performant Internet. Want to help? Apply for one of our open engineering roles here.

The unintended consequences of blocking IP addresses

Post Syndicated from Alissa Starzak original https://blog.cloudflare.com/consequences-of-ip-blocking/

The unintended consequences of blocking IP addresses

The unintended consequences of blocking IP addresses

In late August 2022, Cloudflare’s customer support team began to receive complaints about sites on our network being down in Austria. Our team immediately went into action to try to identify the source of what looked from the outside like a partial Internet outage in Austria. We quickly realized that it was an issue with local Austrian Internet Service Providers.

But the service disruption wasn’t the result of a technical problem. As we later learned from media reports, what we were seeing was the result of a court order. Without any notice to Cloudflare, an Austrian court had ordered Austrian Internet Service Providers (ISPs) to block 11 of Cloudflare’s IP addresses.

In an attempt to block 14 websites that copyright holders argued were violating copyright, the court-ordered IP block rendered thousands of websites inaccessible to ordinary Internet users in Austria over a two-day period. What did the thousands of other sites do wrong? Nothing. They were a temporary casualty of the failure to build legal remedies and systems that reflect the Internet’s actual architecture.

Today, we are going to dive into a discussion of IP blocking: why we see it, what it is, what it does, who it affects, and why it’s such a problematic way to address content online.

Collateral effects, large and small

The craziest thing is that this type of blocking happens on a regular basis, all around the world. But unless that blocking happens at the scale of what happened in Austria, or someone decides to highlight it, it is typically invisible to the outside world. Even Cloudflare, with deep technical expertise and understanding about how blocking works, can’t routinely see when an IP address is blocked.

For Internet users, it’s even more opaque. They generally don’t know why they can’t connect to a particular website, where the connection problem is coming from, or how to address it. They simply know they cannot access the site they were trying to visit. And that can make it challenging to document when sites have become inaccessible because of IP address blocking.

Blocking practices are also wide-spread. In their Freedom on the Net report, Freedom House recently reported that 40 out of the 70 countries that they examined – which vary from countries like Russia, Iran and Egypt to Western democracies like the United Kingdom and Germany –  did some form of website blocking. Although the report doesn’t delve into exactly how those countries block, many of them use forms of IP blocking, with the same kind of potential effects for a partial Internet shutdown that we saw in Austria.

Although it can be challenging to assess the amount of collateral damage from IP blocking, we do have examples where organizations have attempted to quantify it. In conjunction with a case before the European Court of Human Rights, the European Information Society Institute, a Slovakia-based nonprofit, reviewed Russia’s regime for website blocking in 2017. Russia exclusively used IP addresses to block content. The European Information Society Institute concluded that IP blocking led to “collateral website blocking on a massive scale” and noted that as of June 28, 2017, “6,522,629 Internet resources had been blocked in Russia, of which 6,335,850 – or 97% – had been blocked collaterally, that is to say, without legal justification.”

In the UK, overbroad blocking prompted the non-profit Open Rights Group to create the website Blocked.org.uk. The website has a tool enabling users and site owners to report on overblocking and request that ISPs remove blocks. The group also has hundreds of individual stories about the effect of blocking on those whose websites were inappropriately blocked, from charities to small business owners. Although it’s not always clear what blocking methods are being used, the fact that the site is necessary at all conveys the amount of overblocking. Imagine a dressmaker, watchmaker or car dealer looking to advertise their services and potentially gain new customers with their website. That doesn’t work if local users can’t access the site.

One reaction might be, “Well, just make sure there are no restricted sites sharing an address with unrestricted sites.” But as we’ll discuss in more detail, this ignores the large difference between the number of possible domain names and the number of available IP addresses, and runs counter to the very technical specifications that empower the Internet. Moreover, the definitions of restricted and unrestricted differ across nations, communities, and organizations. Even if it were possible to know all the restrictions, the designs of the protocols — of the Internet, itself — mean that it is simply infeasible, if not impossible, to satisfy every agency’s constraints.

Overblocking websites is not only a problem for users; it has legal implications. Because of the effect it can have on ordinary citizens looking to exercise their rights online, government entities (both courts and regulatory bodies) have a legal obligation to make sure that their orders are necessary and proportionate, and don’t unnecessarily affect those who are not contributing to the harm.

It would be hard to imagine, for example, that a court in response to alleged wrongdoing would blindly issue a search warrant or an order based solely on a street address without caring if that address was for a single family home, a six-unit condo building, or a high rise with hundreds of separate units. But those sorts of practices with IP addresses appear to be rampant.

In 2020, the European Court of Human Rights (ECHR) – the court overseeing the implementation of the Council of Europe’s European Convention on Human Rights – considered a case involving a website that was blocked in Russia not because it had been targeted by the Russian government, but because it shared an IP address with a blocked website. The website owner brought suit over the block. The ECHR concluded that the indiscriminate blocking was impermissible, ruling that the block on the lawful content of the site “amounts to arbitrary interference with the rights of owners of such websites.” In other words, the ECHR ruled that it was improper for a government to issue orders that resulted in the blocking of sites that were not targeted.

Using Internet infrastructure to address content challenges

Ordinary Internet users don’t think a lot about how the content they are trying to access online is delivered to them. They assume that when they type a domain name into their browser, the content will automatically pop up. And if it doesn’t, they tend to assume the website itself is having problems unless their entire Internet connection seems to be broken. But those basic assumptions ignore the reality that connections to a website are often used to limit access to content online.

Why do countries block connections to websites? Maybe they want to limit their own citizens from accessing what they believe to be illegal content – like online gambling or explicit material – that is permissible elsewhere in the world. Maybe they want to prevent the viewing of a foreign news source that they believe to be primarily disinformation. Or maybe they want to support copyright holders seeking to block access to a website to limit viewing of content that they believe infringes their intellectual property.

To be clear, blocking access is not the same thing as removing content from the Internet. There are a variety of legal obligations and authorities designed to permit actual removal of illegal content. Indeed, the legal expectation in many countries is that blocking is a matter of last resort, after attempts have been made to remove content at the source.

Blocking just prevents certain viewers – those whose Internet access depends on the ISP that is doing the blocking – from being able to access websites. The site itself continues to exist online and is accessible by everyone else. But when the content originates from a different place and can’t be easily removed, a country may see blocking as their best or only approach.

We recognize the concerns that sometimes drive countries to implement blocking. But fundamentally, we believe it’s important for users to know when the websites they are trying to access have been blocked, and, to the extent possible, who has blocked them from view and why. And it’s critical that any restrictions on content should be as limited as possible to address the harm, to avoid infringing on the rights of others.

Brute force IP address blocking doesn’t allow for those things. It’s fully opaque to Internet users. The practice has unintended, unavoidable consequences on other content. And the very fabric of the Internet means that there is no good way to identify what other websites might be affected either before or during an IP block.

To understand what happened in Austria and what happens in many other countries around the world that seek to block content with the bluntness of IP addresses, we have to understand what is going on behind the scenes. That means diving into some technical details.

Identity is attached to names, never addresses

Before we even get started describing the technical realities of blocking, it’s important to stress that the first and best option to deal with content is at the source. A website owner or hosting provider has the option of removing content at a granular level, without having to take down an entire website. On the more technical side, a domain name registrar or registry can potentially withdraw a domain name, and therefore a website, from the Internet altogether.

But how do you block access to a website, if for whatever reason the content owner or content source is unable or unwilling to remove it from the Internet?  There are only three possible control points.

The first is via the Domain Name System (DNS), which translates domain names into IP addresses so that the site can be found. Instead of returning a valid IP address for a domain name, the DNS resolver could lie and respond with a code, NXDOMAIN, meaning that “there is no such name.” A better approach would be to use one of the honest error numbers standardized in 2020, including error 15 for blocked, error 16 for censored, 17 for filtered, or 18 for prohibited, although these are not widely used currently.

Interestingly, the precision and effectiveness of DNS as a control point depends on whether the DNS resolver is private or public. Private or ‘internal’ DNS resolvers are operated by ISPs and enterprise environments for their own known clients, which means that operators can be precise in applying content restrictions. By contrast, that level of precision is unavailable to open or public resolvers, not least because routing and addressing is global and ever-changing on the Internet map, and in stark contrast to addresses and routes on a fixed postal or street map. For example, private DNS resolvers may be able to block access to websites within specified geographic regions with at least some level of accuracy in a way that public DNS resolvers cannot, which becomes profoundly important given the disparate (and inconsistent) blocking regimes around the world.

The second approach is to block individual connection requests to a restricted domain name. When a user or client wants to visit a website, a connection is initiated from the client to a server name, i.e. the domain name. If a network or on-path device is able to observe the server name, then the connection can be terminated. Unlike DNS, there is no mechanism to communicate to the user that access to the server name was blocked, or why.

The third approach is to block access to an IP address where the domain name can be found. This is a bit like blocking the delivery of all mail to a physical address. Consider, for example, if that address is a skyscraper with its many unrelated and independent occupants. Halting delivery of mail to the address of the skyscraper causes collateral damage by invariably affecting all parties at that address. IP addresses work the same way.

Notably, the IP address is the only one of the three options that has no attachment to the domain name. The website domain name is not required for routing and delivery of data packets; in fact it is fully ignored. A website can be available on any IP address, or even on many IP addresses, simultaneously. And the set of IP addresses that a website is on can change at any time. The set of IP addresses cannot definitively be known by querying DNS, which has been able to return any valid address at any time for any reason, since 1995.

The idea that an address is representative of an identity is anathema to the Internet’s design, because the decoupling of address from name is deeply embedded in the Internet standards and protocols, as is explained next.

The Internet is a set of protocols, not a policy or perspective

Many people still incorrectly assume that an IP address represents a single website. We’ve previously stated that the association between names and addresses is understandable given that the earliest connected components of the Internet appeared as one computer, one interface, one address, and one name. This one-to-one association was an artifact of the ecosystem in which the Internet Protocol was deployed, and satisfied the needs of the time.

Despite the one-to-one naming practice of the early Internet, it has always been possible to assign more than one name to a server (or ‘host’). For example, a server was (and is still) often configured with names to reflect its service offerings such as mail.example.com and www.example.com, but these shared a base domain name.  There were few reasons to have completely different domain names until the need to colocate completely different websites onto a single server. That practice was made easier in 1997 by the Host header in HTTP/1.1, a feature preserved by the SNI field in a TLS extension in 2003.

Throughout these changes, the Internet Protocol and, separately, the DNS protocol, have not only kept pace, but have remained fundamentally unchanged. They are the very reason that the Internet has been able to scale and evolve, because they are about addresses, reachability, and arbitrary name to IP address relationships.

The designs of IP and DNS are also entirely independent, which only reinforces that names are separate from addresses. A closer inspection of the protocols’ design elements illuminates the misperceptions of policies that lead to today’s common practice of controlling access to content by blocking IP addresses.

By design, IP is for reachability and nothing else

Much like large public civil engineering projects rely on building codes and best practice, the Internet is built using a set of open standards and specifications informed by experience and agreed by international consensus. The Internet standards that connect hardware and applications are published by the Internet Engineering Task Force (IETF) in the form of “Requests for Comment” or RFCs — so named not to suggest incompleteness, but to reflect that standards must be able to evolve with knowledge and experience. The IETF and its RFCs are cemented in the very fabric of communications, for example, with the first RFC 1 published in 1969. The Internet Protocol (IP) specification reached RFC status in 1981.

Alongside the standards organizations, the Internet’s success has been helped by a core idea known as the end-to-end (e2e) principle, codified also in 1981, based on years of trial and error experience. The end-to-end principle is a powerful abstraction that, despite taking many forms, manifests a core notion of the Internet Protocol specification: the network’s only responsibility is to establish reachability, and every other possible feature has a cost or a risk.

The idea of “reachability” in the Internet Protocol is also enshrined in the design of IP addresses themselves. Looking at the Internet Protocol specification, RFC 791, the following excerpt from Section 2.3 is explicit about IP addresses having no association with names, interfaces, or anything else.

Addressing

    A distinction is made between names, addresses, and routes [4].   A
    name indicates what we seek.  An address indicates where it is.  A
    route indicates how to get there.  The internet protocol deals
    primarily with addresses.  It is the task of higher level (i.e.,
    host-to-host or application) protocols to make the mapping from
    names to addresses.   The internet module maps internet addresses to
    local net addresses.  It is the task of lower level (i.e., local net
    or gateways) procedures to make the mapping from local net addresses
    to routes.
                            [ RFC 791, 1981 ]

Just like postal addresses for skyscrapers in the physical world, IP addresses are no more than street addresses written on a piece of paper. And just like a street address on paper, one can never be confident about the entities or organizations that exist behind an IP address. In a network like Cloudflare’s, any single IP address represents thousands of servers, and can have even more websites and services — in some cases numbering into the millions — expressly because the Internet Protocol is designed to enable it.

Here’s an interesting question: could we, or any content service provider, ensure that every IP address matches to one and only one name? The answer is an unequivocal no, and here too, because of a protocol design — in this case, DNS.

The number of names in DNS always exceeds the available addresses

A one-to-one relationship between names and addresses is impossible given the Internet specifications for the same reasons that it is infeasible in the physical world. Ignore for a moment that people and organizations can change addresses. Fundamentally, the number of people and organizations on the planet exceeds the number of postal addresses. We not only want, but need for the Internet to accommodate more names than addresses.

The difference in magnitude between names and addresses is also codified in the specifications. IPv4 addresses are 32 bits, and IPv6 addresses are 128 bits. The size of a domain name that can be queried by DNS is as many as 253 octets, or 2,024 bits (from Section 2.3.4 in RFC 1035, published 1987). The table below helps to put those differences into perspective:

The unintended consequences of blocking IP addresses

On November 15, 2022, the United Nations announced the population of the Earth surpassed eight billion people. Intuitively, we know that there cannot be anywhere near as many postal addresses. The difference between the number of possible names on the planet, and similarly on the Internet, does and must exceed the number of available addresses.

The proof is in the pudding names!

Now that those two relevant principles about IP addresses and DNS names in the international standards are understood – that IP address and domain names serve distinct purposes and there is no one to one relationship between the two – an examination of a recent case of content blocking using IP addresses can help to see the reasons it is problematic. Take, for example, the IP blocking incident in Austria late August 2022. The goal was to restrict access to 14 target domains, by blocking 11 IP addresses (source: RTR.Telekom. Post via the Internet Archive) — the mismatch between those two numbers should have been a warning flag that IP blocking might not have the desired effect.

Analogies and international standards may explain the reasons that IP blocking should be avoided, but we can see the scale of the problem by looking at Internet-scale data. To better understand and explain the severity of IP blocking, we decided to generate a global view of domain names and IP addresses (thanks are due to a PhD research intern, Sudheesh Singanamalla, for the effort). In September 2022, we used the authoritative zone files for the top-level domains (TLDs) .com, .net, .info, and .org, together with top-1M website lists, to find a total of 255,315,270 unique names. We then queried DNS from each of five regions and recorded the set of IP addresses returned. The table below summarizes our findings:

The unintended consequences of blocking IP addresses

The table above makes clear that it takes no more than 10.7 million addresses to reach 255,315,270 million names from any region on the planet, and the total set of IP addresses for those names from everywhere is about 16 million — the ratio of names to IP addresses is nearly 24x in Europe and 16x globally.

There is one more worthwhile detail about the numbers above: The IP addresses are the combined totals of both IPv4 and IPv6 addresses, meaning that far fewer addresses are needed to reach all 255M websites.

We’ve also inspected the data a few different ways to find some interesting observations. For example, the figure below shows the cumulative distribution (CDF) of the proportion of websites that can be visited with each additional IP address. On the y-axis is the proportion of websites that can be reached given some number of IP addresses. On the x-axis, the 16M IP addresses are ranked from the most domains on the left, to the least domains on the right. Note that any IP address in this set is a response from DNS and so it must have at least one domain name, but the highest numbers of domains on IP addresses in the set number are in the 8-digit millions.

The unintended consequences of blocking IP addresses

By looking at the CDF there are a few eye-watering observations:

  • Fewer than 10 IP addresses are needed to reach 20% of, or approximately 51 million, domains in the set;
  • 100 IPs are enough to reach almost 50% of domains;
  • 1000 IPs are enough to reach 60% of domains;
  • 10,000 IPs are enough to reach 80%, or about 204 million, domains.

In fact, from the total set of 16 million addresses, fewer than half, 7.1M (43.7%), of the addresses in the dataset had one name. On this ‘one’ point we must be additionally clear: we are unable to ascertain if there was only one and no other names on those addresses because there are many more domain names than those contained only in .com, .org, .info., and .net — there might very well be other names on those addresses.

In addition to having a number of domains on a single IP address, any IP address may change over time for any of those domains.  Changing IP addresses periodically can be helpful with certain security, performance, and to improve reliability for websites. One common example in use by many operations is load balancing. This means DNS queries may return different IP addresses over time, or in different places, for the same websites. This is a further, and separate, reason why blocking based on IP addresses will not serve its intended purpose over time.

Ultimately, there is no reliable way to know the number of domains on an IP address without inspecting all names in the DNS, from every location on the planet, at every moment in time — an entirely infeasible proposition.

Any action on an IP address must, by the very definitions of the protocols that rule and empower the Internet, be expected to have collateral effects.

Lack of transparency with IP blocking

So if we have to expect that the blocking of an IP address will have collateral effects, and it’s generally agreed that it’s inappropriate or even legally impermissible to overblock by blocking IP addresses that have multiple domains on them, why does it still happen? That’s hard to know for sure, so we can only speculate. Sometimes it reflects a lack of technical understanding about the possible effects, particularly from entities like judges who are not technologists. Sometimes governments just ignore the collateral damage – as they do with Internet shutdowns – because they see the blocking as in their interest. And when there is collateral damage, it’s not usually obvious to the outside world, so there can be very little external pressure to have it addressed.

It’s worth stressing that point. When an IP is blocked, a user just sees a failed connection. They don’t know why the connection failed, or who caused it to fail. On the other side, the server acting on behalf of the website doesn’t even know it’s been blocked until it starts getting complaints about the fact that it is unavailable. There is virtually no transparency or accountability for the overblocking. And it can be challenging, if not impossible, for a website owner to challenge a block or seek redress for being inappropriately blocked.

Some governments, including Austria, do publish active block lists, which is an important step for transparency. But for all the reasons we’ve discussed, publishing an IP address does not reveal all the sites that may have been blocked unintentionally. And it doesn’t give those affected a means to challenge the overblocking. Again, in the physical world example, it’s hard to imagine a court order on a skyscraper that wouldn’t be posted on the door, but we often seem to jump over such due process and notice requirements in virtual space.

We think talking about the problematic consequences of IP blocking is more important than ever as an increasing number of countries push to block content online. Unfortunately, ISPs often use IP blocks to implement those requirements. It may be that the ISP is newer or less robust than larger counterparts, but larger ISPs engage in the practice, too, and understandably so because IP blocking takes the least effort and is readily available in most equipment.

And as more and more domains are included on the same number of IP addresses, the problem is only going to get worse.

Next steps

So what can we do?

We believe the first step is to improve transparency around the use of IP blocking. Although we’re not aware of any comprehensive way to document the collateral damage caused by IP blocking, we believe there are steps we can take to expand awareness of the practice. We are committed to working on new initiatives that highlight those insights, as we’ve done with the Cloudflare Radar Outage Center.

We also recognize that this is a whole Internet problem, and therefore has to be part of a broader effort. The significant likelihood that blocking by IP address will result in restricting access to a whole series of unrelated (and untargeted) domains should make it a non-starter for everyone. That’s why we’re engaging with civil society partners and like-minded companies to lend their voices to challenge the use of blocking IP addresses as a way of addressing content challenges and to point out collateral damage when they see it.

To be clear, to address the challenges of illegal content online, countries need legal mechanisms that enable the removal or restriction of content in a rights-respecting way. We believe that addressing the content at the source is almost always the best and the required first step. Laws like the EU’s new Digital Services Act or the Digital Millennium Copyright Act provide tools that can be used to address illegal content at the source, while respecting important due process principles. Governments should focus on building and applying legal mechanisms in ways that least affect other people’s rights, consistent with human rights expectations.

Very simply, these needs cannot be met by blocking IP addresses.

We’ll continue to look for new ways to talk about network activity and disruption, particularly when it results in unnecessary limitations on access. Check out Cloudflare Radar for more insights about what we see online.

New cities on the Cloudflare global network: March 2022 edition

Post Syndicated from Mike Conlow original https://blog.cloudflare.com/mid-2022-new-cities/

New cities on the Cloudflare global network: March 2022 edition

If you follow the Cloudflare blog, you know that we love to add cities to our global map. With each new city we add, we help make the Internet faster, more reliable, and more secure. Today, we are announcing the addition of 18 new cities in Africa, South America, Asia, and the Middle East, bringing our network to over 270 cities globally. We’ll also look closely at how adding new cities improves Internet performance, such as our new locations in Israel, which reduced median response time (latency) from 86ms to 29ms (a 66% improvement) in a matter of weeks for subscribers of one Israeli Internet service provider (ISP).

The Cities

Without further ado, here are the 18 new cities in 10 countries we welcomed to our global network: Accra, Ghana; Almaty, Kazakhstan; Bhubaneshwar, India; Chiang Mai, Thailand; Joinville, Brazil; Erbil, Iraq; Fukuoka, Japan; Goiânia, Brazil; Haifa, Israel; Harare, Zimbabwe; Juazeiro do Norte, Brazil; Kanpur, India; Manaus, Brazil; Naha, Japan; Patna, India; São José do Rio Preto, Brazil; Tashkent, Uzbekistan; Uberlândia, Brazil.

Cloudflare’s ISP Edge Partnership Program

But let’s take a step back and understand why and how adding new cities to our list helps make the Internet better. First, we should reintroduce the Cloudflare Edge Partnership Program. Cloudflare is used as a reverse proxy by nearly 20% of all Internet properties, which means the volume of ISP traffic trying to reach us can be significant. In some cases, as we’ll see in Israel, the distance data needs to travel can also be significant, adding to latency and reducing Internet performance for the user. Our solution is partnering with ISPs to embed our servers inside their network. Not only does the ISP avoid lots of back haul traffic, but their subscribers also get much better performance because the website is served on-net, and close to them geographically. It is a win-win-win.

Consider a large Israeli ISP we did not peer with locally in Tel Aviv. Last year, if a subscriber wanted to reach a website on the Cloudflare network, their request had to travel on the Internet backbone – the large carriers that connect networks together on behalf of smaller ISPs – from Israel to Europe before reaching Cloudflare and going back. The map below shows where they were able to find Cloudflare content before our deployment went live: 48% in Frankfurt, 33% in London, and 18% in Amsterdam. That’s a long way!

New cities on the Cloudflare global network: March 2022 edition

In January and March 2022, we turned up deployments with the ISP  in Tel Aviv and Haifa. Now live, these two locations serve practically all requests from their subscribers locally within Israel. Instead of traveling 3,000 km to reach one of the millions of websites on our network, most requests from Israel now travel 65 km, or less. The improvement has been dramatic: now we’re serving 66% of requests in under 50ms; before the deployment we couldn’t serve any in under 50ms because the distance was too great. Now, 85% are served in under 100ms; before, we served 66% of requests in under 100ms.

New cities on the Cloudflare global network: March 2022 edition

As we continue to put dots on the map, we’ll keep putting updates here on how Internet performance is improving. As we like to say, we’re just getting started.

If you’re an ISP that is interested in hosting a Cloudflare cache to improve performance and reduce back haul, get in touch on our Edge Partnership Program page. And if you’re a software, data, or network engineer – or just the type of person who is curious and wants to help make the Internet better – consider joining our team.

Internet outage in Yemen amid airstrikes

Post Syndicated from João Tomé original https://blog.cloudflare.com/internet-outage-in-yemen-amid-airstrikes/

Internet outage in Yemen amid airstrikes

The early hours of Friday, January 21, 2022, started in Yemen with a country-wide Internet outage. According to local and global news reports airstrikes are happening in the country and the outage is likely related as there are reports that a telecommunications building in Al-Hudaydah where the FALCON undersea cable lands.

Cloudflare Radar shows that Internet traffic dropped close to zero between 21:30 UTC (January 20, 2022) and by 22:00 UTC (01:00 in local time).

Internet outage in Yemen amid airstrikes

The outage affected the main state-owned ISP, Public Telecommunication Corporation (AS30873 in blue in the next chart), which represents almost all the Internet traffic in the country.

Internet outage in Yemen amid airstrikes

Looking at BGP (Border Gateway Protocol) updates from Yemen’s ASNs around the time of the outage, we see a clear spike at the same time the main ASN was affected ~21:55 UTC, January 20, 2022. These update messages are BGP signalling that Yemen’s main ASN was no longer routable, something similar to what we saw happening in The Gambia and Kazakhstan but for very different reasons.

Internet outage in Yemen amid airstrikes

So far, 2022 has started with a few significant Internet disruptions for different reasons:

1. An Internet outage in The Gambia because of a cable problem.
2. An Internet shutdown in Kazakhstan because of unrest.
3. A mobile Internet shutdown in Burkina Faso because of a coup plot.
4. An Internet outage in Tonga because of a volcanic eruption (still ongoing).

You can keep an eye on Cloudflare Radar to monitor this situation as it unfolds.

Where is mobile traffic the most and least popular?

Post Syndicated from João Tomé original https://blog.cloudflare.com/where-mobile-traffic-more-and-less-popular/

Where is mobile traffic the most and least popular?

Where is mobile traffic the most and least popular?

You’re having dinner, you look at the table next to and everyone is checking their phone, scrolling and browsing and interacting with that little (is getting bigger) piece of hardware that puts you in contact with friends, family, work and the giant public square of sorts that social media has become. That could happen in the car (hopefully with the passengers, never the driver), at home when you’re on the sofa, in bed or even when you’re commuting or just bored in line for the groceries.

Or perhaps you use your mobile phone as your only connection to the Internet. It might be your one means of communication and doing business. For many, the mobile Internet opened up access and opportunity that simply was not possible before.

Around the world the use of mobile Internet differs widely. In some countries mobile traffic dominates, in others desktop still reigns supreme.

Mobile Internet traffic has changed the way we relate to the online world — work (once, for some, done on desktop/laptop computers) is just one part of it — and Cloudflare Radar can help us get a better understanding of global Internet traffic but also access regional trends, and monitor emerging security threats. So let’s dig into the mobile traffic trends, starting with a kind of contest (the data reflected here is from the 30 days before October 4).

Where is mobile traffic the most and least popular?
In this area of Cloudflare Radar users can check the mobile traffic trends by country or worldwide (the case shown here) in the past 7 or 30 days. Worldwide we can see that mobile wins over desktop traffic with 52%

The country that has the greatest proportion of mobile Internet traffic is…

Cloudflare Radar has information on countries across the world, so we looked for, in the past month, the country with the highest proportion of mobile Internet traffic. And the answer is… Sudan, with 83% of Internet traffic is done using mobile devices — actually it’s a tie with Yemen, which we talk about a little further below.

In many emerging economies (Sudan is one), a large percent of the population had its first contact with the Internet through a smartphone. In these countries it is normal not to have a computer and some even got their first bank account thanks to the mobile device.

How about Sudan’s neighbours? South Sudan follows that pattern and mobile traffic represents 74% of Internet use. The same in Chad (74%), Libya (75%), Egypt (68%), Eritrea (67%) and Ethiopia (58%). It’s a clear trend throughout Africa, especially in the central and eastern part of the continent, where mobile traffic wins in every country (for the past 30 days).

Where is mobile traffic the most and least popular?
World map that shows (in yellow) the areas of the planet where most of the Internet traffic is done via mobile devices. Africa, the Middle East and Asia have the highest percentage of mobile traffic.

The Vatican goes for the desktop experience (but Italy loves mobile)

On the other hand, the country we found with the least mobile traffic in the past 30 days is… Vatican City, with only 13% (since the Vatican is literally inside Rome this might be an anomaly caused by mobile devices inside the Vatican connecting to Italian networks). Small countries like Seychelles (29%), Andorra (29%), Estonia (34%) and San Marino (36%) have the same pattern — also with a low mobile traffic percentage there’s Madagascar (27%), Haiti (34%) and Greenland (37%).

We can also see that the pattern inside Vatican City differs greatly from the pattern in Italy. Italy is one of the most mobile-friendly European countries — Italians seem to prefer mobile to desktop. About 57% of Internet traffic is via mobile devices. Italy is only matched, in Europe, by its neighbour Croatia — on the other side of the Adriatic Sea — that in the past month has had 58% mobile traffic.

European countries have differing mobile preferences

While we’re talking about Italy and Croatia, let’s dig a bit more into Europe. Only six countries have more mobile than desktop (laptops included) traffic. Besides Italy and Croatia, there’s Romania (54%), Slovakia (52%) and Greece (51%) — all more to the east of Europe.

At the end of this mobile ranking we have one of the most digitally advanced countries in the world: Estonia (a truly digital society, according to Wired). The small country only has 34% of mobile traffic. Other countries in the north of Europe like Denmark (38%) and Finland (39%) follow the same trend.

Spain (47%), France (48%) and Ireland (49%) are getting close to being mobile-first countries. The UK (50%) has the same trend as its neighbours — Russia is actually in the same ‘neighbourhood’ (with 49%). On the other hand, Portugal (42%), Netherlands (43%) or Germany (44%) are still a little far.

How about the American continent?

Where is mobile traffic the most and least popular?
Peru seems to be the country in the American continent that has less mobile use (36%), only compared with Canada (38%). Cuba is the country with more mobile use (70%)

Peru (36%) and Canada (38%) have in common that both are the countries in the American continent with the least mobile use in the past 30 days.

Then there’s Brazil (50%), Mexico (52%) — Chile is not far, with 48% of mobile use. Cuba takes the crown, with 70%, followed by the Dominican Republic (56%), Puerto Rico (51%) and Jamaica (51%), all Caribbean countries. The exception is Haiti, the least mobile of the continent, with 34% of mobile use.

Let’s go to the Middle East: the champion of mobile traffic

Where is mobile traffic the most and least popular?
Most Internet traffic in Yemen is done with mobile devices like this chart from Radar of the previous 30 days shows

In this part of our planet there are no doubts whatsoever: mobile traffic rules completely. On the top of the list is Yemen, with the same 83% of mobile traffic as Sudan (that we talked about before). But Syria is actually a close second, with 82%.

Iran (71%), Iraq (70%), Pakistan (70%) and Egypt (69%) show the same trend. The exception, here, is the United Arab Emirates, with 44% of mobile traffic, and also Israel (45%). Nearby, Saudi Arabia (the country with the highest GDP in the region) is at 55%.

A (mobile) giant called India

Of the top 10 most populated countries, the clear winner of our mobile ranking is, without any doubt, India, with 80% mobile use. The country of 1.3 billion people surpasses the biggest country on the planet, China (1.4 billion live there), with 65% mobile.

Also in Asia, the fourth-biggest country in the world (after the US), Indonesia, has 68% of traffic by mobile devices. The same trend of mobile-first is followed by Thailand (65%), Vietnam (64%), Malaysia (64%), South Korea (56%), Japan (56%) and the Philippines (51%). Singapore is in the middle and down under, Australia is desktop first (37% mobile traffic), just like its neighbour New Zealand (38%).

Just as a curiosity, Vanuatu, the South Pacific Ocean nation (population of 307,150), ranked some years as the happiest nation on the planet (by the Happy Planet Index) has 52% of mobile traffic. The current number one in that same index, Costa Rica, is at 50%.

Conclusion

Mobile devices are here to stay and have become already a bridge to help bring more humans to the vast opportunities that the Internet brings. Of the top 15 countries with more mobile Internet traffic, there’s just one that is in the top 15 in terms of GDP, India.

As we already showed, there is a world of trends and even human habits (differing from country to country) to discover on our Cloudflare Radar platform. It’s all a matter of asking a question that could be reflected in our data and searching for the answers.

Project Myriagon: Cloudflare Passes 10,000 Connected Networks

Post Syndicated from Ticiane Takami original https://blog.cloudflare.com/10000-networks-and-beyond/

Project Myriagon: Cloudflare Passes 10,000 Connected Networks

Project Myriagon: Cloudflare Passes 10,000 Connected Networks

During Speed Week, we’ve talked a lot about the products we’ve improved and the places we’ve expanded to. Today, we have a final exciting announcement: Cloudflare now connects with more than 10,000 other networks. Put another way, over 10,000 networks have direct on-ramps to the Cloudflare network.

This is the culmination of a special project we’ve been working on for the last few months dubbed Project Myriagon, a reference to the 10,000-sided polygon of the same name. In going about this project, we have learned a lot about the performance impact of adding more direct connections to our network — in one recent case, we saw a 90% reduction in median round-trip end-user latency.

But to really explain why this is such a big milestone, we first need to explain a bit about how the Internet works.

More roads leading to Rome

The Internet that all know and rely on is, on a basic level, an interconnected series of independently run local networks. Each network is defined as its own “autonomous system.” These networks are delineated numerically with Autonomous Systems Numbers, or ASNs. An ASN is like the Internet version of a zip code, a short number directly mapping to a distinct region of IP space using a clearly defined methodology. Network interconnection is all about bringing together different ASNs to exponentially multiply the number of possible paths between source and destination.

Most of us have home networks behind a modem and router, connecting your individual miniature network to your ISP. Your ISP then connects with other networks, to fetch the web pages or other Internet traffic you request. These networks in turn have connections to different networks, who in turn connect to interconnected networks, and so on, until your data reaches its destination. The fewer networks your request has to traverse, generally, the lower the end-to-end latency and odds that something will get lost along the way.

The average number of hops between any one network on the Internet to any other network is around 5.7 and 4.7, for the IPv4 and IPv6 networks respectively.

Project Myriagon: Cloudflare Passes 10,000 Connected Networks
Project Myriagon: Cloudflare Passes 10,000 Connected Networks
Source: https://blog.apnic.net/2020/01/14/bgp-in-2019-the-bgp-table/

How do ASNs work?

ASNs are a key part of the routing protocol that directs traffic along the Internet, BGP. Internet Assigned Numbers Authority (IANA), the global coordinator of the DNS Root, IP addressing, and other Internet protocol resources like AS Numbers, delegates ASN-making authority to Regional Internet Registries (RIRs), who in turn assign individual ASNs to network operators in line with their regional policies. The five RIRs are AFRINIC, APNIC, ARIN, LACNIC and RIPE, each entitled to assign and attribute ASN numbers in their respective appointed regions.

Cloudflare’s ASN is 13335, one of the approximately 70,000 ASNs advertised on the Internet. While we’d like to — and plan on — connecting to every one of these ASNs eventually, our team tries to prioritize those with the greatest impact on our overall breadth and improving the proximity to as many people on Earth as possible.

As enabling optimal routes is key to our core business and services, we continuously track how many ASNs we connect to (technically referred to as “adjacent networks”). With Project Myriagon, we aimed to speed up our rate of interconnection and pass 10,000 adjacent networks by the end of the year. By September 2021, we reached that milestone, bringing us from 8,300 at the start of 2020 to over 10,000 today.

As shown in the table below, that milestone is part of a continuous effort towards gradually hitting more of the total advertised ASNs on the Internet.

Project Myriagon: Cloudflare Passes 10,000 Connected Networks
The Regional Internet Registries and their Regions

Table 1: Cloudflare’s peer ASNs and their respective RIR

Project Myriagon: Cloudflare Passes 10,000 Connected Networks

Given that there are 70,000+ ASNs out there, you might be wondering: why is 10,000 a big deal? To understand this, we need to look deeply at BGP, the protocol that glues the Internet together. There are three different classes of ASNs:

  • Transit Only ASNs: these networks only provide connectivity to other networks. They don’t have any IP addresses inside their networks. These networks are quite rare, as it’s very unusual to not have any IP addresses inside your network. Instead, these networks are often used primarily for distinct management purposes within a single organization.
  • Origin Only ASNs: these are networks that do not provide connectivity to other networks. They are a stub network, and often, like your home network, only connected to a single ISP.
  • Mixed ASNs: these networks both have IP addresses inside their network, and provide connectivity to other networks.

Origin Only ASNs Mixed ASNs Transit Only ASNs
61,127 11,128 443

Source: https://bgp.potaroo.net/as6447/

One interesting fact: of the 61,127 origin only ASNs, nearly 43,000 of them are only connected to their ISP. As such, our direct connections to over 10,000 networks indicates that of the networks that connect more than one network, a very good percentage are now already connected to Cloudflare.

Cutting out the middle man

Directly connecting to a network — and eliminating the hops in between — can greatly improve performance in two ways. First, connecting with a network directly allows for Internet traffic to be exchanged locally rather than detouring through remote cities; and secondly, direct connections help avoid the congestion caused by bottlenecks that sometimes happen between networks.

To take a recent real-world example, turning up a direct peering session caused a 90% improvement in median end-user latency when turning up a peering session with a European network, from an average of 76ms to an average of 7ms.

Project Myriagon: Cloudflare Passes 10,000 Connected Networks
Immediate 90% improvement in median end-user latency after peering with a new network. 

By using our own on-ramps to other networks, we both ensure superior performance for our users and avoid adding load and causing congestion on the Internet at large.

And AS13335 is just getting started

Project Myriagon: Cloudflare Passes 10,000 Connected Networks

Cloudflare is an anycast network, meaning that the better connected we are, the faster and better-protected we are — obviating legacy concepts like scrubbing centers and slow origins. Hitting five digits of connected networks is something we’re really proud of as a step on our goal to helping to build a better Internet. As we’ve mentioned throughout the week, we’re all about high speed without having to pay a security or reliability cost.

There’s still work to do! While Project Myriagon has brought us, we believe, to be one of the top 5 most connected networks in the world, we estimate Google is connected to 12,000-15,000 networks. And so, today, we are kicking off Project CatchG. We won’t rest until we’re #1.

Interested in peering with us to help build a better Internet? Reach out to [email protected] with your request. More details on the locations we are present at can be found at http://as13335.peeringdb.com/.

Cloudflare Passes 250 Cities, Triples External Network Capacity, 8x-es Backbone

Post Syndicated from Jon Rolfe original https://blog.cloudflare.com/250-cities-is-just-the-start/

Cloudflare Passes 250 Cities, Triples External Network Capacity, 8x-es Backbone

Cloudflare Passes 250 Cities, Triples External Network Capacity, 8x-es Backbone

It feels like just the other week that we announced ten new cities and our expansion to 25+ cities in Brazil — probably because it was. Today, I have three speedy infrastructure updates: we’ve passed 250 on-network cities, more than tripled our external network capacity, and increased our long-haul internal backbone network by over 800% since the start of 2020.

Light only travels through fiber so fast and with so much bandwidth — and worse still over the copper or on mobile networks that make up most end-users’ connections to the Internet. At some point, there’s only so much software you can throw at the problem before you run into the fundamental problem that an edge network solves: if you want your users to see incredible performance, you have to have servers incredibly physically close. For example, over the past three months, we’ve added another 10 cities in Brazil.  Here’s how that lowered the connection time to Cloudflare. The red line shows the latency prior to the expansion, the blue shows after.

Cloudflare Passes 250 Cities, Triples External Network Capacity, 8x-es Backbone

We’re exceptionally proud of all the teams at Cloudflare that came together to raise the bar for the entire industry in terms of global performance despite border closures, semiconductor shortages, and a sudden shift to working from home. 95% of the entire Internet-connected world is now within 50 ms of a Cloudflare presence, and 80% of the entire Internet-connected world is within 20ms (for reference, it takes 300-400 ms for a human to blink):

Cloudflare Passes 250 Cities, Triples External Network Capacity, 8x-es Backbone

Today, when we ask ourselves what it means to have a fast website, it means having a server less than 0.05 seconds away from your user, no matter where on Earth they are. This is only possible by adding new cities, partners, capacity, and cables — so let’s talk about those.

New Cities

Cutting straight to the point, let’s start with cities and countries: in the last two-ish months, we’ve added another 17 cities (outside of mainland China) split across eight countries: Guayaquil, Ecuador; Dammam, Saudi Arabia; Algiers, Algeria; Surat Thani, Thailand; Hagåtña, Guam, United States; Krasnoyarsk, Russia; Cagayan, Philippines; and ten cities in Brazil: Caçador, Ribeirão Preto, Brasília, Florianópolis, Sorocaba, Itajaí, Belém, Americana, Blumenau, and Belo Horizonte.

Meanwhile, with our partner, JD Cloud and AI, we’re up to 37 cities in mainland China: Anqing and Huainan, Anhui; Beijing, Beijing; Fuzhou and Quanzhou, Fujian; Lanzhou, Gansu; Foshan, Guangzhou, and Maoming, Guangdong; Guiyang, Guizhou; Chengmai and Haikou, Hainan; Langfang and Qinhuangdao, Hebei; Zhengzhou, Henan; Shiyan and Yichang, Hubei; Changde and Yiyang, Hunan; Hohhot, Inner Mongolia; Changzhou, Suqian, and Wuxi, Jiangsu; Nanchang and Xinyu, Jiangxi; Dalian and Shenyang, Liaoning; Xining, Qinghai; Baoji and Xianyang, Shaanxi; Jinan and Qingdao, Shandong; Shanghai, Shanghai; Chengdu, Sichuan; Jinhua, Quzhou, and Taizhou, Zhejiang. These are subject to change: as we ramp up, we have been working with JD Cloud to “trial” cities for a few weeks or months to observe performance and tweak the cities to match.

More Capacity: What and Why?

In addition to all these new cities, we’re also proud to announce that we have seen a 3.5x increase in external network capacity from the start of 2020 to now. This is just as key to our network strategy as new cities: it wouldn’t matter if we were in every city on Earth if we weren’t interconnected with other networks. Last-mile ISPs will sometimes still “trombone” their traffic, but in general, end users will get faster Internet as we interconnect more.

Cloudflare Passes 250 Cities, Triples External Network Capacity, 8x-es Backbone

This interconnection is spread far and wide, both to user networks and those of website hosts and other major cloud networks. This has involved a lot of middleman-removal: rather than run fiber optics from our routers through a third-party network to an origin or user’s network, we’re running more and more Private Network Interconnects (PNIs) and, better yet, Cloudflare Network Interconnects (CNIs) to our customers.

These PNIs and CNIs can not only reduce egress costs for our customers (particularly with our Bandwidth Alliance partners) but also increase the speed, reliability, and privacy of connections. The fewer networks and less distance your Internet traffic flows through, the better off everyone is. To put some numbers on that, only 30% of this newly doubled capacity was transit, leaving 70% flowing directly either physically over PNIs/CNIs or logically over peering sessions at Internet exchange points.

The Backbone

Cloudflare Passes 250 Cities, Triples External Network Capacity, 8x-es Backbone

At the same time as this increase in external capacity, we’ve quietly been adding hundreds of new segments to our backbone. Our backbone consists of dedicated fiber optic lines and reserved portions of wavelength that connect Cloudflare data centers together. This is split approximately 55/45 between “metro” capacity, which redundantly connects data centers in which we have a presence, and “long-haul” capacity, which connects Cloudflare data centers in different cities.

The backbone is used to increase the speed of our customer traffic, e.g., for Argo Smart Routing, Argo Tiered Caching, and WARP+. Our backbone is like a private highway connecting cities, while public Internet routing is like local roads: not only does the backbone directly connect two cities, but it’s reliably faster and sees fewer issues. We’ll dive into some benchmarks of the speed improvements of the backbone in a more comprehensive future blog post.

The backbone is also more secure. While Cloudflare signs all of its BGP routes with RPKI, pushes adjacent networks to use RPKI to avoid route hijacks, and encrypts external and internal traffic, the most secure and private way to safeguard our users’ traffic is to keep it on-network as much as possible.

Internal load balancing between cities has also been greatly improved, thanks to the use of the backbone for traffic management with a technology we call Plurimog (a reference to our in-colo Layer 4 load balancer, Unimog). A surge of traffic into Portland can be shifted instantaneously over diverse links to Seattle, Denver, or San Jose with a single hop, without waiting for changes to propagate over anycast or running the risk of an interim increase in errors.

From an expansion perspective, two key areas of focus have been our undersea North America to Europe (transatlantic) and Asia to North America (transpacific) backbone rings. These links use geographically diverse subsea cable systems and connect into diverse routers and data centers on both ends — four transatlantic cables from North America to Europe, three transamerican cables connecting South and North America, and three transpacific cables connecting Asia and North America. User traffic coming from Los Angeles could travel to an origin as west as Singapore or as east as Moscow without leaving our network.

This rate of growth has been enabled by improved traffic forecast modeling, rapid internal feedback loops on link utilization, and more broadly by growing our teams and partnerships. We are creating a global view of capacity, pricing, and desirability of backbone links in the same way that we have for transit and peering. The result is a backbone that doubled in long-haul capacity this year, increased more than 800% from the start of last year, and will continue to expand to intelligently crisscross the globe.

The backbone has taken on a huge amount of traffic that would otherwise go over external transit and peering connections, freeing up capacity for when it is explicitly needed (last-hop routes, failover, etc.) and avoiding any outages on other major global networks (e.g., CenturyLink, Verizon).

In Conclusion

Cloudflare Passes 250 Cities, Triples External Network Capacity, 8x-es Backbone
A map of the world highlighting all 250+ cities in which Cloudflare is deployed.

More cities, capacity, and backbone are more steps as part of going from being the most global network on Earth to the most local one as well. We believe in providing security, privacy, and reliability for all — not just those who have the money to pay for something we consider fundamental Internet rights. We have seen the investment into our network pay huge dividends this past year.

Happy Speed Week!

Do you want to work on the future of a globally local network? Are you passionate about edge networks? Do you thrive in an exciting, rapid-growth environment? If so, good news: Cloudflare Infrastructure is hiring; check our open roles here!

Alternatively — if you work at an ISP we aren’t already deployed with and want to bring this level of speed and control to your users, we’re here to make that happen. Please reach out to our Edge Partnerships team at [email protected].

Cloudflare’s Network Doubles CPU Capacity and Expands Into Ten New Cities in Four New Countries

Post Syndicated from Jon Rolfe original https://blog.cloudflare.com/ten-new-cities-four-new-countries/

Cloudflare’s Network Doubles CPU Capacity and Expands Into Ten New Cities in Four New Countries

Cloudflare’s Network Doubles CPU Capacity and Expands Into Ten New Cities in Four New Countries

Cloudflare’s global network is always expanding, and 2021 has been no exception. Today, I’m happy to give a mid-year update: we’ve added ten new Cloudflare cities, with four new countries represented among them. And we’ve doubled our computational footprint since the start of pandemic-related lockdowns.

No matter what else we do at Cloudflare, constant expansion of our infrastructure to new places is a requirement to help build a better Internet. 2021, like 2020, has been a difficult time to be a global network — from semiconductor shortages to supply-chain disruptions — but regardless, we have continued to expand throughout the entire globe, experimenting with technologies like ARM, ASICs, and Nvidia all the way.

The Cities

Cloudflare’s Network Doubles CPU Capacity and Expands Into Ten New Cities in Four New Countries

Without further ado, here are the new Cloudflare cities: Tbilisi, Georgia; San José, Costa Rica; Tunis, Tunisia; Yangon, Myanmar; Nairobi, Kenya; Jashore, Bangladesh; Canberra, Australia; Palermo, Italy; and Salvador and Campinas, Brazil.

These deployments are spread across every continent except Antarctica.

We’ve solidified our presence in every country of the Caucuses with our first deployment in the country of Georgia in the capital city of Tbilisi. And on the other side of the world, we’ve also established our first deployment in Costa Rica’s capital of San José with NIC.CR, run by the Academia Nacional de Ciencias.

In the northernmost country in Africa comes another capital city deployment, this time in Tunis, bringing us one country closer towards fully circling the Mediterranean Sea. Wrapping up the new country docket is our first city in Myanmar with our presence in Yangon, the country’s capital and largest city.

Our second Kenyan city is the country’s capital, Nairobi, bringing our city count in sub-Saharan Africa to a total of fifteen. In Bangladesh, Jashore puts us in the capital of its namesake Jashore District and the third largest city in the country after Chittagong and Dhaka, both already Cloudflare cities.

In the land way down under, our Canberra deployment puts us in Australia’s capital city, located, unsurprisingly, in the Australian Capital Territory. In differently warm lands is Palermo, Italy, capital of the island of Sicily, which we already see boosting performance throughout Italy.

Cloudflare’s Network Doubles CPU Capacity and Expands Into Ten New Cities in Four New Countries
25th percentile latency of non-bot traffic in Italy, year-to-date.

Finally, we’ve gone live in Salvador (capital of the state of Bahia) and Campinas, Brazil, the only city announced today that isn’t a capital. These are some of the first few steps in a larger Brazilian expansion — watch this blog for more news on that soon.

This is in addition to the dozens of new cities we’ve added in Mainland China with our partner, JD Cloud, with whom we have been working closely to quickly deploy and test new cities since last year.

The Impact

While we’re proud of our provisioning process, the work with new cities begins, not ends, with deployment. Each city is not only a new source of opportunity, but risk: Internet routing is fickle, and things that should improve network quality don’t always do so. While we have always had a slew of ways to track performance, we’ve found that a significant, constant improvement in the 25th percentile latency of non-bot traffic to be an ideal approximation of latency impacted by only physical distance.

Using this metric, we can quickly see the improvement that comes from adding new cities. For example, in Kenya, we can see that the addition of our Nairobi presence improved real user performance:

Cloudflare’s Network Doubles CPU Capacity and Expands Into Ten New Cities in Four New Countries
25th percentile latency of non-bot Kenyan traffic, before and after Nairobi gained a Cloudflare point of presence.

Latency variations in general are expected on the Internet — particularly in countries with high amounts of Internet traffic originating from non-fixed connections, like mobile phones — but in aggregate, the more consistently low latency, the better. From this chart, you can clearly see that not only was there a reduction in latency, but also that there were fewer frustrating variations in user latency. We all get annoyed when a page loads quickly one second and slowly the next, and the lower jitter that comes with being closer to the server helps to eliminate it.

As a reminder, while these measurements are in thousandths of a second, they add up quickly. Popular sites often require hundreds of individual requests for assets, some of which are initiated serially, so the difference between 25 milliseconds and 5 milliseconds can mean the difference between single and multi-second page load times.

To sum things up, users in the cities or greater areas of these cities should expect an improved Internet experience when using everything from our free, private 1.1.1.1 DNS resolver to the tens of millions of Internet properties that trust Cloudflare with their traffic. We have dozens more cities in the works at any given time, including now. Watch this space for more!

Join Our Team

Like our network, Cloudflare continues to rapidly grow. If working at a rapidly expanding, globally diverse company interests you, we’re hiring for scores of positions, including in the Infrastructure group. Or, if you work at a global ISP and would like to improve your users’ experience and be part of building a better Internet, get in touch with our Edge Partners Program at [email protected] we’ll look into sending some servers your way!

Sudan’s exam-related Internet shutdowns

Post Syndicated from John Graham-Cumming original https://blog.cloudflare.com/sudans-exam-related-internet-shutdowns/

Sudan's exam-related Internet shutdowns

To prevent cheating in exams many countries restrict or even shut down Internet access during critical exam hours. I wrote two weeks ago about Syria having planned Internet shutdowns during June, for exams.

Sudan is doing the same thing and has had four shutdowns so far. Here’s the Internet traffic pattern for Sudan over the last seven days. I’ve circled the shutdowns on Saturday, Sunday, Monday and Tuesday (today, June 22, 2021).

Sudan's exam-related Internet shutdowns

Cloudflare Radar allows anyone to track Internet traffic patterns around the world, and it has country-specific pages. The chart for the last seven days (shown above) came from the dedicated page for Sudan.

The Internet outages start at 0600 UTC (0800 local time) and end three hours later at 0900 UTC (1100 local time). This corresponds to the timings announced by the Sudanese Ministry of Education.

Sudan's exam-related Internet shutdowns

Further shutdowns are likely in Sudan on June 24, 26, 27, 29 and 30 (thanks to Twitter user _adonese for his assistance). Looking deeper into the data, the largest drop in use is for mobile Internet access in Sudan (the message above talks about mobile Internet use being restricted) while some non-mobile access appears to continue.

That can be seen by looking at the traffic mix from Sudan. During the exam times mobile use drops (as a percentage of traffic) and desktop use increases. This chart also shows how popular mobile Internet access is in Sudan: it’s typically above 75% of traffic (compare with, for example, the US).

Sudan's exam-related Internet shutdowns

If you want to follow the other outages for the remaining five exams, you can see live data on the Cloudflare Radar Sudan page.

Internet traffic disruption caused by the Christmas Day bombing in Nashville

Post Syndicated from John Graham-Cumming original https://blog.cloudflare.com/internet-traffic-disruption-caused-by-the-christmas-day-bombing-in-nashville/

Internet traffic disruption caused by the Christmas Day bombing in Nashville

On Christmas Day 2020, an apparent suicide bomb exploded in Nashville, TN. The explosion happened outside an AT&T network building on Second Avenue in Nashville at 1230 UTC. Damage to the AT&T building and its power supply and generators quickly caused an outage for telephone and Internet service for local people. These outages continued for two days.

Looking at traffic flow data for AT&T in the Nashville area to Cloudflare we can see that services continued operating (on battery power according to reports) for over five hours after the explosion, but at 1748 UTC we saw a dramatic drop in traffic. 1748 UTC is close to noon in Nashville when reports indicate that people lost phone and Internet service.

Internet traffic disruption caused by the Christmas Day bombing in Nashville

We saw traffic from Nashville via AT&T start to recover over a 45 minute period on December 27 at 1822 UTC making the total outage 2 days and 34 minutes.

Internet traffic disruption caused by the Christmas Day bombing in Nashville

Traffic flows continue to be normal and no further disruption has been seen.