Creating a single-table design with Amazon DynamoDB

2021-07-26 James Beswick

Post Syndicated from James Beswick original https://aws.amazon.com/blogs/compute/creating-a-single-table-design-with-amazon-dynamodb/

Amazon DynamoDB is a highly performant NoSQL database that provides data storage for many serverless applications. Unlike traditional SQL databases, it does not use table joins and other relational database constructs. However, you can model many common relational designs in a single DynamoDB table but the process is different using a NoSQL approach.

This blog post uses the Alleycat racing application to explain the benefits of a single-table DynamoDB table. It also shows how to approach modeling data access requirements in a DynamoDB table. Alleycat is a home fitness system that allows users to compete in an intense series of 5-minute virtual bicycle races. Up to 1,000 racers at a time take the saddle and push the limits of cadence and resistance to set personal records and rank on virtual leaderboards.

Alleycat requirements

In the Alleycat example, the application offers a number of exercise classes. Each class has multiple races, and there are multiple racers in each race. The system logs the output for each racer per second of the race. An entity-relationship diagram in a traditional relational database shows how you could use normalized tables and relationships to store this data:

In a relational database, often each table has a key that relates to a foreign key in another table. By joining multiple tables, you can query related tables and return the results in a single table view. While this is flexible and convenient, it’s also computationally expensive and difficult to scale horizontally.

Many serverless architectures are built for scale and the relational database paradigm often does not scale as efficiently as a workload demands. DynamoDB scales to almost any level of traffic but one of the tradeoffs is the lack of joins. Fortunately, it offers alternative ways to model the data to meet Alleycat’s requirements.

DynamoDB terminology and concepts

Unlike traditional databases, there is no limit to how much data can be stored in a DynamoDB table. The service is also designed to provide predictable performance at any scale, so you can expect similar query latency regardless of the level of traffic.

The most important operational aspect of running DynamoDB in production is setting and managing throughput. There is a provisioned mode, where you set the throughput, and on-demand, which is managed by the service. In the provisioned mode, you can also use automatic scaling to let the service set the throughput between lower and upper limits you define.

The choice here is determined by the traffic patterns in your workload. For applications with predictable traffic with gradual changes, provisioned mode is the better choice and is more cost effective. If traffic patterns are unknown or you prefer to have capacity managed automatically, choose on-demand. To learn more about the capacity modes, visit the documentation page.

Within each table, you must have a partition key, which is a string, numeric, or binary value. This key is a hash value used to locate items in constant time regardless of table size. It is conceptually different to an ID or primary key field in a SQL-based database and does not relate to data in other tables. When there is only a partition key, these values must be unique across items in a table.

Each table can optionally have a sort key. This allows you to search and sort within items that match a given primary key. While you must search on exact single values in the partition key, you can pattern search on sort keys. It’s common to use a numeric sort key with timestamps to find items within a date range, or use string search operators to find data in hierarchical relationships.

With only partition key and sort keys, this limits the possible types of query without duplicating data in a table. To solve this issue, DynamoDB also offers two types of indexes:

Local secondary indexes (LSIs): these must be created at the same time the table is created and effectively enable another sort key using the same partition key.
Global secondary indexes (GSIs): create and delete these at any time, and optionally use a different partition key from the existing table.

There are other important differences between the two index types:

	LSI	GSI
Create	At table creation	Anytime
Delete	At table deletion	Anytime
Size	Up to 10 GB per partition	Unlimited
Throughput	Shared with table	Separate throughput
Key type	Primary key only or composite key (partition key and sort key)	Composite key only
Consistency model	Both eventual and strong consistency	Eventual consistency only

Determining data access requirements

Relational database design focuses on the normalization process without regard to data access patterns. However, designing NoSQL data schemas starts with the list of questions the application must answer. It’s important to develop a list of data access patterns before building the schema, since NoSQL databases offer less dynamic query flexibility than their SQL equivalents.

To determine data access patterns in new applications, user stories and use-cases can help identify the types of query. If you are migrating an existing application, use the query logs to identify the typical queries used. In the Alleycat example, the frontend application has the following queries:

Get the results for each race by racer ID.
Get a list of races by class ID.
Get the best performance by racer for a class ID.
Get the list of top scores by race ID.
Get the second-by-second performance by racer for all races.

While it’s possible to implement the design with multiple DynamoDB tables, it’s unnecessary and inefficient. A key goal in querying DynamoDB data is to retrieve all the required data in a single query request. This is one of the more difficult conceptual ideas when working with NoSQL databases but the single-table design can help simplify data management and maximize query throughput.

Modeling many-to-many relationships with DynamoDB

In traditional SQL, a many-to-many relationship is classically represented with three tables. In the earlier diagram for the Alleycat application, these tables are racers, raceResults, and races. Populated with sample data, the tables look like this:

In DynamoDB, the adjacency list design pattern enables you to combine multiple SQL-type tables into a single NoSQL table. It has multiple uses but in this case can model many-to-many relationships. To do this, the partition key contains both types of item – races and racers. The key value contains the type of data expected in the item (for example, “race-1” or “racer-2”):

With this table design, you can query by racer ID or by race ID. For a single race, you can query by partition key to return all results for a single race, or use the sort key to limit by a single racer or for the overall results. For per racer results, the second-by-second data is stored in a nested JSON structure.

To allow sorting by output to create leaderboard results, the output value must be a sort key. However, the sort key cannot be updated once it is set. Using the main sort key, the application would only be able to write a final race result per racer to query and sort on this data.

To resolve this problem, use an index. The index can use a separate sort key where the value can be updated. This allows Alleycat to store the latest results in this field, and then for queries to sort by output to create a leaderboard.

The preceding table does not represent the races table in the normalized view, so you cannot query by class ID to retrieve a list of races. Depending on your design, you can solve this by adding a second index to the table to enable querying by class ID and returning a list of partition keys (race IDs). However, you can also overload GSIs to contain multiple types of value.

The AlleyCat application uses both an LSI and GSI to accommodate all the data access patterns. This table shows how this is modeled, although the results attribute names are shorter in the application:

Main composite key: PK and SK.
Local secondary index: Partition key is PK and sort key is Numeric.
Global secondary index: Partition key is SK and sort key is Numeric.

Reviewing the data access patterns for Alleycat

Before creating the DynamoDB table, test the proposed schema against the list of data access patterns. In this section, I review Alleycat’s list of queries to ensure that each is supported by the table schema. I use the Item explorer feature to run queries against a test table, after running the Alleycat simulator for multiple races.

1. Get the results for each race by racer ID

Use the table’s partition key, searching for PK = racer ID. This returns a list of all races (PK) for a given racer. See the updateRaceResults function for an example of how this is used:

2. Get a list of races by class ID

Use the local secondary index, searching for partition key = class ID. This results in a list of races (PK) for a given class ID. See the getRaces function code for an example of this query:

3. Get the best performance by racer for a class ID.

Use the table’s partition key, searching for PK = class ID. This returns a list of racers and their best outputs for the given class ID. See the getLeaderboard function code for an example of this query:

4. Get the list of top scores by race ID.

Use the global secondary index, searching for PK = race ID, sorting by the GSI sort key (descending) to rank the results. This returns a sorted list of results for a race. See the updateRaceResults function for an example of how this is used:

5. Get the second-by-second performance by racer for all races.

Use the main table index, searching for PK = racer ID. Optionally use the sort key to restrict to a single race. This returns items with second-by-second performance stored in a nested JSON attribute. See the loadRealtimeHistory function for an example of how this is used:

Optimizing items and capacity

In the Alleycat application, races are only 5 minutes long so the results attribute only contains 300 separate data points (once per second). By using a nested JSON structure in the items, the schema flattens data that otherwise would use 300 rows in the earlier SQL-based design.

The maximum item size in DynamoDB is 400 KB, which includes attribute names. If you have many more data points, you may reach this limit. To work around this, split the data across multiple items and provide the item order in the sort key. This way, when your application retrieves the items, it can reassemble the attributes to create the original dataset.

For example, if races in Alleycat were an hour long, there would be 3,600 data points. These may be stored in six rows containing 600 second-by-second results each:

Additionally, to maximize the storage per row, choose short attribute names. You can also compress data in attributes by storing as GZIP output instead of raw JSON, and using a binary data type for the attribute. This increases processing for the producing and consuming applications, which must compress and decompress the items. However, it can significantly increase the amount of data stored per row.

To learn more, read Best practices for storing large items and attributes.

Conclusion

This post looks at implementing common relational database patterns using DynamoDB. Instead of using multiple tables, the single-table design pattern can use adjacency lists to provide many-to-many relational functionality.

Using the Alleycat example, I show how to list the data access patterns required by an application, and then model the data using composite keys and indexes to return the relevant data using single queries. Finally, I show how to optimize items and capacity for workloads storing large amounts of data.

For more serverless learning resources, visit Serverless Land.

Analyze Fraud Transactions using Amazon Fraud Detector and Amazon Athena

2021-07-26 Raghavarao Sodabathina

Post Syndicated from Raghavarao Sodabathina original https://aws.amazon.com/blogs/architecture/analyze-fraud-transactions-using-amazon-fraud-detector-and-amazon-athena/

Organizations with online businesses have to be on guard constantly for fraudulent activity, such as fake accounts or payments made with stolen credit cards. One way they try to identify fraudsters is by using fraud detection applications. Some of these applications use machine learning (ML).

A common challenge with ML is the need for a large, labeled dataset to create ML models to detect fraud. You will also need the skill set and infrastructure to build, train, deploy, and scale your ML model.

In this post, I discuss how to perform fraud detection on a batch of many events using Amazon Fraud Detector. Amazon Fraud Detector is a fully managed service that can identify potentially fraudulent online activities. These can be situations such as the creation of fake accounts or online payment fraud. Unlike general-purpose ML packages, Amazon Fraud Detector is designed specifically to detect fraud. You can analyze fraud transaction prediction results by using Amazon Athena and Amazon QuickSight. I will explain how to review fraud using Amazon Fraud Detector and Amazon SageMaker built-in algorithms.

Batch fraud prediction use cases

You can use a batch predictions job in Amazon Fraud Detector to get predictions for a set of events that do not require real-time scoring. You may want to generate fraud predictions for a batch of events. These might be payment fraud, account take over or compromise, and free tier misuse while performing an offline proof-of-concept. You can also use batch predictions to evaluate the risk of events on an hourly, daily, or weekly basis depending upon your business need.

Batch fraud insights using Amazon Fraud Detector

Organizations such as ecommerce companies and credit card companies use ML to detect the fraud. Some of the most common types of fraud include email account compromise (personal or business), new account fraud, and non-payment or non-delivery (which includes compromised card numbers).

Amazon Fraud Detector automates the time-consuming and expensive steps to build, train, and deploy an ML model for fraud detection. Amazon Fraud Detector customizes each model it creates to your dataset, making the accuracy of models higher than current one-size-fits-all ML solutions. And because you pay only for what you use, you can avoid large upfront expenses.

If you want to analyze fraud transactions after the fact, you can perform batch fraud predictions using Amazon Fraud Detector. Then you can store fraud prediction results in an Amazon S3 bucket. Amazon Athena helps you analyze the fraud prediction results. You can create fraud prediction visualization dashboards using Amazon QuickSight.

The following diagram illustrates how to perform fraud predictions for a batch of events and analyze them using Amazon Athena.

Figure 1. Example architecture for analyzing fraud transactions using Amazon Fraud Detector and Amazon Athena

The architecture flow follows these general steps:

Create and publish a detector. First create and publish a detector using Amazon Fraud Detector. It should contain your fraud prediction model and rules. For additional details, see Get started (console).
Create an input Amazon S3 bucket and upload your CSV file. Prepare a CSV file that contains the events you want to evaluate. Then upload your CSV file into the input S3 bucket. In this file, include a column for each variable in the event type associated with your detector. In addition, include columns for EVENT_ID, ENTITY_ID, EVENT_TIMESTAMP, ENTITY_TYPE. Refer to Amazon Fraud Detector batch input and output files for more details. Read Create a variable for additional information on Amazon Fraud Detector variable data types and formatting.
Create an output Amazon S3 bucket. Create an output Amazon S3 bucket to store your Amazon Fraud Detector prediction results.
Perform a batch prediction. You can use a batch predictions job in Amazon Fraud Detector to get predictions for a set of events that do not require real-time scoring. Read more here about Batch predictions.
Review your prediction results. Review your results in the CSV file that is generated and stored in the Amazon S3 output bucket.
Analyze your fraud prediction results.
- After creating a Data Catalog by using AWS Glue, you can use Amazon Athena to analyze your fraud prediction results with standard SQL.
- You can develop user-friendly dashboards to analyze fraud prediction results using Amazon QuickSight by creating new datasets with Amazon Athena as your data source.

Fraud detection using Amazon SageMaker

The Amazon Web Services (AWS) Solutions Implementation, Fraud Detection Using Machine Learning, enables you to run automated transaction processing. This can be on an example dataset or your own dataset. The included ML model detects potentially fraudulent activity and flags that activity for review. The diagram following presents the architecture you can automatically deploy using the solution’s implementation guide and accompanying AWS CloudFormation template.

SageMaker provides several built-in machine learning algorithms that you can use for a variety of problem types. This solution leverages the built-in Random Cut Forest algorithm for unsupervised learning and the built-in XGBoost algorithm for supervised learning. In the SageMaker Developer Guide, you can see how Random Cut Forest and XGBoost algorithms work.

Figure 2. Fraud detection using machine learning architecture on AWS

This architecture can be segmented into three phases.

Develop a fraud prediction machine learning model. The AWS CloudFormation template deploys an example dataset of credit card transactions contained in an Amazon Simple Storage Service (Amazon S3) bucket. An Amazon SageMaker notebook instance with different ML models will be trained on the dataset.
Perform fraud prediction. The solution also deploys an AWS Lambda function that processes transactions from the example dataset. It invokes the two SageMaker endpoints that assign anomaly scores and classification scores to incoming data points. An Amazon API Gateway REST API initiates predictions using signed HTTP requests. An Amazon Kinesis Data Firehose delivery stream loads the processed transactions into another Amazon S3 bucket for storage. The solution also provides an example of how to invoke the prediction REST API as part of the Amazon SageMaker notebook.
Analyze fraud transactions. Once the transactions have been loaded into Amazon S3, you can use analytics tools and services for visualization, reporting, ad-hoc queries, and more detailed analysis.

By default, the solution is configured to process transactions from the example dataset. To use your own dataset, you must modify the solution. For more information, see Customization.

Conclusion

In this post, we showed you how to analyze fraud transactions using Amazon Fraud Detector and Amazon Athena. You can build fraud insights using Amazon Fraud Detector and Amazon SageMaker built-in algorithms Random Cut Forest and XGBoost. With the information in this post, you can build your own fraud insights models on AWS. You’ll be able to detect fraud faster. Finally, you’ll be able to solve a variety of fraud types. These can be new account fraud, online transaction fraud, and fake reviews, among others.

Read more and get started on building fraud detection models on AWS.

Decrypter FOMO No Mo’: Five Years of the No More Ransom Project

2021-07-26 Jen Ellis

Post Syndicated from Jen Ellis original https://blog.rapid7.com/2021/07/26/decrypter-fomo-no-mo-five-years-of-the-no-more-ransom-project/

Decrypter FOMO No Mo’: Five Years of the No More Ransom Project

The amazing No More Ransom Project celebrates its fifth anniversary today and so we just wanted to take a moment to talk about what it has accomplished and why you should tell all your friends about it.

The name pretty much says it all — No More Ransom aims to help organizations avoid having to pay ransoms for cyber attacks by providing guidance for defending against attacks, connecting victims with law enforcement, and most crucially, by providing free decryption tools. Just think about that for a second … you get hit by ransomware and you get a demand for a ransom payment of tens of thousands of dollars, or more. Recently we’ve seen ransom demands go up as high as tens of millions of dollars. But there’s a chance that rather than having to shell out piles of your hard earned cash (in crypto form), you could, in fact, get what you need for free with minimal fuss or effort.

Sounds too good to be true, right? Like maybe you’re thinking that they only have decryptor tools for old encryptors that aren’t really being used anymore? Well, despite being just five years old today, No More Ransom’s tools have already been downloaded more than six million times, and have saved organizations an estimated $900 million in ransoms that didn’t have to be paid. In fact, the Project offers a staggering 121 free tools, which decrypt 151 ransomware families. So we’re talking about a project that is having a profound impact every day. See? You should totally check it out and tell all your friends about it!

The Project is a great example of what can be achieved with effective public-private partnership. The main backers are Europol, the Dutch Government, McAfee and Kaspersky. They have now recruited about 170 other partners from law enforcement, the private sector, and nonprofits around the world, which I’m guessing goes a long way towards helping them stay up to date with malware samples and decryption tools. Special shout outs should also go to Amazon Web Services and Barracuda for hosting the site.

Here’s the thing though, recently I co-chaired the Ransomware Task Force (RTF), which was brought together by the Institute for Security and Technology, to come up with recommendations for reducing ransomware on an international, societal level. As part of the RTF’s investigations into what is happening in the ransomware landscape, we spoke to numerous organizations that have suffered ransomware attacks, as well as, many of the entities they rely on to help them respond — law enforcement, cyber insurers, incident responders, legal counsel. We were surprised to discover that very few of the organizations we spoke with knew about the No More Ransom Project or thought to look there for free decryption tools before paying the ransom. This seemed to be particularly true in the US. Now granted, the tools have been downloaded 6 million times, so definitely some folks do know to look there, often encouraged by law enforcement teams, but there are clearly also many people and organizations who don’t know about it and should.

I suspect that the astonishing ‘six million’ figure is less about awareness and more about how incredibly prevalent ransom attacks have been over the past few years, which is why this project is so important and valuable. So help the No More Ransom Project celebrate its birthday by telling everyone you know about it. You can casually drop that $900 million saving stat into conversation — it’s so impressive I had to mention it twice.

If you’re interested in hearing more of me being incredibly enthusiastic about the Project, check out this week’s special edition of our Security Nation podcast, which will be published on Wednesday, July 28th and features an interview with Philipp Amann, Head of Strategy at the European Cybercrime Centre (EC3), which is part of Europol.

As a tease for the interview, we’ll give Philipp the final word on the No More Ransom Project:

“No More Ransom offers real hope to the victims, and also delivers a clear message to the criminals: the international community stands together with a common goal – to disrupt this criminal business model and to bring offenders to justice.”

Rapid7 Introduces: Kubernetes Security Guardrails

2021-07-26 Alon Berger

Post Syndicated from Alon Berger original https://blog.rapid7.com/2021/07/26/rapid7-introduces-kubernetes-security-guardrails/

Rapid7 Introduces: Kubernetes Security Guardrails

Cloud and container technology provide tremendous flexibility, speed, and agility, so it’s not surprising that organizations around the globe are continuing to embrace cloud and container technology. Many organizations are using multiple tools to secure their often complex cloud and container environments, while struggling to maintain the flexibility, speed, and agility required to keep security intact.

Cloud Security Just Got Better!

In addition to acquiring DivvyCloud, a top-tier Cloud Security Posture Management (CSPM) platform in 2020, Rapid7 recently announced another successful acquisition— joining forces with Alcide, a leading Kubernetes security start-up that offers advanced Cloud Workload Protection Platform (CWPP) capabilities.

Rapid7 is taking the lead in the CSPM space by leveraging both DivvyCloud’s and Alcide’s capabilities and incorporating them into a single platform: InsightCloudSec, your one-stop shop for superior cloud security solutions.

Learn more about InsightCloudSec here

Built for DevOps, Trusted by Security

In retrospect, 2020 was a tipping point for the Kubernetes community, with a massive increase in adoption across the globe. Many companies, seeking an efficient, cost-effective way to make this huge shift to the cloud, turned to Kubernetes. But this in turn created a growing need to remove the Kubernetes security blind spots. It is for this reason that we are introducing Kubernetes Security Guardrails.

With Kubernetes Security Guardrails, organizations are equipped with a multi-cluster vulnerability scanner that covers rich Kubernetes security best practices and compliance policies, such as CIS Benchmarks. As part of Rapid7’s InsightCloudSec solution, this new ability introduces a platform-based and easy-to-maintain solution for Kubernetes security that is deployed in minutes and is fully streamlined in the Kubernetes pipeline.

Securing Kubernetes With InsightCloudSec

Kubernetes Security Guardrails is the most comprehensive solution for all relevant Kubernetes security requirements, designed from a DevOps perspective with in-depth visibility for security teams. Integrated within minutes, Kubernetes Guardrails simplifies the security assessment for the entire Kubernetes environment and the CI/CD pipeline while creating baseline profiles for each cluster, highlighting and scoring security risks, misconfigurations, and hygiene drifts.

Both DevOps and Security teams enjoy the continuous and dynamic analysis of their Kubernetes deployments, all while seamlessly complying with regulatory requirements for Kubernetes such as PCI, GDPR, and HIPAA.

With Kubernetes Guardrails, Dev teams are able to create a snapshot of cluster risks, delivered with a detailed list of misconfigurations, while detecting real-time hygiene and conformance drifts for deployments running on any cloud environment.

Some of the most common use cases include:

Kubernetes vulnerability scanning
Hunting misplaced secrets and excessive secret access
Workload hardening (from pod security to network policies)
Istio security and configuration best practices
Ingress controllers security
Kubernetes API server access privileges
Kubernetes operators best practices
RBAC controls and misconfigurations

Rapid7 proudly brings forward a Kubernetes security solution that encapsulates all-in-one capabilities with incomparable coverage for all things Kubernetes.

With a security-first approach and a strict compliance adherence, Kubernetes Guardrails enable a better understanding and control over distributed projects, and help organizations maintain smooth business operations.

Want to learn more? Register for the webinar on InsightCloudSec and its Kubernetes protection.

Understanding Where the Internet Isn’t Good Enough Yet

2021-07-26 John Graham-Cumming

Post Syndicated from John Graham-Cumming original https://blog.cloudflare.com/understanding-where-the-internet-isnt-good-enough-yet/

Understanding Where the Internet Isn’t Good Enough Yet

Since March 2020, the Internet has been the trusty sidekick that’s helped us through the pandemic. Or so it seems to those of us lucky enough to have fast, reliable (and often cheap) Internet access.

With a good connection you could keep working (if you were fortunate enough to have a job that could be done online), go to school or university, enjoy online entertainment like streaming movies and TV, games, keep up with the latest news, find out vital healthcare information, schedule a vaccination and stay in contact with loved ones and friends with whom you’d normally be spending time in person.

Without a good connection though, all those things were hard or impossible.

Sadly, access to the Internet is not uniformly distributed. Some have cheap, fast, low latency, reliable connections, others have some combination of expensive, slow, high latency and unreliable connections, still others have no connection at all. Close to 60% of the world have Internet access leaving a huge 40% without it at all.

This inequality of access to the Internet has real-world consequences. Without good access it is so much harder to communicate, to get vital information, to work and to study. Inequality of access isn’t a technical problem, it’s a societal problem.

This week, Cloudflare is announcing Project Pangea with the goal of helping reduce this inequality. We’re helping community networks get onto the Internet cheaply, securely and with good bandwidth and latency. We can’t solve all the challenges of bringing fast, cheap broadband access to everyone (yet) but we can give fast, reliable transit to ISPs in underserved communities to help move in that direction. Please refer to our Pangea announcement for more details.

The Tyranny of Averages

To understand why Project Pangea is important, you need to understand how different the experience of accessing the Internet is around the world. From a distance, the world looks blue and green. But we all know that our planet varies wildly from place to place: deserts and rainforests, urban jungles and placid rural landscapes, mountains, valleys and canyons, volcanos, salt flats, tundra, and verdant, rolling hills.

Cloudflare is in a unique position to measure the performance and reach of the Internet over this vast landscape. We have servers in more than 200 cities in over 100 countries, we process 10s of trillions of Internet requests every month. Our network and customers and their users span the globe, every country in every network.

Zoom out to the level of a city, county, state, or country, and average Internet performance can look good — or, at least, acceptable. Zoom in, however, and the inequalities start to show. Perhaps part of a county has great performance, and another limps along at barely dial-up speeds — or worse. Or perhaps a city has some neighborhoods with fantastic fiber service, and others that are underserved and struggling with spotty access.

Inequality of Internet access isn’t a distant problem, it’s not limited to developing countries, it exists in the richest countries in the world as well as the poorest. There are still many parts of the world where a Zoom call is hard or impossible to make. And if you’re reading this on a good Internet connection, you may be surprised to learn that places with poor or no Internet are not far from you at all.

Bandwidth and Latency in Eight Countries

For Impact Week, we’ve analyzed Internet data in the United States, Brazil, United Kingdom, Germany, France, South Africa, Japan, and Australia to build a picture of Internet performance.

Below, you’ll find detailed maps of where the Internet is fast and slow (focusing on available bandwidth) and far away from the end user (at least in terms of the latency between the client and server). We’d have loved to have used a single metric, however, it’s hard for a single number to capture the distribution of good, bad, and non-existent Internet traffic in a region. It’s for that reason that we’ve used two metrics to represent performance: latency and bandwidth (otherwise known as throughput). The maps below are colored to show the differences in bandwidth and latency and answer part of the question: “How good is the Internet in different places around the world?”

As we like to say, we’re just getting started with this — we intend to make more of this data and analysis available in the near future. In the meantime, if you’re a local official who wants to better understand their community’s relative performance, please reach out — we’d love to connect with you. Or, if you’re interested in your own Internet performance, you can visit speed.cloudflare.com to run a personalized test on your connection.

A Quick Refresher on Latency and Bandwidth

Before we begin, a quick reminder: latency (usually measured in milliseconds or ms) is the time it takes for communications to go to an Internet destination from your device and back, whereas bandwidth is the amount of data that can be transferred in a second (it’s usually measured in megabits per second or Mbps).

Both latency and bandwidth affect the performance of an Internet connection. High latency particularly affects things like online gaming where quick responses from servers are needed, but also shows up by slowing down the loading of complex web pages, and even interrupting some streaming video. Low bandwidth makes downloading anything slow: be it images on a webpage, the new app you want to try out on your phone, or the latest movie.

Blinking your eyes takes about 100ms; but you’ll begin to notice performance changes around 60ms of latency and below 30ms is gold class performance, seeing little to no delay in video streaming or gaming.

United States
United States median throughput: 50.27Mbps
US median latency: 46.69ms

The US government has long recognized the importance of improving the Internet for underserved communities, but the Federal Communications Commission (FCC), the US agency responsible for determining where investment is most needed, has struggled to accurately map Internet access across the country. Although the FCC has embarked on a new data collection effort to improve the accuracy of existing maps, the US government still lacks a comprehensive understanding of the areas that would most benefit from broadband investment.

Cloudflare’s data confirms the overall concerns with inconsistent access to the Internet and helps fill in some of the current gaps. A glance at the two maps of the US below will show that, even zoomed out to county level, there is inequality across the country. High latency and low bandwidth stand out as red areas.

US locations with the lowest latency (best) and highest latency (worst) are as follows.

Best performing geographies by latency	Worst performing geographies by latency
La Habra, California	Parrottsville, Tennessee
Midlothian, Texas	Loganville, Wisconsin
Los Alamitos, California	Mackinaw City, Michigan
St Louis, Missouri	Reno, Nevada
Fort Worth, Texas	Eva, Tennessee
Sugar Grove, North Carolina	Milwaukee, Wisconsin
Rockwall, Texas	Grove City, Minnesota
Justin, Texas	Sacred Heart, Minnesota
Denton, Texas	Scottsboro, Alabama
Hampton, Georgia	Vesta, Minnesota

When thinking about bandwidth, 5 to 10Mbps are generally good enough for video conferencing, but ultra-HD TV watching might consume up to 20Mbps easily. For context, the Federal Communications Commission (FCC) defines the minimum bandwidth for “Advanced Service” at 25 Mbps.

The best performing (i.e., the highest bandwidth) in the US tells an interesting story. New York City comes out on top, but if you were to zoom in on the city you’d find pockets of inequality. You can read more about our partnership with NYC Mesh in the Project Pangea post and how they are helping bring better Internet to underserved parts of the Big Apple. Notice how the tyranny of averages can disguise a problem.

Best performing geographies by throughput	Worst performing geographies by throughput
New York, New York	Ozark, Missouri
Hartford, Connecticut	Stanly, North Carolina
Avery, North Carolina	Ellis, Kansas
Red Willow, Nebraska	Marion, West Virginia
McLean, Kentucky	Sedgwick, Kansas
Franklin, Alabama	Calhoun, West Virginia
Montgomery, Pennsylvania	Jasper, Georgia
Cook, Illinois	Buchanan, Missouri
Montgomery, Maryland	Wetzel, West Virginia
Monroe, Pennsylvania	North Slope, Alaska

Contrary to popular discourse about access to the Internet as a product of the rural-urban divide, we found that poor performance was not unique to rural areas. Los Angeles, Milwaukee, Florida’s Orange County, Fairfax, San Bernardino, Knox County, and even San Francisco have pockets of uniformly poor performance, often while adjoining ZIP codes have stronger performance.

Even in areas with excellent Internet connectivity, the same connectivity to the same resources can cost wildly different amounts. Internet prices for end-users correlates with the number of ISPs in an area, i.e. the greater the consumer choice, the better the price. President Biden’s recent competition Executive Order, called out the lack of choice for broadband, noting “More than 200 million U.S. residents live in an area with only one or two reliable high-speed internet providers, leading to prices as much as five times higher in these markets than in markets with more options.”

The following cities have the greatest choice of Internet providers:

Geography
New York, New York
Los Angeles, California
Chicago, Illinois
Dallas, Texas
Washington, District of Columbia
Jersey City, New Jersey
Newark, New Jersey
Secaucus, New Jersey
Columbus, Ohio

One might expect less populated areas to have uniformly slower performance. There are, however, pockets of poor performance even in densely populated areas such as Los Angeles (California), Milwaukee (Wisconsin), Orange County (Florida), Fairfax (Virginia), San Bernardino (California), Knox County (Tennessee), and even San Francisco (California).

In as many as 9% of ZIP codes, average latency exceeds 150ms, the acceptable threshold of performance to run a videoconferencing service such as Zoom.

Australia
Australia median throughput: 33.34Mbps
Australia median latency: 42.04ms

In general, Australia seems to suffer very poor broadband speeds, with speeds that are not capable of sustaining households watching video streaming, and possibly struggling with multiple video calls. The problem isn’t just a rural one either, while the inner cities showed good broadband speed, often with fiber-to-the-building Internet access, suburban areas suffered. Larger suburban areas like the Illawarra had similar speeds to more rural centers like Wagga Wagga, showing this is more than just an urban divide.

Best performing geographies by throughput	Worst performing geographies by throughput
Inner West Sydney, New South Wales	West Tamar, Tasmania
Port Phillip, Victoria	Bassendean, Western Australia
Woollahra, New South Wales	Alexandrina, South Australia
Brimbank, Victoria	Bayswater, Western Australia
Lake Macquarie, New South Wales	Augusta-Margaret River, Western Australia
Hawkesbury, New South Wales	Goulburn Mulwaree, New South Wales
Sydney, New South Wales	Goyder, South Australia
Wentworth, New South Wales	Kingborough, Tasmania
Hunters Hill, New South Wales	Cottesloe, Western Australia
Blacktown, New South Wales	Lithgow, New South Wales

The irony is that, from a latency perspective, Australia actually performs quite well.

Best performing geographies by latency	Worst performing geographies by latency
Port Phillip, Victoria	Narromine, New South Wales
Mornington Peninsula, Victoria	North Sydney, New South Wales
Whittlesea, Victoria	Northern Midlands, Tasmania
Penrith, New South Wales	Swan, Western Australia
Mid-Coast, New South Wales	Wanneroo, Western Australia
Campbelltown, New South Wales	Snowy Valleys, New South Wales
Northern Beaches, New South Wales	Parkes, New South Wales
Strathfield, New South Wales	Broome, Western Australia
Latrobe, Victoria	Griffith, New South Wales
Surf Coast, Victoria	Busselton, Western Australia

Japan
Japan median throughput: 61.4Mbps
Japan median latency: 31.89ms

Japan’s Internet has consistently low latency, including in distant areas such as Okinawa prefecture, 1,000 miles away from Tokyo.

Best performing geographies by latency	Worst performing geographies by latency
Nara	Yamagata
Osaka	Okinawa
Shiga	Miyazaki
Kōchi	Nagasaki
Kyoto	Ōita
Tochigi	Kagoshima
Tokushima	Yamaguchi
Wakayama	Tottori
Kanagawa	Saga
Aichi	Ehime

However, it’s a different story when it comes to bandwidth. Several prefectures in Kyushu Island, Okinawa Prefecture, and Western Honshu have performance falling behind the rest of the country. Unsurprisingly, the best Internet performance is seen in Tokyo, with the highest concentration of people and data centers.

Best performing geographies by throughput	Worst performing geographies by throughput
Osaka	Tottori
Tokyo	Shimane
Kanagawa	Yamaguchi
Nara	Okinawa
Chiba	Saga
Aomori	Miyazaki
Hyōgo	Kagoshima
Kyoto	Yamagata
Tokushima	Nagasaki
Kōchi	Fukui

United Kingdom
United Kingdom median throughput: 53.8Mbps
United Kingdom median latency: 34.12ms

The United Kingdom has good latency throughout most of the country, however bandwidth is a different story. The best performance is seen in inner London as well as some other larger cities like Manchester. London and Manchester are also the homes of the UK’s largest Internet exchange points. More effort to localize data into other cities, like Edinburgh, would be an important step to improving performance for those regions.

Best performing geographies by latency	Worst performing geographies by latency
Sutton	Brent
Milton Keynes	Ceredigion
Lambeth	Westminster
Cardiff	Scottish Borders
Harrow	Shetland Islands
Hackney	Middlesbrough
Islington	Fermanagh and Omagh
Kensington and Chelsea	Slough
Thurrock	Highland
Kingston upon Thames	Denbighshire

Best performing geographies by throughput	Worst performing geographies by throughput
City of London	Orkney Islands
Slough	Shetland Islands
Lambeth	Blaenau Gwent
Surrey	Ceredigion
Tower Hamlets	Isle of Anglesey
Coventry	Fermanagh and Omagh
Wrexham	Scottish Borders
Islington	Denbighshire
Vale of Glamorgan	Midlothian
Leicester	Rutland

Germany
Germany median throughput: 48.79Mbps
Germany median latency: 42.1ms

Germany has some of the best performance centered on Frankfurt am Main, which is one of the major Internet hubs of the world, however what was formerly East Germany, has higher latency, and slower speeds, leaning to a poorer Internet performance.

Best performing geographies by latency	Worst performing geographies by latency
Erlangen	Harz
Coesfeld	Nordwestmecklenburg
Weißenburg-Gunzenhausen	Saale-Holzland-Kreis
Heinsberg	Elbe-Elster
Main-Taunus-Kreis	Vorpommern-Greifswald
Main-Kinzig-Kreis	Vorpommern-Rügen
Darmstadt	Kyffhäuserkreis
Peine	Barnim
Herzogtum Lauenburg	Rostock
Segeberg	Meißen

Best performing geographies by throughput	Worst performing geographies by throughput
Weißenburg-Gunzenhausen	Saale-Holzland-Kreis
Frankfurt am Main	Weimarer Land
Kassel	Vulkaneifel
Cochem-Zell	Kusel
Dingolfing-Landau	Spree-Neiße
Bodenseekreis	Eisenach
Sankt Wendel	Unstrut-Hainich-Kreis
Landshut	Saale-Orla-Kreis
Ludwigsburg	Weimar
Speyer	Südliche Weinstraße

France
France median throughput: 48.51Mbps
France median latency: 54.2ms

Paris has long been the Internet hub in France. Marseille has started to grow as a hub, especially with the large number of submarine cables landing. Other interconnection hubs in Lyon and Bordeaux are where we’ll start to see growth as Internet hubs. These four cities are where we also see the best performance, with the highest speeds and lowest latencies, giving the best Internet performance.

Best performing geographies by latency	Worst performing geographies by latency
Antony	Clamecy
Boulogne-Billancourt	Beaune
Lyon	Ambert
Lille	Commercy
Versailles	Vitry-le-François
Nogent-sur-Marne	Villefranche-de-Rouergue
Bobigny	Lure
Marseille	Avranches
Saint-Germain-en-Laye	Oloron-Sainte-Marie
Créteil	Privas

Best performing geographies by throughput	Worst performing geographies by throughput
Boulogne-Billancourt	Clamecy
Antony	Bellac
Marseille	Issoudun
Lille	Vitry-le-François
Nanterre	Sarlat-la-Canéda
Paris	Segré
Lyon	Rethel
Bobigny	Avallon
Versailles	Privas
Saverne	Sartène

Brazil
Brazil median throughput: 26.28Mbps
Brazil median latency: 49.25ms

Much of Brazil has good, low latency Internet performance, given geographic proximity to the major Internet hubs in São Paulo and Rio de Janeiro. Much of the Amazon has low speeds and high latency, for those parts that are actually connected to the Internet.

Campinas is one stand out, with some of the best performing Internet across Brazil, and is also the site of a recent Cloudflare data center launch.

Best performing geographies by latency	Worst performing geographies by latency
Vale do Paraiba Paulista	Vale do Acre
Assis	Sul Amazonense
Sudoeste Amazonense	Marajo
Litoral Sul Paulista	Vale do Jurua
Baixadas	Sul de Roraima
Centro Fluminense	Centro Amazonense
Sul Catarinense	Madeira-Guapore
Vale do Paraiba Paulista	Sul do Amapa
Noroeste Fluminense	Metropolitana de Belem
Bauru	Baixo Amazonas

Best performing geographies by throughput	Worst performing geographies by throughput
Metropolitana do Rio de Janeiro	Sudoeste Amazonense
Campinas	Marajo
Metropolitana de São Paulo	Norte Amazonense
Oeste Catarinense	Baixo Amazonas
Marilia	Sudeste Rio-Grandense
Vale do Itajaí	Sul Amazonense
Sul Catarinense	Centro-Sul Cearense
Sudoeste Paranaense	Sudoeste Paraense
Grande Florianópolis	Sertão Sergipano
Norte Catarinense	Sertoes Cearenses

South Africa
South Africa median throughput: 6.4Mbps
South Africa median latency: 59.78ms

Johannesburg has been the historical hub for South Africa’s Internet. This is where many Internet giants have built data centers, and it shows in latency as distance from Johannesburg. South Africa has grown to have two more Internet hubs in Cape Town and Durban. Internet performance also follows these three cities. However, much of South Africa’s Internet performance lacks the ability for video streaming and video conferencing in high definition.

Best performing geographies by latency	Worst performing geographies by latency
Siyancuma	Dr Beyers Naude
uMshwathi	Mogalakwena
City of Tshwane	Ulundi
Breede Valley	Modimolle/Mookgophong
City of Cape Town	Maluti a Phofung
Overstrand	Moqhaka
Local Municipality of Madibeng	Thulamela
Metsimaholo	Walter Sisulu
Stellenbosch	Dawid Kruiper
Ekurhuleni	Ga-Segonyana

Best performing geographies by throughput	Worst performing geographies by throughput
Siyancuma	Dr Beyers Naude
City of Cape Town	Walter Sisulu
City of Johannesburg	Lekwa-Teemane
Ekurhuleni	Dr Nkosazana Dlamini Zuma
Drakenstein	Emthanjeni
eThekwini	Dawid Kruiper
Buffalo City	Swellendam
uMhlathuze	Merafong City
City of Tshwane	Blue Crane Route
City of Matlosana	Modimolle/Mookgophong

Case Study on ISP Concentration’s Impact on Performance: Alabama, USA

One question we had as we went through a lot of this data: does ISP concentration impact Internet performance?

On one hand, there’s a case to be made that more ISP competition results in no one vendor being able to invest sufficient resources to build out a fast network. On the other hand, well, classical economics would suggest that monopolies are bad, right?

To investigate the question further, we did a deep dive into Alabama in the United States, the 24th most populous state in the US. We tracked two key metrics across 65 counties: Internet performance as defined by average download speed, and ISP concentration, as measured by the largest ISP’s traffic share.

Here is the raw data:

County	Avg. Download Speed	Largest ISP’s Traffic Share	County	Avg. Download Speed	Largest ISP’s Traffic Share
Marion	53.77	41%	Franklin	32.01	83%
Escambia	29.14	43%	Coosa	82.15	83%
Etowah	56.07	49%	Crenshaw	44.49	84%
Jackson	37.77	52%	Randolph	21.4	86%
Winston	59.25	56%	Lamar	33.94	86%
Montgomery	79.5	58%	Autuaga	65.55	86%
Baldwin	49.06	58%	Choctaw	23.97	87%
Houston	73.73	61%	Butler	29.86	90%
Dallas	86.92	62%	Pike	50.54	92%
Marshall	59.93	62%	Sumter	38.52	91%
Chambers	72.05	63%	Pickens	43.76	92%
Jefferson	99.84	64%	Marengo	42.89	92%
Elmore	71.05	66%	Macon	12.69	92%
Fayette	41.7	68%	Lawrence	62.87	92%
Lauderdale	62.87	69%	Bullock	23.89	92%
Colbert	47.91	70%	Chilton	17.13	95%
DeKalb	58.55	70%	Wilcox	62.12	93%
Morgan	61.78	71%	Monroe	20.74	96%
Washington	5.14	72%	Dale	55.46	97%
Geneva	32.01	73%	Coffee	58.18	97%
Lee	78.1	73%	Conecuh	34.94	97%
Tuscaloosa	58.85	76%	Cleburne	38.25	97%
Cullman	61.03	77%	Clarke	38.14	97%
Covington	35.48	78%	Calhoun	64.19	97%
Shelby	69.66	79%	Lowndes	9.91	98%
St. Clair	33.05	79%	Russell	49.48	98%
Blount	40.58	80%	Henry	4.69	98%
Mobile	68.77	80%	Limestone	71.6	98%
Walker	39.36	81%	Bibb	70.14	98%
Barbour	51.48	82%	Cherokee	17.13	99%
Tallapoosa	60	82%	Greene	4.76	99%
Madison	99	83%	Clay	3.42	100%

Across most of Alabama, we see very high ISP concentration. For the majority of counties, the largest ISP has 80% (or higher) share of traffic, while all the other ISPs combined operate at considerably smaller scale. In only three counties (Marion, Escambia and Etowah) does each ISP carry less than 50% of user traffic. Interestingly, Etowah is one of the best performing in the state, while Henry, a county where 98% of Internet traffic is concentrated behind a single ISP is the worst performing.

Where it gets interesting is when you plot the data, tracking the non-dominant ISP by traffic share (which is simply 100% less the traffic share of the dominant ISP) against the performance (as measured by download speed) and then use a linear line of best fit to find the relationship. Here’s what you get:

As you can see, there is a strong positive relationship between the non-dominant ISP’s traffic share and the average download speed. As the non-dominant ISP increases its traffic share, Internet speeds tend to improve. The conclusion is clear: if you want to improve Internet performance in a region, foster more competition between multiple Internet service providers.

The Other Performance Challenge: Limited ISP Exchanges, and Tromboning

There is more to the story, however, than just concentration. Alabama, like a lot of other regions that aren’t served well by ISPs, faces another performance challenge: poor routing, also sometimes known as “tromboning”.

Consider Tuskegee in Alabama, home to a local university.

In Tuskegee, choice is limited. Consumers only have a single choice for high-speed broadband. But even once an off-campus student has local access to the Internet, it isn’t truly local: Tuskegee students on a different ISP than their university will likely see their traffic detour all the way through Atlanta (two hours northeast by car!) before making its way back to school.

This doesn’t happen in isolation: today, the largest ISPs only exchange traffic with other networks in a handful of cities, notably Seattle, San Jose, Los Angeles, Dallas, Chicago, Atlanta, Miami, Ashburn, and New York City.

If you’re in one of these big cities, you’re unlikely to suffer from tromboning. But if you’re not? Your Internet traffic can often have to travel further away before looping back, similar to the shape of a trombone, reducing your Internet performance. Tromboning contributes to inefficiency and drives up the cost of Internet access. An increasing amount of traffic is wastefully carried to cities far away, instead of keeping the data local.

You can visualize how your Internet traffic is flowing, by using tools like traceroute.

As an example, we ran tests using RIPE Atlas probes to Facebook from Alabama, and unfortunately found extremes where traffic can sometimes take a highly circuitous route — traffic going to Atlanta, then Ashburn, Paris, Amsterdam, before making its way back to Alabama. The path begins on AT&T’s network and goes to Atlanta where it enters the network for Telia (an IP transit provider), crosses the Atlantic, meets Facebook, and then comes back.

Traceroute to 157.240.201.35 (157.240.201.35), 48 byte packets
1- 192.168.6.1 1.435ms 0.912ms 0.636ms
2-  99.22.36.1 99-22-36-1.lightspeed.dctral.sbcglobal.net AS7018 1.26ms 1.134ms 1.107ms
3-  99.173.216.214 AS7018 3.185ms 3.173ms 3.099ms
4-  12.122.140.70 cr84.attga.ip.att.net AS7018 11.572ms 13.552ms 15.038ms
5 - * * *
6- 192.205.33.42 AS7018 8.695ms 9.185ms 8.703ms
7-  62.115.125.129 ash-bb2-link.ip.twelve99.net AS1299 23.53ms 22.738ms 23.012ms
8-  62.115.112.243 prs-bb1-link.ip.twelve99.net AS1299 115.516ms 115.52ms 115.211ms
9-  62.115.134.96 adm-bb3-link.ip.twelve99.net AS1299 113.487ms 113.405ms 113.25ms
10-  62.115.136.195 adm-b1-link.ip.twelve99.net AS1299 115.443ms 115.703ms 115.45ms
11- 62.115.148.231 facebook-ic331939-adm-b1.ip.twelve99-cust.net AS1299 134.149ms 113.885ms 114.246ms
12- 129.134.51.84 po151.asw02.ams2.tfbnw.net AS32934 113.27ms 113.078ms 113.149ms
13-  129.134.48.101 po226.psw04.ams4.tfbnw.net AS32934 114.529ms 114.439ms 117.257ms
14-  157.240.38.227 AS32934 113.281ms 113.365ms 113.448ms
15- 157.240.201.35 edge-star-mini-shv-01-ams4.facebook.com AS32934 115.013ms 115.223ms 115.112ms

The intent here isn’t to shame AT&T, Telia, or Facebook — nor is this challenge unique to them. Facebook’s content is undoubtedly cached in Atlanta and the request from Alabama should go no further than that. While many possible conditions within and between these three networks could have caused this tromboning, in the end, the consumer suffers.

The solution? Have more major ISPs exchange in more cities and with more networks. Of course, there’d be an upfront cost involved in doing so, even if it would reduce cost more over the long run.

Conclusion

As William Gibson famously observed: the future is here, but it’s just not evenly distributed.

One of the clearest takeaways from the data and analysis presented here is that Internet access varies tremendously across geographies. But it’s not just a case of the developed world vs the developing, or even rural vs urban. There are underserved urban communities and regions of the developed world that do not score as highly as you might expect.

Furthermore, our case study of Alabama shows that the structure of the ISP market is incredibly important to promoting performance. We found a strong positive correlation between more competition and faster performance. Similarly, there’s a lot of opportunity for more networks to interconnect in more places, to avoid bad routing.

Finally, if we want to get the other 40% of the world online, we are going to need more initiatives that drive up access and drive down cost. There’s plenty of scope to help — and we’re excited to be launching Project Pangea to help.

Why I joined Cloudflare — and why I’m excited about Project Pangea

2021-07-26 Roderick Fanou

Post Syndicated from Roderick Fanou original https://blog.cloudflare.com/why-i-joined-cloudflare-and-why-im-excited-about-project-pangea/

Why I joined Cloudflare — and why I’m excited about Project Pangea

If you are well-prepared to take up the challenge, you will get to experience a moment where you are stepping forward to help build a better world. Personally, I felt exactly that when about a month ago, after a long and (COVID) complicated visa process, I joined Cloudflare as a Systems Engineer in Austin, Texas.

In the early 2000s, I experienced while travelling throughout the Benin Republic (my home country) and West Africa more generally, how challenging accessing the Internet was. I recall that, as students, we were often connecting to the web from cybercafés through limited bandwidth purchased at high cost. It was a luxury to have a broadband connection at home. When access was free (say, from high school premises or at university) we still had bandwidth constraints, and often we could not connect for long. The Internet can efficiently help tackle issues encountered (in areas like education, health, communications, …) by populations in similar regions, but the lack of easy and affordable access, made it difficult to leverage. It is in such a context that I chose to pursue my studies in telecoms, with the hope of being able to somehow give back to the community by helping improve Internet access in the region.

My internship at Euphorbia Sarl, a local ISP, introduced me to the process of designing, finding, and deploying suitable technologies to satisfy the interconnection needs for the region. But more than that, it showed me first hand the day-to-day challenges encountered by network operators in Africa. It highlighted the need for more research on the Internet in developing regions, most notably measurements studies, to identify the root causes of the lack of connectivity in the (West) African region.

It was with this experience that I then pursued my doctoral studies at IMDEA Networks Institute and UC3M (Spain) and collaborated with stakeholders and researchers to investigate the characteristics and routing dynamics of the Internet in Africa; and then my postdoc at CAIDA/UCSD (US), looking at the occurrence of network congestion worldwide, and the impact of the SACS cable deployment between Angola and Brazil on Internet routing. While studying the network in those underserved and geographically large regions, we noticed that much of the web content was still served from the US and Europe. We also identified a lack of physical infrastructure and interconnections between local and global networks, alongside a lack of local content, as the root causes of packet-tromboning, high transit costs, and the persistently poor quality of service delivered to the users in the region.

Of course, local communities, network operators, stakeholders, and Internet bodies such as the Internet Society or Packet Clearing House have been working towards bridging this gap. But there is still much room for improvement. I believe this (hopefully soon) post-pandemic era — where more and more activities are shifting online — represents the best opportunity to solve this persistent issue. COVID has forced us to reflect, and one of the critical questions I asked myself was: after so many years of research, how can I — like a frontline doctor or nurse in the pandemic — actively and effectively help mitigate these connectivity issues, creating a better Internet for everyone, notably for those in underserved areas? The answer for me was to switch out of academia into tech. But which company?

As I progressed through the interview process with Cloudflare, it soon became clear that this was the answer to my question above. I discovered that Cloudflare’s values and mission were very much aligned with my own. I also loved the culture, how welcoming and diverse the team is, as well as how attentive and close to us the C-level is. I was impressed by the network footprint and, notably by its spread regardless of the Internet region, especially the growing number of data centers in Latin America and Africa. I had to travel back to West Africa during my visa process, and my experience there only reinforced what I already knew: we need more local content in developing regions, we need more support for local communities, and we need to better enable developing regions.

Fast-forward to my starting date, I was pleased to find out that Cloudflare frequently organizes innovation weeks — like Birthday Week — during which the company gives back to the community. There have been several noteworthy initiatives, including Project Fair Shot to enable communities to vaccinate fairly, and Project Galileo, protecting at-risk public interest groups.

But what has me truly excited is Project Pangea, which launches today as part of Impact Week. Project Pangea helps improve security and connectivity for community networks at no cost. Cloudflare’s network spans 200+ cities worldwide; it has one of the largest number of interconnects/peers worldwide. It also delivers a state of the art DNS service with privacy in mind, and an intelligent routing system that constantly learns about the best and least congested Internet routes worldwide from and towards any region in the world. My research on Internet performance in developing regions makes me believe that community networks — and their end users — will benefit tremendously from such a partnership. It is so exciting to be part of such an amazing journey, which is why I am sharing my excitement through this post.

I would like to conclude by making an appeal to all stakeholders in developing regions — including network operators, and bodies such as the ISOC and the RIRs. Please do not hesitate to enquire about the Project Pangea. I truly believe that Cloudflare will be a tremendous partner to you, and your network — and your community — will benefit from using them.

Announcing Project Pangea: Helping Underserved Communities Expand Access to the Internet For Free

2021-07-26 Marwan Fayed

Post Syndicated from Marwan Fayed original https://blog.cloudflare.com/pangea/

Announcing Project Pangea: Helping Underserved Communities Expand Access to the Internet For Free

Half of the world’s population has no access to the Internet, with many more limited to poor, expensive, and unreliable connectivity. This problem persists despite large levels of public investment, private infrastructure, and effort by local organizers.

Today, Cloudflare is excited to announce Project Pangea: a piece of the puzzle to help solve this problem. We’re launching a program that provides secure, performant, reliable access to the Internet for community networks that support underserved communities, and we’re doing it for free¹ because we want to help build an Internet for everyone.

What is Cloudflare doing to help?

Project Pangea is Cloudflare’s project to help bring underserved communities secure connectivity to the Internet through Cloudflare’s global and interconnected network.

Cloudflare is offering our suite of network services — Cloudflare Network Interconnect, Magic Transit, and Magic Firewall — for free to nonprofit community networks, local networks, or other networks primarily focused on providing Internet access to local underserved or developing areas. This service would dramatically reduce the cost for communities to connect to the Internet, with industry leading security and performance functions built-in:

Cloudflare Network Interconnect provides access to Cloudflare’s edge in 200+ cities across the globe through physical and virtual connectivity options.
Magic Transit acts as a conduit to and from the broader Internet and protects community networks by mitigating DDoS attacks within seconds at the edge.
Magic Firewall gives community networks access to a network-layer firewall as a service, providing further protection from malicious traffic.

We’ve learned from working with customers that pure connectivity is not enough to keep a network sustainably connected to the Internet. Malicious traffic, such as DDoS attacks, can target a network and saturate Internet service links, which can lead to providers aggressively rate limiting or even entirely shutting down incoming traffic until the attack subsides. This is why we’re including our security services in addition to connectivity as part of Project Pangea: no attacker should be able to keep communities closed off from accessing the Internet.

What is a community network?

Community networks have existed almost as long as commercial subscribership to the Internet that began with dial-up service. The Internet Society, or ISOC, describes community networks as happening “when people come together to build and maintain the necessary infrastructure for Internet connection.”

Most often, community networks emerge from need, and in response to the lack or absence of available Internet connectivity. They consistently demonstrate success where public and private-sector initiatives have either failed or under-deliver. We’re not talking about stop-gap solutions here, either — community networks around the world have been providing reliable, sustainable, high-quality connections for years.

Many will operate only within their communities, but many others can grow, and have grown, to regional or national scale. The most common models of governance and operation are as not-for-profits or cooperatives, models that ensure reinvestment within the communities being served. For example, we see networks that reinvest their proceeds to replace Wi-Fi infrastructure with fibre-to-the-home.

Cloudflare celebrates these networks’ successes, and also the diversity of the communities that these networks represent. In that spirit, we’d like to dispel myths that we encountered during the launch of this program — many of which we wrongly assumed or believed to be true — because the myths turn out to be barriers that communities so often are forced to overcome. Community networks are built on knowledge sharing, and so we’re sharing some of that knowledge, so others can help accelerate community projects and policies, rather than rely on the assumptions that impede progress.

Myth #1: Only very rural or remote regions are underserved and in need. It’s true that remote regions are underserved. It is also true that underserved regions exist within 10 km (about six miles) of large city centers, and even within the largest cities themselves, as evidenced by the existence of some of our launch partners.

Myth #2: Remote, rural, or underserved is also low-income. This might just be the biggest myth of all. Rural and remote populations are often thriving communities that can afford service, but have no access. In contrast, the need for urban community networks are often egalitarian, and emerge because the access that is available is unaffordable to many.

Myth #3: Service is necessarily more expensive. This myth is sometimes expressed by statements such as, “if large service providers can’t offer affordable access, then no one can.” More than a myth, this is a lie. Community networks (including our launch partners) use novel governance and cost models to ensure that subscribers pay rates similar to the wider market.

Myth #4: Technical expertise is a hard requirement and is unavailable. There is a rich body of evidence and examples showing that, with small amounts of training and support, communities can build their own local networks cheaply and reliably with commodity hardware and non-specialist equipment.

These myths aside, there is one truth: the path to sustainability is hard. The start and initial growth of community networks often consists of volunteer time or grant funding, which are difficult to sustain in the long-term. Eventually the starting models need to transition to models of “willing to charge and willing to pay” — Project Pangea is designed to help fill this gap.

What is the problem?

Communities around the world can and have put up Wi-Fi antennas and laid their own fibre. Even so, and however well-connected the community is to itself, Internet services are prohibitively expensive — if they can be found at all.

Two elements are required to connect to the Internet, and each incurs its own cost:

Backhaul connections to an interconnection point — the connection point may be anything from a local cabinet to a large Internet exchange point (IXP).
Internet Services are provided by a network that interfaces with the wider Internet, and agrees to route traffic to and from on behalf of the community network.

These are distinct elements. Backhaul service carries data packets along a physical link (a fibre cable or wireless medium). Internet service is separate and may be provided over that link, or at its endpoint.

The cost of Internet service for networks is both dominant and variable (with usage), so in most cases it is cheaper to purchase both as a bundle from service providers that also own or operate their own physical network. Telecommunications and energy companies are prime examples.

However, the operating costs and complexity of long-distance backhaul is significantly lower than the costs of Internet service. If reliable, high capacity service were affordable, then community networks could extend their knowledge and governance models sustainably to also provide their own backhaul.

For all that community networks can build, establish, and operate, the one element entirely outside their control is the cost of Internet service — a problem that Project Pangea helps to solve.

Why does the problem persist?

On this subject, I — Marwan — can only share insights drawn from prior experience as a computer science professor, and a co-founder of HUBS c.i.c., launched with talented professors and a network engineer. HUBS is a not-for-profit backhaul and Internet provider in Scotland. It is a cooperative of more than a dozen community networks — some that serve communities with no roads in or out — across thousands of square kilometers along Scotland’s West Coast and Borders regions. As is true of many community networks, not least some of Pangea’s launch partners, HUBS’ is award-winning, and engages in advocacy and policy.

During that time my co-founders and I engaged with research funders, economic development agencies, three levels of government, and so many communities that I lost track. After all that, the answer to the question is still far from clear. There are, however, noteworthy observations and experiences that stood out, and often came from surprising places:

Cables on the ground get chewed by animals that, small or large, might never be seen.
Burying power and Ethernet cables, even 15 centimeters below soil, makes no difference because (we think) animals are drawn by the electrical current.
Property owners sometimes need to be convinced that 8 to 10 square meters to build a small tower in exchange for free Internet and community benefit is a good thing.
The raising of small towers, even that no one will see, is sometimes blocked by legislation or regulation that assumes private non-residential structures can only be a shed, or never taller than a shed.
Private fibre backbone installations installed with public funds are often inaccessible, or are charged by distance even though the cost to light 100 meters of fibre is identical to the cost of lighting 1 km of fibre.
Civil service agencies may be enthusiastic, but are also cautious, even in the face of evidence. Be patient, suffer frustration, be more patient, and repeat. Success is possible.
If and where possible, it’s best to avoid attempts to deliver service where national telecommunications companies have plans to do so.
Never underestimate tidal fading — twice a day, wireless signals over water will be amazing, and will completely disappear. We should have known!

All anecdotes aside, the best policies and practices are non-trivial — but because of so many prior community efforts, and organizations such as ISOC, the APC, the A4AI, and more, the challenges and solutions are better understood than ever before.

How does a community network reach the Internet?

First, we’d like to honor the many organisations we’ve learned from who might say that there are no technical barriers to success. Connections within the community networks may be shaped by geographical features or regional regulations. For example, wireless lines of sight between antenna towers on personal property are guided by hills or restricted by regulations. Similarly, Ethernet cables and fibre deployments are guided by property ownership, digging rights, and the presence or migration of grazing animals that dig into soil and gnaw at cables — yes, they do, even small rabbits.

Once the community establishes its own area network, the connections to meet Internet services are more conventional, more familiar. In part, the choice is influenced or determined by proximity to Internet exchanges, PoPs, or regional fibre cabinet installations. The connections with community networks fall into three broad categories.

Colocation. A community network may be fortunate enough to have service coverage that overlaps with, or is near to, an Internet eXchange Point (IXP), as shown in the figure below. In this case a natural choice is to colocate a router within the exchange, near to the Internet service provider’s router (labeled as Cloudflare in the figure). Our launch partner NYC Mesh connects in this manner. Unfortunately, being that exchanges are most often located in urban settings, colocation is unavailable to many, if not most, community networks.

Conventional point-to-point backhaul. Community networks that are remote must establish a point-to-point backhaul connection to the Internet exchange. This connection method is shown in the figure below in which the community network in the previous figure has moved to the left, and is joined by a physical long-distance link to the Internet service router that remains in the exchange on the right.

Point-to-point backhaul is familiar. If the infrastructure is available — and this is a big ‘if’ — then backhaul is most often available from a utility company, such as a telecommunications or energy provider, that may also bundle Internet service as a way to reduce total costs. Even bundled, the total cost is variable and unaffordable to individual community networks, and is exacerbated by distance. Some community networks have succeeded in acquiring backhaul through university, research and education, or publicly-funded networks that are compelled or convinced to offer the service in the public interest. On the west coast of Scotland, for example, Tegola launched with service from the University of Highlands and Islands and the University of Edinburgh.

Start a backhaul cooperative for point-to-point and colocation. The last connection option we see among our launch partners overcomes the prohibitive costs by forming a cooperative network in which the individual subscriber community networks are also members. The cooperative model can be seen in the figure below. The exchange remains on the right. On the left the community network in the previous figure is now replaced by a collection of community networks that may optionally connect with each other (for example, to establish reliable routing if any link fails). Either directly or indirectly via other community networks, each of these community networks has a connection to a remote router at the near-end of the point-to-point connection. Crucially, the point-to-point backhaul service — as well as the co-located end-points — are owned and operated by the cooperative. In this manner, an otherwise expensive backhaul service is made affordable by being a shared cost.

Two of our launch partners, Guifi.net and HUBS c.i.c., are organised this way and their 10+ years in operation demonstrate both success and sustainability. Since the backhaul provider is a cooperative, the community network members have a say in the ways that revenue is saved, spent, and — best of all — reinvested back into the service and infrastructure.

Why is Cloudflare doing this?

Cloudflare’s mission is to help build a better Internet, for everyone, not just those with privileged access based on their geographical location. Project Pangea aligns with this mission by extending the Internet we’re helping to build — a faster, more reliable, more secure Internet — to otherwise underserved communities.

How can my community network get involved?

Check out our landing page to learn more and apply for Project Pangea today.

The ‘community’ in Cloudflare

Lastly, in a blog post about community networks, we feel it is appropriate to acknowledge the ‘community’ at Cloudflare: Project Pangea is the culmination of multiple projects, and multiple peoples’ hours, effort, dedication, and community spirit. Many, many thanks to all.
______

¹For eligible networks, free up to 5Gbps at p95 levels.

Introducing Flarability, Cloudflare’s Accessibility Employee Resource Group

2021-07-26 Janae Frischer

Post Syndicated from Janae Frischer original https://blog.cloudflare.com/introducing-flarability-cloudflares-accessibility-employee-resource-group/

Introducing Flarability, Cloudflare’s Accessibility Employee Resource Group

Hello, folks! I’m pleased to introduce myself and Cloudflare’s newest Employee Resource Group (ERG), Flarability, to the world. The 31st anniversary of the signing of the Americans with Disabilities Act (ADA), which happens to fall during Cloudflare’s Impact Week, is an ideal time to raise the subject of accessibility at Cloudflare and around the world.

There are multiple accessibility-related projects and programs at Cloudflare, including office space accessibility and website and product accessibility programs, some of which we will highlight in the stories below. I wanted to share my accessibility story and the story of the birth and growth of our accessibility community with you.

About Flarability

Flarability began with a conversation between a couple of colleagues, almost two years ago. Some of us had noticed some things about the workspace that weren’t as inclusive of people with disabilities as they could have been. For example, the open floor plan in our San Francisco office, as well as the positioning of our interview rooms, made it difficult for some to concentrate in the space. To kick off a community discussion, we formed a chat room, spread the word about our existence, and started hosting some meetings for interested employees and our allies. After a short time, we were talking about what to name our group, what our mission should be, and what kind of logo image would best represent our group.

Our Mission: We curate and share resources about disabilities, provide a community space for those with disabilities and our allies to find support and thrive, and encourage and guide Cloudflare’s accessibility programs.

An example of how we have worked with the company was a recent Places Team consultation. As we redevelop our offices and workspaces for a return to what we are calling “back to better”, our Places Team wanted to be sure the way we design our future offices is as inclusive and accessible as possible. You may read more about how we have partnered with the Places Team in Nicole’s story below.

About the Disability Community

There is a lot of diversity amongst disabled people as there are many types of physical or mental impairments. Flarability includes employees with many of them. Some of us have intellectual disabilities such as autism and depression. Some of us have physical disabilities such as deafness and blindness. Several of us are not “out” about our disabilities and that’s definitely okay. The idea of our community is to provide a space for people to feel they can express themselves and feel comfortable. Historically, people with disabilities have been marginalized, even institutionalized. These days, there is much more awareness about and acceptance of disabilities, but there is a lot more work to be done. We are honored to take a central role in that work at Cloudflare.

Stories from Flarability

I am not the only person with a disability at Cloudflare or who works to make Cloudflare more accessible to those with disabilities. We are proud to have many people with disabilities working at our company and I wanted to enable some key individuals with disabilities and supportive team members to share their experiences and stories.

What does accessibility mean to you?

Watson: “Accessibility means integration, having the same opportunities as everyone else to participate in society. My disability was seen as shameful and limiting, and it was only a few years before I started elementary school that New Jersey integrated children with disabilities into the classroom, ensuring that they received an adequate education. Growing up I was taught to hide who I was, and it’s thanks to the self-advocacy that I am now proudly autistic.”

Nicole: “Workplace accessibility is one of the top priorities of Cloudflare’s Places Team while we design and build our future office spaces. Feedback from our teammates in all our offices has always been a collaborative experience at Cloudflare. In previous years when opening a new office, the Places Team would crowdsource feedback from the company to adjust, or repair office features. Today, the Places Team involves a sync with Flarability leaders in the office design/construction process to discuss feedback and requests from coworkers with accessibility needs.

We also have an ergonomics and work accommodations program to ensure each of our teammates is sorted with workplace equipment that fits their individual needs.

Lastly, we want to provide multiple outlets for our teams to advocate for change. The Places Team hosts an internal anonymous feedback form, available to any teammate who feels comfortable submitting requests in a safe space.”

Why is accessibility advocacy important?

Janae: “Accessibility is important in the workplace. However, when people are not advocating for themselves, accessibility initiatives might not be leveraged to their fullest extent. When you don’t communicate what is holding you back from being more productive, you are doing a disservice to the company, but most importantly you. Perhaps you work more efficiently with fewer distractions, yet your boss has assigned you a desk that is right next to a noisy area of the office. What would happen if you asked them for a different workspace? For example, I am hard of hearing. As an outsider, you may not notice, as I appear to be able to carry on a verbal, face-to-face conversation with ease. In reality, I am lip reading, attempting to filter ambient noise, and watching others’ body/facial movements to fully understand what is going on. I work best when in quieter, less distracting environments. However, I am able to work in loud, distracting environments, too; I am just not able to perform at my best in this kind of environment.

Lastly, I’d like to highlight that one day I was casually chatting with a co-worker about my struggles and a company co-founder overheard me. They offered to support me in any and all ways possible. The noisy, distracting office space I had was changed to a workspace in a corner, where less foot traffic and cross conversations happened. This simple adjustment and small deed that our co-founder acted on inspired me to help start Flarability. I want all employees to feel they can advocate for themselves and if they are not comfortable enough to do so, then to know that there are people who are willing and able to help them.”

What’s next for our group?

We are looking forward to growing our Flarability membership, globally. We have already come a long way in our brief history, but we have many more employees to reach and support, company initiatives to advise, and future employees to recruit.

Thank you for reading our personal stories and the story of Flarability. I encourage all of you who are reading this to do some more reading about accessibility and find at least one way to support people with disabilities in your own community.

We would also love to connect with accessibility ERG leaders from other companies. If you’re reading this and are interested in collaborating, please hit me up at [email protected].

The "Racketeer Nickel"

2021-07-26 The History Guy: History Deserves to Be Remembered

Post Syndicated from The History Guy: History Deserves to Be Remembered original https://www.youtube.com/watch?v=LklHnhthGz0

Disrupting Ransomware by Disrupting Bitcoin

2021-07-26 Bruce Schneier

Post Syndicated from Bruce Schneier original https://www.schneier.com/blog/archives/2021/07/disrupting-ransomware-by-disrupting-bitcoin.html

Ransomware isn’t new; the idea dates back to 1986 with the “Brain” computer virus. Now, it’s become the criminal business model of the internet for two reasons. The first is the realization that no one values data more than its original owner, and it makes more sense to ransom it back to them — sometimes with the added extortion of threatening to make it public — than it does to sell it to anyone else. The second is a safe way of collecting ransoms: bitcoin.

This is where the suggestion to ban cryptocurrencies as a way to “solve” ransomware comes from. Lee Reiners, executive director of the Global Financial Markets Center at Duke Law, proposed this in a recent Wall Street Journal op-ed. Journalist Jacob Silverman made the same proposal in a New Republic essay. Without this payment channel, they write, the major ransomware epidemic is likely to vanish, since the only payment alternatives are suitcases full of cash or the banking system, both of which have severe limitations for criminal enterprises.

It’s the same problem kidnappers have had for centuries. The riskiest part of the operation is collecting the ransom. That’s when the criminal exposes themselves, by telling the payer where to leave the money. Or gives out their banking details. This is how law enforcement tracks kidnappers down and arrests them. The rise of an anonymous, global, distributed money-transfer system outside of any national control is what makes computer ransomware possible.

This problem is made worse by the nature of the criminals. They operate out of countries that don’t have the resources to prosecute cybercriminals, like Nigeria; or protect cybercriminals that only attack outside their borders, like Russia; or use the proceeds as a revenue stream, like North Korea. So even when a particular group is identified, it is often impossible to prosecute. Which leaves the only tools left a combination of successfully blocking attacks (another hard problem) and eliminating the payment channels that the criminals need to turn their attacks into profit.

In this light, banning cryptocurrencies like bitcoin is an obvious solution. But while the solution is conceptually simple, it’s also impossible because — despite its overwhelming problems — there are so many legitimate interests using cryptocurrencies, albeit largely for speculation and not for legal payments.

We suggest an easier alternative: merely disrupt the cryptocurrency markets. Making them harder to use will have the effect of making them less useful as a ransomware payment vehicle, and not just because victims will have more difficulty figuring out how to pay. The reason requires understanding how criminals collect their profits.

Paying a ransom starts with a victim turning a large sum of money into bitcoin and then transferring it to a criminal controlled “account.” Bitcoin is, in itself, useless to the criminal. You can’t actually buy much with bitcoin. It’s more like casino chips, only usable in a single establishment for a single purpose. (Yes, there are companies that “accept” bitcoin, but that is mostly a PR stunt.) A criminal needs to convert the bitcoin into some national currency that he can actually save, spend, invest, or whatever.

This is where it gets interesting. Conceptually, bitcoin combines numbered Swiss bank accounts with public transactions and balances. Anyone can create as many anonymous accounts as they want, but every transaction is posted publicly for the entire world to see. This creates some important challenges for these criminals.

First, the criminal needs to take efforts to conceal the bitcoin. In the old days, criminals used “https://www.justice.gov/opa/pr/individual-arrested-and-charged-operating-notorious-darknet-cryptocurrency-mixer”>mixing services“: third parties that would accept bitcoin into one account and then return it (minus a fee) from an unconnected set of accounts. Modern bitcoin tracing tools make this money laundering trick ineffective. Instead, the modern criminal does something called “chain swaps.”

In a chain swap, the criminal transfers the bitcoin to a shady offshore cryptocurrency exchange. These exchanges are notoriously weak about enforcing money laundering laws and — for the most part — don’t have access to the banking system. Once on this alternate exchange, the criminal sells his bitcoin and buys some other cryptocurrency like Ethereum, Dogecoin, Tether, Monero, or one of dozens of others. They then transfer it to another shady offshore exchange and transfer it back into bitcoin. Voila — they now have “clean” bitcoin.

Second, the criminal needs to convert that bitcoin into spendable money. They take their newly cleaned bitcoin and transfer it to yet another exchange, one connected to the banking system. Or perhaps they hire someone else to do this step. These exchanges conduct greater oversight of their customers, but the criminal can use a network of bogus accounts, recruit a bunch of users to act as mules, or simply bribe an employee at the exchange to evade whatever laws there. The end result of this activity is to turn the bitcoin into dollars, euros, or some other easily usable currency.

Both of these steps — the chain swapping and currency conversion — require a large amount of normal activity to keep from standing out. That is, they will be easy for law enforcement to identify unless they are hiding among lots of regular, noncriminal transactions. If speculators stopped buying and selling cryptocurrencies and the market shrunk drastically, these criminal activities would no longer be easy to conceal: there’s simply too much money involved.

This is why disruption will work. It doesn’t require an outright ban to stop these criminals from using bitcoin — just enough sand in the gears in the cryptocurrency space to reduce its size and scope.

How do we do this?

The first mechanism observes that the criminal’s flows have a unique pattern. The overall cryptocurrency space is “zero sum”: Every dollar made was provided by someone else. And the primary legal use of cryptocurrencies involves speculation: people effectively betting on a currency’s future value. So the background speculators are mostly balanced: One bitcoin in results in one bitcoin out. There are exceptions involving offshore exchanges and speculation among different cryptocurrencies, but they’re marginal, and only involve turning one bitcoin into a little more (if a speculator is lucky) or a little less (if unlucky).

Criminals and their victims act differently. Victims are net buyers, turning millions of dollars into bitcoin and never going the other way. Criminals are net sellers, only turning bitcoin into currency. The only other net sellers are the cryptocurrency miners, and they are easy to identify.

Any banked exchange that cares about enforcing money laundering laws must consider all significant net sellers of cryptocurrencies as potential criminals and report them to both in-country and US financial authorities. Any exchange that doesn’t should have its banking forcefully cut.

The US Treasury can ensure these exchanges are cut out of the banking system. By designating a rogue but banked exchange, the Treasury says that it is illegal not only to do business with the exchange but for US banks to do business with the exchange’s bank. As a consequence, the rogue exchange would quickly find its banking options eliminated.

A second mechanism involves the IRS. In 2019, it started demanding information from cryptocurrency exchanges and added a check box to the 1040 form that requires disclosure from those who both buy and sell cryptocurrencies. And while this is intended to target tax evasion, it has the side consequence of disrupting those offshore exchanges criminals rely to launder their bitcoin. Speculation on cryptocurrency is far less attractive since the speculators have to pay taxes but most exchanges don’t help out by filing 1099-Bs that make it easy to calculate the taxes owed.

A third mechanism involves targeting the cryptocurrency Tether. While most cryptocurrencies have values that fluctuate with demand, Tether is a “stablecoin” that is supposedly backed one-to-one with dollars. Of course, it probably isn’t, as its claim to be the seventh largest holder of commercial paper (short-term loans to major businesses) is blatantly untrue. Instead, they appear part of a cycle where new Tether is issued, used to buy cryptocurrencies, and the resulting cryptocurrencies now “back” Tether and drive up the price.

This behavior is clearly that of a “wildcat bank,” an 1800s fraudulent banking style that has long been illegal. Tether also bears a striking similarity to Liberty Reserve, an online currency that the Department of Justice successfully prosecuted for money laundering in 2013. Shutting down Tether would have the side effect of eliminating the value proposition for the exchanges that support chain swapping, since these exchanges need a “stable” value for the speculators to trade against.

There are further possibilities. One involves treating the cryptocurrency miners, those who validate all transactions and add them to the public record, as money transmitters — and subject to the regulations around that business. Another option involves requiring cryptocurrency exchanges to actually deliver the cryptocurrencies into customer-controlled wallets.

Effectively, all cryptocurrency exchanges avoid transferring cryptocurrencies between customers. Instead, they simply record entries in a central database. This makes sense because actual “on chain” transactions can be particularly expensive for cryptocurrencies like bitcoin or Ethereum. If all speculators needed to actually receive their bitcoins, it would make clear that its value proposition as a currency simply doesn’t exist, as the already strained system would grind to a halt.

And, of course, law enforcement can already target criminals’ bitcoin directly. An example of this just occurred, when US law enforcement was able to seize 85% of the $4 million ransom Colonial Pipeline paid to the criminal organization DarkSide. That by the time the seizure occurred the bitcoin lost more than 30% of its value is just one more reminder of how unworkable bitcoin is as a “store of value.”

There is no single silver bullet to disrupt either cryptocurrencies or ransomware. But enough little disruptions, a “death of a thousand cuts” through new and existing regulation, should make bitcoin no longer usable for ransomware. And if there’s no safe way for a criminal to collect the ransom, their business model becomes no longer viable.

This essay was written with Nicholas Weaver, and previously appeared on Slate.com.

Archimedes the AI robot | HackSpace #45

2021-07-26 Ashley Whittaker

Post Syndicated from Ashley Whittaker original https://www.raspberrypi.org/blog/archimedes-the-ai-robot-hackspace-45/

When we saw Alex Glow’s name in the latest issue of HackSpace magazine, we just had to share their project. HackSpace #45 celebrates the best Raspberry Pi builds of all time, and we remembered spotting Alex’s wearable robotic owl familiar back in the day. For those of you yet to have had the pleasure, meet Archimedes…

archimedes owl on maker's shoulder — Archimedes taking a perch on his maker’s shoulder

Back in 2018, Hackster’s Alex Glow built Archimedes, an incredible robot companion using a combination of Raspberry Pi Zero W and Arduino with the Google AIY Vision Kit for its ‘brain’.

An updated model, Archie 2, using Raspberry Pi 3B, ESP32-powered Matrix Voice, and an SG90 micro-servo motor saw the personable owl familiar toughen up – Alex says the 3D-printed case is far more durable – as well as having better voice interaction options using Matrix HAL (for which installer packages are provided for Raspberry Pi and Python), plus Mycroft and Snips.ai voice assistant software.

archimedes owl insides laid out on table — Owl innards

Other refinements included incorporating compact discs into the owl’s wings to provide an iridescent sheen. Slots in the case allowed Alex to feed through cable ties to attach Archie’s wings, which she says now “provide a lively bounce to the wings, in tune with his active movements (as well as my own).”

archimedes owl wing detail — Raspberry Pi getting stuffed into Archimedes’ head

HackSpace magazine issue 45 out NOW!

Each month, HackSpace magazine brings you the best projects, tips, tricks and tutorials from the makersphere. You can get it from the Raspberry Pi Press online store or your local newsagents.

Hack space magazine issue 45 front cover

As always, every issue is free to download from the HackSpace magazine website.

The post Archimedes the AI robot | HackSpace #45 appeared first on Raspberry Pi.

Housing Discrimination: Last Week Tonight with John Oliver (HBO)

2021-07-26 LastWeekTonight

Post Syndicated from LastWeekTonight original https://www.youtube.com/watch?v=_-0J49_9lwc

Comic for 2021.07.26

2021-07-26 Explosm.net

Post Syndicated from Explosm.net original http://explosm.net/comics/5933/

New Cyanide and Happiness Comic

Protecting Personal Data in Grab’s Imagery

2021-07-26 Grab Tech

Post Syndicated from Grab Tech original https://engineering.grab.com/protecting-personal-data-in-grabs-imagery

Image Collection Using KartaView

Starting a few years ago, we realised the strong demand to better understand the streets where our drivers and clients go, with the purpose to better fulfil their needs and also to be able to quickly adapt ourselves to the rapidly changing environment in the Southeast Asia cities.

One way to fulfil that demand was to create an image collection platform named KartaView which is Grab Geo’s platform for geotagged imagery. It empowers collection, indexing, storage, retrieval of imagery, and map data extraction.

KartaView is a public, partially open-sourced product, used both internally and externally by the OpenStreetMap community and other users. As of 2021, KartaView has public imagery in over 100 countries with various coverage degrees, and 60+ cities of Southeast Asia. Check it out at www.kartaview.com.

Figure 1 - KartaView platform — *Figure 1 – KartaView platform*

Why Image Blurring is Important

Many incidental people and licence plates are in the collected images, whose privacy is a serious concern. We deeply respect all of them and consequently, we are using image obfuscation as the most effective anonymisation method for ensuring privacy protection.

Because manually annotating the regions in the picture where faces and licence plates are located is impractical, this problem should be solved using machine learning and engineering techniques. Hence we detect and blur all faces and licence plates which could be considered as personal data.

Figure 2 - Sample blurred picture — *Figure 2 – Sample blurred picture*

In our case, we have a wide range of picture types: regular planar, very wide and 360 pictures in equirectangular format collected with 360 cameras. Also, because we are collecting imagery globally, the vehicle types, licence plates, and human environments are quite diverse in appearance, and are not handled well by off-the-shelf blurring software. So we built our own custom blurring solution which yielded higher accuracy and better cost-efficiency overall with respect to blurring of personal data.

Figure 3 - Example of equirectangular image where personal data has to be blurred — *Figure 3 – Example of equirectangular image where personal data has to be blurred*

Behind the scenes, in KartaView, there are a set of cool services which are able to derive useful information from the pictures like image quality, traffic signs, roads, etc. A big part of them are using deep learning algorithms which potentially can be negatively affected by running them over blurred pictures. In fact, based on the assessment we have done so far, the impact is extremely low, similar to the one reported in a well known study of face obfuscation in ImageNet [9].

Outline of Grab’s Blurring Process

Roughly, the processing steps are the following:

Transform each picture into a set of planar images. In this way, we further process all pictures, whatever the format they had, in the same way.
Use an object detector able to detect all faces and licence plates in a planar image having a standard field of view.
Transform the coordinates of the detected regions into original coordinates and blur those regions.

Figure 4 - Picture’s processing steps [8] — *Figure 4 – Picture’s processing steps [8]*

In the following section, we are going to describe in detail the interesting aspects of the second step, sharing the challenges and how we were solving them. Let’s start with the first and most important part, the dataset.

Dataset

Our current dataset consists of images from a wide range of cameras, including normal perspective cameras from mobile phones, wide field of view cameras and also 360 degree cameras.

It is the result of a series of data collections contributed by Grab’s data tagging teams, which may contain 2 classes of dataset that are of interest for us: FACE and LICENSE_PLATE.

The data was collected using Grab internal tools, stored in queryable databases, making it a system that gives the possibility to revisit the data and correct it if necessary, but also making it possible for data engineers to select and filter the data of interest.

Dataset Evolution

Each iteration of the dataset was made to address certain issues discovered while having models used in a production environment and observing situations where the model lacked in performance.

	Dataset v1	Dataset v2	Dataset v3
Nr. images	15226	17636	30538
Nr. of labels	64119	86676	242534

If the first version was basic, containing a rough tagging strategy we quickly noticed that it was not detecting some special situations that appeared due to the pandemic situation: people wearing masks.

This led to another round of data annotation to include those scenarios.
The third iteration addressed a broader range of issues:

Small regions of interest (objects far away from the camera)

Objects in very dark backgrounds

Rotated objects or even upside down

Variation of the licence plate design due to images from different countries and regions

People wearing masks

Faces in the mirror – see below the mirror of the motorcycle

But the main reason was because of a scenario where the recording, at the start or end (but not only), had close-ups of the operator who was checking the camera. This led to images with large regions of interest containing the camera operator’s face – too large to be detected by the model.

An investigation in the dataset structure, by splitting the data into bins based on the bbox sizes (in pixels), made something clear: the dataset was unbalanced.

We made bins for tag sizes with a stride of 100 pixels and went up to the max present in the dataset which accounted for 1 sample of size 2000 pixels. The majority of the labels were small in size and the higher we would go with the size, the less tags we would have. This made it clear that we would need more targeted annotations for our dataset to try to balance it.

All these scenarios required the tagging team to revisit the data multiple times and also change the tagging strategy by including more tags that were considered at a certain limit. It also required them to pay more attention to small details that may have been missed in a previous iteration.

Data Splitting

To better understand the strategy chosen for splitting the data we need to also understand the source of the data. The images come from different devices that are used in different geo locations (different countries) and are from a continuous trip recording. The annotation team used an internal tool to visualise the trips image by image and mark the faces and licence plates present in them. We would then have access to all those images and their respective metadata.

The chosen ratios for splitting are:

Train 70%
Validation 10%
Test 20%

Number of train images	12733
Number of validation images	1682
Number of test images	3221
Number of labeled classes in train set	60630
Number of labeled classes in validation set	7658
Number of of labeled classes in test set	18388

The split is not so trivial as we have some requirements and need to complete some conditions:

An image can have multiple tags from one or both classes but must belong to just one subset.
The tags should be split as close as possible to the desired ratios.
As different images can belong to the same trip in a close geographical relation we need to force them in the same subset, thus avoiding similar tags in train and test subsets, resulting in incorrect evaluations.

Data Augmentation

The application of data augmentation plays a crucial role while training the machine learning model. There are mainly three ways in which data augmentation techniques can be applied. They are:

Offline data augmentation – enriching a dataset by physically multiplying some of its images and applying modifications to them.
Online data augmentation – on the fly modifications of the image during train time with configurable probability for each modification.
Combination of both offline and online data augmentation.

In our case, we are using the third option which is the combination of both.

The first method that contributes to offline augmentation is a method called image view splitting. This is necessary for us due to different image types: perspective camera images, wide field of view images, 360 degree images in equirectangular format. All these formats and field of views with their respective distortions would complicate the data and make it hard for the model to generalise it and also handle different image types that could be added in the future.

For this we defined the concept of image views which are an extracted portion (view) of an image with some predefined properties. For example, the perspective projection of 75 by 75 degrees field of view patches from the original image.

Here we can see a perspective camera image and the image views generated from it:

Figure 5 - Original image — *Figure 5 – Original image*

Figure 6 - Two image views generated — *Figure 6 – Two image views generated*

The important thing here is that each generated view is an image on its own with the associated tags. They also have an overlapping area so we have a possibility to contain the same tag in two views but from different perspectives. This brings us to an indirect outcome of the first offline augmentation.

The second method for offline augmentation is the oversampling of some of the images (views). As mentioned above, we faced the problem of an unbalanced dataset, specifically we were missing tags that occupied high regions of the image, and even though our tagging teams tried to annotate as many as they could find, these were still scarce.

As our object detection model is an anchor-based detector, we did not even have enough of them to generate the anchor boxes correctly. This could be clearly seen in the accuracy of the previous trained models, as they were performing poorly on bins of big sizes.

By randomly oversampling images that contained big tags, up to a minimum required number, we managed to have better anchors and increase the recall for those scenarios. As described below, the chosen object detector for blurring was YOLOv4 which offers a large variety of online augmentations. The online augmentations used are saturation, exposure, hue, flip and mosaic.

Model

As of summer of 2021, the “to go” solution for object detection in images are convolutional neural networks (CNN), being a mature solution able to fulfil the needs efficiently.

Architecture

Most CNN based object detectors have three main parts: Backbone, Neck and (Dense or Sparse Prediction) Heads. From the input image, the backbone extracts features which can be combined in the neck part to be used by the prediction heads to predict object bounding-boxes and their labels.

Figure 7 - Anatomy of one and two-stage object detectors [1] — *Figure 7 – Anatomy of one and two-stage object detectors [1]*

The backbone is usually a CNN classification network pretrained on some dataset, like ImageNet-1K. The neck combines features from different layers in order to produce rich representations for both large and small objects. Since the objects to be detected have varying sizes, the topmost features are too coarse to represent smaller objects, so the first CNN based object detectors were fairly weak in detecting small sized objects. The multi-scale, pyramid hierarchy is inherent to CNNs so [2] introduced the Feature Pyramid Network which at marginal costs combines features from multiple scales and makes predictions on them. This or improved variants of this technique is used by most detectors nowadays. The head part does the predictions for bounding boxes and their labels.

YOLO is part of the anchor-based one-stage object detectors family being developed originally in Darknet, an open source neural network framework written in C and CUDA. Back in 2015 it was the first end-to-end differentiable network of this kind that offered a joint learning of object bounding boxes and their labels.

One reason for the big success of newer YOLO versions is that the authors carefully merged new ideas into one architecture, the overall speed of the model being always the north star.

YOLOv4 introduces several changes to its v3 predecessor:

Backbone – CSPDarknet53: YOLOv3 Darknet53 backbone was modified to use Cross Stage Partial Network (CSPNet [5]) strategy, which aims to achieve richer gradient combinations by letting the gradient flow propagate through different network paths.
Multiple configurable augmentation and loss function types, so called “Bag of freebies”, which by changing the training strategy can yield higher accuracy without impacting the inference time.
Configurable necks and different activation functions, they call “Bag of specials”.

Insights

For this task, we found that YOLOv4 gave a good compromise between speed and accuracy as it has doubled the speed of a more accurate two-stage detector while maintaining a very good overall precision/recall. For blurring, the main metric for model selection was the overall recall, while precision and intersection over union (IoU) of the predicted box comes second as we want to catch all personal data even if some are wrong. Having a multitude of possibilities to configure the detector architecture and train it on our own dataset we conducted several experiments with different configurations for backbones, necks, augmentations and loss functions to come up with our current solution.

We faced challenges in training a good model as the dataset posed a large object/box-level scale imbalance, small objects being over-represented in the dataset. As described in [3] and [4], this affects the scales of the estimated regions and the overall detection performance. In [3] several solutions are proposed for this out of which the SPP [6] blocks and PANet [7] neck used in YOLOv4 together with heavy offline data augmentation increased the performance of the actual model in comparison to the former ones.

As we have evaluated the model; it still has some issues:

Occlusion of the object, either by the camera view, head accessories or other elements:

These cases would need extra annotation in the dataset, just like the faces or licence plates that are really close to the camera and occupy a large region of interest in the image.

As we have a limited number of annotations of close objects to the camera view, the model has incorrectly learnt this, sometimes producing false positives in these situations:

Again, one solution for this would be to include more of these scenarios in the dataset.

What’s Next?

Grab spends a lot of effort ensuring privacy protection for its users so we are always looking for ways to further improve our related models and processes.

As far as efficiency is concerned, there are multiple directions to consider for both the dataset and the model. There are two main factors that drive the costs and the quality: further development of the dataset for additional edge cases (e.g. more training data of people wearing masks) and the operational costs of the model.

As the vast majority of current models require a fully labelled dataset, this puts a large work effort on the Data Entry team before creating a new model. Our dataset increased 4x for it’s third version, still there is room for improvement as described in the Dataset section.

As Grab extends its operation in more cities, new data is collected that has to be processed, this puts an increased focus on running detection models more efficiently.

Directions to pursue to increase our efficiency could be the following:

As plenty of unlabelled data is available from imagery collection, a natural direction to explore is self-supervised visual representation learning techniques to derive a general vision backbone with superior transferring performance for our subsequent tasks as detection, classification.
Experiment with optimisation techniques like pruning and quantisation to get a faster model without sacrificing too much on accuracy.
Explore new architectures: YOLOv5, EfficientDet or Swin-Transformer for Object Detection.
Introduce semi-supervised learning techniques to improve our model performance on the long tail of the data.

References

Alexey Bochkovskiy et al.. YOLOv4: Optimal Speed and Accuracy of Object Detection. arXiv:2004.10934v1
Tsung-Yi Lin et al. Feature Pyramid Networks for Object Detection. arXiv:1612.03144v2
Kemal Oksuz et al.. Imbalance Problems in Object Detection: A Review. arXiv:1909.00169v3
Bharat Singh, Larry S. Davis. An Analysis of Scale Invariance in Object Detection – SNIP. arXiv:1711.08189v2
Chien-Yao Wang et al. CSPNet: A New Backbone that can Enhance Learning Capability of CNN. arXiv:1911.11929v1
Kaiming He et al. Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition. arXiv:1406.4729v4
Shu Liu et al. Path Aggregation Network for Instance Segmentation. arXiv:1803.01534v4
http://blog.nitishmutha.com/equirectangular/360degree/2017/06/12/How-to-project-Equirectangular-image-to-rectilinear-view.html
Kaiyu Yang et al. Study of Face Obfuscation in ImageNet: arxiv.org/abs/2103.06191
Zhenda Xie et al. Self-Supervised Learning with Swin Transformers. arXiv:2105.04553v2

Join Us

Grab is the leading superapp platform in Southeast Asia, providing everyday services that matter to consumers. More than just a ride-hailing and food delivery app, Grab offers a wide range of on-demand services in the region, including mobility, food, package and grocery delivery services, mobile payments, and financial services across 428 cities in eight countries.

Powered by technology and driven by heart, our mission is to drive Southeast Asia forward by creating economic empowerment for everyone. If this mission speaks to you, join our team today!

Flawed Data

2021-07-26

Post Syndicated from original https://xkcd.com/2494/

We trained it to produce data that looked convincing, and we have to admit the results look convincing!

Kernel prepatch 5.14-rc3

2021-07-26

Post Syndicated from original https://lwn.net/Articles/864277/rss

The third 5.14 kernel prepatch is out for
testing.

Here we are, a week later. After a relatively big rc2, things seem
to have calmed down and rc3 looks pretty normal. Most of the fixes
here are small, and the diffstat looks largely flat. And there’s
not an undue amount of stuff.

Some weekend stable kernels

2021-07-25

Post Syndicated from original https://lwn.net/Articles/864263/rss

The
5.13.5,
5.10.53, and
5.4.135
stable kernels have been released; each contains another set of important
fixes.

Welcome to Cloudflare Impact Week

2021-07-25 Matthew Prince

Post Syndicated from Matthew Prince original https://blog.cloudflare.com/welcome-to-cloudflare-impact-week/

Welcome to Cloudflare Impact Week

If I’m completely honest, Cloudflare didn’t start out as a mission-driven company. When Lee, Michelle, and I first started thinking about starting a company in 2009 we saw an opportunity as the world was shifting from on-premise hardware and software to services in the cloud. It seemed inevitable to us that the same shift would come to security, performance, and reliability services. And, getting ahead of that trend, we could build a great business.

One problem we had was that we knew in order to have a great business we needed to win large organizations with big IT budgets as customers. And, in order to do that, we needed to have the data to build a service that would keep them safe. But we only could get data on security threats once we had customers. So we had a chicken and egg problem.

Our solution was to provide a basic version of Cloudflare’s services for free. We reasoned that individual developers and small businesses would sign up for the free service. We’d learn a lot about security threats and performance and reliability opportunities based on their traffic data. And, from that, we would build a service we could sell to large businesses.

And, generally, Cloudflare’s business model made sense. We found that, for the most part, small companies got a low volume of cyber attacks, and so we could charge them a relatively small amount. Large businesses faced more attacks, so we could charge them more.

But what surprised us, and we only discovered because we were providing a free version of our service, was that there was a certain set of small organizations with very limited resources that received very large attacks. Servicing them was what made Cloudflare the mission-driven company we are today.

The Committee to Protect Journalists

If you ever want to be depressed, sign up for the newsletter of the Committee to Protect Journalists (CPJ). They’re the organization that, when a journalist is kidnapped or killed anywhere in the world, negotiates their release or, far too often, recovers their body.

I’d met the director of the organization at an event in early 2012. Not long after, he called me and asked if I wanted to meet three Cloudflare customers who were in town. I didn’t, I have to confess, but Michelle pushed me to take the meeting.

On a rainy San Francisco afternoon the director of CPJ brought three African journalists to our office. All three of them hugged me. One was from Ethiopia, another was from Angola, and the third they wouldn’t tell us his name or where he was from because he was “currently being hunted by death squads.”

For the next 90 minutes, I listened to stories of how the journalists were covering corruption in their home countries, how their work put them constantly in harm’s way, how powerful forces worked to silence them, how cyberattacks had been a constant struggle, and how, today, they depended on Cloudflare’s free service to keep their work online. That last bit hit me like a ton of bricks.

After our meeting finished, and we saw the journalists out, with Cloudflare T-shirts and other swag in hand, I turned to Michelle and said, “Whoa. What have we gotten ourselves into?”

Becoming Mission Driven

I’ve thought about that meeting often since. It was the moment I realized that Cloudflare had a mission beyond just being a good business. The Internet was a critically important resource for those three journalists and many others like them. At the same time, forces that sought to limit their work would use cyberattacks to shut them down. While we hadn’t set out to ensure everyone had world-class cybersecurity, regardless of their ability to pay, now it seemed critically important.

With that realization, Cloudflare’s mission came naturally: we aim to help build a better Internet. One where you don’t need to be a giant company to be fast and reliable. And where even a journalist, working on their own against daunting odds, can be secure online.

This is why we’ve prioritized projects that give back to the Internet. We launched Project Galileo, which provides our enterprise-grade services to organizations performing politically or artistically important work. We launched the Athenian Project to help protect elections against cyber attacks. We launched Project Fair Shot to make sure the organizations distributing the COVID-19 vaccine had the technical resources they needed to do so equitably.

And, even on the technical side, we work hard to make the Internet better even when there’s no clear economic benefit to us, or even when it’s against our economic benefit. We don’t monetize user data because it seems clear to us that a better Internet is a more private Internet. We enabled encryption for everyone even though, when we did it, it was the biggest differentiator between our free and paid plans and the number one reason people upgraded. But clearly a better Internet was an encrypted Internet, and it seemed silly that someone should have to pay extra for a little bit of math.

Our First Impact Week

This week we kick off Cloudflare’s first Impact Week. We originally conceived the idea of the week as a way to highlight some of the things we were doing as a company around our environmental, social, and governance (ESG) initiatives. But, as is the nature of innovation weeks at Cloudflare, as soon as we announced it internally our team started proposing new products and features to take some of our existing initiatives even further.

So, over the course of the week, in addition to talking about how we’ve built our network to consume less power we’ll also be demonstrating how we’re increasingly using hyper power-efficient Arm-based servers to achieve even higher levels of efficiency in order to lessen the environmental impact of running the Internet. We’ll launch a new Workers option for developers who want to be more environmentally conscious. And we’ll announce an initiative in partnership with other leading Internet companies that we hope, if broadly adopted, could cut down as much as 25% of global web traffic and the corresponding energy wasted to serve it.

We’ll also focus on how we can bring the Internet to more people. While broadband has been a revolution where it’s available, rural and underserved-urban communities around the world still suffer from slow Internet speeds and limited ISP choice. We can’t completely solve that problem (yet) but we’ll be announcing an initiative that will help with some critical aspects.

Finally, as Cloudflare becomes a larger part of the Internet, we’ll be announcing programs both to monitor the network’s health, affirm our commitments to human rights, and extend our protections of critical societal functions like protecting elections.

When I first was trying to convince Michelle that we should start a business together, I pitched her a bunch of ideas. Most of them involved finding a clever way to extract rents from some group or another, often for not much benefit to society at large. Sitting in an Ethiopian restaurant in Central Square, I remember so clearly her saying to me, “Matthew, those are all great business ideas. But they’re not for me. I want to do something where I can be proud of the work we’re doing and the positive impact we’ve made.”

That sentence made me go back to the drawing board. The next business idea I pitched to her turned out to be Cloudflare. Today, Cloudflare’s mission remains helping build a better Internet. And, as we kick off Impact Week, we are proud to continue to live that mission in everything we do.

Абе, тоя на луди ли ни пра‘и, бе!?

2021-07-25

Post Syndicated from original https://bivol.bg/%D0%B0%D0%B1%D0%B5-%D1%82%D0%BE%D1%8F-%D0%BD%D0%B0-%D0%BB%D1%83%D0%B4%D0%B8-%D0%BB%D0%B8-%D0%BD%D0%B8-%D0%BF%D1%80%D0%B0%D0%B8-%D0%B1%D0%B5.html

неделя 25 юли 2021

През 2020-а EUROACTIV са отчели 19 (словом, деветнадесет) случая на осъдени за корупция в България. Деветнадесет. До един на кокошкари. Пюрета, шкембеджийници, казани за варене на ракия, детски пюрета, компоти, всякакви фотки…

Comic for 2021.07.25

2021-07-25 Explosm.net

Post Syndicated from Explosm.net original http://explosm.net/comics/5932/

New Cyanide and Happiness Comic

Alleycat requirements

DynamoDB terminology and concepts

Determining data access requirements

Modeling many-to-many relationships with DynamoDB

Reviewing the data access patterns for Alleycat

1. Get the results for each race by racer ID

2. Get a list of races by class ID

3. Get the best performance by racer for a class ID.

4. Get the list of top scores by race ID.

5. Get the second-by-second performance by racer for all races.

Optimizing items and capacity

Conclusion

Batch fraud prediction use cases

Batch fraud insights using Amazon Fraud Detector

Fraud detection using Amazon SageMaker

Conclusion

Cloud Security Just Got Better!

Built for DevOps, Trusted by Security

Securing Kubernetes With InsightCloudSec

Want to learn more? Register for the webinar on InsightCloudSec and its Kubernetes protection.

The Tyranny of Averages

Bandwidth and Latency in Eight Countries

A Quick Refresher on Latency and Bandwidth

Case Study on ISP Concentration’s Impact on Performance: Alabama, USA

The Other Performance Challenge: Limited ISP Exchanges, and Tromboning

Conclusion

What is Cloudflare doing to help?

What is a community network?

What is the problem?

Why does the problem persist?

How does a community network reach the Internet?

Why is Cloudflare doing this?

How can my community network get involved?

The ‘community’ in Cloudflare

About Flarability

About the Disability Community

Stories from Flarability

What does accessibility mean to you?

Do you have a story to share about how workplace accessibility initiatives have impacted you?

Why is accessibility advocacy important?

What’s next for our group?

HackSpace magazine issue 45 out NOW!

Image Collection Using KartaView

Why Image Blurring is Important

Outline of Grab’s Blurring Process

Dataset

Dataset Evolution

Data Splitting

Data Augmentation

Model

Architecture

Insights

What’s Next?

References

Join Us

The Committee to Protect Journalists

Becoming Mission Driven

Our First Impact Week

The collective thoughts of the interwebz