Tag Archives: infrastructure

NIST Cybersecurity Framework 2.0

Post Syndicated from Bruce Schneier original https://www.schneier.com/blog/archives/2024/03/nist-cybersecurity-framework-2-0.html

NIST has released version 2.0 of the Cybersecurity Framework:

The CSF 2.0, which supports implementation of the National Cybersecurity Strategy, has an expanded scope that goes beyond protecting critical infrastructure, such as hospitals and power plants, to all organizations in any sector. It also has a new focus on governance, which encompasses how organizations make and carry out informed decisions on cybersecurity strategy. The CSF’s governance component emphasizes that cybersecurity is a major source of enterprise risk that senior leaders should consider alongside others such as finance and reputation.

[…]

The framework’s core is now organized around six key functions: Identify, Protect, Detect, Respond and Recover, along with CSF 2.0’s newly added Govern function. When considered together, these functions provide a comprehensive view of the life cycle for managing cybersecurity risk.

The updated framework anticipates that organizations will come to the CSF with varying needs and degrees of experience implementing cybersecurity tools. New adopters can learn from other users’ successes and select their topic of interest from a new set of implementation examples and quick-start guides designed for specific types of users, such as small businesses, enterprise risk managers, and organizations seeking to secure their supply chains.

This is a big deal. The CSF is widely used, and has been in need of an update. And NIST is exactly the sort of respected organization to do this correctly.

Some news articles.

Upgrading GitHub.com to MySQL 8.0

Post Syndicated from Jiaqi Liu original https://github.blog/2023-12-07-upgrading-github-com-to-mysql-8-0/


Over 15 years ago, GitHub started as a Ruby on Rails application with a single MySQL database. Since then, GitHub has evolved its MySQL architecture to meet the scaling and resiliency needs of the platform—including building for high availability, implementing testing automation, and partitioning the data. Today, MySQL remains a core part of GitHub’s infrastructure and our relational database of choice.

This is the story of how we upgraded our fleet of 1200+ MySQL hosts to 8.0. Upgrading the fleet with no impact to our Service Level Objectives (SLO) was no small feat–planning, testing and the upgrade itself took over a year and collaboration across multiple teams within GitHub.

Motivation for upgrading

Why upgrade to MySQL 8.0? With MySQL 5.7 nearing end of life, we upgraded our fleet to the next major version, MySQL 8.0. We also wanted to be on a version of MySQL that gets the latest security patches, bug fixes, and performance enhancements. There are also new features in 8.0 that we want to test and benefit from, including Instant DDLs, invisible indexes, and compressed bin logs, among others.

GitHub’s MySQL infrastructure

Before we dive into how we did the upgrade, let’s take a 10,000-foot view of our MySQL infrastructure:

  • Our fleet consists of 1200+ hosts. It’s a combination of Azure Virtual Machines and bare metal hosts in our data center.
  • We store 300+ TB of data and serve 5.5 million queries per second across 50+ database clusters.
  • Each cluster is configured for high availability with a primary plus replicas cluster setup.
  • Our data is partitioned. We leverage both horizontal and vertical sharding to scale our MySQL clusters. We have MySQL clusters that store data for specific product-domain areas. We also have horizontally sharded Vitess clusters for large-domain areas that outgrew the single-primary MySQL cluster.
  • We have a large ecosystem of tools consisting of Percona Toolkit, gh-ost, orchestrator, freno, and in-house automation used to operate the fleet.

All this sums up to a diverse and complex deployment that needs to be upgraded while maintaining our SLOs.

Preparing the journey

As the primary data store for GitHub, we hold ourselves to a high standard for availability. Due to the size of our fleet and the criticality of MySQL infrastructure, we had a few requirements for the upgrade process:

  • We must be able to upgrade each MySQL database while adhering to our Service Level Objectives (SLOs) and Service Level Agreements (SLAs).
  • We are unable to account for all failure modes in our testing and validation stages. So, in order to remain within SLO, we needed to be able to roll back to the prior version of MySQL 5.7 without a disruption of service.
  • We have a very diverse workload across our MySQL fleet. To reduce risk, we needed to upgrade each database cluster atomically and schedule around other major changes. This meant the upgrade process would be a long one. Therefore, we knew from the start we needed to be able to sustain operating a mixed-version environment.

Preparation for the upgrade started in July 2022 and we had several milestones to reach even before upgrading a single production database.

Prepare infrastructure for upgrade

We needed to determine appropriate default values for MySQL 8.0 and perform some baseline performance benchmarking. Since we needed to operate two versions of MySQL, our tooling and automation needed to be able to handle mixed versions and be aware of new, different, or deprecated syntax between 5.7 and 8.0.

Ensure application compatibility

We added MySQL 8.0 to Continuous Integration (CI) for all applications using MySQL. We ran MySQL 5.7 and 8.0 side-by-side in CI to ensure that there wouldn’t be regressions during the prolonged upgrade process. We detected a variety of bugs and incompatibilities in CI, helping us remove any unsupported configurations or features and escape any new reserved keywords.

To help application developers transition towards MySQL 8.0, we also enabled an option to select a MySQL 8.0 prebuilt container in GitHub Codespaces for debugging and provided MySQL 8.0 development clusters for additional pre-prod testing.

Communication and transparency

We used GitHub Projects to create a rolling calendar to communicate and track our upgrade schedule internally. We created issue templates that tracked the checklist for both application teams and the database team to coordinate an upgrade.

Project Board for tracking the MySQL 8.0 upgrade schedule
Project Board for tracking the MySQL 8.0 upgrade schedule

Upgrade plan

To meet our availability standards, we had a gradual upgrade strategy that allowed for checkpoints and rollbacks throughout the process.

Step 1: Rolling replica upgrades

We started with upgrading a single replica and monitoring while it was still offline to ensure basic functionality was stable. Then, we enabled production traffic and continued to monitor for query latency, system metrics, and application metrics. We gradually brought 8.0 replicas online until we upgraded an entire data center and then iterated through other data centers. We left enough 5.7 replicas online in order to rollback, but we disabled production traffic to start serving all read traffic through 8.0 servers.

The replica upgrade strategy involved gradual rollouts in each data center (DC).
The replica upgrade strategy involved gradual rollouts in each data center (DC).

Step 2: Update replication topology

Once all the read-only traffic was being served via 8.0 replicas, we adjusted the replication topology as follows:

  • An 8.0 primary candidate was configured to replicate directly under the current 5.7 primary.
  • Two replication chains were created downstream of that 8.0 replica:
    • A set of only 5.7 replicas (not serving traffic, but ready in case of rollback).
    • A set of only 8.0 replicas (serving traffic).
  • The topology was only in this state for a short period of time (hours at most) until we moved to the next step.
To facilitate the upgrade, the topology was updated to have two replication chains.
To facilitate the upgrade, the topology was updated to have two replication chains.

Step 3: Promote MySQL 8.0 host to primary

We opted not to do direct upgrades on the primary database host. Instead, we would promote a MySQL 8.0 replica to primary through a graceful failover performed with Orchestrator. At that point, the replication topology consisted of an 8.0 primary with two replication chains attached to it: an offline set of 5.7 replicas in case of rollback and a serving set of 8.0 replicas.

Orchestrator was also configured to blacklist 5.7 hosts as potential failover candidates to prevent an accidental rollback in case of an unplanned failover.

Primary failover and additional steps to finalize MySQL 8.0 upgrade for a database
Primary failover and additional steps to finalize MySQL 8.0 upgrade for a database

Step 4: Internal facing instance types upgraded

We also have ancillary servers for backups or non-production workloads. Those were subsequently upgraded for consistency.

Step 5: Cleanup

Once we confirmed that the cluster didn’t need to rollback and was successfully upgraded to 8.0, we removed the 5.7 servers. Validation consisted of at least one complete 24 hour traffic cycle to ensure there were no issues during peak traffic.

Ability to Rollback

A core part of keeping our upgrade strategy safe was maintaining the ability to rollback to the prior version of MySQL 5.7. For read-replicas, we ensured enough 5.7 replicas remained online to serve production traffic load, and rollback was initiated by disabling the 8.0 replicas if they weren’t performing well. For the primary, in order to roll back without data loss or service disruption, we needed to be able to maintain backwards data replication between 8.0 and 5.7.

MySQL supports replication from one release to the next higher release but does not explicitly support the reverse (MySQL Replication compatibility). When we tested promoting an 8.0 host to primary on our staging cluster, we saw replication break on all 5.7 replicas. There were a couple of problems we needed to overcome:

  1. In MySQL 8.0, utf8mb4 is the default character set and uses a more modern utf8mb4_0900_ai_ci collation as the default. The prior version of MySQL 5.7 supported the utf8mb4_unicode_520_ci collation but not the latest version of Unicode utf8mb4_0900_ai_ci.
  2. MySQL 8.0 introduces roles for managing privileges but this feature did not exist in MySQL 5.7. When an 8.0 instance was promoted to be a primary in a cluster, we encountered problems. Our configuration management was expanding certain permission sets to include role statements and executing them, which broke downstream replication in 5.7 replicas. We solved this problem by temporarily adjusting defined permissions for affected users during the upgrade window.

To address the character collation incompatibility, we had to set the default character encoding to utf8 and collation to utf8_unicode_ci.

For the GitHub.com monolith, our Rails configuration ensured that character collation was consistent and made it easier to standardize client configurations to the database. As a result, we had high confidence that we could maintain backward replication for our most critical applications.

Challenges

Throughout our testing, preparation and upgrades, we encountered some technical challenges.

What about Vitess?

We use Vitess for horizontally sharding relational data. For the most part, upgrading our Vitess clusters was not too different from upgrading the MySQL clusters. We were already running Vitess in CI, so we were able to validate query compatibility. In our upgrade strategy for sharded clusters, we upgraded one shard at a time. VTgate, the Vitess proxy layer, advertises the version of MySQL and some client behavior depends on this version information. For example, one application used a Java client that disabled the query cache for 5.7 servers—since the query cache was removed in 8.0, it generated blocking errors for them. So, once a single MySQL host was upgraded for a given keyspace, we had to make sure we also updated the VTgate setting to advertise 8.0.

Replication delay

We use read-replicas to scale our read availability. GitHub.com requires low replication delay in order to serve up-to-date data.

Earlier on in our testing, we encountered a replication bug in MySQL that was patched on 8.0.28:

Replication: If a replica server with the system variable replica_preserve_commit_order = 1 set was used under intensive load for a long period, the instance could run out of commit order sequence tickets. Incorrect behavior after the maximum value was exceeded caused the applier to hang and the applier worker threads to wait indefinitely on the commit order queue. The commit order sequence ticket generator now wraps around correctly. Thanks to Zhai Weixiang for the contribution. (Bug #32891221, Bug #103636)

We happen to meet all the criteria for hitting this bug.

  • We use replica_preserve_commit_order because we use GTID based replication.
  • We have intensive load for long periods of time on many of our clusters and certainly for all of our most critical ones. Most of our clusters are very write-heavy.

Since this bug was already patched upstream, we just needed to ensure we are deploying a version of MySQL higher than 8.0.28.

We also observed that the heavy writes that drove replication delay were exacerbated in MySQL 8.0. This made it even more important that we avoid heavy bursts in writes. At GitHub, we use freno to throttle write workloads based on replication lag.

Queries would pass CI but fail on production

We knew we would inevitably see problems for the first time in production environments—hence our gradual rollout strategy with upgrading replicas. We encountered queries that passed CI but would fail on production when encountering real-world workloads. Most notably, we encountered a problem where queries with large WHERE IN clauses would crash MySQL. We had large WHERE IN queries containing over tens of thousands of values. In those cases, we needed to rewrite the queries prior to continuing the upgrade process. Query sampling helped to track and detect these problems. At GitHub, we use Solarwinds DPM (VividCortex), a SaaS database performance monitor, for query observability.

Learnings and takeaways

Between testing, performance tuning, and resolving identified issues, the overall upgrade process took over a year and involved engineers from multiple teams at GitHub. We upgraded our entire fleet to MySQL 8.0 – including staging clusters, production clusters in support of GitHub.com, and instances in support of internal tools. This upgrade highlighted the importance of our observability platform, testing plan, and rollback capabilities. The testing and gradual rollout strategy allowed us to identify problems early and reduce the likelihood for encountering new failure modes for the primary upgrade.

While there was a gradual rollout strategy, we still needed the ability to rollback at every step and we needed the observability to identify signals to indicate when a rollback was needed. The most challenging aspect of enabling rollbacks was holding onto the backward replication from the new 8.0 primary to 5.7 replicas. We learned that consistency in the Trilogy client library gave us more predictability in connection behavior and allowed us to have confidence that connections from the main Rails monolith would not break backward replication.

However, for some of our MySQL clusters with connections from multiple different clients in different frameworks/languages, we saw backwards replication break in a matter of hours which shortened the window of opportunity for rollback. Luckily, those cases were few and we didn’t have an instance where the replication broke before we needed to rollback. But for us this was a lesson that there are benefits to having known and well-understood client-side connection configurations. It emphasized the value of developing guidelines and frameworks to ensure consistency in such configurations.

Prior efforts to partition our data paid off—it allowed us to have more targeted upgrades for the different data domains. This was important as one failing query would block the upgrade for an entire cluster and having different workloads partitioned allowed us to upgrade piecemeal and reduce the blast radius of unknown risks encountered during the process. The tradeoff here is that this also means that our MySQL fleet has grown.

The last time GitHub upgraded MySQL versions, we had five database clusters and now we have 50+ clusters. In order to successfully upgrade, we had to invest in observability, tooling, and processes for managing the fleet.

Conclusion

A MySQL upgrade is just one type of routine maintenance that we have to perform – it’s critical for us to have an upgrade path for any software we run on our fleet. As part of the upgrade project, we developed new processes and operational capabilities to successfully complete the MySQL version upgrade. Yet, we still had too many steps in the upgrade process that required manual intervention and we want to reduce the effort and time it takes to complete future MySQL upgrades.

We anticipate that our fleet will continue to grow as GitHub.com grows and we have goals to partition our data further which will increase our number of MySQL clusters over time. Building in automation for operational tasks and self-healing capabilities can help us scale MySQL operations in the future. We believe that investing in reliable fleet management and automation will allow us to scale github and keep up with required maintenance, providing a more predictable and resilient system.

The lessons from this project provided the foundations for our MySQL automation and will pave the way for future upgrades to be done more efficiently, but still with the same level of care and safety.

If you are interested in these types of engineering problems and more, check out our Careers page.

The post Upgrading GitHub.com to MySQL 8.0 appeared first on The GitHub Blog.

EPA Won’t Force Water Utilities to Audit Their Cybersecurity

Post Syndicated from Bruce Schneier original https://www.schneier.com/blog/archives/2023/10/epa-wont-force-water-utilities-to-audit-their-cybersecurity.html

The industry pushed back:

Despite the EPA’s willingness to provide training and technical support to help states and public water system organizations implement cybersecurity surveys, the move garnered opposition from both GOP state attorneys and trade groups.

Republican state attorneys that were against the new proposed policies said that the call for new inspections could overwhelm state regulators. The attorney generals of Arkansas, Iowa and Missouri all sued the EPA—claiming the agency had no authority to set these requirements. This led to the EPA’s proposal being temporarily blocked back in June.

So now we have a piece of our critical infrastructure with substandard cybersecurity. This seems like a really bad outcome.

Hacking Gas Pumps via Bluetooth

Post Syndicated from Bruce Schneier original https://www.schneier.com/blog/archives/2023/10/hacking-gas-pumps-via-bluetooth.html

Turns out pumps at gas stations are controlled via Bluetooth, and that the connections are insecure. No details in the article, but it seems that it’s easy to take control of the pump and have it dispense gas without requiring payment.

It’s a complicated crime to monetize, though. You need to sell access to the gas pump to others.

EDITED TO ADD (10/13): Reader Jeff Hall says that story is not accurate, and that the gas pumps do not have a Bluetooth connection.

December’s Reimagining Democracy Workshop

Post Syndicated from Bruce Schneier original https://www.schneier.com/blog/archives/2023/08/decembers-reimagining-democracy-workshop.html

Imagine that we’ve all—all of us, all of society—landed on some alien planet, and we have to form a government: clean slate. We don’t have any legacy systems from the US or any other country. We don’t have any special or unique interests to perturb our thinking.

How would we govern ourselves?

It’s unlikely that we would use the systems we have today. The modern representative democracy was the best form of government that mid-eighteenth-century technology could conceive of. The twenty-first century is a different place scientifically, technically and socially.

For example, the mid-eighteenth-century democracies were designed under the assumption that both travel and communications were hard. Does it still make sense for all of us living in the same place to organize every few years and choose one of us to go to a big room far away and create laws in our name?

Representative districts are organized around geography, because that’s the only way that made sense 200-plus years ago. But we don’t have to do it that way. We can organize representation by age: one representative for the thirty-one-year-olds, another for the thirty-two-year-olds, and so on. We can organize representation randomly: by birthday, perhaps. We can organize any way we want.

US citizens currently elect people for terms ranging from two to six years. Is ten years better? Is ten days better? Again, we have more technology and therefor more options.

Indeed, as a technologist who studies complex systems and their security, I believe the very idea of representative government is a hack to get around the technological limitations of the past. Voting at scale is easier now than it was 200 year ago. Certainly we don’t want to all have to vote on every amendment to every bill, but what’s the optimal balance between votes made in our name and ballot measures that we all vote on?

In December 2022, I organized a workshop to discuss these and other questions. I brought together fifty people from around the world: political scientists, economists, law professors, AI experts, activists, government officials, historians, science fiction writers and more. We spent two days talking about these ideas. Several themes emerged from the event.

Misinformation and propaganda were themes, of course—and the inability to engage in rational policy discussions when people can’t agree on the facts.

Another theme was the harms of creating a political system whose primary goals are economic. Given the ability to start over, would anyone create a system of government that optimizes the near-term financial interest of the wealthiest few? Or whose laws benefit corporations at the expense of people?

Another theme was capitalism, and how it is or isn’t intertwined with democracy. And while the modern market economy made a lot of sense in the industrial age, it’s starting to fray in the information age. What comes after capitalism, and how does it affect how we govern ourselves?

Many participants examined the effects of technology, especially artificial intelligence. We looked at whether—and when—we might be comfortable ceding power to an AI. Sometimes it’s easy. I’m happy for an AI to figure out the optimal timing of traffic lights to ensure the smoothest flow of cars through the city. When will we be able to say the same thing about setting interest rates? Or designing tax policies?

How would we feel about an AI device in our pocket that voted in our name, thousands of times per day, based on preferences that it inferred from our actions? If an AI system could determine optimal policy solutions that balanced every voter’s preferences, would it still make sense to have representatives? Maybe we should vote directly for ideas and goals instead, and leave the details to the computers. On the other hand, technological solutionism regularly fails.

Scale was another theme. The size of modern governments reflects the technology at the time of their founding. European countries and the early American states are a particular size because that’s what was governable in the 18th and 19th centuries. Larger governments—the US as a whole, the European Union—reflect a world in which travel and communications are easier. The problems we have today are primarily either local, at the scale of cities and towns, or global—even if they are currently regulated at state, regional or national levels. This mismatch is especially acute when we try to tackle global problems. In the future, do we really have a need for political units the size of France or Virginia? Or is it a mixture of scales that we really need, one that moves effectively between the local and the global?

As to other forms of democracy, we discussed one from history and another made possible by today’s technology.

Sortition is a system of choosing political officials randomly to deliberate on a particular issue. We use it today when we pick juries, but both the ancient Greeks and some cities in Renaissance Italy used it to select major political officials. Today, several countries—largely in Europe—are using sortition for some policy decisions. We might randomly choose a few hundred people, representative of the population, to spend a few weeks being briefed by experts and debating the problem—and then decide on environmental regulations, or a budget, or pretty much anything.

Liquid democracy does away with elections altogether. Everyone has a vote, and they can keep the power to cast it themselves or assign it to another person as a proxy. There are no set elections; anyone can reassign their proxy at any time. And there’s no reason to make this assignment all or nothing. Perhaps proxies could specialize: one set of people focused on economic issues, another group on health and a third bunch on national defense. Then regular people could assign their votes to whichever of the proxies most closely matched their views on each individual matter—or step forward with their own views and begin collecting proxy support from other people.

This all brings up another question: Who gets to participate? And, more generally, whose interests are taken into account? Early democracies were really nothing of the sort: They limited participation by gender, race and land ownership.

We should debate lowering the voting age, but even without voting we recognize that children too young to vote have rights—and, in some cases, so do other species. Should future generations get a “voice,” whatever that means? What about nonhumans or whole ecosystems?

Should everyone get the same voice? Right now in the US, the outsize effect of money in politics gives the wealthy disproportionate influence. Should we encode that explicitly? Maybe younger people should get a more powerful vote than everyone else. Or maybe older people should.

Those questions lead to ones about the limits of democracy. All democracies have boundaries limiting what the majority can decide. We all have rights: the things that cannot be taken away from us. We cannot vote to put someone in jail, for example.

But while we can’t vote a particular publication out of existence, we can to some degree regulate speech. In this hypothetical community, what are our rights as individuals? What are the rights of society that supersede those of individuals?

Personally, I was most interested in how these systems fail. As a security technologist, I study how complex systems are subverted—hacked, in my parlance—for the benefit of a few at the expense of the many. Think tax loopholes, or tricks to avoid government regulation. I want any government system to be resilient in the face of that kind of trickery.

Or, to put it another way, I want the interests of each individual to align with the interests of the group at every level. We’ve never had a system of government with that property before—even equal protection guarantees and First Amendment rights exist in a competitive framework that puts individuals’ interests in opposition to one another. But—in the age of such existential risks as climate and biotechnology and maybe AI—aligning interests is more important than ever.

Our workshop didn’t produce any answers; that wasn’t the point. Our current discourse is filled with suggestions on how to patch our political system. People regularly debate changes to the Electoral College, or the process of creating voting districts, or term limits. But those are incremental changes.

It’s hard to find people who are thinking more radically: looking beyond the horizon for what’s possible eventually. And while true innovation in politics is a lot harder than innovation in technology, especially without a violent revolution forcing change, it’s something that we as a species are going to have to get good at—one way or another.

This essay previously appeared in The Conversation.

White House Announces AI Cybersecurity Challenge

Post Syndicated from Bruce Schneier original https://www.schneier.com/blog/archives/2023/08/white-house-announces-ai-cybersecurity-challenge.html

At Black Hat last week, the White House announced an AI Cyber Challenge. Gizmodo reports:

The new AI cyber challenge (which is being abbreviated “AIxCC”) will have a number of different phases. Interested would-be competitors can now submit their proposals to the Small Business Innovation Research program for evaluation and, eventually, selected teams will participate in a 2024 “qualifying event.” During that event, the top 20 teams will be invited to a semifinal competition at that year’s DEF CON, another large cybersecurity conference, where the field will be further whittled down.

[…]

To secure the top spot in DARPA’s new competition, participants will have to develop security solutions that do some seriously novel stuff. “To win first-place, and a top prize of $4 million, finalists must build a system that can rapidly defend critical infrastructure code from attack,” said Perri Adams, program manager for DARPA’s Information Innovation Office, during a Zoom call with reporters Tuesday. In other words: the government wants software that is capable of identifying and mitigating risks by itself.

This is a great idea. I was a big fan of DARPA’s AI capture-the-flag event in 2016, and am happy to see that DARPA is again inciting research in this area. (China has been doing this every year since 2017.)

Backdoor in TETRA Police Radios

Post Syndicated from Bruce Schneier original https://www.schneier.com/blog/archives/2023/07/backdoor-in-tetra-police-radios.html

Seems that there is a deliberate backdoor in the twenty-year-old TErrestrial Trunked RAdio (TETRA) standard used by police forces around the world.

The European Telecommunications Standards Institute (ETSI), an organization that standardizes technologies across the industry, first created TETRA in 1995. Since then, TETRA has been used in products, including radios, sold by Motorola, Airbus, and more. Crucially, TETRA is not open-source. Instead, it relies on what the researchers describe in their presentation slides as “secret, proprietary cryptography,” meaning it is typically difficult for outside experts to verify how secure the standard really is.

The researchers said they worked around this limitation by purchasing a TETRA-powered radio from eBay. In order to then access the cryptographic component of the radio itself, Wetzels said the team found a vulnerability in an interface of the radio.

[…]

Most interestingly is the researchers’ findings of what they describe as the backdoor in TEA1. Ordinarily, radios using TEA1 used a key of 80-bits. But Wetzels said the team found a “secret reduction step” which dramatically lowers the amount of entropy the initial key offered. An attacker who followed this step would then be able to decrypt intercepted traffic with consumer-level hardware and a cheap software defined radio dongle.

Looks like the encryption algorithm was intentionally weakened by intelligence agencies to facilitate easy eavesdropping.

Specifically on the researchers’ claims of a backdoor in TEA1, Boyer added “At this time, we would like to point out that the research findings do not relate to any backdoors. The TETRA security standards have been specified together with national security agencies and are designed for and subject to export control regulations which determine the strength of the encryption.”

And I would like to point out that that’s the very definition of a backdoor.

Why aren’t we done with secret, proprietary cryptography? It’s just not a good idea.

Details of the security analysis. Another news article.

Chinese Hacking of US Critical Infrastructure

Post Syndicated from Bruce Schneier original https://www.schneier.com/blog/archives/2023/05/chinese-hacking-of-us-critical-infrastructure.html

Everyone is writing about an interagency and international report on Chinese hacking of US critical infrastructure.

Lots of interesting details about how the group, called Volt Typhoon, accesses target networks and evades detection.

PIPEDREAM Malware against Industrial Control Systems

Post Syndicated from Bruce Schneier original https://www.schneier.com/blog/archives/2023/05/pipedream-malware-against-industrial-control-systems.html

Another nation-state malware, Russian in origin:

In the early stages of the war in Ukraine in 2022, PIPEDREAM, a known malware was quietly on the brink of wiping out a handful of critical U.S. electric and liquid natural gas sites. PIPEDREAM is an attack toolkit with unmatched and unprecedented capabilities developed for use against industrial control systems (ICSs).

The malware was built to manipulate the network communication protocols used by programmable logic controllers (PLCs) leveraged by two critical producers of PLCs for ICSs within the critical infrastructure sector, Schneider Electric and OMRON.

CISA advisory. Wired article.

Cloudflare’s network expansion in Indonesia

Post Syndicated from Joanne Liew original https://blog.cloudflare.com/indonesia/

Cloudflare's network expansion in Indonesia

Cloudflare's network expansion in Indonesia

As home to over 200 million Internet users and the fourth-largest population in the world, Indonesians depend on fast and reliable Internet, but this has always been a challenging part of the world for Internet infrastructure. This has real world implications on performance and reliability (IP transit is on average 6x more expensive than our major South East Asian interconnection markets). That said, first we wanted to share what makes things challenging in Indonesia; geography, infrastructure, and market dynamics.

Geography: The Internet backbone for many countries is almost entirely delivered by terrestrial fiber optic cables, where connectivity is more affordable and easier to build when the land mass is contiguous and there is a concentrated population distribution. However, Indonesia is a collection of over 18,000 islands, spanning three time zones, and approximately 3,200 miles (5,100 km) east to west. By comparison, the United States is 2,800 miles (4,500 km) east to west. While parts of Indonesia are geographically close to Singapore (the regional Internet hub with over 60% of the region’s data centers) given how large Indonesia is, much of it is far away.

Infrastructure: Indonesia is a large country and to connect it to the rest of the Internet it currently relies on submarine fiber optic cables. There are a total of 22 separate submarine cables connecting Indonesia to Singapore, Malaysia, Australia and onward. Many of the cable systems cross the Strait of Malacca, a narrow stretch of water, between the Malay Peninsula (Peninsular Malaysia) and the Indonesian island of Sumatra to the southwest, connecting the Indian and Pacific Oceans. This makes reliability challenging as a result of human activities, such as ships dropping their anchors, fishing trawlers, and dredging as it is one of the world’s top five busiest shipping lanes. Additionally, Indonesia is geographically located in a very active seismic zone and is very earthquake prone.

There are a number of new submarine cable systems that have come online and four significant builds planned (Apricot, ACC-1, Echo and Nui) that will improve both available capacity and cost economics in the market. Right now the cost is still significantly higher than comparable distances. For example Jakarta to Singapore is approximately 60 times more expensive than a service the same distance would be in the continental US or Europe for a 100Gbps wavelength service. Staying in Asia, a similar distance from Hong Kong to Taiwan costs around 1/6th that of Jakarta to Singapore.

Cloudflare's network expansion in Indonesia

Cloudflare's network expansion in Indonesia
Cyber 1 and Cyber 3 (NTT NexCenter) Data Center Buildings in Jakarta, 2019 (Photo Credit: Tom Paseka).

Cloudflare's network expansion in Indonesia
Picture of Cyber 1 Lobby Directory

While areas like Batam are becoming increasingly popular for data center builds due to its proximity to Singapore, Jakarta is still the most developed and mature market. It has the largest and best interconnected data centers in the country, including the two pictured.

Cloudflare is deployed in the facility on the right (NTT NexCenter), however most ISPs are inside the building on the left (Cyber 1). The two buildings are approximately 30-50 meters apart, yet it’s surprisingly difficult to be able to connect between them. One of the reasons why is market fragmentation and how many options are available. In the adjacent picture of the Cyber 1 building lobby directory many of the listings are unique data centers each with different policies and access conditions.

In the past, we’ve talked about the Cost of Bandwidth around the world (and updated here), but we’ve never talked about Indonesia specifically. Using the same methodology as we’ve used in the past, Indonesia’s cost is 43x times more expensive than North America or Europe, or even multiples more expensive than other countries in Asia.

Market dynamics: While Indonesia has good and functioning Internet Exchanges, there are a few ISPs who dominate the market. The three largest ISPs in the country (Telkom Indonesia, Indosat Ooredoo Hutchison and XL Axiata) collectively control 80% of the market, while Telkom Indonesia alone has a market share of around 60% by revenue.

This results in Telkom Indonesia having a heavily dominant market share position to leverage resulting in refusal to peer, or exchange Internet traffic in Indonesia without expensive payments, or instead, preferring to connect to other networks outside of Indonesia, introducing latency and diminished performance.

Despite all of these challenges, our network has come a long way since our initial deployment to Jakarta in 2019.

We’ve established:

  • A carrier neutral local point of presence at NTT Indonesia Nexcenter Data Center, one of the major interconnection hubs in Jakarta
  • An edge partnership point of presence in Yogyakarta with CitranetIX
  • Direct interconnections in country with two of the top three networks.
  • Peering across three of the larger local internet exchanges, Indonesia Internet Exchange, Jakarta Internet Exchange and Biznet Internet Exchange
  • Dedicated 100G wavelength transport back to Singapore

All of this results in a more performant and reliable network for our local customers.

We wanted to see how our network is performing since deployment. We mentioned during Speed Week in 2021 how we benchmark against different networks, and sharing some of those benchmarks here.

At the end of December 2021, Cloudflare was only faster in a few networks, as compared to other providers in Indonesia.

Cloudflare's network expansion in Indonesia

Fast forward twelve months to December 2022, Cloudflare is significantly faster in even more networks.

Cloudflare's network expansion in Indonesia

The TCP protocol is a connection-oriented protocol, which means that a connection is established and maintained until the application programs at each end have finished exchanging messages. The Connect Time summarizes how fast a session can be set up between a client and a server over a network. TTLB (or time to last byte) is the time taken to send the entire response to the web browser. It’s a good measure of how long a complete download takes. Check out our recent blog on Benchmarking Edge Network Performance for more information on how we measure the performance of our network and benchmark ourselves against industry players.

On closer inspection against the three major ISPs specifically, we’re the top provider for two out of the three networks. Cloudflare’s performance has improved year-on-year (16% reduction) and continues to lead (comparative to the other networks) meaning faster and more responsive services for our customers.

Cloudflare's network expansion in Indonesia

Helping build a better Internet for Indonesia doesn’t stop here and there is always more work to be done! We want to be the number one network everywhere and won’t rest until we are. We are continuing to connect to more networks locally, invest in direct submarine cable capacity, as well as further deployments into new data center buildings, Internet Exchanges and new cities too!

Are you operating a network and not yet peering with Cloudflare? Log-in to our Peering Portal or find out more information here for ways to set up peering, or request we deploy nodes into your network directly.

Cyberwar Lessons from the War in Ukraine

Post Syndicated from Bruce Schneier original https://www.schneier.com/blog/archives/2023/02/cyberwar-lessons-from-the-war-in-ukraine.html

The Aspen Institute has published a good analysis of the successes, failures, and absences of cyberattacks as part of the current war in Ukraine: “The Cyber Defense Assistance Imperative ­ Lessons from Ukraine.”

Its conclusion:

Cyber defense assistance in Ukraine is working. The Ukrainian government and Ukrainian critical infrastructure organizations have better defended themselves and achieved higher levels of resiliency due to the efforts of CDAC and many others. But this is not the end of the road—the ability to provide cyber defense assistance will be important in the future. As a result, it is timely to assess how to provide organized, effective cyber defense assistance to safeguard the post-war order from potential aggressors.

The conflict in Ukraine is resetting the table across the globe for geopolitics and international security. The US and its allies have an imperative to strengthen the capabilities necessary to deter and respond to aggression that is ever more present in cyberspace. Lessons learned from the ad hoc conduct of cyber defense assistance in Ukraine can be institutionalized and scaled to provide new approaches and tools for preventing and managing cyber conflicts going forward.

I am often asked why where weren’t more successful cyberattacks by Russia against Ukraine. I generally give four reasons: (1) Cyberattacks are more effective in the “grey zone” between peace and war, and there are better alternatives once the shooting and bombing starts. (2) Setting these attacks up takes time, and Putin was secretive about his plans. (3) Putin was concerned about attacks spilling outside the war zone, and affecting other countries. (4) Ukrainian defenses were good, aided by other countries and companies. This paper gives a fifth reasons: they were technically successful, but keeping them out of the news made them operationally unsuccessful.

What Will It Take?

Post Syndicated from Bruce Schneier original https://www.schneier.com/blog/archives/2023/02/what-will-it-take.html

What will it take for policy makers to take cybersecurity seriously? Not minimal-change seriously. Not here-and-there seriously. But really seriously. What will it take for policy makers to take cybersecurity seriously enough to enact substantive legislative changes that would address the problems? It’s not enough for the average person to be afraid of cyberattacks. They need to know that there are engineering fixes—and that’s something we can provide.

For decades, I have been waiting for the “big enough” incident that would finally do it. In 2015, Chinese military hackers hacked the Office of Personal Management and made off with the highly personal information of about 22 million Americans who had security clearances. In 2016, the Mirai botnet leveraged millions of Internet-of-Things devices with default admin passwords to launch a denial-of-service attack that disabled major Internet platforms and services in both North America and Europe. In 2017, hackers—years later we learned that it was the Chinese military—hacked the credit bureau Equifax and stole the personal information of 147 million Americans. In recent years, ransomware attacks have knocked hospitals offline, and many articles have been written about Russia inside the U.S. power grid. And last year, the Russian SVR hacked thousands of sensitive networks inside civilian critical infrastructure worldwide in what we’re now calling Sunburst (and used to call SolarWinds).

Those are all major incidents to security people, but think about them from the perspective of the average person. Even the most spectacular failures don’t affect 99.9% of the country. Why should anyone care if the Chinese have his or her credit records? Or if the Russians are stealing data from some government network? Few of us have been directly affected by ransomware, and a temporary Internet outage is just temporary.

Cybersecurity has never been a campaign issue. It isn’t a topic that shows up in political debates. (There was one question in a 2016 Clinton–Trump debate, but the response was predictably unsubstantive.) This just isn’t an issue that most people prioritize, or even have an opinion on.

So, what will it take? Many of my colleagues believe that it will have to be something with extreme emotional intensity—sensational, vivid, salient—that results in large-scale loss of life or property damage. A successful attack that actually poisons a water supply, as someone tried to do in January by raising the levels of lye at a Florida water-treatment plant. (That one was caught early.) Or an attack that disables Internet-connected cars at speed, something that was demonstrated by researchers in 2014. Or an attack on the power grid, similar to what Russia did to the Ukraine in 2015 and 2016. Will it take gas tanks exploding and planes falling out of the sky for the average person to read about the casualties and think “that could have been me”?

Here’s the real problem. For the average nonexpert—and in this category I include every lawmaker—to push for change, they not only need to believe that the present situation is intolerable, they also need to believe that an alternative is possible. Real legislative change requires a belief that the never-ending stream of hacks and attacks is not inevitable, that we can do better. And that will require creating working examples of secure, dependable, resilient systems.

Providing alternatives is how engineers help facilitate social change. We could never have eliminated sales of tungsten-filament household light bulbs if fluorescent and LED replacements hadn’t become available. Reducing the use of fossil fuel for electricity generation requires working wind turbines and cost-effective solar cells.

We need to demonstrate that it’s possible to build systems that can defend themselves against hackers, criminals, and national intelligence agencies; secure Internet-of-Things systems; and systems that can reestablish security after a breach. We need to prove that hacks aren’t inevitable, and that our vulnerability is a choice. Only then can someone decide to choose differently. When people die in a cyberattack and everyone asks “What can be done?” we need to have something to tell them.

We don’t yet have the technology to build a truly safe, secure, and resilient Internet and the computers that connect to it. Yes, we have lots of security technologies. We have older secure systems—anyone still remember Apollo’s DomainOS and MULTICS?—that lost out in a market that didn’t reward security. We have newer research ideas and products that aren’t successful because the market still doesn’t reward security. We have even newer research ideas that won’t be deployed, again, because the market still prefers convenience over security.

What I am proposing is something more holistic, an engineering research task on a par with the Internet itself. The Internet was designed and built to answer this question: Can we build a reliable network out of unreliable parts in an unreliable world? It turned out the answer was yes, and the Internet was the result. I am asking a similar research question: Can we build a secure network out of insecure parts in an insecure world? The answer isn’t obviously yes, but it isn’t obviously no, either.

While any successful demonstration will include many of the security technologies we know and wish would see wider use, it’s much more than that. Creating a secure Internet ecosystem goes beyond old-school engineering to encompass the social sciences. It will include significant economic, institutional, and psychological considerations that just weren’t present in the first few decades of Internet research.

Cybersecurity isn’t going to get better until the economic incentives change, and that’s not going to change until the political incentives change. The political incentives won’t change until there is political liability that comes from voter demands. Those demands aren’t going to be solely the results of insecurity. They will also be the result of believing that there’s a better alternative. It is our task to research, design, build, test, and field that better alternative—even though the market couldn’t care less right now.

This essay originally appeared in the May/June 2021 issue of IEEE Security & Privacy. I forgot to publish it here.

Reimagining Democracy

Post Syndicated from Bruce Schneier original https://www.schneier.com/blog/archives/2022/12/reimagining-democracy.html

Last week, I hosted a two-day workshop on reimagining democracy.

The idea was to bring together people from a variety of disciplines who are all thinking about different aspects of democracy, less from a “what we need to do today” perspective and more from a blue-sky future perspective. My remit to the participants was this:

The idea is to start from scratch, to pretend we’re forming a new country and don’t have any precedent to deal with. And that we don’t have any unique interests to perturb our thinking. The modern representative democracy was the best form of government mid-eighteenth century politicians technology could invent. The twenty-first century is a very different place technically, scientifically, and philosophically. What could democracy look like if it were reinvented today? Would it even be democracy­—what comes after democracy?

Some questions to think about:

  • Representative democracies were built under the assumption that travel and communications were difficult. Does it still make sense to organize our representative units by geography? Or to send representatives far away to create laws in our name? Is there a better way for people to choose collective representatives?
  • Indeed, the very idea of representative government is due to technological limitations. If an AI system could find the optimal solution for balancing every voter’s preferences, would it still make sense to have representatives­—or should we vote for ideas and goals instead?
  • With today’s technology, we can vote anywhere and any time. How should we organize the temporal pattern of voting—­and of other forms of participation?
  • Starting from scratch, what is today’s ideal government structure? Does it make sense to have a singular leader “in charge” of everything? How should we constrain power­—is there something better than the legislative/judicial/executive set of checks and balances?
  • The size of contemporary political units ranges from a few people in a room to vast nation-states and alliances. Within one country, what might the smaller units be­—and how do they relate to one another?
  • Who has a voice in the government? What does “citizen” mean? What about children? Animals? Future people (and animals)? Corporations? The land?
  • And much more: What about the justice system? Is the twelfth-century jury form still relevant? How do we define fairness? Limit financial and military power? Keep our system robust to psychological manipulation?

My perspective, of course, is security. I want to create a system that is resilient against hacking: one that can evolve as both technologies and threats evolve.

The format was one that I have used before. Forty-eight people meet over two days. There are four ninety-minute panels per day, with six people on each. Everyone speaks for ten minutes, and the rest of the time is devoted to questions and comments. Ten minutes means that no one gets bogged down in jargon or details. Long breaks between sessions and evening dinners allow people to talk more informally. The result is a very dense, idea-rich environment that I find extremely valuable.

It was amazing event. Everyone participated. Everyone was interesting. (Details of the event—emerging themes, notes from the speakers—are in the comments.) It’s a week later and I am still buzzing with ideas. I hope this is only the first of an ongoing series of similar workshops.

A Digital Red Cross

Post Syndicated from Bruce Schneier original https://www.schneier.com/blog/archives/2022/11/a-digital-red-cross.html

The International Committee of the Red Cross wants some digital equivalent to the iconic red cross, to alert would-be hackers that they are accessing a medical network.

The emblem wouldn’t provide technical cybersecurity protection to hospitals, Red Cross infrastructure or other medical providers, but it would signal to hackers that a cyberattack on those protected networks during an armed conflict would violate international humanitarian law, experts say, Tilman Rodenhäuser, a legal adviser to the International Committee of the Red Cross, said at a panel discussion hosted by the organization on Thursday.

I can think of all sorts of problems with this idea and many reasons why it won’t work, but those also apply to the physical red cross on buildings, vehicles, and people’s clothing. So let’s try it.

EDITED TO ADD: Original reference.

NSA on Supply Chain Security

Post Syndicated from Bruce Schneier original https://www.schneier.com/blog/archives/2022/11/nsa-on-supply-chain-security.html

The NSA (together with CISA) has published a long report on supply-chain security: “Securing the Software Supply Chain: Recommended Practices Guide for Suppliers.“:

Prevention is often seen as the responsibility of the software developer, as they are required to securely develop and deliver code, verify third party components, and harden the build environment. But the supplier also holds a critical responsibility in ensuring the security and integrity of our software. After all, the software vendor is responsible for liaising between the customer and software developer. It is through this relationship that additional security features can be applied via contractual agreements, software releases and updates, notifications and mitigations of vulnerabilities.

Software suppliers will find guidance from NSA and our partners on preparing organizations by defining software security checks, protecting software, producing well-secured software, and responding to vulnerabilities on a continuous basis. Until all stakeholders seek to mitigate concerns specific to their area of responsibility, the software supply chain cycle will be vulnerable and at risk for potential compromise.

They previously published “Securing the Software Supply Chain: Recommended Practices Guide for Developers.” And they plan on publishing one focused on customers.

Montenegro Is the Victim of a Cyberattack

Post Syndicated from Bruce Schneier original https://www.schneier.com/blog/archives/2022/09/montenegro-is-the-victim-of-a-cyberattack.html

Details are few, but Montenegro has suffered a cyberattack:

A combination of ransomware and distributed denial-of-service attacks, the onslaught disrupted government services and prompted the country’s electrical utility to switch to manual control.

[…]

But the attack against Montenegro’s infrastructure seemed more sustained and extensive, with targets including water supply systems, transportation services and online government services, among many others.

Government officials in the country of just over 600,000 people said certain government services remained temporarily disabled for security reasons and that the data of citizens and businesses were not endangered.

The Director of the Directorate for Information Security, Dusan Polovic, said 150 computers were infected with malware at a dozen state institutions and that the data of the Ministry of Public Administration was not permanently damaged. Polovic said some retail tax collection was affected.

Russia is being blamed, but I haven’t seen any evidence other than “they’re the obvious perpetrator.”

EDITED TO ADD (9/12): The Montenegro government is hedging on that Russia attribution. It seems to be a regular criminal ransomware attack. The Cuba Ransomware gang has Russian members, but that’s not the same thing as the government.

Securing Open-Source Software

Post Syndicated from Bruce Schneier original https://www.schneier.com/blog/archives/2022/07/securing-open-source-software.html

Good essay arguing that open-source software is a critical national-security asset and needs to be treated as such:

Open source is at least as important to the economy, public services, and national security as proprietary code, but it lacks the same standards and safeguards. It bears the qualities of a public good and is as indispensable as national highways. Given open source’s value as a public asset, an institutional structure must be built that sustains and secures it.

This is not a novel idea. Open-source code has been called the “roads and bridges” of the current digital infrastructure that warrants the same “focus and funding.” Eric Brewer of Google explicitly called open-source software “critical infrastructure” in a recent keynote at the Open Source Summit in Austin, Texas. Several nations have adopted regulations that recognize open-source projects as significant public assets and central to their most important systems and services. Germany wants to treat open-source software as a public good and launched a sovereign tech fund to support open-source projects “just as much as bridges and roads,” and not just when a bridge collapses. The European Union adopted a formal open-source strategy that encourages it to “explore opportunities for dedicated support services for open source solutions [it] considers critical.”

Designing an institutional framework that would secure open source requires addressing adverse incentives, ensuring efficient resource allocation, and imposing minimum standards. But not all open-source projects are made equal. The first step is to identify which projects warrant this heightened level of scrutiny—projects that are critical to society. CISA defines critical infrastructure as industry sectors “so vital to the United States that [its] incapacity or destruction would have a debilitating impact on our physical or economic security or public health or safety.” Efforts should target the open-source projects that share those features.

Attacks on Managed Service Providers Expected to Increase

Post Syndicated from Bruce Schneier original https://www.schneier.com/blog/archives/2022/05/attacks-on-managed-service-providers-expected-to-increase.html

CISA, NSA, FBI, and similar organizations in the other Five Eyes countries are warning that attacks on MSPs — as a vector to their customers — are likely to increase. No details about what this prediction is based on. Makes sense, though. The SolarWinds attack was incredibly successful for the Russian SVR, and a blueprint for future attacks.

News articles.

Industrial Control System Malware Discovered

Post Syndicated from Bruce Schneier original https://www.schneier.com/blog/archives/2022/04/industrial-control-system-malware-discovered.html

The Department of Energy, CISA, the FBI, and the NSA jointly issued an advisory describing a sophisticated piece of malware called Pipedream that’s designed to attack a wide range of industrial control systems. This is clearly from a government, but no attribution is given. There’s also no indication of how the malware was discovered. It seems not to have been used yet.

More information. News article.

White House Warns of Possible Russian Cyberattacks

Post Syndicated from Bruce Schneier original https://www.schneier.com/blog/archives/2022/03/white-house-warns-of-possible-russian-cyberattacks.html

News:

The White House has issued its starkest warning that Russia may be planning cyberattacks against critical-sector U.S. companies amid the Ukraine invasion.

[…]

Context: The alert comes after Russia has lobbed a series of digital attacks at the Ukrainian government and critical industry sectors. But there’s been no sign so far of major disruptive hacks against U.S. targets even as the government has imposed increasingly harsh sanctions that have battered the Russian economy.

  • The public alert followed classified briefings government officials conducted last week for more than 100 companies in sectors at the highest risk of Russian hacks, Neuberger said. The briefing was prompted by “preparatory activity” by Russian hackers, she said.
  • U.S. analysts have detected scanning of some critical sectors’ computers by Russian government actors and other preparatory work, one U.S. official told my colleague Ellen Nakashima on the condition of anonymity because of the matter’s sensitivity. But whether that is a signal that there will be a cyberattack on a critical system is not clear, Neuberger said.
  • Neuberger declined to name specific industry sectors under threat but said they’re part of critical infrastructure ­– a government designation that includes industries deemed vital to the economy and national security, including energy, finance, transportation and pipelines.

President Biden’s statement. White House fact sheet. And here’s a video of the extended Q&A with deputy national security adviser Anne Neuberger.

EDITED TO ADD (3/23): Long — three hour — conference call with CISA.