Tag Archives: internet

Web 3.0 Requires Data Integrity

Post Syndicated from Bruce Schneier original https://www.schneier.com/blog/archives/2025/04/web-3-0-requires-data-integrity.html

If you’ve ever taken a computer security class, you’ve probably learned about the three legs of computer security—confidentiality, integrity, and availability—known as the CIA triad. When we talk about a system being secure, that’s what we’re referring to. All are important, but to different degrees in different contexts. In a world populated by artificial intelligence (AI) systems and artificial intelligent agents, integrity will be paramount.

What is data integrity? It’s ensuring that no one can modify data—that’s the security angle—but it’s much more than that. It encompasses accuracy, completeness, and quality of data—all over both time and space. It’s preventing accidental data loss; the “undo” button is a primitive integrity measure. It’s also making sure that data is accurate when it’s collected—that it comes from a trustworthy source, that nothing important is missing, and that it doesn’t change as it moves from format to format. The ability to restart your computer is another integrity measure.

The CIA triad has evolved with the Internet. The first iteration of the Web—Web 1.0 of the 1990s and early 2000s—prioritized availability. This era saw organizations and individuals rush to digitize their content, creating what has become an unprecedented repository of human knowledge. Organizations worldwide established their digital presence, leading to massive digitization projects where quantity took precedence over quality. The emphasis on making information available overshadowed other concerns.

As Web technologies matured, the focus shifted to protecting the vast amounts of data flowing through online systems. This is Web 2.0: the Internet of today. Interactive features and user-generated content transformed the Web from a read-only medium to a participatory platform. The increase in personal data, and the emergence of interactive platforms for e-commerce, social media, and online everything demanded both data protection and user privacy. Confidentiality became paramount.

We stand at the threshold of a new Web paradigm: Web 3.0. This is a distributed, decentralized, intelligent Web. Peer-to-peer social-networking systems promise to break the tech monopolies’ control on how we interact with each other. Tim Berners-Lee’s open W3C protocol, Solid, represents a fundamental shift in how we think about data ownership and control. A future filled with AI agents requires verifiable, trustworthy personal data and computation. In this world, data integrity takes center stage.

For example, the 5G communications revolution isn’t just about faster access to videos; it’s about Internet-connected things talking to other Internet-connected things without our intervention. Without data integrity, for example, there’s no real-time car-to-car communications about road movements and conditions. There’s no drone swarm coordination, smart power grid, or reliable mesh networking. And there’s no way to securely empower AI agents.

In particular, AI systems require robust integrity controls because of how they process data. This means technical controls to ensure data is accurate, that its meaning is preserved as it is processed, that it produces reliable results, and that humans can reliably alter it when it’s wrong. Just as a scientific instrument must be calibrated to measure reality accurately, AI systems need integrity controls that preserve the connection between their data and ground truth.

This goes beyond preventing data tampering. It means building systems that maintain verifiable chains of trust between their inputs, processing, and outputs, so humans can understand and validate what the AI is doing. AI systems need clean, consistent, and verifiable control processes to learn and make decisions effectively. Without this foundation of verifiable truth, AI systems risk becoming a series of opaque boxes.

Recent history provides many sobering examples of integrity failures that naturally undermine public trust in AI systems. Machine-learning (ML) models trained without thought on expansive datasets have produced predictably biased results in hiring systems. Autonomous vehicles with incorrect data have made incorrect—and fatal—decisions. Medical diagnosis systems have given flawed recommendations without being able to explain themselves. A lack of integrity controls undermines AI systems and harms people who depend on them.

They also highlight how AI integrity failures can manifest at multiple levels of system operation. At the training level, data may be subtly corrupted or biased even before model development begins. At the model level, mathematical foundations and training processes can introduce new integrity issues even with clean data. During execution, environmental changes and runtime modifications can corrupt previously valid models. And at the output level, the challenge of verifying AI-generated content and tracking it through system chains creates new integrity concerns. Each level compounds the challenges of the ones before it, ultimately manifesting in human costs, such as reinforced biases and diminished agency.

Think of it like protecting a house. You don’t just lock a door; you also use safe concrete foundations, sturdy framing, a durable roof, secure double-pane windows, and maybe motion-sensor cameras. Similarly, we need digital security at every layer to ensure the whole system can be trusted.

This layered approach to understanding security becomes increasingly critical as AI systems grow in complexity and autonomy, particularly with large language models (LLMs) and deep-learning systems making high-stakes decisions. We need to verify the integrity of each layer when building and deploying digital systems that impact human lives and societal outcomes.

At the foundation level, bits are stored in computer hardware. This represents the most basic encoding of our data, model weights, and computational instructions. The next layer up is the file system architecture: the way those binary sequences are organized into structured files and directories that a computer can efficiently access and process. In AI systems, this includes how we store and organize training data, model checkpoints, and hyperparameter configurations.

On top of that are the application layers—the programs and frameworks, such as PyTorch and TensorFlow, that allow us to train models, process data, and generate outputs. This layer handles the complex mathematics of neural networks, gradient descent, and other ML operations.

Finally, at the user-interface level, we have visualization and interaction systems—what humans actually see and engage with. For AI systems, this could be everything from confidence scores and prediction probabilities to generated text and images or autonomous robot movements.

Why does this layered perspective matter? Vulnerabilities and integrity issues can manifest at any level, so understanding these layers helps security experts and AI researchers perform comprehensive threat modeling. This enables the implementation of defense-in-depth strategies—from cryptographic verification of training data to robust model architectures to interpretable outputs. This multi-layered security approach becomes especially crucial as AI systems take on more autonomous decision-making roles in critical domains such as healthcare, finance, and public safety. We must ensure integrity and reliability at every level of the stack.

The risks of deploying AI without proper integrity control measures are severe and often underappreciated. When AI systems operate without sufficient security measures to handle corrupted or manipulated data, they can produce subtly flawed outputs that appear valid on the surface. The failures can cascade through interconnected systems, amplifying errors and biases. Without proper integrity controls, an AI system might train on polluted data, make decisions based on misleading assumptions, or have outputs altered without detection. The results of this can range from degraded performance to catastrophic failures.

We see four areas where integrity is paramount in this Web 3.0 world. The first is granular access, which allows users and organizations to maintain precise control over who can access and modify what information and for what purposes. The second is authentication—much more nuanced than the simple “Who are you?” authentication mechanisms of today—which ensures that data access is properly verified and authorized at every step. The third is transparent data ownership, which allows data owners to know when and how their data is used and creates an auditable trail of data providence. Finally, the fourth is access standardization: common interfaces and protocols that enable consistent data access while maintaining security.

Luckily, we’re not starting from scratch. There are open W3C protocols that address some of this: decentralized identifiers for verifiable digital identity, the verifiable credentials data model for expressing digital credentials, ActivityPub for decentralized social networking (that’s what Mastodon uses), Solid for distributed data storage and retrieval, and WebAuthn for strong authentication standards. By providing standardized ways to verify data provenance and maintain data integrity throughout its lifecycle, Web 3.0 creates the trusted environment that AI systems require to operate reliably. This architectural leap for integrity control in the hands of users helps ensure that data remains trustworthy from generation and collection through processing and storage.

Integrity is essential to trust, on both technical and human levels. Looking forward, integrity controls will fundamentally shape AI development by moving from optional features to core architectural requirements, much as SSL certificates evolved from a banking luxury to a baseline expectation for any Web service.

Web 3.0 protocols can build integrity controls into their foundation, creating a more reliable infrastructure for AI systems. Today, we take availability for granted; anything less than 100% uptime for critical websites is intolerable. In the future, we will need the same assurances for integrity. Success will require following practical guidelines for maintaining data integrity throughout the AI lifecycle—from data collection through model training and finally to deployment, use, and evolution. These guidelines will address not just technical controls but also governance structures and human oversight, similar to how privacy policies evolved from legal boilerplate into comprehensive frameworks for data stewardship. Common standards and protocols, developed through industry collaboration and regulatory frameworks, will ensure consistent integrity controls across different AI systems and applications.

Just as the HTTPS protocol created a foundation for trusted e-commerce, it’s time for new integrity-focused standards to enable the trusted AI services of tomorrow.

This essay was written with Davi Ottenheimer, and originally appeared in Communications of the ACM.

The First Password on the Internet

Post Syndicated from Bruce Schneier original https://www.schneier.com/blog/archives/2025/01/the-first-password-on-the-internet.html

It was created in 1973 by Peter Kirstein:

So from the beginning I put password protection on my gateway. This had been done in such a way that even if UK users telephoned directly into the communications computer provided by Darpa in UCL, they would require a password.

In fact this was the first password on Arpanet. It proved invaluable in satisfying authorities on both sides of the Atlantic for the 15 years I ran the service ­ during which no security breach occurred over my link. I also put in place a system of governance that any UK users had to be approved by a committee which I chaired but which also had UK government and British Post Office representation.

I wish he’d told us what that password was.

Cloudflare Reports that Almost 7% of All Internet Traffic Is Malicious

Post Syndicated from Bruce Schneier original https://www.schneier.com/blog/archives/2024/07/cloudflare-reports-that-almost-7-of-all-internet-traffic-is-malicious.html

6.8%, to be precise.

From ZDNet:

However, Distributed Denial of Service (DDoS) attacks continue to be cybercriminals’ weapon of choice, making up over 37% of all mitigated traffic. The scale of these attacks is staggering. In the first quarter of 2024 alone, Cloudflare blocked 4.5 million unique DDoS attacks. That total is nearly a third of all the DDoS attacks they mitigated the previous year.

But it’s not just about the sheer volume of DDoS attacks. The sophistication of these attacks is increasing, too. Last August, Cloudflare mitigated a massive HTTP/2 Rapid Reset DDoS attack that peaked at 201 million requests per second (RPS). That number is three times bigger than any previously observed attack.

It wasn’t just Cloudflare that was hit by the largest DDoS attack in its history. Google Cloud reported the same attack peaked at an astonishing 398 million RPS. So, how big is that number? According to Google, Google Cloud was slammed by more RPS in two minutes than Wikipedia saw traffic during September 2023.

Google Reportedly Disconnecting Employees from the Internet

Post Syndicated from Bruce Schneier original https://www.schneier.com/blog/archives/2023/07/google-reportedly-disconnecting-employees-from-the-internet.html

Supposedly Google is starting a pilot program of disabling Internet connectivity from employee computers:

The company will disable internet access on the select desktops, with the exception of internal web-based tools and Google-owned websites like Google Drive and Gmail. Some workers who need the internet to do their job will get exceptions, the company stated in materials.

Google has not confirmed this story.

More news articles.

Facebook Is Down

Post Syndicated from Bruce Schneier original https://www.schneier.com/blog/archives/2021/10/facebook-is-down.html

Facebook — along with Instagram and WhatsApp — went down globally today. Basically, someone deleted their BGP records, which made their DNS fall apart.

…at approximately 11:39 a.m. ET today (15:39 UTC), someone at Facebook caused an update to be made to the company’s Border Gateway Protocol (BGP) records. BGP is a mechanism by which Internet service providers of the world share information about which providers are responsible for routing Internet traffic to which specific groups of Internet addresses.

In simpler terms, sometime this morning Facebook took away the map telling the world’s computers how to find its various online properties. As a result, when one types Facebook.com into a web browser, the browser has no idea where to find Facebook.com, and so returns an error page.

In addition to stranding billions of users, the Facebook outage also has stranded its employees from communicating with one another using their internal Facebook tools. That’s because Facebook’s email and tools are all managed in house and via the same domains that are now stranded.

What I heard is that none of the employee keycards work, since they have to ping a now-unreachable server. So people can’t get into buildings and offices.

And every third-party site that relies on “log in with Facebook” is stuck as well.

The fix won’t be quick:

As a former network admin who worked on the internet at this level, I anticipate Facebook will be down for hours more. I suspect it will end up being Facebook’s longest and most severe failure to date before it’s fixed.

We all know the security risks of monocultures.

EDITED TO ADD (10/6): Good explanation of what happened. Shorter from Jonathan Zittrain: “Facebook basically locked its keys in the car.”

Surveillance of the Internet Backbone

Post Syndicated from Bruce Schneier original https://www.schneier.com/blog/archives/2021/08/surveillance-of-the-internet-backbone.html

Vice has an article about how data brokers sell access to the Internet backbone. This is netflow data. It’s useful for cybersecurity forensics, but can also be used for things like tracing VPN activity.

At a high level, netflow data creates a picture of traffic flow and volume across a network. It can show which server communicated with another, information that may ordinarily only be available to the server owner or the ISP carrying the traffic. Crucially, this data can be used for, among other things, tracking traffic through virtual private networks, which are used to mask where someone is connecting to a server from, and by extension, their approximate physical location.

In the hands of some governments, that could be dangerous.

17000ft| The MagPi 98

Post Syndicated from Rob Zwetsloot original https://www.raspberrypi.org/blog/17000ft-the-magpi-98/

How do you get internet over three miles up the Himalayas? That’s what the 17000 ft Foundation and Sujata Sahu had to figure out. Rob Zwetsloot reports in the latest issue of the MagPi magazine, out now.

Living in more urban areas of the UK, it can be easy to take for granted decent internet and mobile phone signal. In more remote areas of the country, internet can be a bit spotty but it’s nothing compared with living up in a mountain.

Tablet computers are provided that connect to a Raspberry Pi-powered network

“17000 ft Foundation is a not-for-profit organisation in India, set up to improve the lives of people settled in very remote mountainous hamlets, in areas that are inaccessible and isolated due to reasons of harsh mountainous terrain,” explains its founder, Sujata Sahu. “17000 ft has its roots in high-altitude Ladakh, a region in the desolate cold desert of the Himalayan mountain region of India. Situated in altitudes upwards of 9300 ft and with temperatures dropping to -50°C in inhabited areas, this area is home to indigenous tribal communities settled across hundreds of tiny, scattered hamlets. These villages are remote, isolated, and suffer from bare minimum infrastructure and a centuries-old civilisation unwilling but driven to migrate to faraway cities in search of a better life. Ladakh has a population of just under 300,000 people living across 60,000 km2 of harsh mountain terrain, whose sustenance and growth depends on the infrastructure, resources, and support provided by the government.”

A huge number of students have already benefited from the program

The local governments have built schools. However, they don’t have enough resources or qualified teachers to be truly effective, resulting in a problem with students dropping out or having to be sent off to cities. 17000 ft’s mission is to transform the education in these communities.

High-altitude Raspberry Pi

“The Foundation today works in over 200 remote government schools to upgrade school infrastructure, build the capacity of teachers, provide better resources for learning, thereby improving the quality of education for its children,” says Sujata. “17000 ft Foundation has designed and implemented a unique solar-powered offline digital learning solution called the DigiLab, using Raspberry Pi, which brings the power of digital learning to areas which are truly off-grid and have neither electricity nor mobile connectivity, helping children to learn better, while also enabling the local administration to monitor performance remotely.”

Each school is provided with solar power, Raspberry Pi computers to act as a local internet for the school, and tablets to connect to it. It serves as a ‘last mile connectivity’ from a remote school in the cloud, with an app on a teacher’s phone that will download data when it can and then update the installed Raspberry Pi in their school.

Remote success

“The solution has now been implemented in 120 remote schools of Ladakh and is being considered to be implemented at scale to cover the entire region,” adds Sujata. “It has now run successfully across three winters of Ladakh, withstanding even the harshest of -50°C temperatures with no failure. In the first year of its implementation alone, 5000 students were enrolled, with over 93% being active. The system has now delivered over 60,000 hours of learning to students in remote villages and improved learning outcomes.”

Not all children stay in the villages year round

It’s already helping to change education in the area during the winter. Many villages (and schools) can shut down for up to six months, and families who can’t move away are usually left without a functioning school. 17000 ft has changed this.

“In the winter of 2018 and 2019, for the first time in a few decades, parents and community members from many of these hamlets decided to take advantage of their DigiLabs and opened them up for their children to learn despite the harsh winters and lack of teachers,” Sujata explains. “Parents pooled in to provide basic heating facilities (a Bukhari – a wood- or dung-based stove with a long pipe chimney) to bring in some warmth and scheduled classes for the senior children, allowing them to learn at their own pace, with student data continuing to be recorded in Raspberry Pi and available for the teachers to assess when they got back. The DigiLab Program, which has been made possible due to the presence of the Raspberry Pi Server, has solved a major problem that the Ladakhis have been facing for years!”

Some of the village schools go unused in the winter

How can people help?

Sujata says, “17000 ft Foundation is a non-profit organisation and is dependent on donations and support from individuals and companies alike. This solution was developed by the organisation in a limited budget and was implemented successfully across over a hundred hamlets. Raspberry Pi has been a boon for this project, with its low cost and its computing capabilities which helped create this solution for such a remote area. However, the potential of Raspberry Pi is as yet untapped and the solution still needs upgrades to be able to scale to cover more schools and deliver enhanced functionality within the school. 17000 ft is very eager to help take this to other similar regions and cover more schools in Ladakh that still remain ignored. What we really need is funds and technical support to be able to reach the good of this solution to more children who are still out of the reach of Ed Tech and learning. We welcome contributions of any size to help us in this project.”

For donations from outside India, write to [email protected]. Indian citizens can donate through 17000ft.org/donate.

The MagPi magazine is out now, available in print from the Raspberry Pi Press onlinestore, your local newsagents, and the Raspberry Pi Store, Cambridge.

You can also download the PDF directly from the MagPi magazine website.

Subscribers to the MagPi for 12 months get a free Adafruit Circuit Playground, or can choose from one of our other subscription offers, including this amazing limited-time offer of three issues and a book for only £10!

The post 17000ft| The MagPi 98 appeared first on Raspberry Pi.