Tag Archives: Facebook

AI and the Evolution of Social Media

2024-03-19 Bruce Schneier

Post Syndicated from Bruce Schneier original https://www.schneier.com/blog/archives/2024/03/ai-and-the-evolution-of-social-media.html

Oh, how the mighty have fallen. A decade ago, social media was celebrated for sparking democratic uprisings in the Arab world and beyond. Now front pages are splashed with stories of social platforms’ role in misinformation, business conspiracy, malfeasance, and risks to mental health. In a 2022 survey, Americans blamed social media for the coarsening of our political discourse, the spread of misinformation, and the increase in partisan polarization.

Today, tech’s darling is artificial intelligence. Like social media, it has the potential to change the world in many ways, some favorable to democracy. But at the same time, it has the potential to do incredible damage to society.

There is a lot we can learn about social media’s unregulated evolution over the past decade that directly applies to AI companies and technologies. These lessons can help us avoid making the same mistakes with AI that we did with social media.

In particular, five fundamental attributes of social media have harmed society. AI also has those attributes. Note that they are not intrinsically evil. They are all double-edged swords, with the potential to do either good or ill. The danger comes from who wields the sword, and in what direction it is swung. This has been true for social media, and it will similarly hold true for AI. In both cases, the solution lies in limits on the technology’s use.

#1: Advertising

The role advertising plays in the internet arose more by accident than anything else. When commercialization first came to the internet, there was no easy way for users to make micropayments to do things like viewing a web page. Moreover, users were accustomed to free access and wouldn’t accept subscription models for services. Advertising was the obvious business model, if never the best one. And it’s the model that social media also relies on, which leads it to prioritize engagement over anything else.

Both Google and Facebook believe that AI will help them keep their stranglehold on an 11-figure online ad market (yep, 11 figures), and the tech giants that are traditionally less dependent on advertising, like Microsoft and Amazon, believe that AI will help them seize a bigger piece of that market.

Big Tech needs something to persuade advertisers to keep spending on their platforms. Despite bombastic claims about the effectiveness of targeted marketing, researchers have long struggled to demonstrate where and when online ads really have an impact. When major brands like Uber and Procter & Gamble recently slashed their digital ad spending by the hundreds of millions, they proclaimed that it made no dent at all in their sales.

AI-powered ads, industry leaders say, will be much better. Google assures you that AI can tweak your ad copy in response to what users search for, and that its AI algorithms will configure your campaigns to maximize success. Amazon wants you to use its image generation AI to make your toaster product pages look cooler. And IBM is confident its Watson AI will make your ads better.

These techniques border on the manipulative, but the biggest risk to users comes from advertising within AI chatbots. Just as Google and Meta embed ads in your search results and feeds, AI companies will be pressured to embed ads in conversations. And because those conversations will be relational and human-like, they could be more damaging. While many of us have gotten pretty good at scrolling past the ads in Amazon and Google results pages, it will be much harder to determine whether an AI chatbot is mentioning a product because it’s a good answer to your question or because the AI developer got a kickback from the manufacturer.

#2: Surveillance

Social media’s reliance on advertising as the primary way to monetize websites led to personalization, which led to ever-increasing surveillance. To convince advertisers that social platforms can tweak ads to be maximally appealing to individual people, the platforms must demonstrate that they can collect as much information about those people as possible.

It’s hard to exaggerate how much spying is going on. A recent analysis by Consumer Reports about Facebook—just Facebook—showed that every user has more than 2,200 different companies spying on their web activities on its behalf.

AI-powered platforms that are supported by advertisers will face all the same perverse and powerful market incentives that social platforms do. It’s easy to imagine that a chatbot operator could charge a premium if it were able to claim that its chatbot could target users on the basis of their location, preference data, or past chat history and persuade them to buy products.

The possibility of manipulation is only going to get greater as we rely on AI for personal services. One of the promises of generative AI is the prospect of creating a personal digital assistant advanced enough to act as your advocate with others and as a butler to you. This requires more intimacy than you have with your search engine, email provider, cloud storage system, or phone. You’re going to want it with you constantly, and to most effectively work on your behalf, it will need to know everything about you. It will act as a friend, and you are likely to treat it as such, mistakenly trusting its discretion.

Even if you choose not to willingly acquaint an AI assistant with your lifestyle and preferences, AI technology may make it easier for companies to learn about you. Early demonstrations illustrate how chatbots can be used to surreptitiously extract personal data by asking you mundane questions. And with chatbots increasingly being integrated with everything from customer service systems to basic search interfaces on websites, exposure to this kind of inferential data harvesting may become unavoidable.

#3: Virality

Social media allows any user to express any idea with the potential for instantaneous global reach. A great public speaker standing on a soapbox can spread ideas to maybe a few hundred people on a good night. A kid with the right amount of snark on Facebook can reach a few hundred million people within a few minutes.

A decade ago, technologists hoped this sort of virality would bring people together and guarantee access to suppressed truths. But as a structural matter, it is in a social network’s interest to show you the things you are most likely to click on and share, and the things that will keep you on the platform.

As it happens, this often means outrageous, lurid, and triggering content. Researchers have found that content expressing maximal animosity toward political opponents gets the most engagement on Facebook and Twitter. And this incentive for outrage drives and rewards misinformation.

As Jonathan Swift once wrote, “Falsehood flies, and the Truth comes limping after it.” Academics seem to have proved this in the case of social media; people are more likely to share false information—perhaps because it seems more novel and surprising. And unfortunately, this kind of viral misinformation has been pervasive.

AI has the potential to supercharge the problem because it makes content production and propagation easier, faster, and more automatic. Generative AI tools can fabricate unending numbers of falsehoods about any individual or theme, some of which go viral. And those lies could be propelled by social accounts controlled by AI bots, which can share and launder the original misinformation at any scale.

Remarkably powerful AI text generators and autonomous agents are already starting to make their presence felt in social media. In July, researchers at Indiana University revealed a botnet of more than 1,100 Twitter accounts that appeared to be operated using ChatGPT.

AI will help reinforce viral content that emerges from social media. It will be able to create websites and web content, user reviews, and smartphone apps. It will be able to simulate thousands, or even millions, of fake personas to give the mistaken impression that an idea, or a political position, or use of a product, is more common than it really is. What we might perceive to be vibrant political debate could be bots talking to bots. And these capabilities won’t be available just to those with money and power; the AI tools necessary for all of this will be easily available to us all.

#4: Lock-in

Social media companies spend a lot of effort making it hard for you to leave their platforms. It’s not just that you’ll miss out on conversations with your friends. They make it hard for you to take your saved data—connections, posts, photos—and port it to another platform. Every moment you invest in sharing a memory, reaching out to an acquaintance, or curating your follows on a social platform adds a brick to the wall you’d have to climb over to go to another platform.

This concept of lock-in isn’t unique to social media. Microsoft cultivated proprietary document formats for years to keep you using its flagship Office product. Your music service or e-book reader makes it hard for you to take the content you purchased to a rival service or reader. And if you switch from an iPhone to an Android device, your friends might mock you for sending text messages in green bubbles. But social media takes this to a new level. No matter how bad it is, it’s very hard to leave Facebook if all your friends are there. Coordinating everyone to leave for a new platform is impossibly hard, so no one does.

Similarly, companies creating AI-powered personal digital assistants will make it hard for users to transfer that personalization to another AI. If AI personal assistants succeed in becoming massively useful time-savers, it will be because they know the ins and outs of your life as well as a good human assistant; would you want to give that up to make a fresh start on another company’s service? In extreme examples, some people have formed close, perhaps even familial, bonds with AI chatbots. If you think of your AI as a friend or therapist, that can be a powerful form of lock-in.

Lock-in is an important concern because it results in products and services that are less responsive to customer demand. The harder it is for you to switch to a competitor, the more poorly a company can treat you. Absent any way to force interoperability, AI companies have less incentive to innovate in features or compete on price, and fewer qualms about engaging in surveillance or other bad behaviors.

#5: Monopolization

Social platforms often start off as great products, truly useful and revelatory for their consumers, before they eventually start monetizing and exploiting those users for the benefit of their business customers. Then the platforms claw back the value for themselves, turning their products into truly miserable experiences for everyone. This is a cycle that Cory Doctorow has powerfully written about and traced through the history of Facebook, Twitter, and more recently TikTok.

The reason for these outcomes is structural. The network effects of tech platforms push a few firms to become dominant, and lock-in ensures their continued dominance. The incentives in the tech sector are so spectacularly, blindingly powerful that they have enabled six megacorporations (Amazon, Apple, Google, Facebook parent Meta, Microsoft, and Nvidia) to command a trillion dollars each of market value—or more. These firms use their wealth to block any meaningful legislation that would curtail their power. And they sometimes collude with each other to grow yet fatter.

This cycle is clearly starting to repeat itself in AI. Look no further than the industry poster child OpenAI, whose leading offering, ChatGPT, continues to set marks for uptake and usage. Within a year of the product’s launch, OpenAI’s valuation had skyrocketed to about $90 billion.

OpenAI once seemed like an “open” alternative to the megacorps—a common carrier for AI services with a socially oriented nonprofit mission. But the Sam Altman firing-and-rehiring debacle at the end of 2023, and Microsoft’s central role in restoring Altman to the CEO seat, simply illustrated how venture funding from the familiar ranks of the tech elite pervades and controls corporate AI. In January 2024, OpenAI took a big step toward monetization of this user base by introducing its GPT Store, wherein one OpenAI customer can charge another for the use of its custom versions of OpenAI software; OpenAI, of course, collects revenue from both parties. This sets in motion the very cycle Doctorow warns about.

In the middle of this spiral of exploitation, little or no regard is paid to externalities visited upon the greater public—people who aren’t even using the platforms. Even after society has wrestled with their ill effects for years, the monopolistic social networks have virtually no incentive to control their products’ environmental impact, tendency to spread misinformation, or pernicious effects on mental health. And the government has applied virtually no regulation toward those ends.

Likewise, few or no guardrails are in place to limit the potential negative impact of AI. Facial recognition software that amounts to racial profiling, simulated public opinions supercharged by chatbots, fake videos in political ads—all of it persists in a legal gray area. Even clear violators of campaign advertising law might, some think, be let off the hook if they simply do it with AI.

Mitigating the risks

The risks that AI poses to society are strikingly familiar, but there is one big difference: it’s not too late. This time, we know it’s all coming. Fresh off our experience with the harms wrought by social media, we have all the warning we should need to avoid the same mistakes.

The biggest mistake we made with social media was leaving it as an unregulated space. Even now—after all the studies and revelations of social media’s negative effects on kids and mental health, after Cambridge Analytica, after the exposure of Russian intervention in our politics, after everything else—social media in the US remains largely an unregulated “weapon of mass destruction.” Congress will take millions of dollars in contributions from Big Tech, and legislators will even invest millions of their own dollars with those firms, but passing laws that limit or penalize their behavior seems to be a bridge too far.

We can’t afford to do the same thing with AI, because the stakes are even higher. The harm social media can do stems from how it affects our communication. AI will affect us in the same ways and many more besides. If Big Tech’s trajectory is any signal, AI tools will increasingly be involved in how we learn and how we express our thoughts. But these tools will also influence how we schedule our daily activities, how we design products, how we write laws, and even how we diagnose diseases. The expansive role of these technologies in our daily lives gives for-profit corporations opportunities to exert control over more aspects of society, and that exposes us to the risks arising from their incentives and decisions.

The good news is that we have a whole category of tools to modulate the risk that corporate actions pose for our lives, starting with regulation. Regulations can come in the form of restrictions on activity, such as limitations on what kinds of businesses and products are allowed to incorporate AI tools. They can come in the form of transparency rules, requiring disclosure of what data sets are used to train AI models or what new preproduction-phase models are being trained. And they can come in the form of oversight and accountability requirements, allowing for civil penalties in cases where companies disregard the rules.

The single biggest point of leverage governments have when it comes to tech companies is antitrust law. Despite what many lobbyists want you to think, one of the primary roles of regulation is to preserve competition—not to make life harder for businesses. It is not inevitable for OpenAI to become another Meta, an 800-pound gorilla whose user base and reach are several times those of its competitors. In addition to strengthening and enforcing antitrust law, we can introduce regulation that supports competition-enabling standards specific to the technology sector, such as data portability and device interoperability. This is another core strategy for resisting monopoly and corporate control.

Additionally, governments can enforce existing regulations on advertising. Just as the US regulates what media can and cannot host advertisements for sensitive products like cigarettes, and just as many other jurisdictions exercise strict control over the time and manner of politically sensitive advertising, so too could the US limit the engagement between AI providers and advertisers.

Lastly, we should recognize that developing and providing AI tools does not have to be the sovereign domain of corporations. We, the people and our government, can do this too. The proliferation of open-source AI development in 2023, successful to an extent that startled corporate players, is proof of this. And we can go further, calling on our government to build public-option AI tools developed with political oversight and accountability under our democratic system, where the dictatorship of the profit motive does not apply.

Which of these solutions is most practical, most important, or most urgently needed is up for debate. We should have a vibrant societal dialogue about whether and how to use each of these tools. There are lots of paths to a good outcome.

The problem is that this isn’t happening now, particularly in the US. And with a looming presidential election, conflict spreading alarmingly across Asia and Europe, and a global climate crisis, it’s easy to imagine that we won’t get our arms around AI any faster than we have (not) with social media. But it’s not too late. These are still the early years for practical consumer AI applications. We must and can do better.

This essay was written with Nathan Sanders, and was originally published in MIT Technology Review.

Why Facebook’s Lack of Customer Support Is a Problem

2024-02-04 Bozho

Post Syndicated from Bozho original https://techblog.bozho.net/why-facebooks-lack-of-customer-support-is-a-problem/

Facebook is arguably the biggest social network. The network effect makes it hard for people to leave Facebook, and so many businesses, celebrities, institutions, politicians rely on it for reaching out to their customers/fans/citizens/voters.

Yet, at least in my part of the world, the customer support of Facebook is practically non-existent. Because I’m a member of parliament and former minister that had handled disinformation and relations with Meta, many people turn to me for their Facebook woes. And they are almost never resolved.

A few examples: a deep fake of the Bulgarian prime minister was circulating on Facebook for several days, after two institutions submitted official take-down notices. Profiles of fellow members of parliament were blocked/hacked. None of their support requests succeeded and their profiles remained blocked for months. A fellow member of parliament with paid subscription could not change his cover photo during an election campaign for mayor, and Facebook’s support stopped answering. Facebook bulk-deleted our candidate pages after one election campaign (after it has been taking ad money), and its support did not respond adequately (pages remained deleted). One colleague’s ad account was hacked and a malicious actor used his credit card to promote ads. He was unable to remove the intruder and Facebook’s support didn’t manage to do it either, so my colleagues had to remove the credit card. When I became a minister, my request for a blue checkmark was initially rejected and the official support channel didn’t answer. And in all of those cases support was requested in English, so it’s not about language-specific limitations.

I’m sure anyone using Facebook for business has similar experiences. In a nutshell, support is useless, even if you are paying customer or advertiser. And clearly there is no market pressure to change that.

The European Union recently introduced the Digital Services Act which at least pushes forward a long-time proposal of mine for appeals and independent arbitration for decisions that block access. I don’t know if that’s working already, but at least it’s a step.

So why is that a problem? Facebook argues it is not a ‘natural monopoly’, and I’ll agree with that to an extent – it faces competition from different types of social networks. But its scale and the network effect means it is not just a regular market player – it is (as the digital services act puts it) – a very large online platform that has gained a broad influence and therefore needs to be required to bear extra responsibility. The ability for some entity with 4 million users in a country of 7 million to arbitrarily ban members of parliament or candidates for mayors, or to choose (because of inefficiency) to leave a deep fake of a prime minister up for days, is a systemic risk. It’s a systemic risk to leave a business to be reliant on the whims and inefficiencies of the nearly non-existent customer support.

If a company can’t get customer support sorted, market forces usually push it out of the market. But because of the network effect (and its policy of acquiring some potential competitors), this hasn’t been the case. And if one of the most highly-valued companies on earth can’t have a decent support process, regulators should step up and set standards.

The post Why Facebook’s Lack of Customer Support Is a Problem appeared first on Bozho's tech blog.

Facebook’s Extensive Surveillance Network

2024-02-01 Bruce Schneier

Post Syndicated from Bruce Schneier original https://www.schneier.com/blog/archives/2024/02/facebooks-extensive-surveillance-network.html

Consumer Reports is reporting that Facebook has built a massive surveillance network:

Using a panel of 709 volunteers who shared archives of their Facebook data, Consumer Reports found that a total of 186,892 companies sent data about them to the social network. On average, each participant in the study had their data sent to Facebook by 2,230 companies. That number varied significantly, with some panelists’ data listing over 7,000 companies providing their data. The Markup helped Consumer Reports recruit participants for the study. Participants downloaded an archive of the previous three years of their data from their Facebook settings, then provided it to Consumer Reports.

This isn’t data about your use of Facebook. This data about your interactions with other companies, all of which is correlated and analyzed by Facebook. It constantly amazes me that we willingly allow these monopoly companies that kind of surveillance power.

Here’s the Consumer Reports study. It includes policy recommendations:

Many consumers will rightly be concerned about the extent to which their activity is tracked by Facebook and other companies, and may want to take action to counteract consistent surveillance. Based on our analysis of the sample data, consumers need interventions that will:

Reduce the overall amount of tracking.
Improve the ability for consumers to take advantage of their right to opt out under state privacy laws.
Empower social media platform users and researchers to review who and what exactly is being advertised on Facebook.
Improve the transparency of Facebook’s existing tools.

And then the report gives specifics.

Facebook Enables Messenger End-to-End Encryption by Default

2023-12-11 Bruce Schneier

Post Syndicated from Bruce Schneier original https://www.schneier.com/blog/archives/2023/12/facebook-enables-messenger-end-to-end-encryption-by-default.html

It’s happened. Details here, and tech details here (for messages in transit) and here (for messages in storage)

Rollout to everyone will take months, but it’s a good day for both privacy and security.

Slashdot thread.

Interview with Signal’s New President

2022-10-20 Bruce Schneier

Post Syndicated from Bruce Schneier original https://www.schneier.com/blog/archives/2022/10/interview-with-signals-new-president.html

Long and interesting interview with Signal’s new president, Meredith Whittaker:

WhatsApp uses the Signal encryption protocol to provide encryption for its messages. That was absolutely a visionary choice that Brian and his team led back in the day - and big props to them for doing that. But you can’t just look at that and then stop at message protection. WhatsApp does not protect metadata the way that Signal does. Signal knows nothing about who you are. It doesn’t have your profile information and it has introduced group encryption protections. We don’t know who you are talking to or who is in the membership of a group. It has gone above and beyond to minimize the collection of metadata.

WhatsApp, on the other hand, collects the information about your profile, your profile photo, who is talking to whom, who is a group member. That is powerful metadata. It is particularly powerful—and this is where we have to back out into a structural argument for a company to collect the data that is also owned by Meta/Facebook. Facebook has a huge amount, just unspeakable volumes, of intimate information about billions of people across the globe.

It is not trivial to point out that WhatsApp metadata could easily be joined with Facebook data, and that it could easily reveal extremely intimate information about people. The choice to remove or enhance the encryption protocols is still in the hands of Facebook. We have to look structurally at what that organization is, who actually has control over these decisions, and at some of these details that often do not get discussed when we talk about message encryption overall.

I am a fan of Signal and I use it every day. The one feature I want, which WhatsApp has and Signal does not, is the ability to easily export a chat to a text file.

Facebook Has No Idea What Data It Has

2022-09-08 Bruce Schneier

Post Syndicated from Bruce Schneier original https://www.schneier.com/blog/archives/2022/09/facebook-has-no-idea-what-data-it-has.html

This is from a court deposition:

Facebook’s stonewalling has been revealing on its own, providing variations on the same theme: It has amassed so much data on so many billions of people and organized it so confusingly that full transparency is impossible on a technical level. In the March 2022 hearing, Zarashaw and Steven Elia, a software engineering manager, described Facebook as a data-processing apparatus so complex that it defies understanding from within. The hearing amounted to two high-ranking engineers at one of the most powerful and resource-flush engineering outfits in history describing their product as an unknowable machine.

The special master at times seemed in disbelief, as when he questioned the engineers over whether any documentation existed for a particular Facebook subsystem. “Someone must have a diagram that says this is where this data is stored,” he said, according to the transcript. Zarashaw responded: “We have a somewhat strange engineering culture compared to most where we don’t generate a lot of artifacts during the engineering process. Effectively the code is its own design document often.” He quickly added, “For what it’s worth, this is terrifying to me when I first joined as well.”

[…]

Facebook’s inability to comprehend its own functioning took the hearing up to the edge of the metaphysical. At one point, the court-appointed special master noted that the “Download Your Information” file provided to the suit’s plaintiffs must not have included everything the company had stored on those individuals because it appears to have no idea what it truly stores on anyone. Can it be that Facebook’s designated tool for comprehensively downloading your information might not actually download all your information? This, again, is outside the boundaries of knowledge.

“The solution to this is unfortunately exactly the work that was done to create the DYI file itself,” noted Zarashaw. “And the thing I struggle with here is in order to find gaps in what may not be in DYI file, you would by definition need to do even more work than was done to generate the DYI files in the first place.”

The systemic fogginess of Facebook’s data storage made answering even the most basic question futile. At another point, the special master asked how one could find out which systems actually contain user data that was created through machine inference.

“I don’t know,” answered Zarashaw. “It’s a rather difficult conundrum.”

I’m not surprised. These systems are so complex that no humans understand them anymore. That allows us to do things we couldn’t do otherwise, but it’s also a problem.

EDITED TO ADD: Another article.

Facebook Is Now Encrypting Links to Prevent URL Stripping

2022-07-18 Bruce Schneier

Post Syndicated from Bruce Schneier original https://www.schneier.com/blog/archives/2022/07/facebook-is-now-encrypting-links-to-prevent-url-stripping.html

Some sites, including Facebook, add parameters to the web address for tracking purposes. These parameters have no functionality that is relevant to the user, but sites rely on them to track users across pages and properties.

Mozilla introduced support for URL stripping in Firefox 102, which it launched in June 2022. Firefox removes tracking parameters from web addresses automatically, but only in private browsing mode or when the browser’s Tracking Protection feature is set to strict. Firefox users may enable URL stripping in all Firefox modes, but this requires manual configuration. Brave Browser strips known tracking parameters from web addresses as well.

Facebook has responded by encrypting the entire URL into a single ciphertext blob.

Since it is no longer possible to identify the tracking part of the web address, it is no longer possible to remove it from the address automatically. In other words: Facebook has the upper hand in regards to URL-based tracking at the time, and there is little that can be done about it short of finding a way to decrypt the information.

What happened on the Internet during the Facebook outage

2021-10-08 Celso Martinho

Post Syndicated from Celso Martinho original https://blog.cloudflare.com/during-the-facebook-outage/

It’s been a few days now since Facebook, Instagram, and WhatsApp went AWOL and experienced one of the most extended and rough downtime periods in their existence.

When that happened, we reported our bird’s-eye view of the event and posted the blog Understanding How Facebook Disappeared from the Internet where we tried to explain what we saw and how DNS and BGP, two of the technologies at the center of the outage, played a role in the event.

In the meantime, more information has surfaced, and Facebook has published a blog post giving more details of what happened internally.

As we said before, these events are a gentle reminder that the Internet is a vast network of networks, and we, as industry players and end-users, are part of it and should work together.

In the aftermath of an event of this size, we don’t waste much time debating how peers handled the situation. We do, however, ask ourselves the more important questions: “How did this affect us?” and “What if this had happened to us?” Asking and answering these questions whenever something like this happens is a great and healthy exercise that helps us improve our own resilience.

Today, we’re going to show you how the Facebook and affiliate sites downtime affected us, and what we can see in our data.

1.1.1.1

1.1.1.1 is a fast and privacy-centric public DNS resolver operated by Cloudflare, used by millions of users, browsers, and devices worldwide. Let’s look at our telemetry and see what we find.

First, the obvious. If we look at the response rate, there was a massive spike in the number of SERVFAIL codes. SERVFAILs can happen for several reasons; we have an excellent blog called Unwrap the SERVFAIL that you should read if you’re curious.

In this case, we started serving SERVFAIL responses to all facebook.com and whatsapp.com DNS queries because our resolver couldn’t access the upstream Facebook authoritative servers. About 60x times more than the average on a typical day.

What happened on the Internet during the Facebook outage

If we look at all the queries, not specific to Facebook or WhatsApp domains, and we split them by IPv4 and IPv6 clients, we can see that our load increased too.

As explained before, this is due to a snowball effect associated with applications and users retrying after the errors and generating even more traffic. In this case, 1.1.1.1 had to handle more than the expected rate for A and AAAA queries.

Here’s another fun one.

DNS vs. DoT and DoH. Typically, DNS queries and responses are sent in plaintext over UDP (or TCP sometimes), and that’s been the case for decades now. Naturally, this poses security and privacy risks to end-users as it allows in-transit attacks or traffic snooping.

With DNS over TLS (DoT) and DNS over HTTPS, clients can talk DNS using well-known, well-supported encryption and authentication protocols.

Our learning center has a good article on “DNS over TLS vs. DNS over HTTPS” that you can read. Browsers like Chrome, Firefox, and Edge have supported DoH for some time now, WAP uses DoH too, and you can even configure your operating system to use the new protocols.

When Facebook went offline, we saw the number of DoT+DoH SERVFAILs responses grow by over x300 vs. the average rate.

So, we got hammered with lots of requests and errors, causing traffic spikes to our 1.1.1.1 resolver and causing an unexpected load in the edge network and systems. How did we perform during this stressful period?

Quite well. 1.1.1.1 kept its cool and continued serving the vast majority of requests around the famous 10ms mark. An insignificant fraction of p95 and p99 percentiles saw increased response times, probably due to timeouts trying to reach Facebook’s nameservers.

Another interesting perspective is the distribution of the ratio between SERVFAIL and good DNS answers, by country. In theory, the higher this ratio is, the more the country uses Facebook. Here’s the map with the countries that suffered the most:

Here’s the top twelve country list, ordered by those that apparently use Facebook, WhatsApp and Instagram the most:

Country	SERVFAIL/Good Answers ratio
Turkey	7.34
Grenada	4.84
Congo	4.44
Lesotho	3.94
Nicaragua	3.57
South Sudan	3.47
Syrian Arab Republic	3.41
Serbia	3.25
Turkmenistan	3.23
United Arab Emirates	3.17
Togo	3.14
French Guiana	3.00

Impact on other sites

When Facebook, Instagram, and WhatsApp aren’t around, the world turns to other places to look for information on what’s going on, other forms of entertainment or other applications to communicate with their friends and family. Our data shows us those shifts. While Facebook was going down, other services and platforms were going up.

To get an idea of the changing traffic patterns we look at DNS queries as an indicator of increased traffic to specific sites or types of site.

Here are a few examples.

Other social media platforms saw a slight increase in use, compared to normal.

Traffic to messaging platforms like Telegram, Signal, Discord and Slack got a little push too.

Nothing like a little gaming time when Instagram is down, we guess, when looking at traffic to sites like Steam, Xbox, Minecraft and others.

And yes, people want to know what’s going on and fall back on news sites like CNN, New York Times, The Guardian, Wall Street Journal, Washington Post, Huffington Post, BBC, and others:

Attacks

One could speculate that the Internet was under attack from malicious hackers. Our Firewall doesn’t agree; nothing out of the ordinary stands out.

Network Error Logs

Network Error Logging, NEL for short, is an experimental technology supported in Chrome. A website can issue a Report-To header and ask the browser to send reports about network problems, like bad requests or DNS issues, to a specific endpoint.

Cloudflare uses NEL data to quickly help triage end-user connectivity issues when end-users reach our network. You can learn more about this feature in our help center.

If Facebook is down and their DNS isn’t responding, Chrome will start reporting NEL events every time one of the pages in our zones fails to load Facebook comments, posts, ads, or authentication buttons. This chart shows it clearly.

WARP

Cloudflare announced WARP in 2019, and called it “A VPN for People Who Don’t Know What V.P.N. Stands For” and offered it for free to its customers. Today WARP is used by millions of people worldwide to securely and privately access the Internet on their desktop and mobile devices. Here’s what we saw during the outage by looking at traffic volume between WARP and Facebook’s network:

You can see how the steep drop in Facebook ASN traffic coincides with the start of the incident and how it compares to the same period the day before.

Our own traffic

People tend to think of Facebook as a place to visit. We log in, and we access Facebook, we post. It turns out that Facebook likes to visit us too, quite a lot. Like Google and other platforms, Facebook uses an army of crawlers to constantly check websites for data and updates. Those robots gather information about websites content, such as its titles, descriptions, thumbnail images, and metadata. You can learn more about this on the “The Facebook Crawler” page and the Open Graph website.

Here’s what we see when traffic is coming from the Facebook ASN, supposedly from crawlers, to our CDN sites:

The robots went silent.

What about the traffic coming to our CDN sites from Facebook User-Agents? The gap is indisputable.

We see about 30% of a typical request rate hitting us. But it’s not zero; why is that?

We’ll let you know a little secret. Never trust User-Agent information; it’s broken. User-Agent spoofing is everywhere. Browsers, apps, and other clients deliberately change the User-Agent string when they fetch pages from the Internet to hide, obtain access to certain features, or bypass paywalls (because pay-walled sites want sites like Facebook to index their content, so that then they get more traffic from links).

Fortunately, there are newer, and privacy-centric standards emerging like User-Agent Client Hints.

Core Web Vitals

Core Web Vitals are the subset of Web Vitals, an initiative by Google to provide a unified interface to measure real-world quality signals when a user visits a web page. Such signals include Largest Contentful Paint (LCP), First Input Delay (FID), and Cumulative Layout Shift (CLS).

We use Core Web Vitals with our privacy-centric Web Analytics product and collect anonymized data on how end-users experience the websites that enable this feature.

One of the metrics we can calculate using these signals is the page load time. Our theory is that if a page includes scripts coming from external sites (for example, Facebook “like” buttons, comments, ads), and they are unreachable, its total load time gets affected.

We used a list of about 400 domains that we know embed Facebook scripts in their pages and looked at the data.

Now let’s look at the Largest Contentful Paint. LCP marks the point in the page load timeline when the page’s main content has likely loaded. The faster the LCP is, the better the end-user experience.

Again, the page load experience got visibly degraded.

The outcome seems clear. The sites that use Facebook scripts in their pages took 1.5x more time to load their pages during the outage, with some of them taking more than 2x the usual time. Facebook’s outage dragged the performance of some other sites down.

Conclusion

When Facebook, Instagram, and WhatsApp went down, the Web felt it. Some websites got slower or lost traffic, other services and platforms got unexpected load, and people lost the ability to communicate or do business normally.

Hypotheses About What Happened to Facebook

2021-10-06 Bozho

Post Syndicated from Bozho original https://techblog.bozho.net/hypotheses-about-what-happened-to-facebook/

Facebook was down. I’d recommend reading Cloudflare’s summary. Then I recommend reading Facebook’s own account on the incident. But let me expand on that. Facebook published announcements and withdrawals for certain BGP prefixes which lead to removing its DNS servers from “the map of the internet” – they told everyone “the part of our network where our DNS servers are doesn’t exist”. That was the result of a backbone self-inflicted failure due to a bug in the auditing tool that checks whether the commands executed aren’t doing harmful things.

Facebook owns a lot of IPs. According to RIPEstat they are part of 399 prefixes (147 of them are IPv4). The DNS servers are located in two of those 399. Facebook uses a.ns.facebook.com, b.ns.facebook.com, c.ns.facebook.com and d.ns.facebook.com, which get queries whenever someone wants to know the IPs of Facebook-owned domains. These four nameservers are served by the same Autonomous System from just two prefixes – 129.134.30.0/23 and 185.89.218.0/23. Of course “4 nameservers” is a logical construct, there are probably many actual servers behind that (using anycast).

I wrote a simple “script” to fetch all the withdrawals and announcements for all Facebook-owned prefixes (from the great API of RIPEstats). Facebook didn’t remove itself from the map entirely. As CloudFlare points out, it was just some prefixes that are affected. It can be just these two, or a few others as well, but it seems that just a handful were affected. If we sort the resulting CSV from the above script by withdrawals, we’ll notice that 129.134.30.0/23 and 185.89.218.0/23 are the pretty high up (alongside 185.89 and 123.134 with a /24, which are all included in the /23). Now that perfectly matches Facebook’s account that their nameservers automatically withdraw themselves if they fail to connect to other parts of the infrastructure. Everything may have also been down, but the logic for withdrawal is present only in the networks that have nameservers in them.

So first, let me make three general observations that are not as obvious and as universal as they may sound, but they are worth discussing:

Use longer DNS TTLs if possible – if Facebook had 6 hour TTL on its domains, we may have not figured out that their name servers are down. This is hard to ask for such a complex service that uses DNS for load-balancing and geographical distribution, but it’s worth considering. Also, if they killed their backbone and their entire infrastructure was down anyway, the DNS TTL would not have solved the issue. But
We need improved caching logic for DNS. It can’t be just “present or not”; DNS caches may keep “last known good state” in case of SERVFAIL and fallback to that. All of those DNS resolvers that had to ask the authoritative nameserver “where can I find facebook.com” knew where to find facebook.com just a minute ago. Then they got a failure and suddenly they are wondering “oh, where could Facebook be?”. It’s not that simple, of course, but such cache improvement is worth considering. And again, if their entire infrastructure was down, this would not have helped.
Consider having an authoritative nameserver outside your main AS. If something bad happens to your AS routes (regardless of the reason), you may still have DNS working. That may have downsides – generally, it will be hard to manage and sync your DNS infrastructure. But at least having a spare set of nameservers and the option to quickly point glue records there is worth considering. It would not have saved Facebook in this case, as again, they claim the entire infrastructure was inaccessible due to a “broken” backbone.
Have a 100% test coverage on critical tools, such as the auditing tool that had a bug. 100% test coverage is rarely achievable in any project, but in such critical tools it’s a must.

The main explanation is the accidental outage. This is what Facebook engineers explain in the blogpost and other accounts, and that’s what seems to have happened. However, there are alternative hypotheses floating around, so let me briefly discuss all of the options.

Accidental outage due to misconfiguration – a very likely scenario. These things may happen to everyone and Facebook is known for it “break things” mentality, so it’s not unlikely that they just didn’t have the right safeguards in place and that someone ran a buggy update. The scenarios why and how that may have happened are many, and we can’t know from the outside (even after Facebook’s brief description). This remains the primary explanation, following my favorite Hanlon’s razor. A bug in the audit tool is absolutely realistic (btw, I’d love Facebook to publish their internal tools).
Cyber attack – It cannot be known by the data we have, but this would be a sophisticated attack that gained access to their BGP administration interface, which I would assume is properly protected. Not impossible, but a 6-hour outage of a social network is not something a sophisticated actor (e.g. a nation state) would invest resources in. We can’t rule it out, as this might be “just a drill” for something bigger to follow. If I were an attacker that wanted to take Facebook down, I’d try to kill their DNS servers, or indeed, “de-route” them. If we didn’t know that Facebook lets its DNS servers cut themselves from the network in case of failures, the fact that so few prefixes were updated might be in indicator of targeted attack, but this seems less and less likely.
Deliberate self-sabotage – 1.5 billion records are claimed to be leaked yesterday. At the same time, a Facebook whistleblower is testifying in the US congress. Both of these news are potentially damaging to Facebook reputation and shares. If they wanted to drown the news and the respective share price plunge in a technical story that few people understand but everyone is talking about (and then have their share price rebound, because technical issues happen to everyone), then that’s the way to do it – just as a malicious actor would do, but without all the hassle to gain access from outside – de-route the prefixes for the DNS servers and you have a “perfect” outage. These coincidences have lead people to assume such a plot, but from the observed outage and the explanation given by Facebook on why the DNS prefixes have been automatically withdrawn, this sounds unlikely.

Distinguishing between the three options is actually hard. You can mask a deliberate outage as an accident, a malicious actor can make it look like a deliberate self-sabotage. That’s why there are speculations. To me, however, by all of the data we have in RIPEStat and the various accounts by CloudFlare, Facebook and other experts, it seems that a chain of mistakes (operational and possibly design ones) lead to this.

The post Hypotheses About What Happened to Facebook appeared first on Bozho's tech blog.

Facebook Is Down

2021-10-05 Bruce Schneier

Post Syndicated from Bruce Schneier original https://www.schneier.com/blog/archives/2021/10/facebook-is-down.html

Facebook — along with Instagram and WhatsApp — went down globally today. Basically, someone deleted their BGP records, which made their DNS fall apart.

…at approximately 11:39 a.m. ET today (15:39 UTC), someone at Facebook caused an update to be made to the company’s Border Gateway Protocol (BGP) records. BGP is a mechanism by which Internet service providers of the world share information about which providers are responsible for routing Internet traffic to which specific groups of Internet addresses.

In simpler terms, sometime this morning Facebook took away the map telling the world’s computers how to find its various online properties. As a result, when one types Facebook.com into a web browser, the browser has no idea where to find Facebook.com, and so returns an error page.

In addition to stranding billions of users, the Facebook outage also has stranded its employees from communicating with one another using their internal Facebook tools. That’s because Facebook’s email and tools are all managed in house and via the same domains that are now stranded.

What I heard is that none of the employee keycards work, since they have to ping a now-unreachable server. So people can’t get into buildings and offices.

And every third-party site that relies on “log in with Facebook” is stuck as well.

The fix won’t be quick:

As a former network admin who worked on the internet at this level, I anticipate Facebook will be down for hours more. I suspect it will end up being Facebook’s longest and most severe failure to date before it’s fixed.

We all know the security risks of monocultures.

EDITED TO ADD (10/6): Good explanation of what happened. Shorter from Jonathan Zittrain: “Facebook basically locked its keys in the car.”

Changes in WhatsApp’s Privacy Policy

2021-01-11 Bruce Schneier

Post Syndicated from Bruce Schneier original https://www.schneier.com/blog/archives/2021/01/changes-in-whatsapps-privacy-policy.html

If you’re a WhatsApp user, pay attention to the changes in the privacy policy that you’re being forced to agree with.

In 2016, WhatsApp gave users a one-time ability to opt out of having account data turned over to Facebook. Now, an updated privacy policy is changing that. Come next month, users will no longer have that choice. Some of the data that WhatsApp collects includes:

User phone numbers

Other people’s phone numbers stored in address books

Profile names

Profile pictures and

Status message including when a user was last online

Diagnostic data collected from app logs

Under the new terms, Facebook reserves the right to share collected data with its family of companies.

EDITED TO ADD (1/13): WhatsApp tries to explain.

Manipulating Systems Using Remote Lasers

2020-12-01 Bruce Schneier

Post Syndicated from Bruce Schneier original https://www.schneier.com/blog/archives/2020/12/manipulating-systems-using-remote-lasers.html

Many systems are vulnerable:

Researchers at the time said that they were able to launch inaudible commands by shining lasers — from as far as 360 feet — at the microphones on various popular voice assistants, including Amazon Alexa, Apple Siri, Facebook Portal, and Google Assistant.

[…]

They broadened their research to show how light can be used to manipulate a wider range of digital assistants — including Amazon Echo 3 — but also sensing systems found in medical devices, autonomous vehicles, industrial systems and even space systems.

The researchers also delved into how the ecosystem of devices connected to voice-activated assistants — such as smart-locks, home switches and even cars — also fail under common security vulnerabilities that can make these attacks even more dangerous. The paper shows how using a digital assistant as the gateway can allow attackers to take control of other devices in the home: Once an attacker takes control of a digital assistant, he or she can have the run of any device connected to it that also responds to voice commands. Indeed, these attacks can get even more interesting if these devices are connected to other aspects of the smart home, such as smart door locks, garage doors, computers and even people’s cars, they said.

Another article. The researchers will present their findings at Black Hat Europe — which, of course, will be happening virtually — on December 10.

Noise