Tag Archives: Privacy

Google Responds to Warrants for “About” Searches

Post Syndicated from Bruce Schneier original https://www.schneier.com/blog/archives/2020/10/google-responds-to-warrants-for-about-searches.html

One of the things we learned from the Snowden documents is that the NSA conducts “about” searches. That is, searches based on activities and not identifiers. A normal search would be on a name, or IP address, or phone number. An about search would something like “show me anyone that has used this particular name in a communications,” or “show me anyone who was at this particular location within this time frame.” These searches are legal when conducted for the purpose of foreign surveillance, but the worry about using them domestically is that they are unconstitutionally broad. After all, the only way to know who said a particular name is to know what everyone said, and the only way to know who was at a particular location is to know where everyone was. The very nature of these searches requires mass surveillance.

The FBI does not conduct mass surveillance. But many US corporations do, as a normal part of their business model. And the FBI uses that surveillance infrastructure to conduct its own about searches. Here’s an arson case where the FBI asked Google who searched for a particular street address:

Homeland Security special agent Sylvette Reynoso testified that her team began by asking Google to produce a list of public IP addresses used to google the home of the victim in the run-up to the arson. The Chocolate Factory [Google] complied with the warrant, and gave the investigators the list. As Reynoso put it:

On June 15, 2020, the Honorable Ramon E. Reyes, Jr., United States Magistrate Judge for the Eastern District of New York, authorized a search warrant to Google for users who had searched the address of the Residence close in time to the arson.

The records indicated two IPv6 addresses had been used to search for the address three times: one the day before the SUV was set on fire, and the other two about an hour before the attack. The IPv6 addresses were traced to Verizon Wireless, which told the investigators that the addresses were in use by an account belonging to Williams.

Google’s response is that this is rare:

While word of these sort of requests for the identities of people making specific searches will raise the eyebrows of privacy-conscious users, Google told The Register the warrants are a very rare occurrence, and its team fights overly broad or vague requests.

“We vigorously protect the privacy of our users while supporting the important work of law enforcement,” Google’s director of law enforcement and information security Richard Salgado told us. “We require a warrant and push to narrow the scope of these particular demands when overly broad, including by objecting in court when appropriate.

“These data demands represent less than one per cent of total warrants and a small fraction of the overall legal demands for user data that we currently receive.”

Here’s another example of what seems to be about data leading to a false arrest.

According to the lawsuit, police investigating the murder knew months before they arrested Molina that the location data obtained from Google often showed him in two places at once, and that he was not the only person who drove the Honda registered under his name.

Avondale police knew almost two months before they arrested Molina that another man ­ his stepfather ­ sometimes drove Molina’s white Honda. On October 25, 2018, police obtained records showing that Molina’s Honda had been impounded earlier that year after Molina’s stepfather was caught driving the car without a license.

Data obtained by Avondale police from Google did show that a device logged into Molina’s Google account was in the area at the time of Knight’s murder. Yet on a different date, the location data from Google also showed that Molina was at a retirement community in Scottsdale (where his mother worked) while debit card records showed that Molina had made a purchase at a Walmart across town at the exact same time.

Molina’s attorneys argue that this and other instances like it should have made it clear to Avondale police that Google’s account-location data is not always reliable in determining the actual location of a person.

“About” searches might be rare, but that doesn’t make them a good idea. We have knowingly and willingly built the architecture of a police state, just so companies can show us ads. (And it is increasingly apparent that the advertising-supported Internet is heading for a crash.)

Free, Privacy-First Analytics for a Better Web

Post Syndicated from Jon Levine original https://blog.cloudflare.com/free-privacy-first-analytics-for-a-better-web/

Free, Privacy-First Analytics for a Better Web

Everyone with a website needs to know some basic facts about their website: what pages are people visiting? Where in the world are they? What other sites sent traffic to my website?

There are “free” analytics tools out there, but they come at a cost: not money, but your users’ privacy. Today we’re announcing a brand new, privacy-first analytics service that’s open to everyone — even if they’re not already a Cloudflare customer. And if you’re a Cloudflare customer, we’ve enhanced our analytics to make them even more powerful than before.

The most important analytics feature: Privacy

The most popular analytics services available were built to help ad-supported sites sell more ads. But, a lot of websites don’t have ads. So if you use those services, you’re giving up the privacy of your users in order to understand how what you’ve put online is performing.

Cloudflare’s business has never been built around tracking users or selling advertising. We don’t want to know what you do on the Internet — it’s not our business. So we wanted to build an analytics service that gets back to what really matters for web creators, not necessarily marketers, and to give web creators the information they need in a simple, clean way that doesn’t sacrifice their visitors’ privacy. And giving web creators these analytics shouldn’t depend on their use of Cloudflare’s infrastructure for performance and security. (More on that in a bit.)

What does it mean for us to make our analytics “privacy-first”? Most importantly, it means we don’t need to track individual users over time for the purposes of serving analytics. We don’t use any client-side state, like cookies or localStorage, for the purposes of tracking users. And we don’t “fingerprint” individuals via their IP address, User Agent string, or any other data for the purpose of displaying analytics. (We consider fingerprinting even more intrusive than cookies, because users have no way to opt out.)

Counting visits without tracking users

One of the most essential stats about any website is: “how many people went there”? Analytics tools frequently show counts of “unique” visitors, which requires tracking individual users by a cookie or IP address.

We use the concept of a visit: a privacy-friendly measure of how people have interacted with your website. A visit is defined simply as a successful page view that has an HTTP referer that doesn’t match the hostname of the request. This tells you how many times people came to your website and clicked around before navigating away, but doesn’t require tracking individuals.

Free, Privacy-First Analytics for a Better Web

A visit has slightly different semantics from a “unique”, and you should expect this number to differ from other analytics tools.

All of the details, none of the bots

Our analytics deliver the most important metrics about your website, like page views and visits. But we know that an essential analytics feature is flexibility: the ability to add arbitrary filters, and slice-and-dice data as you see fit. Our analytics can show you the top hostnames, URLs, countries, and other critical metrics like status codes. You can filter on any of these metrics with a click and see the whole dashboard update.

I’m especially excited about two features in our time series charts: the ability to drag-to-zoom into a narrower time range, and the ability to “group by” different dimensions to see data in a different way. This is a super powerful way to drill into an anomaly in traffic and quickly see what’s going on. For example, you might notice a spike in traffic, zoom into that spike, and then try different groupings to see what contributed the extra clicks. A GIF is worth a thousand words:

And for customers of our Bot Management product, we’re working on the ability to detect (and remove) automated traffic. Coming very soon, you’ll be able to see which bots are reaching your website — with just a click, block them by using Firewall Rules.

This is all possible thanks to our ABR analytics technology, which enables us to serve analytics very quickly for websites large and small. Check out our blog post to learn more about how this works.

Edge or Browser analytics? Why not both?

There are two ways to collect web analytics data: at the edge (or on an origin server), or in the client using a JavaScript beacon.

Historically, Cloudflare has collected analytics data at our edge. This has some nice benefits over traditional, client-side analytics approaches:

  • It’s more accurate because you don’t miss users who block third-party scripts, or JavaScript altogether
  • You can see all of the traffic back to your origin server, even if an HTML page doesn’t load
  • We can detect (and block bots), apply Firewall rules, and generally scrub traffic of unwanted noise
  • You can measure the performance of your origin server

More commonly, most web analytics providers use client-side measurement. This has some benefits as well:

  • You can understand performance as your users see it — e.g. how long did the page actually take to render
  • You can detect errors in client-side JavaScript execution
  • You can define custom event types emitted by JavaScript frameworks

Ultimately, we want our customers to have the best of both worlds. We think it’s really powerful to get web traffic numbers directly from the edge. We also launched Browser Insights a year ago to augment our existing edge analytics with more performance information, and today Browser Insights are taking a big step forward by incorporating Web Vitals metrics.

But, we know not everyone can modify their DNS to take advantage of Cloudflare’s edge services. That’s why today we’re announcing a free, standalone analytics product for everyone.

How do I get it?

For existing Cloudflare customers on our Pro, Biz, and Enterprise plans, just go to your Analytics tab! Starting today, you’ll see a banner to opt-in to the new analytics experience. (We plan to make this the default in a few weeks.)

But when building privacy-first analytics, we realized it’s important to make this accessible even to folks who don’t use Cloudflare today. You’ll be able to use Cloudflare’s web analytics even if you can’t change your DNS servers — just add our JavaScript, and you’re good to go.

We’re still putting on the finishing touches on our JavaScript-based analytics, but you can sign up here and we’ll let you know when it’s ready.

The evolution of analytics at Cloudflare

Just over a year ago, Cloudflare’s analytics consisted of a simple set of metrics: cached vs uncached data transfer, or how many requests were blocked by the Firewall. Today we provide flexible, powerful analytics across all our products, including Firewall, Cache, Load Balancing and Network traffic.

While we’ve been focused on building analytics about our products, we realized that our analytics are also powerful as a standalone product. Today is just the first step on that journey. We have so much more planned: from real-time analytics, to ever-more performance analysis, and even allowing customers to add custom events.

We want to hear what you want most out of analytics — drop a note in the comments to let us know what you want to see next.

On Executive Order 12333

Post Syndicated from Bruce Schneier original https://www.schneier.com/blog/archives/2020/09/on-executive-order-12333.html

Mark Jaycox has written a long article on the US Executive Order 12333: “No Oversight, No Limits, No Worries: A Primer on Presidential Spying and Executive Order 12,333“:

Abstract: Executive Order 12,333 (“EO 12333”) is a 1980s Executive Order signed by President Ronald Reagan that, among other things, establishes an overarching policy framework for the Executive Branch’s spying powers. Although electronic surveillance programs authorized by EO 12333 generally target foreign intelligence from foreign targets, its permissive targeting standards allow for the substantial collection of Americans’ communications containing little to no foreign intelligence value. This fact alone necessitates closer inspection.

This working draft conducts such an inspection by collecting and coalescing the various declassifications, disclosures, legislative investigations, and news reports concerning EO 12333 electronic surveillance programs in order to provide a better understanding of how the Executive Branch implements the order and the surveillance programs it authorizes. The Article pays particular attention to EO 12333’s designation of the National Security Agency as primarily responsible for conducting signals intelligence, which includes the installation of malware, the analysis of internet traffic traversing the telecommunications backbone, the hacking of U.S.-based companies like Yahoo and Google, and the analysis of Americans’ communications, contact lists, text messages, geolocation data, and other information.

After exploring the electronic surveillance programs authorized by EO 12333, this Article proposes reforms to the existing policy framework, including narrowing the aperture of authorized surveillance, increasing privacy standards for the retention of data, and requiring greater transparency and accountability.

Cory Doctorow on The Age of Surveillance Capitalism

Post Syndicated from Bruce Schneier original https://www.schneier.com/blog/archives/2020/08/cory_doctorow_o_2.html

Cory Doctorow has writtten an extended rebuttal of The Age of Surveillance Capitalism by Shoshana Zuboff. He summarized the argument on Twitter.

Shorter summary: it’s not the surveillance part, it’s the fact that these companies are monopolies.

I think it’s both. Surveillance capitalism has some unique properties that make it particularly unethical and incompatible with a free society, and Zuboff makes them clear in her book. But the current acceptance of monopolies in our society is also extremely damaging — which Doctorow makes clear.

Identifying People by Their Browsing Histories

Post Syndicated from Bruce Schneier original https://www.schneier.com/blog/archives/2020/08/identifying_peo_9.html

Interesting paper: “Replication: Why We Still Can’t Browse in Peace: On the Uniqueness and Reidentifiability of Web Browsing Histories“:

We examine the threat to individuals’ privacy based on the feasibility of reidentifying users through distinctive profiles of their browsing history visible to websites and third parties. This work replicates and extends the 2012 paper Why Johnny Can’t Browse in Peace: On the Uniqueness of Web Browsing History Patterns[48]. The original work demonstrated that browsing profiles are highly distinctive and stable.We reproduce those results and extend the original work to detail the privacy risk posed by the aggregation of browsing histories. Our dataset consists of two weeks of browsing data from ~52,000 Firefox users. Our work replicates the original paper’s core findings by identifying 48,919 distinct browsing profiles, of which 99% are unique. High uniqueness hold seven when histories are truncated to just 100 top sites. Wethen find that for users who visited 50 or more distinct do-mains in the two-week data collection period, ~50% can be reidentified using the top 10k sites. Reidentifiability rose to over 80% for users that browsed 150 or more distinct domains.Finally, we observe numerous third parties pervasive enough to gather web histories sufficient to leverage browsing history as an identifier.

One of the authors of the original study comments on the replication.