Tag Archives: artificialintelligence

Fabricated Voice Used in Financial Fraud

Post Syndicated from Bruce Schneier original https://www.schneier.com/blog/archives/2019/09/fabricated_voic.html

This seems to be an identity theft first:

Criminals used artificial intelligence-based software to impersonate a chief executive’s voice and demand a fraudulent transfer of €220,000 ($243,000) in March in what cybercrime experts described as an unusual case of artificial intelligence being used in hacking.

Another news article.

AI Emotion-Detection Arms Race

Post Syndicated from Bruce Schneier original https://www.schneier.com/blog/archives/2019/08/ai_emotion-dete.html

Voice systems are increasingly using AI techniques to determine emotion. A new paper describes an AI-based countermeasure to mask emotion in spoken words.

Their method for masking emotion involves collecting speech, analyzing it, and extracting emotional features from the raw signal. Next, an AI program trains on this signal and replaces the emotional indicators in speech, flattening them. Finally, a voice synthesizer re-generates the normalized speech using the AIs outputs, which gets sent to the cloud. The researchers say that this method reduced emotional identification by 96 percent in an experiment, although speech recognition accuracy decreased, with a word error rate of 35 percent.

Academic paper.

More on Backdooring (or Not) WhatsApp

Post Syndicated from Bruce Schneier original https://www.schneier.com/blog/archives/2019/08/more_on_backdoo.html

Yesterday, I blogged about a Facebook plan to backdoor WhatsApp by adding client-side scanning and filtering. It seems that I was wrong, and there are no such plans.

The only source for that post was a Forbes essay by Kalev Leetaru, which links to a previous Forbes essay by him, which links to a video presentation from a Facebook developers conference.

Leetaru extrapolated a lot out of very little. I watched the video (the relevant section is at the 23:00 mark), and it doesn’t talk about client-side scanning of messages. It doesn’t talk about messaging apps at all. It discusses using AI techniques to find bad content on Facebook, and the difficulties that arise from dynamic content:

So far, we have been keeping this fight [against bad actors and harmful content] on familiar grounds. And that is, we have been training our AI models on the server and making inferences on the server when all the data are flooding into our data centers.

While this works for most scenarios, it is not the ideal setup for some unique integrity challenges. URL masking is one such problem which is very hard to do. We have the traditional way of server-side inference. What is URL masking? Let us imagine that a user sees a link on the app and decides to click on it. When they click on it, Facebook actually logs the URL to crawl it at a later date. But…the publisher can dynamically change the content of the webpage to make it look more legitimate [to Facebook]. But then our users click on the same link, they see something completely different — oftentimes it is disturbing; oftentimes it violates our policy standards. Of course, this creates a bad experience for our community that we would like to avoid. This and similar integrity problems are best solved with AI on the device.

That might be true, but it also would hand whatever secret-AI sauce Facebook has to every one of its users to reverse engineer — which means it’s probably not going to happen. And it is a dumb idea, for reasons Steve Bellovin has pointed out.

Facebook’s first published response was a comment on the Hacker News website from a user named “wcathcart,” which Cardozo assures me is Will Cathcart, the vice president of WhatsApp. (I have no reason to doubt his identity, but surely there is a more official news channel that Facebook could have chosen to use if they wanted to.) Cathcart wrote:

We haven’t added a backdoor to WhatsApp. The Forbes contributor referred to a technical talk about client side AI in general to conclude that we might do client side scanning of content on WhatsApp for anti-abuse purposes.

To be crystal clear, we have not done this, have zero plans to do so, and if we ever did it would be quite obvious and detectable that we had done it. We understand the serious concerns this type of approach would raise which is why we are opposed to it.

Facebook’s second published response was a comment on my original blog post, which has been confirmed to me by the WhatsApp people as authentic. It’s more of the same.

So, this was a false alarm. And, to be fair, Alec Muffet called foul on the first Forbes piece:

So, here’s my pre-emptive finger wag: Civil Society’s pack mentality can make us our own worst enemies. If we go around repeating one man’s Germanic conspiracy theory, we may doom ourselves to precisely what we fear. Instead, we should ­ we must ­ take steps to constructively demand what we actually want: End to End Encryption which is worthy of the name.

Blame accepted. But in general, this is the sort of thing we need to watch for. End-to-end encryption only secures data in transit. The data has to be in the clear on the device where it is created, and it has to be in the clear on the device where it is consumed. Those are the obvious places for an eavesdropper to get a copy.

This has been a long process. Facebook desperately wanted to convince me to correct the record, while at the same time not wanting to write something on their own letterhead (just a couple of comments, so far). I spoke at length with Privacy Policy Manager Nate Cardozo, whom Facebook hired last December from EFF. (Back then, I remember thinking of him — and the two other new privacy hires — as basically human warrant canaries. If they ever leave Facebook under non-obvious circumstances, we know that things are bad.) He basically leveraged his historical reputation to assure me that WhatsApp, and Facebook in general, would never do something like this. I am trusting him, while also reminding everyone that Facebook has broken so many privacy promises that they really can’t be trusted.

Final note: If they want to be trusted, Adam Shostack and I gave them a road map.

Hacker News thread.

EDITED TO ADD (8/4): Slashdot covered my retraction.

Data, Surveillance, and the AI Arms Race

Post Syndicated from Bruce Schneier original https://www.schneier.com/blog/archives/2019/06/data_surveillan.html

According to foreign policy experts and the defense establishment, the United States is caught in an artificial intelligence arms race with China — one with serious implications for national security. The conventional version of this story suggests that the United States is at a disadvantage because of self-imposed restraints on the collection of data and the privacy of its citizens, while China, an unrestrained surveillance state, is at an advantage. In this vision, the data that China collects will be fed into its systems, leading to more powerful AI with capabilities we can only imagine today. Since Western countries can’t or won’t reap such a comprehensive harvest of data from their citizens, China will win the AI arms race and dominate the next century.

This idea makes for a compelling narrative, especially for those trying to justify surveillance — whether government- or corporate-run. But it ignores some fundamental realities about how AI works and how AI research is conducted.

Thanks to advances in machine learning, AI has flipped from theoretical to practical in recent years, and successes dominate public understanding of how it works. Machine learning systems can now diagnose pneumonia from X-rays, play the games of go and poker, and read human lips, all better than humans. They’re increasingly watching surveillance video. They are at the core of self-driving car technology and are playing roles in both intelligence-gathering and military operations. These systems monitor our networks to detect intrusions and look for spam and malware in our email.

And it’s true that there are differences in the way each country collects data. The United States pioneered “surveillance capitalism,” to use the Harvard University professor Shoshana Zuboff’s term, where data about the population is collected by hundreds of large and small companies for corporate advantage — and mutually shared or sold for profit The state picks up on that data, in cases such as the Centers for Disease Control and Prevention’s use of Google search data to map epidemics and evidence shared by alleged criminals on Facebook, but it isn’t the primary user.

China, on the other hand, is far more centralized. Internet companies collect the same sort of data, but it is shared with the government, combined with government-collected data, and used for social control. Every Chinese citizen has a national ID number that is demanded by most services and allows data to easily be tied together. In the western region of Xinjiang, ubiquitous surveillance is used to oppress the Uighur ethnic minority — although at this point there is still a lot of human labor making it all work. Everyone expects that this is a test bed for the entire country.

Data is increasingly becoming a part of control for the Chinese government. While many of these plans are aspirational at the moment — there isn’t, as some have claimed, a single “social credit score,” but instead future plans to link up a wide variety of systems — data collection is universally pushed as essential to the future of Chinese AI. One executive at search firm Baidu predicted that the country’s connected population will provide them with the raw data necessary to become the world’s preeminent tech power. China’s official goal is to become the world AI leader by 2030, aided in part by all of this massive data collection and correlation.

This all sounds impressive, but turning massive databases into AI capabilities doesn’t match technological reality. Current machine learning techniques aren’t all that sophisticated. All modern AI systems follow the same basic methods. Using lots of computing power, different machine learning models are tried, altered, and tried again. These systems use a large amount of data (the training set) and an evaluation function to distinguish between those models and variations that work well and those that work less well. After trying a lot of models and variations, the system picks the one that works best. This iterative improvement continues even after the system has been fielded and is in use.

So, for example, a deep learning system trying to do facial recognition will have multiple layers (hence the notion of “deep”) trying to do different parts of the facial recognition task. One layer will try to find features in the raw data of a picture that will help find a face, such as changes in color that will indicate an edge. The next layer might try to combine these lower layers into features like shapes, looking for round shapes inside of ovals that indicate eyes on a face. The different layers will try different features and will be compared by the evaluation function until the one that is able to give the best results is found, in a process that is only slightly more refined than trial and error.

Large data sets are essential to making this work, but that doesn’t mean that more data is automatically better or that the system with the most data is automatically the best system. Train a facial recognition algorithm on a set that contains only faces of white men, and the algorithm will have trouble with any other kind of face. Use an evaluation function that is based on historical decisions, and any past bias is learned by the algorithm. For example, mortgage loan algorithms trained on historic decisions of human loan officers have been found to implement redlining. Similarly, hiring algorithms trained on historical data manifest the same sexism as human staff often have. Scientists are constantly learning about how to train machine learning systems, and while throwing a large amount of data and computing power at the problem can work, more subtle techniques are often more successful. All data isn’t created equal, and for effective machine learning, data has to be both relevant and diverse in the right ways.

Future research advances in machine learning are focused on two areas. The first is in enhancing how these systems distinguish between variations of an algorithm. As different versions of an algorithm are run over the training data, there needs to be some way of deciding which version is “better.” These evaluation functions need to balance the recognition of an improvement with not over-fitting to the particular training data. Getting functions that can automatically and accurately distinguish between two algorithms based on minor differences in the outputs is an art form that no amount of data can improve.

The second is in the machine learning algorithms themselves. While much of machine learning depends on trying different variations of an algorithm on large amounts of data to see which is most successful, the initial formulation of the algorithm is still vitally important. The way the algorithms interact, the types of variations attempted, and the mechanisms used to test and redirect the algorithms are all areas of active research. (An overview of some of this work can be found here; even trying to limit the research to 20 papers oversimplifies the work being done in the field.) None of these problems can be solved by throwing more data at the problem.

The British AI company DeepMind’s success in teaching a computer to play the Chinese board game go is illustrative. Its AlphaGo computer program became a grandmaster in two steps. First, it was fed some enormous number of human-played games. Then, the game played itself an enormous number of times, improving its own play along the way. In 2016, AlphaGo beat the grandmaster Lee Sedol four games to one.

While the training data in this case, the human-played games, was valuable, even more important was the machine learning algorithm used and the function that evaluated the relative merits of different game positions. Just one year later, DeepMind was back with a follow-on system: AlphaZero. This go-playing computer dispensed entirely with the human-played games and just learned by playing against itself over and over again. It plays like an alien. (It also became a grandmaster in chess and shogi.)

These are abstract games, so it makes sense that a more abstract training process works well. But even something as visceral as facial recognition needs more than just a huge database of identified faces in order to work successfully. It needs the ability to separate a face from the background in a two-dimensional photo or video and to recognize the same face in spite of changes in angle, lighting, or shadows. Just adding more data may help, but not nearly as much as added research into what to do with the data once we have it.

Meanwhile, foreign-policy and defense experts are talking about AI as if it were the next nuclear arms race, with the country that figures it out best or first becoming the dominant superpower for the next century. But that didn’t happen with nuclear weapons, despite research only being conducted by governments and in secret. It certainly won’t happen with AI, no matter how much data different nations or companies scoop up.

It is true that China is investing a lot of money into artificial intelligence research: The Chinese government believes this will allow it to leapfrog other countries (and companies in those countries) and become a major force in this new and transformative area of computing — and it may be right. On the other hand, much of this seems to be a wasteful boondoggle. Slapping “AI” on pretty much anything is how to get funding. The Chinese Ministry of Education, for instance, promises to produce “50 world-class AI textbooks,” with no explanation of what that means.

In the democratic world, the government is neither the leading researcher nor the leading consumer of AI technologies. AI research is much more decentralized and academic, and it is conducted primarily in the public eye. Research teams keep their training data and models proprietary but freely publish their machine learning algorithms. If you wanted to work on machine learning right now, you could download Microsoft’s Cognitive Toolkit, Google’s Tensorflow, or Facebook’s Pytorch. These aren’t toy systems; these are the state-of-the art machine learning platforms.

AI is not analogous to the big science projects of the previous century that brought us the atom bomb and the moon landing. AI is a science that can be conducted by many different groups with a variety of different resources, making it closer to computer design than the space race or nuclear competition. It doesn’t take a massive government-funded lab for AI research, nor the secrecy of the Manhattan Project. The research conducted in the open science literature will trump research done in secret because of the benefits of collaboration and the free exchange of ideas.

While the United States should certainly increase funding for AI research, it should continue to treat it as an open scientific endeavor. Surveillance is not justified by the needs of machine learning, and real progress in AI doesn’t need it.

This essay was written with Jim Waldo, and previously appeared in Foreign Policy.

Hacking Instagram to Get Free Meals in Exchange for Positive Reviews

Post Syndicated from Bruce Schneier original https://www.schneier.com/blog/archives/2019/04/hacking_instagr.html

This is a fascinating hack:

In today’s digital age, a large Instagram audience is considered a valuable currency. I had also heard through the grapevine that I could monetize a large following — or in my desired case — use it to have my meals paid for. So I did just that.

I created an Instagram page that showcased pictures of New York City’s skylines, iconic spots, elegant skyscrapers ­– you name it. The page has amassed a following of over 25,000 users in the NYC area and it’s still rapidly growing.

I reach out restaurants in the area either via Instagram’s direct messaging or email and offer to post a positive review in return for a free entree or at least a discount. Almost every restaurant I’ve messaged came back at me with a compensated meal or a gift card. Most places have an allocated marketing budget for these types of things so they were happy to offer me a free dining experience in exchange for a promotion. I’ve ended up giving some of these meals away to my friends and family because at times I had too many queued up to use myself.

The beauty of this all is that I automated the whole thing. And I mean 100% of it. I wrote code that finds these pictures or videos, makes a caption, adds hashtags, credits where the picture or video comes from, weeds out bad or spammy posts, posts them, follows and unfollows users, likes pictures, monitors my inbox, and most importantly — both direct messages and emails restaurants about a potential promotion. Since its inception, I haven’t even really logged into the account. I spend zero time on it. It’s essentially a robot that operates like a human, but the average viewer can’t tell the difference. And as the programmer, I get to sit back and admire its (and my) work.

So much going on in this project.

Detecting Shoplifting Behavior

Post Syndicated from Bruce Schneier original https://www.schneier.com/blog/archives/2019/03/detecting_shopl.html

This system claims to detect suspicious behavior that indicates shoplifting:

Vaak, a Japanese startup, has developed artificial intelligence software that hunts for potential shoplifters, using footage from security cameras for fidgeting, restlessness and other potentially suspicious body language.

The article has no detail or analysis, so we don’t know how well it works. But this kind of thing is surely the future of video surveillance.

China’s AI Strategy and its Security Implications

Post Syndicated from Bruce Schneier original https://www.schneier.com/blog/archives/2019/02/chinas_ai_strat.html

Gregory C. Allen at the Center for a New American Security has a new report with some interesting analysis and insights into China’s AI strategy, commercial, government, and military. There are numerous security — and national security — implications.

Machine Learning to Detect Software Vulnerabilities

Post Syndicated from Bruce Schneier original https://www.schneier.com/blog/archives/2019/01/machine_learnin.html

No one doubts that artificial intelligence (AI) and machine learning (ML) will transform cybersecurity. We just don’t know how, or when. While the literature generally focuses on the different uses of AI by attackers and defenders ­ and the resultant arms race between the two ­ I want to talk about software vulnerabilities.

All software contains bugs. The reason is basically economic: The market doesn’t want to pay for quality software. With a few exceptions, such as the space shuttle, the market prioritizes fast and cheap over good. The result is that any large modern software package contains hundreds or thousands of bugs.

Some percentage of bugs are also vulnerabilities, and a percentage of those are exploitable vulnerabilities, meaning an attacker who knows about them can attack the underlying system in some way. And some percentage of those are discovered and used. This is why your computer and smartphone software is constantly being patched; software vendors are fixing bugs that are also vulnerabilities that have been discovered and are being used.

Everything would be better if software vendors found and fixed all bugs during the design and development process, but, as I said, the market doesn’t reward that kind of delay and expense. AI, and machine learning in particular, has the potential to forever change this trade-off.

The problem of finding software vulnerabilities seems well-suited for ML systems. Going through code line by line is just the sort of tedious problem that computers excel at, if we can only teach them what a vulnerability looks like. There are challenges with that, of course, but there is already a healthy amount of academic literature on the topic — and research is continuing. There’s every reason to expect ML systems to get better at this as time goes on, and some reason to expect them to eventually become very good at it.

Finding vulnerabilities can benefit both attackers and defenders, but it’s not a fair fight. When an attacker’s ML system finds a vulnerability in software, the attacker can use it to compromise systems. When a defender’s ML system finds the same vulnerability, he or she can try to patch the system or program network defenses to watch for and block code that tries to exploit it.

But when the same system is in the hands of a software developer who uses it to find the vulnerability before the software is ever released, the developer fixes it so it can never be used in the first place. The ML system will probably be part of his or her software design tools and will automatically find and fix vulnerabilities while the code is still in development.

Fast-forward a decade or so into the future. We might say to each other, “Remember those years when software vulnerabilities were a thing, before ML vulnerability finders were built into every compiler and fixed them before the software was ever released? Wow, those were crazy years.” Not only is this future possible, but I would bet on it.

Getting from here to there will be a dangerous ride, though. Those vulnerability finders will first be unleashed on existing software, giving attackers hundreds if not thousands of vulnerabilities to exploit in real-world attacks. Sure, defenders can use the same systems, but many of today’s Internet of Things systems have no engineering teams to write patches and no ability to download and install patches. The result will be hundreds of vulnerabilities that attackers can find and use.

But if we look far enough into the horizon, we can see a future where software vulnerabilities are a thing of the past. Then we’ll just have to worry about whatever new and more advanced attack techniques those AI systems come up with.

This essay previously appeared on SecurityIntelligence.com.

DARPA Funding in AI-Assisted Cybersecurity

Post Syndicated from Bruce Schneier original https://www.schneier.com/blog/archives/2018/04/darpa_funding_i.html

DARPA is launching a program aimed at vulnerability discovery via human-assisted AI. The new DARPA program is called CHESS (Computers and Humans Exploring Software Security), and they’re holding a proposers day in a week and a half.

This is the kind of thing that can dramatically change the offense/defense balance.

Artificial Intelligence and the Attack/Defense Balance

Post Syndicated from Bruce Schneier original https://www.schneier.com/blog/archives/2018/03/artificial_inte.html

Artificial intelligence technologies have the potential to upend the longstanding advantage that attack has over defense on the Internet. This has to do with the relative strengths and weaknesses of people and computers, how those all interplay in Internet security, and where AI technologies might change things.

You can divide Internet security tasks into two sets: what humans do well and what computers do well. Traditionally, computers excel at speed, scale, and scope. They can launch attacks in milliseconds and infect millions of computers. They can scan computer code to look for particular kinds of vulnerabilities, and data packets to identify particular kinds of attacks.

Humans, conversely, excel at thinking and reasoning. They can look at the data and distinguish a real attack from a false alarm, understand the attack as it’s happening, and respond to it. They can find new sorts of vulnerabilities in systems. Humans are creative and adaptive, and can understand context.

Computers — so far, at least — are bad at what humans do well. They’re not creative or adaptive. They don’t understand context. They can behave irrationally because of those things.

Humans are slow, and get bored at repetitive tasks. They’re terrible at big data analysis. They use cognitive shortcuts, and can only keep a few data points in their head at a time. They can also behave irrationally because of those things.

AI will allow computers to take over Internet security tasks from humans, and then do them faster and at scale. Here are possible AI capabilities:

  • Discovering new vulnerabilities­ — and, more importantly, new types of vulnerabilities­ in systems, both by the offense to exploit and by the defense to patch, and then automatically exploiting or patching them.
  • Reacting and adapting to an adversary’s actions, again both on the offense and defense sides. This includes reasoning about those actions and what they mean in the context of the attack and the environment.
  • Abstracting lessons from individual incidents, generalizing them across systems and networks, and applying those lessons to increase attack and defense effectiveness elsewhere.
  • Identifying strategic and tactical trends from large datasets and using those trends to adapt attack and defense tactics.

That’s an incomplete list. I don’t think anyone can predict what AI technologies will be capable of. But it’s not unreasonable to look at what humans do today and imagine a future where AIs are doing the same things, only at computer speeds, scale, and scope.

Both attack and defense will benefit from AI technologies, but I believe that AI has the capability to tip the scales more toward defense. There will be better offensive and defensive AI techniques. But here’s the thing: defense is currently in a worse position than offense precisely because of the human components. Present-day attacks pit the relative advantages of computers and humans against the relative weaknesses of computers and humans. Computers moving into what are traditionally human areas will rebalance that equation.

Roy Amara famously said that we overestimate the short-term effects of new technologies, but underestimate their long-term effects. AI is notoriously hard to predict, so many of the details I speculate about are likely to be wrong­ — and AI is likely to introduce new asymmetries that we can’t foresee. But AI is the most promising technology I’ve seen for bringing defense up to par with offense. For Internet security, that will change everything.

This essay previously appeared in the March/April 2018 issue of IEEE Security & Privacy.

Confusing Self-Driving Cars by Altering Road Signs

Post Syndicated from Bruce Schneier original https://www.schneier.com/blog/archives/2017/08/confusing_self-.html

Researchers found that they could confuse the road sign detection algorithms of self-driving cars by adding stickers to the signs on the road. They could, for example, cause a car to think that a stop sign is a 45 mph speed limit sign. The changes are subtle, though — look at the photo from the article.

Research paper:

Robust Physical-World Attacks on Machine Learning Models,” by Ivan Evtimov, Kevin Eykholt, Earlence Fernandes, Tadayoshi Kohno, Bo Li, Atul Prakash, Amir Rahmati, and Dawn Song:

Abstract: Deep neural network-based classifiers are known to be vulnerable to adversarial examples that can fool them into misclassifying their input through the addition of small-magnitude perturbations. However, recent studies have demonstrated that such adversarial examples are not very effective in the physical world–they either completely fail to cause misclassification or only work in restricted cases where a relatively complex image is perturbed and printed on paper. In this paper we propose a new attack algorithm–Robust Physical Perturbations (RP2)– that generates perturbations by taking images under different conditions into account. Our algorithm can create spatially-constrained perturbations that mimic vandalism or art to reduce the likelihood of detection by a casual observer. We show that adversarial examples generated by RP2 achieve high success rates under various conditions for real road sign recognition by using an evaluation methodology that captures physical world conditions. We physically realized and evaluated two attacks, one that causes a Stop sign to be misclassified as a Speed Limit sign in 100% of the testing conditions, and one that causes a Right Turn sign to be misclassified as either a Stop or Added Lane sign in 100% of the testing conditions.

US Army Researching Bot Swarms

Post Syndicated from Bruce Schneier original https://www.schneier.com/blog/archives/2017/07/us_army_researc.html

The US Army Research Agency is funding research into autonomous bot swarms. From the announcement:

The objective of this CRA is to perform enabling basic and applied research to extend the reach, situational awareness, and operational effectiveness of large heterogeneous teams of intelligent systems and Soldiers against dynamic threats in complex and contested environments and provide technical and operational superiority through fast, intelligent, resilient and collaborative behaviors. To achieve this, ARL is requesting proposals that address three key Research Areas (RAs):

RA1: Distributed Intelligence: Establish the theoretical foundations of multi-faceted distributed networked intelligent systems combining autonomous agents, sensors, tactical super-computing, knowledge bases in the tactical cloud, and human experts to acquire and apply knowledge to affect and inform decisions of the collective team.

RA2: Heterogeneous Group Control: Develop theory and algorithms for control of large autonomous teams with varying levels of heterogeneity and modularity across sensing, computing, platforms, and degree of autonomy.

RA3: Adaptive and Resilient Behaviors: Develop theory and experimental methods for heterogeneous teams to carry out tasks under the dynamic and varying conditions in the physical world.

Slashdot thread.

And while we’re on the subject, this is an excellent report on AI and national security.

The Future of Forgeries

Post Syndicated from Bruce Schneier original https://www.schneier.com/blog/archives/2017/07/the_future_of_f_1.html

This article argues that AI technologies will make image, audio, and video forgeries much easier in the future.

Combined, the trajectory of cheap, high-quality media forgeries is worrying. At the current pace of progress, it may be as little as two or three years before realistic audio forgeries are good enough to fool the untrained ear, and only five or 10 years before forgeries can fool at least some types of forensic analysis. When tools for producing fake video perform at higher quality than today’s CGI and are simultaneously available to untrained amateurs, these forgeries might comprise a large part of the information ecosystem. The growth in this technology will transform the meaning of evidence and truth in domains across journalism, government communications, testimony in criminal justice, and, of course, national security.

I am not worried about fooling the “untrained ear,” and more worried about fooling forensic analysis. But there’s an arms race here. Recording technologies will get more sophisticated, too, making their outputs harder to forge. Still, I agree that the advantage will go to the forgers and not the forgery detectors.

Security Orchestration and Incident Response

Post Syndicated from Bruce Schneier original https://www.schneier.com/blog/archives/2017/03/security_orches.html

Last month at the RSA Conference, I saw a lot of companies selling security incident response automation. Their promise was to replace people with computers ­– sometimes with the addition of machine learning or other artificial intelligence techniques ­– and to respond to attacks at computer speeds.

While this is a laudable goal, there’s a fundamental problem with doing this in the short term. You can only automate what you’re certain about, and there is still an enormous amount of uncertainty in cybersecurity. Automation has its place in incident response, but the focus needs to be on making the people effective, not on replacing them ­ security orchestration, not automation.

This isn’t just a choice of words ­– it’s a difference in philosophy. The US military went through this in the 1990s. What was called the Revolution in Military Affairs (RMA) was supposed to change how warfare was fought. Satellites, drones and battlefield sensors were supposed to give commanders unprecedented information about what was going on, while networked soldiers and weaponry would enable troops to coordinate to a degree never before possible. In short, the traditional fog of war would be replaced by perfect information, providing certainty instead of uncertainty. They, too, believed certainty would fuel automation and, in many circumstances, allow technology to replace people.

Of course, it didn’t work out that way. The US learned in Afghanistan and Iraq that there are a lot of holes in both its collection and coordination systems. Drones have their place, but they can’t replace ground troops. The advances from the RMA brought with them some enormous advantages, especially against militaries that didn’t have access to the same technologies, but never resulted in certainty. Uncertainty still rules the battlefield, and soldiers on the ground are still the only effective way to control a region of territory.

But along the way, we learned a lot about how the feeling of certainty affects military thinking. Last month, I attended a lecture on the topic by H.R. McMaster. This was before he became President Trump’s national security advisor-designate. Then, he was the director of the Army Capabilities Integration Center. His lecture touched on many topics, but at one point he talked about the failure of the RMA. He confirmed that military strategists mistakenly believed that data would give them certainty. But he took this change in thinking further, outlining the ways this belief in certainty had repercussions in how military strategists thought about modern conflict.

McMaster’s observations are directly relevant to Internet security incident response. We too have been led to believe that data will give us certainty, and we are making the same mistakes that the military did in the 1990s. In a world of uncertainty, there’s a premium on understanding, because commanders need to figure out what’s going on. In a world of certainty, knowing what’s going on becomes a simple matter of data collection.

I see this same fallacy in Internet security. Many companies exhibiting at the RSA Conference promised to collect and display more data and that the data will reveal everything. This simply isn’t true. Data does not equal information, and information does not equal understanding. We need data, but we also must prioritize understanding the data we have over collecting ever more data. Much like the problems with bulk surveillance, the “collect it all” approach provides minimal value over collecting the specific data that’s useful.

In a world of uncertainty, the focus is on execution. In a world of certainty, the focus is on planning. I see this manifesting in Internet security as well. My own Resilient Systems ­– now part of IBM Security –­ allows incident response teams to manage security incidents and intrusions. While the tool is useful for planning and testing, its real focus is always on execution.

Uncertainty demands initiative, while certainty demands synchronization. Here, again, we are heading too far down the wrong path. The purpose of all incident response tools should be to make the human responders more effective. They need both the ability and the capability to exercise it effectively.

When things are uncertain, you want your systems to be decentralized. When things are certain, centralization is more important. Good incident response teams know that decentralization goes hand in hand with initiative. And finally, a world of uncertainty prioritizes command, while a world of certainty prioritizes control. Again, effective incident response teams know this, and effective managers aren’t scared to release and delegate control.

Like the US military, we in the incident response field have shifted too much into the world of certainty. We have prioritized data collection, preplanning, synchronization, centralization and control. You can see it in the way people talk about the future of Internet security, and you can see it in the products and services offered on the show floor of the RSA Conference.

Automation, too, is fixed. Incident response needs to be dynamic and agile, because you are never certain and there is an adaptive, malicious adversary on the other end. You need a response system that has human controls and can modify itself on the fly. Automation just doesn’t allow a system to do that to the extent that’s needed in today’s environment. Just as the military shifted from trying to replace the soldier to making the best soldier possible, we need to do the same.

For some time, I have been talking about incident response in terms of OODA loops. This is a way of thinking about real-time adversarial relationships, originally developed for airplane dogfights, but much more broadly applicable. OODA stands for observe-orient-decide-act, and it’s what people responding to a cybersecurity incident do constantly, over and over again. We need tools that augment each of those four steps. These tools need to operate in a world of uncertainty, where there is never enough data to know everything that is going on. We need to prioritize understanding, execution, initiative, decentralization and command.

At the same time, we’re going to have to make all of this scale. If anything, the most seductive promise of a world of certainty and automation is that it allows defense to scale. The problem is that we’re not there yet. We can automate and scale parts of IT security, such as antivirus, automatic patching and firewall management, but we can’t yet scale incident response. We still need people. And we need to understand what can be automated and what can’t be.

The word I prefer is orchestration. Security orchestration represents the union of people, process and technology. It’s computer automation where it works, and human coordination where that’s necessary. It’s networked systems giving people understanding and capabilities for execution. It’s making those on the front lines of incident response the most effective they can be, instead of trying to replace them. It’s the best approach we have for cyberdefense.

Automation has its place. If you think about the product categories where it has worked, they’re all areas where we have pretty strong certainty. Automation works in antivirus, firewalls, patch management and authentication systems. None of them is perfect, but all those systems are right almost all the time, and we’ve developed ancillary systems to deal with it when they’re wrong.

Automation fails in incident response because there’s too much uncertainty. Actions can be automated once the people understand what’s going on, but people are still required. For example, IBM’s Watson for Cyber Security provides insights for incident response teams based on its ability to ingest and find patterns in an enormous amount of freeform data. It does not attempt a level of understanding necessary to take people out of the equation.

From within an orchestration model, automation can be incredibly powerful. But it’s the human-centric orchestration model –­ the dashboards, the reports, the collaboration –­ that makes automation work. Otherwise, you’re blindly trusting the machine. And when an uncertain process is automated, the results can be dangerous.

Technology continues to advance, and this is all a changing target. Eventually, computers will become intelligent enough to replace people at real-time incident response. My guess, though, is that computers are not going to get there by collecting enough data to be certain. More likely, they’ll develop the ability to exhibit understanding and operate in a world of uncertainty. That’s a much harder goal.

Yes, today, this is all science fiction. But it’s not stupid science fiction, and it might become reality during the lifetimes of our children. Until then, we need people in the loop. Orchestration is a way to achieve that.

This essay previously appeared on the Security Intelligence blog.

Automatically Identifying Government Secrets

Post Syndicated from Bruce Schneier original https://www.schneier.com/blog/archives/2016/11/automatically_i.html

Interesting research: “Using Artificial Intelligence to Identify State Secrets,” by Renato Rocha Souza, Flavio Codeco Coelho, Rohan Shah, and Matthew Connelly.

Abstract: Whether officials can be trusted to protect national security information has become a matter of great public controversy, reigniting a long-standing debate about the scope and nature of official secrecy. The declassification of millions of electronic records has made it possible to analyze these issues with greater rigor and precision. Using machine-learning methods, we examined nearly a million State Department cables from the 1970s to identify features of records that are more likely to be classified, such as international negotiations, military operations, and high-level communications. Even with incomplete data, algorithms can use such features to identify 90% of classified cables with <11% false positives. But our results also show that there are longstanding problems in the identification of sensitive information. Error analysis reveals many examples of both overclassification and underclassification. This indicates both the need for research on inter-coder reliability among officials as to what constitutes classified material and the opportunity to develop recommender systems to better manage both classification and declassification.

Fooling Facial Recognition Systems

Post Syndicated from Bruce Schneier original https://www.schneier.com/blog/archives/2016/11/fooling_facial_.html

This is some interesting research. You can fool facial recognition systems by wearing glasses printed with elements of other people’s faces.

Mahmood Sharif, Sruti Bhagavatula, Lujo Bauer, and Michael K. Reiter, “Accessorize to a Crime: Real and Stealthy Attacks on State-of-the-Art Face Recognition“:

ABSTRACT: Machine learning is enabling a myriad innovations, including new algorithms for cancer diagnosis and self-driving cars. The broad use of machine learning makes it important to understand the extent to which machine-learning algorithms are subject to attack, particularly when used in applications where physical security or safety is at risk. In this paper, we focus on facial biometric systems, which are widely used in surveillance and access control. We define and investigate a novel class of attacks: attacks that are physically realizable and inconspicuous, and allow an attacker to evade recognition or impersonate another individual. We develop a systematic method to automatically generate such attacks, which are realized through printing a pair of eyeglass frames. When worn by the attacker whose image is supplied to a state-of-the-art face-recognition algorithm, the eyeglasses allow her to evade being recognized or to impersonate another individual. Our investigation focuses on white-box face-recognition systems, but we also demonstrate how similar techniques can be used in black-box scenarios, as well as to avoid face detection.

News articles.

Teaching a Neural Network to Encrypt

Post Syndicated from Bruce Schneier original https://www.schneier.com/blog/archives/2016/11/teaching_a_neur.html

Researchers have trained a neural network to encrypt its communications.

In their experiment, computers were able to make their own form of encryption using machine learning, without being taught specific cryptographic algorithms. The encryption was very basic, especially compared to our current human-designed systems. Even so, it is still an interesting step for neural nets, which the authors state “are generally not meant to be great at cryptography:.

This story is more about AI and neural networks than it is about cryptography. The algorithm isn’t any good, but is a perfect example of what I’ve heard called “Schneier’s Law“: Anyone can design a cipher that they themselves cannot break.

Research paper. Note that the researchers work at Google.

Malicious AI

Post Syndicated from Bruce Schneier original https://www.schneier.com/blog/archives/2016/10/malicious_ai.html

It’s not hard to imagine the criminal possibilities of automation, autonomy, and artificial intelligence. But the imaginings are becoming mainstream — and the future isn’t too far off.

Along similar lines, computers are able to predict court verdicts. My guess is that the real use here isn’t to predict actual court verdicts, but for well-paid defense teams to test various defensive tactics.