Tag Archives: cybersecurity

Extracting Personal Information from Large Language Models Like GPT-2

Post Syndicated from Bruce Schneier original https://www.schneier.com/blog/archives/2021/01/extracting-personal-information-from-large-language-models-like-gpt-2.html

Researchers have been able to find all sorts of personal information within GPT-2. This information was part of the training data, and can be extracted with the right sorts of queries.

Paper: “Extracting Training Data from Large Language Models.”

Abstract: It has become common to publish large (billion parameter) language models that have been trained on private datasets. This paper demonstrates that in such settings, an adversary can perform a training data extraction attack to recover individual training examples by querying the language model.

We demonstrate our attack on GPT-2, a language model trained on scrapes of the public Internet, and are able to extract hundreds of verbatim text sequences from the model’s training data. These extracted examples include (public) personally identifiable information (names, phone numbers, and email addresses), IRC conversations, code, and 128-bit UUIDs. Our attack is possible even though each of the above sequences are included in just one document in the training data.

We comprehensively evaluate our extraction attack to understand the factors that contribute to its success. For example, we find that larger models are more vulnerable than smaller models. We conclude by drawing lessons and discussing possible safeguards for training large language models.

From a blog post:

We generated a total of 600,000 samples by querying GPT-2 with three different sampling strategies. Each sample contains 256 tokens, or roughly 200 words on average. Among these samples, we selected 1,800 samples with abnormally high likelihood for manual inspection. Out of the 1,800 samples, we found 604 that contain text which is reproduced verbatim from the training set.

The rest of the blog post discusses the types of data they found.

More on the SolarWinds Breach

Post Syndicated from Bruce Schneier original https://www.schneier.com/blog/archives/2020/12/more-on-the-solarwinds-breach.html

The New York Times has more details.

About 18,000 private and government users downloaded a Russian tainted software update –­ a Trojan horse of sorts ­– that gave its hackers a foothold into victims’ systems, according to SolarWinds, the company whose software was compromised.

Among those who use SolarWinds software are the Centers for Disease Control and Prevention, the State Department, the Justice Department, parts of the Pentagon and a number of utility companies. While the presence of the software is not by itself evidence that each network was compromised and information was stolen, investigators spent Monday trying to understand the extent of the damage in what could be a significant loss of American data to a foreign attacker.

It’s unlikely that the SVR (a successor to the KGB) penetrated all of those networks. But it is likely that they penetrated many of the important ones. And that they have buried themselves into those networks, giving them persistent access even if this vulnerability is patched. This is a massive intelligence coup for the Russians and failure for the Americans, even if no classified networks were touched.

Meanwhile, CISA has directed everyone to remove SolarWinds from their networks. This is (1) too late to matter, and (2) likely to take many months to complete. Probably the right answer, though.

This is almost too stupid to believe:

In one previously unreported issue, multiple criminals have offered to sell access to SolarWinds’ computers through underground forums, according to two researchers who separately had access to those forums.

One of those offering claimed access over the Exploit forum in 2017 was known as “fxmsp” and is wanted by the FBI “for involvement in several high-profile incidents,” said Mark Arena, chief executive of cybercrime intelligence firm Intel471. Arena informed his company’s clients, which include U.S. law enforcement agencies.

Security researcher Vinoth Kumar told Reuters that, last year, he alerted the company that anyone could access SolarWinds’ update server by using the password “solarwinds123”

“This could have been done by any attacker, easily,” Kumar said.

Neither the password nor the stolen access is considered the most likely source of the current intrusion, researchers said.

That last sentence is important, yes. But the sloppy security practice is likely not an isolated incident, and speaks to the overall lack of security culture at the company.

And I noticed that SolarWinds has removed its customer page, presumably as part of its damage control efforts. I quoted from it. Did anyone save a copy?

EDITED TO ADD: Both the Wayback Machine and Brian Krebs have saved the SolarWinds customer page.

Another Massive Russian Hack of US Government Networks

Post Syndicated from Bruce Schneier original https://www.schneier.com/blog/archives/2020/12/another-massive-russian-hack-of-us-government-networks.html

The press is reporting a massive hack of US government networks by sophisticated Russian hackers.

Officials said a hunt was on to determine if other parts of the government had been affected by what looked to be one of the most sophisticated, and perhaps among the largest, attacks on federal systems in the past five years. Several said national security-related agencies were also targeted, though it was not clear whether the systems contained highly classified material.

[…]

The motive for the attack on the agency and the Treasury Department remains elusive, two people familiar with the matter said. One government official said it was too soon to tell how damaging the attacks were and how much material was lost, but according to several corporate officials, the attacks had been underway as early as this spring, meaning they continued undetected through months of the pandemic and the election season.

The attack vector seems to be a malicious update in SolarWinds’ “Orion” IT monitoring platform, which is widely used in the US government (and elsewhere).

SolarWinds’ comprehensive products and services are used by more than 300,000 customers worldwide, including military, Fortune 500 companies, government agencies, and education institutions. Our customer list includes:

  • More than 425 of the US Fortune 500
  • All ten of the top ten US telecommunications companies
  • All five branches of the US Military
  • The US Pentagon, State Department, NASA, NSA, Postal Service, NOAA, Department of Justice, and the Office of the President of the United States
  • All five of the top five US accounting firms
  • Hundreds of universities and colleges worldwide

I’m sure more details will become public over the next several weeks.

EDITED TO ADD (12/15): More news.

A Cybersecurity Policy Agenda

Post Syndicated from Bruce Schneier original https://www.schneier.com/blog/archives/2020/12/a-cybersecurity-policy-agenda.html

The Aspen Institute’s Aspen Cybersecurity Group — I’m a member — has released its cybersecurity policy agenda for the next four years.

The next administration and Congress cannot simultaneously address the wide array of cybersecurity risks confronting modern society. Policymakers in the White House, federal agencies, and Congress should zero in on the most important and solvable problems. To that end, this report covers five priority areas where we believe cybersecurity policymakers should focus their attention and resources as they contend with a presidential transition, a new Congress, and massive staff turnover across our nation’s capital.

  • Education and Workforce Development
  • Public Core Resilience
  • Supply Chain Security
  • Measuring Cybersecurity
  • Promoting Operational Collaboration

Lots of detail in the 70-page report.

Finnish Data Theft and Extortion

Post Syndicated from Bruce Schneier original https://www.schneier.com/blog/archives/2020/12/finnish-data-theft-and-extortion.html

The Finnish psychotherapy clinic Vastaamo was the victim of a data breach and theft. The criminals tried extorting money from the clinic. When that failed, they started extorting money from the patients:

Neither the company nor Finnish investigators have released many details about the nature of the breach, but reports say the attackers initially sought a payment of about 450,000 euros to protect about 40,000 patient records. The company reportedly did not pay up. Given the scale of the attack and the sensitive nature of the stolen data, the case has become a national story in Finland. Globally, attacks on health care organizations have escalated as cybercriminals look for higher-value targets.

[…]

Vastaamo said customers and employees had “personally been victims of extortion” in the case. Reports say that on Oct. 21 and Oct. 22, the cybercriminals began posting batches of about 100 patient records on the dark web and allowing people to pay about 500 euros to have their information taken down.

Use Macie to discover sensitive data as part of automated data pipelines

Post Syndicated from Brandon Wu original https://aws.amazon.com/blogs/security/use-macie-to-discover-sensitive-data-as-part-of-automated-data-pipelines/

Data is a crucial part of every business and is used for strategic decision making at all levels of an organization. To extract value from their data more quickly, Amazon Web Services (AWS) customers are building automated data pipelines—from data ingestion to transformation and analytics. As part of this process, my customers often ask how to prevent sensitive data, such as personally identifiable information, from being ingested into data lakes when it’s not needed. They highlight that this challenge is compounded when ingesting unstructured data—such as files from process reporting, text files from chat transcripts, and emails. They also mention that identifying sensitive data inadvertently stored in structured data fields—such as in a comment field stored in a database—is also a challenge.

In this post, I show you how to integrate Amazon Macie as part of the data ingestion step in your data pipeline. This solution provides an additional checkpoint that sensitive data has been appropriately redacted or tokenized prior to ingestion. Macie is a fully managed data security and privacy service that uses machine learning and pattern matching to discover sensitive data in AWS.

When Macie discovers sensitive data, the solution notifies an administrator to review the data and decide whether to allow the data pipeline to continue ingesting the objects. If allowed, the objects will be tagged with an Amazon Simple Storage Service (Amazon S3) object tag to identify that sensitive data was found in the object before progressing to the next stage of the pipeline.

This combination of automation and manual review helps reduce the risk that sensitive data—such as personally identifiable information—will be ingested into a data lake. This solution can be extended to fit your use case and workflows. For example, you can define custom data identifiers as part of your scans, add additional validation steps, create Macie suppression rules to archive findings automatically, or only request manual approvals for findings that meet certain criteria (such as high severity findings).

Solution overview

Many of my customers are building serverless data lakes with Amazon S3 as the primary data store. Their data pipelines commonly use different S3 buckets at each stage of the pipeline. I refer to the S3 bucket for the first stage of ingestion as the raw data bucket. A typical pipeline might have separate buckets for raw, curated, and processed data representing different stages as part of their data analytics pipeline.

Typically, customers will perform validation and clean their data before moving it to a raw data zone. This solution adds validation steps to that pipeline after preliminary quality checks and data cleaning is performed, noted in blue (in layer 3) of Figure 1. The layers outlined in the pipeline are:

  1. Ingestion – Brings data into the data lake.
  2. Storage – Provides durable, scalable, and secure components to store the data—typically using S3 buckets.
  3. Processing – Transforms data into a consumable state through data validation, cleanup, normalization, transformation, and enrichment. This processing layer is where the additional validation steps are added to identify instances of sensitive data that haven’t been appropriately redacted or tokenized prior to consumption.
  4. Consumption – Provides tools to gain insights from the data in the data lake.

 

Figure 1: Data pipeline with sensitive data scan

Figure 1: Data pipeline with sensitive data scan

The application runs on a scheduled basis (four times a day, every 6 hours by default) to process data that is added to the raw data S3 bucket. You can customize the application to perform a sensitive data discovery scan during any stage of the pipeline. Because most customers do their extract, transform, and load (ETL) daily, the application scans for sensitive data on a scheduled basis before any crawler jobs run to catalog the data and after typical validation and data redaction or tokenization processes complete.

You can expect that this additional validation will add 5–10 minutes to your pipeline execution at a minimum. The validation processing time will scale linearly based on object size, but there is a start-up time per job that is constant.

If sensitive data is found in the objects, an email is sent to the designated administrator requesting an approval decision, which they indicate by selecting the link corresponding to their decision to approve or deny the next step. In most cases, the reviewer will choose to adjust the sensitive data cleanup processes to remove the sensitive data, deny the progression of the files, and re-ingest the files in the pipeline.

Additional considerations for deploying this application for regular use are discussed at the end of the blog post.

Application components

The following resources are created as part of the application:

Note: the application uses various AWS services, and there are costs associated with these resources after the Free Tier usage. See AWS Pricing for details. The primary drivers of the solution cost will be the amount of data ingested through the pipeline, both for Amazon S3 storage and data processed for sensitive data discovery with Macie.

The architecture of the application is shown in Figure 2 and described in the text that follows.
 

Figure 2: Application architecture and logic

Figure 2: Application architecture and logic

Application logic

  1. Objects are uploaded to the raw data S3 bucket as part of the data ingestion process.
  2. A scheduled EventBridge rule runs the sensitive data scan Step Functions workflow.
  3. triggerMacieScan Lambda function moves objects from the raw data S3 bucket to the scan stage S3 bucket.
  4. triggerMacieScan Lambda function creates a Macie sensitive data discovery job on the scan stage S3 bucket.
  5. checkMacieStatus Lambda function checks the status of the Macie sensitive data discovery job.
  6. isMacieStatusCompleteChoice Step Functions Choice state checks whether the Macie sensitive data discovery job is complete.
    1. If yes, the getMacieFindingsCount Lambda function runs.
    2. If no, the Step Functions Wait state waits 60 seconds and then restarts Step 5.
  7. getMacieFindingsCount Lambda function counts all of the findings from the Macie sensitive data discovery job.
  8. isSensitiveDataFound Step Functions Choice state checks whether sensitive data was found in the Macie sensitive data discovery job.
    1. If there was sensitive data discovered, run the triggerManualApproval Lambda function.
    2. If there was no sensitive data discovered, run the moveAllScanStageS3Files Lambda function.
  9. moveAllScanStageS3Files Lambda function moves all of the objects from the scan stage S3 bucket to the scanned data S3 bucket.
  10. triggerManualApproval Lambda function tags and moves objects with sensitive data discovered to the manual review S3 bucket, and moves objects with no sensitive data discovered to the scanned data S3 bucket. The function then sends a notification to the ApprovalRequestNotification Amazon SNS topic as a notification that manual review is required.
  11. Email is sent to the email address that’s subscribed to the ApprovalRequestNotification Amazon SNS topic (from the application deployment template) for the manual review user with the option to Approve or Deny pipeline ingestion for these objects.
  12. Manual review user assesses the objects with sensitive data in the manual review S3 bucket and selects the Approve or Deny links in the email.
  13. The decision request is sent from the Amazon API Gateway to the receiveApprovalDecision Lambda function.
  14. manualApprovalChoice Step Functions Choice state checks the decision from the manual review user.
    1. If denied, run the deleteManualReviewS3Files Lambda function.
    2. If approved, run the moveToScannedDataS3Files Lambda function.
  15. deleteManualReviewS3Files Lambda function deletes the objects from the manual review S3 bucket.
  16. moveToScannedDataS3Files Lambda function moves the objects from the manual review S3 bucket to the scanned data S3 bucket.
  17. The next step of the automated data pipeline will begin with the objects in the scanned data S3 bucket.

Prerequisites

For this application, you need the following prerequisites:

You can use AWS Cloud9 to deploy the application. AWS Cloud9 includes the AWS CLI and AWS SAM CLI to simplify setting up your development environment.

Deploy the application with AWS SAM CLI

You can deploy this application using the AWS SAM CLI. AWS SAM uses AWS CloudFormation as the underlying deployment mechanism. AWS SAM is an open-source framework that you can use to build serverless applications on AWS.

To deploy the application

  1. Initialize the serverless application using the AWS SAM CLI from the GitHub project in the aws-samples repository. This will clone the project locally which includes the source code for the Lambda functions, Step Functions state machine definition file, and the AWS SAM template. On the command line, run the following:
    sam init --location gh: aws-samples/amazonmacie-datapipeline-scan
    

    Alternatively, you can clone the Github project directly.

  2. Deploy your application to your AWS account. On the command line, run the following:
    sam deploy --guided
    

    Complete the prompts during the guided interactive deployment. The first deployment prompt is shown in the following example.

    Configuring SAM deploy
    ======================
    
            Looking for config file [samconfig.toml] :  Found
            Reading default arguments  :  Success
    
            Setting default arguments for 'sam deploy'
            =========================================
            Stack Name [maciepipelinescan]:
    

  3. Settings:
    • Stack Name – Name of the CloudFormation stack to be created.
    • AWS RegionRegion—for example, us-west-2, eu-west-1, ap-southeast-1—to deploy the application to. This application was tested in the us-west-2 and ap-southeast-1 Regions. Before selecting a Region, verify that the services you need are available in those Regions (for example, Macie and Step Functions).
    • Parameter StepFunctionName – Name of the Step Functions state machine to be created—for example, maciepipelinescanstatemachine).
    • Parameter BucketNamePrefix – Prefix to apply to the S3 buckets to be created (S3 bucket names are globally unique, so choosing a random prefix helps ensure uniqueness).
    • Parameter ApprovalEmailDestination – Email address to receive the manual review notification.
    • Parameter EnableMacie – Whether you need Macie enabled in your account or Region. You can select yes or no; select yes if you need Macie to be enabled for you as part of this template, select no, if you already have Macie enabled.
  4. Confirm changes and provide approval for AWS SAM CLI to deploy the resources to your AWS account by responding y to prompts, as shown in the following example. You can accept the defaults for the SAM configuration file and SAM configuration environment prompts.
    #Shows you resources changes to be deployed and require a 'Y' to initiate deploy
    Confirm changes before deploy [y/N]: y
    #SAM needs permission to be able to create roles to connect to the resources in your template
    Allow SAM CLI IAM role creation [Y/n]: y
    ReceiveApprovalDecisionAPI may not have authorization defined, Is this okay? [y/N]: y
    ReceiveApprovalDecisionAPI may not have authorization defined, Is this okay? [y/N]: y
    Save arguments to configuration file [Y/n]: y
    SAM configuration file [samconfig.toml]: 
    SAM configuration environment [default]:
    

    Note: This application deploys an Amazon API Gateway with two REST API resources without authorization defined to receive the decision from the manual review step. You will be prompted to accept each resource without authorization. A token (Step Functions taskToken) is used to authenticate the requests.

  5. This creates an AWS CloudFormation changeset. Once the changeset creation is complete, you must provide a final confirmation of y to Deploy the changeset? [y/N] when prompted as shown in the following example.
    Changeset created successfully. arn:aws:cloudformation:ap-southeast-1:XXXXXXXXXXXX:changeSet/samcli-deploy1605213119/db681961-3635-4305-b1c7-dcc754c7XXXX
    
    
    Previewing CloudFormation changeset before deployment
    ======================================================
    Deploy this changeset? [y/N]:
    

Your application is deployed to your account using AWS CloudFormation. You can track the deployment events in the command prompt or via the AWS CloudFormation console.

After the application deployment is complete, you must confirm the subscription to the Amazon SNS topic. An email will be sent to the email address entered in Step 3 with a link that you need to select to confirm the subscription. This confirmation provides opt-in consent for AWS to send emails to you via the specified Amazon SNS topic. The emails will be notifications of potentially sensitive data that need to be approved. If you don’t see the verification email, be sure to check your spam folder.

Test the application

The application uses an EventBridge scheduled rule to start the sensitive data scan workflow, which runs every 6 hours. You can manually start an execution of the workflow to verify that it’s working. To test the function, you will need a file that contains data that matches your rules for sensitive data. For example, it is easy to create a spreadsheet, document, or text file that contains names, addresses, and numbers formatted like credit card numbers. You can also use this generated sample data to test Macie.

We will test by uploading a file to our S3 bucket via the AWS web console. If you know how to copy objects from the command line, that also works.

Upload test objects to the S3 bucket

  1. Navigate to the Amazon S3 console and upload one or more test objects to the <BucketNamePrefix>-data-pipeline-raw bucket. <BucketNamePrefix> is the prefix you entered when deploying the application in the AWS SAM CLI prompts. You can use any objects as long as they’re a supported file type for Amazon Macie. I suggest uploading multiple objects, some with and some without sensitive data, in order to see how the workflow processes each.

Start the Scan State Machine

  1. Navigate to the Step Functions state machines console. If you don’t see your state machine, make sure you’re connected to the same region that you deployed your application to.
  2. Choose the state machine you created using the AWS SAM CLI as seen in Figure 3. The example state machine is maciepipelinescanstatemachine, but you might have used a different name in your deployment.
     
    Figure 3: AWS Step Functions state machines console

    Figure 3: AWS Step Functions state machines console

  3. Select the Start execution button and copy the value from the Enter an execution name – optional box. Change the Input – optional value replacing <execution id> with the value just copied as follows:
    {
        “id”: “<execution id>”
    }
    

    In my example, the <execution id> is fa985a4f-866b-b58b-d91b-8a47d068aa0c from the Enter an execution name – optional box as shown in Figure 4. You can choose a different ID value if you prefer. This ID is used by the workflow to tag the objects being processed to ensure that only objects that are scanned continue through the pipeline. When the EventBridge scheduled event starts the workflow as scheduled, an ID is included in the input to the Step Functions workflow. Then select Start execution again.
     

    Figure 4: New execution dialog box

    Figure 4: New execution dialog box

  4. You can see the status of your workflow execution in the Graph inspector as shown in Figure 5. In the figure, the workflow is at the pollForCompletionWait step.
     
    Figure 5: AWS Step Functions graph inspector

    Figure 5: AWS Step Functions graph inspector

The sensitive discovery job should run for about five to ten minutes. The jobs scale linearly based on object size, but there is a start-up time per job that is constant. If sensitive data is found in the objects uploaded to the <BucketNamePrefix>-data-pipeline-upload S3 bucket, an email is sent to the address provided during the AWS SAM deployment step, notifying the recipient requesting of the need for an approval decision, which they indicate by selecting the link corresponding to their decision to approve or deny the next step as shown in Figure 6.
 

Figure 6: Sensitive data identified email

Figure 6: Sensitive data identified email

When you receive this notification, you can investigate the findings by reviewing the objects in the <BucketNamePrefix>-data-pipeline-manual-review S3 bucket. Based on your review, you can either apply remediation steps to remove any sensitive data or allow the data to proceed to the next step of the data ingestion pipeline. You should define a standard response process to address discovery of sensitive data in the data pipeline. Common remediation steps include review of the files for sensitive data, deleting the files that you do not want to progress, and updating the ETL process to redact or tokenize sensitive data when re-ingesting into the pipeline. When you re-ingest the files into the pipeline without sensitive data, the files will not be flagged by Macie.

The workflow performs the following:

  • If you select Approve, the files are moved to the <BucketNamePrefix>-data-pipeline-scanned-data S3 bucket with an Amazon S3 SensitiveDataFound object tag with a value of true.
  • If you select Deny, the files are deleted from the <BucketNamePrefix>-data-pipeline-manual-review S3 bucket.
  • If no action is taken, the Step Functions workflow execution times out after five days and the file will automatically be deleted from the <BucketNamePrefix>-data-pipeline-manual-review S3 bucket after 10 days.

Clean up the application

You’ve successfully deployed and tested the sensitive data pipeline scan workflow. To avoid ongoing charges for resources you created, you should delete all associated resources by deleting the CloudFormation stack. In order to delete the CloudFormation stack, you must first delete all objects that are stored in the S3 buckets that you created for the application.

To delete the application

  1. Empty the S3 buckets created in this application (<BucketNamePrefix>-data-pipeline-raw S3 bucket, <BucketNamePrefix>-data-pipeline-scan-stage, <BucketNamePrefix>-data-pipeline-manual-review, and <BucketNamePrefix>-data-pipeline-scanned-data).
  2. Delete the CloudFormation stack used to deploy the application.

Considerations for regular use

Before using this application in a production data pipeline, you will need to stop and consider some practical matters. First, the notification mechanism used when sensitive data is identified in the objects is email. Email doesn’t scale: you should expand this solution to integrate with your ticketing or workflow management system. If you choose to use email, subscribe a mailing list so that the work of reviewing and responding to alerts is shared across a team.

Second, the application is run on a scheduled basis (every 6 hours by default). You should consider starting the application when your preliminary validations have completed and are ready to perform a sensitive data scan on the data as part of your pipeline. You can modify the EventBridge Event Rule to run in response to an Amazon EventBridge event instead of a scheduled basis.

Third, the application currently uses a 60 second Step Functions Wait state when polling for the Macie discovery job completion. In real world scenarios, the discovery scan will take 10 minutes at a minimum, likely several orders of magnitude longer. You should evaluate the typical execution times for your application execution and tune the polling period accordingly. This will help reduce costs related to running Lambda functions and log storage within CloudWatch Logs. The polling period is defined in the Step Functions state machine definition file (macie_pipeline_scan.asl.json) under the pollForCompletionWait state.

Fourth, the application currently doesn’t account for false positives in the sensitive data discovery job results. Also, the application will progress or delete all objects identified based on the decision by the reviewer. You should consider expanding the application to handle false positives through automation rather than manual review / intervention (such as deleting the files from the manual review bucket or removing the sensitive data tags applied).

Last, the solution will stop the ingestion of a subset of objects into your pipeline. This behavior is similar to other validation and data quality checks that most customers perform as part of the data pipeline. However, you should test to ensure that this will not cause unexpected outcomes and address them in your downstream application logic accordingly.

Conclusion

In this post, I showed you how to integrate sensitive data discovery using Macie as an additional validation step in an automated data pipeline. You’ve reviewed the components of the application, deployed it using the AWS SAM CLI, tested to validate that the application functions as expected, and cleaned up by removing deployed resources.

You now know how to integrate sensitive data scanning into your ETL pipeline. You can use automation and—where required—manual review to help reduce the risk of sensitive data, such as personally identifiable information, being inadvertently ingested into a data lake. You can take this application and customize it to fit your use case and workflows, such as using custom data identifiers as part of your scans, adding additional validation steps, creating Macie suppression rules to define cases to archive findings automatically, or only request manual approvals for findings that meet certain criteria (such as high severity findings).

If you have feedback about this post, submit comments in the Comments section below. If you have questions about this post, start a new thread on the Amazon Macie forum.

Want more AWS Security how-to content, news, and feature announcements? Follow us on Twitter.

Author

Brandon Wu

Brandon is a security solutions architect helping financial services organizations secure their critical workloads on AWS. In his spare time, he enjoys exploring outdoors and experimenting in the kitchen.

Open Source Does Not Equal Secure

Post Syndicated from Bruce Schneier original https://www.schneier.com/blog/archives/2020/12/open-source-does-not-equal-secure.html

Way back in 1999, I wrote about open-source software:

First, simply publishing the code does not automatically mean that people will examine it for security flaws. Security researchers are fickle and busy people. They do not have the time to examine every piece of source code that is published. So while opening up source code is a good thing, it is not a guarantee of security. I could name a dozen open source security libraries that no one has ever heard of, and no one has ever evaluated. On the other hand, the security code in Linux has been looked at by a lot of very good security engineers.

We have some new research from GitHub that bears this out. On average, vulnerabilities in their libraries go four years before being detected. From a ZDNet article:

GitHub launched a deep-dive into the state of open source security, comparing information gathered from the organization’s dependency security features and the six package ecosystems supported on the platform across October 1, 2019, to September 30, 2020, and October 1, 2018, to September 30, 2019.

Only active repositories have been included, not including forks or ‘spam’ projects. The package ecosystems analyzed are Composer, Maven, npm, NuGet, PyPi, and RubyGems.

In comparison to 2019, GitHub found that 94% of projects now rely on open source components, with close to 700 dependencies on average. Most frequently, open source dependencies are found in JavaScript — 94% — as well as Ruby and .NET, at 90%, respectively.

On average, vulnerabilities can go undetected for over four years in open source projects before disclosure. A fix is then usually available in just over a month, which GitHub says “indicates clear opportunities to improve vulnerability detection.”

Open source means that the code is available for security evaluation, not that it necessarily has been evaluated by anyone. This is an important distinction.

More on the Security of the 2020 US Election

Post Syndicated from Bruce Schneier original https://www.schneier.com/blog/archives/2020/11/more-on-the-security-of-the-2020-us-election.html

Last week I signed on to two joint letters about the security of the 2020 election. The first was as one of 59 election security experts, basically saying that while the election seems to have been both secure and accurate (voter suppression notwithstanding), we still need to work to secure our election systems:

We are aware of alarming assertions being made that the 2020 election was “rigged” by exploiting technical vulnerabilities. However, in every case of which we are aware, these claims either have been unsubstantiated or are technically incoherent. To our collective knowledge, no credible evidence has been put forth that supports a conclusion that the 2020 election outcome in any state has been altered through technical compromise.

That said, it is imperative that the US continue working to bolster the security of elections against sophisticated adversaries. At a minimum, all states should employ election security practices and mechanisms recommended by experts to increase assurance in election outcomes, such as post-election risk-limiting audits.

The New York Times wrote about the letter.

The second was a more general call for election security measures in the US:

Obviously elections themselves are partisan. But the machinery of them should not be. And the transparent assessment of potential problems or the assessment of allegations of security failure — even when they could affect the outcome of an election — must be free of partisan pressures. Bottom line: election security officials and computer security experts must be able to do their jobs without fear of retribution for finding and publicly stating the truth about the security and integrity of the election.

These pile on to the November 12 statement from Cybersecurity and Infrastructure Security Agency (CISA) and the other agencies of the Election Infrastructure Government Coordinating Council (GCC) Executive Committee. While I’m not sure how they have enough comparative data to claim that “the November 3rd election was the most secure in American history,” they are certainly credible in saying that “there is no evidence that any voting system deleted or lost votes, changed votes, or was in any way compromised.”

We have a long way to go to secure our election systems from hacking. Details of what to do are known. Getting rid of touch-screen voting machines is important, but baseless claims of fraud don’t help.

On Blockchain Voting

Post Syndicated from Bruce Schneier original https://www.schneier.com/blog/archives/2020/11/on-blockchain-voting.html

Blockchain voting is a spectacularly dumb idea for a whole bunch of reasons. I have generally quoted Matt Blaze:

Why is blockchain voting a dumb idea? Glad you asked.

For starters:

  • It doesn’t solve any problems civil elections actually have.
  • It’s basically incompatible with “software independence”, considered an essential property.
  • It can make ballot secrecy difficult or impossible.

I’ve also quoted this XKCD cartoon.

But now I have this excellent paper from MIT researchers:

“Going from Bad to Worse: From Internet Voting to Blockchain Voting”
Sunoo Park, Michael Specter, Neha Narula, and Ronald L. Rivest

Abstract: Voters are understandably concerned about election security. News reports of possible election interference by foreign powers, of unauthorized voting, of voter disenfranchisement, and of technological failures call into question the integrity of elections worldwide.This article examines the suggestions that “voting over the Internet” or “voting on the blockchain” would increase election security, and finds such claims to be wanting and misleading. While current election systems are far from perfect, Internet- and blockchain-based voting would greatly increase the risk of undetectable, nation-scale election failures.Online voting may seem appealing: voting from a computer or smart phone may seem convenient and accessible. However, studies have been inconclusive, showing that online voting may have little to no effect on turnout in practice, and it may even increase disenfranchisement. More importantly: given the current state of computer security, any turnout increase derived from with Internet- or blockchain-based voting would come at the cost of losing meaningful assurance that votes have been counted as they were cast, and not undetectably altered or discarded. This state of affairs will continue as long as standard tactics such as malware, zero days, and denial-of-service attacks continue to be effective.This article analyzes and systematizes prior research on the security risks of online and electronic voting, and show that not only do these risks persist in blockchain-based voting systems, but blockchains may introduce additional problems for voting systems. Finally, we suggest questions for critically assessing security risks of new voting system proposals.

You may have heard of Voatz, which uses blockchain for voting. It’s an insecure mess. And this is my general essay on blockchain. Short summary: it’s completely useless.

Verified, episode 2 – A Conversation with Emma Smith, Director of Global Cyber Security at Vodafone

Post Syndicated from Stephen Schmidt original https://aws.amazon.com/blogs/security/verified-episode-2-conversation-with-emma-smith-director-of-global-cyber-security-at-vodafone/

Over the past 8 months, it’s become more important for us all to stay in contact with peers around the globe. Today, I’m proud to bring you the second episode of our new video series, Verified: Presented by AWS re:Inforce. Even though we couldn’t be together this year at re:Inforce, our annual security conference, we still wanted to share some of the conversations with security leaders that would have taken place at the conference. The series showcases conversations with security leaders around the globe. In episode two, I’m talking to Emma Smith, Vodafone’s Global Cyber Security Director.

Vodafone is a global technology communications company with an optimistic culture. Their focus is connecting people and building the digital future for society. During our conversation, Emma detailed how the core values of the Global Cyber Security team were inspired by the company. “We’ve got a team of people who are ultimately passionate about protecting customers, protecting society, protecting Vodafone, protecting all of our services and our employees.” Emma shared experiences about the evolution of the security organization during her past 5 years with the company.

We were also able to touch on one of Emma’s passions, diversity and inclusion. Emma has worked to implement diversity and drive a policy of inclusion at Vodafone. In June, she was named Diversity Champion in the SC Awards Europe. In her own words: “It makes me realize that my job is to smooth the way for everybody else and to try and remove some of those obstacles or barriers that were put in their way… it means that I’m really passionate about trying to get a very diverse team in security, but also in Vodafone, so that we reflect our customer base, so that we’ve got diversity of thinking, of backgrounds, of experience, and people who genuinely feel comfortable being themselves at work—which is easy to say but really hard to create that culture of safety and belonging.”

Stay tuned for future episodes of Verified: Presented by AWS re:Inforce here on the AWS Security Blog. You can watch episode one, an interview with Jason Chan, Vice President of Information Security at Netflix on YouTube. If you have an idea or a topic you’d like covered in this series, please drop us a comment below.

Want more AWS Security how-to content, news, and feature announcements? Follow us on Twitter.

Author

Steve Schmidt

Steve is Vice President and Chief Information Security Officer for AWS. His duties include leading product design, management, and engineering development efforts focused on bringing the competitive, economic, and security benefits of cloud computing to business and government customers. Prior to AWS, he had an extensive career at the Federal Bureau of Investigation, where he served as a senior executive and section chief. He currently holds 11 patents in the field of cloud security architecture. Follow Steve on Twitter.

2020 Was a Secure Election

Post Syndicated from Bruce Schneier original https://www.schneier.com/blog/archives/2020/11/2020-was-a-secure-election.html

Over at Lawfare: “2020 Is An Election Security Success Story (So Far).”

What’s more, the voting itself was remarkably smooth. It was only a few months ago that professionals and analysts who monitor election administration were alarmed at how badly unprepared the country was for voting during a pandemic. Some of the primaries were disasters. There were not clear rules in many states for voting by mail or sufficient opportunities for voting early. There was an acute shortage of poll workers. Yet the United States saw unprecedented turnout over the last few weeks. Many states handled voting by mail and early voting impressively and huge numbers of volunteers turned up to work the polls. Large amounts of litigation before the election clarified the rules in every state. And for all the president’s griping about the counting of votes, it has been orderly and apparently without significant incident. The result was that, in the midst of a pandemic that has killed 230,000 Americans, record numbers of Americans voted­ — and voted by mail — ­and those votes are almost all counted at this stage.

On the cybersecurity front, there is even more good news. Most significantly, there was no serious effort to target voting infrastructure. After voting concluded, the director of the Cybersecurity and Infrastructure Security Agency (CISA), Chris Krebs, released a statement, saying that “after millions of Americans voted, we have no evidence any foreign adversary was capable of preventing Americans from voting or changing vote tallies.” Krebs pledged to “remain vigilant for any attempts by foreign actors to target or disrupt the ongoing vote counting and final certification of results,” and no reports have emerged of threats to tabulation and certification processes.

A good summary.

Cybersecurity Visuals

Post Syndicated from Bruce Schneier original https://www.schneier.com/blog/archives/2020/10/cybersecurity-visuals.html

The Hewlett Foundation just announced its top five ideas in its Cybersecurity Visuals Challenge. The problem Hewlett is trying to solve is the dearth of good visuals for cybersecurity. A Google Images Search demonstrates the problem: locks, fingerprints, hands on laptops, scary looking hackers in black hoodies. Hewlett wanted to go beyond those tropes.

I really liked the idea, but find the results underwhelming. It’s a hard problem.

Hewlett press release.

Introducing the first video in our new series, Verified, featuring Netflix’s Jason Chan

Post Syndicated from Stephen Schmidt original https://aws.amazon.com/blogs/security/introducing-first-video-new-series-verified-featuring-netflix-jason-chan/

The year has been a profoundly different one for us all, and like many of you, I’ve been adjusting, both professionally and personally, to this “new normal.” Here at AWS we’ve seen an increase in customers looking for secure solutions to maintain productivity in an increased work-from-home world. We’ve also seen an uptick in requests for training; it’s clear, a sense of community and learning are critically important as workforces physically distance.

For these reasons, I’m happy to announce the launch of Verified: Presented by AWS re:Inforce. I’m hosting this series, but I’ll be joined by leaders in cloud security across a variety of industries. The goal is to have an open conversation about the common issues we face in securing our systems and tools. Topics will include how the pandemic is impacting cloud security, tips for creating an effective security program from the ground up, how to create a culture of security, emerging security trends, and more. Learn more by following me on Twitter (@StephenSchmidt), and get regular updates from @AWSSecurityInfo. Verified is just one of the many ways we will continue sharing best practices with our customers during this time. You can find more by reading the AWS Security Blog, reviewing our documentation, visiting the AWS Security and Compliance webpages, watching re:Invent and re:Inforce playlists, and/or reviewing the Security Pillar of Well Architected.

Our first conversation, above, is with Jason Chan, Vice President of Information Security at Netflix. Jason spoke to us about the security program at Netflix, his approach to hiring security talent, and how Zero Trust enables a remote workforce. Jason also has solid insights to share about how he started and grew the security program at Netflix.

“In the early days, what we were really trying to figure out is how do we build a large-scale consumer video-streaming service in the public cloud, and how do you do that in a secure way? There wasn’t a ton of expertise in that, so when I was building the security team at Netflix, I thought, ‘how do we bring in folks from a variety of backgrounds, generalists … to tackle this problem?’”

He also gave his view on how a growing security team can measure ROI. “I think it’s difficult to have a pure equation around that. So what we try to spend our time doing is really making sure that we, as a team, are aligned on what is the most important—what are the most important assets to protect, what are the most critical risks that we’re trying to prevent—and then make sure that leadership is aligned with that, because, as we all know, there’s not unlimited resources, right? You can’t hire an unlimited number of folks or spend an unlimited amount of money, so you’re always trying to figure out how do you prioritize, and how do you find where is going to be the biggest impact for your value?”

Check out Jason’s full interview above, and stay tuned for further videos in this series. If you have an idea or a topic you’d like covered in this series, please drop us a comment below. Thanks!

Want more AWS Security how-to content, news, and feature announcements? Follow us on Twitter.

Author

Steve Schmidt

Steve is Vice President and Chief Information Security Officer for AWS. His duties include leading product design, management, and engineering development efforts focused on bringing the competitive, economic, and security benefits of cloud computing to business and government customers. Prior to AWS, he had an extensive career at the Federal Bureau of Investigation, where he served as a senior executive and section chief. He currently holds 11 patents in the field of cloud security architecture. Follow Steve on Twitter.

Swiss-Swedish Diplomatic Row Over Crypto AG

Post Syndicated from Bruce Schneier original https://www.schneier.com/blog/archives/2020/10/swiss-swedish-diplomatic-row-over-crypto-ag.html

Previously I have written about the Swedish-owned Swiss-based cryptographic hardware company: Crypto AG. It was a CIA-owned Cold War operation for decades. Today it is called Crypto International, still based in Switzerland but owned by a Swedish company.

It’s back in the news:

Late last week, Swedish Foreign Minister Ann Linde said she had canceled a meeting with her Swiss counterpart Ignazio Cassis slated for this month after Switzerland placed an export ban on Crypto International, a Swiss-based and Swedish-owned cybersecurity company.

The ban was imposed while Swiss authorities examine long-running and explosive claims that a previous incarnation of Crypto International, Crypto AG, was little more than a front for U.S. intelligence-gathering during the Cold War.

Linde said the Swiss ban was stopping “goods” — which experts suggest could include cybersecurity upgrades or other IT support needed by Swedish state agencies — from reaching Sweden.

She told public broadcaster SVT that the meeting with Cassis was “not appropriate right now until we have fully understood the Swiss actions.”

EDITED TO ADD (10/13): Lots of information on Crypto AG.

CEO of NS8 Charged with Securities Fraud

Post Syndicated from Bruce Schneier original https://www.schneier.com/blog/archives/2020/09/ceo-of-ns8-charged-with-securities-fraud.html

The founder and CEO of the Internet security company NS8 has been arrested and “charged in a Complaint in Manhattan federal court with securities fraud, fraud in the offer and sale of securities, and wire fraud.”

I admit that I’ve never even heard of the company before.

US Space Cybersecurity Directive

Post Syndicated from Bruce Schneier original https://www.schneier.com/blog/archives/2020/09/us-space-cybersecurity-directive.html

The Trump Administration just published “Space Policy Directive – 5“: “Cybersecurity Principles for Space Systems.” It’s pretty general:

Principles. (a) Space systems and their supporting infrastructure, including software, should be developed and operated using risk-based, cybersecurity-informed engineering. Space systems should be developed to continuously monitor, anticipate,and adapt to mitigate evolving malicious cyber activities that could manipulate, deny, degrade, disrupt,destroy, surveil, or eavesdrop on space system operations. Space system configurations should be resourced and actively managed to achieve and maintain an effective and resilient cyber survivability posture throughout the space system lifecycle.

(b) Space system owners and operators should develop and implement cybersecurity plans for their space systems that incorporate capabilities to ensure operators or automated control center systems can retain or recover positive control of space vehicles. These plans should also ensure the ability to verify the integrity, confidentiality,and availability of critical functions and the missions, services,and data they enable and provide.

These unclassified directives are typically so general that it’s hard to tell whether they actually matter.

News article.

North Korea ATM Hack

Post Syndicated from Bruce Schneier original https://www.schneier.com/blog/archives/2020/09/north_korea_atm.html

The US Cybersecurity and Infrastructure Security Agency (CISA) published a long and technical alert describing a North Korea hacking scheme against ATMs in a bunch of countries worldwide:

This joint advisory is the result of analytic efforts among the Cybersecurity and Infrastructure Security Agency (CISA), the Department of the Treasury (Treasury), the Federal Bureau of Investigation (FBI) and U.S. Cyber Command (USCYBERCOM). Working with U.S. government partners, CISA, Treasury, FBI, and USCYBERCOM identified malware and indicators of compromise (IOCs) used by the North Korean government in an automated teller machine (ATM) cash-out scheme­ — referred to by the U.S. Government as “FASTCash 2.0: North Korea’s BeagleBoyz Robbing Banks.”

The level of detail is impressive, as seems to be common in CISA’s alerts and analysis reports.

North Korea ATM Hack

Post Syndicated from Bruce Schneier original https://www.schneier.com/blog/archives/2020/09/north_korea_atm.html

The US Cybersecurity and Infrastructure Security Agency (CISA) published a long and technical alert describing a North Korea hacking scheme against ATMs in a bunch of countries worldwide:

This joint advisory is the result of analytic efforts among the Cybersecurity and Infrastructure Security Agency (CISA), the Department of the Treasury (Treasury), the Federal Bureau of Investigation (FBI) and U.S. Cyber Command (USCYBERCOM). Working with U.S. government partners, CISA, Treasury, FBI, and USCYBERCOM identified malware and indicators of compromise (IOCs) used by the North Korean government in an automated teller machine (ATM) cash-out scheme­ — referred to by the U.S. Government as “FASTCash 2.0: North Korea’s BeagleBoyz Robbing Banks.”

The level of detail is impressive, as seems to be common in CISA’s alerts and analysis reports.

US Postal Service Files Blockchain Voting Patent

Post Syndicated from Bruce Schneier original https://www.schneier.com/blog/archives/2020/08/us_postal_servi.html

The US Postal Service has filed a patent on a blockchain voting method:

Abstract: A voting system can use the security of blockchain and the mail to provide a reliable voting system. A registered voter receives a computer readable code in the mail and confirms identity and confirms correct ballot information in an election. The system separates voter identification and votes to ensure vote anonymity, and stores votes on a distributed ledger in a blockchain

I wasn’t going to bother blogging this, but I’ve received enough emails about it that I should comment.

As is pretty much always the case, blockchain adds nothing. The security of this system has nothing to do with blockchain, and would be better off without it. For voting in particular, blockchain adds to the insecurity. Matt Blaze is most succinct on that point:

Why is blockchain voting a dumb idea?

Glad you asked.

For starters:

  • It doesn’t solve any problems civil elections actually have.
  • It’s basically incompatible with “software independence”, considered an essential property.
  • It can make ballot secrecy difficult or impossible.

Both Ben Adida and Matthew Green have written longer pieces on blockchain and voting.

News articles.