Tag Archives: artificial intelligence

Takeaways From The Take Command Summit: Understanding Modern Cyber Attacks

Post Syndicated from Emma Burdett original https://blog.rapid7.com/2024/06/21/takeaways-from-the-take-command-summit-understanding-modern-cyber-attacks/

Takeaways From The Take Command Summit: Understanding Modern Cyber Attacks

In today’s cybersecurity landscape, staying ahead of evolving threats is crucial. The State of Security Panel from our Take Command summit held May 21st delved into how artificial intelligence (AI) is reshaping cyber attacks and defenses.

The discussion highlighted the dual role of AI in cybersecurity, presenting both challenges and solutions. To learn more about these insights and protect your organization from sophisticated threats, watch the full video.

Key takeaways from the 30 minute panel:

  1. AI-Enhanced Attacks: Friendly Hacker and CEO of SocialProof Security Rachel Tobac highlighted the growing use of AI by attackers, stating, “Eight times out of ten, I’m using AI tools during my attacks.” AI helps create convincing phishing emails and scripts, making attacks more efficient and scalable.
  2. Voice Cloning and Deepfakes: Attackers are now using AI for voice cloning and deep fakes, making it vital for organizations to verify identities through multiple communication channels. Rachel continued, “We can even do a deep fake, live during a Teams or Zoom call to trick somebody.”
  3. Cloud Vulnerabilities: Rapid7’s Chief Security Officer Jaya Baloo pointed out that roughly  45% of data breaches are due to cloud issues, caused by misconfigurations and vulnerabilities, making cloud security a critical focus.

“Professional paranoia is something that I think we should hold dear to us,” – Jaya Bayloo, Chief Security Officer, Rapid7

Watch the full video here.

Anthropic’s Claude 3.5 Sonnet model now available in Amazon Bedrock: Even more intelligence than Claude 3 Opus at one-fifth the cost

Post Syndicated from Danilo Poccia original https://aws.amazon.com/blogs/aws/anthropics-claude-3-5-sonnet-model-now-available-in-amazon-bedrock-the-most-intelligent-claude-model-yet/

It’s been just 3 months since Anthropic launched Claude 3, a family of state-of-the-art artificial intelligence (AI) models that allows you to choose the right combination of intelligence, speed, and cost that suits your needs.

Today, Anthropic introduced Claude 3.5 Sonnet, its first release in the forthcoming Claude 3.5 model family. We are happy to announce that Claude 3.5 Sonnet is now available in Amazon Bedrock.

Claude 3.5 Sonnet raises the industry bar for intelligence, outperforming other generative AI models on a wide range of evaluations, including Anthropic’s previously most intelligent model, Claude 3 Opus. Claude 3.5 Sonnet is available with the speed and cost of the original Claude 3 Sonnet model. In fact, you can now get intelligence and speed better than Claude 3 Opus at one-fifth of the price because Claude 3.5 Sonnet is 80 percent cheaper than Opus.

Anthropic Claude 3.5 Sonnet Family

The frontier intelligence displayed by Claude 3.5 Sonnet combined with cost-effective pricing, makes the model ideal for complex tasks such as context-sensitive customer support, orchestrating multi-step workflows, and streamlining code translations.

Claude 3.5 Sonnet sets new industry benchmarks for undergraduate-level expert knowledge (MMLU), graduate-level expert reasoning (GPQA), code (HumanEval), and more. As you can see in the following table, according to Anthropic, Claude 3.5 Sonnet outperforms OpenAI’s GPT-4o and Google’s Gemini 1.5 Pro in nearly every benchmark.

Anthropic Claude 3.5 Sonnet Benchmarks

Claude 3.5 Sonnet is also Anthropic’s strongest vision model yet, performing an average of 10 percent better than Claude 3 Opus across the majority of vision benchmarks. According to Anthropic, Claude 3.5 Sonnet also outperforms other generative AI models in nearly every category.

Anthropic Claude 3.5 Sonnet Vision Benchmarks

Anthropic’s Claude 3.5 Sonnet key improvements
The release of Claude 3.5 Sonnet brings significant improvements across multiple domains, empowering software developers and businesses with new generative AI-powered capabilities. Here are some of the key strengths of this new model:

Visual processing and understanding – Claude 3.5 Sonnet demonstrates remarkable capabilities in processing images, particularly in interpreting charts and graphs. It accurately transcribes text from imperfect images, a core capability for industries such as retail, logistics, and financial services, to gather more insights from graphics or illustrations than from text alone. Use Claude 3.5 Sonnet to automate visual data processing tasks, extract valuable information, and enhance data analysis pipelines.

Writing and content generation – Claude 3.5 Sonnet represents a significant leap in its ability to understand nuance and humor. The model produces high-quality written content with a more natural, human tone that feels more authentic and relatable. Use the model to generate engaging and compelling content, streamline your writing workflows, and enhance your storytelling capabilities.

Customer support and natural language processing – With its improved understanding of context and multistep workflow orchestration, Claude 3.5 Sonnet excels at handling intricate customer inquiries. This capability enables round-the-clock support, faster response times, and more natural-sounding interactions, ultimately leading to improved customer satisfaction. Use this model to automate and enhance customer support processes and provide a seamless experience for end users. For an example of a similar implementation, see how DoorDash built a generative AI self-service contact center solution using Anthropic’s Claude 3 models in Amazon Bedrock.

Analytics and insights – Claude 3.5 Sonnet augments human expertise in data science by effortlessly navigating unstructured data and using multiple tools to generate insights. It produces high-quality statistical visualizations and actionable predictions, ranging from business strategies to real-time product trends. Use Claude 3.5 Sonnet to simplify data analysis workflows, uncover valuable insights, and drive data-driven decision-making.

Coding and software development – Claude 3.5 Sonnet can independently write, edit, and execute code with sophisticated reasoning and troubleshooting capabilities when provided with the relevant tools. Use Claude 3.5 Sonnet to streamline developer workflows, accelerate coding tasks, reduce manual effort, and enhance overall productivity.

Using Anthropic’s Claude 3.5 Sonnet in the Amazon Bedrock console
Because this is a new model, I go to the Amazon Bedrock console and choose Model access from the navigation pane to enable access to Claude 3.5 Sonnet.

Console screenshot.

Now that I have access to the model, I’d like to use it to extract information from the following image from the Our World in Data website:

Primary energy consumption from hydropower, 2022

I choose Chat in the Playgrounds section of the navigation pane and select Anthropic and Claude 3.5 Sonnet as the model. I then choose the Image upload button to upload the previous image, type the following prompt, and choose Run:

Which countries consume more than 1000 TWh from hydropower? Think step by step and look at all regions. Output in JSON.

Here’s the result and an animated image recapping what I did.

To determine which countries consume more than 1,000 TWh from hydropower in 2022, I need to carefully examine the map and color-coding. The scale at the bottom shows that dark blue represents consumption between 1,000 TWh and 2,000 TWh.

Let's look at each region systematically:

1. North America: Canada is dark blue, indicating it consumes between 1,000-2,000 TWh from hydropower.
2. South America: Brazil is dark blue, also consuming between 1,000-2,000 TWh.
3. Europe: No country is dark blue; all are light green or have no data.
4. Africa: No data available for any country.
5. Asia: China is dark blue, indicating consumption between 1,000-2,000 TWh.
6. Oceania: No country is dark blue; Australia is light green.

In summary, there are three countries that consume more than 1,000 TWh from hydropower in 2022:
1. Canada (North America)
2. Brazil (South America)
3. China (Asia)

Here's the response in JSON format:

```json
{
  "countries_consuming_over_1000TWh_hydropower": [
    {
      "name": "Canada",
      "region": "North America",
      "consumption_range": "1,000-2,000 TWh"
    },
    {
      "name": "Brazil",
      "region": "South America",
      "consumption_range": "1,000-2,000 TWh"
    },
    {
      "name": "China",
      "region": "Asia",
      "consumption_range": "1,000-2,000 TWh"
    }
  ]
}
```

These three countries stand out as the highest consumers of hydropower energy, each using between 1,000 and 2,000 terawatt-hours in 2022.

Anthropic's Claude 3.5 Sonnet demo in the Amazon Bedrock console.

The model’s ability to reliably extract information from unstructured data, like images, opens up a world of new possibilities.

I choose the three small dots in the corner of the playground window and then View API request to see code examples using the model in the AWS Command Line Interface (AWS CLI) and AWS SDKs. Let’s have a better look at the code syntax.

Using Claude 3.5 Sonnet with AWS SDKs
You can use Claude 3.5 Sonnet with any AWS SDK using the new Amazon Bedrock Converse API or Anthropic Claude Messages API.

To update code already using a Claude 3 model, I just need to replace the model ID with:

anthropic.claude-3-5-sonnet-20240620-v1:0

Here’s a sample implementation with the AWS SDK for Python (Boto3) using the same image as before to show how to use images and text with the Converse API.

import boto3
from botocore.exceptions import ClientError

MODEL_ID = "anthropic.claude-3-5-sonnet-20240620-v1:0"

IMAGE_NAME = "primary-energy-hydro.png"

bedrock_runtime = boto3.client("bedrock-runtime", region_name="us-east-1")

with open(IMAGE_NAME, "rb") as f:
    image = f.read()

user_message = "Which countries consume more than 1000 TWh from hydropower? Think step by step and look at all regions. Output in JSON."

messages = [
    {
        "role": "user",
        "content": [
            {"image": {"format": "png", "source": {"bytes": image}}},
            {"text": user_message},
        ],
    }
]

response = bedrock_runtime.converse(
    modelId=MODEL_ID,
    messages=messages,
)
response_text = response["output"]["message"]["content"][0]["text"]
print(response_text)

When I run it, I get a similar output as in the console:

Let's approach this step-by-step:

1. First, I'll examine the color scale at the bottom of the map. The darkest blue color represents consumption of 2,000 TWh or more.

2. Now, I'll scan the map region by region:

   North America: Canada is dark blue, indicating over 1,000 TWh.
   South America: Brazil is also dark blue, over 1,000 TWh.
   Europe: No country appears to be dark blue.
   Africa: No country appears to be dark blue.
   Asia: China stands out as dark blue, indicating over 1,000 TWh.
   Oceania: No country appears to be dark blue.

3. To be thorough, I'll double-check for any medium blue countries that might be close to or over 1,000 TWh, but I don't see any that appear to reach that threshold.

4. Based on this analysis, there are three countries that clearly consume more than 1,000 TWh from hydropower.

Now, I'll format the answer in JSON:

```json
{
  "countries_consuming_over_1000TWh_hydropower": [
    "Canada",
    "Brazil",
    "China"
  ]
}
```

This JSON output lists the three countries that visually appear to consume more than 1,000 TWh of primary energy from hydropower according to the 2022 data presented in the map.

Because I didn’t specify a JSON syntax, the two answers use a different format. In your applications, you can describe in the prompt the JSON properties you want or provide a sample to get a standard format in output.

For more examples, see the code samples in the Amazon Bedrock User Guide. For a more advanced use case, here’s a fully functional tool use demo illustrating how to connect a generative AI model with a custom tool or API.

Using Claude 3.5 Sonnet with the AWS CLI
There are times when nothing beats the speed of the command line. This is how you can use the AWS CLI with the new model:

aws bedrock-runtime converse \
    --model-id anthropic.claude-3-5-sonnet-20240620-v1:0 \
    --messages '{"role": "user", "content": [{"text": "Alice has N brothers and she also has M sisters. How many sisters does Alice’s brother have?"}]}' \
    --region us-east-1
    --query output.message.content

In the output, I use the query option to only get the content of the output message:

[
    {
        "text": "Let's approach this step-by-step:\n\n1. First, we need to understand the relationships:\n   - Alice has N brothers\n   - Alice has M sisters\n\n2. Now, let's consider Alice's brother:\n   - He is one of Alice's N brothers\n   - He has the same parents as Alice\n\n3. This means that Alice's brother has:\n   - The same sisters as Alice\n   - One sister more than Alice (because Alice herself is his sister)\n\n4. Therefore, the number of sisters Alice's brother has is:\n   M + 1\n\n   Where M is the number of sisters Alice has.\n\nSo, the answer is: Alice's brother has M + 1 sisters."
    }
]

I copy the text into a small Python program to see it printed on multiple lines:

print("Let's approach this step-by-step:\n\n1. First, we need to understand the relationships:\n   - Alice has N brothers\n   - Alice has M sisters\n\n2. Now, let's consider Alice's brother:\n   - He is one of Alice's N brothers\n   - He has the same parents as Alice\n\n3. This means that Alice's brother has:\n   - The same sisters as Alice\n   - One sister more than Alice (because Alice herself is his sister)\n\n4. Therefore, the number of sisters Alice's brother has is:\n   M + 1\n\n   Where M is the number of sisters Alice has.\n\nSo, the answer is: Alice's brother has M + 1 sisters.")
Let's approach this step-by-step:

1. First, we need to understand the relationships:
   - Alice has N brothers
   - Alice has M sisters

2. Now, let's consider Alice's brother:
   - He is one of Alice's N brothers
   - He has the same parents as Alice

3. This means that Alice's brother has:
   - The same sisters as Alice
   - One sister more than Alice (because Alice herself is his sister)

4. Therefore, the number of sisters Alice's brother has is:
   M + 1

   Where M is the number of sisters Alice has.

So, the answer is: Alice's brother has M + 1 sisters.

Even if this was a quite nuanced question, Claude 3.5 Sonnet got it right and described its reasoning step by step.

Things to know
Anthropic’s Claude 3.5 Sonnet is available in Amazon Bedrock today in the US East (N. Virginia) AWS Region. More information on Amazon Bedrock model support by Region is available in the documentation. View the Amazon Bedrock pricing page to determine the costs for your specific use case.

By providing access to a faster and more powerful model at a lower cost, Claude 3.5 Sonnet makes generative AI easier and more effective to use for many industries, such as:

Healthcare and life sciences – In the medical field, Claude 3.5 Sonnet shows promise in enhancing imaging analysis, acting as a diagnostic assistant for patient triage, and summarizing the latest research findings in an easy-to-digest format.

Financial services – The model can provide valuable assistance in identifying financial trends and creating personalized debt repayment plans tailored to clients’ unique situations.

Legal – Law firms can use the model to accelerate legal research by quickly surfacing relevant precedents and statutes. Additionally, the model can increase paralegal efficiency through contract analysis and assist with drafting standard legal documents.

Media and entertainment – The model can expedite research for journalists, support the creative process of scriptwriting and character development, and provide valuable audience sentiment analysis.

Technology – For software developers, Claude 3.5 Sonnet offers opportunities in rapid application prototyping, legacy code migration, innovative feature ideation, user experience optimization, and identification of friction points.

Education – Educators can use the model to streamline grant proposal writing, develop comprehensive curricula incorporating emerging trends, and receive research assistance through database queries and insight generation.

It’s an exciting time for for generative AI. To start using this new model, see the Anthropic Claude models section of the Amazon Bedrock User Guide. You can also visit our community.aws site to find deep-dive technical content and to discover how our Builder communities are using Amazon Bedrock in their solutions. Let me know what you do with these enhanced capabilities!

Danilo

Announcing the general availability of fully managed MLflow on Amazon SageMaker

Post Syndicated from Veliswa Boya original https://aws.amazon.com/blogs/aws/manage-ml-and-generative-ai-experiments-using-amazon-sagemaker-with-mlflow/

Today, we are thrilled to announce the general availability of a fully managed MLflow capability on Amazon SageMaker. MLflow, a widely-used open-source tool, plays a crucial role in helping machine learning (ML) teams manage the entire ML lifecycle. With this new launch, customers can now effortlessly set up and manage MLflow Tracking Servers with just a few steps, streamlining the process and boosting productivity.

Data Scientists and ML developers can leverage MLflow to track multiple attempts at training models as runs within experiments, compare these runs with visualizations, evaluate models, and register the best models to a Model Registry. Amazon SageMaker eliminates the undifferentiated heavy lifting required to set up and manage MLflow, providing ML administrators with a quick and efficient way to establish secure and scalable MLflow environments on AWS.

Core components of managed MLflow on SageMaker

The fully managed MLflow capability on SageMaker is built around three core components:

  • MLflow Tracking Server – With just a few steps, you can create an MLflow Tracking Server through the SageMaker Studio UI. This stand-alone HTTP server serves multiple REST API endpoints for tracking runs and experiments, enabling you to begin monitoring your ML experiments efficiently. For more granular security customization, you can also use the AWS Command Line Interface (AWS CLI).
  • MLflow backend metadata store – The metadata store is a critical part of the MLflow Tracking Server, where all metadata related to experiments, runs, and artifacts is persisted. This includes experiment names, run IDs, parameter values, metrics, tags, and artifact locations, ensuring comprehensive tracking and management of your ML experiments.
  • MLflow artifact store – This component provides a storage location for all artifacts generated during ML experiments, such as trained models, datasets, logs, and plots. Utilizing an Amazon Simple Storage Service (Amazon S3) bucket, it offers a customer-managed AWS account for storing these artifacts securely and efficiently.

Benefits of Amazon SageMaker with MLflow

Using Amazon SageMaker with MLflow can streamline and enhance your machine learning workflows:

  • Comprehensive Experiment Tracking: Track experiments in MLflow across local integrated development environments (IDEs), managed IDEs in SageMaker Studio, SageMaker training jobs, SageMaker processing jobs, and SageMaker Pipelines.
  • Full MLflow Capabilities: Use all MLflow experimentation capabilities such as MLflow Tracking, MLflow Evaluations, and MLflow Model Registry, are available to easily compare and evaluate the results of training iterations.
  • Unified Model Governance: Models registered in MLflow automatically appear in the SageMaker Model Registry, offering a unified model governance experience that helps you deploy MLflow models to SageMaker inference without building custom containers.
  • Efficient Server Management: Provision, remove, and upgrade MLflow Tracking Servers as desired using SageMaker APIs or the SageMaker Studio UI. SageMaker manages the scaling, patching, and ongoing maintenance of your tracking servers, without customers needing to manage the underlying infrastructure.
  • Enhanced Security: Secure access to MLflow Tracking Servers using AWS Identity and Access Management (IAM). Write IAM policies to grant or deny access to specific MLflow APIs, ensuring robust security for your ML environments.
  • Effective Monitoring and Governance: Monitor the activity on an MLflow Tracking Server using Amazon EventBridge and AWS CloudTrail to support effective governance of their Tracking Servers.

MLflow Tracking Server prerequisites (environment setup)

  1. Create a SageMaker Studio domain
    You can create a SageMaker Studio domain using the new SageMaker Studio experience.
  2. Configure the IAM execution role
    The MLflow Tracking Server needs an IAM execution role to read and write artifacts to Amazon S3 and register models in SageMaker. You can use the Studio domain execution role as the Tracking Server execution role or you can create a separate role for the Tracking Server execution role. If you choose to create a new role for this, refer to the SageMaker Developer Guide for more details on the IAM role. If you choose to update the Studio domain execution role, refer to the SageMaker Developer Guide for details on what IAM policy the role needs.

Create the MLflow Tracking Server
In the walkthrough, I use the default settings for creating an MLflow Tracking Server, which include the Tracking Server version (2.13.2), the Tracking Server size (Small), and the Tracking Server execution role (Studio domain execution role). The Tracking Server size determines how much usage a Tracking Server will support, and we recommend using a Small Tracking Server for teams of up to 25 users. For more details on Tracking Server configurations, read the SageMaker Developer Guide.

To get started, in your SageMaker Studio domain created during your environment set up detailed earlier, select MLflow under Applications and choose Create.

Next, provide a Name and Artifact storage location (S3 URI) for the Tracking Server.

Creating an MLflow Tracking Server can take up to 25 minutes.


Track and compare training runs
To get started with logging metrics, parameters, and artifacts to MLflow, you need a Jupyter Notebook and your Tracking Server ARN that was assigned during the creation step. You can use the MLflow SDK to keep track of training runs and compare them using the MLflow UI.


To register models from MLflow Model Registry to SageMaker Model Registry, you need the sagemaker-mlflow plugin to authenticate all MLflow API requests made by the MLflow SDK using AWS Signature V4.

  1. Install the MLflow SDK and sagemaker-mlflow plugin
    In your notebook, first install the MLflow SDK and sagemaker-mlflow Python plugin.
    pip install mlflow==2.13.2 sagemaker-mlflow==0.1.0
  2. Track a run in an experiment
    To track a run in an experiment, copy the following code into your Jupyter notebook.

    import mlflow
    import mlflow.sklearn
    from sklearn.ensemble import RandomForestClassifier
    from sklearn.datasets import load_iris
    from sklearn.model_selection import train_test_split
    from sklearn.metrics import accuracy_score, precision_score, recall_score, f1_score
    
    # Replace this with the ARN of the Tracking Server you just created
    arn = 'YOUR-TRACKING-SERVER-ARN'
    
    mlflow.set_tracking_uri(arn)
    
    # Load the Iris dataset
    iris = load_iris()
    X, y = iris.data, iris.target
    
    # Split the data into training and testing sets
    X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
    
    # Train a Random Forest classifier
    rf_model = RandomForestClassifier(n_estimators=100, random_state=42)
    rf_model.fit(X_train, y_train)
    
    # Make predictions on the test set
    y_pred = rf_model.predict(X_test)
    
    # Calculate evaluation metrics
    accuracy = accuracy_score(y_test, y_pred)
    precision = precision_score(y_test, y_pred, average='weighted')
    recall = recall_score(y_test, y_pred, average='weighted')
    f1 = f1_score(y_test, y_pred, average='weighted')
    
    # Start an MLflow run
    with mlflow.start_run():
    # Log the model
    mlflow.sklearn.log_model(rf_model, "random_forest_model")
    
    # Log the evaluation metrics
    mlflow.log_metric("accuracy", accuracy)
    mlflow.log_metric("precision", precision)
    mlflow.log_metric("recall", recall)
    mlflow.log_metric("f1_score", f1)
  3. View your run in the MLflow UI
    Once you run the notebook shown in Step 2, you will see a new run in the MLflow UI.
  4. Compare runs
    You can run this notebook multiple times by changing the random_state to generate different metric values for each training run.

Register candidate models
Once you’ve compared the multiple runs as detailed in Step 4, you can register the model whose metrics best meet your requirements in the MLflow Model Registry. Registering a model indicates potential suitability for production deployment and there will be further testing to validate this suitability. Once a model is registered in MLflow it automatically appears in the SageMaker Model Registry for a unified model governance experience so you can deploy MLflow models to SageMaker inference. This enables data scientists who primarily use MLflow for experimentation to hand off their models to ML engineers who govern and manage production deployments of models using the SageMaker Model Registry.

Here is the model registered in the MLflow Model Registry.


Here is the model registered in the SageMaker Model Registry.

Clean up
Once created, an MLflow Tracking Server will incur costs until you delete or stop it. Billing for Tracking Servers is based on the duration the servers have been running, the size selected, and the amount of data logged to the Tracking Servers. You can stop Tracking Servers when they are not in use to save costs or delete them using API or the SageMaker Studio UI. For more details on pricing, see the Amazon SageMaker pricing.

Now available
SageMaker with MLflow is generally available in all AWS Regions where SageMaker Studio is available, except China and US GovCloud Regions. We invite you to explore this new capability and experience the enhanced efficiency and control it brings to your machine learning projects. To learn more, visit the SageMaker with MLflow product detail page.

For more information, visit the SageMaker Developer Guide and send feedback to AWS re:Post for SageMaker or through your usual AWS support contacts.

Veliswa

Video annotator: building video classifiers using vision-language models and active learning

Post Syndicated from Netflix Technology Blog original https://netflixtechblog.com/video-annotator-building-video-classifiers-using-vision-language-models-and-active-learning-8ebdda0b2db4

Video annotator: a framework for efficiently building video classifiers using vision-language models and active learning

Amir Ziai, Aneesh Vartakavi, Kelli Griggs, Eugene Lok, Yvonne Jukes, Alex Alonso, Vi Iyengar, Anna Pulido

Introduction

Problem

High-quality and consistent annotations are fundamental to the successful development of robust machine learning models. Conventional techniques for training machine learning classifiers are resource intensive. They involve a cycle where domain experts annotate a dataset, which is then transferred to data scientists to train models, review outcomes, and make changes. This labeling process tends to be time-consuming and inefficient, sometimes halting after a few annotation cycles.

Implications

Consequently, less effort is invested in annotating high-quality datasets compared to iterating on complex models and algorithmic methods to improve performance and fix edge cases. As a result, ML systems grow rapidly in complexity.

Furthermore, constraints on time and resources often result in leveraging third-party annotators rather than domain experts. These annotators perform the labeling task without a deep understanding of the model’s intended deployment or usage, often making consistent labeling of borderline or hard examples, especially in more subjective tasks, a challenge.

This necessitates multiple review rounds with domain experts, leading to unexpected costs and delays. This lengthy cycle can also result in model drift, as it takes longer to fix edge cases and deploy new models, potentially hurting usefulness and stakeholder trust.

Solution

We suggest that more direct involvement of domain experts, using a human-in-the-loop system, can resolve many of these practical challenges. We introduce a novel framework, Video Annotator (VA), which leverages active learning techniques and zero-shot capabilities of large vision-language models to guide users to focus their efforts on progressively harder examples, enhancing the model’s sample efficiency and keeping costs low.

VA seamlessly integrates model building into the data annotation process, facilitating user validation of the model before deployment, therefore helping with building trust and fostering a sense of ownership. VA also supports a continuous annotation process, allowing users to rapidly deploy models, monitor their quality in production, and swiftly fix any edge cases by annotating a few more examples and deploying a new model version.

This self-service architecture empowers users to make improvements without active involvement of data scientists or third-party annotators, allowing for fast iteration.

Video understanding

We design VA to assist in granular video understanding which requires the identification of visuals, concepts, and events within video segments. Video understanding is fundamental for numerous applications such as search and discovery, personalization, and the creation of promotional assets. Our framework allows users to efficiently train machine learning models for video understanding by developing an extensible set of binary video classifiers, which power scalable scoring and retrieval of a vast catalog of content.

Video classification

Video classification is the task of assigning a label to an arbitrary-length video clip, often accompanied by a probability or prediction score, as illustrated in Fig 1.

Fig 1- Functional view of a binary video classifier. A few-second clip from ”Operation Varsity Blues: The College Admissions Scandal” is passed to a binary classifier for detecting the ”establishing shots” label. The classifier outputs a very high score (score is between 0 and 1), indicating that the video clip is very likely an establishing shot. In filmmaking, an establishing shot is a wide shot (i.e. video clip between two consecutive cuts) of a building or a landscape that is intended for establishing the time and location of the scene.

Video understanding via an extensible set of video classifiers

Binary classification allows for independence and flexibility, allowing us to add or improve one model independent of the others. It also has the additional benefit of being easier to understand and build for our users. Combining the predictions of multiple models allows us a deeper understanding of the video content at various levels of granularity, illustrated in Fig 2.

Fig 2- Three video clips and the corresponding binary classifier scores for three video understanding labels. Note that these labels are not mutually exclusive. Video clips are from Operation Varsity Blues: The College Admissions Scandal, 6 Underground, and Leave The World Behind, respectively.

Video Annotator (VA)

In this section, we describe VA’s three-step process for building video classifiers.

Step 1 — search

Users begin by finding an initial set of examples within a large, diverse corpus to bootstrap the annotation process. We leverage text-to-video search to enable this, powered by video and text encoders from a Vision-Language Model to extract embeddings. For example, an annotator working on the establishing shots model may start the process by searching for “wide shots of buildings”, illustrated in Fig 3.

Fig 3- Step 1 — Text-to-video search to bootstrap the annotation process.

Step 2 — active learning

The next stage involves a classic Active Learning loop. VA then builds a lightweight binary classifier over the video embeddings, which is subsequently used to score all clips in the corpus, and presents some examples within feeds for further annotation and refinement, as illustrated in Fig 4.

Fig 4- Step 2 — Active Learning loop. The annotator clicks on build, which initiates classifier training and scoring of all clips in a video corpus. Scored clips are organized in four feeds.

The top-scoring positive and negative feeds display examples with the highest and lowest scores respectively. Our users reported that this provided a valuable indication as to whether the classifier has picked up the correct concepts in the early stages of training and spot cases of bias in the training data that they were able to subsequently fix. We also include a feed of “borderline” examples that the model is not confident about. This feed helps with discovering interesting edge cases and inspires the need for labeling additional concepts. Finally, the random feed consists of randomly selected clips and helps to annotate diverse examples which is important for generalization.

The annotator can label additional clips in any of the feeds and build a new classifier and repeat as many times as desired.

Step 3 — review

The last step simply presents the user with all annotated clips. It’s a good opportunity to spot annotation mistakes and to identify ideas and concepts for further annotation via search in step 1. From this step, users often go back to step 1 or step 2 to refine their annotations.

Experiments

To evaluate VA, we asked three video experts to annotate a diverse set of 56 labels across a video corpus of 500k shots. We compared VA to the performance of a few baseline methods, and observed that VA leads to the creation of higher quality video classifiers. Fig 5 compares VA’s performance to baselines as a function of the number of annotated clips.

Fig 5- Model quality (i.e. Average Precision) as a function of the number of annotated clips for the “establishing shots” label. We observe that all methods outperform the baseline, and that all methods benefit from additional annotated data, albeit to varying degrees.

You can find more details about VA and our experiments in this paper.

Conclusion

We presented Video Annotator (VA), an interactive framework that addresses many challenges associated with conventional techniques for training machine learning classifiers. VA leverages the zero-shot capabilities of large vision-language models and active learning techniques to enhance sample efficiency and reduce costs. It offers a unique approach to annotating, managing, and iterating on video classification datasets, emphasizing the direct involvement of domain experts in a human-in-the-loop system. By enabling these users to rapidly make informed decisions on hard samples during the annotation process, VA increases the system’s overall efficiency. Moreover, it allows for a continuous annotation process, allowing users to swiftly deploy models, monitor their quality in production, and rapidly fix any edge cases.

This self-service architecture empowers domain experts to make improvements without the active involvement of data scientists or third-party annotators, and fosters a sense of ownership, thereby building trust in the system.

We conducted experiments to study the performance of VA, and found that it yields a median 8.3 point improvement in Average Precision relative to the most competitive baseline across a wide-ranging assortment of video understanding tasks. We release a dataset with 153k labels across 56 video understanding tasks annotated by three professional video editors using VA, and also release code to replicate our experiments.


Video annotator: building video classifiers using vision-language models and active learning was originally published in Netflix TechBlog on Medium, where people are continuing the conversation by highlighting and responding to this story.

Rethinking Democracy for the Age of AI

Post Syndicated from Bruce Schneier original https://www.schneier.com/blog/archives/2024/06/rethinking-democracy-for-the-age-of-ai.html

There is a lot written about technology’s threats to democracy. Polarization. Artificial intelligence. The concentration of wealth and power. I have a more general story: The political and economic systems of governance that were created in the mid-18th century are poorly suited for the 21st century. They don’t align incentives well. And they are being hacked too effectively.

At the same time, the cost of these hacked systems has never been greater, across all human history. We have become too powerful as a species. And our systems cannot keep up with fast-changing disruptive technologies.

We need to create new systems of governance that align incentives and are resilient against hacking … at every scale. From the individual all the way up to the whole of society.

For this, I need you to drop your 20th century either/or thinking. This is not about capitalism versus communism. It’s not about democracy versus autocracy. It’s not even about humans versus AI. It’s something new, something we don’t have a name for yet. And it’s “blue sky” thinking, not even remotely considering what’s feasible today.

Throughout this talk, I want you to think of both democracy and capitalism as information systems. Socio-technical information systems. Protocols for making group decisions. Ones where different players have different incentives. These systems are vulnerable to hacking and need to be secured against those hacks.

We security technologists have a lot of expertise in both secure system design and hacking. That’s why we have something to add to this discussion.

And finally, this is a work in progress. I’m trying to create a framework for viewing governance. So think of this more as a foundation for discussion, rather than a road map to a solution. And I think by writing, and what you’re going to hear is the current draft of my writing—and my thinking. So everything is subject to change without notice.

OK, so let’s go.

We all know about misinformation and how it affects democracy. And how propagandists have used it to advance their agendas. This is an ancient problem, amplified by information technologies. Social media platforms that prioritize engagement. “Filter bubble” segmentation. And technologies for honing persuasive messages.

The problem ultimately stems from the way democracies use information to make policy decisions. Democracy is an information system that leverages collective intelligence to solve political problems. And then to collect feedback as to how well those solutions are working. This is different from autocracies that don’t leverage collective intelligence for political decision making. Or have reliable mechanisms for collecting feedback from their populations.

Those systems of democracy work well, but have no guardrails when fringe ideas become weaponized. That’s what misinformation targets. The historical solution for this was supposed to be representation. This is currently failing in the US, partly because of gerrymandering, safe seats, only two parties, money in politics and our primary system. But the problem is more general.

James Madison wrote about this in 1787, where he made two points. One, that representatives serve to filter popular opinions, limiting extremism. And two, that geographical dispersal makes it hard for those with extreme views to participate. It’s hard to organize. To be fair, these limitations are both good and bad. In any case, current technology—social media—breaks them both.

So this is a question: What does representation look like in a world without either filtering or geographical dispersal? Or, how do we avoid polluting 21st century democracy with prejudice, misinformation and bias. Things that impair both the problem solving and feedback mechanisms.

That’s the real issue. It’s not about misinformation, it’s about the incentive structure that makes misinformation a viable strategy.

This is problem No. 1: Our systems have misaligned incentives. What’s best for the small group often doesn’t match what’s best for the whole. And this is true across all sorts of individuals and group sizes.

Now, historically, we have used misalignment to our advantage. Our current systems of governance leverage conflict to make decisions. The basic idea is that coordination is inefficient and expensive. Individual self-interest leads to local optimizations, which results in optimal group decisions.

But this is also inefficient and expensive. The U.S. spent $14.5 billion on the 2020 presidential, senate and congressional elections. I don’t even know how to calculate the cost in attention. That sounds like a lot of money, but step back and think about how the system works. The economic value of winning those elections are so great because that’s how you impose your own incentive structure on the whole.

More generally, the cost of our market economy is enormous. For example, $780 billion is spent world-wide annually on advertising. Many more billions are wasted on ventures that fail. And that’s just a fraction of the total resources lost in a competitive market environment. And there are other collateral damages, which are spread non-uniformly across people.

We have accepted these costs of capitalism—and democracy—because the inefficiency of central planning was considered to be worse. That might not be true anymore. The costs of conflict have increased. And the costs of coordination have decreased. Corporations demonstrate that large centrally planned economic units can compete in today’s society. Think of Walmart or Amazon. If you compare GDP to market cap, Apple would be the eighth largest country on the planet. Microsoft would be the tenth.

Another effect of these conflict-based systems is that they foster a scarcity mindset. And we have taken this to an extreme. We now think in terms of zero-sum politics. My party wins, your party loses. And winning next time can be more important than governing this time. We think in terms of zero-sum economics. My product’s success depends on my competitors’ failures. We think zero-sum internationally. Arms races and trade wars.

Finally, conflict as a problem-solving tool might not give us good enough answers anymore. The underlying assumption is that if everyone pursues their own self interest, the result will approach everyone’s best interest. That only works for simple problems and requires systemic oppression. We have lots of problems—complex, wicked, global problems—that don’t work that way. We have interacting groups of problems that don’t work that way. We have problems that require more efficient ways of finding optimal solutions.

Note that there are multiple effects of these conflict-based systems. We have bad actors deliberately breaking the rules. And we have selfish actors taking advantage of insufficient rules.

The latter is problem No. 2: What I refer to as “hacking” in my latest book: “A Hacker’s Mind.” Democracy is a socio-technical system. And all socio-technical systems can be hacked. By this I mean that the rules are either incomplete or inconsistent or outdated—they have loopholes. And these can be used to subvert the rules. This is Peter Thiel subverting the Roth IRA to avoid paying taxes on $5 billion in income. This is gerrymandering, the filibuster, and must-pass legislation. Or tax loopholes, financial loopholes, regulatory loopholes.

In today’s society, the rich and powerful are just too good at hacking. And it is becoming increasingly impossible to patch our hacked systems. Because the rich use their power to ensure that the vulnerabilities don’t get patched.

This is bad for society, but it’s basically the optimal strategy in our competitive governance systems. Their zero-sum nature makes hacking an effective, if parasitic, strategy. Hacking isn’t a new problem, but today hacking scales better—and is overwhelming the security systems in place to keep hacking in check. Think about gun regulations, climate change, opioids. And complex systems make this worse. These are all non-linear, tightly coupled, unrepeatable, path-dependent, adaptive, co-evolving systems.

Now, add into this mix the risks that arise from new and dangerous technologies such as the internet or AI or synthetic biology. Or molecular nanotechnology, or nuclear weapons. Here, misaligned incentives and hacking can have catastrophic consequences for society.

This is problem No. 3: Our systems of governance are not suited to our power level. They tend to be rights based, not permissions based. They’re designed to be reactive, because traditionally there was only so much damage a single person could do.

We do have systems for regulating dangerous technologies. Consider automobiles. They are regulated in many ways: drivers licenses + traffic laws + automobile regulations + road design. Compare this to aircrafts. Much more onerous licensing requirements, rules about flights, regulations on aircraft design and testing and a government agency overseeing it all day-to-day. Or pharmaceuticals, which have very complex rules surrounding everything around researching, developing, producing and dispensing. We have all these regulations because this stuff can kill you.

The general term for this kind of thing is the “precautionary principle.” When random new things can be deadly, we prohibit them unless they are specifically allowed.

So what happens when a significant percentage of our jobs are as potentially damaging as a pilot’s? Or even more damaging? When one person can affect everyone through synthetic biology. Or where a corporate decision can directly affect climate. Or something in AI or robotics. Things like the precautionary principle are no longer sufficient. Because breaking the rules can have global effects.

And AI will supercharge hacking. We have created a series of non-interoperable systems that actually interact and AI will be able to figure out how to take advantage of more of those interactions: finding new tax loopholes or finding new ways to evade financial regulations. Creating “micro-legislation” that surreptitiously benefits a particular person or group. And catastrophic risk means this is no longer tenable.

So these are our core problems: misaligned incentives leading to too effective hacking of systems where the costs of getting it wrong can be catastrophic.

Or, to put more words on it: Misaligned incentives encourage local optimization, and that’s not a good proxy for societal optimization. This encourages hacking, which now generates greater harm than at any point in the past because the amount of damage that can result from local optimization is greater than at any point in the past.

OK, let’s get back to the notion of democracy as an information system. It’s not just democracy: Any form of governance is an information system. It’s a process that turns individual beliefs and preferences into group policy decisions. And, it uses feedback mechanisms to determine how well those decisions are working and then makes corrections accordingly.

Historically, there are many ways to do this. We can have a system where no one’s preference matters except the monarch’s or the nobles’ or the landowners’. Sometimes the stronger army gets to decide—or the people with the money.

Or we could tally up everyone’s preferences and do the thing that at least half of the people want. That’s basically the promise of democracy today, at its ideal. Parliamentary systems are better, but only in the margins—and it all feels kind of primitive. Lots of people write about how informationally poor elections are at aggregating individual preferences. It also results in all these misaligned incentives.

I realize that democracy serves different functions. Peaceful transition of power, minimizing harm, equality, fair decision making, better outcomes. I am taking for granted that democracy is good for all those things. I’m focusing on how we implement it.

Modern democracy uses elections to determine who represents citizens in the decision-making process. And all sorts of other ways to collect information about what people think and want, and how well policies are working. These are opinion polls, public comments to rule-making, advocating, lobbying, protesting and so on. And, in reality, it’s been hacked so badly that it does a terrible job of executing on the will of the people, creating further incentives to hack these systems.

To be fair, the democratic republic was the best form of government that mid 18th century technology could invent. Because communications and travel were hard, we needed to choose one of us to go all the way over there and pass laws in our name. It was always a coarse approximation of what we wanted. And our principles, values, conceptions of fairness; our ideas about legitimacy and authority have evolved a lot since the mid 18th century. Even the notion of optimal group outcomes depended on who was considered in the group and who was out.

But democracy is not a static system, it’s an aspirational direction. One that really requires constant improvement. And our democratic systems have not evolved at the same pace that our technologies have. Blocking progress in democracy is itself a hack of democracy.

Today we have much better technology that we can use in the service of democracy. Surely there are better ways to turn individual preferences into group policies. Now that communications and travel are easy. Maybe we should assign representation by age, or profession or randomly by birthday. Maybe we can invent an AI that calculates optimal policy outcomes based on everyone’s preferences.

Whatever we do, we need systems that better align individual and group incentives, at all scales. Systems designed to be resistant to hacking. And resilient to catastrophic risks. Systems that leverage cooperation more and conflict less. And are not zero-sum.

Why can’t we have a game where everybody wins?

This has never been done before. It’s not capitalism, it’s not communism, it’s not socialism. It’s not current democracies or autocracies. It would be unlike anything we’ve ever seen.

Some of this comes down to how trust and cooperation work. When I wrote “Liars and Outliers” in 2012, I wrote about four systems for enabling trust: our innate morals, concern about our reputations, the laws we live under and security technologies that constrain our behavior. I wrote about how the first two are more informal than the last two. And how the last two scale better, and allow for larger and more complex societies. They enable cooperation amongst strangers.

What I didn’t appreciate is how different the first and last two are. Morals and reputation are both old biological systems of trust. They’re person to person, based on human connection and cooperation. Laws—and especially security technologies—are newer systems of trust that force us to cooperate. They’re socio-technical systems. They’re more about confidence and control than they are about trust. And that allows them to scale better. Taxi driver used to be one of the country’s most dangerous professions. Uber changed that through pervasive surveillance. My Uber driver and I don’t know or trust each other, but the technology lets us both be confident that neither of us will cheat or attack each other. Both drivers and passengers compete for star rankings, which align local and global incentives.

In today’s tech-mediated world, we are replacing the rituals and behaviors of cooperation with security mechanisms that enforce compliance. And innate trust in people with compelled trust in processes and institutions. That scales better, but we lose the human connection. It’s also expensive, and becoming even more so as our power grows. We need more security for these systems. And the results are much easier to hack.

But here’s the thing: Our informal human systems of trust are inherently unscalable. So maybe we have to rethink scale.

Our 18th century systems of democracy were the only things that scaled with the technology of the time. Imagine a group of friends deciding where to have dinner. One is kosher, one is a vegetarian. They would never use a winner-take-all ballot to decide where to eat. But that’s a system that scales to large groups of strangers.

Scale matters more broadly in governance as well. We have global systems of political and economic competition. On the other end of the scale, the most common form of governance on the planet is socialism. It’s how families function: people work according to their abilities, and resources are distributed according to their needs.

I think we need governance that is both very large and very small. Our catastrophic technological risks are planetary-scale: climate change, AI, internet, bio-tech. And we have all the local problems inherent in human societies. We have very few problems anymore that are the size of France or Virginia. Some systems of governance work well on a local level but don’t scale to larger groups. But now that we have more technology, we can make other systems of democracy scale.

This runs headlong into historical norms about sovereignty. But that’s already becoming increasingly irrelevant. The modern concept of a nation arose around the same time as the modern concept of democracy. But constituent boundaries are now larger and more fluid, and depend a lot on context. It makes no sense that the decisions about the “drug war”—or climate migration—are delineated by nation. The issues are much larger than that. Right now there is no governance body with the right footprint to regulate Internet platforms like Facebook. Which has more users world-wide than Christianity.

We also need to rethink growth. Growth only equates to progress when the resources necessary to grow are cheap and abundant. Growth is often extractive. And at the expense of something else. Growth is how we fuel our zero-sum systems. If the pie gets bigger, it’s OK that we waste some of the pie in order for it to grow. That doesn’t make sense when resources are scarce and expensive. Growing the pie can end up costing more than the increase in pie size. Sustainability makes more sense. And a metric more suited to the environment we’re in right now.

Finally, agility is also important. Back to systems theory, governance is an attempt to control complex systems with complicated systems. This gets harder as the systems get larger and more complex. And as catastrophic risk raises the costs of getting it wrong.

In recent decades, we have replaced the richness of human interaction with economic models. Models that turn everything into markets. Market fundamentalism scaled better, but the social cost was enormous. A lot of how we think and act isn’t captured by those models. And those complex models turn out to be very hackable. Increasingly so at larger scales.

Lots of people have written about the speed of technology versus the speed of policy. To relate it to this talk: Our human systems of governance need to be compatible with the technologies they’re supposed to govern. If they’re not, eventually the technological systems will replace the governance systems. Think of Twitter as the de facto arbiter of free speech.

This means that governance needs to be agile. And able to quickly react to changing circumstances. Imagine a court saying to Peter Thiel: “Sorry. That’s not how Roth IRAs are supposed to work. Now give us our tax on that $5B.” This is also essential in a technological world: one that is moving at unprecedented speeds, where getting it wrong can be catastrophic and one that is resource constrained. Agile patching is how we maintain security in the face of constant hacking—and also red teaming. In this context, both journalism and civil society are important checks on government.

I want to quickly mention two ideas for democracy, one old and one new. I’m not advocating for either. I’m just trying to open you up to new possibilities. The first is sortition. These are citizen assemblies brought together to study an issue and reach a policy decision. They were popular in ancient Greece and Renaissance Italy, and are increasingly being used today in Europe. The only vestige of this in the U.S. is the jury. But you can also think of trustees of an organization. The second idea is liquid democracy. This is a system where everybody has a proxy that they can transfer to someone else to vote on their behalf. Representatives hold those proxies, and their vote strength is proportional to the number of proxies they have. We have something like this in corporate proxy governance.

Both of these are algorithms for converting individual beliefs and preferences into policy decisions. Both of these are made easier through 21st century technologies. They are both democracies, but in new and different ways. And while they’re not immune to hacking, we can design them from the beginning with security in mind.

This points to technology as a key component of any solution. We know how to use technology to build systems of trust. Both the informal biological kind and the formal compliance kind. We know how to use technology to help align incentives, and to defend against hacking.

We talked about AI hacking; AI can also be used to defend against hacking, finding vulnerabilities in computer code, finding tax loopholes before they become law and uncovering attempts at surreptitious micro-legislation.

Think back to democracy as an information system. Can AI techniques be used to uncover our political preferences and turn them into policy outcomes, get feedback and then iterate? This would be more accurate than polling. And maybe even elections. Can an AI act as our representative? Could it do a better job than a human at voting the preferences of its constituents?

Can we have an AI in our pocket that votes on our behalf, thousands of times a day, based on the preferences it infers we have. Or maybe based on the preferences it infers we would have if we read up on the issues and weren’t swayed by misinformation. It’s just another algorithm for converting individual preferences into policy decisions. And it certainly solves the problem of people not paying attention to politics.

But slow down: This is rapidly devolving into technological solutionism. And we know that doesn’t work.

A general question to ask here is when do we allow algorithms to make decisions for us? Sometimes it’s easy. I’m happy to let my thermostat automatically turn my heat on and off or to let an AI drive a car or optimize the traffic lights in a city. I’m less sure about an AI that sets tax rates, or corporate regulations or foreign policy. Or an AI that tells us that it can’t explain why, but strongly urges us to declare war—right now. Each of these is harder because they are more complex systems: non-local, multi-agent, long-duration and so on. I also want any AI that works on my behalf to be under my control. And not controlled by a large corporate monopoly that allows me to use it.

And learned helplessness is an important consideration. We’re probably OK with no longer needing to know how to drive a car. But we don’t want a system that results in us forgetting how to run a democracy. Outcomes matter here, but so do mechanisms. Any AI system should engage individuals in the process of democracy, not replace them.

So while an AI that does all the hard work of governance might generate better policy outcomes. There is social value in a human-centric political system, even if it is less efficient. And more technologically efficient preference collection might not be better, even if it is more accurate.

Procedure and substance need to work together. There is a role for AI in decision making: moderating discussions, highlighting agreements and disagreements helping people reach consensus. But it is an independent good that we humans remain engaged in—and in charge of—the process of governance.

And that value is critical to making democracy function. Democratic knowledge isn’t something that’s out there to be gathered: It’s dynamic; it gets produced through the social processes of democracy. The term of art is “preference formation.” We’re not just passively aggregating preferences, we create them through learning, deliberation, negotiation and adaptation. Some of these processes are cooperative and some of these are competitive. Both are important. And both are needed to fuel the information system that is democracy.

We’re never going to remove conflict and competition from our political and economic systems. Human disagreement isn’t just a surface feature; it goes all the way down. We have fundamentally different aspirations. We want different ways of life. I talked about optimal policies. Even that notion is contested: optimal for whom, with respect to what, over what time frame? Disagreement is fundamental to democracy. We reach different policy conclusions based on the same information. And it’s the process of making all of this work that makes democracy possible.

So we actually can’t have a game where everybody wins. Our goal has to be to accommodate plurality, to harness conflict and disagreement, and not to eliminate it. While, at the same time, moving from a player-versus-player game to a player-versus-environment game.

There’s a lot missing from this talk. Like what these new political and economic governance systems should look like. Democracy and capitalism are intertwined in complex ways, and I don’t think we can recreate one without also recreating the other. My comments about agility lead to questions about authority and how that interplays with everything else. And how agility can be hacked as well. We haven’t even talked about tribalism in its many forms. In order for democracy to function, people need to care about the welfare of strangers who are not like them. We haven’t talked about rights or responsibilities. What is off limits to democracy is a huge discussion. And Butterin’s trilemma also matters here: that you can’t simultaneously build systems that are secure, distributed, and scalable.

I also haven’t given a moment’s thought to how to get from here to there. Everything I’ve talked about—incentives, hacking, power, complexity—also applies to any transition systems. But I think we need to have unconstrained discussions about what we’re aiming for. If for no other reason than to question our assumptions. And to imagine the possibilities. And while a lot of the AI parts are still science fiction, they’re not far-off science fiction.

I know we can’t clear the board and build a new governance structure from scratch. But maybe we can come up with ideas that we can bring back to reality.

To summarize, the systems of governance we designed at the start of the Industrial Age are ill-suited to the Information Age. Their incentive structures are all wrong. They’re insecure and they’re wasteful. They don’t generate optimal outcomes. At the same time we’re facing catastrophic risks to society due to powerful technologies. And a vastly constrained resource environment. We need to rethink our systems of governance; more cooperation and less competition and at scales that are suited to today’s problems and today’s technologies. With security and precautions built in. What comes after democracy might very well be more democracy, but it will look very different.

This feels like a challenge worthy of our security expertise.

This text is the transcript from a keynote speech delivered during the RSA Conference in San Francisco on April 25, 2023. It was previously published in Cyberscoop. I thought I posted it to my blog and Crypto-Gram last year, but it seems that I didn’t.

Using LLMs to Exploit Vulnerabilities

Post Syndicated from Bruce Schneier original https://www.schneier.com/blog/archives/2024/06/using-llms-to-exploit-vulnerabilities.html

Interesting research: “Teams of LLM Agents can Exploit Zero-Day Vulnerabilities.”

Abstract: LLM agents have become increasingly sophisticated, especially in the realm of cybersecurity. Researchers have shown that LLM agents can exploit real-world vulnerabilities when given a description of the vulnerability and toy capture-the-flag problems. However, these agents still perform poorly on real-world vulnerabilities that are unknown to the agent ahead of time (zero-day vulnerabilities).

In this work, we show that teams of LLM agents can exploit real-world, zero-day vulnerabilities. Prior agents struggle with exploring many different vulnerabilities and long-range planning when used alone. To resolve this, we introduce HPTSA, a system of agents with a planning agent that can launch subagents. The planning agent explores the system and determines which subagents to call, resolving long-term planning issues when trying different vulnerabilities. We construct a benchmark of 15 real-world vulnerabilities and show that our team of agents improve over prior work by up to 4.5×.

The LLMs aren’t finding new vulnerabilities. They’re exploiting zero-days—which means they are not trained on them—in new ways. So think about this sort of thing combined with another AI that finds new vulnerabilities in code.

These kinds of developments are important to follow, as they are part of the puzzle of a fully autonomous AI cyberattack agent. I talk about this sort of thing more here.

Rapid7 Infuses Generative AI into the InsightPlatform to Supercharge SecOps and Augment MDR Services

Post Syndicated from Laura Ellis original https://blog.rapid7.com/2024/06/13/rapid7-infuses-generative-ai-into-the-insightplatform-to-supercharge-secops-and-augment-mdr-services/

Rapid7 Infuses Generative AI into the InsightPlatform to Supercharge SecOps and Augment MDR Services

In the ever-evolving landscape of cybersecurity, staying ahead of threats is not just a goal—it’s a necessity. At Rapid7, we are pioneering the infusion of artificial intelligence (AI) into our platform and service offerings, transforming the way security operations centers (SOCs) around the globe operate. We’ve been utilizing AI in our technologies for decades, establishing patented models to better and more efficiently solve customer challenges. Furthering this endeavor, we’re excited to announce we’ve extended the Rapid7 AI Engine to include new Generative AI capabilities being used by our internal SOC teams, transforming the way we deliver our MDR services.

A Thoughtful, Deliberate Approach to AI Model Deployment

At Rapid7, one of our core philosophical beliefs is that vendors – like ourselves – should not lean on customers to tune our models. This belief is showcased by our approach to deploying AI models, with a process that entails initially releasing them to our internal SOC teams to be trained and battle-tested before being released to customers via in-product experiences.

Another core pillar of our AI development principles is that human supervision is essential and can’t be completely removed from the process. We believe wholeheartedly in the efficacy of our models, but the reality is that AI is not immune from making mistakes. At Rapid7, we have the advantage of working in lockstep with one of the world’s leading SOC teams. With a continuous feedback loop in place between our frontline analysts and our AI and data science team, we’re constantly fine-tuning our models, and MDR customers benefit from knowing our teams are validating any AI-generated output for accuracy.

Intelligent Threat Detection and Continuous Alert Triage Validation

The first line of defense in any cybersecurity strategy is the ability to detect threats accurately and efficiently. The Rapid7 AI Engine leverages the massive volume of high-fidelity risk and threat data to enhance alert triage by accurately distinguishing between malicious and benign alerts, ensuring analysts can focus on only the alerts that are truly malicious. The engine has also been extended to include a combination of both traditional machine learning (ML) and Generative AI models to ensure new security alerts are accurately labeled as malicious or benign. This work boosts the signal to noise ratio, thereby enabling Rapid7 analysts to spend more time investigating the security signals that matter to our customers.

Introducing Our AI-Powered SOC Assistant

Generative AI is not just a tool; it’s a game-changer for SOC efficiency. Our AI-native SOC assistant empowers MDR analysts to quickly respond to security threats and proactively mitigate risks on behalf of our customers. Because we fundamentally believe AI should be trained by the knowledge of our teams and vetted processes, our SOC assistant utilizes our vast internal knowledge bases. Sources like the Rapid7 MDR Handbook – a resource amassed over decades of experience cultivated by our elite SOC team – enable the assistant to guide analysts through complex investigations and streamline response workflows, keeping our analysts a step ahead.

Rapid7 is further using generative AI to carefully automate the drafting of security reports for SOC analysts, typically a manual and time-intensive process. With more than 11,000 customers globally, the Rapid7 SOC triages a huge volume of activity each month, with summaries that are critical for keeping customers fully updated on what’s happening in their environment and actions performed on their behalf. While AI is a key tool to streamline report building and delivery, every report that is generated by the Rapid7 AI Engine is augmented and enhanced by our SOC teams, making certain every data point is accurate and actionable. Beyond providing expert guidance, the AI assistant also has the ability to automatically generate incident reports once investigations are closed out, streamlining the process and ensuring we can communicate updates with customers in a timely manner.

An Enabler for Secure AI/ML Application Development

We know we’re not alone in developing Generative AI solutions, and as such we’re also focused on delivering capabilities that allow our customers to implement and adhere to AI/ML development best practices. We continue to expand our support for Generative AI services from major cloud service providers (CSPs), including AWS Bedrock, Azure OpenAI service and GCP Vertex. These services can be continuously audited against best practices outlined in the Rapid7 AI/ML Security Best Practices compliance pack, which includes the mitigations outlined in the OWASP Top 10 for ML and large language models (LLMs). Our continuous auditing process, enriched by InsightCloudSec’s Layered Context, offers a comprehensive view of AI-related cloud risks, ensuring that our customers’ AI-powered assets are secure.

Rapid7 Infuses Generative AI into the InsightPlatform to Supercharge SecOps and Augment MDR Services

The Future of MDR Services is Powered by AI

The integration of Generative AI into the Insight Platform is not just about helping our teams keep pace – it’s about setting the pace. With unparalleled scalability and adaptability, Rapid7 is committed to maintaining a competitive edge in the market, particularly as it relates to leveraging AI to transform security operations. Our focus on operational efficiencies, cost reduction, and improved quality of service is unwavering. We’re not just responding to the changing threat landscape – we’re reshaping it.

The future of MDR services is here, and it’s powered by the Rapid7 AI Engine.

AI and the Indian Election

Post Syndicated from Bruce Schneier original https://www.schneier.com/blog/archives/2024/06/ai-and-the-indian-election.html

As India concluded the world’s largest election on June 5, 2024, with over 640 million votes counted, observers could assess how the various parties and factions used artificial intelligence technologies—and what lessons that holds for the rest of the world.

The campaigns made extensive use of AI, including deepfake impersonations of candidates, celebrities and dead politicians. By some estimates, millions of Indian voters viewed deepfakes.

But, despite fears of widespread disinformation, for the most part the campaigns, candidates and activists used AI constructively in the election. They used AI for typical political activities, including mudslinging, but primarily to better connect with voters.

Deepfakes without the deception

Political parties in India spent an estimated US$50 million on authorized AI-generated content for targeted communication with their constituencies this election cycle. And it was largely successful.

Indian political strategists have long recognized the influence of personality and emotion on their constituents, and they started using AI to bolster their messaging. Young and upcoming AI companies like The Indian Deepfaker, which started out serving the entertainment industry, quickly responded to this growing demand for AI-generated campaign material.

In January, Muthuvel Karunanidhi, former chief minister of the southern state of Tamil Nadu for two decades, appeared via video at his party’s youth wing conference. He wore his signature yellow scarf, white shirt, dark glasses and had his familiar stance—head slightly bent sideways. But Karunanidhi died in 2018. His party authorized the deepfake.

In February, the All-India Anna Dravidian Progressive Federation party’s official X account posted an audio clip of Jayaram Jayalalithaa, the iconic superstar of Tamil politics colloquially called “Amma” or “Mother.” Jayalalithaa died in 2016.

Meanwhile, voters received calls from their local representatives to discuss local issues—except the leader on the other end of the phone was an AI impersonation. Bhartiya Janta Party (BJP) workers like Shakti Singh Rathore have been frequenting AI startups to send personalized videos to specific voters about the government benefits they received and asking for their vote over WhatsApp.

Multilingual boost

Deepfakes were not the only manifestation of AI in the Indian elections. Long before the election began, Indian Prime Minister Narendra Modi addressed a tightly packed crowd celebrating links between the state of Tamil Nadu in the south of India and the city of Varanasi in the northern state of Uttar Pradesh. Instructing his audience to put on earphones, Modi proudly announced the launch of his “new AI technology” as his Hindi speech was translated to Tamil in real time.

In a country with 22 official languages and almost 780 unofficial recorded languages, the BJP adopted AI tools to make Modi’s personality accessible to voters in regions where Hindi is not easily understood. Since 2022, Modi and his BJP have been using the AI-powered tool Bhashini, embedded in the NaMo mobile app, to translate Modi’s speeches with voiceovers in Telugu, Tamil, Malayalam, Kannada, Odia, Bengali, Marathi and Punjabi.

As part of their demos, some AI companies circulated their own viral versions of Modi’s famous monthly radio show “Mann Ki Baat,” which loosely translates to “From the Heart,” which they voice cloned to regional languages.

Adversarial uses

Indian political parties doubled down on online trolling, using AI to augment their ongoing meme wars. Early in the election season, the Indian National Congress released a short clip to its 6 million followers on Instagram, taking the title track from a new Hindi music album named “Chor” (thief). The video grafted Modi’s digital likeness onto the lead singer and cloned his voice with reworked lyrics critiquing his close ties to Indian business tycoons.

The BJP retaliated with its own video, on its 7-million-follower Instagram account, featuring a supercut of Modi campaigning on the streets, mixed with clips of his supporters but set to unique music. It was an old patriotic Hindi song sung by famous singer Mahendra Kapoor, who passed away in 2008 but was resurrected with AI voice cloning.

Modi himself quote-tweeted an AI-created video of him dancing—a common meme that alters footage of rapper Lil Yachty on stage—commenting “such creativity in peak poll season is truly a delight.”

In some cases, the violent rhetoric in Modi’s campaign that put Muslims at risk and incited violence was conveyed using generative AI tools, but the harm can be traced back to the hateful rhetoric itself and not necessarily the AI tools used to spread it.

The Indian experience

India is an early adopter, and the country’s experiments with AI serve as an illustration of what the rest of the world can expect in future elections. The technology’s ability to produce nonconsensual deepfakes of anyone can make it harder to tell truth from fiction, but its consensual uses are likely to make democracy more accessible.

The Indian election’s embrace of AI that began with entertainment, political meme wars, emotional appeals to people, resurrected politicians and persuasion through personalized phone calls to voters has opened a pathway for the role of AI in participatory democracy.

The surprise outcome of the election, with the BJP’s failure to win its predicted parliamentary majority, and India’s return to a deeply competitive political system especially highlights the possibility for AI to have a positive role in deliberative democracy and representative governance.

Lessons for the world’s democracies

It’s a goal of any political party or candidate in a democracy to have more targeted touch points with their constituents. The Indian elections have shown a unique attempt at using AI for more individualized communication across linguistically and ethnically diverse constituencies, and making their messages more accessible, especially to rural, low-income populations.

AI and the future of participatory democracy could make constituent communication not just personalized but also a dialogue, so voters can share their demands and experiences directly with their representatives—at speed and scale.

India can be an example of taking its recent fluency in AI-assisted party-to-people communications and moving it beyond politics. The government is already using these platforms to provide government services to citizens in their native languages.

If used safely and ethically, this technology could be an opportunity for a new era in representative governance, especially for the needs and experiences of people in rural areas to reach Parliament.

This essay was written with Vandinika Shukla and previously appeared in The Conversation.

Using AI for Political Polling

Post Syndicated from Bruce Schneier original https://www.schneier.com/blog/archives/2024/06/using-ai-for-political-polling.html

Public polling is a critical function of modern political campaigns and movements, but it isn’t what it once was. Recent US election cycles have produced copious postmortems explaining both the successes and the flaws of public polling. There are two main reasons polling fails.

First, nonresponse has skyrocketed. It’s radically harder to reach people than it used to be. Few people fill out surveys that come in the mail anymore. Few people answer their phone when a stranger calls. Pew Research reported that 36% of the people they called in 1997 would talk to them, but only 6% by 2018. Pollsters worldwide have faced similar challenges.

Second, people don’t always tell pollsters what they really think. Some hide their true thoughts because they are embarrassed about them. Others behave as a partisan, telling the pollster what they think their party wants them to say—or what they know the other party doesn’t want to hear.

Despite these frailties, obsessive interest in polling nonetheless consumes our politics. Headlines more likely tout the latest changes in polling numbers than the policy issues at stake in the campaign. This is a tragedy for a democracy. We should treat elections like choices that have consequences for our lives and well-being, not contests to decide who gets which cushy job.

Polling Machines?

AI could change polling. AI can offer the ability to instantaneously survey and summarize the expressed opinions of individuals and groups across the web, understand trends by demographic, and offer extrapolations to new circumstances and policy issues on par with human experts. The politicians of the (near) future won’t anxiously pester their pollsters for information about the results of a survey fielded last week: they’ll just ask a chatbot what people think. This will supercharge our access to realtime, granular information about public opinion, but at the same time it might also exacerbate concerns about the quality of this information.

I know it sounds impossible, but stick with us.

Large language models, the AI foundations behind tools like ChatGPT, are built on top of huge corpuses of data culled from the Internet. These are models trained to recapitulate what millions of real people have written in response to endless topics, contexts, and scenarios. For a decade or more, campaigns have trawled social media, looking for hints and glimmers of how people are reacting to the latest political news. This makes asking questions of an AI chatbot similar in spirit to doing analytics on social media, except that they are generative: you can ask them new questions that no one has ever posted about before, you can generate more data from populations too small to measure robustly, and you can immediately ask clarifying questions of your simulated constituents to better understand their reasoning

Researchers and firms are already using LLMs to simulate polling results. Current techniques are based on the ideas of AI agents. An AI agent is an instance of an AI model that has been conditioned to behave in a certain way. For example, it may be primed to respond as if it is a person with certain demographic characteristics and can access news articles from certain outlets. Researchers have set up populations of thousands of AI agents that respond as if they are individual members of a survey population, like humans on a panel that get called periodically to answer questions.

The big difference between humans and AI agents is that the AI agents always pick up the phone, so to speak, no matter how many times you contact them. A political candidate or strategist can ask an AI agent whether voters will support them if they take position A versus B, or tweaks of those options, like policy A-1 versus A-2. They can ask that question of male voters versus female voters. They can further limit the query to married male voters of retirement age in rural districts of Illinois without college degrees who lost a job during the last recession; the AI will integrate as much context as you ask.

What’s so powerful about this system is that it can generalize to new scenarios and survey topics, and spit out a plausible answer, even if its accuracy is not guaranteed. In many cases, it will anticipate those responses at least as well as a human political expert. And if the results don’t make sense, the human can immediately prompt the AI with a dozen follow-up questions.

Making AI agents better polling subjects

When we ran our own experiments in this kind of AI use case with the earliest versions of the model behind ChatGPT (GPT-3.5), we found that it did a fairly good job at replicating human survey responses. The ChatGPT agents tended to match the responses of their human counterparts fairly well across a variety of survey questions, such as support for abortion and approval of the US Supreme Court. The AI polling results had average responses, and distributions across demographic properties such as age and gender, similar to real human survey panels.

Our major systemic failure happened on a question about US intervention in the Ukraine war.  In our experiments, the AI agents conditioned to be liberal were predominantly opposed to US intervention in Ukraine and likened it to the Iraq war. Conservative AI agents gave hawkish responses supportive of US intervention. This is pretty much what most political experts would have expected of the political equilibrium in US foreign policy at the start of the decade but was exactly wrong in the politics of today.

This mistake has everything to do with timing. The humans were asked the question after Russia’s full-scale invasion in 2022, whereas the AI model was trained using data that only covered events through September 2021. The AI got it wrong because it didn’t know how the politics had changed. The model lacked sufficient context on crucially relevant recent events.

We believe AI agents can overcome these shortcomings. While AI models are dependent on  the data they are trained with, and all the limitations inherent in that, what makes AI agents special is that they can automatically source and incorporate new data at the time they are asked a question. AI models can update the context in which they generate opinions by learning from the same sources that humans do. Each AI agent in a simulated panel can be exposed to the same social and media news sources as humans from that same demographic before they respond to a survey question. This works because AI agents can follow multi-step processes, such as reading a question, querying a defined database of information (such as Google, or the New York Times, or Fox News, or Reddit), and then answering a question.

In this way, AI polling tools can simulate exposing their synthetic survey panel to whatever news is most relevant to a topic and likely to emerge in each AI agent’s own echo chamber. And they can query for other relevant contextual information, such as demographic trends and historical data. Like human pollsters, they can try to refine their expectations on the basis of factors like how expensive homes are in a respondent’s neighborhood, or how many people in that district turned out to vote last cycle.

Likely use cases for AI polling

AI polling will be irresistible to campaigns, and to the media. But research is already revealing when and where this tool will fail. While AI polling will always have limitations in accuracy, that makes them similar to, not different from, traditional polling. Today’s pollsters are challenged to reach sample sizes large enough to measure statistically significant differences between similar populations, and the issues of nonresponse and inauthentic response can make them systematically wrong. Yet for all those shortcomings, both traditional and AI-based polls will still be useful. For all the hand-wringing and consternation over the accuracy of US political polling, national issue surveys still tend to be accurate to within a few percentage points. If you’re running for a town council seat or in a neck-and-neck national election, or just trying to make the right policy decision within a local government, you might care a lot about those small and localized differences. But if you’re looking to track directional changes over time, or differences between demographic groups, or to uncover insights about who responds best to what message, then these imperfect signals are sufficient to help campaigns and policymakers.

Where AI will work best is as an augmentation of more traditional human polls. Over time, AI tools will get better at anticipating human responses, and also at knowing when they will be most wrong or uncertain. They will recognize which issues and human communities are in the most flux, where the model’s training data is liable to steer it in the wrong direction. In those cases, AI models can send up a white flag and indicate that they need to engage human respondents to calibrate to real people’s perspectives. The AI agents can even be programmed to automate this. They can use existing survey tools—with all their limitations and latency—to query for authentic human responses when they need them.

This kind of human-AI polling chimera lands us, funnily enough, not too distant from where survey research is today. Decades of social science research has led to substantial innovations in statistical methodologies for analyzing survey data. Current polling methods already do substantial modeling and projecting to predictively model properties of a general population based on sparse survey samples. Today, humans fill out the surveys and computers fill in the gaps. In the future, it will be the opposite: AI will fill out the survey and, when the AI isn’t sure what box to check, humans will fill the gaps. So if you’re not comfortable with the idea that political leaders will turn to a machine to get intelligence about which candidates and policies you want, then you should have about as many misgivings about the present as you will the future.

And while the AI results could improve quickly, they probably won’t be seen as credible for some time. Directly asking people what they think feels more reliable than asking a computer what people think. We expect these AI-assisted polls will be initially used internally by campaigns, with news organizations relying on more traditional techniques. It will take a major election where AI is right and humans are wrong to change that.

This essay was written with Aaron Berger, Eric Gong, and Nathan Sanders, and previously appeared on the Harvard Kennedy School Ash Center’s website.

LLMs Acting Deceptively

Post Syndicated from Bruce Schneier original https://www.schneier.com/blog/archives/2024/06/llms-acting-deceptively.html

New research: “Deception abilities emerged in large language models“:

Abstract: Large language models (LLMs) are currently at the forefront of intertwining AI systems with human communication and everyday life. Thus, aligning them with human values is of great importance. However, given the steady increase in reasoning abilities, future LLMs are under suspicion of becoming able to deceive human operators and utilizing this ability to bypass monitoring efforts. As a prerequisite to this, LLMs need to possess a conceptual understanding of deception strategies. This study reveals that such strategies emerged in state-of-the-art LLMs, but were nonexistent in earlier LLMs. We conduct a series of experiments showing that state-of-the-art LLMs are able to understand and induce false beliefs in other agents, that their performance in complex deception scenarios can be amplified utilizing chain-of-thought reasoning, and that eliciting Machiavellianism in LLMs can trigger misaligned deceptive behavior. GPT-4, for instance, exhibits deceptive behavior in simple test scenarios 99.16% of the time (P < 0.001). In complex second-order deception test scenarios where the aim is to mislead someone who expects to be deceived, GPT-4 resorts to deceptive behavior 71.46% of the time (P < 0.001) when augmented with chain-of-thought reasoning. In sum, revealing hitherto unknown machine behavior in LLMs, our study contributes to the nascent field of machine psychology.

Securing AI Development in the Cloud: Navigating the Risks and Opportunities

Post Syndicated from Rapid7 original https://blog.rapid7.com/2024/06/05/securing-ai-development-in-the-cloud-navigating-the-risks-and-opportunities/

AI-TRiSM – Trust, Risk and Security Management in the Age of AI

Securing AI Development in the Cloud: Navigating the Risks and Opportunities

Co-authored by Lara Sunday and Pojan Shahrivar

As artificial intelligence (AI) and machine learning (ML) technologies continue to advance and proliferate, organizations across industries are investing heavily in these transformative capabilities. According to Gartner, by 2027, spending on AI software will grow to $297.9 billion at a compound annual growth rate of 19.1%. Generative AI (GenAI) software spend will rise from 8% of AI software in 2023 to 35% by 2027.

With the promise of enhanced efficiency, personalization, and innovation, organizations are increasingly turning to cloud environments to develop and deploy these powerful AI and ML technologies. However, this rapid innovation also introduces new security risks and challenges that must be addressed proactively to protect valuable data, intellectual property, and maintain the trust of customers and stakeholders.

Benefits of Cloud Environments for AI Development

Cloud platforms offer unparalleled scalability, allowing organizations to easily scale their computing resources up or down to meet the demanding requirements of training and deploying complex AI models.

“The ability to spin up and down resources on-demand has been a game-changer for our AI development efforts,” says Stuart Millar, Principal AI Engineer at Rapid7. “We can quickly provision the necessary compute power during peak training periods, then scale back down to optimize costs when those resources are no longer needed.”

Cloud environments also provide a cost-effective way to develop AI models, with usage-based pricing models that avoid large upfront investments in hardware and infrastructure. Additionally, major cloud providers offer access to cutting-edge AI hardware and pre-built tools and services, such as Amazon SageMaker, Azure Machine Learning, and Google Cloud AI Platform, which can accelerate development and deployment cycles.

Challenges and Risks of Cloud-Based AI Development

While the cloud offers numerous advantages for AI development, it also introduces unique challenges that organizations must navigate. Limited visibility into complex data flows and model updates can create blind spots for security teams, leaving them unable to effectively monitor for potential threats or anomalies.

In their  AI Threat Landscape Report, HiddenLayer highlighted that 98% of all the companies surveyed identified that elements of their AI models were crucial to their business success, and 77% identified breaches to their AI in the past year. Additionally, multi-cloud and hybrid deployments bring monitoring, governance, and reporting challenges, making it difficult to assess AI/ML risk in context across different cloud environments.

New Attack Vectors and Risk Types

Developing AI in the cloud also exposes organizations to new attack vectors and risk types that traditional security tools may not be equipped to detect or mitigate. Some examples include:

Prompt Injection (LLM01): Imagine a large language model used for generating marketing copy. An attacker could craft a special prompt that tricks the model into generating harmful or offensive content, damaging the company’s brand and reputation.

Training Data Poisoning (LLM03, ML02): Adversaries can tamper with training data to compromise the integrity and reliability of cloud-based AI models. In the case of an AI model used for image recognition in a security surveillance system, poisoned training data containing mislabeled images could cause the model to generate incorrect classifications, potentially missing critical threats.

Model Theft (LLM10, ML05): Unauthorized access to proprietary AI models deployed in the cloud poses risks to intellectual property and competitive advantage. If a competitor were to steal a model trained on a company’s sensitive data, they could potentially replicate its functionality and gain valuable insights.

Supply Chain Vulnerabilities (LLM05, ML06): Compromised libraries, datasets, or services used in cloud AI development pipelines can lead to widespread security breaches. A malicious actor might introduce a vulnerability into a widely used open-source library for AI, which could then be exploited to gain access to AI models deployed by multiple organizations.

Developing Best Practices for Securing AI Development

To address these challenges and risks, organizations need to develop and implement best practices and standards tailored to their specific business needs, striking the right balance between enabling innovation and introducing risk.

While guidelines like NCSC Secure AI System Development and The Open Standard for Responsible AI provide a valuable starting point, organizations must also develop their own customized best practices that align with their unique business requirements, risk appetite, and AI/ML use cases. For instance, a financial institution developing AI models for fraud detection might prioritize best practices around data governance and model explainability to ensure compliance with regulations and maintain transparency in decision-making processes.

Key considerations when developing these best practices include:

Ensuring secure data handling and governance throughout the AI lifecycle

  • Implementing robust access controls and identity management for AI/ML resources
  • Validating and monitoring AI models for potential biases, vulnerabilities, or anomalies
  • Establishing incident response and remediation processes for AI-specific threats
  • Maintaining transparency and explainability to understand and audit AI model behavior

Rapid7’s Approach to Securing AI Development

“At Rapid7, our InsightCloudSec solution offers real-time visibility into AI/ML resources running across major cloud providers, allowing security teams to continuously monitor for potential risks or misconfigurations,” says Aniket Menon, VP, Product Management. “Visibility is the foundation for effective security in any environment, and that’s especially true in the complex world of AI development. Without a clear view into your AI/ML assets and activities, you’re essentially operating blind, leaving your organization vulnerable to a range of threats.”

Here at Rapid7 our AI TRiSM (Trust, Risk, and Security Management) framework empowers our teams. The framework provides us with confidence not only in our operations but also in driving innovation. In their recent blog outlining the company’s AI principles, Laura Ellis and Sabeen Malik shared how Rapid7 tackles and addresses AI challenges. Centering on transparency, fairness, safety, security, privacy, and accountability, these principles are not just guidelines; they are integral to how Rapid7 builds, deploys, and manages AI systems.

Security and compliance are two key InsightCloudSec capabilities. Compliance Packs are out-of-the-box collections of related Insights focused on industry requirements and standards for all of your resources. Compliance packs may focus on security, costs, governance, or combinations of these across a variety of frameworks, e.g., HIPAA, PCI DSS, GDPR, etc.

Last year Rapid7 launched the Rapid7 AI/ML Security Best Practices compliance pack, the pack allows for real-time and continuous visibility into AI/ML resources running across your clouds with support for GenAI services across AWS, Azure and GCP. To empower you to assess this data in the context of your organizational requirements and priorities, you can then automatically prioritize AI/ML-related risk with Layered Context based on exploitability and potential business impact.

You can also leverage Identity Analysis in InsightCloudSec to collect and present the actions executed by a given user or role within a certain time period. These logged actions are collected and analyzed, providing you with a view across your organization of who can access AI/ML resources and automatically rightsize in accordance with the least privilege access (LPA) concept. This enables you to strategically inform your policies moving forward. Native automation allows you to then act on your assessments to alert on compliance drift, remediate AI/ML risk, and enact prevention mechanisms.

Rapid7’s Continued Dedication to AI Innovation

As an inaugural signer of the CISA Secure by Design Pledge, and through our partnership with Queen’s University Belfast Centre for Secure Information Technologies (CSIT), Rapid7 remains dedicated to collaborating with industry leaders and academic institutions to stay ahead of emerging threats and develop cutting-edge solutions for securing AI development.

As the adoption of AI and ML capabilities continues to accelerate, it’s imperative that organizations have the knowledge and tools to make informed decisions and build with confidence. By implementing robust best practices and leveraging advanced security tools like InsightCloudSec, organizations can harness the power of AI while mitigating the associated risks and ensuring their valuable data and intellectual property remain protected.

To learn more about how Rapid7 can help your organization develop and implement best practices for securing AI development, visit our website to request a demo.


Gartner, Forecast Analysis: Artificial Intelligence Software, 2023-2027, Worldwide, Alys Woodward, et al, 07 November 2023

Online Privacy and Overfishing

Post Syndicated from Bruce Schneier original https://www.schneier.com/blog/archives/2024/06/online-privacy-and-overfishing.html

Microsoft recently caught state-backed hackers using its generative AI tools to help with their attacks. In the security community, the immediate questions weren’t about how hackers were using the tools (that was utterly predictable), but about how Microsoft figured it out. The natural conclusion was that Microsoft was spying on its AI users, looking for harmful hackers at work.

Some pushed back at characterizing Microsoft’s actions as “spying.” Of course cloud service providers monitor what users are doing. And because we expect Microsoft to be doing something like this, it’s not fair to call it spying.

We see this argument as an example of our shifting collective expectations of privacy. To understand what’s happening, we can learn from an unlikely source: fish.

In the mid-20th century, scientists began noticing that the number of fish in the ocean—so vast as to underlie the phrase “There are plenty of fish in the sea”—had started declining rapidly due to overfishing. They had already seen a similar decline in whale populations, when the post-WWII whaling industry nearly drove many species extinct. In whaling and later in commercial fishing, new technology made it easier to find and catch marine creatures in ever greater numbers. Ecologists, specifically those working in fisheries management, began studying how and when certain fish populations had gone into serious decline.

One scientist, Daniel Pauly, realized that researchers studying fish populations were making a major error when trying to determine acceptable catch size. It wasn’t that scientists didn’t recognize the declining fish populations. It was just that they didn’t realize how significant the decline was. Pauly noted that each generation of scientists had a different baseline to which they compared the current statistics, and that each generation’s baseline was lower than that of the previous one.

What seems normal to us in the security community is whatever was commonplace at the beginning of our careers.

Pauly called this “shifting baseline syndrome” in a 1995 paper. The baseline most scientists used was the one that was normal when they began their research careers. By that measure, each subsequent decline wasn’t significant, but the cumulative decline was devastating. Each generation of researchers came of age in a new ecological and technological environment, inadvertently masking an exponential decline.

Pauly’s insights came too late to help those managing some fisheries. The ocean suffered catastrophes such as the complete collapse of the Northwest Atlantic cod population in the 1990s.

Internet surveillance, and the resultant loss of privacy, is following the same trajectory. Just as certain fish populations in the world’s oceans have fallen 80 percent, from previously having fallen 80 percent, from previously having fallen 80 percent (ad infinitum), our expectations of privacy have similarly fallen precipitously. The pervasive nature of modern technology makes surveillance easier than ever before, while each successive generation of the public is accustomed to the privacy status quo of their youth. What seems normal to us in the security community is whatever was commonplace at the beginning of our careers.

Historically, people controlled their computers, and software was standalone. The always-connected cloud-deployment model of software and services flipped the script. Most apps and services are designed to be always-online, feeding usage information back to the company. A consequence of this modern deployment model is that everyone—cynical tech folks and even ordinary users—expects that what you do with modern tech isn’t private. But that’s because the baseline has shifted.

AI chatbots are the latest incarnation of this phenomenon: They produce output in response to your input, but behind the scenes there’s a complex cloud-based system keeping track of that input—both to improve the service and to sell you ads.

Shifting baselines are at the heart of our collective loss of privacy. The U.S. Supreme Court has long held that our right to privacy depends on whether we have a reasonable expectation of privacy. But expectation is a slippery thing: It’s subject to shifting baselines.

The question remains: What now? Fisheries scientists, armed with knowledge of shifting-baseline syndrome, now look at the big picture. They no longer consider relative measures, such as comparing this decade with the last decade. Instead, they take a holistic, ecosystem-wide perspective to see what a healthy marine ecosystem and thus sustainable catch should look like. They then turn these scientifically derived sustainable-catch figures into limits to be codified by regulators.

In privacy and security, we need to do the same. Instead of comparing to a shifting baseline, we need to step back and look at what a healthy technological ecosystem would look like: one that respects people’s privacy rights while also allowing companies to recoup costs for services they provide. Ultimately, as with fisheries, we need to take a big-picture perspective and be aware of shifting baselines. A scientifically informed and democratic regulatory process is required to preserve a heritage—whether it be the ocean or the Internet—for the next generation.

This essay was written with Barath Raghavan, and previously appeared in IEEE Spectrum.

AI Will Increase the Quantity—and Quality—of Phishing Scams

Post Syndicated from Bruce Schneier original https://www.schneier.com/blog/archives/2024/06/ai-will-increase-the-quantity-and-quality-of-phishing-scams.html

A piece I coauthored with Fredrik Heiding and Arun Vishwanath in the Harvard Business Review:

Summary. Gen AI tools are rapidly making these emails more advanced, harder to spot, and significantly more dangerous. Recent research showed that 60% of participants fell victim to artificial intelligence (AI)-automated phishing, which is comparable to the success rates of non-AI-phishing messages created by human experts. Companies need to: 1) understand the asymmetrical capabilities of AI-enhanced phishing, 2) determine the company or division’s phishing threat severity level, and 3) confirm their current phishing awareness routines.

Here’s the full text.

How AI Will Change Democracy

Post Syndicated from Bruce Schneier original https://www.schneier.com/blog/archives/2024/05/how-ai-will-change-democracy.html

I don’t think it’s an exaggeration to predict that artificial intelligence will affect every aspect of our society. Not by doing new things. But mostly by doing things that are already being done by humans, perfectly competently.

Replacing humans with AIs isn’t necessarily interesting. But when an AI takes over a human task, the task changes.

In particular, there are potential changes over four dimensions: Speed, scale, scope and sophistication. The problem with AIs trading stocks isn’t that they’re better than humans—it’s that they’re faster. But computers are better at chess and Go because they use more sophisticated strategies than humans. We’re worried about AI-controlled social media accounts because they operate on a superhuman scale.

It gets interesting when changes in degree can become changes in kind. High-speed trading is fundamentally different than regular human trading. AIs have invented fundamentally new strategies in the game of Go. Millions of AI-controlled social media accounts could fundamentally change the nature of propaganda.

It’s these sorts of changes and how AI will affect democracy that I want to talk about.

To start, I want to list some of AI’s core competences. First, it is really good as a summarizer. Second, AI is good at explaining things, teaching with infinite patience. Third, and related, AI can persuade. Propaganda is an offshoot of this. Fourth, AI is fundamentally a prediction technology. Predictions about whether turning left or right will get you to your destination faster. Predictions about whether a tumor is cancerous might improve medical diagnoses. Predictions about which word is likely to come next can help compose an email. Fifth, AI can assess. Assessing requires outside context and criteria. AI is less good at assessing, but it’s getting better. Sixth, AI can decide. A decision is a prediction plus an assessment. We are already using AI to make all sorts of decisions.

How these competences translate to actual useful AI systems depends a lot on the details. We don’t know how far AI will go in replicating or replacing human cognitive functions. Or how soon that will happen. In constrained environments it can be easy. AIs already play chess and Go better than humans. Unconstrained environments are harder. There are still significant challenges to fully AI-piloted automobiles. The technologist Jaron Lanier has a nice quote, that AI does best when “human activities have been done many times before, but not in exactly the same way.”

In this talk, I am going to be largely optimistic about the technology. I’m not going to dwell on the details of how the AI systems might work. Much of what I am talking about is still in the future. Science fiction, but not unrealistic science fiction.

Where I am going to be less optimistic—and more realistic—is about the social implications of the technology. Again, I am less interested in how AI will substitute for humans. I’m looking more at the second-order effects of those substitutions: How the underlying systems will change because of changes in speed, scale, scope and sophistication. My goal is to imagine the possibilities. So that we might be prepared for their eventuality.

And as I go through the possibilities, keep in mind a few questions: Will the change distribute or consolidate power? Will it make people more or less personally involved in democracy? What needs to happen before people will trust AI in this context? What could go wrong if a bad actor subverted the AI in this context? And what can we do, as security technologists, to help?

I am thinking about democracy very broadly. Not just representations, or elections. Democracy as a system for distributing decisions evenly across a population. It’s a way of converting individual preferences into group decisions. And that includes bureaucratic decisions.

To that end, I want to discuss five different areas where AI will affect democracy: Politics, lawmaking, administration, the legal system and, finally, citizens themselves.

I: AI-assisted politicians

I’ve already said that AIs are good at persuasion. Politicians will make use of that. Pretty much everyone talks about AI propaganda. Politicians will make use of that, too. But let’s talk about how this might go well.

In the past, candidates would write books and give speeches to connect with voters. In the future, candidates will also use personalized chatbots to directly engage with voters on a variety of issues. AI can also help fundraise. I don’t have to explain the persuasive power of individually crafted appeals. AI can conduct polls. There’s some really interesting work into having large language models assume different personas and answer questions from their points of view. Unlike people, AIs are always available, will answer thousands of questions without getting tired or bored and are more reliable. This won’t replace polls, but it can augment them. AI can assist human campaign managers by coordinating campaign workers, creating talking points, doing media outreach and assisting get-out-the-vote efforts. These are all things that humans already do. So there’s no real news there.

The changes are largely in scale. AIs can engage with voters, conduct polls and fundraise at a scale that humans cannot—for all sizes of elections. They can also assist in lobbying strategies. AIs could also potentially develop more sophisticated campaign and political strategies than humans can. I expect an arms race as politicians start using these sorts of tools. And we don’t know if the tools will favor one political ideology over another.

More interestingly, future politicians will largely be AI-driven. I don’t mean that AI will replace humans as politicians. Absent a major cultural shift—and some serious changes in the law—that won’t happen. But as AI starts to look and feel more human, our human politicians will start to look and feel more like AI. I think we will be OK with it, because it’s a path we’ve been walking down for a long time. Any major politician today is just the public face of a complex socio-technical system. When the president makes a speech, we all know that they didn’t write it. When a legislator sends out a campaign email, we know that they didn’t write that either—even if they signed it. And when we get a holiday card from any of these people, we know that it was signed by an autopen. Those things are so much a part of politics today that we don’t even think about it. In the future, we’ll accept that almost all communications from our leaders will be written by AI. We’ll accept that they use AI tools for making political and policy decisions. And for planning their campaigns. And for everything else they do. None of this is necessarily bad. But it does change the nature of politics and politicians—just like television and the internet did.

II: AI-assisted legislators

AIs are already good at summarization. This can be applied to listening to constituents:  summarizing letters, comments and making sense of constituent inputs. Public meetings might be summarized. Here the scale of the problem is already overwhelming, and AI can make a big difference. Beyond summarizing, AI can highlight interesting arguments or detect bulk letter-writing campaigns. They can aid in political negotiating.

AIs can also write laws. In November 2023, Porto Alegre, Brazil became the first city to enact a law that was entirely written by AI. It had to do with water meters. One of the councilmen prompted ChatGPT, and it produced a complete bill. He submitted it to the legislature without telling anyone who wrote it. And the humans passed it without any changes.

A law is just a piece of generated text that a government agrees to adopt. And as with every other profession, policymakers will turn to AI to help them draft and revise text. Also, AI can take human-written laws and figure out what they actually mean. Lots of laws are recursive, referencing paragraphs and words of other laws. AIs are already good at making sense of all that.

This means that AI will be good at finding legal loopholes—or at creating legal loopholes. I wrote about this in my latest book, A Hacker’s Mind. Finding loopholes is similar to finding vulnerabilities in software. There’s also a concept called “micro-legislation.” That’s the smallest unit of law that makes a difference to someone. It could be a word or a punctuation mark. AIs will be good at inserting micro-legislation into larger bills. More positively, AI can help figure out unintended consequences of a policy change—by simulating how the change interacts with all the other laws and with human behavior.

AI can also write more complex law than humans can. Right now, laws tend to be general. With details to be worked out by a government agency. AI can allow legislators to propose, and then vote on, all of those details. That will change the balance of power between the legislative and the executive branches of government. This is less of an issue when the same party controls the executive and the legislative branches. It is a big deal when those branches of government are in the hands of different parties. The worry is that AI will give the most powerful groups more tools for propagating their interests.

AI can write laws that are impossible for humans to understand. There are two kinds of laws: specific laws, like speed limits, and laws that require judgment, like those that address reckless driving. Imagine that we train an AI on lots of street camera footage to recognize reckless driving and that it gets better than humans at identifying the sort of behavior that tends to result in accidents. And because it has real-time access to cameras everywhere, it can spot it … everywhere. The AI won’t be able to explain its criteria: It would be a black-box neural net. But we could pass a law defining reckless driving by what that AI says. It would be a law that no human could ever understand. This could happen in all sorts of areas where judgment is part of defining what is illegal. We could delegate many things to the AI because of speed and scale. Market manipulation. Medical malpractice. False advertising. I don’t know if humans will accept this.

III: AI-assisted bureaucracy

Generative AI is already good at a whole lot of administrative paperwork tasks. It will only get better. I want to focus on a few places where it will make a big difference. It could aid in benefits administration—figuring out who is eligible for what. Humans do this today, but there is often a backlog because there aren’t enough humans. It could audit contracts. It could operate at scale, auditing all human-negotiated government contracts. It could aid in contracts negotiation. The government buys a lot of things and has all sorts of complicated rules. AI could help government contractors navigate those rules.

More generally, it could aid in negotiations of all kinds. Think of it as a strategic adviser. This is no different than a human but could result in more complex negotiations. Human negotiations generally center around only a few issues. Mostly because that’s what humans can keep in mind. AI versus AI negotiations could potentially involve thousands of variables simultaneously. Imagine we are using an AI to aid in some international trade negotiation and it suggests a complex strategy that is beyond human understanding. Will we blindly follow the AI? Will we be more willing to do so once we have some history with its accuracy?

And one last bureaucratic possibility: Could AI come up with better institutional designs than we have today? And would we implement them?

IV: AI-assisted legal system

When referring to an AI-assisted legal system, I mean this very broadly—both lawyering and judging and all the things surrounding those activities.

AIs can be lawyers. Early attempts at having AIs write legal briefs didn’t go well. But this is already changing as the systems get more accurate. Chatbots are now able to properly cite their sources and minimize errors. Future AIs will be much better at writing legalese, drastically reducing the cost of legal counsel. And there’s every indication that it will be able to do much of the routine work that lawyers do. So let’s talk about what this means.

Most obviously, it reduces the cost of legal advice and representation, giving it to people who currently can’t afford it. An AI public defender is going to be a lot better than an overworked not very good human public defender. But if we assume that human-plus-AI beats AI-only, then the rich get the combination, and the poor are stuck with just the AI.

It also will result in more sophisticated legal arguments. AI’s ability to search all of the law for precedents to bolster a case will be transformative.

AI will also change the meaning of a lawsuit. Right now, suing someone acts as a strong social signal because of the cost. If the cost drops to free, that signal will be lost. And orders of magnitude more lawsuits will be filed, which will overwhelm the court system.

Another effect could be gutting the profession. Lawyering is based on apprenticeship. But if most of the apprentice slots are filled by AIs, where do newly minted attorneys go to get training? And then where do the top human lawyers come from? This might not happen. AI-assisted lawyers might result in more human lawyering. We don’t know yet.

AI can help enforce the law. In a sense, this is nothing new. Automated systems already act as law enforcement—think speed trap cameras and Breathalyzers. But AI can take this kind of thing much further, like automatically identifying people who cheat on tax returns, identifying fraud on government service applications and watching all of the traffic cameras and issuing citations.

Again, the AI is performing a task for which we don’t have enough humans. And doing it faster, and at scale. This has the obvious problem of false positives. Which could be hard to contest if the courts believe that the computer is always right. This is a thing today: If a Breathalyzer says you’re drunk, it can be hard to contest the software in court. And also the problem of bias, of course: AI law enforcers may be more and less equitable than their human predecessors.

But most importantly, AI changes our relationship with the law. Everyone commits driving violations all the time. If we had a system of automatic enforcement, the way we all drive would change—significantly. Not everyone wants this future. Lots of people don’t want to fund the IRS, even though catching tax cheats is incredibly profitable for the government. And there are legitimate concerns as to whether this would be applied equitably.

AI can help enforce regulations. We have no shortage of rules and regulations. What we have is a shortage of time, resources and willpower to enforce them, which means that lots of companies know that they can ignore regulations with impunity. AI can change this by decoupling the ability to enforce rules from the resources necessary to do it. This makes enforcement more scalable and efficient. Imagine putting cameras in every slaughterhouse in the country looking for animal welfare violations or fielding an AI in every warehouse camera looking for labor violations. That could create an enormous shift in the balance of power between government and corporations—which means that it will be strongly resisted by corporate power.

AIs can provide expert opinions in court. Imagine an AI trained on millions of traffic accidents, including video footage, telemetry from cars and previous court cases. The AI could provide the court with a reconstruction of the accident along with an assignment of fault. AI could do this in a lot of cases where there aren’t enough human experts to analyze the data—and would do it better, because it would have more experience.

AIs can also perform judging tasks, weighing evidence and making decisions, probably not in actual courtrooms, at least not anytime soon, but in other contexts. There are many areas of government where we don’t have enough adjudicators. Automated adjudication has the potential to offer everyone immediate justice. Maybe the AI does the first level of adjudication and humans handle appeals. Probably the first place we’ll see this is in contracts. Instead of the parties agreeing to binding arbitration to resolve disputes, they’ll agree to binding arbitration by AI. This would significantly decrease cost of arbitration. Which would probably significantly increase the number of disputes.

So, let’s imagine a world where dispute resolution is both cheap and fast. If you and I are business partners, and we have a disagreement, we can get a ruling in minutes. And we can do it as many times as we want—multiple times a day, even. Will we lose the ability to disagree and then resolve our disagreements on our own? Or will this make it easier for us to be in a partnership and trust each other?

V: AI-assisted citizens

AI can help people understand political issues by explaining them. We can imagine both partisan and nonpartisan chatbots. AI can also provide political analysis and commentary. And it can do this at every scale. Including for local elections that simply aren’t important enough to attract human journalists. There is a lot of research going on right now on AI as moderator, facilitator, and consensus builder. Human moderators are still better, but we don’t have enough human moderators. And AI will improve over time. AI can moderate at scale, giving the capability to every decision-making group—or chatroom—or local government meeting.

AI can act as a government watchdog. Right now, much local government effectively happens in secret because there are no local journalists covering public meetings. AI can change that, providing summaries and flagging changes in position.

AIs can help people navigate bureaucracies by filling out forms, applying for services and contesting bureaucratic actions. This would help people get the services they deserve, especially disadvantaged people who have difficulty navigating these systems. Again, this is a task that we don’t have enough qualified humans to perform. It sounds good, but not everyone wants this. Administrative burdens can be deliberate.

Finally, AI can eliminate the need for politicians. This one is further out there, but bear with me. Already there is research showing AI can extrapolate our political preferences. An AI personal assistant trained on and continuously attuned to your political preferences could advise you, including what to support and who to vote for. It could possibly even vote on your behalf or, more interestingly, act as your personal representative.

This is where it gets interesting. Our system of representative democracy empowers elected officials to stand in for our collective preferences. But that has obvious problems. Representatives are necessary because people don’t pay attention to politics. And even if they did, there isn’t enough room in the debate hall for everyone to fit. So we need to pick one of us to pass laws in our name. But that selection process is incredibly inefficient. We have complex policy wants and beliefs and can make complex trade-offs. The space of possible policy outcomes is equally complex. But we can’t directly debate the policies. We can only choose one of two—or maybe a few more—candidates to do that for us. This has been called democracy’s “lossy bottleneck.” AI can change this. We can imagine a personal AI directly participating in policy debates on our behalf along with millions of other personal AIs and coming to a consensus on policy.

More near term, AIs can result in more ballot initiatives. Instead of five or six, there might be five or six hundred, as long as the AI can reliably advise people on how to vote. It’s hard to know whether this is a good thing. I don’t think we want people to become politically passive because the AI is taking care of it. But it could result in more legislation that the majority actually wants.

Where will AI take us?

That’s my list. Again, watch where changes of degree result in changes in kind. The sophistication of AI lawmaking will mean more detailed laws, which will change the balance of power between the executive and the legislative branches. The scale of AI lawyering means that litigation becomes affordable to everyone, which will mean an explosion in the amount of litigation. The speed of AI adjudication means that contract disputes will get resolved much faster, which will change the nature of settlements. The scope of AI enforcement means that some laws will become impossible to evade, which will change how the rich and powerful think about them.

I think this is all coming. The time frame is hazy, but the technology is moving in these directions.

All of these applications need security of one form or another. Can we provide confidentiality, integrity and availability where it is needed? AIs are just computers. As such, they have all the security problems regular computers have—plus the new security risks stemming from AI and the way it is trained, deployed and used. Like everything else in security, it depends on the details.

First, the incentives matter. In some cases, the user of the AI wants it to be both secure and accurate. In some cases, the user of the AI wants to subvert the system. Think about prompt injection attacks. In most cases, the owners of the AIs aren’t the users of the AI. As happened with search engines and social media, surveillance and advertising are likely to become the AI’s business model. And in some cases, what the user of the AI wants is at odds with what society wants.

Second, the risks matter. The cost of getting things wrong depends a lot on the application. If a candidate’s chatbot suggests a ridiculous policy, that’s easily corrected. If an AI is helping someone fill out their immigration paperwork, a mistake can get them deported. We need to understand the rate of AI mistakes versus the rate of human mistakes—and also realize that AI mistakes are viewed differently than human mistakes. There are also different types of mistakes: false positives versus false negatives. But also, AI systems can make different kinds of mistakes than humans do—and that’s important. In every case, the systems need to be able to correct mistakes, especially in the context of democracy.

Many of the applications are in adversarial environments. If two countries are using AI to assist in trade negotiations, they are both going to try to hack each other’s AIs. This will include attacks against the AI models but also conventional attacks against the computers and networks that are running the AIs. They’re going to want to subvert, eavesdrop on or disrupt the other’s AI.

Some AI applications will need to run in secure environments. Large language models work best when they have access to everything, in order to train. That goes against traditional classification rules about compartmentalization.

Fourth, power matters. AI is a technology that fundamentally magnifies power of the humans who use it, but not equally across users or applications. Can we build systems that reduce power imbalances rather than increase them? Think of the privacy versus surveillance debate in the context of AI.

And similarly, equity matters. Human agency matters.

And finally, trust matters. Whether or not to trust an AI is less about the AI and more about the application. Some of these AI applications are individual. Some of these applications are societal. Whether something like “fairness” matters depends on this. And there are many competing definitions of fairness that depend on the details of the system and the application. It’s the same with transparency. The need for it depends on the application and the incentives. Democratic applications are likely to require more transparency than corporate ones and probably AI models that are not owned and run by global tech monopolies.

All of these security issues are bigger than AI or democracy. Like all of our security experience, applying it to these new systems will require some new thinking.

AI will be one of humanity’s most important inventions. That’s probably true. What we don’t know is if this is the moment we are inventing it. Or if today’s systems are yet more over-hyped technologies. But these are security conversations we are going to need to have eventually.

AI is fundamentally a power-enhancing technology. We need to ensure that it distributes power and doesn’t further concentrate it.

AI is coming for democracy. Whether the changes are a net positive or negative depends on us. Let’s help tilt things to the positive.

This essay is adapted from a keynote speech delivered at the RSA Conference in San Francisco on May 7, 2024. It originally appeared in Cyberscoop.

 

Let’s Architect! Learn About Machine Learning on AWS

Post Syndicated from Luca Mezzalira original https://aws.amazon.com/blogs/architecture/lets-architect-learn-about-machine-learning-on-aws/

A data-driven approach empowers businesses to make informed decisions based on accurate predictions and forecasts, leading to improved operational efficiency and resource optimization. Machine learning (ML) systems have the remarkable ability to continuously learn and adapt, improving their performance over time as they are exposed to more data. This self-learning capability ensures that organizations can stay ahead of the curve, responding dynamically to changing market conditions and customer preferences, ultimately driving innovation and enhancing competitiveness.

By leveraging the power of machine learning on AWS, businesses can unlock benefits that enhance efficiency, improve decision-making, and foster growth.

AWS re:Invent 2023 – Zero to machine learning: Jump-start your data-driven journey

In this session, see how organizations with constrained resources (budgets, skill gaps, time) can jump start their data-driven journey with advanced analytics and ML capabilities. Learn AWS Working Backwards best practices to drive forward data-related projects that address tangible business value. Then dive into AWS analytics and AI/ML capabilities that simplify and expedite data pipeline delivery and business value from ML workloads. Hear about low-code no-code (LCNC) AWS services within the context of a complete data pipeline architecture.

Take me to this video

See an architecture to analyze customer churn using AWS services

Figure 1. See an architecture to analyze customer churn using AWS services

Introduction to MLOps engineering on AWS

As artificial intelligence (AI) continues to revolutionize industries, the ability to operationalize and scale ML models has become a critical challenge. This session introduces the concept of MLOps, a discipline that builds upon and extends the widely adopted DevOps practices prevalent in software development. By applying MLOps principles, organizations can streamline the process of building, training, and deploying ML models, ensuring efficient and reliable model lifecycle management. By mastering MLOps, organizations can bridge the gap between AI development and operations, enabling them to unlock the full potential of their ML initiatives.

Take me to this video

MLOps maturity level will help to assess your organization and understand how to reach the next level.

Figure 2. MLOps maturity level will help to assess your organization and understand how to reach the next level.

Behind-the-scenes look at generative AI infrastructure at Amazon

To power generative AI applications while keeping costs under control, AWS designs and builds machine learning accelerators like AWS Trainium and AWS Inferentia. This session introduces purpose-built ML hardware for model training and inference, and shows how Amazon and AWS customers take advantage of those solutions to optimize costs and reduce latency.

You can learn from practical examples showing the impact of those solutions and explanations about how these chips work. ML accelerators are not only beneficial for generative AI workloads; they can also be applied to other use cases, including representation learning, recommender systems, or any scenario with deep neural network models.

Take me to this video

Discover the technology that powers our AI services

Figure 3. Discover the technology that powers our AI services

How our customers are implementing machine learning on AWS

The following resources drill down into the ML infrastructure that’s used to train large models at Pinterest and the experimentation framework built by Booking.com.

The Pinterest video discusses the strategy to create an ML development environment, orchestrate training jobs, ingest data into the training loop, and accelerate the training speed. You can also learn about the advantages derived from containers in the context of ML and how Pinterest decided to set up the entire ML lifecycle, including distributed model training.

The second resource covers how Booking.com accelerated the experimentation process by leveraging Amazon SageMaker for data analysis, model training, and online experimentation. This resulted in shorter development times for their ranking models and increased speed for the data science teams.

Take me to Pinterest video

Take me to Booking.com blog post

Let’s discover how Pinterest is using AWS services for machine learning workloads

Figure 4. Let’s discover how Pinterest is using AWS services for machine learning workloads

SageMaker Immersion Day

Amazon SageMaker Immersion Day helps customers and partners provide end-to-end understanding of building ML use cases. From feature engineering to understanding various built-in algorithms, with a focus on training, tuning, and deploying the ML model in a production-like scenario, this workshop guides you to bring your own model to perform lift-and-shift from on-premises to the Amazon SageMaker platform. It further demonstrates more advanced concepts like model debugging, model monitoring, and AutoML.

Take me to the workshop

Train, tune and deploy your workload using Amazon SageMaker

Figure 5. Train, tune and deploy your workload using Amazon SageMaker

See you next time!

Thanks for reading! With this post, introduced you to the art of possibility on using AWS machine learning services. In the next blog, we will talk about cloud migrations.

To revisit any of our previous posts or explore the entire series, visit the Let’s Architect! page.

The art of possible: Three themes from RSA Conference 2024

Post Syndicated from Anne Grahn original https://aws.amazon.com/blogs/security/the-art-of-possible-three-themes-from-rsa-conference-2024/

San Francisco skyline with Oakland Bay Bridge at sunset, California, USA

RSA Conference 2024 drew 650 speakers, 600 exhibitors, and thousands of security practitioners from across the globe to the Moscone Center in San Francisco, California from May 6 through 9.

The keynote lineup was diverse, with 33 presentations featuring speakers ranging from WarGames actor Matthew Broderick, to public and private-sector luminaries such as Cybersecurity and Infrastructure Security Agency (CISA) Director Jen Easterly, U.S. Secretary of State Antony Blinken, security technologist Bruce Schneier, and cryptography experts Tal Rabin, Whitfield Diffie, and Adi Shamir.

Topics aligned with this year’s conference theme, “The art of possible,” and focused on actions we can take to revolutionize technology through innovation, while fortifying our defenses against an evolving threat landscape.

This post highlights three themes that caught our attention: artificial intelligence (AI) security, the Secure by Design approach to building products and services, and Chief Information Security Officer (CISO) collaboration.

AI security

Organizations in all industries have started building generative AI applications using large language models (LLMs) and other foundation models (FMs) to enhance customer experiences, transform operations, improve employee productivity, and create new revenue channels. So it’s not surprising that AI dominated conversations. Over 100 sessions touched on the topic, and the desire of attendees to understand AI technology and learn how to balance its risks and opportunities was clear.

“Discussions of artificial intelligence often swirl with mysticism regarding how an AI system functions. The reality is far more simple: AI is a type of software system.” — CISA

FMs and the applications built around them are often used with highly sensitive business data such as personal data, compliance data, operational data, and financial information to optimize the model’s output. As we explore the advantages of generative AI, protecting highly sensitive data and investments is a top priority. However, many organizations aren’t paying enough attention to security.

A joint generative AI security report released by Amazon Web Services (AWS) and the IBM Institute for Business Value during the conference found that 82% of business leaders view secure and trustworthy AI as essential for their operations, but only 24% are actively securing generative AI models and embedding security processes in AI development. In fact, nearly 70% say innovation takes precedence over security, despite concerns over threats and vulnerabilities (detailed in Figure 1).

Figure 1: Generative AI adoption concerns

Figure 1: Generative AI adoption concerns, Source: IBM Security

Because data and model weights—the numerical values models learn and adjust as they train—are incredibly valuable, organizations need them to stay protected, secure, and private, whether that means restricting access from an organization’s own administrators, customers, or cloud service provider, or protecting data from vulnerabilities in software running in the organization’s own environment.

There is no silver AI-security bullet, but as the report points out, there are proactive steps you can take to start protecting your organization and leveraging AI technology to improve your security posture:

  1. Establish a governance, risk, and compliance (GRC) foundation. Trust in gen AI starts with new security governance models (Figure 2) that integrate and embed GRC capabilities into your AI initiatives, and include policies, processes, and controls that are aligned with your business objectives.

    Figure 2: Updating governance, risk, and compliance models

    Figure 2: Updating governance, risk, and compliance models, Source: IBM Security

    In the RSA Conference session AI: Law, Policy, and Common Sense Suggestions to Stay Out of Trouble, digital commerce and gaming attorney Behnam Dayanim highlighted ethical, policy, and legal considerations—including AI-specific regulations—as well as governance structures such as the National Institute of Standards and Technology (NIST) AI Risk Management Framework (AI RMF 1.0) that can help maximize a successful implementation and minimize potential risk.

  2. Strengthen your security culture. When we think of securing AI, it’s natural to focus on technical measures that can help protect the business. But organizations are made up of people—not technology. Educating employees at all levels of the organization can help avoid preventable harms such as prompt-based risks and unapproved tool use, and foster a resilient culture of cybersecurity that supports effective risk mitigation, incident detection and response, and continuous collaboration.

    “You’ve got to understand early on that security can’t be effective if you’re running it like a project or a program. You really have to run it as an operational imperative—a core function of the business. That’s when magic can happen.” — Hart Rossman, Global Services Security Vice President at AWS
  3. Engage with partners. Developing and securing AI solutions requires resources and skills that many organizations lack. Partners can provide you with comprehensive security support—whether that’s informing and advising you about generative AI, or augmenting your delivery and support capabilities. This can help make your engineers and your security controls more effective.

    While many organizations purchase security products or solutions with embedded generative AI capabilities, nearly two-thirds, as detailed in Figure 3, report that their generative AI security capabilities come through some type of partner.

    Figure 3: More than 90% of security gen AI capabilities are coming from third-party products or partners

    Figure 3: Most security gen AI capabilities are coming from third-party products or partners, Source: IBM Security

    Tens of thousands of customers are using AWS, for example, to experiment and move transformative generative AI applications into production. AWS provides AI-powered tools and services, a Generative AI Innovation Center program, and an extensive network of AWS partners that have demonstrated expertise delivering machine learning (ML) and generative AI solutions. These resources can support your teams with hands-on help developing solutions mapped to your requirements, and a broader collection of knowledge they can use to help you make the nuanced decisions required for effective security.

View the joint report and AWS generative AI security resources for additional guidance.

Secure by Design

Building secure software was a popular and related focus at the conference. Insecure design is ranked as the number four critical web application security concern on the Open Web Application Security Project (OWASP) Top 10.

The concept known as Secure by Design is gaining importance in the effort to mitigate vulnerabilities early, minimize risks, and recognize security as a core business requirement. Secure by Design builds off of security models such as Zero Trust, and aims to reduce the burden of cybersecurity and break the cycle of constantly creating and applying updates by developing products that are foundationally secure.

More than 60 technology companies—including AWS—signed CISA’s Secure by Design Pledge during RSA Conference as part of a collaborative push to put security first when designing products and services.

The pledge demonstrates a commitment to making measurable progress towards seven goals within a year:

  • Broaden the use of multi-factor authentication (MFA)
  • Reduce default passwords
  • Enable a significant reduction in the prevalence of one or more vulnerability classes
  • Increase the installation of security patches by customers
  • Publish a vulnerability disclosure policy (VDP)
  • Demonstrate transparency in vulnerability reporting
  • Strengthen the ability of customers to gather evidence of cybersecurity intrusions affecting products

“From day one, we have pioneered secure by design and secure by default practices in the cloud, so AWS is designed to be the most secure place for customers to run their workloads. We are committed to continuing to help organizations around the world elevate their security posture, and we look forward to collaborating with CISA and other stakeholders to further grow and promote security by design and default practices.” — Chris Betz, CISO at AWS

The need for security by design applies to AI like any other software system. To protect users and data, we need to build security into ML and AI with a Secure by Design approach that considers these technologies to be part of a larger software system, and weaves security into the AI pipeline.

Since models tend to have very high privileges and access to data, integrating an AI bill of materials (AI/ML BOM) and Cryptography Bill of Materials (CBOM) into BOM processes can help you catalog security-relevant information, and gain visibility into model components and data sources. Additionally, frameworks and standards such as the AI RMF 1.0, the HITRUST AI Assurance Program, and ISO/IEC 42001 can facilitate the incorporation of trustworthiness considerations into the design, development, and use of AI systems.

CISO collaboration

In the RSA Conference keynote session CISO Confidential: What Separates The Best From The Rest, Trellix CEO Bryan Palma and CISO Harold Rivas noted that there are approximately 32,000 global CISOs today—4 times more than 10 years ago. The challenges they face include staffing shortages, liability concerns, and a rapidly evolving threat landscape. According to research conducted by the Information Systems Security Association (ISSA), nearly half of organizations (46%) report that their cybersecurity team is understaffed, and more than 80% of CISOs recently surveyed by Trellix have experienced an increase in cybersecurity threats over the past six months. When asked what would most improve their organizations’ abilities to defend against these threats, their top answer was industry peers sharing insights and best practices.

Building trusted relationships with peers and technology partners can help you gain the knowledge you need to effectively communicate the story of risk to your board of directors, keep up with technology, and build success as a CISO.

AWS CISO Circles provide a forum for cybersecurity executives from organizations of all sizes and industries to share their challenges, insights, and best practices. CISOs come together in locations around the world to discuss the biggest security topics of the moment. With NDAs in place and the Chatham House Rule in effect, security leaders can feel free to speak their minds, ask questions, and get feedback from peers through candid conversations facilitated by AWS Security leaders.

“When it comes to security, community unlocks possibilities. CISO Circles give us an opportunity to deeply lean into CISOs’ concerns, and the topics that resonate with them. Chatham House Rule gives security leaders the confidence they need to speak openly and honestly with each other, and build a global community of knowledge-sharing and support.” — Clarke Rodgers, Director of Enterprise Strategy at AWS

At RSA Conference, CISO Circle attendees discussed the challenges of adopting generative AI. When asked whether CISOs or the business own generative AI risk for the organization, the consensus was that security can help with policies and recommendations, but the business should own the risk and decisions about how and when to use the technology. Some attendees noted that they took initial responsibility for generative AI risk, before transitioning ownership to an advisory board or committee comprised of leaders from their HR, legal, IT, finance, privacy, and compliance and ethics teams over time. Several CISOs expressed the belief that quickly taking ownership of generative AI risk before shepherding it to the right owner gave them a valuable opportunity to earn trust with their boards and executive peers, and to demonstrate business leadership during a time of uncertainty.

Embrace the art of possible

There are many more RSA Conference highlights on a wide range of additional topics, including post-quantum cryptography developments, identity and access management, data perimeters, threat modeling, cybersecurity budgets, and cyber insurance trends. If there’s one key takeaway, it’s that we should never underestimate what is possible from threat actors or defenders. By harnessing AI’s potential while addressing its risks, building foundationally secure products and services, and developing meaningful collaboration, we can collectively strengthen security and establish cyber resilience.

Join us to learn more about cloud security in the age of generative AI at AWS re:Inforce 2024 June 10–12 in Pennsylvania. Register today with the code SECBLOfnakb to receive a limited time $150 USD discount, while supplies last.

 
If you have feedback about this post, submit comments in the Comments section below. If you have questions about this post, contact AWS Support.

Want more AWS Security news? Follow us on Twitter.

Anne Grahn

Anne Grahn

Anne is a Senior Worldwide Security GTM Specialist at AWS, based in Chicago. She has more than a decade of experience in the security industry, and focuses on effectively communicating cybersecurity risk. She maintains a Certified Information Systems Security Professional (CISSP) certification.

Danielle Ruderman

Danielle Ruderman

Danielle is a Senior Manager for the AWS Worldwide Security Specialist Organization, where she leads a team that enables global CISOs and security leaders to better secure their cloud environments. Danielle is passionate about improving security by building company security culture that starts with employee engagement.

AWS Weekly Roundup – LlamaIndex support for Amazon Neptune, force AWS CloudFormation stack deletion, and more (May 27, 2024)

Post Syndicated from Antje Barth original https://aws.amazon.com/blogs/aws/aws-weekly-roundup-llamaindex-support-for-amazon-neptune-force-aws-cloudformation-stack-deletion-and-more-may-27-2024/

Last week, Dr. Matt Wood, VP for AI Products at Amazon Web Services (AWS), delivered the keynote at the AWS Summit Los Angeles. Matt and guest speakers shared the latest advancements in generative artificial intelligence (generative AI), developer tooling, and foundational infrastructure, showcasing how they come together to change what’s possible for builders. You can watch the full keynote on YouTube.

AWS Summit LA 2024 keynote

Announcements during the LA Summit included two new Amazon Q courses as part of Amazon’s AI Ready initiative to provide free AI skills training to 2 million people globally by 2025. The courses are part of the Amazon Q learning plan. But that’s not all that happened last week.

Last week’s launches
Here are some launches that got my attention:

LlamaIndex support for Amazon Neptune — You can now build Graph Retrieval Augmented Generation (GraphRAG) applications by combining knowledge graphs stored in Amazon Neptune and LlamaIndex, a popular open source framework for building applications with large language models (LLMs) such as those available in Amazon Bedrock. To learn more, check the LlamaIndex documentation for Amazon Neptune Graph Store.

AWS CloudFormation launches a new parameter called DeletionMode for the DeleteStack API — You can use the AWS CloudFormation DeleteStack API to delete your stacks and stack resources. However, certain stack resources can prevent the DeleteStack API from successfully completing, for example, when you attempt to delete non-empty Amazon Simple Storage Service (Amazon S3) buckets. The DeleteStack API can enter into the DELETE_FAILED state in such scenarios. With this launch, you can now pass FORCE_DELETE_STACK value to the new DeletionMode parameter and delete such stacks. To learn more, check the DeleteStack API documentation.

Mistral Small now available in Amazon Bedrock — The Mistral Small foundation model (FM) from Mistral AI is now generally available in Amazon Bedrock. This a fast-follow to our recent announcements of Mistral 7B and Mixtral 8x7B in March, and Mistral Large in April. Mistral Small, developed by Mistral AI, is a highly efficient large language model (LLM) optimized for high-volume, low-latency language-based tasks. To learn more, check Esra’s post.

New Amazon CloudFront edge location in Cairo, Egypt — The new AWS edge location brings the full suite of benefits provided by Amazon CloudFront, a secure, highly distributed, and scalable content delivery network (CDN) that delivers static and dynamic content, APIs, and live and on-demand video with low latency and high performance. Customers in Egypt can expect up to 30 percent improvement in latency, on average, for data delivered through the new edge location. To learn more about AWS edge locations, visit CloudFront edge locations.

Amazon OpenSearch Service zero-ETL integration with Amazon S3 — This Amazon OpenSearch Service integration offers a new efficient way to query operational logs in Amazon S3 data lakes, eliminating the need to switch between tools to analyze data. You can get started by installing out-of-the-box dashboards for AWS log types such as Amazon VPC Flow Logs, AWS WAF Logs, and Elastic Load Balancing (ELB). To learn more, check out the Amazon OpenSearch Service Integrations page and the Amazon OpenSearch Service Developer Guide.

For a full list of AWS announcements, be sure to keep an eye on the What’s New at AWS page.

Other AWS news
Here are some additional news items and a Twitch show that you might find interesting:

AWS Build On Generative AIBuild On Generative AI — Now streaming every Thursday, 2:00 PM US PT on twitch.tv/aws, my colleagues Tiffany and Mike discuss different aspects of generative AI and invite guest speakers to demo their work. Check out show notes and the full list of episodes on community.aws.

Amazon Bedrock Studio bootstrapper script — We’ve heard your feedback! To everyone who struggled setting up the required AWS Identity and Access Management (IAM) roles and permissions to get started with Amazon Bedrock Studio: You can now use the Bedrock Studio bootstrapper script to automate the creation of the permissions boundary, service role, and provisioning role.

Upcoming AWS events
Check your calendars and sign up for these AWS events:

AWS SummitsAWS Summits — It’s AWS Summit season! Join free online and in-person events that bring the cloud computing community together to connect, collaborate, and learn about AWS. Register in your nearest city: Dubai (May 29), Bangkok (May 30), Stockholm (June 4), Madrid (June 5), and Washington, DC (June 26–27).

AWS re:InforceAWS re:Inforce — Join us for AWS re:Inforce (June 10–12) in Philadelphia, PA. AWS re:Inforce is a learning conference focused on AWS security solutions, cloud security, compliance, and identity. Connect with the AWS teams that build the security tools and meet AWS customers to learn about their security journeys.

AWS Community DaysAWS Community Days — Join community-led conferences that feature technical discussions, workshops, and hands-on labs led by expert AWS users and industry leaders from around the world: Midwest | Columbus (June 13), Sri Lanka (June 27), Cameroon (July 13), New Zealand (August 15), Nigeria (August 24), and New York (August 28).

You can browse all upcoming in-person and virtual events.

That’s all for this week. Check back next Monday for another Weekly Roundup!

— Antje

This post is part of our Weekly Roundup series. Check back each week for a quick roundup of interesting news and announcements from AWS!

Optimized for low-latency workloads, Mistral Small now available in Amazon Bedrock

Post Syndicated from Esra Kayabali original https://aws.amazon.com/blogs/aws/optimized-for-low-latency-workloads-mistral-small-now-available-in-amazon-bedrock/

Today, I am happy to announce that the Mistral Small foundation model (FM) from Mistral AI is now generally available in Amazon Bedrock. This a fast-follow to our recent announcements of Mistral 7B and Mixtral 8x7B in March, and Mistral Large in April. You can now access four high-performing models from Mistral AI in Amazon Bedrock including Mistral Small, Mistral Large, Mistral 7B, and Mixtral 8x7B, further expanding model choice.

Mistral Small, developed by Mistral AI, is a highly efficient large language model (LLM) optimized for high-volume, low-latency language-based tasks. Mistral Small is perfectly suited for straightforward tasks that can be performed in bulk, such as classification, customer support, or text generation. It provides outstanding performance at a cost-effective price point.

Some key features of Mistral Small you need to know about:

  • Retrieval-Augmented Generation (RAG) specialization – Mistral Small ensures that important information is retained even in long context windows, which can extend up to 32K tokens.
  • Coding proficiency – Mistral Small excels in code generation, review, and commenting, supporting major coding languages.
  • Multilingual capability – Mistral Small delivers top-tier performance in French, German, Spanish, and Italian, in addition to English. It also supports dozens of other languages.

Getting started with Mistral Small
I first need access to the model to get started with Mistral Small. I go to the Amazon Bedrock console, choose Model access, and then choose Manage model access. I expand the Mistral AI section, choose Mistral Small, and then choose Save changes.

I now have model access to Mistral Small, and I can start using it in Amazon Bedrock. I refresh the Base models table to view the current status.

I use the following template to build a prompt for the model to get sub-optimal outputs:

<s>[INST] Instruction [/INST]

Note that <s> is a special token for beginning of string (BOS) while [INST] and [/INST] are regular strings.

I try the following prompt to see a classification example:

Prompt:

<s>[INST] Classify the following email to determine if it is spam or not. Only respond with the exact words "Spam" or "Not spam".
🎉 Limited time offer! Buy 2, Get 1 Free! Offer ends today! Don't miss out the CHANCE TO SAVE, please click on the link below: https://bit.ly/buy-2-get-1-free [/INST]

Mistral 7B, Mixtral 8x7B, and Mistral Large can all correctly classify this email as “Spam.” Mistral Small is also able to classify this accurately, just as the larger models can. I also try several similar tasks, such as generating a Bash script from a text prompt and generating a recipe to prepare yoghurt, and get good results. For this reason, Mistral Small is the most cost-effective and efficient option of the Mistral AI models in Amazon Bedrock for such tasks.

Mistral Small excels in multilingual tasks. Along with English, it delivers superior performance in French, German, Spanish, and Italian. I test the model’s understanding of the German language by asking it to give me two sentences about sustainability:

Prompt:

<s>[INST] Geben Sie mir zwei Sätze zum Thema Nachhaltigkeit. [/INST]

Output:

1. Nachhaltigkeit bedeutet, die Bedürfnisse der heutigen Generation zu befriedigen, ohne die Möglichkeiten künftiger Generationen zu gefährden.
2. Ein nachhaltiger Lebensstil umfasst den bewussten Umgang mit Ressourcen, wie Wasser, Energie und Rohstoffen, sowie den Schutz von Umwelt und Klima.
Diese Sätze sollten Ihnen einen guten Überblick über das Thema Nachhaltigkeit geben.

Programmatically interact with Mistral Small
I can use AWS Command Line Interface (AWS CLI) and AWS Software Development Kit (SDK) to programmatically interact with Mistral Small using Amazon Bedrock APIs. I use the following code in Python, which interacts with Amazon Bedrock Runtime APIs with AWS SDK, asking, “What is the color of the sky?”:

import argparse
import boto3
from botocore.exceptions import ClientError
import json

accept = "application/json"
content_type = "application/json"

def invoke_model(model_id, input_data, region, streaming): 
  client = boto3.client('bedrock-runtime', region_name=region)
  try:
    if streaming:
      response = client.invoke_model_with_response_stream(body=input_data, modelId=model_id, accept=accept, contentType=content_type)
    else:
      response = client.invoke_model(body=input_data, modelId=model_id, accept=accept, contentType=content_type)
    status_code = response['ResponseMetadata']['HTTPStatusCode']
    print(json.loads(response.get('body').read()))
  except ClientError as e:
    print(e)

if __name__ == "__main__":
  parser = argparse.ArgumentParser(description="Bedrock Testing Tool")
  parser.add_argument("--prompt", type=str, help="prompt to use", default="Hello")
  parser.add_argument("--max-tokens", type=int, default=64)
  parser.add_argument("--streaming", choices=["true", "false"], help="whether to stream or not", default="false")
  args = parser.parse_args()
  streaming = False
  if args.streaming == "true":
    streaming = True
  input_data = json.dumps({
    "prompt": f"<s>[INST]{args.prompt}[/INST]",
    "max_tokens": args.max_tokens
  })
  invoke_model(model_id="mistral.mistral-small-2402-v1:0", input_data=input_data, region="us-east-1", streaming=streaming)

I get the following output:

{'outputs': [{'text': ' The color of the sky can vary depending on the time of day, weather,', 'stop_reason': 'length'}]}

Now available
The Mistral Small model is now available in Amazon Bedrock in the US East (N. Virginia) Region.

To learn more, visit the Mistral AI in Amazon Bedrock product page. For pricing details, review the Amazon Bedrock pricing page.

To get started with Mistral Small in Amazon Bedrock, visit the Amazon Bedrock console and Amazon Bedrock User Guide.

— Esra

Personal AI Assistants and Privacy

Post Syndicated from Bruce Schneier original https://www.schneier.com/blog/archives/2024/05/personal-ai-assistants-and-privacy.html

Microsoft is trying to create a personal digital assistant:

At a Build conference event on Monday, Microsoft revealed a new AI-powered feature called “Recall” for Copilot+ PCs that will allow Windows 11 users to search and retrieve their past activities on their PC. To make it work, Recall records everything users do on their PC, including activities in apps, communications in live meetings, and websites visited for research. Despite encryption and local storage, the new feature raises privacy concerns for certain Windows users.

I wrote about this AI trust problem last year:

One of the promises of generative AI is a personal digital assistant. Acting as your advocate with others, and as a butler with you. This requires an intimacy greater than your search engine, email provider, cloud storage system, or phone. You’re going to want it with you 24/7, constantly training on everything you do. You will want it to know everything about you, so it can most effectively work on your behalf.

And it will help you in many ways. It will notice your moods and know what to suggest. It will anticipate your needs and work to satisfy them. It will be your therapist, life coach, and relationship counselor.

You will default to thinking of it as a friend. You will speak to it in natural language, and it will respond in kind. If it is a robot, it will look humanoid—­or at least like an animal. It will interact with the whole of your existence, just like another person would.

[…]

And you will want to trust it. It will use your mannerisms and cultural references. It will have a convincing voice, a confident tone, and an authoritative manner. Its personality will be optimized to exactly what you like and respond to.

It will act trustworthy, but it will not be trustworthy. We won’t know how they are trained. We won’t know their secret instructions. We won’t know their biases, either accidental or deliberate.

We do know that they are built at enormous expense, mostly in secret, by profit-maximizing corporations for their own benefit.

[…]

All of this is a long-winded way of saying that we need trustworthy AI. AI whose behavior, limitations, and training are understood. AI whose biases are understood, and corrected for. AI whose goals are understood. That won’t secretly betray your trust to someone else.

The market will not provide this on its own. Corporations are profit maximizers, at the expense of society. And the incentives of surveillance capitalism are just too much to resist.

We are going to need some sort of public AI to counterbalance all of these corporate AIs.

EDITED TO ADD (5/24): Lots of comments about Microsoft Recall and security:

This:

Because Recall is “default allow” (it relies on a list of things not to record) … it’s going to vacuum up huge volumes and heretofore unknown types of data, most of which are ephemeral today. The “we can’t avoid saving passwords if they’re not masked” warning Microsoft included is only the tip of that iceberg. There’s an ocean of data that the security ecosystem assumes is “out of reach” because it’s either never stored, or it’s encrypted in transit. All of that goes out the window if the endpoint is just going to…turn around and write it to disk. (And local encryption at rest won’t help much here if the data is queryable in the user’s own authentication context!)

This:

The fact that Microsoft’s new Recall thing won’t capture DRM content means the engineers do understand the risk of logging everything. They just chose to preference the interests of corporates and money over people, deliberately.

This:

Microsoft Recall is going to make post-breach impact analysis impossible. Right now IR processes can establish a timeline of data stewardship to identify what information may have been available to an attacker based on the level of access they obtained. It’s not trivial work, but IR folks can do it. Once a system with Recall is compromised, all data that has touched that system is potentially compromised too, and the ML indirection makes it near impossible to confidently identify a blast radius.

This:

You may be in a position where leaders in your company are hot to turn on Microsoft Copilot Recall. Your best counterargument isn’t threat actors stealing company data. It’s that opposing counsel will request the recall data and demand it not be disabled as part of e-discovery proceedings.

AI Trust Risk and Security Management: Why Tackle Them Now?

Post Syndicated from Laura Ellis original https://blog.rapid7.com/2024/05/15/ai-trust-risk-and-security-management-why-tackle-them-now/

AI Trust Risk and Security Management: Why Tackle Them Now?

Co-authored by Sabeen Malik and Laura Ellis

In the evolving world of artificial intelligence (AI), keeping our customers secure and maintaining their trust is our top priority. As AI technologies integrate more deeply into our daily operations and services, they bring a set of unique challenges that demand a robust management strategy:

  1. The Black Box Dilemma: AI models pose significant challenges in terms of transparency and predictability. This opaque nature can complicate efforts to diagnose and rectify issues, making predictability and reliability hard to achieve.
  2. Model Fragility: AI’s performance is closely tied to the data it processes. Over time, subtle changes in data input—known as data drift—can degrade an AI system’s accuracy, necessitating constant monitoring and adjustments.
  3. Easy Access, Big Responsibility: The democratization of AI through cloud services means that powerful AI tools are just a few clicks away for developers. This ease of access underscores the need for rigorous security measures to prevent misuse and effectively manage vulnerabilities.
  4. Staying Ahead of the Curve: With AI regulation still in its formative stages, proactive development of self-regulatory frameworks like ours helps inform our future AI regulatory compliance frameworks; but most importantly, it builds trust among our customers. When thinking about AI’s promises and challenges, we know that trust is earned. But that trust is also is of concern for global policymakers, and that is why we are looking forward to engaging with NIST on discussions related to the AI Risk Management, Cyber Security, and Privacy frameworks. It’s  also why we were an inaugural signer of the CISA Secure by Design Pledge to demonstrate to government stakeholders and customers our commitment to building things and understanding the stakes at large.

Our TRiSM (Trust, Risk, and Security Management) framework isn’t merely a component of our operations—it’s a foundational strategy that guides us in navigating the intricate landscape of AI with confidence and security.

How We Approach AI Security at Rapid7

Rapid7 leverages the best available technology to protect our customers’ attack surfaces. Our mission drives us to keep abreast of the latest AI advancements to deliver optimal value to customers while effectively managing the inherent risks of the technology.

Innovation and scientific excellence are key aspects of our AI strategy. We strive for continuous improvement, leveraging the latest technological innovations and scientific research. By engaging with thought leaders and adopting best practices, we aim to stay at the forefront of AI technology, ensuring our solutions are not only effective but also pioneering and thoughtful.

Our AI principles center on transparency, fairness, safety, security, privacy, and accountability. These principles are not just guidelines; they are integral to how we build, deploy, and manage our AI systems. Accountability is a cornerstone of our strategy, and we hold ourselves responsible for the proper functioning of our AI systems so we can ensure they respect and embody our principles throughout their lifecycle. This includes ongoing oversight, regular audits, and adjustments as needed based on feedback and evolving standards.

We have leveraged a number of AI risk management frameworks to inform our approach.  Most notably, we have adopted the NIST AI Risk Management Framework and the Open Standard for Responsible AI. These frameworks help us comprehensively assess and manage AI risks, from the early stages of development through deployment and ongoing use. The NIST framework provides a thorough methodology for lifecycle risk management, while the Open Standard offers practical tools for evaluation and ensures that our AI systems are user-centric and responsible.

We are committed to ensuring that our AI deployments are not only technologically advanced but also adhere to the highest standards of security and ethical responsibility.

AI Integration in Action: Making It Work Day-to-Day

We take a practical approach to adhere to our AI TRiSM framework by integrating it into the daily operations of our existing technologies and processes, ensuring that AI enhances rather than complicates our security posture:

  1. Clear Rules: We have developed and implemented detailed enterprise-wide policies and operational procedures that govern the deployment and use of AI technologies. These guidelines ensure consistency and compliance across all departments and initiatives.
  2. Transparency Matters: We leverage our own tooling to gain visibility into our cloud security posture for AI.  We leverage InsightCloudSec solutions to provide comprehensive visibility into our AI deployments across various environments. This visibility is crucial for our security strategy, encapsulated by the philosophy, “You can’t protect what you can’t see.” It allows us to monitor, evaluate, and adjust our AI resources proactively.
  3. Throughout the Development Lifecycle: We integrate rigorous AI evaluations at every phase of our software development lifecycle. From the initial development stages to production and through regular post-deployment assessments, our framework ensures that AI systems are safe, effective, and aligned with our ethical standards.
  4. Smart Governance: By embedding AI-specific governance protocols into our existing code and cloud configuration management systems, we maintain strict control over all AI-related activities. This integration ensures that our AI initiatives comply with established best practices and regulatory requirements.
  5. Empowering Our Team: We recognize the critical need for advanced AI skills in today’s tech landscape. To address this, we offer training programs and collaborative opportunities, which not only foster innovation but also ensure adherence to best practices. This approach empowers our teams to innovate confidently within a secure and supportive environment.

Integrating AI into our core processes enhances our operational security and underscores our commitment to ethical innovation. At Rapid7, we are dedicated to leading responsibly in the AI space, ensuring that our technological advancements positively contribute to our customers, company, and society.

Our AI TRiSM framework is not merely a set of policies—it’s a proactive, strategic approach to securely and ethically harnessing new technologies. As we continue to innovate and push the boundaries of what’s possible with AI, we stay focused on setting a high bar for standards of responsible and secure AI usage, ensuring that our customers always receive the best technology solutions. Learn more here.