Introducing Amazon CloudFront KeyValueStore: A low-latency datastore for CloudFront Functions

Post Syndicated from Danilo Poccia original https://aws.amazon.com/blogs/aws/introducing-amazon-cloudfront-keyvaluestore-a-low-latency-datastore-for-cloudfront-functions/

Amazon CloudFront allows you to securely deliver static and dynamic content with low latency and high transfer speeds. With CloudFront Functions, you can perform latency-sensitive customizations for millions of requests per second. For example, you can use CloudFront Functions to modify headers, normalize cache keys, rewrite URLs, or authorize requests.

Today, we are introducing CloudFront KeyValueStore, a secure global low-latency key value datastore that allows read access from within CloudFront Functions, enabling advanced customizable logic at the CloudFront edge locations.

Previously, you had to embed configuration data inside the function code. For example, data for determining if a URL should be redirected and which URL to redirect the viewer to. When embedding configuration data with the function code, every small change in configuration requires a code change and a redeployment of the function code. Updating and deploying code for every new lookup addition introduces the risk of making inadvertent changes to code. Also, the maximum function size is 10 KB, making it difficult for many use cases to fit all the data within the code.

With CloudFront KeyValueStore, you can now update the data associated with a function and the function code independently from each other. This simplifies function code and makes it easy to update data without the need to deploy code changes.

Let’s see how this works in practice.

Creating a CloudFront key value store
In the CloudFront console, I choose Functions from the navigation pane. In the KeyValueStores tab, I choose Create KeyValueStore.

Here, I have the option to import key value pairs from a JSON file in an Amazon Simple Storage Service (Amazon S3) bucket. I am not doing that now because I want to start with no keys. I enter a name and description and complete the creation of the key value store.

Console screenshot.

When the key value store has been created, I choose Edit in the Key value pairs section and then Add pair. I type hello for the key and Hello World for the value and save the changes. I can add more keys and values, but one key is enough for now.

Console screenshot.

When I update a key value store, changes are propagated to all CloudFront edge locations in a few seconds so that it can be used with low latency by the functions that are associated with the key value store. Let’s see how that works.

Using CloudFront KeyValueStore from CloudFront Functions
In the CloudFront console, I choose Functions in the navigation pane and then Create function. I type a name for the function, select the cloudfront-js-2.0 runtime, and complete the creation of the function. Then, I use the new option to associate the key value store with this function.

Console screenshot.

I copy the key value store ID from the console to use it in the following function code:

import cf from 'cloudfront';

const kvsId = '<KEY_VALUE_STORE_ID>';

// This fails if the key value store is not associated with the function
const kvsHandle = cf.kvs(kvsId);

async function handler(event) {
    // Use the first part of the pathname as key, for example http(s)://domain/<key>/something/else
    const key = event.request.uri.split('/')[1]
    let value = "Not found" // Default value
    try {
        value = await kvsHandle.get(key);
    } catch (err) {
        console.log(`Kvs key lookup failed for ${key}: ${err}`);
    }
    var response = {
        statusCode: 200,
        statusDescription: 'OK',
        body: {
            encoding: 'text',
            data: `Key: ${key} Value: ${value}\n`
        }
    };
    return response;
}

This function uses the first part of the path of the request as key and responds with the name of the key and its value.

I save the changes and publish the function. In the Publish tab of the function, I associate the function with a CloudFront distribution that I created before. I use the Viewer Request event type and Default (*) cache behavior to intercept all requests to the distribution.

In the console, I go back to the list of functions and wait for the function to be deployed. Then, I use curl from the command line to download content from the distribution and test the result of the function.

First, I try with a couple of paths that invoke the function and look up the key I created before (hello):

curl https://distribution-domain.cloudfront.net/hello
Key: hello Value: Hello World

curl https://distribution-domain.cloudfront.net/hello/world
Key: hello Value: Hello World

It works! Then, I try with a different path to see that the default value I use in the code is returned when the key is not found.

curl https://distribution-domain.cloudfront.net/hi
Key: hi Value: Not found

Now that this first example works, let’s try something more advanced and useful.

Rewriting the URL using configuration data in CloudFront KeyValueStore
Let’s build a function that uses the content of the URL in the HTTP request to look up in a key value store the custom path that CloudFront should use to make the actual request. This function can help manage the multiple services that are part of a website.

For example, I want to update the blog platform I use for my website. The old blog has origin path /blog-v1 while the new blog has origin path /blog-v2.

Architectural diagram.

At first, I am still using the old blog. In the CloudFormation console, I add the blog key to the key value store with value blog-v1.

Then, I create the following function and associate it with the distribution using Viewer Request event and Default (*) cache behavior to intercept all requests to the distribution.

import cf from 'cloudfront';

const kvsId = "<KEY_VALUE_STORE_ID>";

// This fails if the key value store is not associated with the function
const kvsHandle = cf.kvs(kvsId);

async function handler(event) {
    const request = event.request;
    // Use the first segment of the pathname as key
    // For example http(s)://domain/<key>/something/else
    const pathSegments = request.uri.split('/')
    const key = pathSegments[1]
    try {
        // Replace the first path of the pathname with the value of the key
        // For example http(s)://domain/<value>/something/else
        pathSegments[1] = await kvsHandle.get(key);
        const newUri = pathSegments.join('/');
        console.log(`${request.uri} -> ${newUri}`)
        request.uri = newUri;
    } catch (err) {
        // No change to the pathname if the key is not found
        console.log(`${request.uri} | ${err}`);
    }
    return request;
}

Now, when I type blog at the beginning of the URL path, the request will actually go to the blog-v1 path. CloudFront will make the HTTP request to the old blog because blog-v1 is the origin path used by the old blog.

For example, if I type https://distribution-domain.cloudfront.net/blog/index.html in a browser, I see the old blog (V1).

Browser screenshot showing blog V1.

In the console, I update the blog key with value blog-v2. I access the same URL after a few seconds, and now I reach the new blog (V2).

Browser screenshot showing blog V2.

As you can see, the public URL is the same, but the content has changed. More generally, this function assumes that URLs do not change between the two blog versions.

I can now add more keys for the different services that are part of my website (blog, support, help, commerce, and so on) and set their values to use the correct URL path for each of them. When I add a new version for one of them (for example, I migrate to a new commerce platform), I can configure a new origin and update the corresponding key to use the new origin path.

This is just an example of the flexibility you get when you separate configuration data from code. If you are already using CloudFront Functions, you can simplify your code by using CloudFront KeyValueStore.

Things to know
CloudFront KeyValueStore is available today in all edge locations globally. With CloudFront KeyValueStore, you pay only for what you use based on the read/write operations from the public API and the read operations from within CloudFront Functions. For more information, see CloudFront pricing.

You can manage a key value store using the AWS Management Console, AWS Command Line Interface (AWS CLI), and AWS SDKs. AWS CloudFormation support is coming soon. The maximum size of a key value store is 5 MB, and you can associate a single key value store to each function. The maximum size of a key is 512 bytes. Values can be up to 1KB in size. When creating a key value store, you can import key/value data during creation using a source file on Amazon S3 with this JSON structure:

{
  "data":[
    {
      "key":"key1",
      "value":"val1"
    },
    {
      "key":"key2",
      "value":"val2"
    }
  ]
}

Importing key/value data at creation can help automate the setup of a new environment (such as test or dev) and easily replicate the configuration from one environment to another (such as preproduction to production).

Simplify the way you add custom logic at the edge using CloudFront KeyValueStore.

Danilo

Your DevOps and Developer Productivity guide to re:Invent 2023

Post Syndicated from Anubhav Rao original https://aws.amazon.com/blogs/devops/your-devops-and-developer-productivity-guide-to-reinvent-2023/

Your DevOps and Developer Productivity guide to re:Invent 2023

ICYMI – AWS re:Invent is less than a week away! We can’t wait to join thousands of builders in person and virtually for another exciting event. Still need to save your spot? You can register here.

With so much planned for the DevOps and Developer Productivity (DOP) track at re:Invent, we’re highlighting the most exciting sessions for technology leaders and developers in this post. Sessions span intermediate (200) through expert (400) levels of content in a mix of interactive chalk talks, hands-on workshops, and lecture-style breakout sessions.

You will experience the future of efficient development at the DevOps and Developer Productivity track and get a chance to talk to AWS experts about exciting services, tools, and new AI capabilities that optimize and automate your software development lifecycle. Attendees will leave re:Invent with the latest strategies to accelerate development, use generative AI to improve developer productivity, and focus on high-value work and innovation.

How to reserve a seat in the sessions

Reserved seating is available for registered attendees to secure seats in the sessions of their choice. Reserve a seat by signing in to the attendee portal and navigating to Event, then Sessions.

Do not miss the Innovation Talk led by Vice President of AWS Generative Builders, Adam Seligman. In DOP225-INT Build without limits: The next-generation developer experience at AWS, Adam will provide updates on the latest developer tools and services, including generative AI-powered capabilities, low-code abstractions, cloud development, and operations. He’ll also welcome special guests to lead demos of key developer services and showcase how they integrate to increase productivity and innovation.

DevOps and Developer Productivity breakout sessions

What are breakout sessions?

AWS re:Invent breakout sessions are lecture-style and 60 minutes long. These sessions are delivered by AWS experts and typically reserve 10–15 minutes for Q&A at the end. Breakout sessions are recorded and made available on-demand after the event.

Level 200 — Intermediate

DOP201 | Best practices for Amazon CodeWhisperer Generative AI can create new content and ideas, including conversations, stories, images, videos, and music. Learning how to interact with generative AI effectively and proficiently is a skill worth developing. Join this session to learn about best practices for engaging with Amazon CodeWhisperer, which uses an underlying foundation model to radically improve developer productivity by generating code suggestions in real time.

DOP202 | Realizing the developer productivity benefits of Amazon CodeWhisperer Developers spend a significant amount of their time writing undifferentiated code. Amazon CodeWhisperer radically improves productivity by generating code suggestions in real time to alleviate this burden. In this session, learn how CodeWhisperer can “write” much of this undifferentiated code, allowing developers to focus on business logic and accelerate the pace of their innovation.

DOP205 | Accelerate development with Amazon CodeCatalyst In this session, explore the newest features in Amazon CodeCatalyst. Learn firsthand how these practical additions to CodeCatalyst can simplify application delivery, improve team collaboration, and speed up the software development lifecycle from concept to deployment.

DOP206 | AWS infrastructure as code: A year in review AWS provides services that help with the creation, deployment, and maintenance of application infrastructure in a programmatic, descriptive, and declarative way. These services help provide rigor, clarity, and reliability to application development. Join this session to learn about the new features and improvements for AWS infrastructure as code with AWS CloudFormation and AWS Cloud Development Kit (AWS CDK) and how they can benefit your team.

DOP207 | Build and run it: Streamline DevOps with machine learning on AWS While organizations have improved how they deliver and operate software, development teams still run into issues when performing manual code reviews, looking for hard-to-find defects, and uncovering security-related problems. Developers have to keep up with multiple programming languages and frameworks, and their productivity can be impaired when they have to search online for code snippets. Additionally, they require expertise in observability to successfully operate the applications they build. In this session, learn how companies like Fidelity Investments use machine learning–powered tools like Amazon CodeWhisperer and Amazon DevOps Guru to boost application availability and write software faster and more reliably.

DOP208 | Continuous integration and delivery for AWS AWS provides one place where you can plan work, collaborate on code, build, test, and deploy applications with continuous integration/continuous delivery (CI/CD) tools. In this session, learn about how to create end-to-end CI/CD pipelines using infrastructure as code on AWS.

DOP209 | Governance and security with infrastructure as code In this session, learn how to use AWS CloudFormation and the AWS CDK to deploy cloud applications in regulated environments while enforcing security controls. Find out how to catch issues early with cdk-nag, validate your pipelines with cfn-guard, and protect your accounts from unintended changes with CloudFormation hooks.

DOP210 | Scale your application development with Amazon CodeCatalyst Amazon CodeCatalyst brings together everything you need to build, deploy, and collaborate on software into one integrated software development service. In this session, discover the ways that CodeCatalyst helps developers and teams build and ship code faster while spending more time doing the work they love.

DOP211 | Boost developer productivity with Amazon CodeWhisperer Generative AI is transforming the way that developers work. Writing code is already getting disrupted by tools like Amazon CodeWhisperer, which enhances developer productivity by providing real-time code completions based on natural language prompts. In this session, get insights into how to evaluate and measure productivity with the adoption of generative AI–powered tools. Learn from the AWS Disaster Recovery team who uses CodeWhisperer to solve complex engineering problems by gaining efficiency through longer productivity cycles and increasing velocity to market for ongoing fixes. Hear how integrating tools like CodeWhisperer into your workflows can boost productivity.

DOP212 | New AWS generative AI features and tools for developers Explore how generative AI coding tools are changing the way developers and companies build software. Generative AI–powered tools are boosting developer and business productivity by automating tasks, improving communication and collaboration, and providing insights that can inform better decision-making. In this session, see the newest AWS tools and features that make it easier for builders to solve problems with minimal technical expertise and that help technical teams boost productivity. Walk through how organizations like FINRA are exploring generative AI and beginning their journey using these tools to accelerate their pace of innovation.

DOP220 | Simplify building applications with AWS SDKs AWS SDKs play a vital role in using AWS services in your organization’s applications and services. In this session, learn about the current state and the future of AWS SDKs. Explore how they can simplify your developer experience and unlock new capabilities. Discover how SDKs are evolving, providing a consistent experience in multiple languages and empowering you to do more with high-level abstractions to make it easier to build on AWS. Learn how AWS SDKs are built using open source tools like Smithy, and how you can use these tools to build your own SDKs to serve your customers’ needs.

DevOps and Developer Productivity chalk talks

What are chalk talks?

Chalk Talks are highly interactive sessions with a small audience. Experts lead you through problems and solutions on a digital whiteboard as the discussion unfolds. Each begins with a short lecture (10–15 minutes) delivered by an AWS expert, followed by a 45- or 50-minute Q&A session with the audience.

Level 300 — Advanced

DOP306 | Streamline DevSecOps with a complete software development service Security is not just for application code—the automated software supply chains that build modern software can also be exploited by attackers. In this chalk talk, learn how you can use Amazon CodeCatalyst to incorporate security tests into every aspect of your software development lifecycle while maintaining a great developer experience. Discover how CodeCatalyst’s flexible actions-based CI/CD workflows streamline the process of adapting to security threats.

DOP309-R | AI for DevOps: Modernizing your DevOps operations with AWS As more organizations move to microservices architectures to scale their businesses, applications increasingly have become distributed, requiring the need for even greater visibility. IT operations professionals and developers need more automated practices to maintain application availability and reduce the time and effort required to detect, debug, and resolve operational issues. In this chalk talk, discover how you can use AWS services, including Amazon CodeWhisperer, Amazon CodeGuru and Amazon DevOps Guru, to start using AI for DevOps solutions to detect, diagnose, and remedy anomalous application behavior.

DOP310-R | Better together: GitHub Actions, Amazon CodeCatalyst, or AWS CodeBuild Learn how combining GitHub Actions with Amazon CodeCatalyst or AWS CodeBuild can maximize development efficiency. In this chalk talk, learn about the tradeoffs of using GitHub Actions runners hosted on Amazon EC2 or Amazon ECS with GitHub Actions hosted on CodeCatalyst or CodeBuild. Explore integration with other AWS services to enhance workflow automation. Join this talk to learn how GitHub Actions on AWS can take your development processes to the next level.

DOP311 | Building infrastructure as code with AWS CloudFormation AWS CloudFormation helps you manage your AWS infrastructure as code, increasing automation and supporting infrastructure-as-code best practices. In this chalk talk, learn the fundamentals of CloudFormation, including templates, stacks, change sets, and stack dependencies. See a demo of how to describe your AWS infrastructure in a template format and provision resources in an automated, repeatable way.

DOP312 | Creating custom constructs with AWS CDK Join this chalk talk to get answers to your questions about creating, publishing, and sharing your AWS CDK constructs publicly and privately. Learn about construct levels, how to test your constructs, how to discover and use constructs in your AWS CDK projects, and explore Construct Hub.

DOP313-R | Multi-account and multi-Region deployments at scale Many AWS customers are implementing multi-account strategies to more easily manage their cloud infrastructure and improve their security and compliance postures. In this chalk talk, learn about various options for deploying resources into multiple accounts and AWS Regions using AWS developer tools, including AWS CodePipeline, AWS CodeDeploy, and Amazon CodeCatalyst.

DOP314 | Simplifying cloud infrastructure creation with the AWS CDK The AWS Cloud Development Kit (AWS CDK) is an open source software development framework for defining cloud infrastructure in code and provisioning it through AWS CloudFormation. In this chalk talk, get an introduction to the AWS CDK and see a demo of how it can simplify infrastructure creation. Through code examples and diagrams, see how the AWS CDK lets you use familiar programming languages for declarative infrastructure definition. Also learn how it provides higher-level abstractions and constructs over native CloudFormation.

DOP317 | Applying Amazon’s DevOps culture to your team In this chalk talk, learn how Amazon helps its developers rapidly release and iterate software while maintaining industry-leading standards on security, reliability, and performance. Learn about the culture of two-pizza teams and how to maintain a culture of DevOps in a large enterprise. Also, discover how you can help build such a culture at your own organization.

DOP318 | Testing for resilience with AWS Fault Injection Simulator As cloud-based systems grow in scale and complexity, there is increased need to test distributed systems for resiliency. AWS Fault Injection Simulator (FIS) allows you to stress test your applications to understand failure modes and build more resilient services. Through code examples and diagrams, see how to set up and run fault injection experiments on AWS. By the end of this session, understand how FIS helps identify weaknesses and validate improvements to build more resilient cloud-based systems.

DOP319-R | Zero-downtime deployment strategies AWS services support a wealth of deployment options to meet your needs, ranging from in-place updates to blue/green deployment to continuous configuration with feature flags. In this chalk talk, hear about multiple options for deploying changes to Amazon EC2, Amazon ECS, and AWS Lambda compute platforms using AWS CodeDeploy, AWS AppConfig, AWS CloudFormation, AWS Cloud Development Kit (AWS CDK), and Amazon CodeCatalyst.

DOP320 | Build a path to production with Amazon CodeCatalyst blueprints Amazon CodeCatalyst uses blueprints to configure your software projects in the service. Blueprints instruct CodeCatalyst on how to set up a code repository with working sample code, define cloud infrastructure, and run pre-configured CI/CD workflows for your project. In this session, learn how blueprints in CodeCatalyst can give developers a compliant software service they’ll want to use on AWS.

DOP321-R | Code faster with Amazon CodeWhisperer Traditionally, building applications requires developers to spend a lot of time manually writing code and trying to learn and keep up with new frameworks, SDKs, and libraries. In the last three years, AI models have grown exponentially in complexity and sophistication, enabling the creation of tools like Amazon CodeWhisperer that can generate code suggestions in real time based on a natural language description of the task. In this session, learn how CodeWhisperer can accelerate and enhance your software development with code generation, reference tracking, security scans, and more.

DOP324 | Accelerating application development with AWS client-side tools Did you know AWS has more than just services? There are dozens of AWS client-side tools and libraries designed to make developing quality applications easier. In this chalk talk, explore some of the tools available in your development workspace. Learn more about command line tooling (AWS CLI), libraries (AWS SDK), IDE integrations, and application frameworks that can accelerate your AWS application development. The audience helps set the agenda so there’s sure to be something for every builder.

DevOps and Developer Productivity workshops

What are workshops?

Workshops are two-hour interactive learning sessions where you work in small group teams to solve problems using AWS services. Each workshop starts with a short lecture (10–15 minutes) by the main speaker, and the rest of the time is spent working as a group.

Level 300 — Advanced

DOP301 | Boost your application availability with AIOps on AWS As applications become increasingly distributed and complex, developers and IT operations teams can benefit from more automated practices to maintain application availability and reduce the time and effort spent detecting, debugging, and resolving operational issues manually. In this workshop, learn how AWS AIOps solutions can help you make the shift toward more automation and proactive mechanisms so your IT team can innovate faster. The workshop includes use cases spanning multiple AWS services such as AWS Lambda, Amazon DynamoDB, Amazon API Gateway, Amazon RDS, and Amazon EKS. Learn how you can reduce MTTR and quickly identify issues within your AWS infrastructure. You must bring your laptop to participate.

DOP302 | Build software faster with Amazon CodeCatalyst In this workshop, learn about creating continuous integration and continuous delivery (CI/CD) pipelines using Amazon CodeCatalyst. CodeCatalyst is a unified software development service on AWS that brings together everything teams need to plan, code, build, test, and deploy applications with continuous CI/CD tools. You can utilize AWS services and integrate AWS resources into your projects by connecting your AWS accounts. With all of the stages of an application’s lifecycle in one tool, you can deliver quality software quickly and confidently. You must bring your laptop to participate.

DOP303-R | Continuous integration and delivery on AWS In this workshop, learn to create end-to-end continuous integration and continuous delivery (CI/CD) pipelines using AWS Cloud Development Kit (AWS CDK). Review the fundamental concepts of continuous integration, continuous deployment, and continuous delivery. Then, using TypeScript/Python, define an AWS CodePipeline, AWS CodeBuild, and AWS CodeCommit workflow. You must bring your laptop to participate.

DOP304 | Develop AWS CDK resources to deploy your applications on AWS In this workshop, learn how to build and deploy applications using infrastructure as code with AWS Cloud Development Kit (AWS CDK). Create resources using AWS CDK and learn maintenance and operations tips. In addition, get an introduction to building your own constructs. You must bring your laptop to participate.

DOP305 | Develop AWS CloudFormation templates to manage your infrastructure In this workshop, learn how to develop and test AWS CloudFormation templates. Create CloudFormation templates to deploy and manage resources and learn about CloudFormation language features that allow you to reuse and extend templates for many scenarios. Explore testing tools that can help you validate your CloudFormation templates, including cfn-lint and CloudFormation Guard. You must bring your laptop to participate.

DOP307-R | Hands-on with Amazon CodeWhisperer In this workshop, learn how to build applications faster and more securely with Amazon CodeWhisperer. The workshop begins with several examples highlighting how CodeWhisperer incorporates your comments and existing code to produce results. Then dive into a series of challenges designed to improve your productivity using multiple languages and frameworks. You must bring your laptop to participate.

DOP308 | Enforcing development standards with Amazon CodeCatalyst In this workshop, learn how Amazon CodeCatalyst can accelerate the application development lifecycle within your organization. Discover how your cloud center of excellence (CCoE) can provide standardized code and workflows to help teams get started quickly and securely. In addition, learn how to update projects as organization standards evolve. You must bring your laptop to participate.

Level 400 — Expert

DOP401 | Get better at building AWS CDK constructs In this workshop, dive deep into how to design AWS CDK constructs, which are reusable and shareable cloud components that help you meet your organization’s security, compliance, and governance requirements. Learn how to build, test, and share constructs representing a single AWS resource, as well as how to create higher-level abstractions that include built-in defaults and allow you to provision multiple AWS resources. You must bring your laptop to participate.

DevOps and Developer Productivity builders’ sessions

What are builders’ sessions?

These 60-minute group sessions are led by an AWS expert and provide an interactive learning experience for building on AWS. Builders’ sessions are designed to create a hands-on experience where questions are encouraged.

Level 300 — Advanced

DOP322-R | Accelerate data science coding with Amazon CodeWhisperer Generative AI removes the heavy lifting that developers experience today by writing much of the undifferentiated code, allowing them to build faster. Helping developers code faster could be one of the most powerful uses of generative AI that we will see in the coming years—and this framework can also be applied to data science projects. In this builders’ session, explore how Amazon CodeWhisperer accelerates the completion of data science coding tasks with extensions for JupyterLab and Amazon SageMaker. Learn how to build data processing pipeline and machine learning models with the help of CodeWhisperer and accelerate data science experiments in Python. You must bring your laptop to participate.

Level 400 — Expert

DOP402-R | Manage dev environments at scale with Amazon CodeCatalyst Amazon CodeCatalyst Dev Environments are cloud-based environments that you can use to quickly work on the code stored in the source repositories of your project. They are automatically created with pre-installed dependencies and language-specific packages so you can work on a new or existing project right away. In this session, learn how to create secure, reproducible, and consistent environments for VS Code, AWS Cloud9, and JetBrains IDEs. You must bring your laptop to participate.

DOP403-R | Hands-on with Amazon CodeCatalyst: Automating security in CI/CD pipelines In this session, learn how to build a CI/CD pipeline with Amazon CodeCatalyst and add the necessary steps to secure your pipeline. Learn how to perform tasks such as secret scanning, software composition analysis (SCA), static application security testing (SAST), and generating a software bill of materials (SBOM). You must bring your laptop to participate.

DevOps and Developer Productivity lightning talks

What are lightning talks?

Lightning talks are short, 20-minute demos led from a stage.

DOP221 | Amazon CodeCatalyst in real time: Deploying to production in minutes In this follow-up demonstration to DOP210, see how you can use an Amazon CodeCatalyst blueprint to build a production-ready application that is set up for long-term success. See in real time how to create a project using a CodeCatalyst Dev Environment and deploy it to production using a CodeCatalyst workflow.

DevOps and Developer Productivity code talks

What are code talks?

Code talks are 60-minute, highly-interactive discussions featuring live coding. Attendees are encouraged to dig in and ask questions about the speaker’s approach.

DOP203 | The future of development on AWS This code talk includes a live demo and an open discussion about how builders can use the latest AWS developer tools and generative AI to build production-ready applications in minutes. Starting at an Amazon CodeCatalyst blueprint and using integrated AWS productivity and security capabilities, see a glimpse of what the future holds for developing on AWS.

DOP204 | Tips and tricks for coding with Amazon CodeWhisperer Generative AI tools that can generate code suggestions, such as Amazon CodeWhisperer, are growing rapidly in popularity. Join this code talk to learn how CodeWhisperer can accelerate and enhance your software development with code generation, reference tracking, security scans, and more. Learn best practices for prompt engineering, and get tips and tricks that can help you be more productive when building applications.

Want to stay connected?

Get the latest updates for DevOps and Developer Productivity by following us on Twitter and visiting the AWS devops blog.

Build scalable and serverless RAG workflows with a vector engine for Amazon OpenSearch Serverless and Amazon Bedrock Claude models

Post Syndicated from Fraser Sequeira original https://aws.amazon.com/blogs/big-data/build-scalable-and-serverless-rag-workflows-with-a-vector-engine-for-amazon-opensearch-serverless-and-amazon-bedrock-claude-models/

In pursuit of a more efficient and customer-centric support system, organizations are deploying cutting-edge generative AI applications. These applications are designed to excel in four critical areas: multi-lingual support, sentiment analysis, personally identifiable information (PII) detection, and conversational search capabilities. Customers worldwide can now engage with the applications in their preferred language, and the applications can gauge their emotional state, mask sensitive personal information, and provide context-aware responses. This holistic approach not only enhances the customer experience but also offers efficiency gains, ensures data privacy compliance, and drives customer retention and sales growth.

Generative AI applications are poised to transform the customer support landscape, offering versatile solutions that integrate seamlessly with organizations’ operations. By combining the power of multi-lingual support, sentiment analysis, PII detection, and conversational search, these applications promise to be a game-changer. They empower organizations to deliver personalized, efficient, and secure support services while ultimately driving customer satisfaction, cost savings, data privacy compliance, and revenue growth.

Amazon Bedrock and foundation models like Anthropic Claude are poised to enable a new wave of AI adoption by powering more natural conversational experiences. However, a key challenge that has emerged is tailoring these general purpose models to generate valuable and accurate responses based on extensive, domain-specific datasets. This is where the Retrieval Augmented Generation (RAG) technique plays a crucial role.

RAG allows you to retrieve relevant data from databases or document repositories to provide helpful context to large language models (LLMs). This additional context helps the models generate more specific, high-quality responses tuned to your domain.

In this post, we demonstrate building a serverless RAG workflow by combining the vector engine for Amazon OpenSearch Serverless with an LLM like Anthropic Claude hosted by Amazon Bedrock. This combination provides a scalable way to enable advanced natural language capabilities in your applications, including the following:

  • Multi-lingual support – The solution uses the ability of LLMs like Anthropic Claude to understand and respond to queries in multiple languages without any additional training needed. This provides true multi-lingual capabilities out of the box, unlike traditional machine learning (ML) systems that need training data in each language.
  • Sentiment analysis – This solution enables you to detect positive, negative, or neutral sentiment in text inputs like customer reviews, social media posts, or surveys. LLMs can provide explanations for the inferred sentiment, describing which parts of the text contributed to a positive or negative classification. This explainability helps build trust in the model’s predictions. Potential use cases could include analyzing product reviews to identify pain points or opportunities, monitoring social media for brand sentiment, or gathering feedback from customer surveys.
  • PII detection and redaction – The Claude LLM can be accurately prompted to identify various types of PII like names, addresses, Social Security numbers, and credit card numbers and replace it with placeholders or generic values while maintaining readability of the surrounding text. This enables compliance with regulations like GDPR and prevents sensitive customer data from being exposed. This also helps automate the labor-intensive process of PII redaction and reduces risk of exposed customer data across various use cases, such as the following:
    • Processing customer support tickets and automatically redacting any PII before routing to agents.
    • Scanning internal company documents and emails to flag any accidental exposure of customer PII.
    • Anonymizing datasets containing PII before using the data for analytics or ML, or sharing the data with third parties.

Through careful prompt engineering, you can accomplish the aforementioned use cases with a single LLM. The key is crafting prompt templates that clearly articulate the desired task to the model. Prompting allows us to tap into the vast knowledge already present within the LLM for advanced natural language processing (NLP) tasks, while tailoring its capabilities to our particular needs. Well-designed prompts unlock the power and potential of the model.

With the vector database capabilities of Amazon OpenSearch Serverless, you can store vector embeddings of documents, allowing ultra-fast, semantic (rather than keyword) similarity searches to find the most relevant passages to augment prompts.

Read on to learn how to build your own RAG solution using an OpenSearch Serverless vector database and Amazon Bedrock.

Solution overview

The following architecture diagram provides a scalable and fully managed RAG-based workflow for a wide range of generative AI applications, such as language translation, sentiment analysis, PII data detection and redaction, and conversational AI. This pre-built solution operates in two distinct stages. The initial stage involves generating vector embeddings from unstructured documents and saving these embeddings within an OpenSearch Serverless vectorized database index. In the second stage, user queries are forwarded to the Amazon Bedrock Claude model along with the vectorized context to deliver more precise and relevant responses.

In the following sections, we discuss the two core functions of the architecture in more detail:

  • Index domain data
  • Query an LLM with enhanced context

Index domain data

In this section, we discuss the details of the data indexing phase.

Generate embeddings with Amazon Titan

We used Amazon Titan embeddings model to generate vector embeddings. With 1,536 dimensions, the embeddings model captures semantic nuances in meaning and relationships. Embeddings are available via the Amazon Bedrock serverless experience; you can access it using a single API and without managing any infrastructure. The following code illustrates generating embeddings using a Boto3 client.

import boto3
bedrock_client = boto3.client('bedrock-runtime')

## Generate embeddings with Amazon Titan Embeddings model
response = bedrock_client.invoke_model(
            body = json.dumps({"inputText": 'Hello World'}),
            modelId = 'amazon.titan-embed-text-v1',
            accept = 'application/json',
            contentType = 'application/json'
)
result = json.loads(response['body'].read())
embeddings = result.get('embedding')
print(f'Embeddings -> {embeddings}')

Store embeddings in an OpenSearch Serverless vector collection

OpenSearch Serverless offers a vector engine to store embeddings. As your indexing and querying needs fluctuate based on workload, OpenSearch Serverless automatically scales up and down based on demand. You no longer have to predict capacity or manage infrastructure sizing.

With OpenSearch Serverless, you don’t provision clusters. Instead, you define capacity in the form of Opensearch Capacity Units (OCUs). OpenSearch Serverless will scale up to the maximum number of OCUs defined. You’re charged for a minimum of 4 OCUs, which can be shared across multiple collections sharing the same AWS Key Management Service (AWS KMS) key.

The following screenshot illustrates how to configure capacity limits on the OpenSearch Serverless console.

Query an LLM with domain data

In this section, we discuss the details of the querying phase.

Generate query embeddings

When a user queries for data, we first generate an embedding of the query with Amazon Titan embeddings. OpenSearch Serverless vector collections employ an Approximate Nearest Neighbors (A-NN) algorithm to find document embeddings closest to the query embeddings. The A-NN algorithm uses cosine similarity to measure the closeness between the embedded user query and the indexed data. OpenSearch Serverless then returns the documents whose embeddings have the smallest distance, and therefore the highest similarity, to the user’s query embedding. The following code illustrates our vector search query:

vector_query = {
                "size": 5,
                "query": {"knn": {"embedding": {"vector": embedded_search, "k": 2}}},
                "_source": False,
                "fields": ["text", "doc_type"]
            } 

Query Anthropic Claude models on Amazon Bedrock

OpenSearch Serverless finds relevant documents for a given query by matching embedded vectors. We enhance the prompt with this context and then query the LLM. In this example, we use the AWS SDK for Python (Boto3) to invoke models on Amazon Bedrock. The AWS SDK provides the following APIs to interact with foundational models on Amazon Bedrock:

The following code invokes our LLM:

import boto3
bedrock_client = boto3.client('bedrock-runtime')
# model_id could be 'anthropic.claude-v2', 'anthropic.claude-v1','anthropic.claude-instant-v1']
response = bedrock_client.invoke_model_with_response_stream(
        body=json.dumps(prompt),
        modelId=model_id,
        accept='application/json',
        contentType='application/json'
    )

Prerequisites

Before you deploy the solution, review the prerequisites.

Deploy the solution

The code sample along with the deployment steps are available in the GitHub repository. The following screenshot illustrates deploying the solution using AWS CloudShell.

Test the solution

The solution provides some sample data for indexing, as shown in the following screenshot. You can also index custom text. Initial indexing of documents may take some time because OpenSearch Serverless has to create a new vector index and then index documents. Subsequent requests are faster. To delete the vector index and start over, choose Reset.

The following screenshot illustrates how you can query your domain data in multiple languages after it’s indexed. You could also try out sentiment analysis or PII data detection and redaction on custom text. The response is streamed over Amazon API Gateway WebSockets.

Clean up

To clean up your resources, delete the following AWS CloudFormation stacks via the AWS CloudFormation console:

  • LlmsWithServerlessRagStack
  • ApiGwLlmsLambda

Conclusion

In this post, we provided an end-to-end serverless solution for RAG-based generative AI applications. This not only offers you a cost-effective option, particularly in the face of GPU cost and hardware availability challenges, but also simplifies the development process and reduces operational costs.

Stay up to date with the latest advancements in generative AI and start building on AWS. If you’re seeking assistance on how to begin, check out the Generative AI Innovation Center.


About the authors

Fraser Sequeira is a Startups Solutions Architect with AWS based in Mumbai, India. In his role at AWS, Fraser works closely with startups to design and build cloud-native solutions on AWS, with a focus on analytics and streaming workloads. With over 10 years of experience in cloud computing, Fraser has deep expertise in big data, real-time analytics, and building event-driven architecture on AWS. He enjoys staying on top of the latest technology innovations from AWS and sharing his learnings with customers. He spends his free time tinkering with new open source technologies.

Kenneth Walsh is a New York-based Sr. Solutions Architect whose focus is AWS Marketplace. Kenneth is passionate about cloud computing and loves being a trusted advisor for his customers. When he’s not working with customers on their journey to the cloud, he enjoys cooking, audiobooks, movies, and spending time with his family and dog.

Max Winter is a Principal Solutions Architect for AWS Financial Services clients. He works with ISV customers to design solutions that allow them to leverage the power of AWS services to automate and optimize their business. In his free time, he loves hiking and biking with his family, music and theater, digital photography, 3D modeling, and imparting a love of science and reading to his two nearly-teenagers.

Manjula Nagineni is a Senior Solutions Architect with AWS based in New York. She works with major financial service institutions, architecting and modernizing their large-scale applications while adopting AWS Cloud services. She is passionate about designing big data workloads cloud-natively. She has over 20 years of IT experience in software development, analytics, and architecture across multiple domains such as finance, retail, and telecom.

What Is Hybrid Cloud?

Post Syndicated from Molly Clancy original https://www.backblaze.com/blog/confused-about-the-hybrid-cloud-youre-not-alone/

An illustration of clouds computers and servers.
Editor’s note: This post has been updated since it was originally published in 2017.

The term hybrid cloud has been around for a while—we originally published this explainer in 2017. But time hasn’t necessarily made things clearer. Maybe you hear folks talk about your company’s hybrid cloud approach, but what does that really mean? If you’re confused about the hybrid cloud, you’re not alone. 

Hybrid cloud is a computing approach that uses both private and public cloud resources with some kind of orchestration between them. The term has been applied to a wide variety of IT solutions, so it’s no wonder the concept breeds confusion. 

In this post, we’ll explain what a hybrid cloud is, how it can benefit your business, and how to choose a cloud storage provider for your hybrid cloud strategy.

What Is the Hybrid Cloud?

A hybrid cloud is an infrastructure approach that uses both private and public resources. Let’s first break down those key terms:

  • Public cloud: When you use a public cloud, you are storing your data in another company’s internet-accessible data center. A public cloud service allows anybody to sign up for an account, and share data center resources with other customers or tenants. Instead of worrying about the costs and complexity of operating an on-premises data center, a cloud storage user only needs to pay for the cloud storage they need.
  • Private cloud: In contrast, a private cloud is specifically designed for a single tenant. Think of a private cloud as a permanently reserved private dining room at a restaurant—no other customer can use that space. As a result, private cloud services can be more expensive than public clouds. Traditionally, private clouds typically lived on on-premises infrastructure, meaning they were built and maintained on company property. Now, private clouds can be maintained and managed on-premises by an organization or by a third party in a data center. The key defining factor is that the cloud is dedicated to a single tenant or organization.

Those terms are important to know to understand the hybrid cloud architecture approach. Hybrid clouds are defined by a combined management approach, which means there is some type of orchestration between the private and public environments that allows workloads and data to move between them in a flexible way as demands, needs, and costs change. This gives you flexibility when it comes to data deployment and usage.  

In other words, if you have some IT resources on-premises that you are replicating or sharing with an external vendor—congratulations, you have a hybrid cloud!

Hybrid cloud refers to a computing architecture that is made up of both private cloud resources and public cloud resources with some kind of orchestration between them.

Hybrid Cloud Examples

Here are a few examples of how a hybrid cloud can be used:

  1. As an active archive: You might establish a protocol that says all accounting files that have not been changed in the last year, for example, are automatically moved off-premises to cloud storage archive to save cost and reduce the amount of storage needed on-site. You can still access the files; they are just no longer stored on your local systems. 
  2. To meet compliance requirements: Let’s say some of your data is subject to strict data privacy requirements, but other data you manage isn’t as closely protected. You could keep highly regulated data on premises in a private cloud and the rest of your data in a public cloud. 
  3. To scale capacity: If you’re in an industry that experiences seasonal or frequent spikes like retail or ecommerce, these spikes can be handled by a public cloud which provides the elasticity to deal with times when your data needs exceed your on-premises capacity.
  4. For digital transformation: A hybrid cloud lets you adopt cloud resources in a phased approach as you expand your cloud presence.

Hybrid Cloud vs. Multi-cloud: What’s the Diff?

You wouldn’t be the first person to think that the terms multi-cloud and hybrid cloud appear similar. Both of these approaches involve using multiple clouds. However, multi-cloud uses two clouds of the same type in combination (i.e., two or more public clouds) and hybrid cloud approaches combine a private cloud with a public cloud. One cloud approach is not necessarily better than the other—they simply serve different use cases. 

For example, let’s say you’ve already invested in significant on-premises IT infrastructure, but you want to take advantage of the scalability of the cloud. A hybrid cloud solution may be a good fit for you. 

Alternatively, a multi-cloud approach may work best for you if you are already in the cloud and want to mitigate the risk of a single cloud provider having outages or issues. 

Hybrid Cloud Benefits

A hybrid cloud approach allows you to take advantage of the best elements of both private and public clouds. The primary benefits are flexibility, scalability, and cost savings.

Benefit 1: Flexibility and Scalability

One of the top benefits of the hybrid cloud is its flexibility. Managing IT infrastructure on-premises can be time consuming and expensive, and adding capacity requires advance planning, procurement, and upfront investment

The public cloud is readily accessible and able to provide IT resources whenever needed on short notice. For example, the term “cloud bursting” refers to the on-demand and temporary use of the public cloud when demand exceeds resources available in the private cloud. A private cloud, on the other hand, provides the absolute fastest access speeds since it is generally located on-premises. (But cloud providers are catching up fast, for what it’s worth.) For data that is needed with the absolute lowest levels of latency, it may make sense for the organization to use a private cloud for current projects and store an active archive in a less expensive, public cloud.

Benefit 2: Cost Savings

Within the hybrid cloud framework, the public cloud segment offers cost-effective IT resources, eliminating the need for upfront capital expenses and associated labor costs. IT professionals gain the flexibility to optimize configurations, choose the most suitable service provider, and determine the optimal location for each workload. This strategic approach reduces costs by aligning resources with specific tasks. Furthermore, the ability to easily scale, redeploy, or downsize services enhances efficiency, curbing unnecessary expenses and contributing to overall cost savings.

Comparing Private vs. Hybrid Cloud Storage Costs

To understand the difference in storage costs between a purely on-premises solution and a hybrid cloud solution, we’ll present two scenarios. For each scenario, we’ll use data storage amounts of 100TB, 1PB, and 2PB. Each table is the same format, all we’ve done is change how the data is distributed: private (on-premises) or public (off-premises). We are using the costs for our own Backblaze B2 Cloud Storage in this example. The math can be adapted for any set of numbers you wish to use.

Scenario 1    100% of data on-premises storage

    Data Stored
  Data Stored On-premises: 100%   100TB 1,000TB 2,000TB
On-premises cost range   Monthly Cost
  Low — $12/TB/Month   $1,200 $12,000 $24,000
  High — $20/TB/Month   $2,000 $20,000 $40,000

Scenario 2    20% of data on-premises with 80% public cloud storage (Backblaze B2)

    Data Stored
  Data Stored On-premises: 20%   20TB 200TB 400TB
  Data Stored in the Cloud: 80%   80TB 800TB 1,600TB
On-premises cost range   Monthly Cost
  Low — $12/TB/Month   $240 $2,400 $4,800
  High — $20/TB/Month   $400 $4,000 $8,000
Public cloud cost range   Monthly Cost
  Low — $6/TB/Month (Backblaze B2)   $480 $4,800 $9,600
  High — $20/TB/Month   $1,600 $16,000 $32,000
On-premises + public cloud cost range   Monthly Cost
  Low   $720 $7,200 $14,400
  High   $2,000 $20,000 $40,000

As you can see, using a hybrid cloud solution and storing 80% of the data in the cloud with a provider like Backblaze B2 can result in significant savings over storing only on-premises.

Choosing a Cloud Storage Provider for Your Hybrid Cloud

Okay, so you understand the benefits of using a hybrid cloud approach, what next? Determining the right mix of cloud services may be intimidating because there are so many public cloud options available. Fortunately, there are a few decision factors you can use to simplify setting up your hybrid cloud solution. Here’s what to think about when choosing a public cloud storage provider:

  • Ease of use: Avoiding a steep learning curve can save you hours of work effort in managing your cloud deployments. By contrast, overly complicated pricing tiers or bells and whistles you don’t need can slow you down.
  • Data security controls: Compare how each cloud provider facilitates proper data controls. For example, take a look at features like authentication, Object Lock, and encryption.
  • Data egress fees: Some cloud providers charge additional fees for data egress (i.e., removing data from the cloud). These fees can make it more expensive to switch between providers. In addition to fees, check the data speeds offered by the provider.
  • Interoperability: Flexibility and interoperability are key reasons to use cloud services. Before signing up for a service, understand the provider’s integration ecosystem. A lack of needed integrations may place a greater burden on your team to keep the service running effectively.
  • Storage tiers: Some providers offer different storage tiers where you sacrifice access for lower costs. While the promise of inexpensive cold storage can be attractive, evaluate whether you can afford to wait hours or days to retrieve your data.
  • Pricing transparency: Pay careful attention to the cloud provider’s pricing model and tier options. Consider building a spreadsheet to compare a shortlist of cloud providers’ pricing models.

When Hybrid Cloud Might Not Always Be the Right Fit

The hybrid cloud may not always be the optimal solution, particularly for smaller organizations with limited IT budgets that might find a purely public cloud approach more cost-effective. The substantial setup and operational costs of private servers could be prohibitive.

A thorough understanding of workloads is crucial to effectively tailor the hybrid cloud, ensuring the right blend of private, public, and traditional IT resources for each application and maximizing the benefits of the hybrid cloud architecture.

So, Should You Go Hybrid?

Big picture, anything that helps you respond to IT demands quickly, easily, and affordably is a win. With a hybrid cloud, you can avoid some big up-front capital expenses for in-house IT infrastructure, making your CFO happy. Being able to quickly spin up IT resources as they’re needed will appeal to the CTO and VP of operations.

So, given all that, we’ve arrived at the bottom line and the question is, should you or your organization embrace hybrid cloud infrastructure?According to Flexera’s 2023 State of the Cloud report, 72% of enterprises utilize a hybrid cloud strategy. That indicates that the benefits of the hybrid cloud appeal to a broad range of companies.

If an organization approaches implementing a hybrid cloud solution with thoughtful planning and a structured approach, a hybrid cloud can deliver on-demand flexibility, empower legacy systems, and applications with new capabilities, and become a catalyst for digital transformation. The result can be an elastic and responsive infrastructure that has the ability to quickly adapt to changing demands of the business.

As data management professionals increasingly recognize the advantages of the hybrid cloud, we can expect more and more of them to embrace it as an essential part of their IT strategy.

Tell Us What You’re Doing With the Hybrid Cloud

Are you currently embracing the hybrid cloud, or are you still uncertain or hanging back because you’re satisfied with how things are currently? We’d love to hear your comments below on how you’re approaching your cloud architecture decisions.

FAQs About Hybrid Cloud

What exactly is a hybrid cloud?

Hybrid cloud is a computing approach that uses both private and public cloud resources with some kind of orchestration between them.

What is the difference between hybrid and multi-cloud?

Multi-cloud uses two clouds of the same type in combination (i.e., two or more public clouds) and hybrid cloud approaches combine a private cloud with a public cloud. One cloud approach is not necessarily better than the other—they simply serve different use cases.

What is a hybrid cloud architecture?

Hybrid cloud architecture is any kind of IT architecture that combines both the public and private clouds. Many organizations use this term to describe specific software products that provide solutions which combine the two types of clouds.

What are hybrid clouds used for?

Organizations will often use hybrid clouds to create redundancy and scalability for their computing workload. A hybrid cloud is a great way for a company to have extra fallback options to continue offering services even when they have higher than usual levels of traffic, and it can also help companies scale up their services over time as they need to offer more options.

The post What Is Hybrid Cloud? appeared first on Backblaze Blog | Cloud Storage & Cloud Backup.

RFC 9498: The GNU Name System

Post Syndicated from corbet original https://lwn.net/Articles/952122/

The GNU Name System has now been formalized as RFC 9498.

GNS addresses long-standing security and privacy issues in the
ubiquitous Domain Name System (DNS). Previous attempts to secure
DNS (DNSSEC) fail to address critical security issues such as
end-to-end security, query privacy, censorship, and centralization
of root zone governance. After 40 years of patching, it is time for
a new beginning.

MikroTik CRS310-8G+2S+IN Review 8-port 2.5GbE and 2-port 10GbE Switch

Post Syndicated from Rohit Kumar original https://www.servethehome.com/mikrotik-crs310-8g-2s-in-review-8-port-2-5gbe-and-2-port-10gbe-switch/

We have longed for the MikroTik CRS310-8G+2S+IN switch for years, now we review this Marvell-powered 2.5GbE switch with MikroTik management

The post MikroTik CRS310-8G+2S+IN Review 8-port 2.5GbE and 2-port 10GbE Switch appeared first on ServeTheHome.

AWS Security Profile: Chris Betz, CISO of AWS

Post Syndicated from Chris Betz original https://aws.amazon.com/blogs/security/aws-security-profile-chris-betz-ciso-of-aws/

In the AWS Security Profile series, we feature the people who work in Amazon Web Services (AWS) Security and help keep our customers safe and secure. This interview is with Chris Betz, Chief Information Security Officer (CISO), who began his role as CISO of AWS in August of 2023.


How did you get started in security? What prompted you to pursue this field?

I’ve always had a passion for technology, and for keeping people out of harm’s way. When I found computer science and security in the Air Force, this world opened up to me that let me help others, be a part of building amazing solutions, and engage my competitive spirit. Security has the challenges of the ultimate chess game, though with real and impactful consequences. I want to build reliable, massively scaled systems that protect people from malicious actors. This is really hard to do and a huge challenge I undertake every day. It’s an amazing team effort that brings together the smartest folks that I know, competing with threat actors.

What are you most excited about in your new role?

One of the most exciting things about my role is that I get to work with some of the smartest people in the field of security, people who inspire, challenge, and teach me something new every day. It’s exhilarating to work together to make a significant difference in the lives of people all around the world, who trust us at AWS to keep their information secure. Security is constantly changing, we get to learn, adapt, and get better every single day. I get to spend my time helping to build a team and culture that customers can depend on, and I’m constantly impressed and amazed at the caliber of the folks I get work with here.

How does being a former customer influence your role as AWS CISO?

I was previously the CISO at Capital One and was an AWS customer. As a former customer, I know exactly what it’s like to be a customer who relies on a partner for significant parts of their security. There needs to be a lot of trust, a lot of partnership across the shared responsibility model, and consistent focus on what’s being done to keep sensitive data secure. Every moment that I’m here at AWS, I’m reminded about things from the customer perspective and how I can minimize complexity, and help customers leverage the “super powers” that the cloud provides for CISOs who need to defend the breadth of their digital estate. I know how important it is to earn and keep customer trust, just like the trust I needed when I was in their shoes. This mindset influences me to learn as much as I can, never be satisfied with ”good enough,” and grab every opportunity I can to meet and talk with customers about their security.

What’s been the most dramatic change you’ve seen in the security industry recently?

This is pretty easy to answer: artificial intelligence (AI). This is a really exciting time. AI is dominating the news and is on the mind of every security professional, everywhere. We’re witnessing something very big happening, much like when the internet came into existence and we saw how the world dramatically changed because of it. Every single sector was impacted, and AI has the same potential. Many customers use AWS machine learning (ML) and AI services to help improve signal-to-noise ratio, take over common tasks to free up valuable time to dig deeper into complex cases, and analyze massive amounts of threat intelligence to determine the right action in less time. The combination of Data + Compute power + AI is a huge advantage for cloud companies.

AI and ML have been a focus for Amazon for more than 25 years, and we get to build on an amazing foundation. And it’s exciting to take advantage of and adapt to the recent big changes and the impact this is having on the world. At AWS, we’re focused on choice and broadening access to generative AI and foundation models at every layer of the ML stack, including infrastructure (chips), developer tools, and AI services. What a great time to be in security!

What’s the most challenging part of being a CISO?

Maintaining a culture of security involves each person, each team, and each leader. That’s easy to say, but the challenge is making it tangible—making sure that each person sees that, even though their title doesn’t have “security” in it, they are still an integral part of security. We often say, “If you have access, you have responsibility.” We work hard to limit that access. And CISOs must constantly work to build and maintain a culture of security and help every single person who has access to data understand that security is an important part of their job.

What’s your short- and long-term vision for AWS Security?

Customers trust AWS to protect their data so they can innovate and grow quickly, so in that sense, our vision is for security to be a growth lever for our customers, not added friction. Cybersecurity is key to unlocking innovation, so managing risk and aligning the security posture of AWS with our business objectives will continue for the immediate future and long term. For our customers, my vision is to continue helping them understand that investing in security helps them move faster and take the right risks—the kind of risks they need to remain competitive and innovative. When customers view security as a business accelerator, they achieve new technical capabilities and operational excellence. Strong security is the ultimate business enabler.

If you could give one piece of advice to all CISOs, what would it be?

Nail Zero Trust. Zero Trust is the path to the strongest, most effective security, and getting back to the core concepts is important. While Zero Trust is a different journey for every organization, it’s a natural evolution of cybersecurity and defense in depth in particular. No matter what’s driving organizations toward Zero Trust—policy considerations or the growing patchwork of data protection and privacy regulations—Zero Trust meaningfully improves security outcomes through an iterative process. When companies get this right, they can quickly identify and investigate threats and take action to contain or disrupt unwanted activity.

What are you most proud of in your career?

I’m proud to have worked—and still be working with—such talented, capable, and intelligent security professionals who care deeply about security and are passionate about making the world a safer place. Being among the world’s top security experts really makes me grateful and humble for all the amazing opportunities I’ve had to work alongside them, working together to solve problems and being part of creating a legacy to make security better.

 
If you have feedback about this post, submit comments in the Comments section below. If you have questions about this post, contact AWS Support.

Want more AWS Security news? Follow us on Twitter.

Chris Betz

Chris Betz

Chris is CISO at AWS. He oversees security teams and leads the development and implementation of security policies, with the aim of managing risk and aligning the company’s security posture with business objectives. Chris joined Amazon in August 2023, after holding CISO and security leadership roles at leading companies. He lives in Northern Virginia with his family.

Lisa Maher

Lisa Maher

Lisa Maher joined AWS in February 2022 and leads AWS Security Thought Leadership PR. Before joining AWS, she led crisis communications for clients experiencing data security incidents at two global PR firms. Lisa, a former journalist, is a graduate of Texas A&M School of Law, where she specialized in Cybersecurity Law & Policy, Risk Management & Compliance.

When Maximum Effort Doesn’t Equate to Maximum Results

Post Syndicated from Rapid7 original https://blog.rapid7.com/2023/11/21/when-maximum-effort-doesnt-equate-to-maximum-results/

When Maximum Effort Doesn't Equate to Maximum Results

It’s no secret that security teams are feeling beleaguered as a result of the barrage of data, events, and alerts generated by their security tools, to say nothing of the increased budget scrutiny and constrained staff resources that continue to plague cybersecurity practitioners.

The trick is finding the right balance between how much internal teams have to accomplish themselves versus how much they can cede to managed security service providers (MSSPs).

Historically, success in security operations (SecOps) was measured by how quickly teams could react to incoming threats; but the sheer number of alerts that require humans-in-the-loop to determine the accuracy and severity of security events make it nearly impossible for teams to keep up. Additionally, the number of tools deployed in a given organization today – to say nothing of the complexity required to make those tools work in concert – means reacting alone won’t get the job done anyway.

Unfortunately, many MSSPs don’t do enough to relieve customers of noisy alerts without expensive consulting agreements, which puts the burden to evaluate and remediate incidents back on already strapped in-house teams.

Traditional approaches have the added disadvantages of being too siloed, too slow, too antiquated for cloud environments, and too convoluted to demonstrate their value. Analysts at a leading research firm predict that within the next 12-18 months, 33% of organizations that currently have internal security functions will attempt and fail to build an effective internal SecOps because of resource constraints such as lack of budget, expertise, and staffing. Analysts further expect that within the next 12-18 months, 90% of internal SecOps will outsource at least 50% of their operational workloads – which makes choosing an MSSP you trust of paramount importance.

MSSPs enable organizations to maximize resilience while minimizing complexity and optimizing staff resources. The best solutions in the market will drive greater efficiency and consolidation by unifying vulnerability management and managed detection and response (MDR) into a single, cohesive security service built by practitioners for practitioners. They will offer 24x7x365 services that “follow the sun” (meaning no one service center is responsible for 100% of support calls; the work is distributed in certified centers of excellence around the world) so that top-notch support is readily available where and when you need it. Complete coverage and end-to-end detection and response services means you can feel confident that your teams are always ready for what comes next.

But it’s important to choose an MSSP that eschews a one-size-fits-all approach. Rather, look for a partner that is dynamic and flexible enough to meet the particular risk profile and business priorities of your organization, one adaptable enough to conform to changes in evolving threats and attack vectors.

Partnering with the right MSSP also allows you to optimize your SecOps for today’s distributed environments, built for the speed and scale of the cloud. Operating in the cloud means you can integrate hundreds of services with the thousands of devices connecting to them seamlessly and in real time; it also means you must protect and secure a sprawling surface with a multitude of potential entry points that threat actors can exploit.

To meet the challenge, choose an MSSP that offers complete coverage from a single, end-to-end solution so that you’re not left responding to an overabundance of events, alerts, and false positives or trying to protect an attack surface too big to contain.

Look for providers that deliver unlimited data, unlimited incident response, and unlimited intelligence so that when a forensic analysis is performed, their detailed remediation and mitigation recommendations make sure you can improve your resilience against future threats. And in the unfortunate event that a breach becomes a full-scope incident-response engagement, you want a partner that will work with you round-the-clock on the forensic investigation and deliver answers that will remove attackers from your environment as quickly as possible – without charging additional consulting fees.

Partnering with a proven MSSP will also boost your visibility across all services and devices to anticipate the most imminent risks, prevent attacks earlier, and respond to events faster. Additionally, an engagement that includes threat exposure manageability at scale through unified endpoint-to-cloud coverage can identify and respond to threats anywhere while breaking down functional and geographic silos that stall efficiency and reduce collaboration.

Critical functions like threat hunting and patch management can be automated across many tools and processes to reduce reliance on manual work. Machine learning and artificial intelligence models can be paired with internal threat telemetry data and chatbots to triage events, increase staff productivity, or produce threat reports that support more targeted and prioritized threat management across the enterprise.

Best of all, the successful use of AI and automation can help reduce the number of tools operating in your environment, which in turn decreases the complexity and cost of security operations.

It’s time to gain the edge over attackers and keep up with the fluid, ever-expanding threat landscape by eliminating threats wherever they emerge and proactively preventing breaches earlier in the kill chain. Partnering with a trusted MSSP will enable you to manage your threat exposure precisely and comprehensively, improve your signal-to-noise ratio, demonstrate tangible ROI from your security investments, and continually advance your security posture.

Learn more about the best criteria to use when reviewing the capabilities of potential MSSP partners.

[$] Trust in and maintenance of filesystems

Post Syndicated from corbet original https://lwn.net/Articles/951846/

The Linux kernel supports a wide variety of filesystems, many of which are
no longer in heavy use — or, perhaps, any use at all. The kernel code
implementing the less-popular filesystems tends to be relatively unpopular
as well, receiving little in the way of maintenance. Keeping old
filesystems alive does place a burden on kernel developers, though, so it
is not surprising that there is pressure to remove the least popular ones.
At the 2023 Kernel Maintainers Summit, the developers talked about these
filesystems and what can be done about them.

Ekstrand: NVK reaches Vulkan 1.0 conformance

Post Syndicated from corbet original https://lwn.net/Articles/952089/

Faith Ekstrand has announced
that the NVK Vulkan driver for NVIDIA “Turing” GPUs has been certified as
being fully compliant with the Vulkan 1.0 API.

Practically, it means that we can pass the entire Vulkan
conformance test suite. From the Khronos perspective, it means that
NVK now meets the bar required to claim to support the Vulkan API
officially. (There are some legal implications to this which matter
to the Mesa project, but most users don’t care about them.) From
the perspective of users, it means the driver should pretty much
work on Turing and later GPUs.

Security updates for Tuesday

Post Syndicated from corbet original https://lwn.net/Articles/952088/

Security updates have been issued by Debian (activemq, strongswan, and wordpress), Mageia (u-boot), SUSE (avahi, frr, libreoffice, nghttp2, openssl, openssl1, postgresql, postgresql15, postgresql16, python-Twisted, ucode-intel, and xen), and Ubuntu (avahi, hibagent, nodejs, strongswan, tang, and webkit2gtk).

Workers AI Update: Hello Mistral 7B

Post Syndicated from Jesse Kipp original http://blog.cloudflare.com/workers-ai-update-hello-mistral-7b/


Workers AI Update: Hello Mistral 7B

Today we’re excited to announce that we’ve added the Mistral-7B-v0.1-instruct to Workers AI. Mistral 7B is a 7.3 billion parameter language model with a number of unique advantages. With some help from the founders of Mistral AI, we’ll look at some of the highlights of the Mistral 7B model, and use the opportunity to dive deeper into “attention” and its variations such as multi-query attention and grouped-query attention.

Mistral 7B tl;dr:

Mistral 7B is a 7.3 billion parameter model that puts up impressive numbers on benchmarks. The model:

  • Outperforms Llama 2 13B on all benchmarks
  • Outperforms Llama 1 34B on many benchmarks,
  • Approaches CodeLlama 7B performance on code, while remaining good at English tasks, and
  • The chat fine-tuned version we’ve deployed outperforms Llama 2 13B chat in the benchmarks provided by Mistral.

Here’s an example of using streaming with the REST API:

curl -X POST \
“https://api.cloudflare.com/client/v4/accounts/{account-id}/ai/run/@cf/mistral/mistral-7b-instruct-v0.1” \
-H “Authorization: Bearer {api-token}” \
-H “Content-Type:application/json” \
-d '{ “prompt”: “What is grouped query attention”, “stream”: true }'

API Response: { response: “Grouped query attention is a technique used in natural language processing  (NLP) and machine learning to improve the performance of models…” }

And here’s an example using a Worker script:

import { Ai } from ‘@cloudflare/ai’;
export default {
    async fetch(request, env) {
        const ai = new Ai(env.AI);
        const stream = await ai.run(‘@cf/mistral/mistral-7b-instruct-v0.1’, {
            prompt: ‘What is grouped query attention’,
            stream: true
        });
        return Response.json(stream, { headers: { “content-type”: “text/event-stream” } });
    }
}

Mistral takes advantage of grouped-query attention for faster inference. This recently-developed technique improves the speed of inference without compromising output quality. For 7 billion parameter models, we can generate close to 4x as many tokens per second with Mistral as we can with Llama, thanks to Grouped-Query attention.

You don’t need any information beyond this to start using Mistral-7B, you can test it out today ai.cloudflare.com. To learn more about attention and Grouped-Query attention, read on!

So what is “attention” anyway?

The basic mechanism of attention, specifically “Scaled Dot-Product Attention” as introduced in the landmark paper Attention Is All You Need, is fairly simple:

We call our particular attention “Scale Dot-Product Attention”. The input consists of query and keys of dimension d_k, and values of dimension d_v. We compute the dot products of the query with all the keys, divide each by sqrt(d_k) and apply a softmax function to obtain the weights on the values.

More concretely, this looks like this:

source

In simpler terms, this allows models to focus on important parts of the input. Imagine you are reading a sentence and trying to understand it. Scaled dot product attention enables you to pay more attention to certain words based on their relevance. It works by calculating the similarity between each word (K) in the sentence and a query (Q). Then, it scales the similarity scores by dividing them by the square root of the dimension of the query. This scaling helps to avoid very small or very large values. Finally, using these scaled similarity scores, we can determine how much attention or importance each word should receive. This attention mechanism helps models identify crucial information (V) and improve their understanding and translation capabilities.

Easy, right? To get from this simple mechanism to an AI that can write a “Seinfeld episode in which Jerry learns the bubble sort algorithm,” we’ll need to make it more complex. In fact, everything we’ve just covered doesn’t even have any learned parameters — constant values learned during model training that customize the output of the attention block!
Attention blocks in the style of Attention is All You Need add mainly three types of complexity:

Learned parameters

Learned parameters refer to values or weights that are adjusted during the training process of a model to improve its performance. These parameters are used to control the flow of information or attention within the model, allowing it to focus on the most relevant parts of the input data. In simpler terms, learned parameters are like adjustable knobs on a machine that can be turned to optimize its operation.

Vertical stacking – layered attention blocks

Vertical layered stacking is a way to stack multiple attention mechanisms on top of each other, with each layer building on the output of the previous layer. This allows the model to focus on different parts of the input data at different levels of abstraction, which can lead to better performance on certain tasks.

Horizontal stacking – aka Multi-Head Attention

The figure from the paper displays the full multi-head attention module. Multiple attention operations are carried out in parallel, with the Q-K-V input for each generated by a unique linear projection of the same input data (defined by a unique set of learned parameters). These parallel attention blocks are referred to as “attention heads”. The weighted-sum outputs of all attention heads are concatenated into a single vector and passed through another parameterized linear transformation to get the final output.

source

This mechanism allows a model to focus on different parts of the input data concurrently. Imagine you are trying to understand a complex piece of information, like a sentence or a paragraph. In order to understand it, you need to pay attention to different parts of it at the same time. For example, you might need to pay attention to the subject of the sentence, the verb, and the object, all simultaneously, in order to understand the meaning of the sentence. Multi-headed attention works similarly. It allows a model to pay attention to different parts of the input data at the same time, by using multiple “heads” of attention. Each head of attention focuses on a different aspect of the input data, and the outputs of all the heads are combined to produce the final output of the model.

Styles of attention

There are three common arrangements of attention blocks used by large language models developed in recent years: multi-head attention, grouped-query attention and multi-query attention. They differ in the number of K and V vectors relative to the number of query vectors. Multi-head attention uses the same number of K and V vectors as Q vectors, denoted by “N” in the table below. Multi-query attention uses only a single K and V vector. Grouped-query attention, the type used in the Mistral 7B model, divides the Q vectors evenly into groups containing “G” vectors each, then uses a single K and V vector for each group for a total of N divided by G sets of K and V vectors. This summarizes the differences, and we’ll dive into the implications of these below.

 

Number of Key/Value Blocks

Quality

Memory Usage

Multi-head attention (MHA)

N

Best

Most

Grouped-query attention (GQA)

N / G

Better

Less

Multi-query attention (MQA)

1

Good

Least

Summary of attention styles

And this diagram helps illustrate the difference between the three styles:

source

Multi-Query Attention

Multi-query attention was described in 2019 in the paper from Google: Fast Transformer Decoding: One Write-Head is All You Need. The idea is that instead of creating separate K and V entries for every Q vector in the attention mechanism, as in multi-head attention above, only a single K and V vector is used for the entire set of Q vectors. Thus the name, multiple queries combined into a single attention mechanism. In the paper, this was benchmarked on a translation task and showed performance equal to multi-head attention on the benchmark task.

Originally the idea was to reduce the total size of memory that is accessed when performing inference for the model. Since then, as generalized models have emerged and grown in number of parameters, the GPU memory needed is often the bottleneck which is the strength of multi-query attention, as it requires the least accelerator memory of the three types of attention. However, as models grew in size and generality, performance of multi-query attention fell relative to multi-head attention.

Grouped-Query Attention

The newest of the bunch — and the one used by Mistral — is grouped-query attention, as described in the paper GQA: Training Generalized Multi-Query Transformer Models from Multi-Head Checkpoints that was published on arxiv.org in May 2023. Grouped-query attention combines the best of both worlds: the quality of multi-headed attention with the speed and low memory usage of multi-query attention. Instead of either a single set of K and V vectors or one set for every Q vector, a fixed ratio of 1 set of K and V vectors for every Q vector is used, reducing memory usage but retaining high performance on many tasks.

Often choosing a model for a production task is not just about picking the best model available because we must consider tradeoffs between performance, memory usage, batch size, and available hardware (or cloud costs). Understanding these three styles of attention can help guide those decisions and understand when we might choose a particular model given our circumstances.

Enter Mistral — try it today

Being one of the first large language models to leverage grouped-query attention and combining it with sliding window attention, Mistral seems to have hit the goldilocks zone — it’s low latency, high-throughput, and it performs really well on benchmarks even when compared to bigger models (13B). All this to say is that it packs a punch for its size, and we couldn’t be more excited to make it available to all developers today, via Workers AI.

Head over to our developer docs to get started, and if you need help, want to give feedback, or want to share what you’re building just pop into our Developer Discord!

The Workers AI team is also expanding and hiring; check our jobs page for open roles if you’re passionate about AI engineering and want to help us build and evolve our global, serverless GPU-powered inference platform.

Email Security Flaw Found in the Wild

Post Syndicated from Bruce Schneier original https://www.schneier.com/blog/archives/2023/11/email-security-flaw-found-in-the-wild.html

Google’s Threat Analysis Group announced a zero-day against the Zimbra Collaboration email server that has been used against governments around the world.

TAG has observed four different groups exploiting the same bug to steal email data, user credentials, and authentication tokens. Most of this activity occurred after the initial fix became public on Github. To ensure protection against these types of exploits, TAG urges users and organizations to keep software fully up-to-date and apply security updates as soon as they become available.

The vulnerability was discovered in June. It has been patched.

Evolving our online courses to help more people be computing educators

Post Syndicated from Sway Grantham original https://www.raspberrypi.org/blog/free-online-courses-computing-education-updates-2023/

Since launching our free online courses about computing on the edX platform back in August, we’ve been training course facilitators and analysing the needs of educators around the world. We want every course participant to have a great experience learning with us — read on to find out what we’re doing right now and into 2024 to ensure this.

Workshop attendees at a table.

Online courses for all adults who support young people

Educators of all kinds are key for supporting children and young people to engage with computing technology and develop digital skills. You might be a professional teacher, or a parent, volunteer, youth worker, librarian… there are so many roles in which people share knowledge with young learners.

Young people and an adult mentor at a computer at Coolest Projects Ireland 2023.

That’s why our online courses are designed to support any kind of educator to:

  • Understand the full breadth of topics within computing
  • Discover how to introduce computing to young people in clear and exciting ways that are grounded in the latest research

We are constantly improving our online courses based on your feedback, the latest education research, and the insights our team members gain through supporting you on your course learning journeys. Three principles guide these improvements: accessibility, scalability, and sustainability. 

Making our courses more relevant and accessible

Our online courses are used by people who live around the world and bring various knowledge and experiences. Some participants are classroom teachers, others have computing experience from their job and want to volunteer at a kids’ coding club, and some may be parents who want to support their children. It’s important to us that our courses are relevant and accessible to all kinds of adult learners. 

A parent and child work together at a Raspberry Pi computer.

We’re currently working to: 

  • Simplify the English in the courses for participants who speak it as a second language
  • Adapt the course activities for specific settings where participants help young people learn so that e.g. teachers see how the activities work in the classroom, and volunteers who run coding clubs see how they work in club sessions
  • Ensure our course facilitators have experience in a range of different settings including coding clubs, and in a variety of different contexts around the world

Making our courses useful for more groups of people

When we think about the scalability of our courses, we think about how to best support as many educators around the world as possible. If we can make the jobs of all educators easier, whatever their setting is like, then we are making the right choices.

An educator helps two young people at a computer.

We’re currently working to: 

  • Talk with the global network of educators we’re a part of to better understand what works for them so we can reflect that in the courses
  • Include a wider range of examples for settings beyond the classroom in the courses
  • Adapt our courses so they are relevant to participants with various needs while sustaining the high quality of the overall learning experience

Making the learning from our courses sustainable

The educators who take our courses work to achieve amazing things, and this means they are often busy. That they take the time to complete one of our courses to learn new things is a commitment we want to make sure is rewarded. The learning you get from participating in our online courses should continue to benefit you far beyond the time you spend completing it. This is what we mean by sustainability.

Kenyan educators work on a physical computing project.

We’re currently working to: 

  • Lay out clear learning pathways so you can build on the knowledge you gain in one course in the next course
  • Offer course resources that are easy to access after you’ve completed the course
  • Explore ways to build communities around our courses where you can share successes and learning outcomes with your fellow participants

Learn with us, and help us design better courses for you

Our work to improve the accessibility, scalability, and sustainability of our courses will continue into 2024, and these three principles will likely be part of our online training strategy for the following year too. 

If you’d like to support young people in your life to learn about computing and digital technologies, take one of our free courses now and learn something new. We have twenty courses available right now and they are totally free.

We are also looking for adult testers for new course content. So if you’re any kind of educator and would like to test upcoming online course content and share your feedback and experiences, please send us a message with the subject ‘Educator training’. 

The post Evolving our online courses to help more people be computing educators appeared first on Raspberry Pi Foundation.

The collective thoughts of the interwebz