Agents for Amazon Bedrock is now available with improved control of orchestration and visibility into reasoning

Post Syndicated from Antje Barth original https://aws.amazon.com/blogs/aws/agents-for-amazon-bedrock-is-now-available-with-improved-control-of-orchestration-and-visibility-into-reasoning/

Back in July, we introduced Agents for Amazon Bedrock in preview. Today, Agents for Amazon Bedrock is generally available.

Agents for Amazon Bedrock helps you accelerate generative artificial intelligence (AI) application development by orchestrating multistep tasks. Agents uses the reasoning capability of foundation models (FMs) to break down user-requested tasks into multiple steps. They use the developer-provided instruction to create an orchestration plan and then carry out the plan by invoking company APIs and accessing knowledge bases using Retrieval Augmented Generation (RAG) to provide a final response to the end user. If you’re curious how this works, check out my previous posts on agents that include a primer on advanced reasoning and a primer on RAG.

Starting today, Agents for Amazon Bedrock also comes with enhanced capabilities that include improved control of the orchestration and better visibility into the chain of thought reasoning.

Behind the scenes, Agents for Amazon Bedrock automates the prompt engineering and orchestration of user-requested tasks, such as managing retail orders or processing insurance claims. An agent automatically builds the orchestration prompt and, if connected to knowledge bases, augments it with your company-specific information and invokes APIs to provide responses to the user in natural language.

As a developer, you can use the new trace capability to follow the reasoning that’s used as the plan is carried out. You can view the intermediate steps in the orchestration process and use this information to troubleshoot issues.

You can also access and modify the prompt that the agent automatically creates so you can further enhance the end-user experience. You can update this automatically created prompt (or prompt template) to help the FM enhance the orchestration and responses, giving you more control over the orchestration.

Let me show you how to view the reasoning steps and how to modify the prompt.

View reasoning steps
Traces gives you visibility into the agent’s reasoning, known as the chain of thought (CoT). You can use the CoT trace to see how the agent performs tasks step by step. The CoT prompt is based on a reasoning technique called ReAct (synergizing reasoning and acting). Check out the primer on advanced reasoning in my previous blog post to learn more about ReAct and the specific prompt structure.

To get started, navigate to the Amazon Bedrock console and select the working draft of an existing agent. Then, select the Test button and enter a sample user request. In the agent’s response, select Show trace.

Agents for Amazon Bedrock

The CoT trace shows the agent’s reasoning step-by-step. Open each step to see the CoT details.

Agents for Amazon Bedrock

The enhanced visibility helps you understand the rationale used by the agent to complete the task. As a developer, you can use this information to refine the prompts, instructions, and action descriptions to adjust the agent’s actions and responses when iteratively testing and improving the user experience.

Modify agent-created prompts
The agent automatically creates a prompt template from the provided instructions. You can update the preprocessing of user inputs, the orchestration plan, and the postprocessing of the FM response.

To get started, navigate to the Amazon Bedrock console and select the working draft of an existing agent. Then, select the Edit button next to Advanced prompts.

Agents for Amazon Bedrock

Here, you have access to four different types of templates. Preprocessing templates define how an agent
contextualizes and categorizes user inputs. The orchestration template equips an agent with short-term memory, a list of available actions and knowledge bases along with their descriptions, as well as few-shot examples of how to break down the problem and use these actions and knowledge in different sequences or combinations. Knowledge base response generation templates define how knowledge bases will be used and summarized in the response. Postprocessing templates define how an agent will format and present a final response to the end user. You can either keep using the template defaults or edit and override the template defaults.

Things to know
Here are a few best practices and important things to know when you’re working with Agents for Amazon Bedrock.

Agents perform best when you allow them to focus on a specific task. The clearer the objective (instructions) and the more focused the available set of actions (APIs), the easier it will be for the FM to reason and identify the right steps. If you need agents to cover various tasks, consider creating separate, individual agents.

Here are a few additional guidelines:

  • Number of APIs – Use three to five APIs with a couple of input parameters in your agents.
  • API design – Follow general best practices for designing APIs, such as ensuring idempotency.
  • API call validations – Follow best practices of API design by employing exhaustive validation for all API calls. This is particularly important because large language models (LLMs) may generate hallucinated inputs and outputs, and these validations prove helpful during such occurrences.

Availability and pricing
Agents for Amazon Bedrock are available today in AWS Regions US East (N. Virginia) and US West (Oregon). You will be charged for the inference calls (InvokeModel API) made by agents. The InvokeAgent API is not charged separately. Amazon Bedrock Pricing has all the details.

Learn more

— Antje

Customize models in Amazon Bedrock with your own data using fine-tuning and continued pre-training

Post Syndicated from Antje Barth original https://aws.amazon.com/blogs/aws/customize-models-in-amazon-bedrock-with-your-own-data-using-fine-tuning-and-continued-pre-training/

Today, I’m excited to share that you can now privately and securely customize foundation models (FMs) with your own data in Amazon Bedrock to build applications that are specific to your domain, organization, and use case. With custom models, you can create unique user experiences that reflect your company’s style, voice, and services.

With fine-tuning, you can increase model accuracy by providing your own task-specific labeled training dataset and further specialize your FMs. With continued pre-training, you can train models using your own unlabeled data in a secure and managed environment with customer managed keys. Continued pre-training helps models become more domain-specific by accumulating more robust knowledge and adaptability—beyond their original training.

Let me give you a quick tour of both model customization options. You can create fine-tuning and continued pre-training jobs using the Amazon Bedrock console or APIs. In the console, navigate to Amazon Bedrock, then select Custom models.

Amazon Bedrock - Custom Models

Fine-tune Meta Llama 2, Cohere Command Light, and Amazon Titan FMs
Amazon Bedrock now supports fine-tuning for Meta Llama 2, Cohere Command Light, as well as Amazon Titan models. To create a fine-tuning job in the console, choose Customize model, then choose Create Fine-tuning job.

Amazon Bedrock - Custom Models

Here’s a quick demo using the AWS SDK for Python (Boto3). Let’s fine-tune Cohere Command Light to summarize dialogs. For demo purposes, I’m using the public dialogsum dataset, but this could be your own company-specific data.

To prepare for fine-tuning on Amazon Bedrock, I converted the dataset into JSON Lines format and uploaded it to Amazon S3. Each JSON line needs to have both a prompt and a completion field. You can specify up to 10,000 training data records, but you may already see model performance improvements with a few hundred examples.

{"completion": "Mr. Smith's getting a check-up, and Doctor Haw...", "prompt": Summarize the following conversation.\n\n#Pers..."}
{"completion": "Mrs Parker takes Ricky for his vaccines. Dr. P...", "prompt": "Summarize the following conversation.\n\n#Pers..."}
{"completion": "#Person1#'s looking for a set of keys and asks...", "prompt": "Summarize the following conversation.\n\n#Pers..."} 

I redacted the prompt and completion fields for brevity.

You can list available foundation models that support fine-tuning with the following command:

import boto3 
bedrock = boto3.client(service_name="bedrock")
bedrock_runtime = boto3.client(service_name="bedrock-runtime")

for model in bedrock.list_foundation_models(
    byCustomizationType="FINE_TUNING")["modelSummaries"]:
    for key, value in model.items():
        print(key, ":", value)
    print("-----\n")

Next, I create a model customization job. I specify the Cohere Command Light model ID that supports fine-tuning, set customization type to FINE_TUNING, and point to the Amazon S3 location of the training data. If needed, you can also adjust the hyperparameters for fine-tuning.

# Select the foundation model you want to customize
base_model_id = "cohere.command-light-text-v14:7:4k"

bedrock.create_model_customization_job(
    customizationType="FINE_TUNING",
    jobName=job_name,
    customModelName=model_name,
    roleArn=role,
    baseModelIdentifier=base_model_id,
    hyperParameters = {
        "epochCount": "1",
        "batchSize": "8",
        "learningRate": "0.00001",
    },
    trainingDataConfig={"s3Uri": "s3://path/to/train-summarization.jsonl"},
    outputDataConfig={"s3Uri": "s3://path/to/output"},
)

# Check for the job status
status = bedrock.get_model_customization_job(jobIdentifier=job_name)["status"]

Once the job is complete, you receive a unique model ID for your custom model. Your fine-tuned model is stored securely by Amazon Bedrock. To test and deploy your model, you need to purchase Provisioned Throughput.

Let’s see the results. I select one example from the dataset and ask the base model before fine-tuning, as well as the custom model after fine-tuning, to summarize the following dialog:

prompt = """Summarize the following conversation.\\n\\n
#Person1#: Hello. My name is John Sandals, and I've got a reservation.\\n
#Person2#: May I see some identification, sir, please?\\n
#Person1#: Sure. Here you are.\\n
#Person2#: Thank you so much. Have you got a credit card, Mr. Sandals?\\n
#Person1#: I sure do. How about American Express?\\n
#Person2#: Unfortunately, at the present time we take only MasterCard or VISA.\\n
#Person1#: No American Express? Okay, here's my VISA.\\n
#Person2#: Thank you, sir. You'll be in room 507, nonsmoking, with a queen-size bed. Do you approve, sir?\\n
#Person1#: Yeah, that'll be fine.\\n
#Person2#: That's great. This is your key, sir. If you need anything at all, anytime, just dial zero.\\n\\n
Summary: """

Use the Amazon Bedrock InvokeModel API to query the models.

body = {
    "prompt": prompt,
    "temperature": 0.5,
    "p": 0.9,
    "max_tokens": 512,
}

response = bedrock_runtime.invoke_model(
	# Use on-demand inference model ID for response before fine-tuning
    # modelId="cohere.command-light-text-v14",
	# Use ARN of your deployed custom model for response after fine-tuning
	modelId=provisioned_custom_model_arn,
    modelId=base_model_id, 
    body=json.dumps(body)
)

Here’s the base model response before fine-tuning:

#Person2# helps John Sandals with his reservation. John gives his credit card information and #Person2# confirms that they take only MasterCard and VISA. John will be in room 507 and #Person2# will be his host if he needs anything.

Here’s the response after fine-tuning, shorter and more to the point:

John Sandals has a reservation and checks in at a hotel. #Person2# takes his credit card and gives him a key.

Continued pre-training for Amazon Titan Text (preview)
Continued pre-training on Amazon Bedrock is available today in public preview for Amazon Titan Text models, including Titan Text Express and Titan Text Lite. To create a continued pre-training job in the console, choose Customize model, then choose Create Continued Pre-training job.

Amazon Bedrock - Custom Models

Here’s a quick demo again using boto3. Let’s assume you work at an investment company and want to continue pre-training the model with financial and analyst reports to make it more knowledgeable about financial industry terminology. For demo purposes, I selected a collection of Amazon shareholder letters as my training data.

To prepare for continued pre-training, I converted the dataset into JSON Lines format again and uploaded it to Amazon S3. Because I’m working with unlabeled data, each JSON line only needs to have the prompt field. You can specify up to 100,000 training data records and usually see positive effects after providing at least 1 billion tokens.

{"input": "Dear shareholders: As I sit down to..."}
{"input": "Over the last several months, we to..."}
{"input": "work came from optimizing the conne..."}
{"input": "of the Amazon shopping experience f..."}

I redacted the input fields for brevity.

Then, create a model customization job with customization type CONTINUED_PRE_TRAINING that points to the data. If needed, you can also adjust the hyperparameters for continued pre-training.

# Select the foundation model you want to customize
base_model_id = "amazon.titan-text-express-v1"

bedrock.create_model_customization_job(
    customizationType="CONTINUED_PRE_TRAINING",
    jobName=job_name,
    customModelName=model_name,
    roleArn=role,
    baseModelIdentifier=base_model_id,
    hyperParameters = {
        "epochCount": "10",
        "batchSize": "8",
        "learningRate": "0.00001",
    },
    trainingDataConfig={"s3Uri": "s3://path/to/train-continued-pretraining.jsonl"},
    outputDataConfig={"s3Uri": "s3://path/to/output"},
)

Once the job is complete, you receive another unique model ID. Your customized model is securely stored again by Amazon Bedrock. As with fine-tuning, you need to purchase Provisioned Throughput to test and deploy your model.

Things to know
Here are a couple of important things to know:

Data privacy and network security – With Amazon Bedrock, you are in control of your data, and all your inputs and customizations remain private to your AWS account. Your data, such as prompts, completions, custom models, and data used for fine-tuning or continued pre-training, is not used for service improvement and is never shared with third-party model providers. Your data remains in the AWS Region where the API call is processed. All data is encrypted in transit and at rest. You can use AWS PrivateLink to create a private connection between your VPC and Amazon Bedrock.

Billing – Amazon Bedrock charges for model customization, storage, and inference. Model customization is charged per tokens processed. This is the number of tokens in the training dataset multiplied by the number of training epochs. An epoch is one full pass through the training data during customization. Model storage is charged per month, per model. Inference is charged hourly per model unit using provisioned throughput. For detailed pricing information, see Amazon Bedrock Pricing.

Custom models and provisioned throughput – Amazon Bedrock allows you to run inference on custom models by purchasing provisioned throughput. This guarantees a consistent level of throughput in exchange for a term commitment. You specify the number of model units needed to meet your application’s performance needs. For evaluating custom models initially, you can purchase provisioned throughput hourly with no long-term commitment. With no commitment, a quota of one model unit is available per provisioned throughput. You can create up to two provisioned throughputs per account.

Availability
Fine-tuning support on Meta Llama 2, Cohere Command Light, and Amazon Titan Text FMs is available today in AWS Regions US East (N. Virginia) and US West (Oregon). Continued pre-training is available today in public preview in AWS Regions US East (N. Virginia) and US West (Oregon). To learn more, visit the Amazon Bedrock Developer Experience web page and check out the User Guide.

Customize FMs with Amazon Bedrock today!

— Antje

Knowledge Bases now delivers fully managed RAG experience in Amazon Bedrock

Post Syndicated from Antje Barth original https://aws.amazon.com/blogs/aws/knowledge-bases-now-delivers-fully-managed-rag-experience-in-amazon-bedrock/

Back in September, we introduced Knowledge Bases for Amazon Bedrock in preview. Starting today, Knowledge Bases for Amazon Bedrock is generally available.

With a knowledge base, you can securely connect foundation models (FMs) in Amazon Bedrock to your company data for Retrieval Augmented Generation (RAG). Access to additional data helps the model generate more relevant, context-specific, and accurate responses without continuously retraining the FM. All information retrieved from knowledge bases comes with source attribution to improve transparency and minimize hallucinations. If you’re curious how this works, check out my previous post that includes a primer on RAG.

With today’s launch, Knowledge Bases gives you a fully managed RAG experience and the easiest way to get started with RAG in Amazon Bedrock. Knowledge Bases now manages the initial vector store setup, handles the embedding and querying, and provides source attribution and short-term memory needed for production RAG applications. If needed, you can also customize the RAG workflows to meet specific use case requirements or integrate RAG with other generative artificial intelligence (AI) tools and applications.

Fully managed RAG experience
Knowledge Bases for Amazon Bedrock manages the end-to-end RAG workflow for you. You specify the location of your data, select an embedding model to convert the data into vector embeddings, and have Amazon Bedrock create a vector store in your account to store the vector data. When you select this option (available only in the console), Amazon Bedrock creates a vector index in Amazon OpenSearch Serverless in your account, removing the need to manage anything yourself.

Knowledge bases for Amazon Bedrock

Vector embeddings include the numeric representations of text data within your documents. Each embedding aims to capture the semantic or contextual meaning of the data. Amazon Bedrock takes care of creating, storing, managing, and updating your embeddings in the vector store, and it ensures your data is always in sync with your vector store.

Amazon Bedrock now also supports two new APIs for RAG that handle the embedding and querying and provide the source attribution and short-term memory needed for production RAG applications.

With the new RetrieveAndGenerate API, you can directly retrieve relevant information from your knowledge bases and have Amazon Bedrock generate a response from the results by specifying a FM in your API call. Let me show you how this works.

Use the RetrieveAndGenerate API
To give it a try, navigate to the Amazon Bedrock console, create and select a knowledge base, then select Test knowledge base. For this demo, I created a knowledge base that has access to a PDF of Generative AI on AWS. I choose Select Model to specify a FM.

Knowledge Bases for Amazon Bedrock

Then, I ask, “What is Amazon Bedrock?”

Knowledge Bases for Amazon Bedrock

Behind the scenes, Amazon Bedrock converts the queries into embeddings, queries the knowledge base, and then augments the FM prompt with the search results as context information and returns the FM-generated response to my question. For multi-turn conversations, Knowledge Bases manages the short-term memory of the conversation to provide more contextual results.

Here’s a quick demo of how to use the APIs with the AWS SDK for Python (Boto3).

def retrieveAndGenerate(input, kbId):
    return bedrock_agent_runtime.retrieve_and_generate(
        input={
            'text': input
        },
        retrieveAndGenerateConfiguration={
            'type': 'KNOWLEDGE_BASE',
            'knowledgeBaseConfiguration': {
                'knowledgeBaseId': kbId,
                'modelArn': 'arn:aws:bedrock:us-east-1::foundation-model/anthropic.claude-instant-v1'
                }
            }
        )

response = retrieveAndGenerate("What is Amazon Bedrock?", "AES9P3MT9T")["output"]["text"]

The output of the RetrieveAndGenerate API includes the generated response, the source attribution, and the retrieved text chunks. In my demo, the API response looks like this (with some of the output redacted for brevity):


{ ... 
    'output': {'text': 'Amazon Bedrock is a managed service from AWS that ...'}, 
    'citations': 
        [{'generatedResponsePart': 
             {'textResponsePart': 
                 {'text': 'Amazon Bedrock is ...', 'span': {'start': 0, 'end': 241}}
             }, 
	      'retrievedReferences': 
			[{'content':
                 {'text': 'All AWS-managed service API activity...'}, 
				 'location': {'type': 'S3', 's3Location': {'uri': 's3://data-generative-ai-on-aws/gaia.pdf'}}}, 
		     {'content': 
			      {'text': 'Changing a portion of the image using ...'}, 
				  'location': {'type': 'S3', 's3Location': {'uri': 's3://data-generative-ai-on-aws/gaia.pdf'}}}, ...]
        ...}]
}

The generated response looks like this:

Amazon Bedrock is a managed service that offers a serverless experience for generative AI through a simple API. It provides access to foundation models from Amazon and third parties for tasks like text generation, image generation, and building conversational agents. Data processed through Amazon Bedrock remains private and encrypted.

Customize RAG workflows
If you want to process the retrieved text chunks further, see the relevance scores of the retrievals, or develop your own orchestration for text generation, you can use the new Retrieve API. This API converts user queries into embeddings, searches the knowledge base, and returns the relevant results, giving you more control to build custom workflows on top of the semantic search results.

Use the Retrieve API
In the Amazon Bedrock console, I toggle the switch to disable Generate responses.

Knowledge Bases for Amazon Bedrock

Then, I ask again, “What is Amazon Bedrock?” This time, the output shows me the retrieval results with links to the source documents where the text chunks came from.

Knowledge Bases for Amazon Bedrock

Here’s how to use the Retrieve API with boto3.

import boto3

bedrock_agent_runtime = boto3.client(
    service_name = "bedrock-agent-runtime"
)

def retrieve(query, kbId, numberOfResults=5):
    return bedrock_agent_runtime.retrieve(
        retrievalQuery= {
            'text': query
        },
        knowledgeBaseId=kbId,
        retrievalConfiguration= {
            'vectorSearchConfiguration': {
                'numberOfResults': numberOfResults
            }
        }
    )

response = retrieve("What is Amazon Bedrock?", "AES9P3MT9T")["retrievalResults"]

The output of the Retrieve API includes the retrieved text chunks, the location type and URI of the source data, and the scores of the retrievals. The score helps to determine chunks that match more closely with the query.

In my demo, the API response looks like this (with some of the output redacted for brevity):

[{'content': {'text': 'Changing a portion of the image using ...'},
  'location': {'type': 'S3',
   's3Location': {'uri': 's3://data-generative-ai-on-aws/gaia.pdf'}},
  'score': 0.7329834},
 {'content': {'text': 'back to the user in natural language. For ...'},
  'location': {'type': 'S3',
   's3Location': {'uri': 's3://data-generative-ai-on-aws/gaia.pdf'}},
  'score': 0.7331088},
...]
		 

To further customize your RAG workflows, you can define a custom chunking strategy and select a custom vector store.

Custom chunking strategy – To enable effective retrieval from your data, a common practice is to first split the documents into manageable chunks. This enhances the model’s capacity to comprehend and process information more effectively, leading to improved relevant retrievals and generation of coherent responses. Knowledge Bases for Amazon Bedrock manages the chunking of your documents.

When you configure the data source for your knowledge base, you can now define a chunking strategy. Default chunking splits data into chunks of up to 200 tokens and is optimized for question-answer tasks. Use default chunking when you are not sure of the optimal chunk size for your data.

You also have the option to specify a custom chunk size and overlap with fixed-size chunking. Use fixed-size chunking if you know the optimal chunk size and overlap for your data (based on file attributes, accuracy testing, and so on). An overlap between chunks in the recommended range of 0–20 percent can help improve accuracy. Higher overlap can lead to decreased relevancy scores.

If you select to create one embedding per document, Knowledge Bases keeps each file as a single chunk. Use this option if you don’t want Amazon Bedrock to chunk your data, for example, if you want to chunk your data offline using an algorithm that is specific to your use case. Common use cases include code documentation.

Custom vector store – You can also select a custom vector store. The available vector database options include vector engine for Amazon OpenSearch Serverless, Pinecone, and Redis Enterprise Cloud. To use a custom vector store, you must create a new, empty vector database from the list of supported options and provide the vector database index name as well as index field and metadata field mappings. This vector database will need to be for exclusive use with Amazon Bedrock.

Knowledge Bases for Amazon Bedrock

Integrate RAG with other generative AI tools and applications
If you want to build an AI assistant that can perform multistep tasks and access company data sources to generate more relevant and context-aware responses, you can integrate Knowledge Bases with Agents for Amazon Bedrock. You can also use the Knowledge Bases retrieval plugin for LangChain to integrate RAG workflows into your generative AI applications.

Availability
Knowledge bases for Amazon Bedrock is available today in AWS Regions US East (N. Virginia) and US West (Oregon).

Learn more

— Antje

Updates to Layered Context Enable Teams to Quickly Understand Which Risk Signals Are Most Pressing

Post Syndicated from Pauline Logan original https://blog.rapid7.com/2023/11/28/updates-to-layered-context-enable-teams-to-quickly-understand-which-risk-signals-are-most-pressing/

Updates to Layered Context Enable Teams to Quickly Understand Which Risk Signals Are Most Pressing

Layered Context introduced a consolidated view of all security risks insightCloudSec collects from the various layers of a cloud environment. This enabled our customers to go from visibility into individual security risks on a resource, to understanding all of the risks that impacted that resource and the overall risk of that resource.

For example: let’s take a cloud resource that has a port left open to the public.

With this level of detail it is pretty challenging to identify the risk level, because we don’t know enough about the resource in question, or even if it was supposed to be opened to the public or not. It’s not that this isn’t risky, we just need to know more to evaluate just how risky it is. As we add more context, we start to get a clearer picture: the environment the resource is running in, if it is connected to a business critical application, does it have any known vulnerabilities, are there identities with elevated permissions associated with the resource, etc.

Updates to Layered Context Enable Teams to Quickly Understand Which Risk Signals Are Most Pressing

By layering together all of this context, customers are able to effectively understand the actual risk associated with each and every one of their resources – in real-time. This is of course helpful information to have in one consolidated view, but even still it can be difficult to sift through potentially thousands of resources and prioritize the work that needs to be done to secure each one. To that end, we are excited to introduce a new risk score in Layered Context, which analyzes all the signals and context we know about a given cloud resource and automatically assigns a score and a severity, making it easy for our customers to understand the riskiest resources they should focus on.

Prioritizing Risk By Focusing on Toxic Combinations

Much like Layered Context itself, the new risk score combines a variety of risk signals, assigning a higher risk score to resources that suffer from toxic combinations, or multiple risk vectors that compound to present an increased likelihood or impact of compromise.

The risk score takes into account:

  • Business Criticality, with an understanding of what applications the resource is associated with such as a crown-jewel or revenue generating app
  • Public Accessibility, both from a network perspective as well as via user permissions (more on that in a second)
  • Potential Attack Paths, to understand how a bad actor could move laterally across your inter-connected environment
  • Identity-related risk, including excessive and/or unused permissions and privileges
  • Misconfigurations, including whether or not the resource is in compliance with organizational standards
  • Threats to factor in any malicious behavior that has been detected
  • And of course, Vulnerabilities, using Rapid7’s Active Risk model which consumes data on exploitability and active exploitation in the wild

By identifying these toxic combinations, we can ensure the riskiest resources are given the highest priority. Each resource is assigned a score and a severity, making it easy for our customers to see where the riskiest resources exist in their environment and where to focus.

A Clear Understanding of How We Calculate Risk

Alongside our risk score, we are  introducing a new view to breakdown all of the reasons why a resource has been scored accordingly. This will give an overview of the most important information our customers need to know that clearly summarizes the factors that influenced the risk scoring. Reducing the time required to understand why a resource is risky, meaning security teams can focus on remediating the risks.

Updates to Layered Context Enable Teams to Quickly Understand Which Risk Signals Are Most Pressing

A Bit More on How we Determine Public Accessibility

As mentioned previously, the basis of much of our risk calculation in cloud resources stems from a simple question: “is this resource publicly accessible?” This is a critical detail in determining relative risk, but can be very difficult to ascertain given the complex and ephemeral nature of cloud environments. To address this, we’ve invested significant time and effort to ensure we’re assessing public accessibility as accurately as possible but also explaining why we’ve determined it that way, so it’s much easier to take remediation action. This determination can easily be viewed on a per resource basis from the Layered Context page.

We have lots of exciting releases coming up in the next few months, alongside Risk scoring we are also extending our Attack Path Analysis feature to show the Blast Radius of an Attack with improved topology visualizations.  This will give our customers not only the visibility into how an attacker could exploit a given resource but also the potential for lateral movement between interconnected resources. Additionally, we’ll be updating the way we validate and show proof of public accessibility. Should a resource be publicly accessible, you will be able to easily view the proof details which will show exactly which combination of configurations is resulting in the resource being publicly accessible.

The new risk scoring capabilities in Layered Context will be on display at AWS Re:Invent next week. Be sure to stop by booth #1270 to see it in action!

Join the preview for new memory-optimized, AWS Graviton4-powered Amazon EC2 instances (R8g)

Post Syndicated from Jeff Barr original https://aws.amazon.com/blogs/aws/join-the-preview-for-new-memory-optimized-aws-graviton4-powered-amazon-ec2-instances-r8g/

We are opening up a preview of the next generation of Amazon Elastic Compute Cloud (Amazon EC2) instances. Equipped with brand-new Graviton4 processors, the new R8g instances will deliver better price performance than any existing memory-optimized instance. The R8g instances are suitable for your most demanding memory-intensive workloads: big data analytics, high-performance databases, in-memory caches and so forth.

Graviton history
Let’s take a quick look back in time and recap the evolution of the Graviton processors:

November 2018 – The Graviton processor made its debut in the A1 instances, optimized for both performance and cost, and delivering cost reductions of up to 45% for scale-out workloads.

December 2019 – The Graviton2 processor debuted with the announcement of M6g, M6gd, C6g, C6gd, R6g, and R6gd instances with up to 40% better price performance than equivalent non-Graviton instances. The second-generation processor delivered up to 7x performance of the first one, including twice the floating point performance.

November 2021 – The Graviton3 processor made its debut with the announcement of the compute-optimized C7g instances. In addition to up to 25% better compute performance, this generation of processors once again doubled floating point and cryptographic performance when compared to the previous generation.

November 2022 – The Graviton 3E processor was announced, for use in the Hpc7g and C7gn instances, with up to 35% higher vector instruction processing performance than the Graviton3.

Today, every one of the top 100 Amazon Elastic Compute Cloud (EC2) customers makes use of Graviton, choosing between more than 150 Graviton-powered instances.

New Graviton4
I’m happy to be able to tell you about the latest in our series of innovative custom chip designs, the energy-efficient AWS Graviton4 processor.

96 Neoverse V2 cores, 2 MB of L2 cache per core, and 12 DDR5-5600 channels work together to make the Graviton4 up to 40% faster for databases, 30% faster for web applications, and 45% faster for large Java applications than the Graviton3.

Graviton4 processors also support all of the security features from the previous generations, and includes some important new ones including encrypted high-speed hardware interfaces and Branch Target Identification (BTI).

R8g instance sizes
The 8th generation R8g instances will be available in multiple sizes with up to triple the number of vCPUs and triple the amount of memory of the 7th generation (R7g) of memory-optimized, Graviton3-powered instances.

Join the preview
R8g instances with Graviton4 processors

Jeff;

Announcing the new Amazon S3 Express One Zone high performance storage class

Post Syndicated from Jeff Barr original https://aws.amazon.com/blogs/aws/new-amazon-s3-express-one-zone-high-performance-storage-class/

The new Amazon S3 Express One Zone storage class is designed to deliver up to 10x better performance than the S3 Standard storage class while handling hundreds of thousands of requests per second with consistent single-digit millisecond latency, making it a great fit for your most frequently accessed data and your most demanding applications. Objects are stored and replicated on purpose built hardware within a single AWS Availability Zone, allowing you to co-locate storage and compute (Amazon EC2, Amazon ECS, and Amazon EKS) resources to further reduce latency.

Amazon S3 Express One Zone
With very low latency between compute and storage, the Amazon S3 Express One Zone storage class can help to deliver a significant reduction in runtime for data-intensive applications, especially those that use hundreds or thousands of parallel compute nodes to process large amounts of data for AI/ML training, financial modeling, media processing, real-time ad placement, high performance computing, and so forth. These applications typically keep the data around for a relatively short period of time, but access it very frequently during that time.

This new storage class can handle objects of any size, but is especially awesome for smaller objects. This is because for smaller objects the time to first byte is very close to the time for last byte. In all storage systems, larger objects take longer to stream because there is more data to download during the transfer, and therefore the storage latency has less impact on the total time to read the object. As a result, smaller objects receive an outsized benefit from lower storage latency compared to large objects. Because of S3 Express One Zone’s consistent very low latency, small objects can be read up to 10x faster compared to S3 Standard.

The extremely low latency provided by Amazon S3 Express One Zone, combined with request costs that are 50% lower than for the S3 Standard storage class, means that your Spot and On-Demand compute resources are used more efficiently and can be shut down earlier, leading to an overall reduction in processing costs.

Each Amazon S3 Express One Zone directory bucket exists in a single Availability Zone that you choose, and can be accessed using the usual set of S3 API functions: CreateBucket, PutObject, GetObject, ListObjectsV2, and so forth. The buckets also support a carefully chosen set of S3 features including byte-range fetches, multi-part upload, multi-part copy, presigned URLs, and Access Analyzer for S3. You can upload objects directly, write code that uses CopyObject, or use S3 Batch Operations,

In order to reduce latency and to make this storage class as efficient & scalable as possible, we are introducing a new bucket type, a new authentication model, and a bucket naming convention:

New bucket type – The new directory buckets are specific to this storage class, and support hundreds of thousands of requests per second. They have a hierarchical namespace and store object key names in a directory-like manner. The path delimiter must be “/“, and any prefixes that you supply to ListObjectsV2 must end with a delimiter. Also, list operations return results without first sorting them, so you cannot do a “start after” retrieval.

New authentication model – The new CreateSession function returns a session token that grants access to a specific bucket for five minutes. You must include this token in the requests that you make to other S3 API functions that operate on the bucket or the objects in it, with the exception of CopyObject, which requires IAM credentials. The newest versions of the AWS SDKs handle session creation automatically.

Bucket naming – Directory bucket names must be unique within their AWS Region, and must specify an Availability Zone ID in a specially formed suffix. If my base bucket name is jbarr and it exists in Availability Zone use1-az5 (Availability Zone 5 in the US East (N. Virginia) Region) the name that I supply to CreateBucket would be jbarr--use1-az5--x-s3. Although the bucket exists within a specific Availability Zone, it is accessible from the other zones in the region, and there are no data transfer charges for requests from compute resources in one Availability Zone to directory buckets in another one in the same region.

Amazon S3 Express One Zone in action
Let’s put this new storage class to use. I will focus on the command line, but AWS Management Console and API access are also available.

My EC2 instance is running in my us-east-1f Availability Zone. I use jq to map this value to an Availability Zone Id:

$ aws ec2 describe-availability-zones --output json | \
  jq -r  '.AvailabilityZones[] | select(.ZoneName == "us-east-1f") | .ZoneId'
use1-az5

I create a bucket configuration (s3express-bucket-config.json) and include the Id:

{
        "Location" :
        {
                "Type" : "AvailabilityZone",
                "Name" : "use1-az5"
        },
        "Bucket":
        {
                "DataRedundancy" : "SingleAvailabilityZone",
                "Type"           : "Directory"
        }
}

After installing the newest version of the AWS Command Line Interface (AWS CLI), I create my directory bucket:

$ aws s3api create-bucket --bucket jbarr--use1-az5--x-s3 \
  --create-bucket-configuration file://s3express-bucket-config.json \
  --region us-east-1
-------------------------------------------------------------------------------------------
|                                       CreateBucket                                      |
+----------+------------------------------------------------------------------------------+
|  Location|  https://jbarr--use1-az5--x-s3.s3express-use1-az5.us-east-1.amazonaws.com/   |
+----------+------------------------------------------------------------------------------+

Then I can use the directory bucket as the destination for other CLI commands as usual (the second aws is the directory where I unzipped the AWS CLI):

$ aws s3 sync aws s3://jbarr--use1-az5--x-s3

When I list the directory bucket’s contents, I see that the StorageClass is EXPRESS_ONEZONE:

$ aws s3api list-objects-v2 --bucket jbarr--use1-az5--x-s3 --output json | \
  jq -r '.Contents[] | {Key: .Key, StorageClass: .StorageClass}'
...
{
  "Key": "install",
  "StorageClass": "EXPRESS_ONEZONE"
}
...

The Management Console for S3 shows General purpose buckets and Directory buckets on separate tabs:

I can import the contents of an existing bucket (or a prefixed subset of the contents) into a directory bucket using the Import button, as seen above. I select a source bucket, click Import, and enter the parameters that will be used to generate an inventory of the source bucket and to create and an S3 Batch Operations job.

The job is created and begins to execute:

Things to know
Here are some important things to know about this new S3 storage class:

Regions – Amazon S3 Express One Zone is available in the US East (N. Virginia), US West (Oregon), Asia Pacific (Tokyo), and Europe (Stockholm) Regions, with plans to expand to others over time.

Other AWS Services – You can use Amazon S3 Express One Zone with other AWS services including Amazon SageMaker Model Training, Amazon Athena, Amazon EMR, and AWS Glue Data Catalog to accelerate your machine learning and analytics workloads. You can also use Mountpoint for Amazon S3 to process your S3 objects in file-oriented fashion.

Pricing – Pricing, like the other S3 storage classes, is on a pay-as-you-go basis. You pay $0.16/GB/month in the US East (N. Virginia) Region, with a one-hour minimum billing time for each object, and additional charges for certain request types. You pay an additional per-GB fee for the portion of any request that exceeds 512 KB. For more information, see the Amazon S3 Pricing page.

Durability – In the unlikely case of the loss or damage to all or part of an AWS Availability Zone, data in a One Zone storage class may be lost. For example, events like fire and water damage could result in data loss. Apart from these types of events, our One Zone storage classes use similar engineering designs as our Regional storage classes to protect objects from independent disk, host, and rack-level failures, and each are designed to deliver 99.999999999% data durability.

SLA – Amazon S3 Express One Zone is designed to deliver 99.95% availability with an availability SLA of 99.9%; for information see the Amazon S3 Service Level Agreement page.

This new storage class is available now and you can start using it today!

Learn more
Amazon S3 Express One Zone

Jeff;

Security updates for Tuesday

Post Syndicated from corbet original https://lwn.net/Articles/953099/

Security updates have been issued by Debian (cryptojs, fastdds, mediawiki, and minizip), Fedora (chromium, kubernetes, and thunderbird), Mageia (lilypond, mariadb, and packages), Red Hat (firefox, linux-firmware, and thunderbird), SUSE (compat-openssl098, gstreamer-plugins-bad, squashfs, squid, thunderbird, vim, and xerces-c), and Ubuntu (libtommath, linux-intel-iotg, linux-intel-iotg-5.15, linux-oracle, perl, and python3.8, python3.10, python3.11).

Security at multiple layers for web-administered apps

Post Syndicated from Guy Morton original https://aws.amazon.com/blogs/security/security-at-multiple-layers-for-web-administered-apps/

In this post, I will show you how to apply security at multiple layers of a web application hosted on AWS.

Apply security at all layers is a design principle of the Security pillar of the AWS Well-Architected Framework. It encourages you to apply security at the network edge, virtual private cloud (VPC), load balancer, compute instance (or service), operating system, application, and code.

Many popular web apps are designed with a single layer of security: the login page. Behind that login page is an in-built administration interface that is directly exposed to the internet. Admin interfaces for these apps typically have simple login mechanisms and often lack multi-factor authentication (MFA) support, which can make them an attractive target for threat actors.

The in-built admin interface can also be problematic if you want to horizontally scale across multiple servers. The admin interface is available on every server that runs the app, so it creates a large attack surface. Because the admin interface updates the software on its own server, you must synchronize updates across a fleet of instances.

Multi-layered security is about identifying (or creating) isolation boundaries around the parts of your architecture and minimizing what is permitted to cross each boundary. Adding more layers to your architecture gives you the opportunity to introduce additional controls at each layer, creating more boundaries where security controls can be enforced.

In the example app scenario in this post, you have the opportunity to add many additional layers of security.

Example of multi-layered security

This post demonstrates how you can use the Run Web-Administered Apps on AWS sample project to help address these challenges, by implementing a horizontally-scalable architecture with multi-layered security. The project builds and configures many different AWS services, each designed to help provide security at different layers.

By running this solution, you can produce a segmented architecture that separates the two functions of these apps into an unprivileged public-facing view and an admin view. This design limits access to the web app’s admin functions while creating a fleet of unprivileged instances to serve the app at scale.

Figure 1 summarizes how the different services in this solution work to help provide security at the following layers:

  1. At the network edge
  2. Within the VPC
  3. At the load balancer
  4. On the compute instances
  5. Within the operating system
Figure 1: Logical flow diagram to apply security at multiple layers

Figure 1: Logical flow diagram to apply security at multiple layers

Deep dive on a multi-layered architecture

The following diagram shows the solution architecture deployed by Run Web-Administered Apps on AWS. The figure shows how the services deployed in this solution are deployed in different AWS Regions, and how requests flow from the application user through the different service layers.

Figure 2: Multi-layered architecture

Figure 2: Multi-layered architecture

This post will dive deeper into each of the architecture’s layers to see how security is added at each layer. But before we talk about the technology, let’s consider how infrastructure is built and managed — by people.

Perimeter 0 – Security at the people layer

Security starts with the people in your team and your organization’s operational practices. How your “people layer” builds and manages your infrastructure contributes significantly to your security posture.

A design principle of the Security pillar of the Well-Architected Framework is to automate security best practices. This helps in two ways: it reduces the effort required by people over time, and it helps prevent resources from being in inconsistent or misconfigured states. When people use manual processes to complete tasks, misconfigurations and missed steps are common.

The simplest way to automate security while reducing human effort is to adopt services that AWS manages for you, such as Amazon Relational Database Service (Amazon RDS). With Amazon RDS, AWS is responsible for the operating system and database software patching, and provides tools to make it simple for you to back up and restore your data.

You can automate and integrate key security functions by using managed AWS security services, such as Amazon GuardDuty, AWS Config, Amazon Inspector, and AWS Security Hub. These services provide network monitoring, configuration management, and detection of software vulnerabilities and unintended network exposure. As your cloud environments grow in scale and complexity, automated security monitoring is critical.

Infrastructure as code (IaC) is a best practice that you can follow to automate the creation of infrastructure. By using IaC to define, configure, and deploy the AWS resources that you use, you reduce the likelihood of human error when building AWS infrastructure.

Adopting IaC can help you improve your security posture because it applies the rigor of application code development to infrastructure provisioning. Storing your infrastructure definition in a source control system (such as AWS CodeCommit) creates an auditable artifact. With version control, you can track changes made to it over time as your architecture evolves.

You can add automated testing to your IaC project to help ensure that your infrastructure is aligned with your organization’s security policies. If you ever need to recover from a disaster, you can redeploy the entire architecture from your IaC project.

Another people-layer discipline is to apply the principle of least privilege. AWS Identity and Access Management (IAM) is a flexible and fine-grained permissions system that you can use to grant the smallest set of actions that your solution needs. You can use IAM to control access for both humans and machines, and we use it in this project to grant the compute instances the least privileges required.

You can also adopt other IAM best practices such as using temporary credentials instead of long-lived ones (such as access keys), and regularly reviewing and removing unused users, roles, permissions, policies, and credentials.

Perimeter 1 – network protections

The internet is public and therefore untrusted, so you must proactively address the risks from threat actors and network-level attacks.

To reduce the risk of distributed denial of service (DDoS) attacks, this solution uses AWS Shield for managed protection at the network edge. AWS Shield Standard is automatically enabled for all AWS customers at no additional cost and is designed to provide protection from common network and transport layer DDoS attacks. For higher levels of protection against attacks that target your applications, subscribe to AWS Shield Advanced.

Amazon Route 53 resolves the hostnames that the solution uses and maps the hostnames as aliases to an Amazon CloudFront distribution. Route 53 is a robust and highly available globally distributed DNS service that inspects requests to protect against DNS-specific attack types, such as DNS amplification attacks.

Perimeter 2 – request processing

CloudFront also operates at the AWS network edge and caches, transforms, and forwards inbound requests to the relevant origin services across the low-latency AWS global network. The risk of DDoS attempts overwhelming your application servers is further reduced by caching web requests in CloudFront.

The solution configures CloudFront to add a shared secret to the origin request within a custom header. A CloudFront function copies the originating user’s IP to another custom header. These headers get checked when the request arrives at the load balancer.

AWS WAF, a web application firewall, blocks known bad traffic, including cross-site scripting (XSS) and SQL injection events that come into CloudFront. This project uses AWS Managed Rules, but you can add your own rules, as well. To restrict frontend access to permitted IP CIDR blocks, this project configures an IP restriction rule on the web application firewall.

Perimeter 3 – the VPC

After CloudFront and AWS WAF check the request, CloudFront forwards it to the compute services inside an Amazon Virtual Private Cloud (Amazon VPC). VPCs are logically isolated networks within your AWS account that you can use to control the network traffic that is allowed in and out. This project configures its VPC to use a private IPv4 CIDR block that cannot be directly routed to or from the internet, creating a network perimeter around your resources on AWS.

The Amazon Elastic Compute Cloud (Amazon EC2) instances are hosted in private subnets within the VPC that have no inbound route from the internet. Using a NAT gateway, instances can make necessary outbound requests. This design hosts the database instances in isolated subnets that don’t have inbound or outbound internet access. Amazon RDS is a managed service, so AWS manages patching of the server and database software.

The solution accesses AWS Secrets Manager by using an interface VPC endpoint. VPC endpoints use AWS PrivateLink to connect your VPC to AWS services as if they were in your VPC. In this way, resources in the VPC can communicate with Secrets Manager without traversing the internet.

The project configures VPC Flow Logs as part of the VPC setup. VPC flow logs capture information about the IP traffic going to and from network interfaces in your VPC. GuardDuty analyzes these logs and uses threat intelligence data to identify unexpected, potentially unauthorized, and malicious activity within your AWS environment.

Although using VPCs and subnets to segment parts of your application is a common strategy, there are other ways that you can achieve partitioning for application components:

  • You can use separate VPCs to restrict access to a database, and use VPC peering to route traffic between them.
  • You can use a multi-account strategy so that different security and compliance controls are applied in different accounts to create strong logical boundaries between parts of a system. You can route network requests between accounts by using services such as AWS Transit Gateway, and control them using AWS Network Firewall.

There are always trade-offs between complexity, convenience, and security, so the right level of isolation between components depends on your requirements.

Perimeter 4 – the load balancer

After the request is sent to the VPC, an Application Load Balancer (ALB) processes it. The ALB distributes requests to the underlying EC2 instances. The ALB uses TLS version 1.2 to encrypt incoming connections with an AWS Certificate Manager (ACM) certificate.

Public access to the load balancer isn’t allowed. A security group applied to the ALB only allows inbound traffic on port 443 from the CloudFront IP range. This is achieved by specifying the Region-specific AWS-managed CloudFront prefix list as the source in the security group rule.

The ALB uses rules to decide whether to forward the request to the target instances or reject the traffic. As an additional layer of security, it uses the custom headers that the CloudFront distribution added to make sure that the request is from CloudFront. In another rule, the ALB uses the originating user’s IP to decide which target group of Amazon EC2 instances should handle the request. In this way, you can direct admin users to instances that are configured to allow admin tasks.

If a request doesn’t match a valid rule, the ALB returns a 404 response to the user.

Perimeter 5 – compute instance network security

A security group creates an isolation boundary around the EC2 instances. The only traffic that reaches the instance is the traffic that the security group rules allow. In this solution, only the ALB is allowed to make inbound connections to the EC2 instances.

A common practice is for customers to also open ports, or to set up and manage bastion hosts to provide remote access to their compute instances. The risk in this approach is that the ports could be left open to the whole internet, exposing the instances to vulnerabilities in the remote access protocol. With remote work on the rise, there is an increased risk for the creation of these overly permissive inbound rules.

Using AWS Systems Manager Session Manager, you can remove the need for bastion hosts or open ports by creating secure temporary connections to your EC2 instances using the installed SSM agent. As with every software package that you install, you should check that the SSM agent aligns with your security and compliance requirements. To review the source code to the SSM agent, see amazon-ssm-agent GitHub repo.

The compute layer of this solution consists of two separate Amazon EC2 Auto Scaling groups of EC2 instances. One group handles requests from administrators, while the other handles requests from unprivileged users. This creates another isolation boundary by keeping the functions separate while also helping to protect the system from a failure in one component causing the whole system to fail. Each Amazon EC2 Auto Scaling group spans multiple Availability Zones (AZs), providing resilience in the event of an outage in an AZ.

By using managed database services, you can reduce the risk that database server instances haven’t been proactively patched for security updates. Managed infrastructure helps reduce the risk of security issues that result from the underlying operating system not receiving security patches in a timely manner and the risk of downtime from hardware failures.

Perimeter 6 – compute instance operating system

When instances are first launched, the operating system must be secure, and the instances must be updated as required when new security patches are released. We recommend that you create immutable servers that you build and harden by using a tool such as EC2 Image Builder. Instead of patching running instances in place, replace them when an updated Amazon Machine Image (AMI) is created. This approach works in our example scenario because the application code (which changes over time) is stored on Amazon Elastic File System (Amazon EFS), so when you replace the instances with a new AMI, you don’t need to update them with data that has changed after the initial deployment.

Another way that the solution helps improve security on your instances at the operating system is to use EC2 instance profiles to allow them to assume IAM roles. IAM roles grant temporary credentials to applications running on EC2, instead of using hard-coded credentials stored on the instance. Access to other AWS resources is provided using these temporary credentials.

The IAM roles have least privilege policies attached that grant permission to mount the EFS file system and access AWS Systems Manager. If a database secret exists in Secrets Manager, the IAM role is granted permission to access it.

Perimeter 7 – at the file system

Both Amazon EC2 Auto Scaling groups of EC2 instances share access to Amazon EFS, which hosts the files that the application uses. IAM authorization applies IAM file system policies to control the instance’s access to the file system. This creates another isolation boundary that helps prevent the non-admin instances from modifying the application’s files.

The admin group’s instances have the file system mounted in read-write mode. This is necessary so that the application can update itself, install add-ons, upload content, or make configuration changes. On the unprivileged instances, the file system is mounted in read-only mode. This means that these instances can’t make changes to the application code or configuration files.

The unprivileged instances have local file caching enabled. This caches files from the EFS file system on the local Amazon Elastic Block Store (Amazon EBS) volume to help improve scalability and performance.

Perimeter 8 – web server configuration

This solution applies different web server configurations to the instances running in each Amazon EC2 Auto Scaling group. This creates a further isolation boundary at the web server layer.

The admin instances use the default configuration for the application that permits access to the admin interface. Non-admin, public-facing instances block admin routes, such as wp-login.php, and will return a 403 Forbidden response. This creates an additional layer of protection for those routes.

Perimeter 9 – database security

The database layer is within two additional isolation boundaries. The solution uses Amazon RDS, with database instances deployed in isolated subnets. Isolated subnets have no inbound or outbound internet access and can only be reached through other network interfaces within the VPC. The RDS security group further isolates the database instances by only allowing inbound traffic from the EC2 instances on the database server port.

By using IAM authentication for the database access, you can add an additional layer of security by configuring the non-admin instances with less privileged database user credentials.

Perimeter 10 – Security at the application code layer

To apply security at the application code level, you should establish good practices around installing updates as they become available. Most applications have email lists that you can subscribe to that will notify you when updates become available.

You should evaluate the quality of an application before you adopt it. The following are some metrics to consider:

  • Number of developers who are actively working on it
  • Frequency of updates to it
  • How quickly the developers respond with patches when bugs are reported

Other steps that you can take

Use AWS Verified Access to help secure application access for human users. With Verified Access, you can add another user authentication stage, to help ensure that only verified users can access an application’s administrative functions.

Amazon GuardDuty is a threat detection service that continuously monitors your AWS accounts and workloads for malicious activity and delivers detailed security findings for visibility and remediation. It can detect communication with known malicious domains and IP addresses and identify anomalous behavior. GuardDuty Malware Protection helps you detect the potential presence of malware by scanning the EBS volumes that are attached to your EC2 instances.

Amazon Inspector is an automated vulnerability management service that automatically discovers the Amazon EC2 instances that are running and scans them for software vulnerabilities and unintended network exposure. To help ensure that your web server instances are updated when security patches are available, use AWS Systems Manager Patch Manager.

Deploy the sample project

We wrote the Run Web-Administered Apps on AWS project by using the AWS Cloud Development Kit (AWS CDK). With the AWS CDK, you can use the expressive power of familiar programming languages to define your application resources and accelerate development. The AWS CDK has support for multiple languages, including TypeScript, Python, .NET, Java, and Go.

This project uses Python. To deploy it, you need to have a working version of Python 3 on your computer. For instructions on how to install the AWS CDK, see Get Started with AWS CDK.

Configure the project

To enable this project to deploy multiple different web projects, you must do the configuration in the parameters.properties file. Two variables identify the configuration blocks: app (which identifies the web application to deploy) and env (which identifies whether the deployment is to a dev or test environment, or to production).

When you deploy the stacks, you specify the app and env variables as CDK context variables so that you can select between different configurations at deploy time. If you don’t specify a context, a [default] stanza in the parameters.properties file specifies the default app name and environment that will be deployed.

To name other stanzas, combine valid app and env values by using the format <app>-<env>. For each stanza, you can specify its own Regions, accounts, instance types, instance counts, hostnames, and more. For example, if you want to support three different WordPress deployments, you might specify the app name as wp, and for env, you might want devtest, and prod, giving you three stanzas: wp-devwp-test, and wp-prod.

The project includes sample configuration items that are annotated with comments that explain their function.

Use CDK bootstrapping

Before you can use the AWS CDK to deploy stacks into your account, you need to use CDK bootstrapping to provision resources in each AWS environment (account and Region combination) that you plan to use. For this project, you need to bootstrap both the US East (N. Virginia) Region (us-east-1)  and the home Region in which you plan to host your application.

Create a hosted zone in the target account

You need to have a hosted zone in Route 53 to allow the creation of DNS records and certificates. You must manually create the hosted zone by using the AWS Management Console. You can delegate a domain that you control to Route 53 and use it with this project. You can also register a domain through Route 53 if you don’t currently have one.

Run the project

Clone the project to your local machine and navigate to the project root. To create the Python virtual environment (venv) and install the dependencies, follow the steps in the Generic CDK instructions.

To create and configure the parameters.properties file

Copy the parameters-template.properties file (in the root folder of the project) to a file called parameters.properties and save it in the root folder. Open it with a text editor and then do the following:

If you want to restrict public access to your site, change 192.0.2.0/24 to the IP range that you want to allow. By providing a comma-separated list of allowedIps, you can add multiple allowed CIDR blocks.

If you don’t want to restrict public access, set allowedIps=* instead.

If you have forked this project into your own private repository, you can commit the parameters.properties file to your repo. To do that, comment out the parameters.properties  line in the .gitignore file.

To install the custom resource helper

The solution uses an AWS CloudFormation custom resource for cross-Region configuration management. To install the needed Python package, run the following command in the custom_resource directory:

cd custom_resource
pip install crhelper -t .

To learn more about CloudFormation custom resource creation, see AWS CloudFormation custom resource creation with Python, AWS Lambda, and crhelper.

To configure the database layer

Before you deploy the stacks, decide whether you want to include a data layer as part of the deployment. The dbConfig parameter determines what will happen, as follows:

  • If dbConfig is left empty — no database will be created and no database credentials will be available in your compute stacks
  • If dbConfig is set to instance — you will get a new Amazon RDS instance
  • If dbConfig is set to cluster — you will get an Amazon Aurora cluster
  • If dbConfig is set to none — if you previously created a database in this stack, the database will be deleted

If you specify either instance or cluster, you should also configure the following database parameters to match your requirements:

  • dbEngine — set the database engine to either mysql or postgres
  • dbSnapshot — specify the named snapshot for your database
  • dbSecret — if you are using an existing database, specify the Amazon Resource Name (ARN) of the secret where the database credentials and DNS endpoint are located
  • dbMajorVersion — set the major version of the engine that you have chosen; leave blank to get the default version
  • dbFullVersion — set the minor version of the engine that you have chosen; leave blank to get the default version
  • dbInstanceType — set the instance type that you want (note that these vary by service); don’t prefix with db. because the CDK will automatically prepend it
  • dbClusterSize — if you request a cluster, set this parameter to determine how many Amazon Aurora replicas are created

You can choose between mysql or postgres for the database engine. Other settings that you can choose are determined by that choice.

You will need to use an Amazon Machine Image (AMI) that has the CLI preinstalled, such as Amazon Linux 2, or install the AWS Command Line Interface (AWS CLI) yourself with a user data command. If instead of creating a new, empty database, you want to create one from a snapshot, supply the snapshot name by using the dbSnapshot parameter.

To create the database secret

AWS automatically creates and stores the RDS instance or Aurora cluster credentials in a Secrets Manager secret when you create a new instance or cluster. You make these credentials available to the compute stack through the db_secret_command variable, which contains a single-line bash command that returns the JSON from the AWS CLI command aws secretsmanager get-secret-value. You can interpolate this variable into your user data commands as follows:

SECRET=$({db_secret_command})
USERNAME=`echo $SECRET | jq -r '.username'`
PASSWORD=`echo $SECRET | jq -r '.password'`
DBNAME=`echo $SECRET | jq -r '.dbname'`
HOST=`echo $SECRET | jq -r '.host'`

If you create a database from a snapshot, make sure that your Secrets Manager secret and Amazon RDS snapshot are in the target Region. If you supply the secret for an existing database, make sure that the secret contains at least the following four key-value pairs (replace the <placeholder values> with your values):

{
    "password":"<your-password>",
    "dbname":"<your-database-name>",
    "host":"<your-hostname>",
    "username":"<your-username>"
}

The name for the secret must match the app value followed by the env value (both in title case), followed by DatabaseSecret, so for app=wp and env=dev, your secret name should be WpDevDatabaseSecret.

To deploy the stacks

The following commands deploy the stacks defined in the CDK app. To deploy them individually, use the specific stack names (these will vary according to the info that you supplied previously), as shown in the following.

cdk deploy wp-dev-network-stack -c app=wp -c env=dev
cdk deploy wp-dev-database-stack -c app=wp -c env=dev
cdk deploy wp-dev-compute-stack -c app=wp -c env=dev
cdk deploy wp-dev-cdn-stack -c app=wp -c env=dev

To create a database stack, deploy the network and database stacks first.

cdk deploy wp-dev-network-stack -c app=wp -c env=dev
cdk deploy wp-dev-database-stack -c app=wp -c env=dev

You can then initiate the deployment of the compute stack.

cdk deploy wp-dev-compute-stack -c app=wp -c env=dev

After the compute stack deploys, you can deploy the stack that creates the CloudFront distribution.

cdk deploy wp-dev-cdn-stack -c env=dev

This deploys the CloudFront infrastructure to the US East (N. Virginia) Region (us-east-1). CloudFront is a global AWS service, which means that you must create it in this Region. The other stacks are deployed to the Region that you specified in your configuration stanza.

To test the results

If your stacks deploy successfully, your site appears at one of the following URLs:

  • subdomain.hostedZone (if you specified a value for the subdomain) — for example, www.example.com
  • appName-env.hostedZone (if you didn’t specify a value for the subdomain) — for example, wp-dev.example.com.

If you connect through the IP address that you configured in the adminIps configuration, you should be connected to the admin instance for your site. Because the admin instance can modify the file system, you should use it to do your administrative tasks.

Users who connect to your site from an IP that isn’t in your allowedIps list will be connected to your fleet instances and won’t be able to alter the file system (for example, they won’t be able to install plugins or upload media).

If you need to redeploy the same app-env combination, manually remove the parameter store items and the replicated secret that you created in us-east-1. You should also delete the cdk.context.json file because it caches values that you will be replacing.

One project, multiple configurations

You can modify the configuration file in this project to deploy different applications to different environments using the same project. Each app can have different configurations for dev, test, or production environments.

Using this mechanism, you can deploy sites for test and production into different accounts or even different Regions. The solution uses CDK context variables as command-line switches to select different configuration stanzas from the configuration file.

CDK projects allow for multiple deployments to coexist in one account by using unique names for the deployed stacks, based on their configuration.

Check the configuration file into your source control repo so that you track changes made to it over time.

Got a different web app that you want to deploy? Create a new configuration by copying and pasting one of the examples and then modify the build commands as needed for your use case.

Conclusion

In this post, you learned how to build an architecture on AWS that implements multi-layered security. You can use different AWS services to provide protections to your application at different stages of the request lifecycle.

You can learn more about the services used in this sample project by building it in your own account. It’s a great way to explore how the different services work and the full features that are available. By understanding how these AWS services work, you will be ready to use them to add security, at multiple layers, in your own architectures.

 
If you have feedback about this post, submit comments in the Comments section below. If you have questions about this post, contact AWS Support.

Want more AWS Security news? Follow us on Twitter.

Guy Morton

Guy Morton

Guy is a Senior Solutions Architect at AWS. He enjoys bringing his decades of experience as a full stack developer, architect, and people manager to helping customers build and scale their applications securely in the AWS Cloud. Guy has a passion for automation in all its forms, and is also an occasional songwriter and musician who performs under the pseudonym Whtsqr.

Better debugging for Cloudflare Workers, now with breakpoints

Post Syndicated from Adam Murray original http://blog.cloudflare.com/debugging-cloudflare-workers/


Better debugging for Cloudflare Workers, now with breakpoints

As developers, we’ve all experienced times when our code doesn’t work like we expect it to. Whatever the root cause is, being able to quickly dive in, diagnose the problem, and ship a fix is invaluable.

If you’re developing with Cloudflare Workers, we provide many tools to help you debug your applications; from your local environment all the way into production. This additional insight helps save you time and resources and provides visibility into how your application actually works — which can help you optimize and refactor code even before it goes live.

In this post, we’ll explore some of the tools we currently offer, and do a deep dive into one specific area — breakpoint debugging — looking at not only how to use it, but how we recently implemented it in our runtime, workerd.

Available Debugging Tools

Logs

console.log. It might be the simplest tool for a developer to debug, but don’t underestimate it. Built into the Cloudflare runtime is node-like logging, which provides detailed, color-coded logs. Locally, you can view these logs in a terminal window, and they will look like this:

Outside local development, once your Worker is deployed, console.log statements are visible via the Real-time Logs interface in the Cloudflare Dashboard or via the Workers CLI tool, Wrangler, using the wrangler tail command. Each log that comes through wrangler tail is structured JSON, and the command has options to filter and search incoming logs to make results as relevant as possible.

If you’d like to send these logs to third-parties for processing and storage, you can leverage Workers Trace Events Logpush which supports a variety of destinations.

DevTools

In addition to logging, you can also leverage our implementation of Chrome’s DevTools to do things like view and debug network requests, take memory heap snapshots, and monitor CPU usage.

This interactive tool provides even further insight and information about your Cloudflare Workers, and can be started from within Wrangler by running wrangler dev and pressing [d] once the dev server is spun up. It can also be accessed by the editor that is built into the Cloudflare Dashboard or the Workers Playground.

Breakpoints

Breakpoints allow developers to stop code execution at specific points (lines) to evaluate what is happening. This is great for situations where you might have a race condition or times when you don’t know exactly what is happening, and your code isn’t behaving as expected. Breakpoints allow you to walk through your code line by line to see how it behaves.

You can get started with breakpoint debugging from within the Wrangler CLI by running wrangler dev and pressing [d] to open up a DevTools debugger session. If you prefer to debug via your IDE, we support VSCode and WebStorm.

Setting up VSCode
To set up VSCode to debug Cloudflare Workers with breakpoints, you’ll need to create a new .vscode/launch.json file with the following content:

{
  "configurations": [
    {
  "name": "Wrangler",
  "type": "node",
  "request": "attach",
  "port": 9229,
  "cwd": "/",
  "resolveSourceMapLocations": null,
  "attachExistingChildren": false,
  "autoAttachChildProcesses": false
    }
  ]
}

Once you’ve created this configuration in launch.json, open your project in VSCode. Open a new terminal window from VSCode, and run npx wrangler dev to start a local dev server.

At the top of the Run & Debug panel, you should see an option to select a configuration. Choose Wrangler, and select the play icon. You should see Wrangler: Remote Process [0] show up in the Call Stack panel on the left. Go back to a .js or .ts file in your project and add at least one breakpoint.

Open your browser and go to the Worker’s local URL (default http://127.0.0.1:8787). The breakpoint should be hit, and you should see details about your code at the specified line.

Setting up WebStorm
To set up WebStorm with breakpoint debugging, create a new “Attach to Node.js/Chrome” Debug Configuration, setting the port to 9229:

Run npx wrangler dev to start a local dev server, then start the Debug Configuration:

Add a breakpoint, then open your browser and go to the Worker’s local URL (default http://127.0.0.1:8787). The breakpoint should be hit, and you should see details about your code at the specified line.

How we enabled breakpoint debugging via workerd

Both workerd and Cloudflare Workers embed V8 to run workers code written in JavaScript and WASM. V8 is a component of the world’s most widely used web browser today, Google Chrome, and it is also widely used embedded into open source projects like Node.js.

The Google Chrome team has created a set of web developer tools, Chrome DevTools, that are built directly into the browser. These provide a wide range of features for inspecting, debugging, editing, and optimizing web pages. Chrome DevTools are exposed through a UI in Chrome that talks to the components of the browser, such as V8, using the Chrome DevTools Protocol (CDP). The protocol uses JSON-RPC transmitted over a websocket to exchange messages and notifications between clients, like the DevTools UI, and the components of Chrome. Within Chrome Devtools protocols are domains (DOM, Debugger, Media) that group related commands by functionality that can be implemented by different components in Chrome.

V8 supports the following CDP domains:

  • Runtime
  • Debugger
  • Profiler
  • HeapProfiler

These domains are available to all projects that embed V8, including workerd, so long as the embedding application is able to route messages between a DevTools client and V8. DevTools clients use the Debugger domain to implement debugging functionality. The Debugger domain exposes all the commands to debug an application, such as setting breakpoints. It also sends debugger events, like hitting a breakpoint, up to DevTools clients, so they can present the state of the script in a debugger UI.

While workerd has supported CDP since its first release, support for the Debugger domain is new. The Debugger domain differs from the other domains exposed by V8 because it requires the ability to suspend the execution of a script whilst it is being debugged. This presents a complication for introducing breakpoint debugging in workerd, because workerd runs each Worker in a V8 isolate in which there is just a single thread that receives incoming requests and runs the scripts associated with them.

Why is this a problem? Workerd uses an event-driven programming model and its single thread is responsible for both responding to incoming requests and for running JavaScript / WASM code. In practice, this is implemented via an event loop that sits at the bottom of the call stack that sends and receives network messages and calls event handlers that run JavaScript code. The thread needs to fall back into the event loop after running event handlers to be able to process network messages. However, the V8 API for handling breakpoints expects execution to be suspended within a method implemented by the embedder that is called from V8 when a breakpoint is hit. This method is called from the event handler that is running JavaScript in V8. Unfortunately, this prevents the workerd thread from falling back into the event loop and processing any incoming network events, including all CDP commands relating to debugging. So if a client asks to resume execution by sending a CDP command, it cannot be relayed to the executing thread because it is unable to fall into the eventloop whilst in a breakpoint.

We solved this event processing problem by adding an I/O thread to workerd. The I/O thread handles sending and receiving CDP messages, because the thread executing JavaScript can be suspended due to hitting a breakpoint or a JavaScript `debugger` statement. The I/O thread wakes the JavaScript thread when CDP commands arrive and also handles sending responses back to the CDP client. Conceptually, this was not difficult, but it required some careful synchronization to avoid dropped messages.

Use the Source

When debugging, JavaScript developers expect to see their source code in the debugger. For this to work, the embedded V8 needs to be able to locate sources. It is common for JavaScript code to be generated either by combining and minifying multiple JavaScript sources, or by transpiling to JavaScript from another language, such as TypeScript, Dart, CoffeeScript, or Elm. To render the source code in the debugger in its original form, the embedded V8 needs to know 1) where the source code came from and 2) how any given line of JavaScript visible to V8 maps back to the original sources before any transformation of the original sources was applied. The standard solution to this problem is to embed a source map into the JavaScript code that JavaScript engine runs. The embedding is performed through a special comment in the JavaScript running in the JavaScript engine:

//# sourceMappingURL=generated.js.map

This source map’s URL is resolved relative to the source URL. This can be set when instantiating a source file with the V8 API, or via another special comment:

//# sourceURL=file:///absolute/path/to/generated.js

An example source map looks something like this:

{
  "version": 3,
  "sources": ["../src/index.ts"],
  "sourcesContent": ["interface Env { ... }\n\nexport default ..."],
  "mappings": ";AAIA,IAAO,mBAA8B;AAAA,EACjC,MAAM,MAAM,SAAS,KAAK,KAAK;...",
  "names": []
}

Each of the relative paths in sources are resolved relative to the source map’s fully-qualified URL. When DevTools connects to V8 and enables the Debugger domain, V8 will send information on all parsed scripts including the source map’s fully-qualified URL. In our example, this would be file:///absolute/path/to/generated.js.map. DevTools needs to fetch this URL along with source URLs to perform source mapping. Unfortunately, our patched version of DevTools is hosted at https://devtools.devprod.cloudflare.dev/, and browsers prohibit fetching file:// URLs from non-file:// origins for security reasons. However, we need to use file:// URLs so IDEs like Visual Studio Code can match up source files from source maps to files on disk. To get around this, we used Wrangler’s inspector proxy to rewrite the CDP script parsed messages sent by V8 to use a different protocol if the User-Agent of the inspector WebSocket handshake is a browser.

Now that we can set breakpoints and fetch source maps, DevTools works as normal. When a user tries to set a breakpoint in an original source file, DevTools will use the map’s mappings to find the location in the generated JavaScript file and set a breakpoint there. This is the opposite problem to source mapping error stack traces. When V8 hits this JavaScript breakpoint, DevTools will pause on the location in the original source file. Stepping through the source file requires mapping the stepped-over segment to generated code, sending the step-over command to V8, then mapping back the new paused location in generated code to the original source file.

(sequence diagram showing process of setting, hitting, and stepping over breakpoints) (mermaid URL)

Future Work

Both the Visual Studio Code and WebStorm configurations for breakpoint debugging require attaching to an existing dev server. It would be great if your IDE could launch the dev server too, and automatically attach to it.

When you debug a Node.js program in Visual Studio Code or WebStorm, an additional --require hook is added to the NODE_OPTIONS environment variable. This hook registers the process’s inspector URL with the editor over a well-known socket. This means if your Node.js process spawns another Node.js child process, your editor will debug that child process too. This is how Visual Studio Code’s JavaScript Debug Terminal works, and is how editors can debug Node.js processes started by npm scripts.

Our plan is to detect this --require hook, and register workerd child processes started by Wrangler and Miniflare. This will mean you can debug npm launch tasks, without having to worry about starting the dev server and then attaching to it.

Start debugging!

All the debugging tools listed above are ready to be used today. Logs and DevTools can be accessed either by logging into the Cloudflare dashboard or by downloading Wrangler, the command-line tool for the Cloudflare Developer Platform. Breakpoint debugging and Node-style logging is built into the latest version of Wrangler, and can be accessed by running npx wrangler@latest dev in a terminal window. Let us know what you think in the #wrangler channel on the Cloudflare Developers Discord, and please open a GitHub issue if you hit any unexpected behavior.

What’s Up, Home? – Monitor your ad blocker with Zabbix

Post Syndicated from Janne Pikkarainen original https://blog.zabbix.com/whats-up-home-monitor-your-ad-blocker-with-zabbix/26912/

Can you monitor your ad blocker with Zabbix? Of course, you can!

API defines it all

My home Asus router is running on Asuswrt-Merlin firmware, and with that, I have AdGuard Home ad blocker.

As AdGuard Home has an API, monitoring it with Zabbix is trivial.

Communicate with the API

Communicating with AdGuard Home API is easy: pass it Authorisation: Basic XXXXXXXXXXXX header, where XXXXXXXXXX is just a Base64 hash of your AdGuard username and password. You can generate that Base64 snippet with for example

echo -n "myuser:mypassword" | base64

Next, in Zabbix, create a new HTTP Agent type item, and point it to your AdGuard Home instance.

Create some items

You’ll get the info back as JSON, so next you can create some dependent items and start monitoring. I only added

  • Total number of DNS requests
  • Blocked # of DNS requests
  • Redirects to safe search
  • Parental advisory stuff
  • Average request processing time

For the dependent items, you’ll then just do some JSONPath processing.

Add triggers

Next, I added a few triggers to alert me if AdGuard starts to run slower than usual.

Add service

Finally, I added AdGuard as a new business service, so I’ll get an SLA for it.

And that’s it! From now on I’ll know more about how well my home router ad-blocker is working. (Well, it also has a Skynet firewall which probably filters stuff before AdGuard Home, but that’s another story….)

This post was originally published on the author’s page.

The post What’s Up, Home? – Monitor your ad blocker with Zabbix appeared first on Zabbix Blog.

Improve performance of workloads containing repetitive scan filters with multidimensional data layout sort keys in Amazon Redshift

Post Syndicated from Yanzhu Ji original https://aws.amazon.com/blogs/big-data/improve-performance-of-workloads-containing-repetitive-scan-filters-with-multidimensional-data-layout-sort-keys-in-amazon-redshift/

Amazon Redshift, the most widely used cloud data warehouse, has evolved significantly to meet the performance requirements of the most demanding workloads. This post covers one such new feature—the multidimensional data layout sort key.

Amazon Redshift now improves your query performance by supporting multidimensional data layout sort keys, which is a new type of sort key that sorts a table’s data by filter predicates instead of physical columns of the table. Multidimensional data layout sort keys will significantly improve the performance of table scans, especially when your query workload contains repetitive scan filters.

Amazon Redshift already provides the capability of automatic table optimization (ATO), which automatically optimizes the design of tables by applying sort and distribution keys without the need for administrator intervention. In this post, we introduce multidimensional data layout sort keys as an additional capability offered by ATO and fortified by Amazon Redshift’s sort key advisor algorithm.

Multidimensional data layout sort keys

When you define a table with the AUTO sort key, Amazon Redshift ATO will analyze your query history and automatically select either a single-column sort key or multidimensional data layout sort key for your table, based on which option is better for your workload. When multidimensional data layout is selected, Amazon Redshift will construct a multidimensional sort function that co-locates rows that are typically accessed by the same queries, and the sort function is subsequently used during query runs to skip data blocks and even skip scanning the individual predicate columns.

Consider the following user query, which is a dominant query pattern in the user’s workload:

SELECT season, sum(metric2) AS "__measure__0"
FROM titles
WHERE lower(subregion) like '%United States%'
GROUP BY 1
ORDER BY 1;

Amazon Redshift stores data for each column in 1 MB disk blocks and stores the minimum and maximum values in each block as part of the table’s metadata. If a query uses a range-restricted predicate, Amazon Redshift can use the minimum and maximum values to rapidly skip over large numbers of blocks during table scans. However, this query’s filter on the subregion column can’t be used to determine which blocks to skip based on minimum and maximum values, and as a result, Amazon Redshift scans all rows from the titles table:

SELECT table_name, input_rows, step_attribute
FROM sys_query_detail
WHERE query_id = 123456789;

When the user’s query was run with titles using a single-column sort key on subregion, the result of the preceding query is as follows:

  table_name | input_rows | step_attribute
-------------+------------+---------------
  titles     | 2164081640 | 
(1 rows)

This shows that the table scan read 2,164,081,640 rows.

To improve scans on the titles table, Amazon Redshift might automatically decide to use a multidimensional data layout sort key. All rows that satisfy the lower(subregion) like '%United States%' predicate would be co-located to a dedicated region of the table, and therefore Amazon Redshift will only scan data blocks that satisfy the predicate.

When the user’s query is run with titles using a multidimensional data layout sort key that includes lower(subregion) like '%United States%' as a predicate, the result of the sys_query_detail query is as follows:

  table_name | input_rows | step_attribute
-------------+------------+---------------
  titles     | 152324046  | multi-dimensional
(1 rows)

This shows that the table scan read 152,324,046 rows, which is only 7% of the original, and it used the multidimensional data layout sort key.

Note that this example uses a single query to showcase the multidimensional data layout feature, but Amazon Redshift will consider all the queries running against the table and can create multiple regions to satisfy the most commonly run predicates.

Let’s take another example, with more complex predicates and multiple queries this time.

Imagine having a table items (cost int, available int, demand int) with four rows as shown in the following example.

#id cost available demand
1 4 3 3
2 2 23 6
3 5 4 5
4 1 1 2

Your dominant workload consists of two queries:

  • 70% queries pattern:
    select * from items where cost > 3 and available < demand

  • 20% queries pattern:
    select avg(cost) from items where available < demand

With traditional sorting techniques, you might choose to sort the table over the cost column, such that the evaluation of cost > 3 will benefit from the sort. So, the items table after sorting using a single cost column will look like the following.

#id cost available demand
Region #1, with cost <= 3
Region #2, with cost > 3
#id cost available demand
4 1 1 2
2 2 23 6
1 4 3 3
3 5 4 5

By using this traditional sort, we can immediately exclude the top two (blue) rows with ID 4 and ID 2, because they don’t satisfy cost > 3.

On the other hand, with a multidimensional data layout sort key, the table will be sorted based on a combination of the two commonly occurring predicates in the user’s workload, which are cost > 3 and available < demand. As a result, the table’s rows are sorted into four regions.

#id cost available demand
Region #1, with cost <= 3 and available < demand
Region #2, with cost <= 3 and available >= demand
Region #3, with cost > 3 and available < demand
Region #4, with cost > 3 and available >= demand
#id cost available demand
4 1 1 2
2 2 23 6
3 5 4 5
1 4 3 3

This concept is even more powerful when applied to entire blocks instead of single rows, when applied to complex predicates that use operators not suitable for traditional sorting techniques (such as like), and when applied to more than two predicates.

System tables

The following Amazon Redshift system tables will show users if multidimensional data layouts are used on their tables and queries:

  • To determine if a particular table is using a multidimensional data layout sort key, you can check whether sortkey1 in svv_table_info is equal to AUTO(SORTKEY(padb_internal_mddl_key_col)).
  • To determine if a particular query uses multidimensional data layout to accelerate table scans, you can check step_attribute in the sys_query_detail view. The value will be equal to multi-dimensional if the table’s multidimensional data layout sort key was used during the scan.

Performance benchmarks

We performed internal benchmark testing for multiple workloads with repetitive scan filters and see that introducing multidimensional data layout sort keys produced the following results:

  • A 74% total runtime reduction compared to having no sort key.
  • A 40% total runtime reduction compared to having the best single-column sort key on each table.
  • A 80% reduction in total rows read from tables compared to having no sort key.
  • A 47% reduction in total rows read from tables compared to having the best single-column sort key on each table.

Feature comparison

With the introduction of multidimensional data layout sort keys, your tables can now be sorted by expressions based off of the commonly occurring filter predicates in your workload. The following table provides a feature comparison for Amazon Redshift against two competitors.

Feature Amazon Redshift Competitor A Competitor B
Support for sorting on columns Yes Yes Yes
Support for sorting by expression Yes Yes No
Automatic column selection for sorting Yes No Yes
Automatic expressions selection for sorting Yes No No
Automatic selection between columns sorting or expressions sorting Yes No No
Automatic use of sorting properties for expressions during scans Yes No No

Considerations

Keep in mind the following when using a multidimensional data layout:

  • Multidimensional data layout is enabled when you set your table as SORTKEY AUTO.
  • Amazon Redshift Advisor will automatically choose either a single-column sort key or multidimensional data layout for the table by analyzing your historical workload.
  • Amazon Redshift ATO adjusts the multidimensional data layout sorting results based on the manner in which ongoing queries interact with the workload.
  • Amazon Redshift ATO maintains multidimensional data layout sort keys the same way as it currently does for existing sort keys. Refer to Working with automatic table optimization for more details on ATO.
  • Multidimensional data layout sort keys will work with both provisioned clusters and serverless workgroups.
  • Multidimensional data layout sort keys will work with your existing data as long as the AUTO SORTKEY is enabled on your table and a workload with repetitive scan filters is detected. The table will be reorganized based on the results of multi-dimensional sort function.
  • To disable multidimensional data layout sort keys for a table, use alter table: ALTER TABLE table_name ALTER SORTKEY NONE. This disables the AUTO sort key feature on the table.
  • Multidimensional data layout sort keys are preserved when restoring or migrating your provisioned cluster to a serverless cluster or vice versa.

Conclusion

In this post, we showed that multidimensional data layout sort keys can significantly improve query runtime performance for workloads where dominant queries have repetitive scan filters.

To create a preview cluster from the Amazon Redshift console, navigate to the Clusters page and choose Create preview cluster. You can create a cluster in the US East (Ohio), US East (N. Virginia), US West (Oregon), Asia Pacific (Tokyo), Europe (Ireland), and Europe (Stockholm) Regions and test your workloads.

We would love to hear your feedback on this new feature and look forward to your comments on this post.


About the authors

Yanzhu Ji is a Product Manager in the Amazon Redshift team. She has experience in product vision and strategy in industry-leading data products and platforms. She has outstanding skill in building substantial software products using web development, system design, database, and distributed programming techniques. In her personal life, Yanzhu likes painting, photography, and playing tennis.

Milind Oke is a Data Warehouse Specialist Solutions Architect based out of New York. He has been building data warehouse solutions for over 15 years and specializes in Amazon Redshift.

Jialin Ding is an Applied Scientist in the Learned Systems Group, specializing in applying machine learning and optimization techniques to improve the performance of data systems such as Amazon Redshift.

Reserve quantum computers, get guidance and cutting-edge capabilities with Amazon Braket Direct

Post Syndicated from Channy Yun original https://aws.amazon.com/blogs/aws/reserve-quantum-computers-get-expertise-and-cutting-edge-capabilities-with-amazon-braket-direct/

Today, we are announcing the availability of Braket Direct, a new Amazon Braket program that helps quantum researchers dive deeper into quantum computing. This program lets you get dedicated, private access to the full capacity of various quantum processing units (QPUs) without any queues or wait times, connect with quantum computing specialists to receive expert guidance for your workloads, and get early access to features and devices with limited availability to conduct cutting-edge research on today’s noisy quantum devices.

Since its launch in 2020, Amazon Braket has democratized access to quantum computing by offering on-demand access to various QPUs using shared, public availability windows, where you only pay for the duration of your reservation.

You can now use Braket Direct to reserve the entire dedicated machine for a period of time on IonQ Aria, QuEra Aquila, and Rigetti Aspen-M-3 devices for running your most complex, long-running, time-sensitive workloads, or conducting live events such as training workshops and hackathons, where you pay only for what you reserve.

To further your research, you can now engage directly with Braket’s experts through free office hours or one-on-one, hands-on reservation prep sessions. For deeper research collaborations, you can connect with specialists from quantum hardware providers such as IonQ, Oxford Quantum Circuits, QuEra, Rigetti, or Amazon Quantum Solutions Lab, our dedicated professional services team.

Finally, to truly push the boundaries, you can gain access to experimental capabilities that have limited or reduced availability starting with IonQ’s highest fidelity, 30-qubit Forte device.

Braket Direct expands on our commitment to accelerate research and innovation in quantum computing without requiring any upfront fees or long-term commitments.

Getting started with Braket Direct
To get started, go to the Amazon Braket console and choose Braket Direct in the left pane. You can see new features such as quantum hardware reservation, expert advice and get access to next-generation quantum hardware and features.

1. Request a quantum hardware reservation
To create a reservation, choose Reserve device and select the Device that you would like to reserve. Provide your contact information, including your name and email address, any details about the workload that you would like to execute using your reservation, such as desired reservation length, relevant constraints, and desired schedule.

Braket Direct assures that you have the full capacity of the QPU during your reservation and the predictability that your workloads will execute when your reservation begins.

If you are interested in connecting with a Braket expert for a one-on-one reservation prep session after your reservation is confirmed, you can select that option at no additional cost.

Choose Submit to complete your reservation request. A Braket team member will email you within 2–3 business days, pending request verification. To make the most of your reservation, you can choose to pre-create your tasks and jobs prior to a reservation to maximize use of the time.

To learn more about your quantum tasks and hybrid jobs to execute in a device reservation, see Get started with Braket Direct in the AWS documentation.

2. Get support from quantum computing experts
You can get in touch with quantum experts and get advice about your workload. With Braket office hours, Braket experts can help you go from ideation to execution faster at no additional cost. Explore your device to fit your use case, identify options to make best use of Braket for your algorithm, and get recommendations on how to use certain Braket features like Hybrid Jobs, Braket Pulse, or Analog Hamiltonian Simulation.

To book an upcoming Braket office hours slot, choose Sign up and fill out your contact information, workload details, and any desired discussion topics. You will receive a calendar invitation to the next available slot by email.

To take advantage of experts from quantum hardware providers, choose Connect and browse their professional services listings on AWS Marketplace.

The Amazon Quantum Solutions Lab is a collaborative research and professional services team staffed with quantum computing experts who can assist you in more effectively exploring quantum computing, engaging in quantum research, and assessing the current performance of this technology. To contact the Quantum Solutions Lab, select Connect and fill out contact information and use case details. The team will email you with next steps.

3. Access to cutting-edge capabilities
To move your research quicker, you can get early access to innovative new capabilities. With Braket Direct, you can easily request access to cutting-edge capabilities, such as new quantum devices with limited availability, directly in the Braket console. Today, you can get reservation-only access to IonQ’s highest-fidelity Forte QPU. Due to its limited availability, this device is currently only available through Braket Direct reservations.

Now available
Braket Direct is now generally available in all AWS Regions where Amazon Braket is available. To learn more, see the Braket Direct page and pricing page.

Give it a try and send feedback to AWS re:Post for Amazon Braket, Quantum Computing Stack Exchange, or through your usual AWS Support contacts.

Channy

AWS Step Functions Workflow Studio is now available in AWS Application Composer

Post Syndicated from Donnie Prakoso original https://aws.amazon.com/blogs/aws/aws-step-functions-workflow-studio-is-now-available-in-aws-application-composer/

Today, we’re announcing that AWS Step Functions Workflow Studio is now available in AWS Application Composer. This new integration brings together the development of workflows and application resources into a unified visual infrastructure as code (IaC) builder.

Now, you can have a seamless transition between authoring workflows with AWS Step Functions Workflow Studio and defining resources with AWS Application Composer. This announcement allows you to create and manage all resources at any stage of your development journey. You can visualize the full application in AWS Application Composer, then zoom into the workflow details with AWS Step Functions Workflow Studio—all within a single interface.

Seamlessly build workflow and modern application
To help you design and build modern applications, we launched AWS Application Composer in March 2023. With AWS Application Composer, you can use a visual builder to compose and configure serverless applications from AWS services backed by deployment-ready IaC.

In various use cases of building modern applications, you may also need to orchestrate microservices, automate mission-critical business processes, create event-driven applications that respond to infrastructure changes, or build machine learning (ML) pipelines. To solve these challenges, you can use AWS Step Functions, a fully managed service that makes it easier to coordinate distributed application components using visual workflows. To simplify workflow development, in 2021 we introduced AWS Step Functions Workflow Studio, a low-code visual tool for rapid workflow prototyping and development across 12,000+ API actions from over 220 AWS services.

While AWS Step Functions Workflow Studio brings simplicity to building workflows, customers that want to deploy workflows using IaC had to manually define their state machine resource and migrate their workflow definitions to the IaC template.

Better together: AWS Step Functions Workflow Studio in AWS Application Composer
With this new integration, you can now design AWS Step Functions workflows in AWS Application Composer using a drag-and-drop interface. This accelerates the path from prototyping to production deployment and iterating on existing workflows.

You can start by composing your modern application with AWS Application Composer. Within the canvas, you can add a workflow by adding an AWS Step Functions state machine resource. This new capability provides you with the ability to visually design and build a workflow with an intuitive interface to connect workflow steps to resources.

How it works
Let me walk you through how you can use AWS Step Functions Workflow Studio in AWS Application Composer. For this demo, let’s say that I need to improve handling e-commerce transactions by building a workflow and integrating with my existing serverless APIs.

First, I navigate to AWS Application Composer. Because I already have an existing project that includes application code and IaC templates from AWS Application Composer, I don’t need to build anything from scratch.

I open the menu and select Project folder to open the files in my local development machine.

Then, I select the path of my local folder, and AWS Application Composer automatically detects the IaC template that I currently have.

Then, AWS Application Composer visualizes the diagram in the canvas. What I really like about using this approach is that AWS Application Composer activates Local sync mode, which automatically syncs and saves any changes in IaC templates into my local project.

Here, I have a simple serverless API running on Amazon API Gateway, which invokes an AWS Lambda function and integrates with Amazon DynamoDB.

Now, I’m ready to make some changes to my serverless API. I configure another route on Amazon API Gateway and add AWS Step Functions state machine to start building my workflow.

When I configure my Step Functions state machine, I can start editing my workflow by selecting Edit in Workflow Studio.

This opens Step Functions Workflow Studio within the AWS Application Composer canvas. I have the same experience as Workflow Studio in the AWS Step Functions console. I can use the canvas to add actions, flows , and patterns into my Step Functions state machine.

I start building my workflow, and here’s the result that I exported using Export PNG image in Workflow Studio.

But here’s where this new capability really helps me as a developer. In the workflow definition, I use various AWS resources, such as AWS Lambda functions and Amazon DynamoDB. If I need to reference the AWS resources I defined in AWS Application Composer, I can use an AWS CloudFormation substitution.

With AWS CloudFormation substitutions, I can add a substitution using an AWS CloudFormation convention, which is a dynamic reference to a value that is provided in the IaC template. I am using a placeholder substitution here so I can map it with an AWS resource in the AWS Application Composer canvas in a later step.

I can also define the AWS CloudFormation substitution for my Amazon DynamoDB table.

At this stage, I’m happy with my workflow. To review the Amazon States Language as my AWS Step Functions state machine definition, I can also open the Code tab. Now I don’t need to manually copy and paste this definition into IaC templates. I only need to save my work and choose Return to Application Composer.

Here, I can see that my AWS Step Functions state machine is updated both in the visual diagram and in the state machine definition section.

If I scroll down, I will find AWS Cloudformation Definition Substitutions for resources that I defined in Workflow Studio. I can manually replace the mapping here, or I can use the canvas.

To use the canvas, I simply drag and drop the respective resources in my Step Functions state machine and in the Application Composer canvas. Here, I connect the Inventory Process task state with a new AWS Lambda function. Also, my Step Functions state machine tasks can reference existing resources.

When I choose Template, the state machine definition is integrated with other AWS Application Composer resources. With this IaC template I can easily deploy using AWS Serverless Application Model Command Line Interface (AWS SAM CLI) or CloudFormation.

Things to know
Here is some additional information for you:

Pricing – The AWS Step Functions Workflow Studio in AWS Application Composer comes at no additional cost.

Availability – This feature is available in all AWS Regions where Application Composer is available.

AWS Step Functions Workflow Studio in AWS Application Composer provides you with an easy-to-use experience to integrate your workflow into modern applications. Get started and learn more about this feature on the AWS Application Composer page.

Happy building!
— Donnie

Amazon CodeCatalyst introduces custom blueprints and a new enterprise tier

Post Syndicated from Irshad Buchh original https://aws.amazon.com/blogs/aws/amazon-codecatalyst-introduces-custom-blueprints-and-a-new-enterprise-tier/

Today, I’m excited to introduce the new Amazon CodeCatalyst enterprise tier and custom blueprints.

Amazon CodeCatalyst enterprise tier is a new pricing tier that offers features like custom blueprints and project lifecycle management. The enterprise tier is $20/user per month, and each enterprise tier space gets 1,500 compute minutes, 160 Dev Environment hours, and 64GB of Dev Environment storage per paying user. You can use custom blueprints to define best practices for your application code, workflows, and infrastructure. You can publish these blueprints to your CodeCatalyst space, utilizing them for project creation or applying standards to existing projects.

Blueprints help you set up projects in minutes so you can get to work on code immediately. With just a few clicks, you can set up project files and configure built-in, fully integrated tools (for example, source repository, issue management, and continuous integration and delivery (CI/CD) pipeline) with best practices for your particular type of project. You can swap in popular tools like GitHub if needed, while maintaining the unified experience. You spend less time on building, integrating, or operating developer tools over the project’s lifetime.

With custom blueprints, you can define various elements of your CodeCatalyst project, like workflow definitions, infrastructure as code (IaC), and application code. When custom blueprints are updated, those changes are reflected in any project using the blueprint as a pull request update. This streamlined process reduces overhead in setting up your projects and ensures that best practices are consistently applied across your projects. As an admin, you can easily view details about which projects are using each blueprint in your CodeCatalyst space, giving you visibility into how standards are being applied across your projects.

Creating a custom blueprint in CodeCatalyst
I open the CodeCatalyst console and then I navigate to my space. On the Settings tab, I choose Blueprints on the left navigation pane, and then I select Create blueprint. At this point, I codify my best practices in this blueprint. When ready, I’ll publish it back to my space so my teams can use it to create projects.

create-blueprint

After I publish my blueprint, I can view and manage it in my CodeCatalyst space in the Settings tab. In the left panel, I choose Blueprints. Then I select Space blueprints.

manage-blueprint

I select Create project > Space blueprints to create a CodeCatalyst project from my custom blueprint.

create-project

Availability

The Amazon CodeCatalyst enterprise tier is available in US West (Oregon) and Europe (Ireland) Regions, but you can deploy to any commercial Region. To learn more, visit the Amazon CodeCatalyst webpage.

Go build!

— Irshad

Amazon ElastiCache Serverless for Redis and Memcached is now available

Post Syndicated from Channy Yun original https://aws.amazon.com/blogs/aws/amazon-elasticache-serverless-for-redis-and-memcached-now-generally-available/

Today, we are announcing the availability of Amazon ElastiCache Serverless, a new serverless option that allows customers to create a cache in under a minute and instantly scale capacity based on application traffic patterns. ElastiCache Serverless is compatible with two popular open-source caching solutions, Redis and Memcached.

You can use ElastiCache Serverless to operate a cache for even the most demanding workloads without spending time in capacity planning or requiring caching expertise. ElastiCache Serverless constantly monitors your application’s memory, CPU, and network resource utilization and scales instantly to accommodate changes to the access patterns of workloads it serves. You can create a highly available cache with data automatically replicated across multiple Availability Zones and up to 99.99 percent availability Service Level Agreement (SLA) for all workloads, which saves you time and money.

Customers wanted to get radical simplicity to deploy and operate a cache. ElastiCache Serverless offers a simple endpoint experience abstracting the underlying cluster topology and cache infrastructure. You can reduce application complexity and have more operational excellence without handling reconnects and rediscovering nodes.

With ElastiCache Serverless, there are no upfront costs, and you pay for only the resources you use. You pay for the amount of cache data storage and ElastiCache Processing Units (ECPUs) resources consumed by your applications.

Getting started with Amazon ElastiCache Serverless
To get started, go to the ElastiCache console and choose Redis caches or Memcached caches in the left navigation pane. ElastiCache Serverless supports engine versions of Redis 7.1 or higher and Memcached 1.6 or higher.

For example, in the case of Redis caches, choose Create Redis cache.

You see two deployment options: either Serverless or Design your own cache to create a node-based cache cluster. Choose the Serverless option, the New cache method, and provide a name.

Use the default settings to create a cache in your default VPC, Availability Zones, service-owned encryption key, and security groups. We will automatically set recommended best practices. You don’t have to enter any additional settings.

If you want to customize default settings, you can set your own security groups, or enable automatic backups. You can also set maximum limits for your compute and memory usage to ensure your cache doesn’t grow beyond a certain size. When your cache reaches the memory limit, keys with a time to live (TTL) are evicted according to the least recently used (LRU) logic. When your compute limit is reached, ElastiCache will throttle requests, which will lead to elevated request latencies.

When you create a new serverless cache, you can see the details of settings for connectivity and data protection, including an endpoint and network environment.

Now, you can configure the ElastiCache Serverless endpoint in your application and connect using any Redis client that supports Redis in cluster mode, such as redis-cli.

$ redis-cli -h channy-redis-serverless.elasticache.amazonaws.com --tls -c -p 6379
set x Hello
OK
get x
"Hello"

You can manage the cache using AWS Command Line Interface (AWS CLI) or AWS SDKs. For more information, see Getting started with Amazon ElastiCache for Redis in the AWS documentation.

If you have an existing Redis cluster, you can migrate your data to ElastiCache Serverless by specifying the ElastiCache backups or Amazon S3 location of a backup file in a standard Redis rdb file format when creating your ElastiCache Serverless cache.

For a Memcached cache, you can create and use a new serverless cache in the same way as Redis.

If you use ElastiCache Serverless for Memcached, there are significant benefits of high availability and instant scaling because they are not natively available in the Memcached engine. You no longer have to write custom business logic, manage multiple caches, or use a third-party proxy layer to replicate data to get high availability with Memcached. Now you can get up to 99.99 percent availability SLA and data replication across multiple Availability Zones.

To connect to the Memcached endpoint, run the openssl client and Memcached commands as shown in the following example output:

$ /usr/bin/openssl s_client -connect channy-memcached-serverless.cache.amazonaws.com:11211 -crlf 
set a 0 0 5
hello
STORED
get a
VALUE a 0 5
hello
END

For more information, see Getting started with Amazon ElastiCache Serverless for Memcached in the AWS documentation.

Scaling and performance
ElastiCache Serverless scales without downtime or performance degradation to the application by allowing the cache to scale up and initiating a scale-out in parallel to meet capacity needs just in time.

To show ElastiCache Serverless’ performance we conducted a simple scaling test. We started with a typical Redis workload with an 80/20 ratio between reads and writes with a key size of 512 bytes. Our Redis client was configured to Read From Replica (RFR) using the READONLY Redis command, for optimal read performance. Our goal is to show how fast workloads can scale on ElastiCache Serverless without any impact on latency.

As you can see in the graph above, we were able to double the requests per second (RPS) every 10 minutes up until the test’s target request rate of 1M RPS. During this test, we observed that p50 GET latency remained around 751 microseconds and at all times below 860 microseconds. Similarly, we observed p50 SET latency remained around 1,050 microseconds, not crossing the 1,200 microseconds even during the rapid increase in throughput.

Things to know

  • Upgrading engine version – ElastiCache Serverless transparently applies new features, bug fixes, and security updates, including new minor and patch engine versions on your cache. When a new major version is available, ElastiCache Serverless will send you a notification in the console and an event in Amazon EventBridge. ElastiCache Serverless major version upgrades are designed for no disruption to your application.
  • Performance and monitoring – ElastiCache Serverless publishes a suite of metrics to Amazon CloudWatch, including memory usage (BytesUsedForCache), CPU usage (ElastiCacheProcessingUnits), and cache metrics, including CacheMissRate, CacheHitRate, CacheHits, CacheMisses, and ThrottledRequests. ElastiCache Serverless also publishes Amazon EventBridge events for significant events, including cache creation, deletion, and limit updates. For a full list of available metrics and events, see the documentation.
  • Security and compliance – ElastiCache Serverless caches are accessible from within a VPC. You can access the data plane using AWS Identity and Access Management (IAM). By default, only the AWS account creating the ElastiCache Serverless cache can access it. ElastiCache Serverless encrypts all data at rest and in-transit by transport layer security (TLS) encrypting each connection to ElastiCache Serverless. You can optionally choose to limit access to the cache within your VPCs, subnets, IAM access, and AWS Key Management Service (AWS KMS) key for encryption. ElastiCache Serverless is compliant with PCI-DSS, SOC, and ISO and is HIPAA eligible.

Now available
Amazon ElastiCache Serverless is now available in all commercial AWS Regions, including China. With ElastiCache Serverless, there are no upfront costs, and you pay for only the resources you use. You pay for cached data in GB-hours, ECPUs consumed, and Snapshot storage in GB-months.

To learn more, see the ElastiCache Serverless page and the pricing page. Give it a try, and please send feedback to AWS re:Post for Amazon ElastiCache or through your usual AWS support contacts.

Channy

The collective thoughts of the interwebz