Tag Archives: storage

Security is the top priority for Amazon S3

Post Syndicated from Maddie Bacon original https://aws.amazon.com/blogs/security/security-is-the-top-priority-for-amazon-s3/

Amazon Simple Storage Service (Amazon S3) launched 15 years ago in March 2006, and became the first generally available service from Amazon Web Services (AWS). AWS marked the fifteenth anniversary with AWS Pi Week—a week of in-depth streams and live events. During AWS Pi Week, AWS leaders and experts reviewed the history of AWS and Amazon S3, and some of the key decisions involved in building and evolving S3.

As part of this celebration, Werner Vogels, VP and CTO for Amazon.com, and Eric Brandwine, VP and Distinguished Engineer with AWS Security, had a conversation about the role of security in Amazon S3 and all AWS services. They touched on why customers come to AWS, and how AWS services grow with customers by providing built-in security that can progress to protections that are more complex, based on each customer’s specific needs. They also touched on how, starting with Amazon S3 over 15 years ago and continuing to this day, security is the top priority at AWS, and how nothing can proceed at AWS without security that customers can rely on.

“In security, there are constantly challenging tradeoffs,” Eric says. “The path that we’ve taken at AWS is that our services are usable, but secure by default.”

To learn more about how AWS helps secure its customers’ systems and information through a culture of security first, watch the video, and be sure to check out AWS Pi Week 2021: The Birth of the AWS Cloud.

If you have feedback about this post, submit comments in the Comments section below.

Want more AWS Security how-to content, news, and feature announcements? Follow us on Twitter.


Maddie Bacon

Maddie (she/her) is a technical writer for AWS Security with a passion for creating meaningful, inclusive content. She previously worked as a security reporter and editor at TechTarget and has a BA in Mathematics. In her spare time, she enjoys reading, traveling, and all things Harry Potter.

File Access Auditing Is Now Available for Amazon FSx for Windows File Server

Post Syndicated from Martin Beeby original https://aws.amazon.com/blogs/aws/file-access-auditing-is-now-available-for-amazon-fsx-for-windows-file-server/

Amazon FSx for Windows File Server provides fully managed file storage that is accessible over the industry-standard Server Message Block (SMB) protocol. It is built on Windows Server and offers a rich set of enterprise storage capabilities with the scalability, reliability, and low cost that you have come to expect from AWS.

In addition to key features such as user quotas, end-user file restore, and Microsoft Active Directory integration, the team has now added support for the auditing of end-user access on files, folders, and file shares using Windows event logs.

Introducing File Access Auditing
File access auditing allows you to send logs to a rich set of other AWS services so that you can query, process, and store your logs. By using file access auditing, enterprise storage administrators and compliance auditors can meet security and compliance requirements while eliminating the need to manage storage as logs grow over time. File access auditing will be particularly important to regulated customers such as those in the financial services and healthcare industries.

You can choose a destination for publishing audit events in the Windows event log format. The destination options are logging to Amazon CloudWatch Logs or streaming to Amazon Kinesis Data Firehose. From there, you can view and query logs in CloudWatch Logs, archive logs to Amazon Simple Storage Service (Amazon S3), or use AWS Partner solutions, such as Splunk and Datadog, to monitor your logs.

You can also set up Lambda functions that are triggered by new audit events. For example, you can configure AWS Lambda and Amazon CloudWatch alarms to send a notification to data security personnel when unauthorized access occurs.

Using File Access Auditing on a New File System
To enable file access auditing on a new file system, I head over to the Amazon FSx console and choose Create file system. On the Select file system type page, I choose Amazon FSx for Windows File Server, and then configure other settings for the file system. To use the auditing feature, Throughput capacity must be at least 32 MB/s, as shown here:

Screenshot of creating a file system

In Auditing, I see that File access auditing is turned on by default. In Advanced, for Choose an event log destination, I can change the destination for publishing user access events. I choose CloudWatch Logs and then choose a CloudWatch Logs log group in my account.

Screenshot of the Auditing options

After my file system has been created, I launch a new Amazon Elastic Compute Cloud (Amazon EC2) Instance and join it to my Active directory. When the instance is available, I connect to it using a remote desktop client. I open File Explorer and follow the documentation to map my new file system.

Screenshot of the file system once mapped

I open the file system in Windows Explorer and then right-click and select Properties. I choose Security, Advanced, and Auditing and then choose Add to add a new auditing entry. On the page for the auditing entry, in Principal, I click Select a principal. This is who I will be auditing. I choose Everyone. Next, for Type, I select the type of auditing I want (Success/Fail/All). Under Basic permissions, I select Full control for the permissions I want to audit for.

Screenshot of auditing options on a file share

Now that auditing is set up, I create some folders and create and modify some files. All this activity is now being audited, and the logs are being sent to CloudWatch Logs.

Screenshot of a file share, where some files and folders have been created

In the CloudWatch Logs Insights console, I can start to query the audit logs. Below you can see how I ran a simple query that finds all the logs associated with a specific file.

Screenshot of AWS CloudWatch Logs Insights

Continued Momentum
File access auditing is one of many features the team has launched in recent years, including: Self-Managed Directories, Native Multi-AZ File Systems, Support for SQL Server, Fine-Grained File Restoration, On-Premises Access, a Remote Management CLI, Data Deduplication, Programmatic File Share Configuration, Enforcement of In-Transit Encryption, Storage Size and Throughput Capacity Scaling, and Storage Quotas.

File access auditing is free on Amazon FSx for Windows File Server. Standard pricing applies for the use of Amazon CloudWatch Logs, Amazon Kinesis Data Firehose, any downstream AWS services such as Amazon Redshift, S3, or AWS Lambda, and any AWS Partner solutions like Splunk and Datadog.

Available Today
File access auditing is available today for all new file systems in all AWS Regions where Amazon FSx for Windows File Server is available. Check our documentation for more details.

— Martin

Forwarding emails automatically based on content with Amazon Simple Email Service

Post Syndicated from Murat Balkan original https://aws.amazon.com/blogs/messaging-and-targeting/forwarding-emails-automatically-based-on-content-with-amazon-simple-email-service/


Email is one of the most popular channels consumers use to interact with support organizations. In its most basic form, consumers will send their email to a catch-all email address where it is further dispatched to the correct support group. Often, this requires a person to inspect content manually. Some IT organizations even have a dedicated support group that handles triaging the incoming emails before assigning them to specialized support teams. Triaging each email can be challenging, and delays in email routing and support processes can reduce customer satisfaction. By utilizing Amazon Simple Email Service’s deep integration with Amazon S3, AWS Lambda, and other AWS services, the task of categorizing and routing emails is automated. This automation results in increased operational efficiencies and reduced costs.

This blog post shows you how a serverless application will receive emails with Amazon SES and deliver them to an Amazon S3 bucket. The application uses Amazon Comprehend to identify the dominant language from the message body.  It then looks it up in an Amazon DynamoDB table to find the support group’s email address specializing in the email subject. As the last step, it forwards the email via Amazon SES to its destination. Archiving incoming emails to Amazon S3 also enables further processing or auditing.


By completing the steps in this post, you will create a system that uses the architecture illustrated in the following image:

Architecture showing how to forward emails by content using Amazon SES

The flow of events starts when a customer sends an email to the generic support email address like [email protected]. This email is listened to by Amazon SES via a recipient rule. As per the rule, incoming messages are written to a specified Amazon S3 bucket with a given prefix.

This bucket and prefix are configured with S3 Events to trigger a Lambda function on object creation events. The Lambda function reads the email object, parses the contents, and sends them to Amazon Comprehend for language detection.

Amazon DynamoDB looks up the detected language code from an Amazon DynamoDB table, which includes the mappings between language codes and support group email addresses for these languages. One support group could answer English emails, while another support group answers French emails. The Lambda function determines the destination address and re-sends the same email address by performing an email forward operation. Suppose the lookup does not return any destination address, or the language was not be detected. In that case, the email is forwarded to a catch-all email address specified during the application deployment.

In this example, Amazon SES hosts the destination email addresses used for forwarding, but this is not a requirement. External email servers will also receive the forwarded emails.


To use Amazon SES for receiving email messages, you need to verify a domain that you own. Refer to the documentation to verify your domain with Amazon SES console. If you do not have a domain name, you will register one from Amazon Route 53.

Deploying the Sample Application

Clone this GitHub repository to your local machine and install and configure AWS SAM with a test AWS Identity and Access Management (IAM) user.

You will use AWS SAM to deploy the remaining parts of this serverless architecture.

The AWS SAM template creates the following resources:

  • An Amazon DynamoDB mapping table (language-lookup) contains information about language codes and associates them with destination email addresses.
  • An AWS Lambda function (BlogEmailForwarder) that reads the email content parses it, detects the language, looks up the forwarding destination email address, and sends it.
  • An Amazon S3 bucket, which will store the incoming emails.
  • IAM roles and policies.

To start the AWS SAM deployment, navigate to the root directory of the repository you downloaded and where the template.yaml AWS SAM template resides. AWS SAM also requires you to specify an Amazon Simple Storage Service (Amazon S3) bucket to hold the deployment artifacts. If you haven’t already created a bucket for this purpose, create one now. You will refer to the documentation to learn how to create an Amazon S3 bucket. The bucket should have read and write access by an AWS Identity and Access Management (IAM) user.

At the command line, enter the following command to package the application:

sam package --template template.yaml --output-template-file output_template.yaml --s3-bucket BUCKET_NAME_HERE

In the preceding command, replace BUCKET_NAME_HERE with the name of the Amazon S3 bucket that should hold the deployment artifacts.

AWS SAM packages the application and copies it into this Amazon S3 bucket.

When the AWS SAM package command finishes running, enter the following command to deploy the package:

sam deploy --template-file output_template.yaml --stack-name blogstack --capabilities CAPABILITY_IAM --parameter-overrides [email protected] YOUR_DOMAIN_NAME_HERE [email protected] YOUR_DOMAIN_NAME_HERE

In the preceding command, change the YOUR_DOMAIN_NAME_HERE with the domain name you validated with Amazon SES. This domain also applies to other commands and configurations that will be introduced later.

This example uses “blogstack” as the stack name, you will change this to any other name you want. When you run this command, AWS SAM shows the progress of the deployment.

Configure the Sample Application

Now that you have deployed the application, you will configure it.

Configuring Receipt Rules

To deliver incoming messages to Amazon S3 bucket, you need to create a Rule Set and a Receipt rule under it.

Note: This blog uses Amazon SES console to create the rule sets. To create the rule sets with AWS CloudFormation, refer to the documentation.

  1. Navigate to the Amazon SES console. From the left navigation choose Rule Sets.
  2. Choose Create a Receipt Rule button at the right pane.
  3. Add [email protected]YOUR_DOMAIN_NAME_HERE as the first recipient addresses by entering it into the text box and choosing Add Recipient.



Choose the Next Step button to move on to the next step.

  1. On the Actions page, select S3 from the Add action drop-down to reveal S3 action’s details. Select the S3 bucket that was created by the AWS SAM template. It is in the format of your_stack_name-inboxbucket-randomstring. You will find the exact name in the outputs section of the AWS SAM deployment under the key name InboxBucket or by visiting the AWS CloudFormation console. Set the Object key prefix to info/. This tells Amazon SES to add this prefix to all messages destined to this recipient address. This way, you will re-use the same bucket for different recipients.

Choose the Next Step button to move on to the next step.

In the Rule Details page, give this rule a name at the Rule name field. This example uses the name info-recipient-rule. Leave the rest of the fields with their default values.

Choose the Next Step button to move on to the next step.

  1. Review your settings on the Review page and finalize rule creation by choosing Create Rule

  1. In this example, you will be hosting the destination email addresses in Amazon SES rather than forwarding the messages to an external email server. This way, you will be able to see the forwarded messages in your Amazon S3 bucket under different prefixes. To host the destination email addresses, you need to create different rules under the default rule set. Create three additional rules for [email protected]YOUR_DOMAIN_NAME_HERE , [email protected] YOUR_DOMAIN_NAME_HERE and [email protected]YOUR_DOMAIN_NAME_HERE email addresses by repeating the steps 2 to 5. For Amazon S3 prefixes, use catchall/, english/, and french/ respectively.


Configuring Amazon DynamoDB Table

To configure the Amazon DynamoDB table that is used by the sample application

  1. Navigate to Amazon DynamoDB console and reach the tables view. Inspect the table created by the AWS SAM application.

language-lookup table is the table where languages and their support group mappings are kept. You need to create an item for each language, and an item that will hold the default destination email address that will be used in case no language match is found. Amazon Comprehend supports more than 60 different languages. You will visit the documentation for the supported languages and add their language codes to this lookup table to enhance this application.

  1. To start inserting items, choose the language-lookup table to open table overview page.
  2. Select the Items tab and choose the Create item From the dropdown, select Text. Add the following JSON content and choose Save to create your first mapping object. While adding the following object, replace Destination attribute’s value with an email address you own. The email messages will be forwarded to that address.


  “language”: “en”,

  “destination”: “[email protected]_DOMAIN_NAME_HERE”


Lastly, create an item for French language support.


  “language”: “fr”,

  “destination”: “[email protected]_DOMAIN_NAME_HERE”



Now that the application is deployed and configured, you will test it.

  1. Use your favorite email client to send the following email to the domain name [email protected] email address.

Subject: I need help


Hello, I’d like to return the shoes I bought from your online store. How can I do this?

After the email is sent, navigate to the Amazon S3 console to inspect the contents of the Amazon S3 bucket that is backing the Amazon SES Rule Sets. You will also see the AWS Lambda logs from the Amazon CloudWatch console to confirm that the Lambda function is triggered and run successfully. You should receive an email with the same content at the address you defined for the English language.

  1. Next, send another email with the same content, this time in French language.

Subject: j’ai besoin d’aide


Bonjour, je souhaite retourner les chaussures que j’ai achetées dans votre boutique en ligne. Comment puis-je faire ceci?


Suppose a message is not matched to a language in the lookup table. In that case, the Lambda function will forward it to the catchall email address that you provided during the AWS SAM deployment.

You will inspect the new email objects under english/, french/ and catchall/ prefixes to observe the forwarding behavior.

Continue experimenting with the sample application by sending different email contents to [email protected] YOUR_DOMAIN_NAME_HERE address or adding other language codes and email address combinations into the mapping table. You will find the available languages and their codes in the documentation. When adding a new language support, don’t forget to associate a new email address and Amazon S3 bucket prefix by defining a new rule.


To clean up the resources you used in your account,

  1. Navigate to the Amazon S3 console and delete the inbox bucket’s contents. You will find the name of this bucket in the outputs section of the AWS SAM deployment under the key name InboxBucket or by visiting the AWS CloudFormation console.
  2. Navigate to AWS CloudFormation console and delete the stack named “blogstack”.
  3. After the stack is deleted, remove the domain from Amazon SES. To do this, navigate to the Amazon SES Console and choose Domains from the left navigation. Select the domain you want to remove and choose Remove button to remove it from Amazon SES.
  4. From the Amazon SES Console, navigate to the Rule Sets from the left navigation. On the Active Rule Set section, choose View Active Rule Set button and delete all the rules you have created, by selecting the rule and choosing Action, Delete.
  5. On the Rule Sets page choose Disable Active Rule Set button to disable listening for incoming email messages.
  6. On the Rule Sets page, Inactive Rule Sets section, delete the only rule set, by selecting the rule set and choosing Action, Delete.
  7. Navigate to CloudWatch console and from the left navigation choose Logs, Log groups. Find the log group that belongs to the BlogEmailForwarderFunction resource and delete it by selecting it and choosing Actions, Delete log group(s).
  8. You will also delete the Amazon S3 bucket you used for packaging and deploying the AWS SAM application.



This solution shows how to use Amazon SES to classify email messages by the dominant content language and forward them to respective support groups. You will use the same techniques to implement similar scenarios. You will forward emails based on custom key entities, like product codes, or you will remove PII information from emails before forwarding with Amazon Comprehend.

With its native integrations with AWS services, Amazon SES allows you to enhance your email applications with different AWS Cloud capabilities easily.

To learn more about email forwarding with Amazon SES, you will visit documentation and AWS blogs.

Netflix Drive

Post Syndicated from Netflix Technology Blog original https://netflixtechblog.com/netflix-drive-a607538c3055

A file and folder interface for Netflix Cloud Services

Written by Vikram Krishnamurthy, Kishore Kasi, Abhishek Kapatkar, Tejas Chopra, Prudhviraj Karumanchi, Kelsey Francis, Shailesh Birari

In this post, we are introducing Netflix Drive, a Cloud drive for media assets and providing a high level overview of some of its features and interfaces. We intend this to be a first post in a series of posts covering Netflix Drive. In the future posts, we will do an architectural deep dive into the several components of Netflix Drive.

Netflix, and particularly Studio applications (and Studio in the Cloud) produce petabytes of data backed by billions of media assets. Several artists and workflows that may be globally distributed, work on different projects, and each of these projects produce content that forms a part of the large corpus of assets.

Here is an example of globally distributed production where several artists and workflows work in conjunction to create and share assets for one or many projects.

Fig 1: Globally distributed production with artists working on different assets from different parts of the world

There are workflows in which these artists may want to view a subset of these assets from this large dataset, for example, pertaining to a specific project. These artists may want to create personal workspaces and work on generating intermediate assets. To support such use cases, access control at the user workspace and project workspace granularity is extremely important for presenting a globally consistent view of pertinent data to these artists.

Netflix Drive aims to solve this problem of exposing different namespaces and attaching appropriate access control to help build a scalable, performant, globally distributed platform for storing and retrieving pertinent assets.

Netflix Drive is envisioned to be a Cloud Drive for Studio and Media applications and lends itself to be a generic paved path solution for all content in Netflix.

It exposes a file/folder interface for applications to save their data and an API interface for control operations. Netflix Drive relies on a data store that will be the persistent storage layer for assets, and a metadata store which will provide a relevant mapping from the file system hierarchy to the data store entities. The major pieces, as shown in Fig. 2, are the file system interface, the API interface, and the metadata and data stores. We will delve into these in the following sections.

Fig 2: Netflix Drive components

File interface for Netflix Drive

Creative applications such as Nuke, Maya, Adobe Photoshop store and retrieve content using files and folders. Netflix Drive relies on FUSE (File System In User Space) to provide POSIX files and folders interface to such applications. A FUSE based POSIX interface provides feature customization elasticity, deployment configuration flexibility as well as a standard and seamless file/folder interface. A similar user space abstraction is available for Windows (WinFSP) and MacOS (MacFUSE)

The operations that originate from user, application and system actions on files and folders translate to a well defined set of function and system calls which are forwarded by the Linux Virtual File System Layer (or a pass-through/filter driver in Windows) to the FUSE layer in user space. The resulting metadata and data operations will be implemented by appropriate metadata and data adapters in Netflix Drive.

Fig 3: POSIX interface of Netflix Drive

The POSIX files and folders interface for Netflix Drive is designed as a layered system with the FUSE implementation hooks forming the top layer. This layer will provide entry points for all of the relevant VFS calls that will be implemented. Netflix Drive contains an abstraction layer below FUSE which allows different metadata and data stores to be plugged into the architecture by having their corresponding adapters implement the interface. We will discuss more about the layered architecture in the section below.

API Interface for Netflix Drive

Along with exposing a file interface which will be a hub of all abstractions, Netflix Drive also exposes API and Polled Task interfaces to allow applications and workflow tools to trigger control operations in Netflix Drive.

For example, applications can explicitly use REST endpoints to publish files stored in Netflix Drive to cloud, and later use a REST endpoint to retrieve a subset of the published files from cloud. The API interface can also be used to track the transfers of large files and allows other applications to be built on top of Netflix Drive.

Fig 4: Control interface of Netflix Drive

The Polled Task interface allows studio and media workflow orchestrators to post or dispatch tasks to Netflix Drive instances on disparate workstations or containers. This allows Netflix Drive to be bootstrapped with an empty namespace when the workstation comes up and dynamically project a specific set of assets relevant to the artists’ work sessions or workflow stages. Further these assets can be projected into a namespace of the artist’s or application’s choosing.

Alternatively, workstations/containers can be launched with the assets of interest prefetched at startup. These allow artists and applications to obtain a workstation which already contains relevant files and optionally add and delete asset trees during the work session. For example, artists perform transformative work on files, and use Netflix Drive to store/fetch intermediate results as well as the final copy which can be transformed back into a media asset.

Bootstrapping Netflix Drive

Given the two different modes in which applications can interact with Netflix Drive, now let us discuss how Netflix Drive is bootstrapped.

On startup, Netflix Drive expects a manifest that contains information about the data store, metadata store, and credentials (tied to a user login) to form an instance of namespace hierarchy. A Netflix Drive mount point may contain multiple Netflix Drive namespaces.

A dynamic instance allows Netflix Drive to show a user-selected and user-accessible subset of data from a large corpus of assets. A user instance allows it to act like a Cloud Drive, where users can work on content which is automatically synced in the background periodically to Cloud. On restart on a new machine, the same files and folders will be prefetched from the cloud. We will cover the different namespaces of Netflix Drive in more detail in a subsequent blog post.

Here is an example of a typical bootstrap manifest file.

This image shows a bootstrap manifest json which highlights how Netflix Drive can work with different metadata stores (such as Redis, CockroachDB), and data stores (such as Ceph, S3) and tie them together to provide persistence layer for assets
A sample manifest file.

The manifest is a persistent artifact which renders a user workstation its Netflix Drive personality. It survives instance failures and is able to recreate the same stateful interface on any newly deployed instance.

Metadata and Data Store Abstractions

In order to allow a variety of different metadata stores and data stores to be easily plugged into the architecture, Netflix Drive exposes abstract interfaces for both metadata and data stores. Here is a high level diagram explaining the different layers of abstractions in Netflix Drive

Fig 5: Layered architecture of Netflix Drive

Metadata Store Characteristics

Each file in Netflix Drive would have one or many corresponding metadata nodes, corresponding to different versions of the file. The file system hierarchy would be modeled as a tree in the metadata store where the root node is the top level folder for the application.

Each metadata node will contain several attributes, such as checksum of the file, location of the data, user permissions to access data, file metadata such as size, modification time, etc. A metadata node may also provide support for extended attributes which can be used to model ACLs, symbolic links, or other expressive file system constructs.

Metadata Store may also expose the concept of workspaces, where each user/application can have several workspaces, and can share workspaces with other users/applications. These are higher level constructs that are very useful to Studio applications.

Data Store Characteristics

Netflix Drive relies on a data store that allows streaming bytes into files/objects persisted on the storage media. The data store should expose APIs that allow Netflix Drive to perform I/O operations. The transfer mechanism for transport of bytes is a function of the data store.

In the first manifestation, Netflix Drive is using an object store (such as Amazon S3) as a data store. In order to expose file store-like properties, there were some changes needed in the object store. Each file can be stored as one or more objects. For Studio applications, file sizes may exceed the maximum object size for Cloud Storage, and so, the data store service should have the ability to store multiple parts of a file as separate objects. It is the responsibility of the data store service to tie these objects to a single file and inform the metadata store of the single unique Id for these several object parts. This Data store internally implements the chunking of file into several parts, encrypting of the content, and life cycle management of the data.

Multi-tiered architecture

Netflix Drive allows multiple data stores to be a part of the same installation via its bootstrap manifest.

Fig 6: Multiple data stores of Netflix Drive

Some studio applications such as encoding and transcoding have different I/O characteristics than a typical cloud drive.

Most of the data produced by these applications is ephemeral in nature, and is read often initially. The final encoded copy needs to be persisted and the ephemeral data can be deleted. To serve such applications, Netflix Drive can persist the ephemeral data in storage tiers which are closer to the application that allow lower read latencies and better economies for read request, since cloud storage reads incur an egress cost. Finally, once the encoded copy is prepared, this copy can be persisted by Netflix Drive to a persistent storage tier in the cloud. A single data store may also choose to archive some subset of content stored in cheaper alternatives.


Studio applications require strict adherence to security models where only users or applications with specific permissions should be allowed to access specific assets. Security is one of the cornerstones of Netflix Drive design. Netflix Drive dynamic namespace design allows an artist or workflow to access only a small subset of the assets based on the workspace information and access control and is one of the benefits of using Netflix Drive in Studio workflows. Netflix Drive encapsulates the authentication and authorization models in its metadata store. These are translated into POSIX ACLs in Netflix Drive. In the future, Netflix Drive can allow more expressive ACLs by leveraging extended attributes associated with Metadata nodes corresponding to an asset.

Netflix Drive is currently being used by several Studio teams as the paved path solution for working with assets and is integrated with several media suite applications. As of today, Netflix Drive can be installed on CentOS, MacOS and Windows. In the future blog posts, we will cover implementation details, learnings, performance analysis of Netflix Drive, and some of the applications and workflows built on top of Netflix Drive.

If you are passionate about building Storage and Infrastructure solutions for Netflix Data Platform, we are always looking for talented engineers and managers. Please check out our job listings

Netflix Drive was originally published in Netflix TechBlog on Medium, where people are continuing the conversation by highlighting and responding to this story.

Automating AWS service logs table creation and querying them with Amazon Athena

Post Syndicated from Michael Hamilton original https://aws.amazon.com/blogs/big-data/automating-aws-service-logs-table-creation-and-querying-them-with-amazon-athena/

I was working with a customer who was just getting started using AWS, and they wanted to understand how to query their AWS service logs that were being delivered to Amazon Simple Storage Service (Amazon S3). I introduced them to Amazon Athena, a serverless, interactive query service that allows you to easily analyze data in Amazon S3 and other sources. Together, we used Athena to query service logs, and were able to create tables for AWS CloudTrail logs, Amazon S3 access logs, and VPC flow logs. As I was walking the customer through the documentation and creating tables and partitions for each service log in Athena, I thought there had to be an easier and faster way to allow customers to query their logs in Amazon S3, which is the focus of this post.

This post demonstrates how to use AWS CloudFormation to automatically create AWS service log tables, partitions, and example queries in Athena. We also use the SQL query editor in Athena to query the AWS service log tables that AWS CloudFormation created.

Athena best practices

This solution is appropriate for ad hoc use and queries the raw log files. These raw files can range from compressed JSON to uncompressed text formats, depending on how they were configured to be sent to Amazon S3. If you need to query over hundreds of GBs or TBs of data per day in Amazon S3, performing ETL on your raw files and transforming them to a columnar file format like Apache Parquet can lead to increased performance and cost savings. You can save on your Amazon S3 storage costs by using snappy compression for Parquet files stored in Amazon S3. To learn more about Athena best practices, see Top 10 Performance Tuning Tips for Amazon Athena.

Table partition strategies

There are a few important considerations when deciding how to define your table partitions. Mainly you should ask: what types of queries will I be writing against my data in Amazon S3? Do I only need to query data for that day and for a single account, or do I need to query across months of data and multiple accounts? In this post, we talk about how to query across a single, partitioned account.

By partitioning data, you can restrict the amount of data scanned per query, thereby improving performance and reducing cost. When creating a table schema in Athena, you set the location of where the files reside in Amazon S3, and you can also define how the table is partitioned. The location is a bucket path that leads to the desired files. If you query a partitioned table and specify the partition in the WHERE clause, Athena scans the data only for that partition. For more information, see Table Location in Amazon S3 and Partitioning Data. You can then define partitions in Athena that map to the data residing in Amazon S3.

Let’s look at an example to see how defining a location and partitioning our table can improve performance and reduce costs. In the following tree diagram, we’ve outlined what the bucket path may look like as logs are delivered to your S3 bucket, starting from the bucket name and going all the way down to the day.

In the following tree diagram, we’ve outlined what the bucket path may look like as logs are delivered to your S3 bucket

Outlined in red is where we set the location for our table schema, and Athena then scans everything after the CloudTrail folder. We then outlined our partitions in blue. This is where we can specify the granularity of our queries. In this case, we partition our table down to the day, which is very granular because we can tell Athena exactly where to look for our data. This is also the most performant and cost-effective option because it results in scanning only the required data and nothing else.

If you have to query multiple accounts and Regions, you should back off the location to AWSLogs and then create a non-partitioned CloudTrail table. This allows you to write queries across all your accounts and Regions, but the trade-off is that your queries take much longer and are more expensive due to Athena having to scan all the data that comes after AWSLogs every query. However, querying multiple accounts is beyond the scope of this post.


Before you get started, you should have the following prerequisites:

  • Service logs already being delivered to Amazon S3
  • An AWS account with access to your service logs

Deploying the automated solution in your AWS account

The following steps walk you through deploying a CloudFormation template that creates saved queries for you to run (Create Table, Create Partition, and example queries for each service log).

  1. Choose Launch Stack:

  1. Choose Next.
  2. For Stack name, enter a name for your stack.

You don’t need to have every AWS service log that the template asks for. If you don’t have CloudFront logs for example, you can leave the PathParameter as is. If you need CloudFront logs in the future, you can simply update the Create Table statement with the correct Amazon S3 location in Athena.

  1. For each service log table you want to create, follow the steps below:
  • Replace <_BUCKET_NAME> with the name of your S3 bucket that holds each AWS service log. You can use the same bucket name if it’s used to hold more than one type of service log.
  • Replace <Prefix> with your own folder prefix in Amazon S3. If you don’t have a prefix, make sure to remove it from the path parameters.
  • Replace <ACCOUNT-ID> and <REGION> with desired account and region.

Choose Next.

  1. Choose Next.
  2. Enter any tags you wish to assign to the stack.
  3. Choose Next.
  4. Verify parameters are correct and choose Create stack at the bottom.

Verify the stack has been created successfully. The stack takes about 1 minute to create the resources.

Querying your tables

You’re now ready to start querying your service logs.

  1. On the Athena console, on the Saved queries tab, search for the service log you want to interact with.

On the Athena console, on the Saved queries tab, search for the service log you want to interact with.

  1. Choose Create Table – CloudTrail Logs to run the SQL statement in the Athena query editor.

Make sure the location for Amazon S3 is correct in your SQL statement and verify you have the correct database selected.

  1. Choose Run query or press Tab+Enter to run the query.

Choose Run query or press Tab+Enter to run the query.

The table cloudtrail_logs is created in the selected database. You can repeat this process to create other service log tables.

For partitioned tables like cloudtrail_logs, you must add partitions to your table before querying.

  1. On the Saved queries tab, choose Create Partition – CloudTrail.
  2. Update the Region, year, month, and day you want to partition. Choose Run query or press Tab+Enter to run the query.

Choose Run query or press Tab+Enter to run the query.

After you run the query, you have successfully added a partition to your cloudtrail_logs table. Let’s look at some of the example queries we can run now.

  1. On the Saved queries tab, choose Query – CloudTrail Logs.

This is a base template included to begin querying your CloudTrail logs.

  1. Highlight the query and choose Run query.

You can see the base query template uses the WHERE clause to leverage partitions that have been loaded.

You can see the base query template uses the WHERE clause to leverage partitions that have been loaded.

Let’s say we have a spike in API calls from AWS Lambda and we want to see the users that the calls were coming from in a specific time range as well as the count for each user. Our query looks like the following code:

SELECT useridentity.sessioncontext.sessionissuer.username as "User",
       count(eventname) as "Lambda API Calls"
FROM cloudtrail_logs
WHERE eventsource = 'lambda.amazonaws.com'
       AND eventtime BETWEEN '2020-11-24T18:00:00Z' AND '2020-11-24T21:00:00Z' 
group by useridentity.sessioncontext.sessionissuer.username
order by count(eventname) desc

Or if we wanted to check our S3 Access Logs to make sure only authorized users are accessing certain prefixes:

FROM s3_access_logs
WHERE key='prefix/images/example.jpg'
        AND requester != 'arn:aws:iam::accountid:user/username'

Cost of solution and cleaning up

Deploying the CloudFormation template doesn’t cost anything. You’re only charged for the amount of data scanned by Athena. Remember to use the best practices we discussed earlier when querying your data in Amazon S3. For more pricing information, see Amazon Athena pricing and Amazon S3 pricing.

To clean up the resources that were created, delete the CloudFormation stack you created earlier. This also deletes the saved queries in Athena.


In this post, we discussed how we can use AWS CloudFormation to easily create AWS service log tables, partitions, and starter queries in Athena by entering bucket paths as parameters. We used CloudTrail and Amazon S3 access logs as examples, but you can replicate these steps for other service logs that you may need to query by visiting the Saved queries tab in Athena. Feel free to check out the video as well, where I go over how we store logs in Amazon S3 and then give a quick demo on how to deploy the solution.

For more information about service logs, see Easily query AWS service logs using Amazon Athena.

About the Author

Michael Hamilton is a Solutions Architect at Amazon Web Services and is based out of Charlotte, NC. He has a focus in analytics and enjoys helping customers solve their unique use cases. When he’s not working, he loves going hiking with his wife, kids, and a 2-year-old German shepherd.

AWS PrivateLink for Amazon S3 is Now Generally Available

Post Syndicated from Martin Beeby original https://aws.amazon.com/blogs/aws/aws-privatelink-for-amazon-s3-now-available/

At AWS re:Invent, we pre-announced that AWS PrivateLink for Amazon S3 was coming soon, and soon has arrived — this new feature is now generally available. AWS PrivateLink provides private connectivity between Amazon Simple Storage Service (S3) and on-premises resources using private IPs from your virtual network.

Way back in 2015, S3 was the first service to add a VPC endpoint; these endpoints provide a secure connection to S3 that does not require a gateway or NAT instances. Our customers welcomed this new flexibility but also told us they needed to access S3 from on-premises applications privately over secure connections provided by AWS Direct Connect or AWS VPN.

Our customers are very resourceful and by setting up proxy servers with private IP addresses in their Amazon Virtual Private Clouds and using gateway endpoints for S3, they found a way to solve this problem. While this solution works, proxy servers typically constrain performance, add additional points of failure, and increase operational complexity.

We looked at how we could solve this problem for our customers without these drawbacks and PrivateLink for S3 is the result.

With this feature you can now access S3 directly as a private endpoint within your secure, virtual network using a new interface VPC endpoint in your Virtual Private Cloud. This extends the functionality of existing gateway endpoints by enabling you to access S3 using private IP addresses. API requests and HTTPS requests to S3 from your on-premises applications are automatically directed through interface endpoints, which connect to S3 securely and privately through PrivateLink.

Interface endpoints simplify your network architecture when connecting to S3 from on-premises applications by eliminating the need to configure firewall rules or an internet gateway. You can also gain additional visibility into network traffic with the ability to capture and monitor flow logs in your VPC. Additionally, you can set security groups and access control policies on your interface endpoints.

Available Now
PrivateLink for S3 is available in all AWS Regions. AWS PrivateLink is available at a low per-GB charge for data processed and a low hourly charge for interface VPC endpoints. We hope you enjoy using this new feature and look forward to receiving your feedback. To learn more, check out the PrivateLink for S3 documentation.

Try out AWS PrivateLink for Amazon S3 today, and happy storing.

— Martin

Ingesting Jira data into Amazon S3

Post Syndicated from Vishwa Gupta original https://aws.amazon.com/blogs/big-data/ingesting-jira-data-into-amazon-s3/

Consolidating data from a work management tool like Jira and integrating this data with other data sources like ServiceNow, GitHub, Jenkins, and Time Entry Systems enables end-to-end visibility of different aspects of the software development lifecycle and helps keep your projects on schedule and within budget.

Amazon Simple Storage Service (Amazon S3) is an object storage service that offers industry-leading scalability, performance, security, and data availability. Many of our customers choose to build their data lakes on Amazon S3. They find the flexible, pay-as-you-go, cloud model ideal when dealing with vast amounts of heterogeneous data.

This post discusses some of the use cases for ingesting Jira data into an Amazon S3 data lake, the ingestion data flow, and a conceptional approach to ingesting data. We also provide the relevant Python code.

Use cases

Business use cases for Jira data ingestion range from proactive project monitoring, detection and resolution of project effort and cost variances, and non-compliance of SDLC processes. In this section, we provide an inclusive but not exhaustive list of use cases and their benefits.

Cognitive project monitoring

Cognitive project monitoring use cases include the following:

  • Automated analytics – You can prevent and reduce project schedule and budget variances by proactively monitoring metrics by combining data from Jira, GitHub, Jenkins, and Time Entry Systems.
  • Automated status reporting – You can use Amazon SageMaker machine learning (ML) models to derive prescriptive metrics by looking at data across various sources. This could reduce a project manager’s time spent stitching data and generating reports, and provide a holistic view of project-tracking metrics.

Automated project compliance and governance

You can analyze user behavior to detect potentially suspicious patterns by building a baseline of user activity. You create this based on primary data from HR (such as role, location, and work hours) and IT infrastructure (such as an assigned asset’s IP address).

Possible business outcomes include the following:

  • Proactively identify user IDs and passwords being shared with other users
  • Detect insider threats, such as abnormal login times, unauthorized access to Jira, and incorrect access permissions in Jira, GitHub, Jenkins, or Time Entry Systems
  • Identify compromised accounts based on frequent logins from unassigned assets or unusual successive authentications
  • Identify theft of corporate IPs based on unusual printing volume, printing project-related documents, and emailing organization-related documents and code to external accounts

Accelerated migration from Jira to another project management product

You can also use AWS Glue and its Data Catalog metadata to map between two products. This could increase your data migration.

Overview of solution

One of the most common approaches to ingest data from Jira into AWS is to create a Python module, which is used in AWS Glue or AWS Lambda. The following diagram shows the high-level approach for an end-to-end solution. In this solution, Glue Development Endpoint and respective SageMaker Jupyter Notebook instance are used to create the Jira Python module to facilitate Jupyter notebook experience, interactive testing and debugging capability. The scope of this post is limited to the following steps:

  • Setting up access for Jira
  • Using the Python model with AWS Lambda or AWS Glue
  • Incrementally pulling changed data from Jira with JQL (Jira Query Language)
  • Ingesting data to the AWS serverless data lake

Ingesting data from Jira into Amazon S3

The Jira server exposes data using REST APIs and open authorization (OAuth) authentication methods. It uses a three-legged OAuth approach (also called the OAuth dance) to acquire access to the resources served by the APIs. For more information about the following steps, see OAuth for REST APIs.

Generating an RSA public/private key pair

Consumer key and consumer secret details are required for interacting with API endpoints. You store the details inside encrypted SSM parameters.

To use macOS or Linux, run the following OpenSSL commands in the terminal (anywhere in the file system):

openssl genrsa -out jira_privatekey.pem 1024
openssl req -newkey rsa:1024 -x509 -key jira_privatekey.pem -out jira_publickey.cer -days 365
openssl pkcs8 -topk8 -nocrypt -in jira_privatekey.pem -out jira_privatekey.pcks8
openssl x509 -pubkey -noout -in jira_publickey.cer > jira_publickey.pem

To use Windows, download OpenSSL and run it using the path to the bin folder. Create a new environment variable named OPENSSL_CONF and the value "path_to"\openssl.cnf. Run the command as admin:

"path_to_openssl"\bin\openssl genrsa -out jira_privatekey.pem 1024
"path_to_openssl"\bin\openssl req -newkey rsa:1024 -x509 -key jira_privatekey.pem -out jira_publickey.cer -days 365
"path_to_openssl"\bin\openssl pkcs8 -topk8 -nocrypt -in jira_privatekey.pem -out jira_privatekey.pcks8
"path_to_openssl"\bin\openssl x509 -pubkey -noout -in jira_publickey.cer > jira_publickey.pem

Configuring a REST API-based consumer in Jira

For full instructions on configuring your REST API-based consumer, see Step 2: Configure your client application as an OAuth consumer in OAuth for REST APIs. Be sure to complete the following steps:

  1. In the Link applications section, select Create incoming link.
  2. For Public key, enter the public key you created earlier.

Performing the OAuth dance

In this step, you go through the process of getting the access token from the resource so the consumer can access the resource.

  1. Create the following parameters in the AWS Systems Manager Parameter Store:
    1. jira_access_private_key – Stores the private key in AWS Systems Manager as a parameter.
    2. jira_access_urls – Stores URLs to access Jira. These URLs are constructed based on display URLs defined in JIRA by adding the additional tags:
      • request_token_urlhttps://jiratoawss3.atlassian.net/plugins/servlet/oauth/request-token
      • access_token_urlhttps://jiratoawss3.atlassian.net/plugins/servlet/oauth/access-token
      • authorize_urlhttps://jiratoawss3.atlassian.net/plugins/servlet/oauth/authorize
      • data_urlhttps://jiratoawss3.atlassian.net/rest/api/2/search
    3. jira_access_secrets – Stores secrets to access Jira. Initially, only two values are present in this SSM parameter; it’s updated later with access_token. You need the following two parameters to start:
      1. consumer_key
      2. consumer_secret
  2. Download the notebook file and upload it to SageMaker notebook instance of AWS Glue Development Endpoint.:
    1. Below are the steps to set-up AWS Glue Development Endpoint
      1. In the AWS Glue console, choose Dev endpoints. Choose Add endpoint.
      2. Specify an endpoint name, such as demo-endpoint.
      3. Choose an IAM role with permissions similar to the IAM role that you use to run AWS Glue ETL jobs. For more information, see Create an IAM Role for AWS Glue. Choose Next.
      4. In Networking, leave Skip networking information selected, and choose Next.
      5. In SSH Public Key, enter a public key generated by an SSH key generator program, such as ssh-keygen (do not use an Amazon EC2 key pair). The generated public key will be imported into your development endpoint. Save the corresponding private key to later connect to the development endpoint using SSH. Choose Next. For more information, see ssh-keygen in Wikipedia.b. Once status of AWS Glue Development Endpoint is ready follow steps to set-up SageMaker notebooks with in your development endpoint.
    2. Once status of AWS Glue Development Endpoint is ready follow
    3. Once status shows Ready, open the notebook and upload the downloaded notebook file.

You now run the following cells.

  1. Install and import dependent modules with the following code:
    !pip install tlslite
    !pip install oauth2
    import urllib
    import oauth2 as oauth
    from tlslite.utils import keyfactory
    import json
    import sys
    import os
    import base64
    import boto3
    from boto3.dynamodb.conditions import Key, Attr
    import datetime
    import logging
    import pprint
    import time
    from pytz import timezone
    logger = logging.getLogger()

  2. Create an SSM client to connect parameters defined in Systems Manager (update the Region if it’s different than us-east-1):
    ssm = boto3.client("ssm", region_name='us-east-1')

  3. Define the signature class to sign the Jira REST API requests:
    class SignatureMethod_RSA_SHA1(oauth.SignatureMethod):
        name = 'RSA-SHA1'
        def signing_base(self, request, consumer, token):
            if not hasattr(request, 'normalized_url') or request.normalized_url is None:
                raise ValueError("Base URL for request is not set.")
            sig = (
            key = '%s&' % oauth.escape(consumer.secret)
            if token:
                key += oauth.escape(token.secret)
            raw = '&'.join(sig)
            return key, raw
        def sign(self, request, consumer, token):
            key, raw = self.signing_base(request, consumer, token)
            # SSM support to fetch private key
            ssm_param = ssm.get_parameter(Name='jira_access_private_key', WithDecryption=True)
            jira_private_key_str = ssm_param['Parameter']['Value']
            privateKeyString = jira_private_key_str.strip()
            privatekey = keyfactory.parsePrivateKey(privateKeyString)
            # Used encode() to convert to bytes
            signature = privatekey.hashAndSign(raw.encode())
            return base64.b64encode(signature) 

  4. Get the consumer_key and consumer_secret from the SSM parameter that you defined earlier:
    jira_secrets = json.loads(ssm.get_parameter(Name='jira_access_secrets_1', WithDecryption=True)['Parameter']['Value'])
    consumer_key = jira_secrets["consumer_key"]
    consumer_secret = jira_secrets["consumer_secret"]

  5. Define the URLs for request_token_url, access_token_url, and authorize_url:
    request_token_url = 'input_here'
    access_token_url = 'input_here'
    authorize_url = 'input_here'

    These URLs are defined while setting up the Rest API endpoint in Jira. It includes the following components:

    • request_token_url – https://jiratoawss3.atlassian.net/plugins/servlet/oauth/request-token
    • access_token_url – https://jiratoawss3.atlassian.net/plugins/servlet/oauth/access-token
    • authorize_url – https://jiratoawss3.atlassian.net/plugins/servlet/oauth/authorize
  6. Generate your request token:
    # Create Consumer using consumer_key and consumer_secret
    consumer = oauth.Consumer(consumer_key, consumer_secret)
    # Use Consumer to create oauth client
    client = oauth.Client(consumer)
    # Add Signature Method to the client
    # Get response from request token URL using the client
    resp, content = client.request(request_token_url, "POST")
    # Convert the content received from previous step into a Dictionary
    request_token = dict(urllib.parse.parse_qsl(content))
    # request token has two components oauth_token and oauth_token_secret

You only need to do this one time every five years (the default setting in Jira).

The following is an example request token value:


The following are example values after the URL parse and converted to dict:

  • {b'oauth_token': b'oFUFV5cqOuoWycnaCXYrkcioHuRw2TbV ',
  • b'oauth_token_secret': b'CzhMoEsozCV3xFZ179YQoLzRu4DYQHlR'}
  1. Manually approve the request token by opening the following user in a browser:
    authorize_url + '?oauth_token=' + request_token[b'oauth_token'].decode()

An example value of the final authorized user is:

  • https://jiratoawss3.atlassian.net/plugins/servlet/oauth/authorize?oauth_token=wYLlIxmcsnZTHgTy2ZpUmBakqzmqSbww.

When you go to the URL in your output, you see the following screenshot.

  1. Use an approved request token to generate an access token:
    # Create an oauth token using components of request token
    token = oauth.Token(request_token[b'oauth_token'], request_token[b'oauth_token_secret'])
    # Use Consumer and token to create oauth client
    client = oauth.Client(consumer, token)
    # Add Signature Method to the client
    # Get response from access token URL using the client
    access_token_resp, access_token_content = client.request(access_token_url, "POST")

An example access token is:


The following are example access token values after URL parse and converted to dict:

  • {b'oauth_token': b'Ym3UDrs1iYnLUZ1t0TkT1PinfJNN3RLj',
  • b'oauth_token_secret': b'FYQfGjLLhbCJg3DXZFaKsE6wsURVfebN',
  • b'oauth_expires_in': b'157680000',
  • b'oauth_session_handle': b'BulouCOypjssDS3GzeY7Ldi30h0ERWDo',
  • b'oauth_authorization_expires_in': b'160272000'}

oauth_authorization_expires_in states when the token expires (in seconds), which is generally 5 years.

  1. Update the access_token key in the SSM parameter jira_access_secrets with the value for access_token_content.

This access token is valid for 5 years (the expires_in key of access_token_content states when the token expires in seconds). Rotating the access key depends on your organization’s security policy and is out of scope for this post.

Using the access token, querying data from Jira, and storing data in Amazon S3

The following are the important points in this step:

  • The Jira REST API returns 50 records at a time, but gives a total record count, which is used to paginate through the result set.
  • The Jira REST API endpoint needs to be updated with JQL filters. JQL allows you to only pick changed records.
  • Data returned from the REST API endpoint is serialized to Amazon S3. The Python code batches the records from Jira pages and commits after every four pages (which is configurable) have been fetched from Jira.
  • JQL is appended to the data_url (defined earlier). When pulling data from Jira, it’s good practice to do a one-time bulk load and get incremental loads by maintaining the last data pull date in an Amazon DynamoDB The following screenshot shows an example of tracking dates in DynamoDB by projects. Because it’s an hourly batch, last_ingest_date is rounded up to the hour.
  • The key attributes of JQL used for constructing JQL and pulling data from Jira are:
    • project – Loop through the projects from DynamoDB and pull data for one project at a time.
    • updated – Last update date for Jira story or task. Data pull from Jira is based on this date.
      • For bulk loads, updated is less than or equal to the batch run date, rounded up to the hour.
      • For incremental loads, updated is greater than or equal to last_ingest_date from DynamoDB, and less than or equal to the batch run date, rounded up to the hour.
      • To use date in JQL, we need to parse the date before we use it.
    • startAt – Jira generally paginates the results every 50 records. This attribute is used to loop through the complete data. For example, if a project has 500 records and page size is 50 records, this attribute is incremented by page size in every iteration, and it takes 10 iterations to get complete data.
    • maxResults – This is the page size set up in Jira (maximum number of records Jira returns in every API call).
  • Use the provided notebook to perform the OAuth dance. The sample code pulls data from Jira based on the approach you specified earlier. The purpose of this code is to accelerate implementing data ingestion from Jira.

Cleaning up

To avoid incurring future charges, delete the resources set up as part of this post:

  • AWS Glue Development Endpoint
  • DynamoDB table
  • Systems Manager parameters
  • S3 bucket

Next steps

To extend the usability scope of Jira data in S3 buckets, you can crawl the location to create AWS Glue Data Catalog database tables. Registering the locations with AWS Lake Formation helps simplify permission management and allows you to implement fine-grained access control. You can also use Amazon Athena, Amazon Redshift, Amazon SageMaker, and Amazon QuickSight for data analysis, ML, and reporting services.


This post aims to simplify and accelerate the steps to ingest Jira data into Amazon S3. The solution includes Jira configuration, performing the three-legged OAuth dance, JQL-based attributes for data selection, and Python-based data extraction into Amazon S3.

If you have feedback about this post, submit comments in the Comments section below. If you have questions about this post, start a new thread on the AWS Glue forum.

About the Authors

Vishwa Gupta is a Data and ML Engineer with AWS Professional Services Intelligence Practice. He helps customers implement big data and analytics platform and solutions. Outside of work, he enjoys spending time with family, traveling, and playing badminton.




Sreeram Thoom is a Data Architect at Amazon Web Services.






New – Amazon S3 Replication Adds Support for Multiple Destination Buckets

Post Syndicated from Marcia Villalba original https://aws.amazon.com/blogs/aws/new-amazon-s3-replication-adds-support-for-multiple-destination-buckets/

Amazon Simple Storage Service (S3) supports many types of replication, including S3 Same-Region Replication (SRR), which launched in 2019 and S3 Cross-Region Replication (CRR), which has been around since 2015. Today, we are happy to announce S3 Replication support for multiple destination buckets. S3 Replication now gives you the ability to replicate data from one source bucket to multiple destination buckets. With S3 Replication (multi-destination) you can replicate data in the same AWS Regions using S3 SRR or across different AWS Regions by using S3 CRR, or a combination of both.

Before this launch, if you needed to have multiple copies of your data in different S3 buckets, you had to build your own S3 replication service by monitoring S3 events, identifying created objects, and using AWS Lambda functions to copy objects to each destination bucket.

This launch removes the need for you to develop your own solutions to replicate the data across multiple destinations. You can use the flexibility of S3 Replication (multi-destination) to store multiple copies of your data in different storage classes, with different encryption types, or across different accounts depending on its intended use. Additionally, when replicating to multiple destinations, you can use CloudWatch metrics to track replication progress for each region pair.

S3 Replication (multi-destination) is an extension to S3 Replication, and it supports all existing S3 Replication features like Replication Time Control (RTC) and delete marker replication. If you need a predictable replication time backed by a Service Level Agreement, you can use RTC to replicate objects in less than 15 minutes.

How to Get Started With S3 Replication (multi-destination)
In order to get S3 Replication working, all the buckets involved in the replication (source and destinations) must have bucket versioning enabled.

To setup S3 Replication (multi-destination), you need to define replication rules. You can create a new rule in the bucket Management page, under Replication Rules.

Screenshot of adding a rule

When creating a new replication rule, one very important step is to set up permissions for replication, as S3 will need to replicate objects on your behalf. To do that, you can follow the instructions available in the S3 documentation page.

To create the replication rule, just follow the steps in the console. You can specify to which objects of the bucket this rule applies, the destination bucket, if you want to change the storage class of the replicated objects and many other preferences for your replicated objects.

Screenshot configuring the replication rule

One thing to have in mind when activating a rule is that the replication will start for all new objects added to the bucket from that moment. Objects uploaded to the bucket before the rule was created need to be copied using one time operations like S3 batch operations or S3 copy.

If you want to monitor the progress of your replication using CloudWatch metrics, don’t forget to click the Replication metrics and notifications checkbox.

Screenshot of configuring replication rules metrics

Now that we support multiple destinations for replication, rule priorities are used when there are two or more rules with the same destination. When that happens, the rule with the highest priority will be applied. For the same destination bucket, a lower priority rule will not be applied when the replication configuration has two or more rules with overlapping scope. If there are two or more rules with the same scope and different destinations, both rules will be applied.

You can see a summary of all your rules in the Replication rules listing under the bucket Management page.

Screenshot of replication rules listing

Monitoring Replication
When you have all the rules configured, you can start uploading objects to the source bucket and monitor how they get replicated in all the different destinations.

To know the replication status of an object in the source bucket, you can see the Replication status in the object Details. The status types are:

  • COMPLETED: The replication was successful in all the destinations.
  • PENDING: The replication is still in progress.
  • FAILED: The replication failed to replicate in at least one of the destinations. When there is a failure in replication, the only way to fix it is by uploading the object again.

screenshot of object metadata

For replicated objects, you will see the REPLICA status under the Replication status.

You can also use CloudWatch metrics to monitor the replication. First, you need to enable metrics for each of the rules. And then in the bucket Metrics, you can choose which rules you want to see the metrics of and see the charts for each of them; the metrics are also available in the CloudWatch console.

Screenshot of replication metrics

S3 Replication (multi-destination) is available today in all AWS Regions. To get started, you can use the AWS Management Console, SDKs, S3 API, or AWS CloudFormation to create replication rules from one source bucket to multiple destination buckets.

Pricing for S3 Replication (multi-destination) applies for each rule. For pricing information, please visit the Amazon S3 pricing page.

For more information about this new feature visit the S3 Replication page.



New – Amazon EBS gp3 Volume Lets You Provision Performance Apart From Capacity

Post Syndicated from Harunobu Kameda original https://aws.amazon.com/blogs/aws/new-amazon-ebs-gp3-volume-lets-you-provision-performance-separate-from-capacity-and-offers-20-lower-price/

Amazon Elastic Block Store (EBS) is an easy-to-use, high-performance block storage service designed for use with Amazon EC2 instances for both throughput and transaction-intensive workloads of all sizes. Using existing general purpose solid state drive (SSD) gp2 volumes, performance scales with storage capacity. By provisioning larger storage volume sizes, you can improve application input / output operations per second (IOPS) and throughput.

However some applications, such as MySQL, Cassandra, and Hadoop clusters, require high performance but not high storage capacity. Customers want to meet the performance requirements of these types of applications without paying for more storage volumes than they need.

Today I would like to tell you about gp3, a new type of SSD EBS volume that lets you provision performance independent of storage capacity, and offers a 20% lower price than existing gp2 volume types.

New gp3 Volume Type

With EBS, customers can choose from multiple volume types based on the unique needs of their applications. We introduced general purpose SSD gp2 volumes in 2014 to offer SSD performance at a very low price. gp2 provides an easy and cost-effective way to meet the performance and throughput requirements of many applications our customers use such as virtual desktops, medium-sized databases such as SQLServer and OracleDB, and development and testing environments.

That said, some customers need higher performance. Because the basic idea behind gp2 is that the larger the capacity, the faster the IOPS, customers may end up provisioning more storage capacity than desired. Even though gp2 offers a low price point, customers end up paying for storage they don’t need.

The new gp3 is the 7th variation of EBS volume types. It lets customers independently increase IOPS and throughput without having to provision additional block storage capacity, paying only for the resources they need.

gp3 is designed to provide predictable 3,000 IOPS baseline performance and 125 MiB/s regardless of volume size. It is ideal for applications that require high performance at a low cost such as MySQL, Cassandra, virtual desktops and Hadoop analytics. Customers looking for higher performance can scale up to 16,000 IOPS and 1,000 MiB/s for an additional fee. The top performance of gp3 is 4 times faster than max throughput of gp2 volumes.

How to Switch From gp2 to gp3

If you’re currently using gp2, you can easily migrate your EBS volumes to gp3 using Amazon EBS Elastic Volumes, an existing feature of Amazon EBS. Elastic Volumes allows you to modify the volume type, IOPS, and throughput of your existing EBS volumes without interrupting your Amazon EC2 instances. Also, when you create a new Amazon EBS volume, Amazon EC2 instance, or Amazon Machine Image (AMI), you can choose the gp3 volume type. New AWS customers receive 30GiB of gp3 storage with the baseline performance at no charge for 12 months.

Available Today

The gp3 volume type is available for all AWS Regions. You can access the AWS Management Console to launch your first gp3 volume.

For more information, see Amazon Elastic Block Store and get started with gp3 today.

– Kame


New – Amazon EC2 R5b Instances Provide 3x Higher EBS Performance

Post Syndicated from Harunobu Kameda original https://aws.amazon.com/blogs/aws/new-amazon-ec2-r5b-instances-providing-3x-higher-ebs-performance/

In July 2018, we announced memory-optimized R5 instances for the Amazon Elastic Compute Cloud (Amazon EC2). R5 instances are designed for memory-intensive applications such as high-performance databases, distributed web scale in-memory caches, in-memory databases, real time big data analytics, and other enterprise applications.

R5 instances offer two different block storage options. R5d instances offer up to 3.6TB of NMVe instance storage for applications that need access to high-speed, low latency local storage. In addition, all R5b instances work with Amazon Elastic Block Store. Amazon EBS is an easy-to-use, high-performance and highly available block storage service designed for use with Amazon EC2 for both throughput- and transaction-intensive workloads at any scale. A broad range of workloads, such as relational and non-relational databases, enterprise applications, containerized applications, big data analytics engines, file systems, and media workflows are widely deployed on Amazon EBS.

Today, we are happy to announce the availability of R5b, a new addition to the R5 instance family. The new R5b instance is powered by the AWS Nitro System to provide the best network-attached storage performance available on EC2. This new instance offers up to 60Gbps of EBS bandwidth and 260,000 I/O operations per second (IOPS).

Amazon EC2 R5b Instance
Many customers use R5 instances with EBS for large relational database workloads such as commerce platforms, ERP systems, and health record systems, and they rely on EBS to provide scalable, durable, and high availability block storage. These instances provide sufficient storage performance for many use cases, but some customers require higher EBS performance on EC2.

R5 instances provide bandwidth up to 19Gbps and maximum EBS performance of 80K IOPS, while the new R5b instances support bandwidth up to 60Gbps and EBS performance of 260K IOPS, providing 3x higher EBS-Optimized performance compared to R5 instances, enabling customers to lift and shift large relational databases applications to AWS. R5b and R5 vCPU to memory ratio and network performance are the same.

Instance Name vCPUs Memory EBS Optimized Bandwidth (Mbps) EBS Optimized [email protected] (IO/s)
r5b.large 2 16 GiB Up to 10,000 Up to 43,333
r5b.xlarge 4 32 GiB Up to 10,000 Up to 43,333
r5b.2xlarge 8 64 GiB Up to 10,000 Up to 43,333
r5b.4xlarge 16 128 GiB 10,000 43,333
r5b.8xlarge 32 256 GiB 20,000 86,667
r5b.12xlarge 48 384 GiB 30,000 130,000
r5b.16xlarge 64 512 GiB 40,000 173,333
r5b.24xlarge 96 768 GiB 60,000 260,000
r5b.metal 96 768 GiB 60,000 260,000

Customers operating storage performance sensitive workloads can migrate from R5 to R5b to consolidate their existing workloads into fewer or smaller instances. This can reduce the cost of both infrastructure and licensed commercial software working on those instances. R5b instances are supported by Amazon RDS for Oracle and Amazon RDS for SQL Server, simplifying the migration path for large commercial database applications and improving storage performance for current RDS customers by up to 3x.

All Nitro compatible AMIs support R5b instances, and the EBS-backed HVM AMI must have NVMe 1.0e and ENA drivers installed at R5b instance launch. R5b supports io1, io2 Block Express (in preview), gp2, gp3, sc1, st1 and standard volumes. R5b does not support io2 volumes and io1 volumes that have multi-attach enabled, which are coming soon.

Available Today

R5b instances are available in the following regions: US West (Oregon), Asia Pacific (Tokyo), US East (N. Virginia), US East (Ohio), Asia Pacific (Singapore), and Europe (Frankfurt). RDS on r5b is available in US East (Ohio), Asia Pacific (Singapore), and Europe (Frankfurt), and support in other regions is coming soon.

Learn more about EC2 R5 instances and get started with Amazon EC2 today.

– Kame;

re:Invent 2020 Liveblog: Andy Jassy Keynote

Post Syndicated from Jeff Barr original https://aws.amazon.com/blogs/aws/reinvent-2020-liveblog-andy-jassy-keynote/

I’m always ready to try something new! This year, I am going to liveblog Andy Jassy‘s AWS re:Invent keynote address, which takes place from 8 a.m. to 11 a.m. on Tuesday, December 1 (PST). I’ll be updating this post every couple of minutes as I watch Andy’s address from the comfort of my home office. Stay tuned!




Introducing Amazon S3 Storage Lens – Organization-wide Visibility Into Object Storage

Post Syndicated from Martin Beeby original https://aws.amazon.com/blogs/aws/s3-storage-lens/

When starting out in the cloud, a customer’s storage requirements might consist of a handful of S3 buckets, but as they grow, migrate more applications and realize the power of the cloud, things can become more complicated. A customer may have tens or even hundreds of accounts and have multiple S3 buckets across numerous AWS Regions. Customers managing these sorts of environments have told us that they find it difficult to understand how storage is used across their organization, optimize their costs, and improve security posture.

Drawing from more than 14 years of experience helping customers optimize their storage, the S3 team has built a new feature called Amazon S3 Storage Lens. This is the first cloud storage analytics solution to give you organization-wide visibility into object storage, with point-in-time metrics and trend lines as well as actionable recommendations. All these things combined will help you discover anomalies, identify cost efficiencies and apply data protection best practices.

With S3 Storage Lens , you can understand, analyze, and optimize storage with 29+ usage and activity metrics and interactive dashboards to aggregate data for your entire organization, specific accounts, regions, buckets, or prefixes. All of this data is accessible in the S3 Management Console or as raw data in an S3 bucket.

Every Customer Gets a Default Dashboard

S3 Storage Lens includes an interactive dashboard which you can find in the S3 console. The dashboard gives you the ability to perform filtering and drill-down into your metrics to really understand how your storage is being used. The metrics are organized into categories like data protection and cost efficiency, to allow you to easily find relevant metrics.

For ease of use all customers receive a default dashboard. If you are like many customers, this maybe the only dashboard that you need, but if you want to, you can make changes. For example, you could configure the dashboard to export the data daily to an S3 bucket for analysis with another tool (Amazon QuickSight, Amazon Athena, Amazon Redshift, etc.) or you could upgrade to receive advanced metrics and recommendations.

Creating a Dashboard
You can also create your own dashboards from scratch, to do this I head over to the S3 console and click on the Dashboards menu item inside the Storage Lens section. Secondly, I click the Create dashboard button.

Screenshot of the console

I give my dashboard the name s3-lens-demo and select a home Region. The home Region is where the metrics data for your dashboard will be stored. I choose to enable the dashboard, meaning that it will be updated daily with new metrics.

A dashboard can analyze storage across accounts, Regions, buckets, and prefix. I choose to include buckets from all accounts in my organization and across all regions in the Dashboard scope section.

S3 Storage Lens has two tiers: Free Metrics, which is free of charge, automatically available for all S3 customers and contains 15 usage related metrics; and Advanced metrics and recommendations, which has an additional charge, but includes all 29 usage and activity metrics with 15-month data retention, and contextual recommendations. For this demo, I select Advanced metrics and recommendations.

Screenshot of Management Console

Finally, I can configure the dashboard metrics to be exported daily to a specific S3 bucket. The metrics can be exported to either CSV or Apache Parquet format for further analysis outside of the console.

An alert pops up to tell me that my dashboard has been created, but it can take up to 48 hours to generate my initial metrics.

What does a Dashboard Show?

Once my dashboard has been created, I can start to explore the data. I can filter by Accounts, Regions, Storage classes, Buckets, and Prefixes at the top of the dashboard.

The next section is a snapshot of the metrics such as the Total storage and Object count, and I can see a trendline that shows the trend on each metric over the last 30 days and a percentage change. The number in the % change column shows by default the Day/day percentage change, but I can select to compare by Week/week or Month/month.

I can toggle between different Metric groups by selecting either Summary, Cost efficiency, Data protection, or Activity.

There are some metrics here that are pretty typical like total storage and object counts, and you can already receive these in a few places in the S3 console and in Amazon CloudWatch – but in S3 Storage Lens you can receive these metrics in aggregate across your organization or account, or at the prefix level, which was not previously possible.

There are some other metrics you might not expect, like metrics that pertain to S3 feature utilization. For example we can break out the % of objects that are using encryption, or the number of objects that are non-current versions. These metrics help you understand how your storage is configured, and allows you to identify discrepancies, and then drill in for details.

The dashboard provides contextual recommendations alongside your metrics to indicate actions you can take based on the metric, for example ways to improve cost efficiency, or apply data protection best practices. Any recommendations are shown in the Recommendation column. A few days ago I took the screenshot below which shows a recommendation on one of my dashboards that suggests I should check my buckets’ default encryption configuration.

The dashboard trends and distribution section allows me to compare two metrics over time in more detail. Here I have selected Total storage as my Primary metric and Object Count as my Secondary metric.

These two metrics are now plotted on a graph, and I can select a date range to view the trend over time.

The dashboard also shows me those two metrics and how they are distributed across Storage class and Regions.

I can click on any value in this graph and Drill down to filter the entire dashboard on that value, or select Anayze by to navigate to a new dashboard view for that dimension.

The last section of the dashboard allows me to perform a Top N analysis of a metric over a date range, where N is between 1 and 25. In the example below, I have selected the top 3 items in descending order for the Total storage metric.

I can then see the top three accounts (note: there are only two accounts in my organization) and the Total storage metric for each account.

It also shows the top 3 regions for the Total storage metric, and I can see that 51.15% of my data is stored in US East (N. Virginia)

Lastly, the dashboard contains information about the top 3 buckets and prefixes and the associated trends.

As I have shown, S3 Storage Lens delivers more than 29 individual metrics on S3 storage usage and activity for all accounts in your organization. These metrics are available in the S3 console to visualize storage usage and activity trends in a dashboard, with contextual recommendations that make it easy to take immediate action. In addition to the dashboard in the S3 console, you can export metrics in CSV or Parquet format to an S3 bucket of your choice for further analysis with other tools including Amazon QuickSight, Amazon Athena, or Amazon Redshift to name a few.

Video Walkthrough

If you would like a more indepth look at S3 Storage Lens the team have recorded the following video to explain how this new feature works.

Available Now

S3 Storage Lens is available in all commercial AWS Regions. You can use S3 Storage Lens with the Amazon S3 API, CLI, or in the S3 Console. For pricing information, regarding S3 Storage Lens advanced metrics and recommendations, check out the Amazon S3 pricing page. If you’d like to dive a little deeper, then you should check out the documentation or the S3 Storage Lens webpage.

Happy Storing

— Martin


S3 Intelligent-Tiering Adds Archive Access Tiers

Post Syndicated from Marcia Villalba original https://aws.amazon.com/blogs/aws/s3-intelligent-tiering-adds-archive-access-tiers/

We launched S3 Intelligent-Tiering two years ago, which added the capability to take advantage of S3 without needing to have a deep understanding of your data access patterns. Today we are launching two new optimizations for S3 Intelligent-Tiering that will automatically archive objects that are rarely accessed. These new optimizations will reduce the amount of […]

Handling data erasure requests in your data lake with Amazon S3 Find and Forget

Post Syndicated from Chris Deigan original https://aws.amazon.com/blogs/big-data/handling-data-erasure-requests-in-your-data-lake-with-amazon-s3-find-and-forget/

Data lakes are a popular choice for organizations to store data around their business activities. Best practice design of data lakes impose that data is immutable once stored, but new regulations such as the European General Data Protection Regulation (GDPR), California Consumer Privacy Act (CCPA), and others have created new obligations that operators now need to be able to erase private data from their data lake when requested.

When asked to erase an individual’s private data, as a data lake operator you have to find all the objects in your Amazon Simple Storage Service (Amazon S3) buckets that contain data relating to that individual. This can be complex because data lakes contain many S3 objects (each of which may contain multiple rows), as shown in the following diagram. You often can’t predict which objects contain data relating to an individual, so you need to check each object. For example, if the user mary34 asks to be removed, you need to check each object to determine if it contains data relating to mary34. This is the first challenge operators face: identifying which objects contain data of interest.

After you identify objects containing data of interest, you face a second challenge: you need to retrieve the object from the S3 bucket, remove relevant rows from the file, put a new version of the object into S3, and make sure you delete any older versions.

Locating and removing data manually can be time-consuming and prone to mistakes, considering the large number of objects typically in data lakes.

Amazon S3 Find and Forget solves these challenges with ready-to-use automations. It allows you to remove records from data lakes of any size that are in AWS Glue Data Catalog. The solution includes a web user interface that you can use and an API that you can use to integrate with your own applications.

Solution overview

Amazon S3 Find and Forget enables you to find and delete records automatically in data lakes on Amazon S3. Using the solution, you can:

  • Define which tables from your AWS Glue Data Catalog contain data you want to erase
  • Manage a queue of identifiers (such as unique customer identifiers) to erase
  • Erase rows from your data lake matching the queued record identifiers
  • Access a log of all actions taken by the solution

You can use Amazon S3 Find and Forget to work with data lakes stored on Amazon S3 in a supported file format.

The solution is developed and distributed as open-source software that you deploy and run inside your own AWS account. When deploying this solution, you only pay for the AWS services consumed to run it. We recommend reviewing the Cost Estimate guide and creating Amazon CloudWatch Billing Alarms to monitor charges before deploying the solution in your own account.

When you handle requests to remove data, you add the identifiers through the web interface or API to a Deletion Queue. The identifiers remain in the queue until you start a Deletion Job. The Deletion Job processes the queue and removes matching rows from objects in your data lake.

Where your requirements allow it, batching deletions can provide significant cost savings by minimizing the number of times the data lake needs to be re-scanned and processed. For example, you could start a Deletion Job once a week to process all requests received in the preceding week.

Solution demonstration

This section provides a demonstration of using Amazon S3 Find and Forget’s main features. To deploy the solution in your own account, refer to the User Guide.

For this demonstration, I have prepared in advance:

The first step is to deploy the solution using AWS CloudFormation by following the instructions in the User Guide. The CloudFormation stack can take 20-30 minutes to deploy depending on the options chosen when deploying.

Once deployed, I visit the web user interface by going to the address in the WebUIUrl CloudFormation stack output. Using a temporary password emailed to the address I provided in my CloudFormation parameters, I login and set a password for future use. I then see a dashboard with some base metrics for my Amazon S3 Find and Forget deployment:

I now need to create a Data Mapper so that Amazon S3 Find and Forget can find my data lake. To do this, I select Data Mappers, then Create Data Mapper:

On this screen, I give my Data Mapper a name, choose the AWS Glue database and table in my account that I want to operate on, and the columns that I want my deletions to match. In this demonstration, I’m using a copy of the Amazon Customer Reviews Dataset that I copied to my own S3 bucket. I’ll be using the customer_id column to remove data. In the dataset, this field contains a unique identifier for each customer who has created a product review.

I then specify the IAM role to be used when modifying the objects in S3. I also choose whether I want the old S3 object versions to be deleted for me. I can turn this off if I want to implement my own strategy to manage deleting old object versions, such as by using S3 lifecycle policies.

After choosing Create Data Mapper the Data Mapper is created, and I am prompted to grant permissions for S3 Find and Forget to operate in my bucket. In the Data Mapper list, I select my new Data Mapper, then choose Generate Access Policies. The interface displays a sample bucket policy that I copy and paste into the bucket policy for my S3 bucket in the AWS Management Console.

With the Data Mapper set up, I’m now able to add the customers who have requested to have their data deleted to the Deletion Queue. Using their Customer IDs, I go to the Deletion Queue section and select Add Match to the Deletion Queue.

I’ve chosen to delete from all the available Data Mappers, but I can also choose specific ones. Once I’ve added my matches, I can see a list of them on Deletion Queue page:

I can now run a deletion job that will cause the matches to be deleted from the data lake. To do this, I select Deletion Jobs then Start a Deletion Job.

After a few minutes the Deletion Job completes, and I can see metrics collected during the job including that the job took just over two-and-a-half minutes:

There is an Export to JSON option that includes all the metrics shown, more granular information about the Deletion Job, and which S3 objects were modified.

At this point the Deletion Queue is empty, and ready for me to use for future requests.

Solution design

This section includes a brief introduction to how the solution works. More comprehensive design documentation is available in the Amazon S3 Find and Forget GitHub repository.

The following diagram illustrates the architecture of this solution.

Amazon S3 Find and Forget uses AWS Serverless services to optimize for cost and scalability. The user interface and API are built using Amazon S3, Amazon Cognito, AWS Lambda, Amazon DynamoDB, and Amazon API Gateway, which automatically scale down when not in use so that there is no expensive baseline cost just for having the solution installed. These AWS services are always available and scale in concert with when the solution is used with a pay-for-what-you-use price model.

The Deletion Job workflow is coordinated using AWS Step Functions, Lambda, and Amazon Simple Queue Service (Amazon SQS). The solution uses Step Functions for high-level coordination and state tracking in the workflow, Lambda functions for discrete computation tasks, and Amazon SQS to store queues of repetitive work.

A deletion job has two phases: Find and Forget. In the Find phase, the solution uses Amazon Athena to scan the data lake for objects containing rows matching the identifiers in the deletion queue. For this to work at scale, we built a query planner Lambda function that uses the partition list in the AWS Glue Data Catalog for each data mapper to run an Athena query on each partition, returning the path to S3 objects that contain matches with the identifiers in the Deletion Queue. The object keys are then added to an SQS queue that we refer to as the Object Deletion Queue.

In the Forget phase, deletion workers are started as a service running on AWS Fargate. These workers process each object in the Object Deletion Queue by downloading the objects from the S3 bucket into memory, deleting the rows that contain matched identifiers, then putting a new version of the object to the S3 bucket using the same key. By default, older versions of the object are then deleted from the S3 bucket to make the deletion irreversible. You can alternatively disable this feature to implement your own strategy for deleting old object versions, such as by using an S3 Lifecycle policy.

Note that during the Forget phase, affected S3 objects are replaced at the time they are processed and are subject to the Amazon S3 data consistency model. We recommend that you avoid running a Deletion Job in parallel to a workload that reads from the data lake unless it has been designed to handle temporary inconsistencies between objects.

When the object deletion queue is empty, the Forget phase is complete and a final status is determined for the Deletion Job based on whether any errors occurred (for example, due to missing permissions for S3 objects).

Logs are generated for all actions throughout the Deletion Job, which you can use for reporting or troubleshooting. These are stored in DynamoDB, along with other persistent data including the Data Mappers and Deletion Queue.


In this post, we introduced the Amazon S3 Find and Forget solution, which assists data lake operators to handle data erasure requests they may receive pursuant to regulations such as GDPR, CCPA, and others. We then described features of the solution and how to use it for a basic use case.

You can get started today by deploying the solution from the GitHub repository, where you can also find more documentation of how the solution works, its features, and limits. We are continuing to develop the solution and welcome you to send feedback, feature requests, or questions through GitHub Issues.


About the Authors

Chris Deigan is an AWS Solution Engineer in London, UK. Chris works with AWS Solution Architects to create standardized tools, code samples, demonstrations, and quick starts.




Matteo Figus is an AWS Solution Engineer based in the UK. Matteo works with the AWS Solution Architects to create standardized tools, code samples, demonstrations and quickstarts. He is passionate about open-source software and in his spare time he likes to cook and play the piano.




Nick Lee is an AWS Solution Engineer based in the UK. Nick works with the AWS Solution Architects to create standardized tools, code samples, demonstrations and quickstarts. In his spare time he enjoys playing football and squash, and binge-watching TV shows.




Adir Sharabi is a Solutions Architect with Amazon Web Services. He works with AWS customers to help them architect secure, resilient, scalable and high performance applications in the cloud. He is also passionate about Data and helping customers to get the most out of it.




Cristina Fuia is a Specialist Solutions Architect for Analytics at AWS. She works with customers across EMEA helping them to solve complex problems, design and build data architectures so that they can get business value from analyzing their data.


Analyzing Amazon S3 server access logs using Amazon ES

Post Syndicated from Mahesh Goyal original https://aws.amazon.com/blogs/big-data/analyzing-amazon-s3-server-access-logs-using-amazon-es/

When you use Amazon Simple Storage Service (Amazon S3) to store corporate data and host websites, you need additional logging to monitor access to your data and the performance of your application. An effective logging solution enhances security and improves the detection of security incidents. With the advent of increased data storage needs, you can rely on Amazon S3 for a range of use cases and simultaneously looking for ways to analyze your logs to ensure compliance, perform the audit, and discover risks.

Amazon S3 lets you monitor the traffic using the server access logging feature. With server access logging, you can capture and monitor the traffic to your S3 bucket at any time, with detailed information about the source of the request. The logs are stored in the S3 bucket you own in the same Region. This addresses the security and compliance requirements of most organizations. The logs are critical for establishing baselines, analyzing access patterns, and identifying trends. For example, the logs could answer a financial organization’s question about how many requests are made to a bucket and who is making what type of access requests to the objects.

You can discover insights from server access logs through several different methods. One common option is by using Amazon Athena or Amazon Redshift Spectrum and query the log files stored in Amazon S3. However, this solution poses high latency with an exponential growth in volume. It requires further integration with Amazon QuickSight to add visualization capabilities.

You can address this by using Amazon Elasticsearch Service (Amazon ES). Amazon ES is a managed service that makes it easier to deploy, operate, and scale Elasticsearch clusters in the AWS Cloud. Elasticsearch is a popular open-source search and analytics engine for use cases such as log analytics, real-time application monitoring, and clickstream analysis. The service provides support for open-source Elasticsearch APIs, managed Kibana, and integration with other AWS services such as Amazon S3 and Amazon Kinesis for loading streaming data into Amazon ES.

This post walks you through automating ingestion of server access logs from Amazon S3 into Amazon ES using AWS Lambda and visualizing the data in Kibana.

Architecture overview

Server access logging is enabled on source buckets, and logs are delivered to access log bucket. The access log bucket is configured to send an event to the Lambda function when a log file is created. On an event trigger, the Lambda function reads the file, processes the access log, and sends it to Amazon ES. When the logs are available, you can use Kibana to create interactive visuals and analyze the logs over a time period.

When designing a log analytics solution for high-frequency incoming data, you should consider buffering layers to avoid instability in the system. Buffering helps you streamline processes for unpredictable incoming log data. For such use cases, you can take advantage of managed services like Amazon Kinesis Data Streams, Amazon Kinesis Data Firehose, and Amazon Managed Streaming for Apache Kafka (Amazon MSK).

Streaming services buffer data before delivering it to Amazon ES. This helps you avoid overwhelming your cluster with spiky ingestion events. Kinesis Data Firehose can reliably load data into Amazon ES. Kinesis Data Firehose lets you choose a buffer size of 1–100 MiBs and a buffer interval of 60–900 seconds when Amazon ES is selected as the destination. Kinesis Data Firehose also scales automatically to match the throughput of your data and requires no ongoing administration. For more information, see Ingest streaming data into Amazon Elasticsearch Service within the privacy of your VPC with Amazon Kinesis Data Firehose.

The following diagram illustrates the solution architecture.


Before creating resources in AWS CloudFormation, you must enable server access logging on the source bucket. Open the S3 bucket properties and look for Amazon S3 access and delivery bucket. See the following screenshot.

You also need an AWS Identity and Access Management (IAM) user with sufficient permissions to interact with the AWS Management Console and related AWS services. The user must have access to create IAM roles and policies via the CloudFormation template.

Setting up the resources with AWS CloudFormation

First, deploy the CloudFormation template to create the core components of the architecture. AWS CloudFormation automates the deployment of technology and infrastructure in a safe and repeatable manner across multiple Regions and multiple accounts with the least amount of effort and time.

  1. Sign in to the console and choose the Region of the bucket storing the access log. For this post, I use us-east-1.
  2. Launch the stack:
  3. Choose Next.
  4. For Stack name, enter a name.
  5. On the Parameters page, enter the following parameters:
    1. VPC Configuration – Select any VPC that has at least two private subnets. The template deploys the Amazon ES service domain and Lambda within the VPC.
    2. Private subnets – Select two private subnets of the VPC. The route tables associated with subnets must have a NAT gateway configuration and VPC endpoint for Amazon S3 to privately connect the bucket from Lambda.
    3. Access log S3 bucket – Enter the S3 bucket where access logs are delivered. The template configures event notification on the bucket to trigger the Lambda function.
    4. Amazon ES domain name – Specify the Amazon ES domain name to be deployed through the template.
  6. Choose Next.
  7. On the next page, choose Next.
  8. Acknowledge resource creation under Capabilities and transforms and choose Create.

The stack takes about 10–15 minutes to complete. The CloudFormation stack does the following:

  • Creates an Amazon ES domain with fine-grained access control enabled on it. Fine-grained access control is configured with a primary user in the internal user database.
  • Creates IAM role for the Lambda function with required permission to read from S3 bucket and write to Amazon ES.
  • Creates Lambda within the same VPC of Amazon ES elastic network interfaces (ENI). Amazon ES places an ENI in the VPC for each of your data nodes. The communication from Lambda to the Amazon ES domain is via this ENI.
  • Configures file create event notification on Access log S3 bucket to trigger the Lambda function. The function code segments are discussed in detail in this GitHub project.

You must make several considerations before you proceed with a production-grade deployment. For this post, I use one primary shard with no replicas. As a best practice, we recommend deploying your domain into three Availability Zones with at least two replicas. This configuration lets Amazon ES distribute replica shards to different Availability Zones than their corresponding primary shards and improves the availability of your domain. For more information about sizing your Amazon ES, see Get started with Amazon Elasticsearch Service: T-shirt-size your domain.

We recommend setting the shard count based on your estimated index size, using 50 GB as a maximum target shard size. You should also define an index template to set the primary and replica shard counts before index creation. For more information about best practices, see Best practices for configuring your Amazon Elasticsearch Service domain.

For high-frequency incoming data, you can rotate indexes either per day or per week depending on the size of data being generated. You can use Index State Management to define custom management policies to automate routine tasks and apply them to indexes and index patterns.

Creating the Kibana user

With Amazon ES, you can configure fine-grained users to control access to your data. Fine-grained access control adds multiple capabilities to give you tighter control over your data. This feature includes the ability to use roles to define granular permissions for indexes, documents, or fields and to extend Kibana with read-only views and secure multi-tenant support. For more information on granular access control, see Fine-Grained Access Control in Amazon Elasticsearch Service.

For this post, you create a fine-grained role for Kibana access and map it to a user.

  1. Navigate to Kibana and enter the primary user credentials:
    1. User nameadminuser01
    2. Password[email protected]

To access Kibana, you must have access to the VPC. For more information about accessing Kibana, see Controlling Access to Kibana.

  1. Choose Security, Roles.
  2. For Role name, enter kibana_only_role.
  3. For Cluster-wide permissions, choose cluster_composite_ops_ro.
  4. For Index patterns, enter access-log and kibana.
  5. For Permissions: Action Groups, choose read, delete, index, and manage.
  6. Choose Save Role Definition.
  7. Choose Security, Internal User Database, and Create a New User.
  8. For Open Distro Security Roles, choose Kibana_only_role (created earlier).
  9. Choose Submit.

The user kibanauser01 now has full access to Kibana and access-logs indexes. You can log in to Kibana with this user and create the visuals and dashboards.

Building dashboards

You can use Kibana to build interactive visuals and analyze the trends and combine the visuals for different use cases in a dashboard. For example, you may want to see the number of requests made to the buckets in the last two days.

  1. Log in to Kibana using kibanauser01.
  2. Create an index pattern and set the time range
  3. On the Visualize section of your Kibana dashboard, add a new visualization.
  4. Choose Vertical Bar.

You can select any time range and visual based on your requirements.

  1. Choose the index pattern and then configure your graph options.
  2. In the Metrics pane, expand Y-Axis.
  3. For Aggregation, choose Count.
  4. For Custom Label, enter Request Count.
  5. Expand the X-Axis
  6. For Aggregation, choose Terms.
  7. For Field, choose bucket.
  8. For Order By, choose metric: Request Count.
  9. Choose Apply changes.
  10. Choose Add sub-bucket and expand the Split Series
  11. For Sub Aggregation, choose Date Histogram.
  12. For Field, choose requestdatetime.
  13. For Interval, choose Daily.
  14. Apply the changes by choosing the play icon at the top of the page.

You should see the visual on the right side, similar to the following screenshot.

You can combine graphs of different use cases into a dashboard. I have built some example graphs for general use cases like the number of operations per bucket, user action breakdown for buckets, HTTPS status rate, top users, and tabular formatted error details. See the following screenshots.

Cleaning up

Delete all the resources deployed through the CloudFormation template to avoid any unintended costs.

  1. Disable the access log on source bucket.
  2. On to the CloudFormation console, identify the stacks appropriately, and delete


This post detailed a solution to visualize and monitor Amazon S3 access logs using Amazon ES to ensure compliance, perform security audits, and discover risks and patterns at scale with minimal latency. To learn about best practices of Amazon ES, see Amazon Elasticsearch Service Best Practices. To learn how to analyze and create a dashboard of data stored in Amazon ES, see the AWS Security Blog.

About the Authors

Mahesh Goyal is a Data Architect in Big Data at AWS. He works with customers in their journey to the cloud with a focus on big data and data warehouses. In his spare time, Mahesh likes to listen to music and explore new food places with his family.





New – AWS Fargate for Amazon EKS now supports Amazon EFS

Post Syndicated from Harunobu Kameda original https://aws.amazon.com/blogs/aws/new-aws-fargate-for-amazon-eks-now-supports-amazon-efs/

AWS Fargate is a serverless compute engine for containers available with both Amazon Elastic Kubernetes Service (EKS) and Amazon Elastic Container Service (ECS). With Fargate, developers are able to focus on building applications, eliminating the need to manage the infrastructure related undifferentiated heavy lifting.

Developers specify resources for each Kubernetes pod, and are charged only for provisioned compute resource. When using Fargate, each EKS pod runs in its own kernel runtime environment and CPU, memory, storage, and network resources are never shared with other pods, providing workload isolation and increased security.

Containers are ephemeral in nature. They are dynamically scaled in and out, and their saved state or data is cleared on exit. We’ve had many requirements from our customers about data persistence and shared storage of containerized applications since launching EKS support for Fargate in 2019, and announced Amazon Elastic File System(EFS) support for Fargate on ECS in April 2020. Now many customers are operating stateful workloads on it, and others have requested support for EFS with Fargate when used with EKS. Today we are happy to announce this EFS support.

EFS provides a simple, scalable, and fully managed shared file system for use with AWS cloud services, and can also help Kubernetes applications be highly available because all data written to EFS is written to multiple AWS Availability Zones. EFS is built for on-demand petabyte growth without application interruption, and it automatically grows and shrinks as files are added and removed, eliminating the need to provision and manage capacity to accommodate growth. EFS Access Points is also ideal for security sensitive workloads as it can encrypt data in the file system and data in transit.

Kubernetes supports “Container Storage Interface (CSI)” which is a standard for exposing block and file storage systems to containerized workloads. The EFS CSI driver makes it simple to configure elastic file storage for Kubernetes clusters, and before this update customers could to use EFS via Amazon EC2 worker nodes connected to a cluster. Now customers can also configure their pods running on Fargate to access an EFS file system using standard Kubernetes APIs. With this update, customers can run stateful workloads that require highly available file systems as well as workloads that require access to shared storage. Using the EFS CSI driver, all data in transit is encrypted by default.

We released a generally available version of the Amazon EFS CSI driver for EKS in July 2020. The Amazon EFS CSI driver makes it easy to configure elastic file storage for both EKS and self-managed Kubernetes clusters running on AWS using standard Kubernetes interfaces. If a Kubernetes pod is terminated and relaunched, the CSI driver reconnects the EFS file system, even if the pod is relaunched in a different AWS Availability Zone.​ When using standard EC2 worker nodes, the EFS CSI driver needs to be deployed as a set of pods and DaemonSets. With this new update, for Fargate this step is not required and you do not need to install the EFS CSI driver, as it is installed in the Fargate stack and support for EFS is provided out of the box. Customers can use EFS with Fargate for EKS without spending the time and resources to install and update the CSI driver.

How to configure the Fargate/EKS and EFS integration?

You need to use three Kubernetes settings to mount EFS to Farfgate on EKS. Those are StorageClass, PersistentVolume (PV), and PersistentVolumeClaim (PVC). Configuring StorageClass and PVs are steps that an administrator (or similar) would do to make EFS file systems available for application developers. PVCs are used to allocate PVs from the pool of existing PVs as needed to deploy applications.

The StorageClass object provides a way for a Kubernetes administrator to register a specific storage type (e.g. EFS or EBS) and configuration (e.g. throughput, backup policy). Once a StorageClass is defined the PV object is used to create actual storage volumes inside that class. StorageClass and PV are the Kubernetes mechanisms that allow actual storage subsystems to be abstracted and decoupled from the way they are consumed by Kubernetes users. For example, while a Kubernetes administrator needs to know how exactly to configure a specific storage configuration from a particular storage service, Kubernetes users do not because they only see their volumes within abstract classes of storage.

The last step is the binding: Kubernetes users requests access to said volumes via the PVC object and related API. These volumes can be created dynamically when the user requests them via the PVC or they need to be statically pre-created by an administrator for later consumption by a Kubernetes user. The current implementation of the EFS CSI driver requires the volumes to be statically pre-created for the PVC binding to work.

If you are new to Kubernetes persistent volumes and want to know more about how they work, please refer to this page in the Kubernetes documentation that has all the details.

Let’s see this in action. First, you need to create your own EFS file system in the same AWS Region. If you are not familiar with EFS this EFS getting start guide is a good resource you can start with.

Once you create an EFS file system, you get your file system ID. You can configure the mount settings using a Kubernetes StorageClass and PersistentVolume. Here is an example of the YAML files:

CSIDriver Object

apiVersion: storage.k8s.io/v1beta1
kind: CSIDriver
  name: efs.csi.aws.com
  attachRequired: false

For now you need to add the EFS CSIDriver object shown above to your cluster so Kubernetes can discover the driver that Fargate automatically installs. In the future, this manifest will be added by default to EKS clusters.

Storage Class

kind: StorageClass
apiVersion: storage.k8s.io/v1
  name: efs-sc
provisioner: efs.csi.aws.com


apiVersion: v1
kind: PersistentVolume
  name: efs-pv
    storage: 5Gi
  volumeMode: Filesystem
    - ReadWriteMany
  persistentVolumeReclaimPolicy: Retain
  storageClassName: efs-sc
    driver: efs.csi.aws.com
    volumeHandle: <EFS filesystem ID>

The volumeHandle is returned by the EFS service when you create a file system, and you need to use it to configure the CSI driver to create the PV. You can obtain EFS filesystem ID from the AWS management console or the below command by AWS CLI.

aws efs describe-file-systems --query "FileSystems[*].FileSystemId" --output text

Now that you have created a PV by applying the manifest above, you configure Kurbenetes pods to access the EFS file system by including a PersistentVolumeClaim in the pod manifest. These are two manifest examples that do that:


apiVersion: v1
kind: PersistentVolumeClaim
  name: efs-claim
    - ReadWriteMany
  storageClassName: efs-sc
      storage: 5Gi

Pod manifest

apiVersion: v1
kind: Pod
  name: app1
  - name: app1 image: busybox command: ["/bin/sh"] args: ["-c", "while true; do echo $(date -u) >> /data/out1.txt; sleep 5; done"]
    - name: persistent-storage
      mountPath: /data
  - name: persistent-storage
      claimName: efs-claim

Available Today

Today, this feature update is available for newly created EKS clusters with Kubernetes version 1.17, and we are planning to roll out support for this feature with additional Kubernetes versions on EKS in the coming weeks. This update is available in all AWS regions where Fargate with EKS is available. You can check our latest documentation for more detail.

– Kame;

New – High-Performance HDD Storage for Amazon FSx for Lustre File Systems

Post Syndicated from Harunobu Kameda original https://aws.amazon.com/blogs/aws/new-high-performance-hdd-storage-for-amazon-fsx-for-lustre-file-systems/

Many workloads, such as genome analysis, training of machine learning models, High Performance Computing (HPC), and analytics applications depend on multiple compute instances accessing the same set of data. For these workloads, clusters of compute instances are commonly connected to a high-performance shared file system. Amazon FSx for Lustre makes it easy and cost-effective to launch and run the world’s most popular high-performance shared file system. And today we’re announcing new HDD storage options for FSx for Lustre that reduce storage costs by up to 80% for throughput-intensive workloads that don’t require the sub-millisecond latencies of SSD storage.

Customers can achieve up to tens of gigabytes of throughput per second while lowering their storage costs for workloads where throughput is the dominant performance attribute. Video rendering and financial simulations are two examples of these throughput-intensive workloads.

This announcement includes two new HDD-based storage options which are optimized for reading and writing sequential file data. One offers 12 MB/sec of baseline throughput per TiB of storage and the other offers 40 MB/sec of baseline throughput per TiB of storage, and both allow you to burst to six times those throughput levels. To increase performance for frequently accessed files, you can also provision an SSD cache that is automatically sized to 20% of your HDD file system storage capacity. On file systems that are provisioned with an SSD cache, files read from the cache are served with sub-millisecond latencies.

The new FSx file systems are comprised of multiple HDD-based storage servers and a single SSD-based metadata server. The SSD storage on the metadata servers ensures that all metadata operations, which represent the majority of file system operations, are delivered with sub-millisecond latencies.

HDD performance increases with storage capacity making it easy to scale out your storage solution without encountering file system bottlenecks. Here’s a summary of the performance specifications for both the new HDD storage options and the existing SSD storage options.

Quick Guide

Traditionally, operating and scaling high performance file systems was costly and time consuming. Now with just a few clicks anyone can use FSx for Lustre for any compute workload. Launching the HDD-based file system is easy. Simply open the management console and click the Create file system button.

Chose FSx for Lustre and click Next.

FSx for Lustre offers two deployment types – Persistent and Scratch. HDD storage is available on persistent mode which is designed for longer-term storage and workloads. On persistent file systems, data is replicated and file servers are replaced if they fail whereas the scratch type are ideal for temporary storage and shorter-term processing of data. On scratch file systems, data is not replicated and does not persist if a file server fails. You can can find more detail on the difference between the two deployment options in this blog article.

Once you choose HDD as the Storage Type, you can select 12 or 40 MB/s per TiB for the Throughput per unit of storage. You can also add the SSD cache to accelerate file access by choosing “Read-only SSD cache” as Drive Cache Type.

You can also create a file system by CLI.

fsx create-file-system \
--storage-capacity <capacity> --storage-type HDD \
--file-system-type LUSTRE \
--subnet-ids subnet-<your vpc subnet id>85b2c0ce --lustre-configuration \
DeploymentType=PERSISTENT_1,PerUnitStorageThroughput=<12 or 40>\,DriveCacheType=<NONE or READ>

For PerUnitStorageThroughput=12, acceptable values of storage capacity are multiples of 6000.
For PerUnitStorageThroughput=40, acceptable values of storage capacity are multiples of 1800.

Available Today

The new HDD storage options are available for all AWS regions where Amazon FSx for Lustre is available. Please visit our web site for more details.

–  Kame;


New – Using Amazon GuardDuty to Protect Your S3 Buckets

Post Syndicated from Danilo Poccia original https://aws.amazon.com/blogs/aws/new-using-amazon-guardduty-to-protect-your-s3-buckets/

As we anticipated in this post, the anomaly and threat detection for Amazon Simple Storage Service (S3) activities that was previously available in Amazon Macie has now been enhanced and reduced in cost by over 80% as part of Amazon GuardDuty. This expands GuardDuty threat detection coverage beyond workloads and AWS accounts to also help you protect your data stored in S3.

This new capability enables GuardDuty to continuously monitor and profile S3 data access events (usually referred to data plane operations) and S3 configurations (control plane APIs) to detect suspicious activities such as requests coming from an unusual geo-location, disabling of preventative controls such as S3 block public access, or API call patterns consistent with an attempt to discover misconfigured bucket permissions. To detect possibly malicious behavior, GuardDuty uses a combination of anomaly detection, machine learning, and continuously updated threat intelligence. For your reference, here’s the full list of GuardDuty S3 threat detections.

When threats are detected, GuardDuty produces detailed security findings to the console and to Amazon EventBridge, making alerts actionable and easy to integrate into existing event management and workflow systems, or trigger automated remediation actions using AWS Lambda. You can optionally deliver findings to an S3 bucket to aggregate findings from multiple regions, and to integrate with third party security analysis tools.

If you are not using GuardDuty yet, S3 protection will be on by default when you enable the service. If you are using GuardDuty, you can simply enable this new capability with one-click in the GuardDuty console or through the API. For simplicity, and to optimize your costs, GuardDuty has now been integrated directly with S3. In this way, you don’t need to manually enable or configure S3 data event logging in AWS CloudTrail to take advantage of this new capability. GuardDuty also intelligently processes only the data events that can be used to generate threat detections, significantly reducing the number of events processed and lowering your costs.

If you are part of a centralized security team that manages GuardDuty across your entire organization, you can manage all accounts from a single account using the integration with AWS Organizations.

Enabling S3 Protection for an AWS Account
I already have GuardDuty enabled for my AWS account in this region. Now, I want to add threat detection for my S3 buckets. In the GuardDuty console, I select S3 Protection and then Enable. That’s it. To be more protected, I repeat this process for all regions enabled in my account.

After a few minutes, I start seeing new findings related to my S3 buckets. I can select each finding to get more information on the possible threat, including details on the source actor and the target action.

After a few days, I select the Usage section of the console to monitor the estimated monthly costs of GuardDuty in my account, including the new S3 protection. I can also find which are the S3 buckets contributing more to the costs. Well, it turns out I didn’t have lots of traffic on my buckets recently.

Enabling S3 Protection for an AWS Organization
To simplify management of multiple accounts, GuardDuty uses its integration with AWS Organizations to allow you to delegate an account to be the administrator for GuardDuty for the whole organization.

Now, the delegated administrator can enable GuardDuty for all accounts in the organization in a region with one click. You can also set Auto-enable to ON to automatically include new accounts in the organization. If you prefer, you can add accounts by invitation. You can then go to the S3 Protection page under Settings to enable S3 protection for their entire organization.

When selecting Auto-enable, the delegated administrator can also choose to enable S3 protection automatically for new member accounts.

Available Now
As always, with Amazon GuardDuty, you only pay for the quantity of logs and events processed to detect threats. This includes API control plane events captured in CloudTrail, network flow captured in VPC Flow Logs, DNS request and response logs, and with S3 protection enabled, S3 data plane events. These sources are ingested by GuardDuty through internal integrations when you enable the service, so you don’t need to configure any of these sources directly. The service continually optimizes logs and events processed to reduce your cost, and displays your usage split by source in the console. If configured in multi-account, usage is also split by account.

There is a 30-day free trial for the new S3 threat detection capabilities. This applies as well to accounts that already have GuardDuty enabled, and add the new S3 protection capability. During the trial, the estimated cost based on your S3 data event volume is calculated in the GuardDuty console Usage tab. In this way, while you evaluate these new capabilities at no cost, you can understand what would be your monthly spend.

GuardDuty for S3 protection is available in all regions where GuardDuty is offered. For regional availability, please see the AWS Region Table. To learn more, please see the documentation.


Field Notes: Building a Disaster Recovery site on AWS for your Azure Workload

Post Syndicated from Ebrahim (EB) Khiyami original https://aws.amazon.com/blogs/architecture/field-notes-building-a-disaster-recovery-site-on-aws-for-your-azure-workload/

Customers running workloads on other clouds, like Azure or GCP, can increase resilience and meet compliance requirements by using AWS as their disaster recovery site. CloudEndure Disaster Recovery provides an easy cross-cloud solution for replicating and recovering workloads from other cloud providers to AWS. It automatically converts your source machines so that they boot and run natively on AWS.

In this blog post, I show you how to use CloudEndure Disaster Recovery to build a DR site on AWS if your primary workload is on Azure. I will build connectivity between Azure and AWS according to CloudEndure requirements, install CloudEndure Agent, failover a server from Azure to AWS, and finally return to normal operations by failing back from AWS to Azure.

When using CloudEndure Disaster Recovery, your source location can be AWS (for Region to Region replication within AWS), VCenter, or ‘Other Infrastructure’. Other Infrastructure includes physical servers, virtual machines on private cloud, or virtual machines on public cloud providers like Azure or GCP. Using CloudEndure Disaster Recovery, you can achieve a Recovery Point Objectives (RPO) of 1 second, and a Recovery Time Objective (RTO) of 5-20 minutes (depending on the operating system used).

The failover process to AWS is the same regardless of source infrastructure. It starts by setting up your CloudEndure Disaster Recovery project, installing CloudEndure Agent and completing the initial replication from source (Azure) to target (AWS), and then performing a failover to target (AWS). However, the failback process (discussed later), will vary depending on your source infrastructure. During failback, CloudEndure DR reverses the replication to the opposite direction, in our case from AWS to Azure, so that the data written on your recovered servers during the disaster can be restored to the original source environment (Azure).

The following architectural diagram depicts the solution I build in this post.

Architectural Diagram

Figure 1 – Architectural Diagram

For the purpose of this post, I’m using a sample VM that runs Windows Server 2019 datacenter on Azure in East US 2 Region. I choose US-East-1 as a recovery target Region in AWS. The process is identical for any number of servers and any other supported operating systems.

I perform the following steps:

  1. Establish connectivity between Azure and AWS
  2. Complete the networking requirements for CloudEndure Disaster Recovery
  3. Register for a CloudEndure DR account, create a DR project and get an agent installation token.
  4. Install CloudEndure Agent on Azure VM
  5. Configure Blueprint for target server in the CloudEndure DR User Console
  6. Perform failover from Azure to AWS
  7. Perform failback from AWS to Azure
  8. Return to normal operations

Let’s review the details of each step.

1-      Establish connectivity between Azure and AWS

CloudEndure supports connectivity using public internet, VPN, or DirectConnect. For this demo, I will create a VPN connection between the two sites. Creating a VPN connectivity between Azure and AWS is no different than a normal Site-to-Site VPN connection setup between AWS and on-premises. I have to do the configurations in parallel on both sides so that both ends of the connection will be synced.

Assuming you have created a Virtual Network and a Virtual Private Cloud (VPC) on Azure and AWS sides respectively, following are the high-level steps to create a VPN connection between the two.  For more details review, “How do I create a secure connection between my office network and Amazon Virtual Private Cloud?”, and “Create a Site-to-Site connection in the Azure portal.”

On Azure

1-1-           Create Virtual Network Gateway with static routing

1-2-           Take a note of the public IP of the gateway created. We will use it in the next step.


1-3-           Create a Virtual Private Gateway (VPG). Select Amazon default ASN for ASN.

Virtual Private Gateway

Figure 2 – Create a Virtual Private Gateway

1-4             Attach the VPG created to the VPC you plan to use to host your DR.

Attach a Virtual Private Gateway

Figure 3 – Attach a Virtual Private Gateway

1-5-           Create a Customer Gateway (CGW). Select Static for Routing and provide the public IP address for the Virtual Network Gateway on Azure side noted in step 1-2. For Static IP Address enter the network address of your primary site on Azure.

Create Customer Gateway

Figure 4 – Create Customer Gateway

1-6-           Create Site-to-Site VPN Connection. Select VGW and customer gateway created in the preceding step. Select Static routing and provide the IP address for VNet network on Azure.

Create VPN Connection

Figure 5 – Create VPN Connection

1-7-           Download the Configuration Template

VPN Configuration Template

Figure 6 – VPN Configuration Template

1-8-           We must locate the Pre-Shared Key, and the Virtual Private Gateway IP address in the configuration file.  Review the file carefully to understand its content. See my example in the following image.

Figure 7- VPN Configuration - Pre-shared key

Figure 7- VPN Configuration – Pre-shared key


Figure 8 - VPN Configuration VPG IP

Figure 8 – VPN Configuration VPG IP

Let’s go back to the Azure Portal

1-9-           Create a Local Network Gateway. Provide the Outside IP address noted in step 1-8. In the Address space field provide the DR VPC network address on AWS side.

Figure 9 - Local Network Gateway

Figure 9 – Local Network Gateway

1-10-           Create a connection. You must provide the pre-shared key noted from configuration file in step 1-8.

Figure 10 - Azure VPN Connection

Figure 10 – Azure VPN Connection


1-11-           Finally, remember to modify the routing table of the subnet on AWS side so that it routes VPN traffic to VGW.

Figure 11 - AWS Routing Table

Figure 11 – AWS Routing Table

The VPN connection we created on AWS side will have 2 tunnels. I recommend you setup both tunnels for high availability. You must follow the same steps to configure the second tunnel.

At this point, you have a VPN connection between Azure and AWS. In the next step we must add the necessary ports needed for CloudEndure management and replication between Azure and AWS.

2-       Complete the networking requirements for CloudEndure

Adjust the security settings on both sides to allow the ports necessary for CloudEndure DR replication and authentication.

Communication over TCP Port 443:

  • Between the Source Machines and the CloudEndure Service Manager.
  • Between the Staging Area and the CloudEndure Service Manager.

Communication over TCP Port 1500:

  • Between the Source Machines and the Staging Area

3-       Register for a CloudEndure DR account, create a DR project, and get an agent installation token

Before using CloudEndure DR you need to subscribe to the product on AWS Marketplace. The subscription process is as follows:

The first step in using CloudEndure is creating a project. You have two types of projects:

  1. Migration
  2. Disaster Recovery

Select Disaster Recovery.

Figure 12- CloudEndure Project

Figure 12- CloudEndure Project

After you create a project you must setup source and destination environments, replication settings and retrieve an authentication token.  You use the token to install CloudEndure Agent on source environment servers, in this case Azure VM. You do that by providing credentials (Access key and Secret access key) for an Identity and Access Management (IAM) user in your AWS account that has the necessary permissions to perform CloudEndure related API. The details are listed in the CloudEndure IAM policy.

Figure 13 - Project Credentials

Figure 13 – Project Credentials

Next step is to set your disaster recovery source and target. Source can be AWS Region, if you want to perform a Region to Region disaster recovery within AWS. The source is Other Infrastructure, if your source environment is a physical datacenter, virtual environment or any other public cloud.

For Azure, I choose Other Infrastructure. You also must provide the subnet in which the replication server on AWS will be launched in the staging area and the security group it will use. I keep the instance type for the replication server as default.

Figure 14 - Replication Settings

Figure 14 – Replication Settings

After you complete the setup in the replication setting screen you are presented with a How to Add Machines page that will give you the installation instructions for CloudEndure Agent and the installation token. Take a note of these details and get ready to switch to the Azure portal.

Figure 15 - CloudEndure How to Add Machines

Figure 15 – CloudEndure How to Add Machines

4-       Install CloudEndure Agent

My source environment is a VM that runs Windows 2019 datacenter and has two disks of size 8 and 15 GB. As shown in the preceding screenshot, I download the agent installer for Windows and then RDP to the VM. I run the command to install the agent. This is the output from my screen:

Figure 16- CloudEndure Agent Installation

Figure 16- CloudEndure Agent Installation

After installation has completed successfully, CloudEndure will start creating the replication server in the staging area using configurations you set up in the Blueprint (more about Blueprint in the next step). It will authenticate the replication server with the CloudEndure Service Manager.

  • Download the replication software
  • Create staging disks
  • Pair CloudEndure Agent on the source VM with the replication server,
  • Establish communication between the CloudEndure Agent and the replication server

The replication traffic is encrypted in transit using AES 256 bit, and is carried on TCP port 1500.  Make sure you have this port open on Azure side in the egress direction to AWS. CloudEndure then will start to initiate replication and when it finishes, the source VM will show in the dashboard with Continuous Data Protection CDP state in the Data Replication Progress column. The VM at this point is ready to be tested for disaster recovery.

Figure 17- Initial Replication Completed

Figure 17- Initial Replication Completed

5-       Set up the Blueprint

While I’m waiting for the initial data sync to complete and enter Continuous Data Protection state, I choose Machines in the CloudEndure dashboard to go to the Blueprint page.

The Blueprint is a set of instructions on how to launch a target machine. You can define Blueprint properties such as instance type, subnet, security group, IP address and others.

I select my target VM to run on a machine type that’s a copy of source. This means CloudEndure will match an instance type on AWS that provides at least the compute and memory properties on the Azure VM. I also select a subnet that I had already created in my disaster recovery VPC and choose a new IP to be assigned to the target server from that subnet.

Figure 18- CloudEndure Blueprint

Figure 18- CloudEndure Blueprint

6-       Perform Failover from Azure to AWS

The overall process for failing over starts with the replication. As we saw in the previous step, the replication starts with initial sync and then it enters the Continuous data protection state.

Multiple Point in Time (PIT) snapshots will then be created. Before you perform an actual failover, I recommend that you launch your target server in Test mode first.

You do that by selecting the machine and then select Launch Target Machine> Test Mode. You then must select the snapshot you want to recover from. CloudEndure DR by default allows you select from 60 snapshots per disk per month. I select Latest and then Recover.

After the Test machine has been created successfully, the Disaster Recovery Lifecycle column will show “Tested Recently” with a green bar to the left of the machine name. This means the machine is now ready to be failed over.

Figure 19- CloudEndure DR Snapshots

Figure 19- CloudEndure DR Snapshots

Figure 20- VM Ready for Failover

Figure 20- VM Ready for Failover

Steps for failover

  • Launch the machine in Recovery mode. This is similar to launching machines in Test Mode. Select the latest snapshot.
  • Perform the actual Failover. In this step you direct your users’ traffic from Azure to AWS. The steps here vary depending on the tool you use for directing traffic.
  • CloudEndure will create the final failed-over machine using the configurations you provided in Blueprint in step 5. At this point, the Disaster Recovery Lifecycle column will show “Failed Over”.
Figure 21 - Target Machine Created

Figure 21 – Target Machine Created

7-       Perform Failback from Azure to AWS

Whether you perform the disaster recovery failover as part of an annual DR testing exercise, or as part of an actual disaster, you may want to failback to the source environment after the outage/disaster is over. This functionality is called failback.

CloudEndure DR performs failback by resetting your project and reversing the replication direction from AWS to Azure. All data that was written to your recovery machine (on AWS) during the disaster, will be copied back to machines in your original source infrastructure (Azure).

The actual steps for failback depend on the source infrastructure. In case your source environment is AWS or VMWare, the steps are fully orchestrated. If your source infrastructure is physical or other public cloud providers, a Failback client will be necessary.

Steps for Failback

  • Prepare the project for failback by selecting the Prepare for Failback option in the Project Actions menu in the User Console.
Figure 22- Prepare Project for Failback

Figure 22- Prepare Project for Failback


Figure 22- Prepare Project for Failback

The project will show “Preparing for failback to original Source” next to the project type. After a few minutes, the Data Replication Progress column will show “Pair the CloudEndure Agent with the Replication Server”. This state will change only after you complete the installation of the failback client on Azure side in the next step.

Figure 23- Prepare for Failback

Figure 23- Prepare for Failback

Download the Failback Client from Replication Settings. The failback client is an ubuntu ISO image that acts as a standalone replication server. You must install it on the source VM and boot from it so that it starts to initiate the connection necessary to start the replication in the reversed direction, from AWS to Azure.

Figure 24- Download Failback Client

Figure 24- Download Failback Client

Prepare Failback client on Azure

Azure currently supports creating virtual machines from VHD. In order to use CloudEndure failback client, you need first to convert it from ISO to VHD format. The detailed steps to do this are outside the scope of this post, but, at a high-level, this is what you do:

  • Deploy the failback ISO image to a VMWare workstation.
  • Once the installation is complete, power off the VM and copy the resulted VMDK file
  • Use the following PowerShell script to convert the VMDK file into VHD. (You must change file locations to reflect your actual environments)
1.  Import-Module 'C:\Program Files\Microsoft Virtual Machine Converter\MvmcCmdlet.psd1<br />
2.  ConvertTo-MvmcVirtualHardDisk -SourceLiteralPath 'C:\Users\Administrator\Documents\Virtual Machines\failbackdvd\failbackdvd.vmdk' -VhdType FixedHardDisk -VhdFormat Vhd -DestinationLiteralPath C:\DR


Figure 25- Converting Failback Client

Figure 25- Converting Failback Client

Figure 26- Failback Client Converted

Figure 26- Failback Client Converted

  • Upload the resulted VHD file to your Storage account on Azure using the steps in Create an Azure Storage Account
  • Create an image using the VHD file you uploaded
  • Create a new VM using the image you created. This will act as the replication server. Later I will use it to restore the machine data from AWS to Azure after the replication completes. Make sure you attached the same disks to it as in the source machine on AWS described in step 4.
  • When I start the newly created VM, I see the following screen indicating that CloudEndure failback client has been loaded and is waiting to for CloudEndure authentication token to authenticate with CloudEndure Console. Details on where to retrieve your authentication token from CloudEndure Console is available in Installing the Agents.
Figure 27- Failback Client Starts

Figure 27- Failback Client Starts

I enter the authentication token and then the ID of the machine I want to replicate. Details to check a machine ID in CloudEndure Console are available in the Machines Page. The replication server will authenticate with CloudEndure Console, and will connect to the source machine on AWS on port TCP 1500 to initiate the replication. Make sure you have a security group to allow to allow this connectivity on Azure and AWS side.

Figure 28- Failback Client Authenticated

Figure 28- Failback Client Authenticated

I wait for the replication to complete and enter Continuous Data Protection state on the CloudEndure Console. This indicates the replication direction has been reversed successfully.

Figure 29- Replication Reversed Successfully

Figure 29- Replication Reversed Successfully

I go back to CloudEndure and follow the same steps as in step 6 to launch a Test and then a Recovery instance. One thing to consider here is when launching Test/Recovery instance, the failback client will reboot as part of the launch process. Make sure to manually detach the volume that contains the failback client.

The CloudEndure Console will now show “Pair with CloudEndure Agent”. This status will change only when you return your project to normal operation in the next step.

8-       Return to normal operation

At this point, we failed over from Azure to AWS as part of a disaster recovery exercise or an actual disaster. We then performed a failback from AWS to Azure by reversing the replication direction so that all data written during the outage/disaster on AWS side will be replicated back to Azure.

The last step in the process is to return back to normal operations. This will again reverse the replication from the original primary location in Azure to the original recovery target in AWS. You do so by selecting Project Actions> Return to Normal Operation.

Figure 30- Return to Normal Operation Completes

Figure 30- Return to Normal Operation Completes

The failback has now completed and your replication has been reversed back to its original direction from Azure to AWS.

Clean Up

By the end of this exercise, please make sure you delete the resources you created on both AWS and Azure sides, so you don’t incur charges in the future.


In this blog post, we built a demo environment to demonstrate CloudEndure DR functionality between Azure as a DR source site, and AWS as a DR target. I showed you how to create a VPN tunnel between the sites, start replication, failover from Azure to AWS and then failback from AWS to Azure (using CloudEndure failback client). Finally, I showed you how to return your project to its normal operation by reverting the replication to its original direction that we started with from Azure to AWS. More details can be found in CloudEndure Disaster Recovery. 

If you have questions, feel free to ask in the comments. I look forward to hearing from you.

Field Notes provides hands-on technical guidance from AWS Solutions Architects, consultants, and technical account managers, based on their experiences in the field solving real-world business problems for customers.