Backblaze Terraform Provider Changes the Game for Avisi

Post Syndicated from Molly Clancy original https://www.backblaze.com/blog/backblaze-terraform-provider-changes-the-game-for-avisi/

Backblaze + Avisi Apps

Recently, we announced that Backblaze B2 Cloud Storage published a provider to the Terraform registry to support developers in their infrastructure as code (IaC) efforts. With the Backblaze Terraform provider, you can provision and manage B2 Cloud Storage resources directly from a Terraform configuration file.

Today’s post grew from a comment in our GitHub repository from Gert-Jan van de Streek, Co-founder of Avisi, a Netherlands-based software development company. That comment sparked a conversation that turned into a bigger story. We spoke with Gert-Jan to find out how the Avisi team practices IaC processes and uses the Backblaze Terraform provider to increase efficiency, accuracy, and speed through the DevOps lifecycle. We hoped it might be useful for other developers considering IaC for their operations.

What Is Infrastructure as Code?

IaC emerged in the late 2000s as a response to the increasing complexity of scaling software developments. Rather than provisioning infrastructure via a provider’s user interface, developers can design, implement, and deploy infrastructure for applications using the same tools and best practices they use to write software.

Provisioning Storage for “Apps That Fill Gaps”

The team at Avisi likes to think about software development as a sport. And their long-term vision is just as big and audacious as an Olympic contender’s—to be the best software development company in their country.

Gert-Jan co-founded Avisi in 2000 with two college friends. They specialize in custom project management, process optimization, and ERP software solutions, providing implementation, installation and configuration support, integration and customization, and training and plugin development. They built the company by focusing on security, privacy, and quality, which helped them to take on projects with public utilities, healthcare providers, and organizations like the Dutch Royal Notarial Professional Organization—entities that demand stable, secure, and private production environments.

They bring the same focus to product development, a business line Gert-Jan leads where they create “apps that fill gaps.” He coined the tagline to describe the apps they publish on the Atlassian and monday.com marketplaces. “We know that a lot of stuff is missing from the Atlassian and monday.com tooling because we use it in our everyday life. Our goal in life is to provide that missing functionality—apps to fill gaps,” he explained.

Avisi application platforms - Confluence, Jira, Bitbucket, Monday.com, GitLab
Avisi’s applications fill the gaps in popular project management solutions.

With multiple development environments for each application, managing storage becomes a maintenance problem for sophisticated DevOps teams like Avisi’s. For example, let’s say Gert-Jan has 10 apps to deploy. Each app has test, staging, and production environments, and each has to be deployed in three different regions. That’s 90 individual storage configurations, 90 opportunities to make a mistake, and 90 times the labor it takes to provision one bucket.

Infrastructure in Sophisticated DevOps Environments: An Example

10 apps x three environments x three regions = 90 storage configurations

Following DevOps best practices means Avisi writes reusable code, eliminating much of the manual labor and room for error. “It was really important for us to have IaC so we’re not clicking around in user interfaces. We need to have stable test, staging, and production environments where we don’t have any surprises,” Gert-Jan explained.

Terraform vs. CloudFormation

Gert-Jan had already been experimenting with Terraform, an open-source IaC tool developed by HashiCorp, when the company decided to move some of their infrastructure from Amazon Web Services (AWS) to Google Cloud Platform (GCP). The Avisi team uses Google apps for business, so the move made configuring access permissions easier.

Of course, Amazon and Google don’t always play nice—CloudFormation, AWS’s proprietary IaC tool, isn’t supported across the platforms. Since Terraform is open-source, it allowed Avisi to implement IaC with GCP and a wide range of third-party integrations like StatusCake, a tool they use for URL monitoring.

Backblaze B2 + Terraform

Simultaneously, when Avisi moved some of their infrastructure from AWS to GCP, they resolved to stand up an additional public cloud provider to serve as off-site storage as part of a 3-2-1 strategy (three copies of data on two different media, with one off-site). Gert-Jan implemented Backblaze B2, citing positive reviews, affordability, and the Backblaze European data center as key decision factors. Many of Avisi’s customers reside in the European Union and are often subject to data residency requirements that stipulate data must remain in specific geographic locations. Backblaze allowed Gert-Jan to achieve a 3-2-1 strategy for customers where data residency in the EU is top of mind.

When Backblaze published a provider to the Terraform registry, Avisi started provisioning Backblaze B2 storage buckets using Terraform immediately. “The Backblaze module on Terraform is pure gold,” Gert-Jan said. “It’s about five lines of code that I copy from another project. I configure it, rename a couple variables, and that’s it.”

Real-time Storage Sync With Terraform

Gert-Jan wrote the cloud function to sync between GCP and Backblaze B2 in Clojure, a functional programming language, running on top of Node.js. Clojure compiles to Javascript, so it runs in Java environments as well as Node.js or browser environments, for example. That means the language is available on the server side as well as the client side for Avisi.

The cloud function allowed off-site tiering to be almost instantaneous. Now, every time a file is written, it gets picked up by the cloud function and transferred to Backblaze in real time. “You need to feel comfortable about what you deploy and where you deploy it. Because it is code, the Backblaze Terraform provider does the work for me. I trust that everything is in place,” Gert-Jan said.

Avisi meeting room
The Avisi team at work.

Easier Lifecycle Rules and Code Reviews

In addition to reducing manual labor and increasing accuracy, the Backblaze Terraform provider makes setting lifecycle rules to comply with control frameworks like the General Data Protection Regulations (GDPR) and SOC 2 requirements much simpler. Gert-Jan configured one reusable module that meets the regulations and can apply the same configurations to each project. In a SOC 2 audit or when savvy customers want to know how their data is being handled, he can simply provide the code for the Backblaze B2 configuration as proof that Avisi is retaining and adequately encrypting backups rather than sending screenshots of various UIs.

Using Backblaze via the Terraform provider also streamlined code reviews. Prior to the Backblaze Terraform provider, Gert-Jan’s team members had less visibility into the storage set up and struggled with ecosystem naming. “With the Backblaze Terraform provider, my code is fully reviewable, which is a big plus,” he explained.

Simplifying Storage Management

Embracing IaC practices and using the Backblaze Terraform provider specifically means Gert-Jan can focus on growing the business rather than setting up hundreds of storage buckets by hand. He saves about eight hours per environment. Based on the example above, that equates to 720 hours saved all told. “Terraform and the Backblaze module reduced the time I spend on DevOps by 75% to just a couple of hours per app we deploy, so I can take care of the company while I’m at it,” he said.

If you’re interested in stepping up your DevOps game with IaC, set up a bucket in Backblaze B2 for free and start experimenting with the Backblaze Terraform provider.

The post Backblaze Terraform Provider Changes the Game for Avisi appeared first on Backblaze Blog | Cloud Storage & Cloud Backup.

Amazon Redshift ML Is Now Generally Available – Use SQL to Create Machine Learning Models and Make Predictions from Your Data

Post Syndicated from Danilo Poccia original https://aws.amazon.com/blogs/aws/amazon-redshift-ml-is-now-generally-available-use-sql-to-create-machine-learning-models-and-make-predictions-from-your-data/

With Amazon Redshift, you can use SQL to query and combine exabytes of structured and semi-structured data across your data warehouse, operational databases, and data lake. Now that AQUA (Advanced Query Accelerator) is generally available, you can improve the performance of your queries by up to 10 times with no additional costs and no code changes. In fact, Amazon Redshift provides up to three times better price/performance than other cloud data warehouses.

But what if you want to go a step further and process this data to train machine learning (ML) models and use these models to generate insights from data in your warehouse? For example, to implement use cases such as forecasting revenue, predicting customer churn, and detecting anomalies? In the past, you would need to export the training data from Amazon Redshift to an Amazon Simple Storage Service (Amazon S3) bucket, and then configure and start a machine learning training process (for example, using Amazon SageMaker). This process required many different skills and usually more than one person to complete. Can we make it easier?

Today, Amazon Redshift ML is generally available to help you create, train, and deploy machine learning models directly from your Amazon Redshift cluster. To create a machine learning model, you use a simple SQL query to specify the data you want to use to train your model, and the output value you want to predict. For example, to create a model that predicts the success rate for your marketing activities, you define your inputs by selecting the columns (in one or more tables) that include customer profiles and results from previous marketing campaigns, and the output column you want to predict. In this example, the output column could be one that shows whether a customer has shown interest in a campaign.

After you run the SQL command to create the model, Redshift ML securely exports the specified data from Amazon Redshift to your S3 bucket and calls Amazon SageMaker Autopilot to prepare the data (pre-processing and feature engineering), select the appropriate pre-built algorithm, and apply the algorithm for model training. You can optionally specify the algorithm to use, for example XGBoost.

Architectural diagram.

Redshift ML handles all of the interactions between Amazon Redshift, S3, and SageMaker, including all the steps involved in training and compilation. When the model has been trained, Redshift ML uses Amazon SageMaker Neo to optimize the model for deployment and makes it available as a SQL function. You can use the SQL function to apply the machine learning model to your data in queries, reports, and dashboards.

Redshift ML now includes many new features that were not available during the preview, including Amazon Virtual Private Cloud (VPC) support. For example:

Architectural diagram.

  • You can also create SQL functions that use existing SageMaker endpoints to make predictions (remote inference). In this case, Redshift ML is batching calls to the endpoint to speed up processing.

Before looking into how to use these new capabilities in practice, let’s see the difference between Redshift ML and similar features in AWS databases and analytics services.

ML Feature Data Training
from SQL
Predictions
using SQL Functions
Amazon Redshift ML

Data warehouse

Federated relational databases

S3 data lake (with Redshift Spectrum)

Yes, using
Amazon SageMaker Autopilot
Yes, a model can be imported and executed inside the Amazon Redshift cluster, or invoked using a SageMaker endpoint.
Amazon Aurora ML Relational database
(compatible with MySQL or PostgreSQL)
No

Yes, using a SageMaker endpoint.

A native integration with Amazon Comprehend for sentiment analysis is also available.

Amazon Athena ML

S3 data lake

Other data sources can be used through Athena Federated Query.

No Yes, using a SageMaker endpoint.

Building a Machine Learning Model with Redshift ML
Let’s build a model that predicts if customers will accept or decline a marketing offer.

To manage the interactions with S3 and SageMaker, Redshift ML needs permissions to access those resources. I create an AWS Identity and Access Management (IAM) role as described in the documentation. I use RedshiftML for the role name. Note that the trust policy of the role allows both Amazon Redshift and SageMaker to assume the role to interact with other AWS services.

From the Amazon Redshift console, I create a cluster. In the cluster permissions, I associate the RedshiftML IAM role. When the cluster is available, I load the same dataset used in this super interesting blog post that my colleague Julien wrote when SageMaker Autopilot was announced.

The file I am using (bank-additional-full.csv) is in CSV format. Each line describes a direct marketing activity with a customer. The last column (y) describes the outcome of the activity (if the customer subscribed to a service that was marketed to them).

Here are the first few lines of the file. The first line contains the headers.

age,job,marital,education,default,housing,loan,contact,month,day_of_week,duration,campaign,pdays,previous,poutcome,emp.var.rate,cons.price.idx,cons.conf.idx,euribor3m,nr.employed,y 56,housemaid,married,basic.4y,no,no,no,telephone,may,mon,261,1,999,0,nonexistent,1.1,93.994,-36.4,4.857,5191.0,no
57,services,married,high.school,unknown,no,no,telephone,may,mon,149,1,999,0,nonexistent,1.1,93.994,-36.4,4.857,5191.0,no
37,services,married,high.school,no,yes,no,telephone,may,mon,226,1,999,0,nonexistent,1.1,93.994,-36.4,4.857,5191.0,no
40,admin.,married,basic.6y,no,no,no,telephone,may,mon,151,1,999,0,nonexistent,1.1,93.994,-36.4,4.857,5191.0,no

I store the file in one of my S3 buckets. The S3 bucket is used to unload data and store SageMaker training artifacts.

Then, using the Amazon Redshift query editor in the console, I create a table to load the data.

CREATE TABLE direct_marketing (
	age DECIMAL NOT NULL, 
	job VARCHAR NOT NULL, 
	marital VARCHAR NOT NULL, 
	education VARCHAR NOT NULL, 
	credit_default VARCHAR NOT NULL, 
	housing VARCHAR NOT NULL, 
	loan VARCHAR NOT NULL, 
	contact VARCHAR NOT NULL, 
	month VARCHAR NOT NULL, 
	day_of_week VARCHAR NOT NULL, 
	duration DECIMAL NOT NULL, 
	campaign DECIMAL NOT NULL, 
	pdays DECIMAL NOT NULL, 
	previous DECIMAL NOT NULL, 
	poutcome VARCHAR NOT NULL, 
	emp_var_rate DECIMAL NOT NULL, 
	cons_price_idx DECIMAL NOT NULL, 
	cons_conf_idx DECIMAL NOT NULL, 
	euribor3m DECIMAL NOT NULL, 
	nr_employed DECIMAL NOT NULL, 
	y BOOLEAN NOT NULL
);

I load the data into the table using the COPY command. I can use the same IAM role I created earlier (RedshiftML) because I am using the same S3 bucket to import and export the data.

COPY direct_marketing 
FROM 's3://my-bucket/direct_marketing/bank-additional-full.csv' 
DELIMITER ',' IGNOREHEADER 1
IAM_ROLE 'arn:aws:iam::123412341234:role/RedshiftML'
REGION 'us-east-1';

Now, I create the model straight form the SQL interface using the new CREATE MODEL statement:

CREATE MODEL direct_marketing
FROM direct_marketing
TARGET y
FUNCTION predict_direct_marketing
IAM_ROLE 'arn:aws:iam::123412341234:role/RedshiftML'
SETTINGS (
  S3_BUCKET 'my-bucket'
);

In this SQL command, I specify the parameters required to create the model:

  • FROM – I select all the rows in the direct_marketing table, but I can replace the name of the table with a nested query (see example below).
  • TARGET – This is the column that I want to predict (in this case, y).
  • FUNCTION – The name of the SQL function to make predictions.
  • IAM_ROLE – The IAM role assumed by Amazon Redshift and SageMaker to create, train, and deploy the model.
  • S3_BUCKET – The S3 bucket where the training data is temporarily stored, and where model artifacts are stored if you choose to retain a copy of them.

Here I am using a simple syntax for the CREATE MODEL statement. For more advanced users, other options are available, such as:

  • MODEL_TYPE – To use a specific model type for training, such as XGBoost or multilayer perceptron (MLP). If I don’t specify this parameter, SageMaker Autopilot selects the appropriate model class to use.
  • PROBLEM_TYPE – To define the type of problem to solve: regression, binary classification, or multiclass classification. If I don’t specify this parameter, the problem type is discovered during training, based on my data.
  • OBJECTIVE – The objective metric used to measure the quality of the model. This metric is optimized during training to provide the best estimate from data. If I don’t specify a metric, the default behavior is to use mean squared error (MSE) for regression, the F1 score for binary classification, and accuracy for multiclass classification. Other available options are F1Macro (to apply F1 scoring to multiclass classification) and area under the curve (AUC). More information on objective metrics is available in the SageMaker documentation.

Depending on the complexity of the model and the amount of data, it can take some time for the model to be available. I use the SHOW MODEL command to see when it is available:

SHOW MODEL direct_marketing

When I execute this command using the query editor in the console, I get the following output:

Console screenshot.

As expected, the model is currently in the TRAINING state.

When I created this model, I selected all the columns in the table as input parameters. I wonder what happens if I create a model that uses fewer input parameters? I am in the cloud and I am not slowed down by limited resources, so I create another model using a subset of the columns in the table:

CREATE MODEL simple_direct_marketing
FROM (
        SELECT age, job, marital, education, housing, contact, month, day_of_week, y
 	  FROM direct_marketing
)
TARGET y
FUNCTION predict_simple_direct_marketing
IAM_ROLE 'arn:aws:iam::123412341234:role/RedshiftML'
SETTINGS (
  S3_BUCKET 'my-bucket'
);

After some time, my first model is ready, and I get this output from SHOW MODEL. The actual output in the console is in multiple pages, I merged the results here to make it easier to follow:

Console screenshot.

From the output, I see that the model has been correctly recognized as BinaryClassification, and F1 has been selected as the objective. The F1 score is a metrics that considers both precision and recall. It returns a value between 1 (perfect precision and recall) and 0 (lowest possible score). The final score for the model (validation:f1) is 0.79. In this table I also find the name of the SQL function (predict_direct_marketing) that has been created for the model, its parameters and their types, and an estimation of the training costs.

When the second model is ready, I compare the F1 scores. The F1 score of the second model is lower (0.66) than the first one. However, with fewer parameters the SQL function is easier to apply to new data. As is often the case with machine learning, I have to find the right balance between complexity and usability.

Using Redshift ML to Make Predictions
Now that the two models are ready, I can make predictions using SQL functions. Using the first model, I check how many false positives (wrong positive predictions) and false negatives (wrong negative predictions) I get when applying the model on the same data used for training:

SELECT predict_direct_marketing, y, COUNT(*)
  FROM (SELECT predict_direct_marketing(
                   age, job, marital, education, credit_default, housing,
                   loan, contact, month, day_of_week, duration, campaign,
                   pdays, previous, poutcome, emp_var_rate, cons_price_idx,
                   cons_conf_idx, euribor3m, nr_employed), y
          FROM direct_marketing)
 GROUP BY predict_direct_marketing, y;

The result of the query shows that the model is better at predicting negative rather than positive outcomes. In fact, even if the number of true negatives is much bigger than true positives, there are much more false positives than false negatives. I added some comments in green and red to the following screenshot to clarify the meaning of the results.

Console screenshot.

Using the second model, I see how many customers might be interested in a marketing campaign. Ideally, I should run this query on new customer data, not the same data I used for training.

SELECT COUNT(*)
  FROM direct_marketing
 WHERE predict_simple_direct_marketing(
           age, job, marital, education, housing,
           contact, month, day_of_week) = true;

Wow, looking at the results, there are more than 7,000 prospects!

Console screenshot.

Availability and Pricing
Redshift ML is available today in the following AWS Regions: US East (Ohio), US East (N Virginia), US West (Oregon), US West (San Francisco), Canada (Central), Europe (Frankfurt), Europe (Ireland), Europe (Paris), Europe (Stockholm), Asia Pacific (Hong Kong) Asia Pacific (Tokyo), Asia Pacific (Singapore), Asia Pacific (Sydney), and South America (São Paulo). For more information, see the AWS Regional Services list.

With Redshift ML, you pay only for what you use. When training a new model, you pay for the Amazon SageMaker Autopilot and S3 resources used by Redshift ML. When making predictions, there is no additional cost for models imported into your Amazon Redshift cluster, as in the example I used in this post.

Redshift ML also allows you to use existing Amazon SageMaker endpoints for inference. In that case, the usual SageMaker pricing for real-time inference applies. Here you can find a few tips on how to control your costs with Redshift ML.

To learn more, you can see this blog post from when Redshift ML was announced in preview and the documentation.

Start getting better insights from your data with Redshift ML.

Danilo

Getting Started with Amazon ECS Anywhere – Now Generally Available

Post Syndicated from Channy Yun original https://aws.amazon.com/blogs/aws/getting-started-with-amazon-ecs-anywhere-now-generally-available/

Since Amazon Elastic Container Service (Amazon ECS) was launched in 2014, AWS has released other options for running Amazon ECS tasks outside of an AWS Region such as AWS Wavelength, an offering for mobile edge devices or AWS Outposts, a service that extends to customers’ environments using hardware owned and fully managed by AWS.

But some customers have applications that need to run on premises due to regulatory, latency, and data residency requirements or the desire to leverage existing infrastructure investments. In these cases, customers have to install, operate, and manage separate container orchestration software and need to use disparate tooling across their AWS and on-premises environments. Customers asked us for a way to manage their on-premises containers without this added complexity and cost.

Following Jeff’s preannouncement last year, I am happy to announce the general availability of Amazon ECS Anywhere, a new capability in Amazon ECS that enables customers to easily run and manage container-based applications on premises, including virtual machines (VMs), bare metal servers, and other customer-managed infrastructure.

With ECS Anywhere, you can run and manage containers on any customer-managed infrastructure using the same cloud-based, fully managed, and highly scalable container orchestration service you use in AWS today. You no longer need to prepare, run, update, or maintain your own container orchestrators on premises, making it easier to manage your hybrid environment and leverage the cloud for your infrastructure by installing simple agents.

ECS Anywhere provides consistent tooling and APIs for all container-based applications and the same Amazon ECS experience for cluster management, workload scheduling, and monitoring both in the cloud and on customer-managed infrastructure. You can now enjoy the benefits of reduced cost and complexity by running container workloads such as data processing at edge locations on your own hardware maintaining reduced latency, and in the cloud using a single, consistent container orchestrator.

Amazon ECS Anywhere – Getting Started
To get started with ECS Anywhere, register your on-premises servers or VMs (also referred to as External instances) in the ECS cluster. The AWS Systems Manager Agent, Amazon ECS container agent, and Docker must be installed on these external instances. Your external instances require an IAM role that permits them to communicate with AWS APIs. For more information, see Required IAM permissions in the ECS Developer Guide.

To create a cluster for ECS Anywhere, on the Create Cluster page in the ECS console, choose the Networking Only template. This option is for use with either AWS Fargate or external instance capacity. We recommend that you use the AWS Region that is geographically closest to the on-premises servers you want to register.

This creates an empty cluster to register external instances. On the ECS Instances tab, choose Register External Instances to get activation codes and an installation script.

On the Step 1: External instances activation details page, in Activation key duration (in days), enter the number of days the activation key should remain active. The activation key can be used for up to 1,000 activations. In Number of instances, enter the number of external instances you want to register to your cluster. In Instance role, enter the IAM role to associate with your external instances.

Choose Next step to get a registration command.

On the Step 2: Register external instances page, copy the registration command. Run this command on the external instances you want to register to your cluster.

Paste the registration command in your on-premise servers or VMs. Each external instance is then registered as an AWS Systems Manager managed instance, which is then registered to your Amazon ECS clusters.

Both x86_64 and ARM64 CPU architectures are supported. The following is a list of supported operating systems:

  • CentOS 7, CentOS 8
  • RHEL 7
  • Fedora 32, Fedora 33
  • openSUSE Tumbleweed
  • Ubuntu 18, Ubuntu 20
  • Debian 9, Debian 10
  • SUSE Enterprise Server 15

When the ECS agent has started and completed the registration, your external instance will appear on the ECS Instances tab.

You can also add your external instances to the existing cluster. In this case, you can see both Amazon EC2 instances and external instances are prefixed with mi-* together.

Now that the external instances are registered to your cluster, you are ready to create a task definition. Amazon ECS provides the requiresCompatibilities parameter to validate that the task definition is compatible with the the EXTERNAL launch type when creating your service or running your standalone task. The following is an example task definition:

{
	"requiresCompatibilities": [
		"EXTERNAL"
	],
	"containerDefinitions": [{
		"name": "nginx",
		"image": "public.ecr.aws/nginx/nginx:latest",
		"memory": 256,
		"cpu": 256,
		"essential": true,
		"portMappings": [{
			"containerPort": 80,
			"hostPort": 8080,
			"protocol": "tcp"
		}]
	}],
	"networkMode": "bridge",
	"family": "nginx"
}

You can create a task definition in the ECS console. In Task Definition, choose Create new task definition. For Launch type, choose EXTERNAL and then configure the task and container definitions to use external instances.

On the Tasks tab, choose Run new task. On the Run Task page, for Cluster, choose the cluster to run your task definition on. In Number of tasks, enter the number of copies of that task to run with the EXTERNAL launch type.

Or, on the Services tab, choose Create. Configure service lets you specify copies of your task definition to run and maintain in a cluster. To run your task in the registered external instance, for Launch type, choose EXTERNAL. When you choose this launch type, load balancers, tag propagation, and service discovery integration are not supported.

The tasks you run on your external instances must use the bridge, host, or none network modes. The awsvpc network mode isn’t supported. For more information about each network mode, see Choosing a network mode in the Amazon ECS Best Practices Guide.

Now you can run your tasks and associate a mix of EXTERNAL, FARGATE, and EC2 capacity provider types with the same ECS service and specify how you would like your tasks to be split across them.

Things to Know
Here are a couple of things to keep in mind:

Connectivity: In the event of loss of network connectivity between the ECS agent running on the on-premises servers and the ECS control plane in the AWS Region, existing ECS tasks will continue to run as usual. If tasks still have connectivity with other AWS services, they will continue to communicate with them for as long as the task role credentials are active. If a task launched as part of a service crashes or exits on its own, ECS will be unable to replace it until connectivity is restored.

Monitoring: With ECS Anywhere, you can get Amazon CloudWatch metrics for your clusters and services, use the CloudWatch Logs driver (awslogs) to get your containers’ logs, and access the ECS CloudWatch event stream to monitor your clusters’ events.

Networking: ECS external instances are optimized for running applications that generate outbound traffic or process data. If your application requires inbound traffic, such as a web service, you will need to employ a workaround to place these workloads behind a load balancer until the feature is supported natively. For more information, see Networking with ECS Anywhere.

Data Security: To help customers maintain data security, ECS Anywhere only sends back to the AWS Region metadata related to the state of the tasks or the state of the containers (whether they are running or not running, performance counters, and so on). This communication is authenticated and encrypted in transit through Transport Layer Security (TLS).

ECS Anywhere Partners
ECS Anywhere integrates with a variety of ECS Anywhere partners to help customers take advantage of ECS Anywhere and provide additional functionality for the feature. Here are some of the blog posts that our partners wrote to share their experiences and offerings. (I am updating this article with links as they are published.)

Now Available
Amazon ECS Anywhere is now available in all commercial regions except AWS China Regions where ECS is supported. With ECS Anywhere, there are no minimum fees or upfront commitments. You pay per instance hour for each managed ECS Anywhere task. ECS Anywhere free tier includes 2200 instance hours per month for six months per account for all regions. For more information, see the pricing page.

To learn more, see ECS Anywhere in the Amazon ECS Developer Guide. Please send feedback to the AWS forum for Amazon ECS or through your usual AWS Support contacts.

Get started with the Amazon ECS Anywhere today.

Channy

Update. Watch a cool demo of ECS Anywhere to operate a Raspberry Pi cluster at home office and read its deep-dive blog post.

Introducing Amazon Kinesis Data Analytics Studio – Quickly Interact with Streaming Data Using SQL, Python, or Scala

Post Syndicated from Danilo Poccia original https://aws.amazon.com/blogs/aws/introducing-amazon-kinesis-data-analytics-studio-quickly-interact-with-streaming-data-using-sql-python-or-scala/

The best way to get timely insights and react quickly to new information you receive from your business and your applications is to analyze streaming data. This is data that must usually be processed sequentially and incrementally on a record-by-record basis or over sliding time windows, and can be used for a variety of analytics including correlations, aggregations, filtering, and sampling.

To make it easier to analyze streaming data, today we are pleased to introduce Amazon Kinesis Data Analytics Studio.

Now, from the Amazon Kinesis console you can select a Kinesis data stream and with a single click start a Kinesis Data Analytics Studio notebook powered by Apache Zeppelin and Apache Flink to interactively analyze data in the stream. Similarly, you can select a cluster in the Amazon Managed Streaming for Apache Kafka console to start a notebook to analyze data in Apache Kafka streams. You can also start a notebook from the Kinesis Data Analytics Studio console and connect to custom sources.

Architectural diagram.

In the notebook, you can interact with streaming data and get results in seconds using SQL queries and Python or Scala programs. When you are satisfied with your results, with a few clicks you can promote your code to a production stream processing application that runs reliably at scale with no additional development effort.

For new projects, we recommend that you use the new Kinesis Data Analytics Studio over Kinesis Data Analytics for SQL Applications. Kinesis Data Analytics Studio combines ease of use with advanced analytical capabilities, which makes it possible to build sophisticated stream processing applications in minutes. Let’s see how that works in practice.

Using Kinesis Data Analytics Studio to Analyze Streaming Data
I want to get a better understanding of the data sent by some sensors to a Kinesis data stream.

To simulate the workload, I use this random_data_generator.py Python script. You don’t need to know Python to use Kinesis Data Analytics Studio. In fact, I am going to use SQL in the following steps. Also, you can avoid any coding and use the Amazon Kinesis Data Generator user interface (UI) to send test data to Kinesis Data Streams or Kinesis Data Firehose. I am using a Python script to have finer control over the data that is being sent.

import datetime
import json
import random
import boto3

STREAM_NAME = "my-input-stream"


def get_random_data():
    current_temperature = round(10 + random.random() * 170, 2)
    if current_temperature > 160:
        status = "ERROR"
    elif current_temperature > 140 or random.randrange(1, 100) > 80:
        status = random.choice(["WARNING","ERROR"])
    else:
        status = "OK"
    return {
        'sensor_id': random.randrange(1, 100),
        'current_temperature': current_temperature,
        'status': status,
        'event_time': datetime.datetime.now().isoformat()
    }


def send_data(stream_name, kinesis_client):
    while True:
        data = get_random_data()
        partition_key = str(data["sensor_id"])
        print(data)
        kinesis_client.put_record(
            StreamName=stream_name,
            Data=json.dumps(data),
            PartitionKey=partition_key)


if __name__ == '__main__':
    kinesis_client = boto3.client('kinesis')
    send_data(STREAM_NAME, kinesis_client)

This script sends random records to my Kinesis data stream using JSON syntax. For example:

{'sensor_id': 77, 'current_temperature': 93.11, 'status': 'OK', 'event_time': '2021-05-19T11:20:00.978328'}
{'sensor_id': 47, 'current_temperature': 168.32, 'status': 'ERROR', 'event_time': '2021-05-19T11:20:01.110236'}
{'sensor_id': 9, 'current_temperature': 140.93, 'status': 'WARNING', 'event_time': '2021-05-19T11:20:01.243881'}
{'sensor_id': 27, 'current_temperature': 130.41, 'status': 'OK', 'event_time': '2021-05-19T11:20:01.371191'}

From the Kinesis console, I select a Kinesis data stream (my-input-stream) and choose Process data in real time from the Process drop-down. In this way, the stream is configured as a source for the notebook.

Console screenshot.

Then, in the following dialog box, I create an Apache Flink – Studio notebook.

I enter a name (my-notebook) and a description for the notebook. The AWS Identity and Access Management (IAM) permissions to read from the Kinesis data stream I selected earlier (my-input-stream) are automatically attached to the IAM role assumed by the notebook.

Console screenshot.

I choose Create to open the AWS Glue console and create an empty database. Back in the Kinesis Data Analytics Studio console, I refresh the list and select the new database. It will define the metadata for my sources and destinations. From here, I can also review the default Studio notebook settings. Then, I choose Create Studio notebook.

Console screenshot.

Now that the notebook has been created, I choose Run.

Console screenshot.

When the notebook is running, I choose Open in Apache Zeppelin to get access to the notebook and write code in SQL, Python, or Scala to interact with my streaming data and get insights in real time.

In the notebook, I create a new note and call it Sensors. Then, I create a sensor_data table describing the format of the data in the stream:

%flink.ssql

CREATE TABLE sensor_data (
    sensor_id INTEGER,
    current_temperature DOUBLE,
    status VARCHAR(6),
    event_time TIMESTAMP(3),
    WATERMARK FOR event_time AS event_time - INTERVAL '5' SECOND
)
PARTITIONED BY (sensor_id)
WITH (
    'connector' = 'kinesis',
    'stream' = 'my-input-stream',
    'aws.region' = 'us-east-1',
    'scan.stream.initpos' = 'LATEST',
    'format' = 'json',
    'json.timestamp-format.standard' = 'ISO-8601'
)

The first line in the previous command tells to Apache Zeppelin to provide a stream SQL environment (%flink.ssql) for the Apache Flink interpreter. I can also interact with the streaming data using a batch SQL environment (%flink.bsql), or Python (%flink.pyflink) or Scala (%flink) code.

The first part of the CREATE TABLE statement is familiar to anyone who has used SQL with a database. A table is created to store the sensor data in the stream. The WATERMARK option is used to measure progress in the event time, as described in the Event Time and Watermarks section of the Apache Flink documentation.

The second part of the CREATE TABLE statement describes the connector used to receive data in the table (for example, kinesis or kafka), the name of the stream, the AWS Region, the overall data format of the stream (such as json or csv), and the syntax used for timestamps (in this case, ISO 8601). I can also choose the starting position to process the stream, I am using LATEST to read the most recent data first.

When the table is ready, I find it in the AWS Glue Data Catalog database I selected when I created the notebook:

Console screenshot.

Now I can run SQL queries on the sensor_data table and use sliding or tumbling windows to get a better understanding of what is happening with my sensors.

For an overview of the data in the stream, I start with a simple SELECT to get all the content of the sensor_data table:

%flink.ssql(type=update)

SELECT * FROM sensor_data;

This time the first line of the command has a parameter (type=update) so that the output of the SELECT, which is more than one row, is continuously updated when new data arrives.

On the terminal of my laptop, I start the random_data_generator.py script:

$ python3 random_data_generator.py

At first I see a table that contains the data as it comes. To get a better understanding, I select a bar graph view. Then, I group the results by status to see their average current_temperature, as shown here:

Notebook screenshot.

As expected by the way I am generating these results, I have different average temperatures depending on the status (OK, WARNING, or ERROR). The higher the temperature, the greater the probability that something is not working correctly with my sensors.

I can run the aggregated query explicitly using a SQL syntax. This time, I want the result computed on a sliding window of 1 minute with results updated every 10 seconds. To do so, I am using the HOP function in the GROUP BY section of the SELECT statement. To add the time to the output of the select, I use the HOP_ROWTIME function. For more information, see how group window aggregations work in the Apache Flink documentation.

%flink.ssql(type=update)

SELECT sensor_data.status,
       COUNT(*) AS num,
       AVG(sensor_data.current_temperature) AS avg_current_temperature,
       HOP_ROWTIME(event_time, INTERVAL '10' second, INTERVAL '1' minute) as hop_time
  FROM sensor_data
 GROUP BY HOP(event_time, INTERVAL '10' second, INTERVAL '1' minute), sensor_data.status;

This time, I look at the results in table format:

Notebook screenshot.

To send the result of the query to a destination stream, I create a table and connect the table to the stream. First, I need to give permissions to the notebook to write into the stream.

In the Kinesis Data Analytics Studio console, I select my-notebook. Then, in the Studio notebooks details section, I choose Edit IAM permissions. Here, I can configure the sources and destinations used by the notebook and the IAM role permissions are updated automatically.

Console screenshot.

In the Included destinations in IAM policy section, I choose the destination and select my-output-stream. I save changes and wait for the notebook to be updated. I am now ready to use the destination stream.

In the notebook, I create a sensor_state table connected to my-output-stream.

%flink.ssql

CREATE TABLE sensor_state (
    status VARCHAR(6),
    num INTEGER,
    avg_current_temperature DOUBLE,
    hop_time TIMESTAMP(3)
)
WITH (
'connector' = 'kinesis',
'stream' = 'my-output-stream',
'aws.region' = 'us-east-1',
'scan.stream.initpos' = 'LATEST',
'format' = 'json',
'json.timestamp-format.standard' = 'ISO-8601');

I now use this INSERT INTO statement to continuously insert the result of the select into the sensor_state table.

%flink.ssql(type=update)

INSERT INTO sensor_state
SELECT sensor_data.status,
    COUNT(*) AS num,
    AVG(sensor_data.current_temperature) AS avg_current_temperature,
    HOP_ROWTIME(event_time, INTERVAL '10' second, INTERVAL '1' minute) as hop_time
FROM sensor_data
GROUP BY HOP(event_time, INTERVAL '10' second, INTERVAL '1' minute), sensor_data.status;

The data is also sent to the destination Kinesis data stream (my-output-stream) so that it can be used by other applications. For example, the data in the destination stream can be used to update a real-time dashboard, or to monitor the behavior of my sensors after a software update.

I am satisfied with the result. I want to deploy this query and its output as a Kinesis Analytics application. To do so, I need to provide an S3 location to store the application executable.

In the configuration section of the console, I edit the Deploy as application configuration settings. There, I choose a destination bucket in the same region and save changes.

Console screenshot.

I wait for the notebook to be ready after the update. Then, I create a SensorsApp note in my notebook and copy the statements that I want to execute as part of the application. The tables have already been created, so I just copy the INSERT INTO statement above.

From the menu at the top right of my notebook, I choose Build SensorsApp and export to Amazon S3 and confirm the application name.

Notebook screenshot.

When the export is ready, I choose Deploy SensorsApp as Kinesis Analytics application in the same menu. After that, I fine-tune the configuration of the application. I set parallelism to 1 because I have only one shard in my input Kinesis data stream and not a lot of traffic. Then, I run the application, without having to write any code.

From the Kinesis Data Analytics applications console, I choose Open Apache Flink dashboard to get more information about the execution of my application.

Apache Flink console screenshot.

Availability and Pricing
You can use Amazon Kinesis Data Analytics Studio today in all AWS Regions where Kinesis Data Analytics is generally available. For more information, see the AWS Regional Services List.

In Kinesis Data Analytics Studio, we run the open-source versions of Apache Zeppelin and Apache Flink, and we contribute changes upstream. For example, we have contributed bug fixes for Apache Zeppelin, and we have contributed to AWS connectors for Apache Flink, such as those for Kinesis Data Streams and Kinesis Data Firehose. Also, we are working with the Apache Flink community to contribute availability improvements, including automatic classification of errors at runtime to understand whether errors are in user code or in application infrastructure.

With Kinesis Data Analytics Studio, you pay based on the average number of Kinesis Processing Units (KPU) per hour, including those used by your running notebooks. One KPU comprises 1 vCPU of compute, 4 GB of memory, and associated networking. You also pay for running application storage and durable application storage. For more information, see the Kinesis Data Analytics pricing page.

Start using Kinesis Data Analytics Studio today to get better insights from your streaming data.

Danilo

Vulhub – Pre-Built Vulnerable Docker Environments For Learning To Hack

Post Syndicated from original https://www.darknet.org.uk/2021/05/vulhub-pre-built-vulnerable-docker-environments-for-learning-to-hack/?utm_source=rss&utm_medium=social&utm_campaign=darknetfeed

Vulhub – Pre-Built Vulnerable Docker Environments For Learning To Hack

Vulhub is an open-source collection of pre-built vulnerable docker environments for learning to hack. No pre-existing knowledge of docker is required, just execute two simple commands and you have a vulnerable environment.

Features of Vulhub Pre-Built Vulnerable Docker Environments For Learning To Hack

Vulhub contains many frameworks, databases, applications, programming languages and more such as:

  • Drupal
  • ffmpeg
  • CouchDB
  • ActiveMQ
  • Glassfish
  • Joombla
  • JBoss
  • Kibana
  • Laravel
  • Rails
  • Python
  • Tomcat

And many, many more.

Read the rest of Vulhub – Pre-Built Vulnerable Docker Environments For Learning To Hack now! Only available at Darknet.

Anchor | Cloud Engineering Services 2022-05-27 07:02:18

Post Syndicated from Gerald Bachlmayr original https://www.anchor.com.au/blog/2022/05/death-by-nodevops/

The CEO of ‘Waterfall & Silo’ walks into the meeting room and asks his three internal advisors: How are we progressing with our enterprise transformation towards DevOps, business agility and simplification? 

The well-prepared advisors, who had read at least a book and a half about organisational transformation and also watched a considerable number of Youtube videos, confidently reply: We are nearly there. We only need to get one more team on board. We have the first CI/CD pipelines established, and the containers are already up and running.

Unfortunately the advisors overlooked some details.

Two weeks later, the CEO asks the same question, and this time the response is: We only need to get two more teams on board, agree on some common tooling, the delivery methodology and relaunch our community of practice.

A month later, an executive decision is made to go back to the previous processes, tooling and perceived ‘customer focus’.

Two years later, the business closes its doors whilst other competitors achieve record revenues.

What has gone wrong, and why does this happen so often?

To answer this question, let’s have a look… 

Why do you need to transform your business?

Without transforming your business, you will run the risk of falling behind because you are potentially: 

  1. Dealing with the drag of outdated processes and ways of working. Therefore your organisation cannot react swiftly to new business opportunities and changing market trends.
  2. Wasting a lot of time and money on Undifferentiated heavy lifting (UHL). These are tasks that don’t differentiate your business from others but can be easily done better, faster and cheaper by someone else, for example, providing cloud infrastructure. Every minute you spend on UHL distracts you from focusing on your customer.
  3. Not focusing enough on what your customers need. If you don’t have sufficient data insights or experiment with new customer features, you will probably mainly focus on your competition. That makes you a follower. Customer-focused organisations will figure out earlier what works for them and what doesn’t. They will take the lead. 

How do you get started?

The biggest enablers for your transformation are the people in your business. If they work together in a collaborative way, they can leverage synergies and coach each other. This will ultimately motivate them. Delivering customer value is like in a team sport: not the team with the best player wins, but the team with the best strategy and overall team performance.  

How do we get there?

Establishing top-performing DevOps teams

Moving towards cross-functional DevOps teams, also called squads, helps to reduce manual hand-offs and waiting times in your delivery. It is also a very scalable model that is used by many modern organisations that have a good customer experience at their forefront. This applies to a variety of industries, from financial services to retail and professional services. Squad members have different skills and work together towards a shared outcome. A top-performing squad that understands the business goals will not only figure out how to deliver effectively but also how to simplify the solution and reduce Undifferentiated Heavy Lifting. A mature DevOps team will always try out new ways to solve problems. The experimental aspect is crucial for continuous improvement, and it keeps the team excited. Regular feedback in the form of metrics and retrospectives will make it easier for the team to know that they are on the right track.

Understand your customer needs and value chain

There are different methodologies to identify customer needs. Amazon has the “working backwards from the customer” methodology to come up with new ideas, and Google has the “design sprint” methodology. Identifying your actual opportunities and understanding the landscape you are operating in are big challenges. It is easy to get lost in detail and head in the wrong direction. Getting the strategy right is only one aspect of the bigger picture. You also need to get the execution right, experiment with new approaches and establish strong feedback loops between execution and strategy. 

This brings us to the next point that describes how we link those two aspects.

A bidirectional governance approach

DevOps teams operate autonomously and figure out how to best work together within their scope. They do not necessarily know what capabilities are required across the business. Hence you will need a governing working group that has complete visibility of this. That way, you can leverage synergies organisation-wide and not just within a squad. It is important that this working group gets feedback from the individual squads who are closer to specific business domains. One size does not fit all, and for some edge cases, you might need different technologies or delivery approaches. A bidirectional feedback loop will make sure you can improve customer focus and execution across the business.

Key takeaways

Establishing a mature DevOps model is a journey, and it may take some time. Each organisation and industry deals with different challenges, and therefore the journey does not always look the same. It is important to continuously tweak the approach and measure progress to make sure the customer focus can improve.

But if you don’t start the DevOps journey, you could turn into another ‘Waterfall & Silo’.

The post appeared first on Anchor | Cloud Engineering Services.

How to implement a hybrid PKI solution on AWS

Post Syndicated from Max Farnga original https://aws.amazon.com/blogs/security/how-to-implement-a-hybrid-pki-solution-on-aws/

As customers migrate workloads into Amazon Web Services (AWS) they may be running a combination of on-premises and cloud infrastructure. When certificates are issued to this infrastructure, having a common root of trust to the certificate hierarchy allows for consistency and interoperability of the Public Key Infrastructure (PKI) solution.

In this blog post, I am going to show how you can plan and deploy a PKI that enables certificates to be issued across a hybrid (cloud & on-premises) environment with a common root. This solution will use Windows Server Certificate Authority (Windows CA), also known as Active Directory Certificate Services (ADCS) to distribute and manage x.509 certificates for Active Directory users, domain controllers, routers, workstations, web servers, mobile and other devices. And an AWS Certificate Manager Private Certificate Authority (ACM PCA) to manage certificates for AWS services, including API Gateway, CloudFront, Elastic Load Balancers, and other workloads.

The Windows CA also integrates with AWS Cloud HSM to securely store the private keys that sign the certificates issued by your CAs, and use the HSM to perform the cryptographic signing operations. In Figure 1, the diagram below shows how ACM PCA and Windows CA can be used together to issue certificates across a hybrid environment.

Figure 1: Hybrid PKI hierarchy

Figure 1: Hybrid PKI hierarchy

PKI is a framework that enables a safe and trustworthy digital environment through the use of a public and private key encryption mechanism. PKI maintains secure electronic transactions on the internet and in private networks. It also governs the verification, issuance, revocation, and validation of individual systems in a network.

There are two types of PKI:

This blog post focuses on the implementation of a private PKI, to issue and manage private certificates.

When implementing a PKI, there can be challenges from security, infrastructure, and operations standpoints, especially when dealing with workloads across multiple platforms. These challenges include managing isolated PKIs for individual networks across on-premises and AWS cloud, managing PKI with no Hardware Security Module (HSM) or on-premises HSM, and lack of automation to rapidly scale the PKI servers to meet demand.

Figure 2 shows how an internal PKI can be limited to a single network. In the following example, the root CA, issuing CAs, and certificate revocation list (CRL) distribution point are all in the same network, and issue cryptographic certificates only to users and devices in the same private network.

Figure 2: On-premises PKI hierarchy in a single network

Figure 2: On-premises PKI hierarchy in a single network

Planning for your PKI system deployment

It’s important to carefully consider your business requirements, encryption use cases, corporate network architecture, and the capabilities of your internal teams. You must also plan for how to manage the confidentiality, integrity, and availability of the cryptographic keys. These considerations should guide the design and implementation of your new PKI system.

In the below section, we outline the key services and components used to design and implement this hybrid PKI solution.

Key services and components for this hybrid PKI solution

Solution overview

This hybrid PKI can be used if you need a new private PKI, or want to upgrade from an existing legacy PKI with a cryptographic service provider (CSP) to a secure PKI with Windows Cryptography Next Generation (CNG). The hybrid PKI design allows you to seamlessly manage cryptographic keys throughout the IT infrastructure of your organization, from on-premises to multiple AWS networks.

Figure 3: Hybrid PKI solution architecture

Figure 3: Hybrid PKI solution architecture

The solution architecture is depicted in the preceding figure—Figure 3. The solution uses an offline root CA that can be operated on-premises or in an Amazon VPC, while the subordinate Windows CAs run on EC2 instances and are integrated with CloudHSM for key management and storage. To insulate the PKI from external access, the CloudHSM cluster are deployed in protected subnets, the EC2 instances are deployed in private subnets, and the host VPC has site-to-site network connectivity to the on-premises network. The Amazon EC2 volumes are encrypted with AWS KMS customer managed keys. Users and devices connect and enroll to the PKI interface through a Network Load Balancer.

This solution also includes a subordinate ACM private CA to issue certificates that will be installed on AWS services that are integrated with ACM. For example, ELB, CloudFront, and API Gateway. This is so that the certificates users see are always presented from your organization’s internal PKI.

Prerequisites for deploying this hybrid internal PKI in AWS

  • Experience with AWS Cloud, Windows Server, and AD CS is necessary to deploy and configure this solution.
  • An AWS account to deploy the cloud resources.
  • An offline root CA, running on Windows 2016 or newer, to sign the CloudHSM and the issuing CAs, including the private CA and Windows CAs. Here is an AWS Quick-Start article to deploy your Root CA in a VPC. We recommend installing the Windows Root CA in its own AWS account.
  • A VPC with at least four subnets. Two or more public subnets and two or more private subnets, across two or more AZs, with secure firewall rules, such as HTTPS to communicate with your PKI web servers through a load balancer, along with DNS, RDP and other port to communicate within your organization network. You can use this CloudFormation sample VPC template to help you get started with your PKI VPC provisioning.
  • Site-to-site AWS Direct Connect or VPN connection from your VPC to the on-premises network and other VPCs to securely manage multiple networks.
  • Windows 2016 EC2 instances for the subordinate CAs.
  • An Active Directory environment that has access to the VPC that hosts the PKI servers. This is required for a Windows Enterprise CA implementation.

Deploy the solution

The below CloudFormation Code and instructions will help you deploy and configure all the AWS components shown in the above architecture diagram. To implement the solution, you’ll deploy a series of CloudFormation templates through the AWS Management Console.

If you’re not familiar with CloudFormation, you can learn about it from Getting started with AWS CloudFormation. The templates for this solution can be deployed with the CloudFormation console, AWS Service Catalog, or a code pipeline.

Download and review the template bundle

To make it easier to deploy the components of this internal PKI solution, you download and deploy a template bundle. The bundle includes a set of CloudFormation templates, and a PowerShell script to complete the integration between CloudHSM and the Windows CA servers.

To download the template bundle

  1. Download or clone the solution source code repository from AWS GitHub.
  2. Review the descriptions in each template for more instructions.

Deploy the CloudFormation templates

Now that you have the templates downloaded, use the CouldFormation console to deploy them.

To deploy the VPC modification template

Deploy this template into an existing VPC to create the protected subnets to deploy a CloudHSM cluster.

  1. Navigate to the CloudFormation console.
  2. Select the appropriate AWS Region, and then choose Create Stack.
  3. Choose Upload a template file.
  4. Select 01_PKI_Automated-VPC_Modifications.yaml as the CloudFormation stack file, and then choose Next.
  5. On the Specify stack details page, enter a stack name and the parameters. Some parameters have a dropdown list that you can use to select existing values.

    Figure 4: Example of a <strong>Specify stack details</strong> page

    Figure 4: Example of a Specify stack details page

  6. Choose Next, Next, and Create Stack.

To deploy the PKI CDP S3 bucket template

This template creates an S3 bucket for the CRL and AIA distribution point, with initial bucket policies that allow access from the PKI VPC, and PKI users and devices from your on-premises network, based on your input. To grant access to additional AWS accounts, VPCs, and on-premises networks, please refer to the instructions in the template.

  1. Navigate to the CloudFormation console.
  2. Choose Upload a template file.
  3. Select 02_PKI_Automated-Central-PKI_CDP-S3bucket.yaml as the CloudFormation stack file, and then choose Next.
  4. On the Specify stack details page, enter a stack name and the parameters.
  5. Choose Next, Next, and Create Stack

To deploy the ACM Private CA subordinate template

This step provisions the ACM private CA, which is signed by an existing Windows root CA. Provisioning your private CA with CloudFormation makes it possible to sign the CA with a Windows root CA.

  1. Navigate to the CloudFormation console.
  2. Choose Upload a template file.
  3. Select 03_PKI_Automated-ACMPrivateCA-Provisioning.yaml as the CloudFormation stack file, and then choose Next.
  4. On the Specify stack details page, enter a stack name and the parameters. Some parameters have a dropdown list that you can use to select existing values.
  5. Choose Next, Next, and Create Stack.

Assign and configure certificates

After deploying the preceding templates, use the console to assign certificate renewal permissions to ACM and configure your certificates.

To assign renewal permissions

  1. In the ACM Private CA console, choose Private CAs.
  2. Select your private CA from the list.
  3. Choose the Permissions tab.
  4. Select Authorize ACM to use this CA for renewals.
  5. Choose Save.

To sign private CA certificates with an external CA (console)

  1. In the ACM Private CA console, select your private CA from the list.
  2. From the Actions menu, choose Import CA certificate. The ACM Private CA console returns the certificate signing request (CSR).
  3. Choose Export CSR to a file and save it locally.
  4. Choose Next.
    1. Use your existing Windows root CA.
    2. Copy the CSR to the root CA and sign it.
    3. Export the signed CSR in base64 format.
    4. Export the <RootCA>.crt certificate in base64 format.
  5. On the Upload the certificates page, upload the signed CSR and the RootCA certificates.
  6. Choose Confirm and Import to import the private CA certificate.

To request a private certificate using the ACM console

Note: Make a note of IDs of the certificate you configure in this section to use when you deploy the HTTPS listener CloudFormation templates.

  1. Sign in to the console and open the ACM console.
  2. Choose Request a certificate.
  3. On the Request a certificate page, choose Request a private certificate and Request a certificate to continue.
  4. On the Select a certificate authority (CA) page, choose Select a CA to view the list of available private CAs.
  5. Choose Next.
  6. On the Add domain names page, enter your domain name. You can use a fully qualified domain name, such as www.example.com, or a bare—also called apex—domain name such as example.com. You can also use an asterisk (*) as a wild card in the leftmost position to include all subdomains in the same root domain. For example, you can use *.example.com to include all subdomains of the root domain example.com.
  7. To add another domain name, choose Add another name to this certificate and enter the name in the text box.
  8. (Optional) On the Add tags page, tag your certificate.
  9. When you finish adding tags, choose Review and request.
  10. If the Review and request page contains the correct information about your request, choose Confirm and request.

Note: You can learn more at Requesting a Private Certificate.

To share the private CA with other accounts or with your organization

You can use ACM Private CA to share a single private CA with multiple AWS accounts. To share your private CA with multiple accounts, follow the instructions in How to use AWS RAM to share your ACM Private CA cross-account.

Continue deploying the CloudFormation templates

With the certificates assigned and configured, you can complete the deployment of the CloudFormation templates for this solution.

To deploy the Network Load Balancer template

In this step, you provision a Network Load Balancer.

  1. Navigate to the CloudFormation console.
  2. Choose Upload a template file.
  3. Select 05_PKI_Automated-LoadBalancer-Provisioning.yaml as the CloudFormation stack file, and then choose Next.
  4. On the Specify stack details page, enter a stack name and the parameters. Some parameters are filled in automatically or have a dropdown list that you can use to select existing values.
  5. Choose Next, Next, and Create Stack.

To deploy the HTTPS listener configuration template

The following steps create the HTTPS listener with an initial configuration for the load balancer.

  1. Navigate to the CloudFormation console:
  2. Choose Upload a template file.
  3. Select 06_PKI_Automated-HTTPS-Listener.yaml as the CloudFormation stack file, and then choose Next.
  4. On the Specify stack details page, enter the stack name and the parameters. Some parameters are filled in automatically or have a dropdown list that you can use to select existing values.
  5. Choose Next, Next, and Create Stack.

To deploy the AWS KMS CMK template

In this step, you create an AWS KMS CMK to encrypt EC2 EBS volumes and other resources. This is required for the EC2 instances in this solution.

  1. Open the CloudFormation console.
  2. Choose Upload a template file.
  3. Select 04_PKI_Automated-KMS_CMK-Creation.yaml as the CloudFormation stack file, and then choose Next.
  4. On the Specify stack details page, enter a stack name and the parameters.
  5. Choose Next, Next, and Create Stack.

To deploy the Windows EC2 instances provisioning template

This template provisions a purpose-built Windows EC2 instance within an existing VPC. It will provision an EC2 instance for the Windows CA, with KMS to encrypt the EBS volume, an IAM instance profile and automatically installs SSM agent on your instance.

It also has optional features and flexibilities. For example, the template can automatically create new target group, or add instance to existing target group. It can also configure listener rules, create Route 53 records and automatically join an Active Directory domain.

Note: The AWS KMS CMK and the IAM role are required to provision the EC2, while the target group, listener rules, and domain join features are optional.

  1. Navigate to the CloudFormation console.
  2. Choose Upload a template file.
  3. Select 07_PKI_Automated-EC2-Servers-Provisioning.yaml as the CloudFormation stack file, and then choose Next.
  4. On the Specify stack details page, enter the stack name and the parameters. Some parameters are filled in automatically or have a dropdown list that you can use to select existing values.

    Note: The Optional properties section at the end of the parameters list isn’t required if you’re not joining the EC2 instance to an Active Directory domain.

  5. Choose Next, Next, and Create Stack.

Create and initialize a CloudHSM cluster

In this section, you create and configure CloudHSM within the VPC subnets provisioned in previous steps. After the CloudHSM cluster is completed and signed by the Windows root CA, it will be integrated with the EC2 Windows servers provisioned in previous sections.

To create a CloudHSM cluster

  1. Log in to the AWS account, open the console, and navigate to the CloudHSM.
  2. Choose Create cluster.
  3. In the Cluster configuration section:
    1. Select the VPC you created.
    2. Select the three private subnets you created across the Availability Zones in previous steps.
  4. Choose Next: Review.
  5. Review your cluster configuration, and then choose Create cluster.

To create an HSM

  1. Open the console and go to the CloudHSM cluster you created in the preceding step.
  2. Choose Initialize.
  3. Select an AZ for the HSM that you’re creating, and then choose Create.

To download and sign a CSR

Before you can initialize the cluster, you must download and sign a CSR generated by the first HSM of the cluster.

  1. Open the CloudHSM console.
  2. Choose Initialize next to the cluster that you created previously.
  3. When the CSR is ready, select Cluster CSR to download it.

    Figure 5: Download CSR

    Figure 5: Download CSR

To initialize the cluster

  1. Open the CloudHSM console.
  2. Choose Initialize next to the cluster that you created previously.
  3. On the Download certificate signing request page, choose Next. If Next is not available, choose one of the CSR or certificate links, and then choose Next.
  4. On the Sign certificate signing request (CSR) page, choose Next.
  5. Use your existing Windows root CA.
    1. Copy the CSR to the root CA and sign it.
    2. Export the signed CSR in base64 format.
    3. Also export the <RootCA>.crt certificate in base64 format.
  6. On the Upload the certificates page, upload the signed CSR and the root CA certificates.
  7. Choose Upload and initialize.

Integrate CloudHSM cluster to Windows Server AD CS

In this section you use a script that provides step-by-step instructions to help you successfully integrate your Windows Server CA with AWS CloudHSM.

To integrate CloudHSM cluster to Windows Server AD CS

Open the script 09_PKI_AWS_CloudHSM-Windows_CA-Integration-Playbook.txt and follow the instructions to complete the CloudHSM integration with the Windows servers.

Install and configure Windows CA with CloudHSM

When the CloudHSM integration is complete, install and configure your Windows Server CA with the CloudHSM key storage provider and select RSA#Cavium Key Storage Provider as your cryptographic provider.

Conclusion

By deploying the hybrid solution in this post, you’ve implemented a PKI to manage security across all workloads in your AWS accounts and in your on-premises network.

With this solution, you can use a private CA to issue Transport Layer Security (TLS) certificates to your Application Load Balancers, Network Load Balancers, CloudFront, and other AWS workloads across multiple accounts and VPCs. The Windows CA lets you enhance your internal security by binding your internal users, digital devices, and applications to appropriate private keys. You can use this solution with TLS, Internet Protocol Security (IPsec), digital signatures, VPNs, wireless network authentication, and more.

Additional resources

If you have feedback about this post, submit comments in the Comments section below. If you have questions about this post, start a new thread on the AWS Certificate Manager forum or CloudHSM forum or contact AWS Support.

Want more AWS Security how-to content, news, and feature announcements? Follow us on Twitter.

Author

Max Farnga

Max is a Security Transformation Consultant with AWS Professional Services – Security, Risk and Compliance team. He has a diverse technical background in infrastructure, security, and cloud computing. He helps AWS customers implement secure and innovative solutions on the AWS cloud.

Why (and how) GitHub is adopting OpenTelemetry

Post Syndicated from Wolfgang Hennerbichler original https://github.blog/2021-05-26-why-and-how-github-is-adopting-opentelemetry/

Over the years, GitHub engineers have developed many ways to observe how our systems behave. We mostly make use of statsd for metrics, the syslog format for plain text logs and OpenTracing for request traces. While we have somewhat standardized what we emit, we tend to solve the same problems over and over in each new system we develop.

And, while each component serves its individual purpose well, interoperability is a challenge. For example, several pieces of GitHub’s infrastructure use different statsd dialects, which means we have to special-case our telemetry code in different places – a non-trivial amount of work!

Different components can use different vocabularies for similar observability concepts, making investigatory work difficult. For example, there’s quite a bit of complexity around following a GitHub.com request across various telemetry signals, which can include metrics, traces, logs, errors and exceptions. While we emit a lot of telemetry data, it can still be difficult to use in practice.

We needed a solution that would allow us to standardize telemetry usage at GitHub, while making it easy for developers around the organization to instrument their code. The OpenTelemetry project provided us with exactly that!

Introducing OpenTelemetry

OpenTelemetry introduces a common, vendor-neutral format for telemetry signals: OTLP. It also enables telemetry signals to be easily correlated with each other.

Moreover, we can reduce manual work for our engineers by adopting the OpenTelemetry SDKs – their inherent extensibility allows engineers to avoid re-instrumenting their applications if we need to change our telemetry collection backend, and with only one client SDK per-language we can easily propagate best practices among codebases.

OpenTelemetry empowers us to build integrated and opinionated solutions for our engineers. Designing for observability can be at the forefront of our application engineer’s minds, because we can make it so rewarding.

What we’re working on, and why

At the moment, we’re focusing on building out excellent support for the OpenTelemetry tracing signal. We believe that tracing your application should be the main entry point to observability, because we live in a distributed-systems world and tracing illuminates this in a way that’s almost magical.

We believe that tracing allows us to naturally and easily add/derive additional signals once it’s in place: many metrics, for example, can be calculated by backends automatically, tracing events can be converted to detailed logs automatically, and exceptions can automatically be reported to our tracking systems.

We’re building opinionated internal helper libraries for OpenTelemetry, so that engineers can add tracing to their systems quickly, rather than spending hours re-inventing the wheel for unsatisfying results. For example, our helper libraries automatically make sure you’re not emitting traces during your tests, to keep your test suites running smoothly:

# in an initializer, or config/application.rb
OpenTelemetry::SDK.configure do |c|
  if Rails.env.test?
    c.add_span_processor(
      # In production, you almost certainly want BatchSpanProcessor!
      OpenTelemetry::SDK::Trace::Export::SimpleSpanProcessornew(
        OpenTelemetry::SDK::Trace::Export::NoopSpanExporter.new
      )
    )
  end
end

The OpenTelemetry community has started writing libraries that automatically add distributed tracing capabilities to other libraries we rely on – such as the Ruby Postgres adapter, for example. We believe that careful application of auto-instrumentation can yield powerful results that encourage developers to customize tracing to meet their specific needs. Here, in a Rails application that did not previously have any tracing, the following auto-instrumentation was enough for useful, actionable insights:

# in an initializer, or config/application.rb
OpenTelemetry::SDK.configure do |c|
  c.use 'OpenTelemetry::Instrumentation::Rails'
  c.use 'OpenTelemetry::Instrumentation::PG', enable_sql_obfuscation: true
  c.use 'OpenTelemetry::Instrumentation::ActiveJob'
  # This application makes a variety of outbound HTTP calls, with a variety of underlying
  # HTTP client libraries - you may not need this many!
  c.use 'OpenTelemetry::Instrumentation::Faraday'
  c.use 'OpenTelemetry::Instrumentation::Net::HTTP'
  c.use 'OpenTelemetry::Instrumentation::RestClient'
end

This short sample is the standard way to configure the OpenTelemetry SDK—and each c.use line tells the SDK to load and initialize a certain set of auto-instrumentation for us. After that’s happened, we can immediately gain insight into a poorly performing page! This request tracing view below demonstrates a full, crystal-clear trace of database operations from auto-instrumentation alone:

Screenshot of OpenTelemetry tracing to identify poor page performance

We’re building intelligent automatic correlation of the signals that we have today, with OpenTelemetry tracing as the root. By adding trace identifiers into log lines automatically we can connect request traces to logs for example. We have exciting plans about automatically building useful and relevant dashboards, alerts and tools for teams based on these signals. As our OpenTelemetry efforts progress, we’ll shift our focus towards our logging and metrics systems, for the times when tracing isn’t enough. And we’re contributing as much as we can back to the OpenTelemetry project, for everyone to use!

Why should you be excited and how should you get involved?

OpenTelemetry is a CNCF project and it’s been up and running for quite some time. It’s a project with an incredibly wide scope, and one that we believe has real potential to transform how our entire industry approaches one of the most necessary aspects of our work—how to observe and understand our systems. That’s why we’re so excited about it, and why we’re introducing it to our engineers.

Better, more observable distributed systems are possible, and together we have the opportunity to shape them. If it’s interesting to you, we invite you to join the OpenTelemetry community in improving observability for everyone. And if you want to help us make our internal developer experience better, we invite you to join us!

CDK Corner – May 2021

Post Syndicated from Christian Weber original https://aws.amazon.com/blogs/devops/cdk-corner-may-2021/

Social – community engagement

According to Matt Coulter’s tweet, nearly 4000 people signed up for CDK Day to celebrate all things CDK on April 30. As a single-day, two-track event, there was a significant amount of content to learn from while having fun, and interacting with the CDK community.

Eric Johnson as the emcee, keynoted the first session of the morning, presenting “Better together: AWS CDK and AWS SAM.” This keynote was the announcement for the public preview of the AWS Serverless Application Model CLI (AWS SAM CLI). The AWS Serverless Application Model CLI includes support for local development and testing of AWS CDK projects.

To learn more, the blog post announcing the AWS SAM CLI public preview has more detail about the capabilities of the AWS SAM CLI.

If you missed CDK Day, fear not! CDK Day Track 1 and Track2 are available to watch online.

Great job and round of applause to the sign-language translators, the speakers, the organizers, and the hosts for making the second CDK Day a success! We can’t wait for CDK Day number 3!

Updates to the CDK

AWS CDK v2 developer preview

It’s here! The much-anticipated release of CDK v2’s developer preview is now available!

When using CDK previously, developers in JavaScript and TypeScript have faced challenges with the way that npm handles transitive dependencies; the dependencies that your dependencies rely on. For example, the aws-ec2 package.json file lists dependencies for other CDK construct libraries. If one of these transitive dependencies were updated, all of them would be need to be updated. Or you would run into dependency tree resolution errors, as seen in this StackOverflow thread.

With v2, all construct modules are now provided in a single package: aws-cdk-lib. All of the dependencies are now pinned to a single version of aws-cdk-lib, making it easier to manage. This also gives you the flexibility of having all CDK construct library modules available without having to run npm install each time you want to use a new construct library.

Another change to AWS CDK v2 is the removal of experimental modules. To help promote API stability and comply with semantic versioning, CDK v2 ships only with modules marked as stable.

Experimental modules aren’t going away completely, though. In v1, experimental modules and constructs will be provided together with no change. In v2, experimental modules are distributed and versioned separately from the aws-cdk-lib package, in their own dedicated package and namespace. Once a v2 construct is deemed stable, it is then merged into the aws-cdk-lib package.

The CDK team is still determining the best method of distributing experimental modules and constructs, so stay tuned for more information. Read more about the AWS CDK v2 developer preview in the What’s new blog post.

AWS CDK for Go developer preview

On April 7, the AWS CDK team announced support for golang. From the Go tracking issue on GitHub, nearly 900 members of the CDK community have requested for CDK to support golang, and we’re happy to see it become available! We are looking forward to helping out all the golang gophers out there build amazing CDK applications!

To learn more about Go and AWS CDK, read the AWS CDK for Go module API documentation on pkg.go.dev. You can also read the Go bindings for JSII RFC document on GitHub. Want to contribute to the success of Go and CDK? The project tracking board for Go’s General Availability has tasks and items which could use your help.

Construct modules promoted to General Availability

Many new construct modules were promoted to General Availability recently. General Availability indicates a module’s stability, giving confidence to run these modules in production workloads. In April, a total of 15 modules were promoted stable:

Notable new L2 constructs

In the @aws-cdk/route-53 module, name server (NS) records were previously defined with the route53.RecordType enum. In PR#13895, user stijnbrouwers introduces the NS record as its own L2 construct: route53.NSRecord. bringing it into company with other record type L2s, such as route53.ARecord. This makes managing NS records consistent with the other record types represented as L2 constructs.

Improving the @aws-cdk/aws-events-targets module, CDK community user hedrall submitted PR#13823. This change brings support for Amazon API Gateway as a target for an Amazon EventBridge event.

@aws-cdk/aws-codepipeline-actions now includes an L2 construct for AWS CodeStar Connections supporting BitBucket and GitHub. This construct lets you create a CDK application that uses AWS CodeStar with a source connection from either provider, thanks to PR#13781 from the CDK Team.

Level ups to existing CDK constructs

Amazon Elastic Inference makes available low-cost GPU-acceleration for deep-learning workloads. PR#13950 now lets you use the service via @aws-cdk/aws-ecs in Amazon Elastic Container Service tasks, from CDK community user upparekh.

In PR#13473, from pgarbe, the @aws-cdk/aws-lambda-nodejs module will now bundle AWS Lambda functions with Docker images sourced from the Amazon Elastic Container Registry (Amazon ECR) Public Registry, instead of DockerHub. Prior to this change, CDK used your DockerHub credentials to pull a Docker image for the Lambda function. If your account was in DockerHub’s free-tier account level, your account is throttled whenever it exceeds the API limit within a short time frame set by DockerHub. This can cause your AWS CDK deployment to be delayed until you are under DockerHub’s API limit. By moving to the Amazon ECR Public Registry, this removes the risk of being affected by DockerHub’s API rate limiting . You can read more in this blog post giving customers advice about DockerHub rate limits from last year.

With @aws-cdk/aws-codebuild, you can use concurrent build support to speed up your build process. Sometimes you’ll want to limit the number of builds that run concurrently, whether for cost reduction or reducing the complexity of your build process. PR#14185, authored by gmokki, adds the ability to define a concurrent build limit for an AWS CodeBuild project Stage.

It is common for customers to have applications or resources spanning multiple AWS Regions. If you’re using @aws-cdk/aws-secretsmanager, you can now replicate secrets to multiple Regions, with PR#14266 from the CDK team. Make sure you’re not setting your secret as “test123” for your production databases in multiple Regions!

For users of @aws-cdk/aws-eks, PR#12659 from anguslees lets you pass arguments from bootstrap.sh to avoid the DescribeCluster API call. This will speed up the time it takes nodes to join an EKS cluster.

PR#14250 from the CDK team gives developers using @aws-cdk/aws-ec2 the ability to set fixed IPs when defining NAT gateways. This change will now pre-create Elastic IP address allocations and assign them to the NAT gateway. This can be useful when managing links from an Amazon Virtual Private Cloud (VPC) to an on-premises data center that relies on fixed/static IP addresses.

@aws-cdk/aws-iam now lets you add AWS Identity and Access Management (AWS IAM) users to new or existing groups. For example, you might want to have a user in a specific group for the life of a deployed CDK application. And on stack deletion, revoke that membership. Thanks to PR#13698 from jogold, this is now possible.

Learning – Finds from across the internet

If you work with CDK parameters, you might be curious how parameters derive their names and values. Borislav Hadzhiev released a blog post about setting and using CDK parameters.

Ibrahim Cesar’s wrote an awesome blog post detailing the experience of discovering and working with CDK. It’s an enjoyable read of inspiration and animated gifs.

Twitter user edwin4_ released a tool for CDK automation called RocketCDK. From the project’s GitHub repository, this tool will initialize your CDK app, install your packages, and auto-import them into your stack. Neat! Anything that helps save time is a plus-one.

Community acknowledgments

And finally, congratulations and rounds of applause for these folks who had their first Pull Request merged to the CDK repository!

*These users’ Pull Requests were merged in April.

Thank you for joining us on this update of the CDK corner. See you next time!

Object Lock Roadmap: Veeam Immutability and Data Protection for All

Post Syndicated from Molly Clancy original https://www.backblaze.com/blog/object-lock-roadmap-veeam-immutability-and-data-protection-for-all/

Backblaze Object Lock Veeam Data Immutability illustration

We announced that Backblaze earned a Veeam Ready-Object with Immutability qualification in October of 2020, and just yesterday we shared that Object Lock is now available for anyone using the Backblaze S3 Compatible API. After a big announcement, it’s easy to forget about the hard work that went into it. With that in mind, we’ve asked our Senior Java Engineer, Fabian Morgan, who focuses on the Backblaze B2 Cloud Storage product, to explain some of the challenges he found intriguing in the process of developing Object Lock functionality to support both Veeam users and Backblaze B2 customers in general. Read on if you’re interested in using Object Lock (via the Backblaze S3 Compatible API) or File Lock (via the B2 Native API) to protect your data, or if you’re just curious about how we develop new features.

With new reports of ransomware attacks surfacing every day, it’s no surprise that thousands of customers have already started using Object Lock functionality and support for Veeam® immutability via the Backblaze S3 Compatible API since we launched the feature.

We were proud to be the first public cloud storage alternative to Amazon S3 to earn the Veeam Ready-Object with Immutability qualification, but the work started well before that. In this post, I’ll walk you through how we approached development, sifted through discrepancies between AWS documentation and S3 API behavior, solved for problematic retention scenarios, and tested the solutions.

What Are Object Lock and File Lock? Are They the Same Thing?

Object Lock and File Lock both allow you to store objects using a Write Once, Read Many (WORM) model, meaning after it’s written, data cannot be modified or deleted for a defined period of time or indefinitely. Object Lock and File Lock are the same thing, but Object Lock is the terminology used in the S3 Compatible API documentation, and File Lock is the terminology used in the B2 Native API documentation.

illustration of Backblaze Object Lock

How We Developed Object Lock and File Lock

Big picture, we wanted to offer our customers the ability to lock their data, but achieving that functionality for all customers involved a few different development objectives:

  1. First, we wanted to answer the call for immutability support from Veeam + Backblaze B2 customers via the Backblaze S3 Compatible API, but we knew that Veeam was only part of the answer.
  2. We also wanted to offer the ability to lock objects via the S3 Compatible API for non-Veeam customers.
  3. And we wanted to offer the ability to lock files via the B2 Native API.

To avoid overlapping work and achieve priority objectives first, we took a phased approach. Within each phase, we identified tasks that had dependencies and tasks that could be completed in parallel. First, we focused on S3 Compatible API support and the subset of APIs that Veeam used to achieve the Veeam Ready-Object with Immutability qualification. Phase two brought the remainder of the S3 Compatible API as well as File Lock capabilities for the B2 Native API. Phasing development allowed us to be efficient and minimize rework for the B2 Native API after the S3 Compatible API was completed, in keeping with general good software principles of code reuse. For organizations that don’t use Veeam, our S3 Compatible API and B2 Native API solutions have been exactly what they needed to lock their files in a cost effective, easy to use way.

AWS Documentation Challenges: Solving for Unexpected Behavior

At the start of the project, we spent a lot of time testing various documented and undocumented scenarios in AWS. For example, the AWS documentation at that point did not specify what happens if you attempt to switch from governance mode to compliance mode and vice versa, so we issued API calls to find out. Moreover, if we saw inconsistencies between the final outputs of the AWS Command Line Interface and the Java SDK library, we would take the raw XML response from the AWS S3 server as the basis for our implementation.

Compliance Mode vs. Governance Mode: What’s the Diff?

In compliance mode, users can extend the retention period, but they cannot shorten it under any circumstances. In governance mode, users can alter the retention period to be shorter or longer, remove it altogether, or even remove the file itself if they have an enhanced application key capability along with the standard read and write capabilities. Without the enhanced application key capability, governance mode behaves similarly to compliance mode.

Not only did the AWS documentation fail to account for some scenarios, there were instances when the AWS documentation didn’t match up with the actual system behavior. We utilized an existing AWS S3 service to test the API responses with Postman, an API development platform, and compared them to the documentation. We made the decision to mimic the behavior rather than what the documentation said in order to maximize compatibility. We resolved the inconsistencies by making the same feature API invocation against the AWS S3 service and our server, then verified that our server brought back similar XML as the AWS S3 service.

Retention Challenges: What If a Customer Wants to Close Their Account?

Our team raised an intriguing question in the development process: What if a customer accidentally sets the retention term far in the future, and then they want to close their account?

Originally, we required customers to delete the buckets and files they created or uploaded before closing their account. If, for example, they enabled Object Lock on any files in compliance mode, and had not yet reached the retention expiration date when they wanted to close their account, they couldn’t delete those files. A good thing for data protection. A bad thing for customers who want to leave (even though we hate to see them go, we still want to make it as easy as possible).

The question spawned a separate project that allowed customers to close their account without deleting files and buckets. After the account was closed, the system would asynchronously delete the files even if they were under retention and the associated buckets afterward. However, this led to another problem: If we actually allow files with retention to be deleted asynchronously for this scenario, how do we ensure that no other files with retention would be mistakenly deleted?

The tedious but truthful answer is that we added extensive checks and tests to ensure that the system would only delete files under retention in two scenarios (assuming the retention date had not already expired):

  1. If a customer closed an account.
  2. If the file was retained under governance mode, and the customer had the appropriate application key capability when submitting the delete request.

illustration of a lock and a cloud

Testing, Testing: Out-thinking Threats

Features like Object Lock or File Lock have to be bulletproof. As such, testing different scenarios, like the retention example above and many others, posed the most interesting challenges. One critical example: We had to ensure that we protected locked files such that there was no back door or sequence of API calls that would allow someone to delete a file with Object Lock or File Lock enabled. Not only that, we also had to prevent the metadata of the lock properties from being changed.

We approached this problem like a bank teller approaches counterfeit bill identification. They don’t study the counterfeits, they study the real thing to know the difference. What does that mean for us? There are an infinite number of ways a nefarious actor could try to game the system, just like there are an infinite number of counterfeits out there. Instead of thinking of every possible scenario, we identified the handful of ways a user could delete a file, then solved for how to reject anything outside of those strict parameters.

Looking Back

Developing and testing Object Lock and File Lock was truly a team effort, and making sure we had everything accounted for and covered was an exercise that we all welcomed. We expected challenges along the way, and thanks to our great team members, both on the Engineering team and in Compliance, TechOps, and QA, we were able to meet them. When all was said and done, it felt great to be able to work on a much sought-after feature and deliver even more data protection to our customers.

“The immutability support from Backblaze made the decision to tier our Veeam backups to Backblaze B2 easy. Immutability has given us one more level of protection against the hackers. That’s why that was so important to us and most importantly, to our customers.”
—Gregory Tellone, CEO, Continuity Centers

This post was written in collaboration by Fabian Morgan and Molly Clancy.

The post Object Lock Roadmap: Veeam Immutability and Data Protection for All appeared first on Backblaze Blog | Cloud Storage & Cloud Backup.

The collective thoughts of the interwebz

By continuing to use the site, you agree to the use of cookies. more information

The cookie settings on this website are set to "allow cookies" to give you the best browsing experience possible. If you continue to use this website without changing your cookie settings or you click "Accept" below then you are consenting to this.

Close