Tag Archives: amf

Avoiding TPM PCR fragility using Secure Boot

Post Syndicated from Matthew Garrett original http://mjg59.dreamwidth.org/48897.html

In measured boot, each component of the boot process is “measured” (ie, hashed and that hash recorded) in a register in the Trusted Platform Module (TPM) build into the system. The TPM has several different registers (Platform Configuration Registers, or PCRs) which are typically used for different purposes – for instance, PCR0 contains measurements of various system firmware components, PCR2 contains any option ROMs, PCR4 contains information about the partition table and the bootloader. The allocation of these is defined by the PC Client working group of the Trusted Computing Group. However, once the boot loader takes over, we’re outside the spec[1].

One important thing to note here is that the TPM doesn’t actually have any ability to directly interfere with the boot process. If you try to boot modified code on a system, the TPM will contain different measurements but boot will still succeed. What the TPM can do is refuse to hand over secrets unless the measurements are correct. This allows for configurations where your disk encryption key can be stored in the TPM and then handed over automatically if the measurements are unaltered. If anybody interferes with your boot process then the measurements will be different, the TPM will refuse to hand over the key, your disk will remain encrypted and whoever’s trying to compromise your machine will be sad.

The problem here is that a lot of things can affect the measurements. Upgrading your bootloader or kernel will do so. At that point if you reboot your disk fails to unlock and you become unhappy. To get around this your update system needs to notice that a new component is about to be installed, generate the new expected hashes and re-seal the secret to the TPM using the new hashes. If there are several different points in the update where this can happen, this can quite easily go wrong. And if it goes wrong, you’re back to being unhappy.

Is there a way to improve this? Surprisingly, the answer is “yes” and the people to thank are Microsoft. Appendix A of a basically entirely unrelated spec defines a mechanism for storing the UEFI Secure Boot policy and used keys in PCR 7 of the TPM. The idea here is that you trust your OS vendor (since otherwise they could just backdoor your system anyway), so anything signed by your OS vendor is acceptable. If someone tries to boot something signed by a different vendor then PCR 7 will be different. If someone disables secure boot, PCR 7 will be different. If you upgrade your bootloader or kernel, PCR 7 will be the same. This simplifies things significantly.

I’ve put together a (not well-tested) patchset for Shim that adds support for including Shim’s measurements in PCR 7. In conjunction with appropriate firmware, it should then be straightforward to seal secrets to PCR 7 and not worry about things breaking over system updates. This makes tying things like disk encryption keys to the TPM much more reasonable.

However, there’s still one pretty major problem, which is that the initramfs (ie, the component responsible for setting up the disk encryption in the first place) isn’t signed and isn’t included in PCR 7[2]. An attacker can simply modify it to stash any TPM-backed secrets or mount the encrypted filesystem and then drop to a root prompt. This, uh, reduces the utility of the entire exercise.

The simplest solution to this that I’ve come up with depends on how Linux implements initramfs files. In its simplest form, an initramfs is just a cpio archive. In its slightly more complicated form, it’s a compressed cpio archive. And in its peak form of evolution, it’s a series of compressed cpio archives concatenated together. As the kernel reads each one in turn, it extracts it over the previous ones. That means that any files in the final archive will overwrite files of the same name in previous archives.

My proposal is to generate a small initramfs whose sole job is to get secrets from the TPM and stash them in the kernel keyring, and then measure an additional value into PCR 7 in order to ensure that the secrets can’t be obtained again. Later disk encryption setup will then be able to set up dm-crypt using the secret already stored within the kernel. This small initramfs will be built into the signed kernel image, and the bootloader will be responsible for appending it to the end of any user-provided initramfs. This means that the TPM will only grant access to the secrets while trustworthy code is running – once the secret is in the kernel it will only be available for in-kernel use, and once PCR 7 has been modified the TPM won’t give it to anyone else. A similar approach for some kernel command-line arguments (the kernel, module-init-tools and systemd all interpret the kernel command line left-to-right, with later arguments overriding earlier ones) would make it possible to ensure that certain kernel configuration options (such as the iommu) weren’t overridable by an attacker.

There’s obviously a few things that have to be done here (standardise how to embed such an initramfs in the kernel image, ensure that luks knows how to use the kernel keyring, teach all relevant bootloaders how to handle these images), but overall this should make it practical to use PCR 7 as a mechanism for supporting TPM-backed disk encryption secrets on Linux without introducing a hug support burden in the process.

[1] The patchset I’ve posted to add measured boot support to Grub use PCRs 8 and 9 to measure various components during the boot process, but other bootloaders may have different policies.

[2] This is because most Linux systems generate the initramfs locally rather than shipping it pre-built. It may also get rebuilt on various userspace updates, even if the kernel hasn’t changed. Including it in PCR 7 would entirely break the fragility guarantees and defeat the point of all of this.

comment count unavailable comments

Manage Kubernetes Clusters on AWS Using Kops

Post Syndicated from Arun Gupta original https://aws.amazon.com/blogs/compute/kubernetes-clusters-aws-kops/

Any containerized application typically consists of multiple containers. There is a container for the application itself, one for database, possibly another for web server, and so on. During development, its normal to build and test this multi-container application on a single host. This approach works fine during early dev and test cycles but becomes a single point of failure for production where the availability of the application is critical. In such cases, this multi-container application is deployed on multiple hosts. There is a need for an external tool to manage such a multi-container multi-host deployment. Container orchestration frameworks provides the capability of cluster management, scheduling containers on different hosts, service discovery and load balancing, crash recovery and other related functionalities. There are multiple options for container orchestration on Amazon Web Services: Amazon ECS, Docker for AWS, and DC/OS.

Another popular option for container orchestration on AWS is Kubernetes. There are multiple ways to run a Kubernetes cluster on AWS. This multi-part blog series provides a brief overview and explains some of these approaches in detail. This first post explains how to create a Kubernetes cluster on AWS using kops.

Kubernetes and Kops overview

Kubernetes is an open source, container orchestration platform. Applications packaged as Docker images can be easily deployed, scaled, and managed in a Kubernetes cluster. Some of the key features of Kubernetes are:

  • Self-healing
    Failed containers are restarted to ensure that the desired state of the application is maintained. If a node in the cluster dies, then the containers are rescheduled on a different node. Containers that do not respond to application-defined health check are terminated, and thus rescheduled.
  • Horizontal scaling
    Number of containers can be easily scaled up and down automatically based upon CPU utilization, or manually using a command.
  • Service discovery and load balancing
    Multiple containers can be grouped together discoverable using a DNS name. The service can be load balanced with integration to the native LB provided by the cloud provider.
  • Application upgrades and rollbacks
    Applications can be upgraded to a newer version without an impact to the existing one. If something goes wrong, Kubernetes rolls back the change.

Kops, short for Kubernetes Operations, is a set of tools for installing, operating, and deleting Kubernetes clusters in the cloud. A rolling upgrade of an older version of Kubernetes to a new version can also be performed. It also manages the cluster add-ons. After the cluster is created, the usual kubectl CLI can be used to manage resources in the cluster.

Download Kops and Kubectl

There is no need to download the Kubernetes binary distribution for creating a cluster using kops. However, you do need to download the kops CLI. It then takes care of downloading the right Kubernetes binary in the cloud, and provisions the cluster.

The different download options for kops are explained at github.com/kubernetes/kops#installing. On MacOS, the easiest way to install kops is using the brew package manager.

brew update && brew install kops

The version of kops can be verified using the kops version command, which shows:

Version 1.6.1

In addition, download kubectl. This is required to manage the Kubernetes cluster. The latest version of kubectl can be downloaded using the following command:

curl -LO https://storage.googleapis.com/kubernetes-release/release/$(curl -s https://storage.googleapis.com/kubernetes-release/release/stable.txt)/bin/darwin/amd64/kubectl

Make sure to include the directory where kubectl is downloaded in your PATH.

IAM user permission

The IAM user to create the Kubernetes cluster must have the following permissions:

  • AmazonEC2FullAccess
  • AmazonRoute53FullAccess
  • AmazonS3FullAccess
  • IAMFullAccess
  • AmazonVPCFullAccess

Alternatively, a new IAM user may be created and the policies attached as explained at github.com/kubernetes/kops/blob/master/docs/aws.md#setup-iam-user.

Create an Amazon S3 bucket for the Kubernetes state store

Kops needs a “state store” to store configuration information of the cluster.  For example, how many nodes, instance type of each node, and Kubernetes version. The state is stored during the initial cluster creation. Any subsequent changes to the cluster are also persisted to this store as well. As of publication, Amazon S3 is the only supported storage mechanism. Create a S3 bucket and pass that to the kops CLI during cluster creation.

This post uses the bucket name kubernetes-aws-io. Bucket names must be unique; you have to use a different name. Create an S3 bucket:

aws s3api create-bucket --bucket kubernetes-aws-io

I strongly recommend versioning this bucket in case you ever need to revert or recover a previous version of the cluster. This can be enabled using the AWS CLI as well:

aws s3api put-bucket-versioning --bucket kubernetes-aws-io --versioning-configuration Status=Enabled

For convenience, you can also define KOPS_STATE_STORE environment variable pointing to the S3 bucket. For example:

export KOPS_STATE_STORE=s3://kubernetes-aws-io

This environment variable is then used by the kops CLI.

DNS configuration

As of Kops 1.6.1, a top-level domain or a subdomain is required to create the cluster. This domain allows the worker nodes to discover the master and the master to discover all the etcd servers. This is also needed for kubectl to be able to talk directly with the master.

This domain may be registered with AWS, in which case a Route 53 hosted zone is created for you. Alternatively, this domain may be at a different registrar. In this case, create a Route 53 hosted zone. Specify the name server (NS) records from the created zone as NS records with the domain registrar.

This post uses a kubernetes-aws.io domain registered at a third-party registrar.

Generate a Route 53 hosted zone using the AWS CLI. Download jq to run this command:

ID=$(uuidgen) && \
aws route53 create-hosted-zone \
--name cluster.kubernetes-aws.io \
--caller-reference $ID \
| jq .DelegationSet.NameServers

This shows an output such as the following:

[
"ns-94.awsdns-11.com",
"ns-1962.awsdns-53.co.uk",
"ns-838.awsdns-40.net",
"ns-1107.awsdns-10.org"
]

Create NS records for the domain with your registrar. Different options on how to configure DNS for the cluster are explained at github.com/kubernetes/kops/blob/master/docs/aws.md#configure-dns.

Experimental support to create a gossip-based cluster was added in Kops 1.6.2. This post uses a DNS-based approach, as that is more mature and well tested.

Create the Kubernetes cluster

The Kops CLI can be used to create a highly available cluster, with multiple master nodes spread across multiple Availability Zones. Workers can be spread across multiple zones as well. Some of the tasks that happen behind the scene during cluster creation are:

  • Provisioning EC2 instances
  • Setting up AWS resources such as networks, Auto Scaling groups, IAM users, and security groups
  • Installing Kubernetes.

Start the Kubernetes cluster using the following command:

kops create cluster \
--name cluster.kubernetes-aws.io \
--zones us-west-2a \
--state s3://kubernetes-aws-io \
--yes

In this command:

  • --zones
    Defines the zones in which the cluster is going to be created. Multiple comma-separated zones can be specified to span the cluster across multiple zones.
  • --name
    Defines the cluster’s name.
  • --state
    Points to the S3 bucket that is the state store.
  • --yes
    Immediately creates the cluster. Otherwise, only the cloud resources are created and the cluster needs to be started explicitly using the command kops update --yes. If the cluster needs to be edited, then the kops edit cluster command can be used.

This starts a single master and two worker node Kubernetes cluster. The master is in an Auto Scaling group and the worker nodes are in a separate group. By default, the master node is m3.medium and the worker node is t2.medium. Master and worker nodes are assigned separate IAM roles as well.

Wait for a few minutes for the cluster to be created. The cluster can be verified using the command kops validate cluster --state=s3://kubernetes-aws-io. It shows the following output:

Using cluster from kubectl context: cluster.kubernetes-aws.io

Validating cluster cluster.kubernetes-aws.io

INSTANCE GROUPS
NAME                 ROLE      MACHINETYPE    MIN    MAX    SUBNETS
master-us-west-2a    Master    m3.medium      1      1      us-west-2a
nodes                Node      t2.medium      2      2      us-west-2a

NODE STATUS
NAME                                           ROLE      READY
ip-172-20-38-133.us-west-2.compute.internal    node      True
ip-172-20-38-177.us-west-2.compute.internal    master    True
ip-172-20-46-33.us-west-2.compute.internal     node      True

Your cluster cluster.kubernetes-aws.io is ready

It shows the different instances started for the cluster, and their roles. If multiple cluster states are stored in the same bucket, then --name <NAME> can be used to specify the exact cluster name.

Check all nodes in the cluster using the command kubectl get nodes:

NAME                                          STATUS         AGE       VERSION
ip-172-20-38-133.us-west-2.compute.internal   Ready,node     14m       v1.6.2
ip-172-20-38-177.us-west-2.compute.internal   Ready,master   15m       v1.6.2
ip-172-20-46-33.us-west-2.compute.internal    Ready,node     14m       v1.6.2

Again, the internal IP address of each node, their current status (master or node), and uptime are shown. The key information here is the Kubernetes version for each node in the cluster, 1.6.2 in this case.

The kubectl value included in the PATH earlier is configured to manage this cluster. Resources such as pods, replica sets, and services can now be created in the usual way.

Some of the common options that can be used to override the default cluster creation are:

  • --kubernetes-version
    The version of Kubernetes cluster. The exact versions supported are defined at github.com/kubernetes/kops/blob/master/channels/stable.
  • --master-size and --node-size
    Define the instance of master and worker nodes.
  • --master-count and --node-count
    Define the number of master and worker nodes. By default, a master is created in each zone specified by --master-zones. Multiple master nodes can be created by a higher number using --master-count or specifying multiple Availability Zones in --master-zones.

A three-master and five-worker node cluster, with master nodes spread across different Availability Zones, can be created using the following command:

kops create cluster \
--name cluster2.kubernetes-aws.io \
--zones us-west-2a,us-west-2b,us-west-2c \
--node-count 5 \
--state s3://kubernetes-aws-io \
--yes

Both the clusters are sharing the same state store but have different names. This also requires you to create an additional Amazon Route 53 hosted zone for the name.

By default, the resources required for the cluster are directly created in the cloud. The --target option can be used to generate the AWS CloudFormation scripts instead. These scripts can then be used by the AWS CLI to create resources at your convenience.

Get a complete list of options for cluster creation with kops create cluster --help.

More details about the cluster can be seen using the command kubectl cluster-info:

Kubernetes master is running at https://api.cluster.kubernetes-aws.io
KubeDNS is running at https://api.cluster.kubernetes-aws.io/api/v1/proxy/namespaces/kube-system/services/kube-dns

To further debug and diagnose cluster problems, use 'kubectl cluster-info dump'.

Check the client and server version using the command kubectl version:

Client Version: version.Info{Major:"1", Minor:"6", GitVersion:"v1.6.4", GitCommit:"d6f433224538d4f9ca2f7ae19b252e6fcb66a3ae", GitTreeState:"clean", BuildDate:"2017-05-19T18:44:27Z", GoVersion:"go1.7.5", Compiler:"gc", Platform:"darwin/amd64"}
Server Version: version.Info{Major:"1", Minor:"6", GitVersion:"v1.6.2", GitCommit:"477efc3cbe6a7effca06bd1452fa356e2201e1ee", GitTreeState:"clean", BuildDate:"2017-04-19T20:22:08Z", GoVersion:"go1.7.5", Compiler:"gc", Platform:"linux/amd64"}

Both client and server version are 1.6 as shown by the Major and Minor attribute values.

Upgrade the Kubernetes cluster

Kops can be used to create a Kubernetes 1.4.x, 1.5.x, or an older version of the 1.6.x cluster using the --kubernetes-version option. The exact versions supported are defined at github.com/kubernetes/kops/blob/master/channels/stable.

Or, you may have used kops to create a cluster a while ago, and now want to upgrade to the latest recommended version of Kubernetes. Kops supports rolling cluster upgrades where the master and worker nodes are upgraded one by one.

As of kops 1.6.1, upgrading a cluster is a three-step process.

First, check and apply the latest recommended Kubernetes update.

kops upgrade cluster \
--name cluster2.kubernetes-aws.io \
--state s3://kubernetes-aws-io \
--yes

The --yes option immediately applies the changes. Not specifying the --yes option shows only the changes that are applied.

Second, update the state store to match the cluster state. This can be done using the following command:

kops update cluster \
--name cluster2.kubernetes-aws.io \
--state s3://kubernetes-aws-io \
--yes

Lastly, perform a rolling update for all cluster nodes using the kops rolling-update command:

kops rolling-update cluster \
--name cluster2.kubernetes-aws.io \
--state s3://kubernetes-aws-io \
--yes

Previewing the changes before updating the cluster can be done using the same command but without specifying the --yes option. This shows the following output:

NAME                 STATUS        NEEDUPDATE    READY    MIN    MAX    NODES
master-us-west-2a    NeedsUpdate   1             0        1      1      1
nodes                NeedsUpdate   2             0        2      2      2

Using --yes updates all nodes in the cluster, first master and then worker. There is a 5-minute delay between restarting master nodes, and a 2-minute delay between restarting nodes. These values can be altered using --master-interval and --node-interval options, respectively.

Only the worker nodes may be updated by using the --instance-group node option.

Delete the Kubernetes cluster

Typically, the Kubernetes cluster is a long-running cluster to serve your applications. After its purpose is served, you may delete it. It is important to delete the cluster using the kops command. This ensures that all resources created by the cluster are appropriately cleaned up.

The command to delete the Kubernetes cluster is:

kops delete cluster --state=s3://kubernetes-aws-io --yes

If multiple clusters have been created, then specify the cluster name as in the following command:

kops delete cluster cluster2.kubernetes-aws.io --state=s3://kubernetes-aws-io --yes

Conclusion

This post explained how to manage a Kubernetes cluster on AWS using kops. Kubernetes on AWS users provides a self-published list of companies using Kubernetes on AWS.

Try starting a cluster, create a few Kubernetes resources, and then tear it down. Kops on AWS provides a more comprehensive tutorial for setting up Kubernetes clusters. Kops docs are also helpful for understanding the details.

In addition, the Kops team hosts office hours to help you get started, from guiding you with your first pull request. You can always join the #kops channel on Kubernetes slack to ask questions. If nothing works, then file an issue at github.com/kubernetes/kops/issues.

Future posts in this series will explain other ways of creating and running a Kubernetes cluster on AWS.

— Arun

Building High-Throughput Genomic Batch Workflows on AWS: Batch Layer (Part 3 of 4)

Post Syndicated from Andy Katz original https://aws.amazon.com/blogs/compute/building-high-throughput-genomic-batch-workflows-on-aws-batch-layer-part-3-of-4/

Aaron Friedman is a Healthcare and Life Sciences Partner Solutions Architect at AWS

Angel Pizarro is a Scientific Computing Technical Business Development Manager at AWS

This post is the third in a series on how to build a genomics workflow on AWS. In Part 1, we introduced a general architecture, shown below, and highlighted the three common layers in a batch workflow:

  • Job
  • Batch
  • Workflow

In Part 2, you built a Docker container for each job that needed to run as part of your workflow, and stored them in Amazon ECR.

In Part 3, you tackle the batch layer and build a scalable, elastic, and easily maintainable batch engine using AWS Batch.

AWS Batch enables developers, scientists, and engineers to easily and efficiently run hundreds of thousands of batch computing jobs on AWS. It dynamically provisions the optimal quantity and type of compute resources (for example, CPU or memory optimized instances) based on the volume and specific resource requirements of the batch jobs that you submit. With AWS Batch, you do not need to install and manage your own batch computing software or server clusters, which allows you to focus on analyzing results, such as those of your genomic analysis.

Integrating applications into AWS Batch

If you are new to AWS Batch, we recommend reading Setting Up AWS Batch to ensure that you have the proper permissions and AWS environment.

After you have a working environment, you define several types of resources:

  • IAM roles that provide service permissions
  • A compute environment that launches and terminates compute resources for jobs
  • A custom Amazon Machine Image (AMI)
  • A job queue to submit the units of work and to schedule the appropriate resources within the compute environment to execute those jobs
  • Job definitions that define how to execute an application

After the resources are created, you’ll test the environment and create an AWS Lambda function to send generic jobs to the queue.

This genomics workflow covers the basic steps. For more information, see Getting Started with AWS Batch.

Creating the necessary IAM roles

AWS Batch simplifies batch processing by managing a number of underlying AWS services so that you can focus on your applications. As a result, you create IAM roles that give the service permissions to act on your behalf. In this section, deploy the AWS CloudFormation template included in the GitHub repository and extract the ARNs for later use.

To deploy the stack, go to the top level in the repo with the following command:

aws cloudformation create-stack --template-body file://batch/setup/iam.template.yaml --stack-name iam --capabilities CAPABILITY_NAMED_IAM

You can capture the output from this stack in the Outputs tab in the CloudFormation console:

Creating the compute environment

In AWS Batch, you will set up a managed compute environments. Managed compute environments automatically launch and terminate compute resources on your behalf based on the aggregate resources needed by your jobs, such as vCPU and memory, and simple boundaries that you define.

When defining your compute environment, specify the following:

  • Desired instance types in your environment
  • Min and max vCPUs in the environment
  • The Amazon Machine Image (AMI) to use
  • Percentage value for bids on the Spot Market and VPC subnets that can be used.

AWS Batch then provisions an elastic and heterogeneous pool of Amazon EC2 instances based on the aggregate resource requirements of jobs sitting in the RUNNABLE state. If a mix of CPU and memory-intensive jobs are ready to run, AWS Batch provisions the appropriate ratio and size of CPU and memory-optimized instances within your environment. For this post, you will use the simplest configuration, in which instance types are set to "optimal" allowing AWS Batch to choose from the latest C, M, and R EC2 instance families.

While you could create this compute environment in the console, we provide the following CLI commands. Replace the subnet IDs and key name with your own private subnets and key, and the image-id with the image you will build in the next section.

ACCOUNTID=<your account id>
SERVICEROLE=<from output in CloudFormation template>
IAMFLEETROLE=<from output in CloudFormation template>
JOBROLEARN=<from output in CloudFormation template>
SUBNETS=<comma delimited list of subnets>
SECGROUPS=<your security groups>
SPOTPER=50 # percentage of on demand
IMAGEID=<ami-id corresponding to the one you created>
INSTANCEROLE=<from output in CloudFormation template>
REGISTRY=${ACCOUNTID}.dkr.ecr.us-east-1.amazonaws.com
KEYNAME=<your key name>
MAXCPU=1024 # max vCPUs in compute environment
ENV=myenv

# Creates the compute environment
aws batch create-compute-environment --compute-environment-name genomicsEnv-$ENV --type MANAGED --state ENABLED --service-role ${SERVICEROLE} --compute-resources type=SPOT,minvCpus=0,maxvCpus=$MAXCPU,desiredvCpus=0,instanceTypes=optimal,imageId=$IMAGEID,subnets=$SUBNETS,securityGroupIds=$SECGROUPS,ec2KeyPair=$KEYNAME,instanceRole=$INSTANCEROLE,bidPercentage=$SPOTPER,spotIamFleetRole=$IAMFLEETROLE

Creating the custom AMI for AWS Batch

While you can use default Amazon ECS-optimized AMIs with AWS Batch, you can also provide your own image in managed compute environments. We will use this feature to provision additional scratch EBS storage on each of the instances that AWS Batch launches and also to encrypt both the Docker and scratch EBS volumes.

AWS Batch has the same requirements for your AMI as Amazon ECS. To build the custom image, modify the default Amazon ECS-Optimized Amazon Linux AMI in the following ways:

  • Attach a 1 TB scratch volume to /dev/sdb
  • Encrypt the Docker and new scratch volumes
  • Mount the scratch volume to /docker_scratch by modifying /etcfstab

The first two tasks can be addressed when you create the custom AMI in the console. Spin up a small t2.micro instance, and proceed through the standard EC2 instance launch.

After your instance has launched, record the IP address and then SSH into the instance. Copy and paste the following code:

sudo yum -y update
sudo parted /dev/xvdb mklabel gpt
sudo parted /dev/xvdb mkpart primary 0% 100%
sudo mkfs -t ext4 /dev/xvdb1
sudo mkdir /docker_scratch
sudo echo -e '/dev/xvdb1\t/docker_scratch\text4\tdefaults\t0\t0' | sudo tee -a /etc/fstab
sudo mount -a

This auto-mounts your scratch volume to /docker_scratch, which is your scratch directory for batch processing. Next, create your new AMI and record the image ID.

Creating the job queues

AWS Batch job queues are used to coordinate the submission of batch jobs. Your jobs are submitted to job queues, which can be mapped to one or more compute environments. Job queues have priority relative to each other. You can also specify the order in which they consume resources from your compute environments.

In this solution, use two job queues. The first is for high priority jobs, such as alignment or variant calling. Set this with a high priority (1000) and map back to the previously created compute environment. Next, set a second job queue for low priority jobs, such as quality statistics generation. To create these compute environments, enter the following CLI commands:

aws batch create-job-queue --job-queue-name highPriority-${ENV} --compute-environment-order order=0,computeEnvironment=genomicsEnv-${ENV}  --priority 1000 --state ENABLED
aws batch create-job-queue --job-queue-name lowPriority-${ENV} --compute-environment-order order=0,computeEnvironment=genomicsEnv-${ENV}  --priority 1 --state ENABLED

Creating the job definitions

To run the Isaac aligner container image locally, supply the Amazon S3 locations for the FASTQ input sequences, the reference genome to align to, and the output BAM file. For more information, see tools/isaac/README.md.

The Docker container itself also requires some information on a suitable mountable volume so that it can read and write files temporary files without running out of space.

Note: In the following example, the FASTQ files as well as the reference files to run are in a publicly available bucket.

FASTQ1=s3://aws-batch-genomics-resources/fastq/SRR1919605_1.fastq.gz
FASTQ2=s3://aws-batch-genomics-resources/fastq/SRR1919605_2.fastq.gz
REF=s3://aws-batch-genomics-resources/reference/isaac/
BAM=s3://mybucket/genomic-workflow/test_results/bam/

mkdir ~/scratch

docker run --rm -ti -v $(HOME)/scratch:/scratch $REPO_URI --bam_s3_folder_path $BAM \
--fastq1_s3_path $FASTQ1 \
--fastq2_s3_path $FASTQ2 \
--reference_s3_path $REF \
--working_dir /scratch 

Locally running containers can typically expand their CPU and memory resource headroom. In AWS Batch, the CPU and memory requirements are hard limits and are allocated to the container image at runtime.

Isaac is a fairly resource-intensive algorithm, as it creates an uncompressed index of the reference genome in memory to match the query DNA sequences. The large memory space is shared across multiple CPU threads, and Isaac can scale almost linearly with the number of CPU threads given to it as a parameter.

To fit these characteristics, choose an optimal instance size to maximize the number of CPU threads based on a given large memory footprint, and deploy a Docker container that uses all of the instance resources. In this case, we chose a host instance with 80+ GB of memory and 32+ vCPUs. The following code is example JSON that you can pass to the AWS CLI to create a job definition for Isaac.

aws batch register-job-definition --job-definition-name isaac-${ENV} --type container --retry-strategy attempts=3 --container-properties '
{"image": "'${REGISTRY}'/isaac",
"jobRoleArn":"'${JOBROLEARN}'",
"memory":80000,
"vcpus":32,
"mountPoints": [{"containerPath": "/scratch", "readOnly": false, "sourceVolume": "docker_scratch"}],
"volumes": [{"name": "docker_scratch", "host": {"sourcePath": "/docker_scratch"}}]
}'

You can copy and paste the following code for the other three job definitions:

aws batch register-job-definition --job-definition-name strelka-${ENV} --type container --retry-strategy attempts=3 --container-properties '
{"image": "'${REGISTRY}'/strelka",
"jobRoleArn":"'${JOBROLEARN}'",
"memory":32000,
"vcpus":32,
"mountPoints": [{"containerPath": "/scratch", "readOnly": false, "sourceVolume": "docker_scratch"}],
"volumes": [{"name": "docker_scratch", "host": {"sourcePath": "/docker_scratch"}}]
}'

aws batch register-job-definition --job-definition-name snpeff-${ENV} --type container --retry-strategy attempts=3 --container-properties '
{"image": "'${REGISTRY}'/snpeff",
"jobRoleArn":"'${JOBROLEARN}'",
"memory":10000,
"vcpus":4,
"mountPoints": [{"containerPath": "/scratch", "readOnly": false, "sourceVolume": "docker_scratch"}],
"volumes": [{"name": "docker_scratch", "host": {"sourcePath": "/docker_scratch"}}]
}'

aws batch register-job-definition --job-definition-name samtoolsStats-${ENV} --type container --retry-strategy attempts=3 --container-properties '
{"image": "'${REGISTRY}'/samtools_stats",
"jobRoleArn":"'${JOBROLEARN}'",
"memory":10000,
"vcpus":4,
"mountPoints": [{"containerPath": "/scratch", "readOnly": false, "sourceVolume": "docker_scratch"}],
"volumes": [{"name": "docker_scratch", "host": {"sourcePath": "/docker_scratch"}}]
}'

The value for "image" comes from the previous post on creating a Docker image and publishing to ECR. The value for jobRoleArn you can find from the output of the CloudFormation template that you deployed earlier. In addition to providing the number of CPU cores and memory required by Isaac, you also give it a storage volume for scratch and staging. The volume comes from the previously defined custom AMI.

Testing the environment

After you have created the Isaac job definition, you can submit the job using the AWS Batch submitJob API action. While the base mappings for Docker run are taken care of in the job definition that you just built, the specific job parameters should be specified in the container overrides section of the API call. Here’s what this would look like in the CLI, using the same parameters as in the bash commands shown earlier:

aws batch submit-job --job-name testisaac --job-queue highPriority-${ENV} --job-definition isaac-${ENV}:1 --container-overrides '{
"command": [
			"--bam_s3_folder_path", "s3://mybucket/genomic-workflow/test_batch/bam/",
            "--fastq1_s3_path", "s3://aws-batch-genomics-resources/fastq/ SRR1919605_1.fastq.gz",
            "--fastq2_s3_path", "s3://aws-batch-genomics-resources/fastq/SRR1919605_2.fastq.gz",
            "--reference_s3_path", "s3://aws-batch-genomics-resources/reference/isaac/",
            "--working_dir", "/scratch",
			"—cmd_args", " --exome ",]
}'

When you execute a submitJob call, jobId is returned. You can then track the progress of your job using the describeJobs API action:

aws batch describe-jobs –jobs <jobId returned from submitJob>

You can also track the progress of all of your jobs in the AWS Batch console dashboard.

To see exactly where a RUNNING job is at, use the link in the AWS Batch console to direct you to the appropriate location in CloudWatch logs.

Completing the batch environment setup

To finish, create a Lambda function to submit a generic AWS Batch job.

In the Lambda console, create a Python 2.7 Lambda function named batchSubmitJob. Copy and paste the following code. This is similar to the batch-submit-job-python27 Lambda blueprint. Use the LambdaBatchExecutionRole that you created earlier. For more information about creating functions, see Step 2.1: Create a Hello World Lambda Function.

from __future__ import print_function

import json
import boto3

batch_client = boto3.client('batch')

def lambda_handler(event, context):
    # Log the received event
    print("Received event: " + json.dumps(event, indent=2))
    # Get parameters for the SubmitJob call
    # http://docs.aws.amazon.com/batch/latest/APIReference/API_SubmitJob.html
    job_name = event['jobName']
    job_queue = event['jobQueue']
    job_definition = event['jobDefinition']
    
    # containerOverrides, dependsOn, and parameters are optional
    container_overrides = event['containerOverrides'] if event.get('containerOverrides') else {}
    parameters = event['parameters'] if event.get('parameters') else {}
    depends_on = event['dependsOn'] if event.get('dependsOn') else []
    
    try:
        response = batch_client.submit_job(
            dependsOn=depends_on,
            containerOverrides=container_overrides,
            jobDefinition=job_definition,
            jobName=job_name,
            jobQueue=job_queue,
            parameters=parameters
        )
        
        # Log response from AWS Batch
        print("Response: " + json.dumps(response, indent=2))
        
        # Return the jobId
        event['jobId'] = response['jobId']
        return event
    
    except Exception as e:
        print(e)
        message = 'Error getting Batch Job status'
        print(message)
        raise Exception(message)

Conclusion

In part 3 of this series, you successfully set up your data processing, or batch, environment in AWS Batch. We also provided a Python script in the corresponding GitHub repo that takes care of all of the above CLI arguments for you, as well as building out the job definitions for all of the jobs in the workflow: Isaac, Strelka, SAMtools, and snpEff. You can check the script’s README for additional documentation.

In Part 4, you’ll cover the workflow layer using AWS Step Functions and AWS Lambda.

Please leave any questions and comments below.

ISP Bombarded With 82,000+ Demands to Reveal Alleged Pirates

Post Syndicated from Andy original https://torrentfreak.com/isp-bombarded-with-82000-demands-to-reveal-alleged-pirates-170513/

It was once a region where people could share files without fear of reprisal, but over the years Scandinavia has become a hotbed of ‘pirate’ prosecutions.

Sweden, in particular, has seen many sites shut down and their operators sentenced, notably those behind The Pirate Bay but also more recent cases such as those against DreamFilm and Swefilmer.

To this backdrop, members of the public have continued to share files, albeit in decreasing numbers. However, at the same time copyright trolls have hit countries like Sweden, Finland, and Denmark, hoping to scare alleged file-sharers into cash settlements.

This week regional ISP Telia revealed that the activity has already reached epidemic proportions.

Under the EU IPR Enforcement Directive (IPRED), Internet service providers are required to hand over the personal details of suspected pirates to copyright holders, if local courts deem that appropriate. Telia says it is now being bombarded with such demands.

“Telia must adhere to court decisions. At the same time we have a commitment to respect the privacy of our customers and therefore to be transparent,” the company says.

“While in previous years Telia has normally received less than ten such [disclosure] requests per market, per year, lately the number of requests has increased significantly.”

The scale is huge. The company reports that in Sweden during the past year alone, it has been ordered to hand over the identities of subscribers behind more than 45,000 IP addresses.

In Finland during the same period, court orders covered almost 37,000 IP addresses. Four court orders in Denmark currently require the surrendering of data on “hundreds” of customers.

Telia says that a Danish law firm known as Njord Law is behind many of the demands. The company is connected to international copyright trolls operating out of the United States, United Kingdom, and elsewhere.

“A Danish law firm (NJORD Law firm), representing the London-based copyright holder Copyright Management Services Ltd, was recently (2017-01-31) granted a court order forcing Telia Sweden to disclose to the law firm the subscriber identities behind 25,000 IP-addresses,” the company notes.

Copyright Management Services Ltd was incorporated in the UK during October 2014. Its sole director is Patrick Achache, who also operates German-based BitTorrent tracking company MaverickEye. Both are part of the notorious international trolling operation Guardaley.

Copyright Management Services, which is based at the same London address as fellow UK copyright-trolling partner Hatton and Berkeley, filed accounts in June 2016 claiming to be a dormant company. Other than that, it has never filed any financial information.

Copyright Management Services will be legally required to publish more detailed accounts next time around, since the company is now clearly trading, but its role in this operation is far from clear. For its part, Telia hopes the court has done the necessary checking when handing information over to partner firm, Njord Law.

“Telia assumes that the courts perform adequate assessments of the evidence provided by the above law firm, and also that the courts conduct a sufficient assessment of proportionality between copyright and privacy,” the company says.

“Telia does not know what the above law firm intends to do with the large amount of customer data which they are now collecting.”

While that statement from Telia is arguably correct, it doesn’t take a genius to work out where this is going. Every time that these companies can match an IP address to an account holder, they will receive a letter in the mail demanding a cash settlement. Anything that substantially deviates from this outcome would be a very surprising development indeed.

In the meantime, Jon Karlung, the outspoken boss of ISP Bahnhof, has pointed out that if Telia didn’t store customer IP addresses in the first place, it wouldn’t have anything to hand out to copyright trolls.

“Bahnhof does not store this data – and we can’t give out something we do not have. The same logic should apply to Telia,” he said.

Bahnhof says it stores customer data including IP addresses for 24 hours, just long enough to troubleshoot technical issues but nowhere near long enough to be useful to trolls.

Source: TF, for the latest info on copyright, file-sharing, torrent sites and ANONYMOUS VPN services.

Four Men Jailed For Running Pirate Movie Sites

Post Syndicated from Andy original https://torrentfreak.com/four-men-jailed-running-pirate-movie-sites-170510/

In the wake of December 2014 action that closed down The Pirate Bay for weeks, Swedish police turned their focus to one of the country’s top streaming portals, Dreamfilm.se.

The site had been growing in popularity for a while, and together with defunct streaming site Swefilmer, whose admins also went on trial recently, the site accounted for up to 25% of online viewing in Sweden.

“After an administrator was detained and interrogated, it has been mutually agreed that dreamfilm.se will be shut down for good,” the site said in a January 2015 statement.

While the site later came back to life under a new name, Swedish police kept up the pressure. In February 2015, several more sites bit the dust including the country’s second largest torrent site Tankefetast, torrent site PirateHub, and streaming portal Tankefetast Play (TFPlay).

Image previously released by Tankafetasttankafetast

It took more than two years, but recently the key people behind the sites had their day in court. According to IDG, all of the men admitted to being involved in Dreamfilm, but none accepted they had committed any crimes.

Yesterday the Linköping District Court handed down its decision and it’s particularly bad news for those involved. Aged between 21 and 31-years-old, the men were sentenced to between six and 10 months in jail and ordered to pay damages of around $147,000 to the film industry.

A 23-year-old man who founded Dreamfilm back in 2012 was handed the harshest sentence of 10 months. He was due to receive a sentence of one year in jail but due to his age at the time of some of the offenses, the Court chose to impose a slightly lower term.

A member of the Pirate Party who reportedly handled advertising and helped to administer the site, was sentenced to eight months in prison. Two other men who worked in technical roles were told to serve between six and 10 months.

Image published by Dreamfilm after the raiddreamfilm

Anti-piracy outfit Rights Alliance, which as usual was deeply involved in the prosecution, says that the sites were significant players in the pirate landscape.

“The network that included Dream Movie, Tankafetast, TF Play and Piratehub was one of Europe’s leading players for illegal file sharing and streaming. The coordination of the network was carried out by two of the convicted,” the group said.

“This case is an example of how organized commercial piracy used Sweden as a base and target for its operations. They are well organized and earn a lot of money and the risks are considered small and punishments low in Sweden,” lawyer Henrik Pontén said.

While lenient sentences are now clearly off the agenda, the convicted men still have a chance to appeal. It is not yet clear whether they will do so. In the meantime the Dreamfilm.se domain will be seized until the District Court decision becomes final.

Source: TF, for the latest info on copyright, file-sharing, torrent sites and ANONYMOUS VPN services.

Intel’s remote AMT vulnerablity

Post Syndicated from Matthew Garrett original http://mjg59.dreamwidth.org/48429.html

Intel just announced a vulnerability in their Active Management Technology stack. Here’s what we know so far.

Background

Intel chipsets for some years have included a Management Engine, a small microprocessor that runs independently of the main CPU and operating system. Various pieces of software run on the ME, ranging from code to handle media DRM to an implementation of a TPM. AMT is another piece of software running on the ME, albeit one that takes advantage of a wide range of ME features.

Active Management Technology

AMT is intended to provide IT departments with a means to manage client systems. When AMT is enabled, any packets sent to the machine’s wired network port on port 16992 will be redirected to the ME and passed on to AMT – the OS never sees these packets. AMT provides a web UI that allows you to do things like reboot a machine, provide remote install media or even (if the OS is configured appropriately) get a remote console. Access to AMT requires a password – the implication of this vulnerability is that that password can be bypassed.

Remote management

AMT has two types of remote console: emulated serial and full graphical. The emulated serial console requires only that the operating system run a console on that serial port, while the graphical environment requires drivers on the OS side. However, an attacker who enables emulated serial support may be able to use that to configure grub to enable serial console. Remote graphical console seems to be problematic under Linux but some people claim to have it working, so an attacker would be able to interact with your graphical console as if you were physically present. Yes, this is terrifying.

Remote media

AMT supports providing an ISO remotely. In older versions of AMT (before 11.0) this was in the form of an emulated IDE controller. In 11.0 and later, this takes the form of an emulated USB device. The nice thing about the latter is that any image provided that way will probably be automounted if there’s a logged in user, which probably means it’s possible to use a malformed filesystem to get arbitrary code execution in the kernel. Fun!

The other part of the remote media is that systems will happily boot off it. An attacker can reboot a system into their own OS and examine drive contents at their leisure. This doesn’t let them bypass disk encryption in a straightforward way[1], so you should probably enable that.

How bad is this

That depends. Unless you’ve explicitly enabled AMT at any point, you’re probably fine. The drivers that allow local users to provision the system would require administrative rights to install, so as long as you don’t have them installed then the only local users who can do anything are the ones who are admins anyway. If you do have it enabled, though…

How do I know if I have it enabled?

Yeah this is way more annoying than it should be. First of all, does your system even support AMT? AMT requires a few things:

1) A supported CPU
2) A supported chipset
3) Supported network hardware
4) The ME firmware to contain the AMT firmware

Merely having a “vPRO” CPU and chipset isn’t sufficient – your system vendor also needs to have licensed the AMT code. Under Linux, if lspci doesn’t show a communication controller with “MEI” in the description, AMT isn’t running and you’re safe. If it does show an MEI controller, that still doesn’t mean you’re vulnerable – AMT may still not be provisioned. If you reboot you should see a brief firmware splash mentioning the ME. Hitting ctrl+p at this point should get you into a menu which should let you disable AMT.

What do we not know?

We have zero information about the vulnerability, other than that it allows unauthenticated access to AMT. One big thing that’s not clear at the moment is whether this affects all AMT setups, setups that are in Small Business Mode, or setups that are in Enterprise Mode. If the latter, the impact on individual end-users will be basically zero – Enterprise Mode involves a bunch of effort to configure and nobody’s doing that for their home systems. If it affects all systems, or just systems in Small Business Mode, things are likely to be worse.

What should I do?

Make sure AMT is disabled. If it’s your own computer, you should then have nothing else to worry about. If you’re a Windows admin with untrusted users, you should also disable or uninstall LSM by following these instructions.

Does this mean every Intel system built since 2008 can be taken over by hackers?

No. Most Intel systems don’t ship with AMT. Most Intel systems with AMT don’t have it turned on.

Does this allow persistent compromise of the system?

Not in any novel way. An attacker could disable Secure Boot and install a backdoored bootloader, just as they could with physical access.

But isn’t the ME a giant backdoor with arbitrary access to RAM?

Yes, but there’s no indication that this vulnerability allows execution of arbitrary code on the ME – it looks like it’s just (ha ha) an authentication bypass for AMT.

Is this a big deal anyway?

Yes. Fixing this requires a system firmware update in order to provide new ME firmware (including an updated copy of the AMT code). Many of the affected machines are no longer receiving firmware updates from their manufacturers, and so will probably never get a fix. Anyone who ever enables AMT on one of these devices will be vulnerable. That’s ignoring the fact that firmware updates are rarely flagged as security critical (they don’t generally come via Windows update), so even when updates are made available, users probably won’t know about them or install them.

Avoiding this kind of thing in future

Users ought to have full control over what’s running on their systems, including the ME. If a vendor is no longer providing updates then it should at least be possible for a sufficiently desperate user to pay someone else to do a firmware build with the appropriate fixes. Leaving firmware updates at the whims of hardware manufacturers who will only support systems for a fraction of their useful lifespan is inevitably going to end badly.

How certain are you about any of this?

Not hugely – the quality of public documentation on AMT isn’t wonderful, and while I’ve spent some time playing with it (and related technologies) I’m not an expert. If anything above seems inaccurate, let me know and I’ll fix it.

[1] Eh well. They could reboot into their own OS, modify your initramfs (because that’s not signed even if you’re using UEFI Secure Boot) such that it writes a copy of your disk passphrase to /boot before unlocking it, wait for you to type in your passphrase, reboot again and gain access. Sealing the encryption key to the TPM would avoid this.

comment count unavailable comments

Pirate Site Operators Caught By Money Trail, Landmark Trial Hears

Post Syndicated from Andy original https://torrentfreak.com/pirate-site-operators-caught-by-money-trail-landmark-trial-hears-170411/

Founded half a decade ago, Swefilmer grew to become Sweden’s most popular movie and TV show streaming site. At one stage, Swefilmer and fellow streaming site Dreamfilm were said to account for 25% of all web TV viewing in Sweden.

In 2015, local man Ola Johansson took to the Internet to reveal that he’d been raided by the police under suspicion of being involved in running the site. In March 2016, a Turkish national was arrested in Germany on a secret European arrest warrant.

After a couple of false starts, one last June and another this January, the case finally got underway yesterday in Sweden.

The pair stand accused of the unlawful distribution of around 1,400 movies, owned by a dozen studios including Warner, Disney and Fox. Investigators tested 67 of the titles and ten had been made available online before their DVD release.

Anti-piracy group Rights Alliance claims that the site generated a lot of money from advertising without paying for the appropriate licenses. On the table are potential convictions for copyright infringement and money laundering.

Follow the money

In common with so many file-sharing related cases, it’s clear that the men in this case were tracked down from traces left online. Those included IP address evidence and money trails from both advertising revenues and site donations.

According to Sveriges Radio who were in court yesterday, police were able to trace two IP addresses used to operate Swefilmer back to Turkey.

In an effort to trace the bank account used by the site to hold funds, the prosecutor then sought assistance from Turkish authorities. After obtaining the name of the 26-year-old, the prosecutor was then able to link that with advertising revenue generated by the site.

Swefilmer also had a PayPal account used to receive donations and payments for VIP memberships. That account was targeted by an investigator from Rights Alliance who donated money via the same method. That allowed the group to launch an investigation with the payment processor.

The PayPal inquiry appears to have been quite fruitful. The receipt from the donation revealed the account name and from there PayPal apparently gave up the email and bank account details connected to the account. These were linked to the 26-year-old by the prosecutor.

Advertising

The site’s connections with its advertisers also proved useful to investigators. The prosecution claimed that Swefilmer received its first payment in 2013 and its last in 2015. The money generated, some $1.5m (14m kronor), was deposited in a bank account operated by the 26-year-old by a Stockholm-based ad company.

The court heard that while the CEO of the advertising company had been questioned in connection with the case, he is not suspected of crimes.

Connecting the site’s operators

While the exact mechanism is unclear, investigators from Rights Alliance managed to find an IP address used by the 22-year-old. This IP was then traced back to his parents’ home in Kungsbacka, Sweden. The same IP address was used to access the man’s Facebook page.

In court, the prosecution read out chat conversations between both men. They revealed that the men knew each other only through chat and that the younger man believed the older was from Russia.

The prosecution’s case is that the 26-year-old was the ring-leader and that his colleague was a minor player. With that in mind, the latter is required to pay back around $4,000, which is the money he earned from the site.

For the older man, the situation is much more serious. The prosecution is seeking all of the money the site made from advertising, a cool $1.5m.

The case was initially set to go ahead last year but was postponed pending a ruling from the European Court of Justice. Last September, the Court determined that it was illegal to link to copyrighted material if profit was being made.

Claes Kennedy, the lawyer for the 22-year-old, insists that his client did nothing wrong. His actions took place before the ECJ’s ruling so should be determined legal, he says.

The case continues.

Source: TF, for the latest info on copyright, file-sharing, torrent sites and ANONYMOUS VPN services.

Scaling Your Desktop Application Streams with Amazon AppStream 2.0

Post Syndicated from Bryan Liston original https://aws.amazon.com/blogs/compute/scaling-your-desktop-application-streams-with-amazon-appstream-2-0/

Want to stream desktop applications to a web browser, without rewriting them? Amazon AppStream 2.0 is a fully managed, secure, application streaming service. An easy way to learn what the service does is to try out the end-user experience, at no cost.

In this post, I describe how you can scale your AppStream 2.0 environment, and achieve some cost optimizations. I also add some setup and monitoring tips.

AppStream 2.0 workflow

You import your applications into AppStream 2.0 using an image builder. The image builder allows you to connect to a desktop experience from within the AWS Management Console, and then install and test your apps. Then, create an image that is a snapshot of the image builder.

After you have an image containing your applications, select an instance type and launch a fleet of streaming instances. Each instance in the fleet is used by only one user, and you match the instance type used in the fleet to match the needed application performance. Finally, attach the fleet to a stack to set up user access. The following diagram shows the role of each resource in the workflow.

Figure 1: Describing an AppStream 2.0 workflow

appstreamscaling_1.png

Setting up AppStream 2.0

To get started, set up an example AppStream 2.0 stack or use the Quick Links on the console. For this example, I named my stack ds-sample, selected a sample image, and chose the stream.standard.medium instance type. You can explore the resources that you set up in the AWS console, or use the describe-stacks and describe-fleets commands as follows:

Figure 2: Describing an AppStream 2.0 stack

appstreamscaling_1.png

Figure 3: Describing an AppStream 2.0 fleet

appstreamscaling_2.43%20AM

To set up user access to your streaming environment, you can use your existing SAML 2.0 compliant directory. Your users can then use their existing credentials to log in. Alternatively, to quickly test a streaming connection, or to start a streaming session from your own website, you can create a streaming URL. In the console, choose Stacks, Actions, Create URL, or call create-streaming-url as follows:

Figure 4: Creating a streaming URL

appstreamscaling_3.png

You can paste the streaming URL into a browser, and open any of the displayed applications.

appstreamscaling_4.30%20PM

Now that you have a sample environment set up, here are a few tips on scaling.

Scaling and cost optimization for AppStream 2.0

To provide an instant-on streaming connection, the instances in an AppStream 2.0 fleet are always running. You are charged for running instances, and each running instance can serve exactly one user at any time. To optimize your costs, match the number of running instances to the number of users who want to stream apps concurrently. This section walks through three options for doing this:

  • Fleet Auto Scaling
  • Fixed fleets based on a schedule
  • Fleet Auto Scaling with schedules

Fleet Auto Scaling

To dynamically update the number of running instances, you can use Fleet Auto Scaling. This feature allows you to scale the size of the fleet automatically between a minimum and maximum value based on demand. This is useful if you have user demand that changes constantly, and you want to scale your fleet automatically to match this demand. For examples about setting up and managing scaling policies, see Fleet Auto Scaling.

You can trigger changes to the fleet through the available Amazon CloudWatch metrics:

  • CapacityUtilization – the percentage of running instances already used.
  • AvailableCapacity – the number of instances that are unused and can receive connections from users.
  • InsufficientCapacityError – an error that is triggered when there is no available running instance to match a user’s request.

You can create and attach scaling policies using the AWS SDK or AWS Management Console. I find it convenient to set up the policies using the console. Use the following steps:

  1. In the AWS Management Console, open AppStream 2.0.
  2. Choose Fleets, select a fleet, and choose Scaling Policies.
  3. For Minimum capacity and Maximum capacity, enter values for the fleet.

Figure 5: Fleets tab for setting scaling policies

appstreamscaling_5.png

  1. Create scale out and scale in policies by choosing Add Policy in each section.

Figure 6: Adding a scale out policy

appstreamscaling_6.png

Figure 7: Adding a scale in policy

appstreamscaling_7.png

After you create the policies, they are displayed as part of your fleet details.

appstreamscaling_8.png

The scaling policies are triggered by CloudWatch alarms. These alarms are automatically created on your behalf when you create the scaling policies using the console. You can view and modify the alarms via the CloudWatch console.

Figure 8: CloudWatch alarms for triggering fleet scaling

appstreamscaling_9.png

Fixed fleets based on a schedule

An alternative option to optimize costs and respond to predictable demand is to fix the number of running instances based on the time of day or day of the week. This is useful if you have a fixed number of users signing in at different times of the day― scenarios such as a training classes, call center shifts, or school computer labs. You can easily set the number of instances that are running using the AppStream 2.0 update-fleet command. Update the Desired value for the compute capacity of your fleet. The number of Running instances changes to match the Desired value that you set, as follows:

Figure 9: Updating desired capacity for your fleet

appstreamscaling_10.png

Set up a Lambda function to update your fleet size automatically. Follow the example below to set up your own functions. If you haven’t used Lambda before, see Step 2: Create a HelloWorld Lambda Function and Explore the Console.

To create a function to change the fleet size

  1. In the Lambda console, choose Create a Lambda function.
  2. Choose the Blank Function blueprint. This gives you an empty blueprint to which you can add your code.
  3. Skip the trigger section for now. Later on, you can add a trigger based on time, or any other input.
  4. In the Configure function section:
    1. Provide a name and description.
    2. For Runtime, choose Node.js 4.3.
    3. Under Lambda function handler and role, choose Create a custom role.
    4. In the IAM wizard, enter a role name, for example Lambda-AppStream-Admin. Leave the defaults as is.
    5. After the IAM role is created, attach an AppStream 2.0 managed policy “AmazonAppStreamFullAccess” to the role. For more information, see Working with Managed Policies. This allows Lambda to call the AppStream 2.0 API on your behalf. You can edit and attach your own IAM policy, to limit access to only actions you would like to permit. To learn more, see Controlling Access to Amazon AppStream 2.0.
    6. Leave the default values for the rest of the fields, and choose Next, Create function.
  5. To change the AppStream 2.0 fleet size, choose Code and add some sample code, as follows:
    'use strict';
    
    /**
    This AppStream2 Update-Fleet blueprint sets up a schedule for a streaming fleet
    **/
    
    const AWS = require('aws-sdk');
    const appstream = new AWS.AppStream();
    const fleetParams = {
      Name: 'ds-sample-fleet', /* required */
      ComputeCapacity: {
        DesiredInstances: 1 /* required */
    
      }
    };
    
    exports.handler = (event, context, callback) => {
        console.log('Received event:', JSON.stringify(event, null, 2));
    
        var resource = event.resources[0];
        var increase = resource.includes('weekday-9am-increase-capacity')
    
        try {
            if (increase) {
                fleetParams.ComputeCapacity.DesiredInstances = 3
            } else {
                fleetParams.ComputeCapacity.DesiredInstances = 1
            }
            appstream.updateFleet(fleetParams, (error, data) => {
                if (error) {
                    console.log(error, error.stack);
                    return callback(error);
                }
                console.log(data);
                return callback(null, data);
            });
        } catch (error) {
            console.log('Caught Error: ', error);
            callback(error);
        }
    };

  6. Test the code. Choose Test and use the “Hello World” test template. The first time you do this, choose Save and Test. Create a test input like the following to trigger the scaling update.

    appstreamscaling_11.png

  7. You see output text showing the result of the update-fleet call. You can also use the CLI to check the effect of executing the Lambda function.

Next, to set up a time-based schedule, set a trigger for invoking the Lambda function.

To set a trigger for the Lambda function

  1. Choose Triggers, Add trigger.
  2. Choose CloudWatch Events – Schedule.
  3. Enter a rule name, such as “weekday-9am-increase-capacity”, and a description. For Schedule expression, choose cron. You can edit the value for the cron later.
  4. After the trigger is created, open the event weekday-9am-increase-capacity.
  5. In the CloudWatch console, edit the event details. To scale out the fleet at 9 am on a weekday, you can adjust the time to be: 00 17 ? * MON-FRI *. (If you’re not in Seattle (Pacific Time Zone), change this to another specific time zone).
  6. You can also add another event that triggers at the end of a weekday.

appstreamscaling_12.png

This setup now triggers scale-out and scale-in automatically, based on the time schedule that you set.

Fleet Auto Scaling with schedules

You can choose to combine both the fleet scaling and time-based schedule approaches to manage more complex scenarios. This is useful to manage the number of running instances based on business and non-business hours, and still respond to changes in demand. You could programmatically change the minimum and maximum sizes for your fleet based on time of day or day of week, and apply the default scale-out or scale-in policies. This allows you to respond to predictable minimum demand based on a schedule.

For example, at the start of a work day, you might expect a certain number of users to request streaming connections at one time. You wouldn’t want to wait for the fleet to scale out and meet this requirement. However, during the course of the day, you might expect the demand to scale in or out, and would want to match the fleet size to this demand.

To achieve this, set up the scaling polices via the console, and create a Lambda function to trigger changes to the minimum, maximum, and desired capacity for your fleet based on a schedule. Replace the code for the Lambda function that you created earlier with the following code:

'use strict';

/**
This AppStream2 Update-Fleet function sets up a schedule for a streaming fleet
**/

const AWS = require('aws-sdk');
const appstream = new AWS.AppStream();
const applicationAutoScaling = new AWS.ApplicationAutoScaling();

const fleetParams = {
  Name: 'ds-sample-fleet', /* required */
  ComputeCapacity: {
    DesiredInstances: 1 /* required */
  }
};

var scalingParams = {
  ResourceId: 'fleet/ds-sample-fleet', /* required - fleet name*/
  ScalableDimension: 'appstream:fleet:DesiredCapacity', /* required */
  ServiceNamespace: 'appstream', /* required */
  MaxCapacity: 1,
  MinCapacity: 6,
  RoleARN: 'arn:aws:iam::659382443255:role/service-role/ApplicationAutoScalingForAmazonAppStreamAccess'
};

exports.handler = (event, context, callback) => {
    
    console.log('Received this event now:', JSON.stringify(event, null, 2));
    
    var resource = event.resources[0];
    var increase = resource.includes('weekday-9am-increase-capacity')

    try {
        if (increase) {
            //usage during business hours - start at capacity of 10 and scale
            //if required. This implies at least 10 users can connect instantly. 
            //More users can connect as the scaling policy triggers addition of
            //more instances. Maximum cap is 20 instances - fleet will not scale
            //beyond 20. This is the cap for number of users.
            fleetParams.ComputeCapacity.DesiredInstances = 10
            scalingParams.MinCapacity = 10
            scalingParams.MaxCapacity = 20
        } else {
            //usage during non-business hours - start at capacity of 1 and scale
            //if required. This implies only 1 user can connect instantly. 
            //More users can connect as the scaling policy triggers addition of
            //more instances. 
            fleetParams.ComputeCapacity.DesiredInstances = 1
            scalingParams.MinCapacity = 1
            scalingParams.MaxCapacity = 10
        }
        
        //Update minimum and maximum capacity used by the scaling policies
        applicationAutoScaling.registerScalableTarget(scalingParams, (error, data) => {
             if (error) console.log(error, error.stack); 
             else console.log(data);                     
            });
            
        //Update the desired capacity for the fleet. This sets 
        //the number of running instances to desired number of instances
        appstream.updateFleet(fleetParams, (error, data) => {
            if (error) {
                console.log(error, error.stack);
                return callback(error);
            }

            console.log(data);
            return callback(null, data);
        });
            
    } catch (error) {
        console.log('Caught Error: ', error);
        callback(error);
    }
};

Note: To successfully execute this code, you need to add IAM policies to the role used by the Lambda function. The policies allow Lambda to call the Application Auto Scaling service on your behalf.

Figure 10: Inline policies for using Application Auto Scaling with Lambda

{
"Version": "2012-10-17",
"Statement": [
   {
      "Effect": "Allow", 
         "Action": [
            "iam:PassRole"
         ],
         "Resource": "*"
   }
]
}
{
"Version": "2012-10-17",
"Statement": [
   {
      "Effect": "Allow", 
         "Action": [
            "application-autoscaling:*"
         ],
         "Resource": "*"
   }
]
}

Monitoring usage

After you have set up scaling for your fleet, you can use CloudWatch metrics with AppStream 2.0, and create a dashboard for monitoring. This helps optimize your scaling policies over time based on the amount of usage that you see.

For example, if you were very conservative with your initial set up and over-provisioned resources, you might see long periods of low fleet utilization. On the other hand, if you set the fleet size too low, you would see high utilization or errors from insufficient capacity, which would block users’ connections. You can view CloudWatch metrics for up to 15 months, and drive adjustments to your fleet scaling policy.

Figure 11: Dashboard with custom Amazon CloudWatch metrics

appstreamscaling_13.53%20PM

Summary

These are just a few ideas for scaling AppStream 2.0 and optimizing your costs. Let us know if these are useful, and if you would like to see similar posts. If you have comments about the service, please post your feedback on the AWS forum for AppStream 2.0.

Landmark Movie Streaming Trial Gets Underway in Sweden

Post Syndicated from Andy original https://torrentfreak.com/landmark-movie-streaming-trial-gets-underway-in-sweden-170116/

swefilmlogoFounded half a decade ago, Swefilmer grew to become Sweden’s most popular movie and TV show streaming site. Together with Dreamfilm, another site operating in the same niche, Swefilmer is said to have accounted for 25% of all web TV viewing in Sweden.

In the summer of 2015, local man Ola Johansson revealed that he’d been raided by the police under suspicion of being involved in running the site. In March 2015, a Turkish national was arrested in Germany on a secret European arrest warrant. The now 26-year-old was accused of receiving donations from users and setting up Swefilmer’s deals with advertisers.

In a subsequent indictment filed at the Varberg District Court, the men were accused of copyright infringement offenses relating to the unlawful distribution of more than 1,400 movies. However, just hours after the trial got underway last June, it was suspended, when a lawyer for one of the men asked to wait for an important EU copyright case to run its course.

That case, between Dutch blog GeenStijl.nl and Playboy, had seen a Dutch court ask the EU Court of Justice to rule whether unauthorized links to copyrighted content could be seen as a ‘communication to the public’ under Article 3(1) of the Copyright Directive, and whether those links facilitated copyright infringement.

Last September, the European Court of Justice ruled that it is usually acceptable to link to copyrighted content without permission when people are not aware content is infringing and when they do so on a non-profit basis. In commercial cases, the rules are much more strict.

The Swefilmer siteswefilmer

In light of that ruling, the pair return to the Varberg District Court today, accused of making more than $1.5m from their activities between November 2013 and June 2015.

While Swedish prosecutions against sites such as The Pirate Bay have made global headlines, the case against Swefilmer is the first of its kind against a stream-links portal. Prosecutor Anna Ginner and Rights Alliance lawyer Henrik Pontén believe they have the evidence needed to take down the pair.

“Swefilmer is a typical example of how a piracy operation looks today: fully commercial, well organized and great efforts expended to conceal itself. This applies particularly to the principal of the site,” Pontén told IDG.

According to Ginner, the pair ran an extensive operation and generated revenues from a number of advertising companies. They did not act alone but the duo were the ones that were identified by, among other things, their IP addresses.

The 26-year-old, who was arrested in Germany, was allegedly the money man who dealt with the advertisers. In addition to copyright infringement offenses, he stands accused of money laundering.

According to IDG, he will plead not guilty. His lawyer gave no hints why but suggested the reasons will become evident during the trial.

The younger man, who previously self-identified as Ola Johansson, is accused of being the day-to-day operator of the site, which included uploading movies to other sites where Swefilmer linked. He is said to have received a modest sum for his work, around $3,800.

“I think what’s interesting for the Swedish court is that this case has such clear elements of organized crime compared to what we have seen before,” Anna Ginner concludes.

Source: TF, for the latest info on copyright, file-sharing, torrent sites and ANONYMOUS VPN services.

Hollywood Lawsuit Expands Pirate Bay & ExtraTorrent Web Blockade

Post Syndicated from Andy original https://torrentfreak.com/hollywood-lawsuit-expands-pirate-bay-extratorrent-web-blockade-170113/

FCT tyIn an expansion of their site-blocking campaign, several Hollywood studios under the banner of the MPA applied to a court in Norway during 2015 to have seven ‘pirate’ sites blocked at the ISP level.

Warner Bros, Paramount, Twentieth Century Fox, Universal, Sony, Disney, Columbia and several local industry groups argued that The Pirate Bay, ExtraTorrent, Viooz, PrimeWire, Swefilmer, DreamFilm and Movie4K, infringe their copyrights.

Local ISPs including Telenor, TeliaSonera, NextGenTel and Altibox were named as defendants in the case, which was handled by the Oslo District Court. In September 2015 the Court handed down its ruling, ordering the ISPs to block the seven sites listed in the lawsuit.

While many observers believed the studios would be back immediately with more blocking requests, things went quiet for more than a year. Only recently did they return with fresh requests against more ISPs.

The Motion Picture Association filed a new lawsuit against seven Internet service providers, named by Tek.no as Eidsiva Bredbånd, Lynet Internett, Breiband.no, Enivest, Neas Bredbånd, Tussa IKT and Opennet Norge.

Like before, the MPA and the studios wanted the ISPs to restrict access to a number of ‘pirate’ sites including The Pirate Bay, Extratorrent, Viooz, SweFilmer, DreamFilm and Movie4K, which were all named in the original complaint. Additional sites named in the new complaint include Watch Series, Putlocker, TUBE+, Couch Tuner, Watch32, Project Free TV and Watch Free.

It now appears that the MPA found a sympathetic ear at the Oslo District Court, which has just issued a ruling in the studios’ favor. Local media says the ruling is considerably shorter than the one handed down in 2015, which seems to indicate that the process has now been streamlined.

According to Tek, the Court made its decision based on the fact that the sites in question published links to copyrighted material already available online. The Court determined that the sites published the content to a “new public”, noting that they had not changed their modus operandi since the original ruling.

The ISPs were ordered to implement DNS blockades covering the domains currently in use. As illustrated on a a number of previous occasions, including in Norway itself, DNS blocks are the weakest form of blocking and are easily circumvented by switching to a different provider, such as Google.

Torgeir Waterhouse of Internet interest group ICT Norway says that while the DNS blocks might only amount to a “speed bump”, it’s more important to make it easier to access legal services.

“We must make it easy to pay, easy to access and easy to develop new services. Ultimately it is this that determines the levels of revenue for producers. Revenue does not occur ‘automagically’ by preventing their access to an illegal service,” he says.

Waterhouse, who has campaigned on several intellectual property issues, also criticized the authorities for holding ISPs responsible for the actions of others.

“The government should make a law that says what they really want, namely that parts of the Internet should not be accessible to the population. They should have the courage to stand up and say that this is what they want, not create the impression that Internet service providers are doing something wrong,” he adds

In common with the 2015 ruling, the sites detailed in the lawsuit were all ordered to pay court costs. None of the sites’ owners appeared in Court so it’s unlikely that any will pay the $1,800 they each now owe.

Source: TF, for the latest info on copyright, file-sharing, torrent sites and ANONYMOUS VPN services.

The “cryptsetup initrd root shell” vulnerability

Post Syndicated from corbet original http://lwn.net/Articles/706444/rss

Hector Marco and Ismael Ripoll report
a discouraging vulnerability in many encrypted disk setups: simply running
up too many password failures will eventually result in a root shell.
This vulnerability allows to obtain a root initramfs shell on
affected systems. The vulnerability is very reliable because it doesn’t
depend on specific systems or configurations. Attackers can copy, modify or
destroy the hard disc as well as set up the network to exfiltrate
data. This vulnerability is specially serious in environments like
libraries, ATMs, airport machines, labs, etc, where the whole boot process
is protect (password in BIOS and GRUB) and we only have a keyboard or/and a
mouse.

The NSA Is Hoarding Vulnerabilities

Post Syndicated from Bruce Schneier original https://www.schneier.com/blog/archives/2016/08/the_nsa_is_hoar.html

The National Security Agency is lying to us. We know that because of data stolen from an NSA server was dumped on the Internet. The agency is hoarding information about security vulnerabilities in the products you use, because it wants to use it to hack others’ computers. Those vulnerabilities aren’t being reported, and aren’t getting fixed, making your computers and networks unsafe.

On August 13, a group calling itself the Shadow Brokers released 300 megabytes of NSA cyberweapon code on the Internet. Near as we experts can tell, the NSA network itself wasn’t hacked; what probably happened was that a “staging server” for NSA cyberweapons — that is, a server the NSA was making use of to mask its surveillance activities — was hacked in 2013.

The NSA inadvertently resecured itself in what was coincidentally the early weeks of the Snowden document release. The people behind the link used casual hacker lingo, and made a weird, implausible proposal involving holding a bitcoin auction for the rest of the data: “!!! Attention government sponsors of cyber warfare and those who profit from it !!!! How much you pay for enemies cyber weapons?”

Still, most people believe the hack was the work of the Russian government and the data release some sort of political message. Perhaps it was a warning that if the US government exposes the Russians as being behind the hack of the Democratic National Committee — or other high-profile data breaches — the Russians will expose NSA exploits in turn.

But what I want to talk about is the data. The sophisticated cyberweapons in the data dump include vulnerabilities and “exploit code” that can be deployed against common Internet security systems. Products targeted include those made by Cisco, Fortinet, TOPSEC, Watchguard, and Juniper — systems that are used by both private and government organizations around the world. Some of these vulnerabilities have been independently discovered and fixed since 2013, and some had remained unknown until now.

All of them are examples of the NSA — despite what it and other representatives of the US government say — prioritizing its ability to conduct surveillance over our security. Here’s one example. Security researcher Mustafa al-Bassam found an attack tool codenamed BENIGHCERTAIN that tricks certain Cisco firewalls into exposing some of their memory, including their authentication passwords. Those passwords can then be used to decrypt virtual private network, or VPN, traffic, completely bypassing the firewalls’ security. Cisco hasn’t sold these firewalls since 2009, but they’re still in use today.

Vulnerabilities like that one could have, and should have, been fixed years ago. And they would have been, if the NSA had made good on its word to alert American companies and organizations when it had identified security holes.

Over the past few years, different parts of the US government have repeatedly assured us that the NSA does not hoard “zero days” ­ the term used by security experts for vulnerabilities unknown to software vendors. After we learned from the Snowden documents that the NSA purchases zero-day vulnerabilities from cyberweapons arms manufacturers, the Obama administration announced, in early 2014, that the NSA must disclose flaws in common software so they can be patched (unless there is “a clear national security or law enforcement” use).

Later that year, National Security Council cybersecurity coordinator and special adviser to the president on cybersecurity issues Michael Daniel insisted that US doesn’t stockpile zero-days (except for the same narrow exemption). An official statement from the White House in 2014 said the same thing.

The Shadow Brokers data shows this is not true. The NSA hoards vulnerabilities.

Hoarding zero-day vulnerabilities is a bad idea. It means that we’re all less secure. When Edward Snowden exposed many of the NSA’s surveillance programs, there was considerable discussion about what the agency does with vulnerabilities in common software products that it finds. Inside the US government, the system of figuring out what to do with individual vulnerabilities is called the Vulnerabilities Equities Process (VEP). It’s an inter-agency process, and it’s complicated.

There is a fundamental tension between attack and defense. The NSA can keep the vulnerability secret and use it to attack other networks. In such a case, we are all at risk of someone else finding and using the same vulnerability. Alternatively, the NSA can disclose the vulnerability to the product vendor and see it gets fixed. In this case, we are all secure against whoever might be using the vulnerability, but the NSA can’t use it to attack other systems.

There are probably some overly pedantic word games going on. Last year, the NSA said that it discloses 91 percent of the vulnerabilities it finds. Leaving aside the question of whether that remaining 9 percent represents 1, 10, or 1,000 vulnerabilities, there’s the bigger question of what qualifies in the NSA’s eyes as a “vulnerability.”

Not all vulnerabilities can be turned into exploit code. The NSA loses no attack capabilities by disclosing the vulnerabilities it can’t use, and doing so gets its numbers up; it’s good PR. The vulnerabilities we care about are the ones in the Shadow Brokers data dump. We care about them because those are the ones whose existence leaves us all vulnerable.

Because everyone uses the same software, hardware, and networking protocols, there is no way to simultaneously secure our systems while attacking their systems ­ whoever “they” are. Either everyone is more secure, or everyone is more vulnerable.

Pretty much uniformly, security experts believe we ought to disclose and fix vulnerabilities. And the NSA continues to say things that appear to reflect that view, too. Recently, the NSA told everyone that it doesn’t rely on zero days — very much, anyway.

Earlier this year at a security conference, Rob Joyce, the head of the NSA’s Tailored Access Operations (TAO) organization — basically the country’s chief hacker — gave a rare public talk, in which he said that credential stealing is a more fruitful method of attack than are zero days: “A lot of people think that nation states are running their operations on zero days, but it’s not that common. For big corporate networks, persistence and focus will get you in without a zero day; there are so many more vectors that are easier, less risky, and more productive.”

The distinction he’s referring to is the one between exploiting a technical hole in software and waiting for a human being to, say, get sloppy with a password.

A phrase you often hear in any discussion of the Vulnerabilities Equities Process is NOBUS, which stands for “nobody but us.” Basically, when the NSA finds a vulnerability, it tries to figure out if it is unique in its ability to find it, or whether someone else could find it, too. If it believes no one else will find the problem, it may decline to make it public. It’s an evaluation prone to both hubris and optimism, and many security experts have cast doubt on the very notion that there is some unique American ability to conduct vulnerability research.

The vulnerabilities in the Shadow Brokers data dump are definitely not NOBUS-level. They are run-of-the-mill vulnerabilities that anyone — another government, cybercriminals, amateur hackers — could discover, as evidenced by the fact that many of them were discovered between 2013, when the data was stolen, and this summer, when it was published. They are vulnerabilities in common systems used by people and companies all over the world.

So what are all these vulnerabilities doing in a secret stash of NSA code that was stolen in 2013? Assuming the Russians were the ones who did the stealing, how many US companies did they hack with these vulnerabilities? This is what the Vulnerabilities Equities Process is designed to prevent, and it has clearly failed.

If there are any vulnerabilities that — according to the standards established by the White House and the NSA — should have been disclosed and fixed, it’s these. That they have not been during the three-plus years that the NSA knew about and exploited them — despite Joyce’s insistence that they’re not very important — demonstrates that the Vulnerable Equities Process is badly broken.

We need to fix this. This is exactly the sort of thing a congressional investigation is for. This whole process needs a lot more transparency, oversight, and accountability. It needs guiding principles that prioritize security over surveillance. A good place to start are the recommendations by Ari Schwartz and Rob Knake in their report: these include a clearly defined and more public process, more oversight by Congress and other independent bodies, and a strong bias toward fixing vulnerabilities instead of exploiting them.

And as long as I’m dreaming, we really need to separate our nation’s intelligence-gathering mission from our computer security mission: we should break up the NSA. The agency’s mission should be limited to nation state espionage. Individual investigation should be part of the FBI, cyberwar capabilities should be within US Cyber Command, and critical infrastructure defense should be part of DHS’s mission.

I doubt we’re going to see any congressional investigations this year, but we’re going to have to figure this out eventually. In my 2014 book Data and Goliath, I write that “no matter what cybercriminals do, no matter what other countries do, we in the US need to err on the side of security by fixing almost all the vulnerabilities we find…” Our nation’s cybersecurity is just too important to let the NSA sacrifice it in order to gain a fleeting advantage over a foreign adversary.

This essay previously appeared on Vox.com.

EDITED TO ADD (8/27): The vulnerabilities were seen in the wild within 24 hours, demonstrating how important they were to disclose and patch.

James Bamford thinks this is the work of an insider. I disagree, but he’s right that the TAO catalog was not a Snowden document.

People are looking at the quality of the code. It’s not that good.

How to Remove Single Points of Failure by Using a High-Availability Partition Group in Your AWS CloudHSM Environment

Post Syndicated from Tracy Pierce original https://blogs.aws.amazon.com/security/post/Tx7VU4QS5RCK7Q/How-to-Remove-Single-Points-of-Failure-by-Using-a-High-Availability-Partition-Gr

A hardware security module (HSM) is a hardware device designed with the security of your data and cryptographic key material in mind. It is tamper-resistant hardware that prevents unauthorized users from attempting to pry open the device, plug any extra devices in to access data or keys such as subtokens, or damage the outside housing. If any such interference occurs, the device wipes all information stored so that unauthorized parties do not gain access to your data or cryptographic key material. A high-availability (HA) setup could be beneficial because, with multiple HSMs kept in different data centers and all data synced between them, the loss of one HSM does not mean the loss of your data.

In this post, I will walk you through steps to remove single points of failure in your AWS CloudHSM environment by setting up an HA partition group. Single points of failure occur when a single CloudHSM device fails in a non-HA configuration, which can result in the permanent loss of keys and data. The HA partition group, however, allows for one or more CloudHSM devices to fail, while still keeping your environment operational.

Prerequisites

You will need a few things to build your HA partition group with CloudHSM:

  • 2 CloudHSM devices. AWS offers a free two-week trial. AWS will provision the trial for you and send you the CloudHSM information such as the Elastic Network Interface (ENI) and the private IP address assigned to the CloudHSM device so that you may begin testing. If you have used CloudHSM before, another trial cannot be provisioned, but you can set up production CloudHSM devices on your own. See Provisioning Your HSMs.
  • A client instance from which to access your CloudHSM devices. You can create this manually, or via an AWS CloudFormation template. You can connect to this instance in your public subnet, and then it can communicate with the CloudHSM devices in your private subnets.
  • An HA partition group, which ensures the syncing and load balancing of all CloudHSM devices you have created.

The CloudHSM setup process takes about 30 minutes from beginning to end. By the end of this how-to post, you should be able to set up multiple CloudHSM devices and an HA partition group in AWS with ease. Keep in mind that each production CloudHSM device you provision comes with an up-front fee of $5,000. You are not charged for any CloudHSM devices provisioned for a trial, unless you decide to move them to production when the trial ends.

If you decide to move your provisioned devices to your production environment, you will be billed $5,000 per device. If you decide to stop the trial so as not to be charged, you have up to 24 hours after the trial ends to let AWS Support know of your decision.

Solution overview

How HA works

HA is a feature of the Luna SA 7000 HSM hardware device AWS uses for its CloudHSM service. (Luna SA 7000 HSM is also known as the “SafeNet Network HSM” in more recent SafeNet documentation. Because AWS documentation refers to this hardware as “Luna SA 7000 HSM,” I will use this same product name in this post.) This feature allows more than one CloudHSM device to be placed as members in a load-balanced group setup. By having more than one device on which all cryptographic material is stored, you remove any single points of failure in your environment.

You access your CloudHSM devices in this HA partition group through one logical endpoint, which distributes traffic to the CloudHSM devices that are members of this group in a load-balanced fashion. Even though traffic is balanced between the HA partition group members, any new data or changes in data that occur on any CloudHSM device will be mirrored for continuity to the other members of the HA partition group. A single HA partition group is logically represented by a slot, which is physically composed of multiple partitions distributed across all HA nodes. Traffic is sent through the HA partition group, and then distributed to the partitions that are linked. All partitions are then synced so that data is persistent on each one identically.

The following diagram illustrates the HA partition group functionality.

  1. Application servers send traffic to your HA partition group endpoint.
  2. The HA partition group takes all requests and distributes them evenly between the CloudHSM devices that are members of the HA partition group.
  3. Each CloudHSM device mirrors itself to each other member of the HA partition group to ensure data integrity and availability.

Automatic recovery

If you ever lose data, you want a hands-off, quick recovery. Before autoRecovery was introduced, you could take advantage of the redundancy and performance HA partition groups offer, but you were still required to manually intervene when a group member was lost.

HA partition group members may fail for a number of reasons, including:

  • Loss of power to a CloudHSM device.
  • Loss of network connectivity to a CloudHSM device. If network connectivity is lost, it will be seen as a failed device and recovery attempts will be made.

Recovery of partition group members will only work if the following are true:

  • HA autoRecovery is enabled.
  • There are at least two nodes (CloudHSM devices) in the HA partition group.
  • Connectivity is established at startup.
  • The recover retry limit is not reached (if reached or exceeded, the only option is manual recovery).

HA autoRecovery is not enabled by default and must be explicitly enabled by running the following command, which is found in Enabling Automatic Recovery.

>vtl haAdmin –autoRecovery –retry <count>

When enabling autoRecovery, set the –retry and –interval parameters. The –retry parameter can be a value between 0 and 500 (or -1 for infinite retries), and equals the number of times the CloudHSM device will attempt automatic recovery. The –interval parameter is in seconds and can be any value between 60 and 1200. This is the amount of time between automatic recovery tries that the CloudHSM will attempt.

Setting up two Production CloudHSM devices in AWS

Now that I have discussed how HA partition groups work and why they are useful, I will show how to set up your CloudHSM environment and the HA partition group itself. To create an HA partition group environment, you need a minimum of two CloudHSM devices. You can have as many as 16 CloudHSM devices associated with an HA partition group at any given time. These must be associated with the same account and region, but can be spread across multiple Availability Zones, which is the ideal setup for an HA partition group. Automatic recovery is great for larger HA partition groups because it allows the devices to quickly attempt recovery and resync data in the event of a failure, without requiring manual intervention.

Set up the CloudHSM environment

To set up the CloudHSM environment, you must have a few things already in place:

  • An Amazon VPC.
  • At least one public subnet and two private subnets.
  • An Amazon EC2 client instance (m1.small running Amazon Linux x86 64-bit) in the public subnet, with the SafeNet client software already installed. This instance uses the key pair that you specified during creation of the CloudFormation stack. You can find a ready-to-use Amazon Machine Image (AMI) in our Community AMIs. Simply log into the EC2 console, choose Launch Instance, click Community AMIs, and search for CloudHSM. Because we regularly release new AMIs with software updates, searching for CloudHSM will show all available AMIs for a region. Select the AMI with the most recent client version.
  • Two security groups, one for the client instance and one for the CloudHSM devices. The security group for the client instance, which resides in the public subnet, will allow SSH on port 22 from your local network. The security group for the CloudHSM devices, which resides in the private subnet, will allow SSH on port 22 and NTLS on port 1792 from your public subnet. These will both be ingress rules (egress rules allow all traffic).
  • An Elastic IP address for the client instance.
  • An IAM role that delegates AWS resource access to CloudHSM. You can create this role in the IAM console:

    1. Click Roles and then click Create New Role.
    2. Type a name for the role and then click Next Step.
    3. Under AWS Service Roles, click Select next to AWS CloudHSM.
    4. In the Attach Policy step, select AWSCloudHSMRole as the policy. Click Next Step.
    5. Click Create Role.

We have a CloudFormation template available that will set up the CloudHSM environment for you:

  1. Go to the CloudFormation console.
  2. Choose Create Stack. Specify https://cloudhsm.s3.amazonaws.com/cloudhsm-quickstart.json as the Amazon S3 template URL.
  3. On the next two pages, specify parameters such as the Stack name, SSH Key Pair, Tags, and SNS Topic for alerts. You will find SNS Topic under the Advanced arrow. Then, click Create.

When the new stack is in the CREATION_COMPLETE state, you will have the IAM role to be used for provisioning your CloudHSM devices, the private and public subnets, your client instance with Elastic IP (EIP), and the security groups for both the CloudHSM devices and the client instance. The CloudHSM security group will already have its necessary rules in place to permit SSH and NTLS access from your public subnet; however, you still must add the rules to the client instance’s security group to permit SSH access from your allowed IPs. To do this:

  1. In the VPC console, make sure you select the same region as the region in which your HSM VPC resides.
  2. Select the security group in your HSM VPC that will be used for the client instance.
  3. Add an inbound rule that allows TCP traffic on port 22 (SSH) from your local network IP addresses.
  4. On the Inbound tab, from the Create a new rule list, select SSH, and enter the IP address range of the local network from which you will connect to your client instance.
  5. Click Add Rule, and then click Apply Rule Changes.

After adding the IP rules for SSH (port 22) to your client instance’s security group, test the connection by attempting to make a SSH connection locally to your client instance EIP. Make sure to write down all the subnet and role information, because you will need this later.

Create an SSH key pair

The SSH key pair that you will now create will be used by CloudHSM devices to authenticate the manager account when connecting from your client instance. The manager account is simply the user that is permitted to SSH to your CloudHSM devices. Before provisioning the CloudHSM devices, you create the SSH key pair so that you can provide the public key to the CloudHSM during setup. The private key remains on your client instance to complete the authentication process. You can generate the key pair on any computer, as long as you ensure the client instance has the private key copied to it. You can also create the key pair on Linux or Windows. I go over both processes in this section of this post.

In Linux, you will use the ssh-keygen command. By typing just this command into the terminal window, you will receive output similar to the following.

$ ssh-keygen
Generating public/private rsa key pair.
Enter file in which to save the key (/home/user/.ssh/id_rsa):
Enter passphrase (empty for no passphrase):
Enter same passphrase again:
Your identification has been saved in /home/user/.ssh/id_rsa.
Your public key has been saved in /home/user/.ssh/id_rsa.pub.
The key fingerprint is: df:c4:49:e9:fe:8e:7b:eb:28:d5:1f:72:82:fb:f2:69
The key's randomart image is:
+--[ RSA 2048]----+
|                 |
|             .   |
|            o    |
|           + .   |
|        S   *.   |
|         . =.o.o |
|          ..+ +..|
|          .o Eo .|
|           .OO=. |
+-----------------+
$

In Windows, use PuTTYgen to create your key pair:

  1. Start PuTTYgen. For Type of key to generate, select SSH-2 RSA.
  2. In the Number of bits in a generated key field, specify 2048.
  3. Click Generate.
  4. Move your mouse pointer around in the blank area of the Key section below the progress bar (to generate some randomness) until the progress bar is full.
  5. A private/public key pair has now been generated.
  6. In the Key comment field, type a name for the key pair that you will remember.
  7. Click Save public key and name your file.
  8. Click Save private key and name your file. It is imperative that you do not lose this key, so make sure to store it somewhere safe.
  9. Right-click the text field labeled Public key for pasting into OpenSSH authorized_keys file and choose Select All.
  10. Right-click again in the same text field and choose Copy.

The following screenshot shows what the PuTTYgen output will look like after you have created the key pair.

You must convert the keys created by PuTTYgen to OpenSSH format for use with other clients by using the following command.

ssh-keygen –i –f puttygen_key > openssh_key

The public key will be used to provision the CloudHSM device and the private key will be stored on the client instance to authenticate the SSH sessions. The public SSH key will look something like the following. If it does not, it is not in the correct format and must be converted using the preceding procedure.

ssh-rsa AAAAB3NzaC1yc2EAAAABIwAAAQEA6bUsFjDSFcPC/BZbIAv8cAR5syJMB GiEqzFOIEHbm0fPkkQ0U6KppzuXvVlc2u7w0mg
PMhnkEfV6j0YBITu0Rs8rNHZFJs CYXpdoPxMMgmCf/FaOiKrb7+1xk21q2VwZyj13GPUsCxQhRW7dNidaaYTf14sbd9A qMUH4UOUjs
27MhO37q8/WjV3wVWpFqexm3f4HPyMLAAEeExT7UziHyoMLJBHDKMN7 1Ok2kV24wwn+t9P/Va/6OR6LyCmyCrFyiNbbCDtQ9JvCj5
RVBla5q4uEkFRl0t6m9 XZg+qT67sDDoystq3XEfNUmDYDL4kq1xPM66KFk3OS5qeIN2kcSnQ==

Whether you are saving the private key on your local computer or moving it to the client instance, you must ensure that the file permissions are correct. You can do this by running the following commands (throughout this post, be sure to replace placeholder content with your own values). The first command sets the necessary permissions; the second command adds the private key to the authentication agent.

$ chmod 600 ~/.ssh/<private_key_file>
$ ssh-add ~/.ssh/<private_key_file>

Set up the AWS CLI tools

Now that you have your SSH key pair ready, you can set up the AWS CLI tools so that you may provision and manage your CloudHSM devices. If you used the CloudFormation template or the CloudHSM AMI to set up your client instance, you already have the CLI installed. You can check this by running at the command prompt: CloudHSM version. The resulting output should be “Version”: “3.0.5”. If you chose to use your own AMI and install the Luna SA software, you can install the CloudHSM CLI Tools with the following steps. The current version in use is 3.0.5.

$ wget https://s3.amazon.com/cloudhsm-software/CloudHsmCLI.egg
$ sudo easy_install-2.7 –s /usr/local/bin CloudHsmCLI.egg
$ cloudhsm version
{
      “Version”: “<version>”
}

You must also set up file and directory ownership for your user on the client instance and the Chrystoki.conf file. The Chrystoki.conf file is the configuration file for the CloudHSM device. By default, CloudHSM devices come ready from the factory for immediate housing of cryptographic keys and performing cryptographic processes on data, but must be configured to connect to your client instances:

  1. On the client instance, set the owner and write permission on the Chrystoki.conf file.
$ sudo chown <owner> /etc/Chrystoki.conf
$ sudo chmod +w /etc/Chrystoki.conf

The <owner> can be either the user or a group the user belongs to (for example, ec2-user).

  1. On the client instance set the owner of the Luna client directory:
$ sudo chown <owner> -R <luna_client_dir>

The <owner> should be the same as the <owner> of the Chrystoki.conf file. The <luna_client_dir> differs based on the version of the LunaSA client software installed. If these are new setups, use version 5.3 or newer; however, if you have older clients with version 5.1 installed, use version 5.1:

  • Client software version 5.3: /usr/safenet/lunaclient/
  • Client software version 5.1: /usr/lunasa/

You also must configure the AWS CLI tools with AWS credentials to use for the API calls. These can be set by config files, passing the credentials in the commands, or by instance profile. The most secure option, which eliminates the need to hard-code credentials in a config file, is to use an instance profile on your client instance. All CLI commands in this post are performed on a client instance launched with a IAM role that has CloudHSM permissions. If you want to set your credentials in a config file instead, you must remember that each CLI command should include –profile <profilename>, with <profilename> being the name you assigned in the config file for these credentials.See Configuring the AWS CloudHSM CLI Tools for help with setting up the AWS CLI tools.

You will then set up a persistent SSH tunnel for all CloudHSM devices to communicate with the client instance. This is done by editing the ~/.ssh/config file. Replace <CloudHSM_ip_address> with the private IP of your CloudHSM device, and replace <private_key_file> with the file location of your SSH private key created earlier (for example, /home/user/.ssh/id_rsa).

Host <CloudHSM_ip_address>
User manager
IdentityFile <private_key_file>

Also necessary for the client instance to authenticate with the CloudHSM partitions or partition group are client certificates. Depending on the LunaSA client software you are using, the location of these files can differ. Again, if these are new setups, use version 5.3 or newer; however, if you have older clients with version 5.1 installed, use version 5.1:

  • Linux clients

    • Client software version 5.3: /usr/safenet/lunaclient/cert
    • Client software version 5.1: /usr/lunasa/cert
  • Windows clients

    • Client software version 5.3: %ProgramFiles%\SafeNet\LunaClient\cert
    • Client software version 5.1: %ProgramFiles%\LunaSA\cert

To create the client certificates, you can use the OpenSSL Toolkit or the LunaSA client-side vtl commands. The OpenSSL Toolkit is a program that allows you to manage TLS (Transport Layer Security) and SSL (Secure Sockets Layer) protocols. It is commonly used to create SSL certificates for secure communication between internal network devices. The LunaSA client-side vtl commands are installed on your client instance along with the Luna software. If you used either CloudFormation or the CloudHSM AMI, the vtl commands are already installed for you. If you chose to launch a different AMI, you can download the Luna software. After you download the software, run the command linux/64/install.sh as root on a Linux instance and install the Luna SA option. If you install the softtware on a Windows instance, run the command windows\64\LunaClient.msi to install the Luna SA option. I show certificate creation in both OpenSSL Toolkit and LunaSA in the following section.

OpenSSL Toolkit

      $ openssl genrsa –out <luna_client_cert_dir>/<client_name>Key.pem 2048
      $ openssl req –new –x509 –days 3650 –key <luna_client_cert_dir>/<client_cert>Key.pem –out <client_name>.pem

The <luna_client_cert_dir> is the LunaSA Client certificate directory on the client and the <client_name> can be whatever you choose.

LunaSA

      $ sudo vtl createCert –n <client_name>

The output of the preceding LunaSA command will be similar to the following.

Private Key created and written to:

<luna_client_cert_dir>/<client_name>Key.pem

Certificate created and written to:

<luna_client_cert_dir>/<client_name>.pem

You will need these key file locations later on, so make sure you write them down or save them to a file. One last thing to do at this point is create the client Amazon Resource Name (ARN), which you do by running the following command.

$ CloudHSM create-client –certificate-file <luna_client_cert_dir>/<client_name>.pem

{
      “ClientARN”: “<client_arn>”,
      “RequestId”: “<request_id>”
}

Also write down in a safe location the client ARN because you will need it when registering your client instances to the HA partition group.

Provision your CloudHSM devices

Now for the fun and expensive part. Always remember that for each CloudHSM device you provision to your production environment, there is an upfront fee of $5,000. Because you need more than one CloudHSM device to set up an HA partition group, provisioning two CloudHSM devices to production will cost an upfront fee of $10,000.

If this is your first time trying out CloudHSM, you can have a two-week trial provisioned for you at no cost. The only cost will occur if you decide to keep your CloudHSM devices and move them into production. If you are unsure of the usage in your company, I highly suggest doing a trial first. You can open a support case requesting a trial at any time. You must have a paid support plan to request a CloudHSM trial.

To provision the two CloudHSM devices, SSH into your client instance and run the following CLI command.

$ CloudHSM create-CloudHSM \
--subnet-id <subnet_id> \
--ssh-public-key-file <public_key_file> \
--iam-role-arn <iam_role_arn> \
--syslog-ip <syslog_ip_address>

The response should resemble the following.

{
      “CloudHSMArn”: “<CloudHSM_arn>”,
      “RequestId”: “<request_id>”
}

Make note of each CloudHSM ARN because you will need them to initialize the CloudHSM devices and later add them to the HA partition group.

Initialize the CloudHSM devices

Configuring your CloudHSM devices, or initializing them as the process is formally called, is what allows you to set up the configuration files, certificate files, and passwords on the CloudHSM itself. Because you already have your CloudHSM ARNs from the previous section, you can run the describe-hsm command to get the EniId and the EniIp of the CloudHSM devices. Your results should be similar to the following.

$ CloudHSM describe-CloudHSM --CloudHSM-arn <CloudHSM_arn>
{    
     "EniId": "<eni_id>",   
     "EniIp": "<eni_ip>",    
     "CloudHSMArn": "<CloudHSM_arn>",    
     "IamRoleArn": "<iam_role_arn>",    
     "Partitions": [],    
     "RequestId": "<request_id>",    
     "SerialNumber": "<serial_number>",    
     "SoftwareVersion": "5.1.3-1",    
     "SshPublicKey": "<public_key_text>",    
     "Status": "<status>",    
     "SubnetId": "<subnet_id>",    
     "SubscriptionStartDate": "2014-02-05T22:59:38.294Z",    
     "SubscriptionType": "PRODUCTION",    
     "VendorName": "SafeNet Inc."
}

Now that you know the EniId of each CloudHSM, you need to apply the CloudHSM security group to them. This ensures that connection can occur from any instance with the client security group assigned. When a trial is provisioned for you, or you provision CloudHSM devices yourself, the default security group of the VPC is automatically assigned to the ENI.You must change this to the security group that permits ingress ports 22 and 1792 from your client instance.

To apply a CloudHSM security group to an EniId:

  1. Go to the EC2 console, and choose Network Interfaces in the left pane.
  2. Select the EniId of the CloudHSM.
  3. From the Actions drop-down list, choose Change Security Groups. Choose the security group for your CloudHSM device, and then click Save.

To complete the initialization process, you must ensure a persistent SSH connection is in place from your client to the CloudHSM. Remember the ~/.ssh/config file you edited earlier? Now that you have the IP address of the CloudHSM devices and the location of the private SSH key file, go back and fill in that config file’s parameters by using your favorite text editor.

Now, initialize using the initialize-hsm command with the information you gathered from the provisioning steps. The values in red in the following example are meant as placeholders for your own naming and password conventions, and should be replaced with your information during the initialization of the CloudHSM devices.

$CloudHSM initialize-CloudHSM \
--CloudHSM-arn <CloudHSM_arn> \
--label <label> \
--cloning-domain <cloning_domain> \
--so-password <so_password>

The <label> is a unique name for the CloudHSM device that should be easy to remember. You can also use this name as a descriptive label that tells what the CloudHSM device is for. The <cloning_domain> is a secret used to control cloning of key material from one CloudHSM to another. This can be any unique name that fits your company’s naming conventions. Examples could be exampleproduction or exampledevelopment. If you are going to set up an HA partition group environment, the <cloning_domain> must be the same across all CloudHSMs. The <so_password> is the security officer password for the CloudHSM device, and for ease of remembrance, it should be the same across all devices as well. It is important you use passwords and cloning domain names that you will remember, because they are unrecoverable and the loss of them means loss of all data on a CloudHSM device. For your use, we do supply a Password Worksheet if you want to write down your passwords and store the printed page in a secure place.

Configure the client instance

Configuring the client instance is important because it is the secure link between you, your applications, and the CloudHSM devices. The client instance opens a secure channel to the CloudHSM devices and sends all requests over this channel so that the CloudHSM device can perform the cryptographic operations and key storage. Because you already have launched the client instance and mostly configured it, the only step left is to create the Network Trust Link (NTL) between the client instance and the CloudHSM. For this, we will use the LunaSA vtl commands again.

  1. Copy the server certificate from the CloudHSM to the client.
$ scp –i ~/.ssh/<private_key_file> [email protected]<CloudHSM_ip_address>:server.pem
  1. Register the CloudHSM certificate with the client.
$ sudo vtl addServer –n <CloudHSM_ip_address> -c server.pem
New server <CloudHSM_ip_address> successfully added to server list.
  1. Copy the client certificate to the CloudHSM.
$ scp –i ~/.ssh/<private_key_file> <client_cert_directory>/<client_name>.pem [email protected]<CloudHSM_ip_address>:
  1. Connect to the CloudHSM.
$ ssh –i ~/.ssh/<private_key_file> [email protected]<CloudHSM_ip_address>
lunash:>
  1. Register the client.
lunash:> client register –client <client_id> -hostname <client_name>

The <client_id> and <client_name> should be the same for ease of use, and this should be the same as the name you used when you created your client certificate.

  1. On the CloudHSM, log in with the SO password.
lunash:> hsm login
  1. Create a partition on each CloudHSM (use the same name for ease of remembrance).
lunash:> partition create –partition <name> -password <partition_password> -domain <cloning_domain>

The <partition_password> does not have to be the same as the SO password, and for security purposes, it should be different.

  1. Assign the client to the partition.
lunash:> client assignPartition –client <client_id> -partition <partition_name>
  1. Verify that the partition assigning went correctly.
lunash:> client show –client <client_id>
  1. Log in to the client and verify it has been properly configured.
$ vtl verify
The following Luna SA Slots/Partitions were found:
Slot    Serial #         Label
====    =========        ============
1      <serial_num1>     <partition_name>
2      <serial_num2>     <partition_name>

You should see an entry for each partition created on each CloudHSM device. This step lets you know that the CloudHSM devices and client instance were properly configured.

The partitions created and assigned via the previous steps are for testing purposes only and will not be used in the HA parition group setup. The HA partition group workflow will automatically create a partition on each CloudHSM device for its purposes. At this point, you have created the client and at least two CloudHSM devices. You also have set up and tested for validity the connection between the client instance and the CloudHSM devices. The next step in to ensure fault tolerance by setting up the HA partition group.

Set up the HA partition group for fault tolerance

Now that you have provisioned multiple CloudHSM devices in your account, you will add them to an HA partition group. As I explained earlier in this post, an HA partition group is a virtual partition that represents a group of partitions distributed over many physical CloudHSM devices for HA. Automatic recovery is also a key factor in ensuring HA and data integrity across your HA partition group members. If you followed the previous procedures in this post, setting up the HA partition group should be relatively straightforward.

Create the HA partition group

First, you will create the actual HA partition group itself. Using the CloudHSM CLI on your client instance, run the following command to create the HA partition group and name it per your company’s naming conventions. In the following command, replace <label> with the name you chose.

$ CloudHSM create-hapg –group-label <label>

Register the CloudHSM devices with the HA partition group

Now, add the already initialized CloudHSM devices to the HA partition group. You will need to run the following command for each CloudHSM device you want to add to the HA partition group.

$ CloudHSM add-CloudHSM-to-hapg \
--CloudHSM-arn <CloudHSM_arn> \
--hapg-arn <hapg_arn> \
--cloning-domain <cloning_domain> \
--partition-password <partition_password> \
--so-password <so_password>

You should see output similar to the following after each successful addition to the HA partition group.

{
      “Status”: “Addition of CloudHSM <CloudHSM_arn> to HA partition group <hapg_arn> successful”
}

Register the client with the HA partition group

The last step is to register the client with the HA partition group. You will need the client ARN from earlier in the post, and you will use the CloudHSM CLI command register-client-to-hapg to complete this process.

$ CloudHSM register-client-to-hapg \
--client-arn <client_arn> \
--hapg-arn <hapg_arn>
{
      “Status”: “Registration of the client <client_arn> to the HA partition group <hapg_arn> successful”
}

After you register the client with the HA partition group, you have the client configuration file and the server certificates. You have already registered the client with the HA partition group, but now you have to actually assign it as well, which you do by using the get-client-configuration AWS CLI command.

$ CloudHSM get-client-configuration \
--client-arn <client_arn> \
--hapg-arn <hapg_arn> \
--cert-directory <server_cert_location> \
--config-directory /etc/

The configuration file has been copied to /etc/
The server certificate has been copied to <server_cert_location>

The <server_cert_location> will differ depending on the LunaSA client software you are using:

  • Client software version 5.3: /usr/safenet/lunaclient/cert/server
  • Client software version 5.1: /usr/lunasa/cert/server

Lastly, to verify the client configuration, run the following LunaSA vtl command.

$ vtl haAdmin show

In the output, you will see a heading, HA Group and Member Information. Ensure that the number of group members equals the number of CloudHSM devices you added to the HA partition group. If the number does not match what you have provisioned, you might have missed a step in the provisioning process. Going back through the provisioning process usually repairs this. However, if you still encounter issues, opening a support case is the quickest way to get assistance.

Another way to verify the HA partition group setup is to check the /etc/Chrystoki.conf file for output similar to the following.

VirtualToken = {
   VirtualToken00Label = hapg1;
   VirtualToken00SN = 1529127380;
   VirtualToken00Members = 475256026,511541022;
}
HASynchronize = {
   hapg1 = 1;
}
HAConfiguration = {
   reconnAtt = -1;
   AutoReconnectInterval = 60;
   HAOnly = 1;

Summary

You have now completed the process of provisioning CloudHSM devices, the client instance for connection, and your HA partition group for fault tolerance. You can begin using an application of your choice to access the CloudHSM devices for key management and encryption. By accessing CloudHSM devices via the HA partition group, you ensure that all traffic is load balanced between all backing CloudHSM devices. The HA partition group will ensure that each CloudHSM has identical information so that it can respond to any request issued.

Now that you have an HA partition group set up with automatic recovery, if a CloudHSM device fails, the device will attempt to recover itself, and all traffic will be rerouted to the remaining CloudHSM devices in the HA partition group so as not to interrupt traffic. After recovery (manual or automatic), all data will be replicated across the CloudHSM devices in the HA partition group to ensure consistency.

If you have questions about any part of this blog post, please post them on the IAM forum.

– Tracy

PulseAudio 9.0 is out

Post Syndicated from corbet original http://lwn.net/Articles/692988/rss

The PulseAudio 9.0 release is out. Changes include improvements to
automatic routing, beamforming support, use of the Linux memfd mechanism for transport, higher
sample-rate support, and more; see the
release notes
for details.

See also: this
article from Arun Raghavan
on how the beamforming feature works.
The basic idea is that if you have a number of microphones (a mic
array) in some known arrangement, it is possible to ‘point’ or steer the
array in a particular direction, so sounds coming from that direction are
made louder, while sounds from other directions are rendered softer
(attenuated).

Streaming Site Operators Face Jail & $1.7m Forfeiture

Post Syndicated from Andy original https://torrentfreak.com/streaming-site-operators-face-jail-1-7m-forfeiture-160626/

Founded half a decade ago, Swefilmer was Sweden’s most popular unauthorized streaming site.

Offering all the latest movies and TV shows, Swefilmer (and another, Dreamfilm) captured up to 25% of all web TV viewing in Sweden according to a 2015 report.

Last summer, however, the noose began to tighten. In July local man Ola Johansson revealed that he’d been raided by the police under suspicion of being involved in running the site.

Meanwhile, police continued the hunt for the site’s primary operator and in March 2016 it was revealed that a Turkish national had been arrested in Germany on a secret European arrest warrant. The 25-year-old is said to be the person who received donations from users and set up Swefilmer’s deals with advertisers.

Both men have now been prosecuted by Swedish authorities. In an indictment filed in the Varberg District Court, both men are accused of copyright infringement connected to the unlawful distribution of more than 1,400 movies.

Additionally, the 25-year-old stands accused of aggravated money laundering offenses related to his handling of Swefilmer’s finances.

The prosecution says that the site generated more than $1.7m between November 2013 and June 2015. More than $1.5m of that amount came from advertising with user donations contributing around $110,000. The state wants the 25-year-old to forfeit the full amount. A $77,000 car and properties worth $233,000 have already been seized.

While both could be sent to prison, the 22-year-old faces less serious charges and will be expected to pay back around $3,600.

The trial, which is expected to go ahead in just over a week, will be the most significant case against a streaming portal in Sweden to date.

Source: TF, for the latest info on copyright, file-sharing, torrent sites and ANONYMOUS VPN services.

Graphical fidelity is ruining video games

Post Syndicated from Eevee original https://eev.ee/blog/2016/06/22/graphical-fidelity-is-ruining-video-games/

I’m almost 30, so I have to start practicing being crotchety.

Okay, maybe not all video games, but something curious has definitely happened here. Please bear with me for a moment.

Discovering Doom

Surprise! This is about Doom again.

Last month, I sat down and played through the first episode of Doom 1 for the first time. Yep, the first time. I’ve mentioned before that I was introduced to Doom a bit late, and mostly via Doom 2. I’m familiar with a decent bit of Doom 1, but I’d never gotten around to actually playing through any of it.

I might be almost unique in playing Doom 1 for the first time decades after it came out, while already being familiar with the series overall. I didn’t experience Doom 1 only in contrast to modern games, but in contrast to later games using the same engine.

It was very interesting to experience Romero’s design sense in one big chunk, rather than sprinkled around as it is in Doom 2. Come to think of it, Doom 1’s first episode is the only contiguous block of official Doom maps to have any serious consistency: it sticks to a single dominant theme and expands gradually in complexity as you play through it. Episodes 2 and 3, as well of most of Doom 2, are dominated by Sandy Petersen’s more haphazard and bizarre style. Episode 4 and Final Doom, if you care to count them, are effectively just map packs.

It was also painfully obvious just how new this kind of game was. I’ve heard Romero stress the importance of contrast in floor height (among other things) so many times, and yet Doom 1 is almost comically flat. There’s the occasional lift or staircase, sure, but the spaces generally feel like they’re focused around a single floor height with the occasional variation. Remember, floor height was a new thing — id had just finished making Wolfenstein 3D, where the floor and ceiling were completely flat and untextured.

The game was also clearly designed for people who had never played this kind of game. There was much more ammo than I could possibly carry; I left multiple shell boxes behind on every map. The levels were almost comically easy, even on UV, and I’m not particularly good at shooters. It was a very stark contrast to when I played partway through The Plutonia Experiment a few years ago and had to rely heavily on quicksaving.

Seeing Doom 1 from a Doom 2 perspective got me thinking about how design sensibilities in shooters have morphed over time. And then I realized something: I haven’t enjoyed an FPS since Quake 2.

Or… hang on. That’s not true. I enjoy Splatoon (except when I lose). I loved the Metroid Prime series. I played Team Fortress 2 for quite a while.

On the other hand, I found Half-Life 2 a little boring, I lost interest in Doom 3 before even reaching Hell, and I bailed on Quake 4 right around the extremely hammy spoiler plot thing. I loved Fallout, but I couldn’t stand Fallout 3. Uncharted is pretty to watch, but looks incredibly tedious to play. I never cared about Halo. I don’t understand the appeal of Counterstrike or Call of Duty.

If I made a collage of screenshots of these two sets of games, you’d probably spot the pattern pretty quickly. It seems I can’t stand games with realistic graphics.

I have a theory about this.

The rise of realism

Quake introduced the world to “true” 3D — an environment made out of arbitrary shapes, not just floors and walls. (I’m sure there were other true-3D games before it, but I challenge you to name one off the top of your head.)

Before Quake, games couldn’t even simulate a two-story building, which ruled out most realistic architecture. Walls that slid sideways were virtually unique to Hexen (and, for some reason, the much earlier Wolfenstein 3D). So level designers built slightly more abstract spaces instead. Consider this iconic room from the very beginning of Doom’s E1M1.

What is this room? This is supposed to be a base of some kind, but who would build this room just to store a single armored vest? Up a flight of stairs, on a dedicated platform, and framed by glowing pillars? This is completely ridiculous.

But nobody thinks like that, and even the people who do, don’t really care too much. It’s a room with a clear design idea and a clear gameplay purpose: to house the green armor. It doesn’t matter that this would never be a real part of a base. The game exists in its own universe, and it establishes early on that these are the rules of that universe. Sometimes a fancy room exists just to give the player a thing.

At the same time, the room still resembles a base. I can take for granted, in the back of my head, that someone deliberately placed this armor here for storage. It’s off the critical path, too, so it doesn’t quit feel like it was left specifically for me to pick up. The world is designed for the player, but it doesn’t feel that way — the environment implies, however vaguely, that other stuff is going on here.


Fast forward twenty years. Graphics and physics technology have vastly improved, to the point that we can now roughly approximate a realistic aesthetic in real-time. A great many games thus strive to do exactly that.

And that… seems like a shame. The better a game emulates reality, the less of a style it has. I can’t even tell Call of Duty and Battlefield apart.

That’s fine, though, right? It’s just an aesthetic thing. It doesn’t really affect the game.

It totally affects the game

Everything looks the same

Realism” generally means “ludicrous amounts of detail” — even moreso if the environments are already partially-destroyed, which is a fairly common trope I’ll be touching on a lot here.

When everything is highly-detailed, screenshots may look very good, but gameplay suffers because the player can no longer tell what’s important. The tendency for everything to have a thick coating of sepia certainly doesn’t help.

Look at that Call of Duty screenshot again. What in this screenshot is actually important? What here matters to you as a player? As far as I can tell, the only critical objects are:

  • Your current weapon

That’s it. The rocks and grass and billboards and vehicles and Hollywood sign might look very nice (by which I mean, “look like those things look”), but they aren’t important to the game at all. This might as well be a completely empty hallway.

To be fair, I haven’t played the game, so for all I know there’s a compelling reason to collect traffic cones. Otherwise, this screenshot is 100% noise. Everything in it serves only to emphasize that you’re in a realistic environment.

Don’t get me wrong, setting the scene is important, but something has been missed here. Detail catches the eye, and this screenshot is nothing but detail. None of it is relevant. If there were ammo lying around, would you even be able to find it?

Ah, but then, modern realistic games either do away with ammo pickups entirely or make them glow so you can tell they’re there. You know, for the realism.

(Speaking of glowing: something I always found ridiculous was how utterly bland the imp fireballs look in Doom 3 and 4. We have these amazing lighting engines, and the best we can do for a fireball is a solid pale orange circle? How do modern fireballs look less interesting than a Doom 1 fireball sprite?)

Even Fallout 2 bugged me a little with this; the world was full of shelves and containers, but it seemed almost all of them were completely empty. Fallout 1 had tons of loot waiting to be swiped from shelves, but someone must’ve decided that was a little silly and cut down on it in Fallout 2. So then, what’s the point of having so many shelves? They encourage the player to explore, then offer no reward whatsoever most of the time.

Environments are boring and static

Fallout 3 went right off the rails, filling the world with tons of (gray) detail, none of which I could interact with. I was barely finished with the first settlement before I gave up on the game because of how empty it felt. Everywhere was detailed as though it were equally important, but most of it was static decorations. From what I’ve seen, Fallout 4 is even worse.

Our graphical capabilities have improved much faster than our ability to actually simulate all the junk we’re putting on the screen. Hey, there’s a car! Can I get in it? Can I drive it? No, I can only bump into an awkwardly-shaped collision box drawn around it. So what’s the point of having a car, an object that — in the real world — I’m accustomed to being able to use?

And yet… a game that has nothing to do with driving a car doesn’t need you to be able to drive a car. Games are games, not perfect simulations of reality. They have rules, a goal, and a set of things the player is able to do. There’s no reason to make the player able to do everything if it has no bearing on what the game’s about.

This puts “realistic” games in an awkward position. How do they solve it?

One good example that comes to mind is Portal, which was rendered realistically, but managed to develop a style from the limited palette it used in the actual play areas. It didn’t matter that you couldn’t interact with the world in any way other than portaling walls and lifting cubes, because for the vast majority of the game, you only encountered walls and cubes! Even the “behind the scenes” parts at the end were mostly architecture, not objects, and I’m not particularly bothered that I can’t interact with a large rusty pipe.

The standouts were the handful of offices you managed to finagle your way into, which were of course full of files and computers and other desktop detritus. Everything in an office is — necessarily! — something a human can meaningfully interact with, but the most you can do in Portal is drop a coffee cup on the floor. It’s all the more infuriating if you consider that the plot might have been explained by the information in those files or on those computers. Portal 2 was in fact a little worse about this, as you spent much more time outside of the controlled test areas.

I think Left 4 Dead may have also avoided this problem by forcing the players to be moving constantly — you don’t notice that you can’t get in a car if you’re running for your life. The only time the players can really rest is in a safe house, which are generally full of objects the players can pick up and use.

Progression feels linear and prescripted

Ah, but the main draw of Portal is one of my favorite properties of games: you could manipulate the environment itself. It’s the whole point of the game, even. And it seems to be conspicuously missing from many modern “realistic” games, partly because real environments are just static, but also in large part because… of the graphics!

Rendering a very complex scene is hard, so modern map formats do a whole lot of computing stuff ahead of time. (For similar reasons, albeit more primitive ones, vanilla Doom can’t move walls sideways.) Having any of the environment actually move or change is thus harder, so it tends to be reserved for fancy cutscenes when you press the button that lets you progress. And because grandiose environmental changes aren’t very realistic, that button often just opens a door or blows something up.

It feels hamfisted, like someone carefully set it all up just for me. Obviously someone did, but the last thing I want is to be reminded of that. I’m reminded very strongly of Half-Life 2, which felt like one very long corridor punctuated by the occasional overt physics puzzle. Contrast with Doom, where there are buttons all over the place and they just do things without drawing any particular attention to the results. Mystery switches are sometimes a problem, but for better or worse, Doom’s switches always feel like something I’m doing to the game, rather than the game waiting for me to come along so it can do some preordained song and dance.

I miss switches. Real switches, not touchscreens. Big chunky switches that take up half a wall.

It’s not just the switches, though. Several of Romero’s maps from episode 1 are shaped like a “horseshoe”, which more or less means that you can see the exit from the beginning (across some open plaza). More importantly, the enemies at the exit can see you, and will be shooting at you for much of the level.

That gives you choices, even within the limited vocabulary of Doom. Do you risk wasting ammo trying to take them out from a distance, or do you just dodge their shots all throughout the level? It’s up to you! You get to decide how to play the game, naturally, without choosing from a How Do You Want To Play The Game menu. Hell, Doom has entire speedrun categories focused around combat — Tyson for only using the fist and pistol, pacifist for never attacking a monster at all.

You don’t see a lot of that any more. Rendering an entire large area in a polygon-obsessed game is, of course, probably not going to happen — whereas the Doom engine can handle it just fine. I’ll also hazard a guess and say that having too much enemy AI going at once and/or rendering too many highly-detailed enemies at once is too intensive. Or perhaps balancing and testing multiple paths is too complicated.

Or it might be the same tendency I see in modding scenes: the instinct to obsessively control the player’s experience, to come up with a perfectly-crafted gameplay concept and then force the player to go through it exactly as it was conceived. Even Doom 4, from what I can see, has a shocking amount of “oh no the doors are locked, kill all the monsters to unlock them!” nonsense. Why do you feel the need to force the player to shoot the monsters? Isn’t that the whole point of the game? Either the player wants to do it and the railroading is pointless, or the player doesn’t want to do it and you’re making the game actively worse for them!

Something that struck me in Doom’s E1M7 was that, at a certain point, you run back across half the level and there are just straggler monsters all over the place. They all came out of closets when you picked up something, of course, but they also milled around while waiting for you to find them. They weren’t carefully scripted to teleport around you in a fixed pattern when you showed up; they were allowed to behave however they want, following the rules of the game.

Whatever the cause, something has been lost. The entire point of games is that they’re an interactive medium — the player has some input, too.

Exploration is discouraged

I haven’t played through too many recent single-player shooters, but I get the feeling that branching paths (true nonlinearity) and sprawling secrets have become less popular too. I’ve seen a good few people specifically praise Doom 4 for having them, so I assume the status quo is to… not.

That’s particularly sad off the back of Doom episode 1, which has sprawling secrets that often feel like an entire hidden part of the base. In several levels, merely getting outside qualifies as a secret. There are secrets within secrets. There are locked doors inside secrets. It’s great.

And these are real secrets, not three hidden coins in a level and you need to find so many of them to unlock more levels. The rewards are heaps of resources, not a fixed list of easter eggs to collect. Sometimes they’re not telegraphed at all; sometimes you need to do something strange to open them. Doom has a secret you open by walking up to one of two pillars with a heart on it. Doom 2 has a secret you open by run-jumping onto a light fixture, and another you open by “using” a torch and shooting some eyes in the wall.

I miss these, too. Finding one can be a serious advantage, and you can feel genuinely clever for figuring them out, yet at the same time you’re not permanently missing out on anything if you don’t find them all.

I can imagine why these might not be so common any more. If decorating an area is expensive and complicated, you’re not going to want to build large areas off the critical path. In Doom, though, you can make a little closet containing a powerup in about twenty seconds.

More crucially, many of the Doom secrets require the player to notice a detail that’s out of place — and that’s much easier to set up in a simple world like Doom. In a realistic world where every square inch is filled with clutter, how could anyone possibly notice a detail out of place? How can a designer lay any subtle hints at all, when even the core gameplay elements have to glow for anyone to pick them out from background noise?

This might be the biggest drawback to extreme detail: it ultimately teaches the player to ignore the detail, because very little of it is ever worth exploring. After running into enough invisible walls, you’re going to give up on straying from the beaten path.

We wind up with a world where players are trained to look for whatever glows, and completely ignore everything else. At which point… why are we even bothering?

There are no surprises

Realistic” graphics mean a “realistic” world, and let’s face it, the real world can be a little dull. That’s why we invented video games, right?

Doom has a very clear design vocabulary. Here are some demons. They throw stuff at you; don’t get hit by it. Here are some guns, which you can all hold at once, because those are the rules. Also here’s a glowing floating sphere that gives you a lot of health.

What is a megasphere, anyway? Does it matter? It’s a thing in the game with very clearly-defined rules. It’s good; pick it up.

You can’t do that in a “realistic” game. (Or maybe you can, but we seem to be trying to avoid it.) You can’t just pick up a pair of stereoscopic glasses to inexplicably get night vision for 30 seconds; you need to have some night-vision goggles with batteries and it’s a whole thing. You can’t pick up health kits that heal you; you have to be wearing regenerative power armor and pick up energy cells. Even Doom 4 seems to be uncomfortable leaving brightly flashing keycards lying around — instead you retrieve them from the corpses of people wearing correspondingly-colored armor.

Everything needs an explanation, which vastly reduces the chances of finding anything too surprising or new.

I’m told that Call of Duty is the most popular vidya among the millenials, so I went to look at its weapons:

  • Gun
  • Fast gun
  • Long gun
  • Different gun

How exciting! If you click through each of those gun categories, you can even see the list of unintelligible gun model numbers, which are exactly what gets me excited about a game.

I wonder if those model numbers are real or not. I’m not sure which would be worse.

Get off my lawn

So my problem is that striving for realism is incredibly boring and counter-productive. I don’t even understand the appeal; if I wanted reality, I could look out my window.

Realism” actively sabotages games. I can judge Doom or Mario or Metroid or whatever as independent universes with their own rules, because that’s what they are. A game that’s trying to mirror reality, I can only compare to reality — and it’ll be a very pale imitation.

It comes down to internal consistency. Doom and Team Fortress 2 and Portal and Splatoon and whatever else are pretty upfront about what they’re offering: you have a gun, you can shoot it, also you can run around and maybe press some buttons if you’re lucky. That’s exactly what you get. It’s right there on the box, even.

Then I load Fallout 3, and it tries to look like the real world, and it does a big song and dance asking me for my stats “in-world”, and it tries to imply I can roam this world and do anything I want and forge my own destiny. Then I get into the game, and it turns out I can pretty much just shoot, pick from dialogue trees, and make the occasional hamfisted moral choice. The gameplay doesn’t live up to what the environment tried to promise. The controls don’t even live up to what the environment tried to promise.

The great irony is that “realism” is harshly limiting, even as it grows ever more expensive and elaborate. I’m reminded of the Fat Man in Fallout 3, the gun that launches “mini nukes”. If that weapon had been in Fallout 1 or 2, I probably wouldn’t think twice about it. But in the attempted “realistic” world of Fallout 3, I have to judge it as though it were trying to be a real thing — because it is! — and that makes it sound completely ridiculous.

(It may sound like I’m picking on Fallout 3 a lot here, but to its credit, it actually had enough stuff going on that it stands out to me. I barely remember anything about Doom 3 or Quake 4, and when I think of Half-Life 2 I mostly imagine indistinct crumbling hallways or a grungy river that never ends.)

I’ve never felt this way about series that ignored realism and went for their own art style. Pikmin 3 looks very nice, but I never once felt that I ought to be able to do anything other than direct Pikmin around. Metroid Prime looks great too and has some “realistic” touches, but it still has a very distinct aesthetic, and it manages to do everything important with a relatively small vocabulary — even plentiful secrets.

I just don’t understand the game industry (and game culture)’s fanatical obsession with realistic graphics. They make games worse. It’s entirely possible to have an art style other than “get a lot of unpaid interns to model photos of rocks”, even for a mind-numbingly bland army man simulator. Please feel free to experiment a little more. I would love to see more weird and abstract worlds that follow their own rules and drag you down the rabbit hole with them.

Making it easier to deploy TPMTOTP on non-EFI systems

Post Syndicated from Matthew Garrett original http://mjg59.dreamwidth.org/41458.html

I’ve been working on TPMTOTP a little this weekend. I merged a pull request that adds command-line argument handling, which includes the ability to choose the set of PCRs you want to seal to without rebuilding the tools, and also lets you print the base32 encoding of the secret rather than the qr code so you can import it into a wider range of devices. More importantly it also adds support for setting the expected PCR values on the command line rather than reading them out of the TPM, so you can now re-seal the secret against new values before rebooting.

I also wrote some new code myself. TPMTOTP is designed to be usable in the initramfs, allowing you to validate system state before typing in your passphrase. Unfortunately the initramfs itself is one of the things that’s measured. So, you end up with something of a chicken and egg problem – TPMTOTP needs access to the secret, and the obvious thing to do is to put the secret in the initramfs. But the secret is sealed against the hash of the initramfs, and so you can’t generate the secret until after the initramfs. Modify the initramfs to insert the secret and you change the hash, so the secret is no longer released. Boo.

On EFI systems you can handle this by sticking the secret in an EFI variable (there’s some special-casing in the code to deal with the additional metadata on the front of things you read out of efivarfs). But that’s not terribly useful if you’re not on an EFI system. Thankfully, there’s a way around this. TPMs have a small quantity of nvram built into them, so we can stick the secret there. If you pass the -n argument to sealdata, that’ll happen. The unseal apps will attempt to pull the secret out of nvram before falling back to looking for a file, so things should just magically work.

I think it’s pretty feature complete now, other than TPM2 support? That’s on my list.

comment count unavailable comments

Will Spark Power the Data behind Precision Medicine?

Post Syndicated from Christopher Crosbie original https://blogs.aws.amazon.com/bigdata/post/Tx1GE3J0NATVJ39/Will-Spark-Power-the-Data-behind-Precision-Medicine

Christopher Crosbie is a Healthcare and Life Science Solutions Architect with Amazon Web Services.
This post was co-authored by Ujjwal Ratan, a Solutions Architect with Amazon Web Services.

———————————

“And that’s the promise of precision medicine — delivering the right treatments, at the right time, every time to the right person.“ (President Obama, 2015 State of the Union address)

The promise of precision medicine that President Obama envisions with this statement is a far-reaching goal that will require sweeping changes to the ways physicians treat patients, health data is collected, and global collaborative research is performed. Precision medicine typically describes an approach for treating and preventing disease that takes into account a patient’s individual variation in genes, lifestyle, and environment. Achieving this mission relies on the intersection of several technology innovations and a major restructuring of health data to focus on the genetic makeup of an individual.

The healthcare ecosystem has chosen a variety of tools and techniques for working with big data, but one tool that comes up again and again in many of the architectures we design and review is Spark on Amazon EMR.

Spark is already known for being a major player in big data analysis, but it is additionally uniquely capable in advancing genomics algorithms given the complex nature of genomics research. This post introduces gene analysis using Spark on EMR and ADAM, for those new to precision medicine.

Development of precision medicine

Data-driven tailored treatments have been commonplace for certain treatments like blood transfusions for a long time. Historically, however, most treatment plans are deeply subjective, due to the many disparate pieces of information that physicians must tie together to make a health plan based on the individual’s specific conditions.

To move past the idiosyncratic nature of most medical treatments, we need to amass properly collected and curated biological data to compare and correlate outcomes and biomarkers across varying patient populations.

The data of precision medicine inherently involves the data representation of large volumes of living, mobile, and irrationally complex humans. The recent blog post How The Healthcare of Tomorrow is Being Delivered Today detailed the ways in which AWS underlies some of the most advanced and innovative big data technologies in precision medicine. Technologies like these will enable the research and analysis of the data structures necessary to create true individualized care.

Genomics data sets require exploration

The study of genomics dates back much further than the Obama era. The field benefits from the results of a prodigious amount of research spanning Gregor Mendel’s pea pods in the 1860s to the Human Genome Project of the 1990s.

As with most areas of science, building knowledge often goes hand in hand with legacy analysis features that turn out to be outdated as the discipline evolves. The generation of scientists following Mendel, for example, significantly altered his calculations due to the wide adoption of the p-value in statistics.

The anachronism for many of the most common genomics algorithms today is the failure to properly optimize for cloud technology. Memory requirements are often limited to a single compute node or expect a POSIX file system. These tools may have been sufficient for the computing task involved in analyzing the genome of a single person. However, a shift to cloud computing will be necessary as we move to the full-scale population studies that will be required to develop novel methods in precision medicine.

Many existing genomics algorithms must be refactored to address the scale at which research is done today. The Broad Institute of MIT and Harvard, a leading genomic research center, recognized this and moved many of the algorithms that could be pulled apart and parallelized into a MapReduce paradigm within a Genome Analysis Toolkit (GATK). Despite the MapReduce migration and many Hadoop-based projects such as BioPig, the bioinformatics community did not fully embrace the Hadoop ecosystem due to the sequential nature of the reads and the overhead associated with splitting the MapReduce tasks.

Precision medicine is also going to rely heavily on referencing public data sets. Generally available, open data sets are often downloaded and copied among many researcher centers. These multiple copies of the same data create an inefficiency for researchers that is addressed through the AWS Public Data Sets program. This program allows researchers to leverage popular genomics data sets like TCGA and ICGC without having to pay for raw storage.

Migrating genetics to Spark on Amazon EMR

The transitioning of genomics algorithms that are popular today to Spark is one path that scientists are taking to capture the distributed processing capabilities of the cloud. However, precision medicine will require an abundance of exploration and new approaches. Many of these are already being built on top of Spark on EMR.

This is because Spark on EMR provides much of the functionality that aligns well with the goals of the precision medicine. For instance, using Amazon S3 as an extension of HDFS storage makes it easy to share the results of your analysis with collaborators from all over the world by simply allowing access to an S3 URL. It also provides the ability for researchers to adjust their cluster to the algorithm they are trying to build instead of adjusting their algorithm to the cluster to which they have access. Competition for compute resources with other cluster users is another drawback that can be mitigated with a move towards EMR.

The introduction of Spark has now overcome many of the previous limitations associated with parallelizing the data based on index files that comprise the standard Variant Control Files (VCF) and Binary Alignment Map (BAM) formats used by genomics researchers. The AMPLab at UC Berkeley, through projects like ADAM, demonstrated the value of a Spark infrastructure for genomics data processing.

Moreover, genomics scientists began identifying the need to get away from the undifferentiated heavy lifting of developing custom distributed system code. These developments motivated the Broad to develop the next version of the GATK with an option to run in the cloud on Spark. The upcoming GATK is a major step forward for the scientific community since it will soon be able to incorporate many of the features of EMR, such as on-demand cluster of various types and Amazon S3–backed storage.

Although Spark on EMR provides many infrastructure advantages, Spark still speaks the languages that are popular with the research community. Languages like SparkR on EMR make for an easy transition into the cloud. That way when there are issues that arise such as needing to unpack Gzip-compressed files to make them split-able, familiar R code can be used to make the transformation.

What’s under the hood of ADAM?

ADAM is an in-memory MapReduce system. ADAM is the result of contributions from universities, biotech, and pharma companies. It is entirely open source under the Apache 2 license. ADAM uses Apache Avro and Parquet file formats for data storage to provide a schema-based description of genomic data. This eliminates the dependencies on format-based libraries, which in the past has created incompatibilities.

Apache Parquet is a columnar storage format that is well suited for genomic data. The primary reason for this is because genomic data consists of a large number of similar data items that is align well with columnar storage. By using Apache Parquet as its storage format, ADAM makes querying and storing genomic data highly efficient. It limits the I/O to data actually needed by loading only the columns that need to be accessed. Because of its columnar nature, storing data in Parquet saves space as a result of better compression ratios.

The columnar Parquet file format enables up to 25% improvement in storage volume compared to compressed genomics file formats like BAM. Moreover, the in-memory caching using Spark and the ability to parallelize data processing over multiple nodes have been known to provide up to 50 times better performance on average on a 100-node cluster.

In addition to using the Parquet format for columnar storage, ADAM makes use of a new schema for genomics data referred to as bdg-formats, a project that provides schemas for describing common genomic data types such as variants, assemblies, and genotypes. It uses Apache Avro as its base framework and as a result, works well with common programming languages and platforms. By using the Apache Avro-based schema as its core data structure, workflows built using ADAM are flexible and easy to maintain using familiar technologies.

ADAM on Amazon EMR

The figure above represents a typical genomic workload designed on AWS using ADAM. File formats like SAM/BAM and VCF are uploaded as objects on Amazon S3.

Amazon EMR provides a way to run Spark compute jobs with ADAM and other applications on an as-needed basis while keeping the genomics data itself stored in separate, cheap object storage.

Your first genomics analysis with ADAM on Amazon EMR

To install ADAM on EMR, launch an EMR cluster from the AWS Management Console. Make sure you select the option to install Spark, as shown in the screen shot below.

While launching the EMR cluster, you should configure the EMR master security groups to allow you access to port 22 so you can SSH to the master node. The master node should be able to communicate to the Internet to download the necessary packages and build ADAM.

ADAM installation steps

During the creation of a cluster, a shell script known as a bootstrap action can automate the installation process of ADAM. In case you want to install ADAM manually, the following instructions walk through what the script does.

After the EMR cluster with Apache Spark is up and running, you can log in into the master node using SSH and install ADAM. ADAM requires Maven to build the packages; after you install Maven, you can clone ADAM from the ADAM GitHub repository.

SSH into the master node of the EMR cluster.

Install Maven by downloading the apache libraries from its website as shown below:

/* create a new directory to install maven */
mkdir maven
cd maven
/* download the maven zip from the apache mirror website */
echo "copying the maven zip file from the apache mirror"
wget http://apachemirror.ovidiudan.com/maven/maven-3/3.3.9/binaries/apache-maven-3.3.9-bin.tar.gz
echo "unzipping the file"
tar -zxvf apache-maven-3.3.9-bin.tar.gz
/* export the “MAVEN_HOME” path */
echo "exporting MAVEN_HOME"
export PATH=$HOME/maven/apache-maven-3.3.9/bin:$PATH

Before cloning ADAM from GitHub, install GIT on the the master node of the EMR cluster. This can be done by running the following command:

/* install git in the home directory */
cd $HOME
sudo yum install git

Clone the ADAM repository from GitHub and begin your build.

/* clone the ADAM repository from GitHub and build the package */
git clone https://github.com/bigdatagenomics/adam.git
cd adam

Finally, begin your build

export MAVEN_OPTS="-Xmx512m -XX:MaxPermSize=256m"
mvn clean package -DskipTests

On completion, you see a message confirming that ADAM was built successfully.

Analysis of genomic data using ADAM

After installation, ADAM provides a set of functions to transform and analyze genomic data sets. The functions range from actions to count K-mers to conversion operations to convert file standard genomics formats like SAM, BAM or VCF to ADAM Parquet. A list of ADAM functions can be viewed by invoking adam-submit from the command line.

For the purposes of this analysis, you use a VCF file from an AWS public data set, specifically the 1000 genome project.

This is a project that aims to build the most detailed map of human genetic variation available. When complete, the 1000 genome project will have genomic data from over 2,661 people. Amazon S3 hosts the initial pilot data for this project in a public S3 bucket. The latest data set available to everyone hosts data for approximately 1700 people and is more than 200 TB in size.

From the master node of the EMR cluster, you can connect to the S3 bucket and download the file to the master node. The first step is to copy the file into HDFS so it can be accessed using ADAM. This can be done by running the following commands:

//copying a vcf file from S3 to master node
$ aws s3 cp s3://1000genomes/phase1/analysis_results/integrated_call_sets/ALL.chr1.integrated_phase1_v3.20101123.snps_indels_svs.genotypes.vcf.gz /home/hadoop/

// Unzip the file
$ gunzip ALL.chr1.integrated_phase1_v3.20101123.snps_indels_svs.genotypes.vcf.gz

//copying file to hdfs
$ hadoop fs -put /home/hadoop/ ALL.chr1.integrated_phase1_v3.20101123.snps_indels_svs.genotypes.vcf /user/hadoop/

The command generates more than 690 ADAM Parquet files. The following screenshot shows a subset of these files.

You can now process the ADAM Parquet files as regular Parquet files in Apache Spark using Scala. ADAM provides its users its own shell, which is invoked from the command line using adam-shell. In this example, read the ADAM  Parquet file as a data frame and print its schema. Then, go on to register the data frame as a temp table and use it to query the genomic data sets.

val gnomeDF = sqlContext.read.parquet("/user/hadoop/adamfiles/ ")
gnomeDF.printSchema()
gnomeDF.registerTempTable("gnome")
val gnome_data = sqlContext.sql("select count(*) from gnome")
gnome_data.show()

Here’s how to use ADAM interactive shell to perform a reads count and K-mer count. A k-mer count returns all of the strings of length k from this DNA sequence. This is an important first step to understanding all of the possible DNA sequences (of length 20) that are contained in this file.

import org.bdgenomics.adam.rdd.ADAMContext._
val reads = sc.loadAlignments("part-r-00000.gz.parquet").cache()
reads.count()
val gnomeDF = sqlContext.read.parquet("part-r-00000.gz.parquet")
gnomeDF.printSchema()
println(reads.first)

/*
The following command cuts reads into _k_-mers, and then counts the number of
occurrences of each _k_-mer
*/
val Kmers = reads.adamCountKmers(20).cache()

kmers.count()

Conclusion  

Spark on Amazon EMR provides unique advantages over traditional genomic processing and may become a necessary tool as genomics moves into the scale of population-based studies required for precision medicine. Innovative companies like Human Longevity Inc. have discussed at re:Invent  how they use AWS tools such as Spark on EMR with Amazon Redshift to build a platform that is pushing precision medicine forward.

Of course, simply distributing compute resources will not solve all of the complexities associated with understanding the human condition. There is an old adage that reminds us that nine women cannot be used to make a baby in one month. Often in biology problems, there is a need to wait for one step to be completed before the next can be undergone. This procedure does not necessarily lend itself well to Spark, which benefits from distributing many small tasks at once. There are still many areas such as sequence assembly that may not have an easy transition to Spark. However, Spark on Amazon EMR will be a very interesting project to watch on our move toward precision medicine.

If you have questions or suggestions, please leave a comment below.

Special thanks to Angel Pizaro for his help on this post!

———————————–

Related

Extending Seven Bridges Genomics with Amazon Redshift and R

Want to learn more about Big Data or Streaming Data? Check out our Big Data and Streaming data educational pages.