Tag Archives: environment

Kotlin and Groovy JVM Languages with AWS Lambda

Post Syndicated from Juan Villa original https://aws.amazon.com/blogs/compute/kotlin-and-groovy-jvm-languages-with-aws-lambda/


Juan Villa – Partner Solutions Architect

 

When most people hear “Java” they think of Java the programming language. Java is a lot more than a programming language, it also implies a larger ecosystem including the Java Virtual Machine (JVM). Java, the programming language, is just one of the many languages that can be compiled to run on the JVM. Some of the most popular JVM languages, other than Java, are Clojure, Groovy, Scala, Kotlin, JRuby, and Jython (see this link for a list of more JVM languages).

Did you know that you can compile and subsequently run all these languages on AWS Lambda?

AWS Lambda supports the Java 8 runtime, but this does not mean you are limited to the Java language. The Java 8 runtime is capable of running JVM languages such as Kotlin and Groovy once they have been compiled and packaged as a “fat” JAR (a JAR file containing all necessary dependencies and classes bundled in).

In this blog post we’ll work through building AWS Lambda functions in both Kotlin and Groovy programming languages. To compile and package our projects we will use Gradle build tool.

To follow along, please clone the Git repository available at GitHub here. Also, I recommend using an Integrated Development Environment (IDE) such as JetBrain’s IntelliJ IDEA, this is the IDE I used while working on these projects.

Kotlin

Kotlin is a statically-typed JVM language designed and developed by JetBrains (one of our Amazon Partner Network Technology partners) and the open source community. Compared to Java the programming language, Kotlin has additional powerful language features such as: Data Classes, Default Arguments, Extensions, Elvis Operator, and Destructuring Declarations. This is a just a short list of Kotlin’s powerful language features. For a more thorough list of features, and how to use them, refer to the full documentation of the Kotlin language.

Let’s jump right into the code and see what an AWS Lambda function looks like in Kotlin.

package com.aws.blog.jvmlangs.kotlin

import java.io.*
import com.fasterxml.jackson.module.kotlin.*

data class HandlerInput(val who: String)
data class HandlerOutput(val message: String)

class Main {
    val mapper = jacksonObjectMapper()

    fun handler(input: InputStream, output: OutputStream): Unit {
        val inputObj = mapper.readValue<HandlerInput>(input)
        mapper.writeValue(output, HandlerOutput("Hello ${inputObj.who}"))
    }
}

The above example is a very simple Hello World application that accepts as an input a JSON object containing a key called “who” and returns a JSON object containing a key called “message” with a value of “Hello {who}”.

AWS Lambda does not support serializing JSON objects into Kotlin data classes, but don’t worry! AWS Lambda supports passing an input object as a Stream, and also supports an output Stream for returning a result (see this link for more information). Combined with the Input/Output Stream form of the handler function, we are using the Jackson library with a Kotlin extension function to support serialization and deserialization of Kotlin data class types.

To get started with this example, let’s first compile and package the Kotlin project.

git clone https://github.com/awslabs/lambda-kotlin-groovy-example
cd lambda-kotlin-groovy-example/kotlin
./gradlew shadowJar

Once packaged, a JAR file containing all necessary dependencies will be available at “build/libs/ jvmlangs-kotlin-1.0-SNAPSHOT-all.jar”. Now let’s deploy this package to AWS Lambda.

To deploy the lambda function, we will be using the AWS Command Line Interface (CLI). You can find information on how to set up the AWS CLI here. This tool allows you to set up and manage AWS services via the command line.

aws lambda create-function --region us-east-1 --function-name kotlin-hello \
--zip-file fileb://build/libs/jvmlangs-kotlin-1.0-SNAPSHOT-all.jar \
--role arn:aws:iam::<account_id>:role/lambda_basic_execution \
--handler com.aws.blog.jvmlangs.kotlin.Main::handler --runtime java8 \
--timeout 15 --memory-size 128

Once deployed, we can test the function by invoking the lambda function from the CLI.

aws lambda invoke --function-name kotlin-hello --payload '{"who": "AWS Fan"}' output.txt
cat output.txt

If successful, you’ll see an output of “{"message":"Hello AWS Fan"}”.

Groovy

Groovy is an optionally typed JVM language with both dynamic and static typing capabilities. Groovy is currently being supported by the Apache Software Foundation. Like Kotlin, Groovy also packs a lot of powerful features such as: Closures, Dynamic Typing, Collection Literals, String Interpolation, and Elvis Operator. This is just a short list, see the full documentation for a list of features and how to use them.

Once again, let’s jump right into the code.

package com.aws.blog.jvmlangs.groovy

class HandlerInput {
    String who
}
class HandlerOutput {
    String message
}

class Main {
    def handler(HandlerInput input) {
        return new HandlerOutput(message: "Hello ${input.who}")
    }
}

Just like the Kotlin example, we have defined a function that takes a simple JSON object containing a “who” key value and build a response containing a “message” key. Note that in this case we are not using the Input/Output Stream form of the handler function, but rather we are letting AWS Lambda serialize the input JSON object into the type HandlerInput. To accomplish this, AWS Lambda uses the Jackson library and handles the serialization for us.

Let’s go ahead and compile and package this Groovy example.

git clone https://github.com/awslabs/lambda-kotlin-groovy-example
cd lambda-kotlin-groovy-example/groovy
./gradlew shadowJar

Once packaged, a JAR file containing all necessary dependencies will be available at “build/libs/ jvmlangs-groovy-1.0-SNAPSHOT-all.jar”. Now let’s deploy this package to AWS Lambda.

aws lambda create-function --region us-east-1 --function-name groovy-hello \
--zip-file fileb://build/libs/jvmlangs-groovy-1.0-SNAPSHOT-all.jar \
--role arn:aws:iam::<account_id>:role/lambda_basic_execution \
--handler com.aws.blog.jvmlangs.groovy.Main::handler --runtime java8 \
--timeout 15 --memory-size 128

Once deployed, we can test the function by invoking the lambda function from the CLI.

aws lambda invoke --function-name groovy-hello --payload '{"who": "AWS Fan"}' output.txt
cat output.txt

If successful, you’ll see an output of “{"message":"Hello AWS Fan"}”.

Gradle Build Tool

Finally, let’s touch up on how we built the JAR package from the Kotlin and Groovy sources above. To build the JARs we used the Gradle build tool. Gradle builds a project by reading instructions from a file called “build.gradle”. This is a file written in Gradle’s Groovy Domain Specific Langauge (DSL). You can find more information on the gradle build file by looking at their documentation. Let’s take a look at the Gradle build files we used for this post.

For the Kotlin example, this is the build file we used.

buildscript {
    repositories {
        mavenCentral()
        jcenter()
    }
    dependencies {
        classpath "org.jetbrains.kotlin:kotlin-gradle-plugin:$kotlin_version"
        classpath "com.github.jengelman.gradle.plugins:shadow:1.2.3"
    }
}

group 'com.aws.blog.jvmlangs.kotlin'
version '1.0-SNAPSHOT'

apply plugin: 'kotlin'
apply plugin: 'com.github.johnrengelman.shadow'

repositories {
    mavenCentral()
}

dependencies {
    compile "org.jetbrains.kotlin:kotlin-stdlib:$kotlin_version"
    compile "com.fasterxml.jackson.module:jackson-module-kotlin:2.8.2"
}

For the Groovy example this is the build file we used.

buildscript {
    repositories {
        jcenter()
    }
    dependencies {
        classpath 'com.github.jengelman.gradle.plugins:shadow:1.2.3'
    }
}

group 'com.aws.blog.jvmlangs.groovy'
version '1.0-SNAPSHOT'

apply plugin: 'groovy'
apply plugin: 'com.github.johnrengelman.shadow'

repositories {
    mavenCentral()
}

dependencies {
    compile 'org.codehaus.groovy:groovy-all:2.3.11'
    testCompile group: 'junit', name: 'junit', version: '4.11'
}

As you can see, the build files for both Kotlin and Groovy files are very similar. For the Kotlin project we define a dependency on the Jackson Kotlin module. Also, for each respective language we include the language supporting libraries (kotlin-stdlib and groovy-all respectively).

In addition, you will notice that we are using a plugin called “shadow”. We use this plugin to package all the project dependencies into one JAR by using the Gradle task “shadowJar”. You can find more information on Shadow in their documentation.

Final Words

Don’t stop here though! Take a look at other JVM languages and get them running on AWS Lambda with the Java 8 runtime. Maybe start with Clojure? or Scala?

Also take a look AWS Lambda Java libraries provided by AWS. They provide interfaces and models to make handling events from event sources easier to handle.

A Raspbian desktop update with some new programming tools

Post Syndicated from Simon Long original https://www.raspberrypi.org/blog/a-raspbian-desktop-update-with-some-new-programming-tools/

Today we’ve released another update to the Raspbian desktop. In addition to the usual small tweaks and bug fixes, the big new changes are the inclusion of an offline version of Scratch 2.0, and of Thonny (a user-friendly IDE for Python which is excellent for beginners). We’ll look at all the changes in this post, but let’s start with the biggest…

Scratch 2.0 for Raspbian

Scratch is one of the most popular pieces of software on Raspberry Pi. This is largely due to the way it makes programming accessible – while it is simple to learn, it covers many of the concepts that are used in more advanced languages. Scratch really does provide a great introduction to programming for all ages.

Raspbian ships with the original version of Scratch, which is now at version 1.4. A few years ago, though, the Scratch team at the MIT Media Lab introduced the new and improved Scratch version 2.0, and ever since we’ve had numerous requests to offer it on the Pi.

There was, however, a problem with this. The original version of Scratch was written in a language called Squeak, which could run on the Pi in a Squeak interpreter. Scratch 2.0, however, was written in Flash, and was designed to run from a remote site in a web browser. While this made Scratch 2.0 a cross-platform application, which you could run without installing any Scratch software, it also meant that you had to be able to run Flash on your computer, and that you needed to be connected to the internet to program in Scratch.

We worked with Adobe to include the Pepper Flash plugin in Raspbian, which enables Flash sites to run in the Chromium browser. This addressed the first of these problems, so the Scratch 2.0 website has been available on Pi for a while. However, it still needed an internet connection to run, which wasn’t ideal in many circumstances. We’ve been working with the Scratch team to get an offline version of Scratch 2.0 running on Pi.

Screenshot of Scratch on Raspbian

The Scratch team had created a website to enable developers to create hardware and software extensions for Scratch 2.0; this provided a version of the Flash code for the Scratch editor which could be modified to run locally rather than over the internet. We combined this with a program called Electron, which effectively wraps up a local web page into a standalone application. We ended up with the Scratch 2.0 application that you can find in the Programming section of the main menu.

Physical computing with Scratch 2.0

We didn’t stop there though. We know that people want to use Scratch for physical computing, and it has always been a bit awkward to access GPIO pins from Scratch. In our Scratch 2.0 application, therefore, there is a custom extension which allows the user to control the Pi’s GPIO pins without difficulty. Simply click on ‘More Blocks’, choose ‘Add an Extension’, and select ‘Pi GPIO’. This loads two new blocks, one to read and one to write the state of a GPIO pin.

Screenshot of new Raspbian iteration of Scratch 2, featuring GPIO pin control blocks.

The Scratch team kindly allowed us to include all the sprites, backdrops, and sounds from the online version of Scratch 2.0. You can also use the Raspberry Pi Camera Module to create new sprites and backgrounds.

This first release works well, although it can be slow for some operations; this is largely unavoidable for Flash code running under Electron. Bear in mind that you will need to have the Pepper Flash plugin installed (which it is by default on standard Raspbian images). As Pepper Flash is only compatible with the processor in the Pi 2.0 and Pi 3, it is unfortunately not possible to run Scratch 2.0 on the Pi Zero or the original models of the Pi.

We hope that this makes Scratch 2.0 a more practical proposition for many users than it has been to date. Do let us know if you hit any problems, though!

Thonny: a more user-friendly IDE for Python

One of the paths from Scratch to ‘real’ programming is through Python. We know that the transition can be awkward, and this isn’t helped by the tools available for learning Python. It’s fair to say that IDLE, the Python IDE, isn’t the most popular piece of software ever written…

Earlier this year, we reviewed every Python IDE that we could find that would run on a Raspberry Pi, in an attempt to see if there was something better out there than IDLE. We wanted to find something that was easier for beginners to use but still useful for experienced Python programmers. We found one program, Thonny, which stood head and shoulders above all the rest. It’s a really user-friendly IDE, which still offers useful professional features like single-stepping of code and inspection of variables.

Screenshot of Thonny IDE in Raspbian

Thonny was created at the University of Tartu in Estonia; we’ve been working with Aivar Annamaa, the lead developer, on getting it into Raspbian. The original version of Thonny works well on the Pi, but because the GUI is written using Python’s default GUI toolkit, Tkinter, the appearance clashes with the rest of the Raspbian desktop, most of which is written using the GTK toolkit. We made some changes to bring things like fonts and graphics into line with the appearance of our other apps, and Aivar very kindly took that work and converted it into a theme package that could be applied to Thonny.

Due to the limitations of working within Tkinter, the result isn’t exactly like a native GTK application, but it’s pretty close. It’s probably good enough for anyone who isn’t a picky UI obsessive like me, anyway! Have a look at the Thonny webpage to see some more details of all the cool features it offers. We hope that having a more usable environment will help to ease the transition from graphical languages like Scratch into ‘proper’ languages like Python.

New icons

Other than these two new packages, this release is mostly bug fixes and small version bumps. One thing you might notice, though, is that we’ve made some tweaks to our custom icon set. We wondered if the icons might look better with slightly thinner outlines. We tried it, and they did: we hope you prefer them too.

Downloading the new image

You can either download a new image from the Downloads page, or you can use apt to update:

sudo apt-get update
sudo apt-get dist-upgrade

To install Scratch 2.0:

sudo apt-get install scratch2

To install Thonny:

sudo apt-get install python3-thonny

One more thing…

Before Christmas, we released an experimental version of the desktop running on Debian for x86-based computers. We were slightly taken aback by how popular it turned out to be! This made us realise that this was something we were going to need to support going forward. We’ve decided we’re going to try to make all new desktop releases for both Pi and x86 from now on.

The version of this we released last year was a live image that could run from a USB stick. Many people asked if we could make it permanently installable, so this version includes an installer. This uses the standard Debian install process, so it ought to work on most machines. I should stress, though, that we haven’t been able to test on every type of hardware, so there may be issues on some computers. Please be sure to back up your hard drive before installing it. Unlike the live image, this will erase and reformat your hard drive, and you will lose anything that is already on it!

You can still boot the image as a live image if you don’t want to install it, and it will create a persistence partition on the USB stick so you can save data. Just select ‘Run with persistence’ from the boot menu. To install, choose either ‘Install’ or ‘Graphical install’ from the same menu. The Debian installer will then walk you through the install process.

You can download the latest x86 image (which includes both Scratch 2.0 and Thonny) from here or here for a torrent file.

One final thing

This version of the desktop is based on Debian Jessie. Some of you will be aware that a new stable version of Debian (called Stretch) was released last week. Rest assured – we have been working on porting everything across to Stretch for some time now, and we will have a Stretch release ready some time over the summer.

The post A Raspbian desktop update with some new programming tools appeared first on Raspberry Pi.

How to Create an AMI Builder with AWS CodeBuild and HashiCorp Packer – Part 2

Post Syndicated from Heitor Lessa original https://aws.amazon.com/blogs/devops/how-to-create-an-ami-builder-with-aws-codebuild-and-hashicorp-packer-part-2/

Written by AWS Solutions Architects Jason Barto and Heitor Lessa

 
In Part 1 of this post, we described how AWS CodeBuild, AWS CodeCommit, and HashiCorp Packer can be used to build an Amazon Machine Image (AMI) from the latest version of Amazon Linux. In this post, we show how to use AWS CodePipeline, AWS CloudFormation, and Amazon CloudWatch Events to continuously ship new AMIs. We use Ansible by Red Hat to harden the OS on the AMIs through a well-known set of security controls outlined by the Center for Internet Security in its CIS Amazon Linux Benchmark.

You’ll find the source code for this post in our GitHub repo.

At the end of this post, we will have the following architecture:

Requirements

 
To follow along, you will need Git and a text editor. Make sure Git is configured to work with AWS CodeCommit, as described in Part 1.

Technologies

 
In addition to the services and products used in Part 1 of this post, we also use these AWS services and third-party software:

AWS CloudFormation gives developers and systems administrators an easy way to create and manage a collection of related AWS resources, provisioning and updating them in an orderly and predictable fashion.

Amazon CloudWatch Events enables you to react selectively to events in the cloud and in your applications. Specifically, you can create CloudWatch Events rules that match event patterns, and take actions in response to those patterns.

AWS CodePipeline is a continuous integration and continuous delivery service for fast and reliable application and infrastructure updates. AWS CodePipeline builds, tests, and deploys your code every time there is a code change, based on release process models you define.

Amazon SNS is a fast, flexible, fully managed push notification service that lets you send individual messages or to fan out messages to large numbers of recipients. Amazon SNS makes it simple and cost-effective to send push notifications to mobile device users or email recipients. The service can even send messages to other distributed services.

Ansible is a simple IT automation system that handles configuration management, application deployment, cloud provisioning, ad-hoc task-execution, and multinode orchestration.

Getting Started

 
We use CloudFormation to bootstrap the following infrastructure:

Component Purpose
AWS CodeCommit repository Git repository where the AMI builder code is stored.
S3 bucket Build artifact repository used by AWS CodePipeline and AWS CodeBuild.
AWS CodeBuild project Executes the AWS CodeBuild instructions contained in the build specification file.
AWS CodePipeline pipeline Orchestrates the AMI build process, triggered by new changes in the AWS CodeCommit repository.
SNS topic Notifies subscribed email addresses when an AMI build is complete.
CloudWatch Events rule Defines how the AMI builder should send a custom event to notify an SNS topic.
Region AMI Builder Launch Template
N. Virginia (us-east-1)
Ireland (eu-west-1)

After launching the CloudFormation template linked here, we will have a pipeline in the AWS CodePipeline console. (Failed at this stage simply means we don’t have any data in our newly created AWS CodeCommit Git repository.)

Next, we will clone the newly created AWS CodeCommit repository.

If this is your first time connecting to a AWS CodeCommit repository, please see instructions in our documentation on Setup steps for HTTPS Connections to AWS CodeCommit Repositories.

To clone the AWS CodeCommit repository (console)

  1. From the AWS Management Console, open the AWS CloudFormation console.
  2. Choose the AMI-Builder-Blogpost stack, and then choose Output.
  3. Make a note of the Git repository URL.
  4. Use git to clone the repository.

For example: git clone https://git-codecommit.eu-west-1.amazonaws.com/v1/repos/AMI-Builder_repo

To clone the AWS CodeCommit repository (CLI)

# Retrieve CodeCommit repo URL
git_repo=$(aws cloudformation describe-stacks --query 'Stacks[0].Outputs[?OutputKey==`GitRepository`].OutputValue' --output text --stack-name "AMI-Builder-Blogpost")

# Clone repository locally
git clone ${git_repo}

Bootstrap the Repo with the AMI Builder Structure

 
Now that our infrastructure is ready, download all the files and templates required to build the AMI.

Your local Git repo should have the following structure:

.
├── ami_builder_event.json
├── ansible
├── buildspec.yml
├── cloudformation
├── packer_cis.json

Next, push these changes to AWS CodeCommit, and then let AWS CodePipeline orchestrate the creation of the AMI:

git add .
git commit -m "My first AMI"
git push origin master

AWS CodeBuild Implementation Details

 
While we wait for the AMI to be created, let’s see what’s changed in our AWS CodeBuild buildspec.yml file:

...
phases:
  ...
  build:
    commands:
      ...
      - ./packer build -color=false packer_cis.json | tee build.log
  post_build:
    commands:
      - egrep "${AWS_REGION}\:\sami\-" build.log | cut -d' ' -f2 > ami_id.txt
      # Packer doesn't return non-zero status; we must do that if Packer build failed
      - test -s ami_id.txt || exit 1
      - sed -i.bak "s/<<AMI-ID>>/$(cat ami_id.txt)/g" ami_builder_event.json
      - aws events put-events --entries file://ami_builder_event.json
      ...
artifacts:
  files:
    - ami_builder_event.json
    - build.log
  discard-paths: yes

In the build phase, we capture Packer output into a file named build.log. In the post_build phase, we take the following actions:

  1. Look up the AMI ID created by Packer and save its findings to a temporary file (ami_id.txt).
  2. Forcefully make AWS CodeBuild to fail if the AMI ID (ami_id.txt) is not found. This is required because Packer doesn’t fail if something goes wrong during the AMI creation process. We have to tell AWS CodeBuild to stop by informing it that an error occurred.
  3. If an AMI ID is found, we update the ami_builder_event.json file and then notify CloudWatch Events that the AMI creation process is complete.
  4. CloudWatch Events publishes a message to an SNS topic. Anyone subscribed to the topic will be notified in email that an AMI has been created.

Lastly, the new artifacts phase instructs AWS CodeBuild to upload files built during the build process (ami_builder_event.json and build.log) to the S3 bucket specified in the Outputs section of the CloudFormation template. These artifacts can then be used as an input artifact in any later stage in AWS CodePipeline.

For information about customizing the artifacts sequence of the buildspec.yml, see the Build Specification Reference for AWS CodeBuild.

CloudWatch Events Implementation Details

 
CloudWatch Events allow you to extend the AMI builder to not only send email after the AMI has been created, but to hook up any of the supported targets to react to the AMI builder event. This event publication means you can decouple from Packer actions you might take after AMI completion and plug in other actions, as you see fit.

For more information about targets in CloudWatch Events, see the CloudWatch Events API Reference.

In this case, CloudWatch Events should receive the following event, match it with a rule we created through CloudFormation, and publish a message to SNS so that you can receive an email.

Example CloudWatch custom event

[
        {
            "Source": "com.ami.builder",
            "DetailType": "AmiBuilder",
            "Detail": "{ \"AmiStatus\": \"Created\"}",
            "Resources": [ "ami-12cd5guf" ]
        }
]

Cloudwatch Events rule

{
  "detail-type": [
    "AmiBuilder"
  ],
  "source": [
    "com.ami.builder"
  ],
  "detail": {
    "AmiStatus": [
      "Created"
    ]
  }
}

Example SNS message sent in email

{
    "version": "0",
    "id": "f8bdede0-b9d7...",
    "detail-type": "AmiBuilder",
    "source": "com.ami.builder",
    "account": "<<aws_account_number>>",
    "time": "2017-04-28T17:56:40Z",
    "region": "eu-west-1",
    "resources": ["ami-112cd5guf "],
    "detail": {
        "AmiStatus": "Created"
    }
}

Packer Implementation Details

 
In addition to the build specification file, there are differences between the current version of the HashiCorp Packer template (packer_cis.json) and the one used in Part 1.

Variables

  "variables": {
    "vpc": "{{env `BUILD_VPC_ID`}}",
    "subnet": "{{env `BUILD_SUBNET_ID`}}",
         “ami_name”: “Prod-CIS-Latest-AMZN-{{isotime \”02-Jan-06 03_04_05\”}}”
  },
  • ami_name: Prefixes a name used by Packer to tag resources during the Builders sequence.
  • vpc and subnet: Environment variables defined by the CloudFormation stack parameters.

We no longer assume a default VPC is present and instead use the VPC and subnet specified in the CloudFormation parameters. CloudFormation configures the AWS CodeBuild project to use these values as environment variables. They are made available throughout the build process.

That allows for more flexibility should you need to change which VPC and subnet will be used by Packer to launch temporary resources.

Builders

  "builders": [{
    ...
    "ami_name": “{{user `ami_name`| clean_ami_name}}”,
    "tags": {
      "Name": “{{user `ami_name`}}”,
    },
    "run_tags": {
      "Name": “{{user `ami_name`}}",
    },
    "run_volume_tags": {
      "Name": “{{user `ami_name`}}",
    },
    "snapshot_tags": {
      "Name": “{{user `ami_name`}}",
    },
    ...
    "vpc_id": "{{user `vpc` }}",
    "subnet_id": "{{user `subnet` }}"
  }],

We now have new properties (*_tag) and a new function (clean_ami_name) and launch temporary resources in a VPC and subnet specified in the environment variables. AMI names can only contain a certain set of ASCII characters. If the input in project deviates from the expected characters (for example, includes whitespace or slashes), Packer’s clean_ami_name function will fix it.

For more information, see functions on the HashiCorp Packer website.

Provisioners

  "provisioners": [
    {
        "type": "shell",
        "inline": [
            "sudo pip install ansible"
        ]
    }, 
    {
        "type": "ansible-local",
        "playbook_file": "ansible/playbook.yaml",
        "role_paths": [
            "ansible/roles/common"
        ],
        "playbook_dir": "ansible",
        "galaxy_file": "ansible/requirements.yaml"
    },
    {
      "type": "shell",
      "inline": [
        "rm .ssh/authorized_keys ; sudo rm /root/.ssh/authorized_keys"
      ]
    }

We used shell provisioner to apply OS patches in Part 1. Now, we use shell to install Ansible on the target machine and ansible-local to import, install, and execute Ansible roles to make our target machine conform to our standards.

Packer uses shell to remove temporary keys before it creates an AMI from the target and temporary EC2 instance.

Ansible Implementation Details

 
Ansible provides OS patching through a custom Common role that can be easily customized for other tasks.

CIS Benchmark and Cloudwatch Logs are implemented through two Ansible third-party roles that are defined in ansible/requirements.yaml as seen in the Packer template.

The Ansible provisioner uses Ansible Galaxy to download these roles onto the target machine and execute them as instructed by ansible/playbook.yaml.

For information about how these components are organized, see the Playbook Roles and Include Statements in the Ansible documentation.

The following Ansible playbook (ansible</playbook.yaml) controls the execution order and custom properties:

---
- hosts: localhost
  connection: local
  gather_facts: true    # gather OS info that is made available for tasks/roles
  become: yes           # majority of CIS tasks require root
  vars:
    # CIS Controls whitepaper:  http://bit.ly/2mGAmUc
    # AWS CIS Whitepaper:       http://bit.ly/2m2Ovrh
    cis_level_1_exclusions:
    # 3.4.2 and 3.4.3 effectively blocks access to all ports to the machine
    ## This can break automation; ignoring it as there are stronger mechanisms than that
      - 3.4.2 
      - 3.4.3
    # CloudWatch Logs will be used instead of Rsyslog/Syslog-ng
    ## Same would be true if any other software doesn't support Rsyslog/Syslog-ng mechanisms
      - 4.2.1.4
      - 4.2.2.4
      - 4.2.2.5
    # Autofs is not installed in newer versions, let's ignore
      - 1.1.19
    # Cloudwatch Logs role configuration
    logs:
      - file: /var/log/messages
        group_name: "system_logs"
  roles:
    - common
    - anthcourtney.cis-amazon-linux
    - dharrisio.aws-cloudwatch-logs-agent

Both third-party Ansible roles can be easily configured through variables (vars). We use Ansible playbook variables to exclude CIS controls that don’t apply to our case and to instruct the CloudWatch Logs agent to stream the /var/log/messages log file to CloudWatch Logs.

If you need to add more OS or application logs, you can easily duplicate the playbook and make changes. The CloudWatch Logs agent will ship configured log messages to CloudWatch Logs.

For more information about parameters you can use to further customize third-party roles, download Ansible roles for the Cloudwatch Logs Agent and CIS Amazon Linux from the Galaxy website.

Committing Changes

 
Now that Ansible and CloudWatch Events are configured as a part of the build process, commiting any changes to the AWS CodeComit Git Repository will triger a new AMI build process that can be followed through the AWS CodePipeline console.

When the build is complete, an email will be sent to the email address you provided as a part of the CloudFormation stack deployment. The email serves as notification that an AMI has been built and is ready for use.

Summary

 
We used AWS CodeCommit, AWS CodePipeline, AWS CodeBuild, Packer, and Ansible to build a pipeline that continuously builds new, hardened CIS AMIs. We used Amazon SNS so that email addresses subscribed to a SNS topic are notified upon completion of the AMI build.

By treating our AMI creation process as code, we can iterate and track changes over time. In this way, it’s no different from a software development workflow. With that in mind, software patches, OS configuration, and logs that need to be shipped to a central location are only a git commit away.

Next Steps

 
Here are some ideas to extend this AMI builder:

  • Hook up a Lambda function in Cloudwatch Events to update EC2 Auto Scaling configuration upon completion of the AMI build.
  • Use AWS CodePipeline parallel steps to build multiple Packer images.
  • Add a commit ID as a tag for the AMI you created.
  • Create a scheduled Lambda function through Cloudwatch Events to clean up old AMIs based on timestamp (name or additional tag).
  • Implement Windows support for the AMI builder.
  • Create a cross-account or cross-region AMI build.

Cloudwatch Events allow the AMI builder to decouple AMI configuration and creation so that you can easily add your own logic using targets (AWS Lambda, Amazon SQS, Amazon SNS) to add events or recycle EC2 instances with the new AMI.

If you have questions or other feedback, feel free to leave it in the comments or contribute to the AMI Builder repo on GitHub.

[$] Memory use in CPython and MicroPython

Post Syndicated from jake original https://lwn.net/Articles/725508/rss

At PyCon 2017, Kavya Joshi looked
at some of the differences between the Python reference implementation
(known as “CPython”) and
that of MicroPython. In particular,
she described the differences in memory use and handling between the two.
Those differences are
part of
what allows MicroPython to run on the severely memory-constrained
microcontrollers it targets—an environment that could never support CPython.

MPAA & RIAA Demand Tough Copyright Standards in NAFTA Negotiations

Post Syndicated from Andy original https://torrentfreak.com/mpaa-riaa-demand-tough-copyright-standards-in-nafta-negotiations-170621/

The North American Free Trade Agreement (NAFTA) between the United States, Canada, and Mexico was negotiated more than 25 years ago. With a quarter of a decade of developments to contend with, the United States wants to modernize.

“While our economy and U.S. businesses have changed considerably over that period, NAFTA has not,” the government says.

With this in mind, the US requested comments from interested parties seeking direction for negotiation points. With those comments now in, groups like the MPAA and RIAA have been making their positions known. It’s no surprise that intellectual property enforcement is high on the agenda.

“Copyright is the lifeblood of the U.S. motion picture and television industry. As such, MPAA places high priority on securing strong protection and enforcement disciplines in the intellectual property chapters of trade agreements,” the MPAA writes in its submission.

“Strong IPR protection and enforcement are critical trade priorities for the music industry. With IPR, we can create good jobs, make significant contributions to U.S. economic growth and security, invest in artists and their creativity, and drive technological innovation,” the RIAA notes.

While both groups have numerous demands, it’s clear that each seeks an environment where not only infringers can be held liable, but also Internet platforms and services.

For the RIAA, there is a big focus on the so-called ‘Value Gap’, a phenomenon found on user-uploaded content sites like YouTube that are able to offer infringing content while avoiding liability due to Section 512 of the DMCA.

“Today, user-uploaded content services, which have developed sophisticated on-demand music platforms, use this as a shield to avoid licensing music on fair terms like other digital services, claiming they are not legally responsible for the music they distribute on their site,” the RIAA writes.

“Services such as Apple Music, TIDAL, Amazon, and Spotify are forced to compete with services that claim they are not liable for the music they distribute.”

But if sites like YouTube are exercising their rights while acting legally under current US law, how can partners Canada and Mexico do any better? For the RIAA, that can be achieved by holding them to standards envisioned by the group when the DMCA was passed, not how things have panned out since.

Demanding that negotiators “protect the original intent” of safe harbor, the RIAA asks that a “high-level and high-standard service provider liability provision” is pursued. This, the music group says, should only be available to “passive intermediaries without requisite knowledge of the infringement on their platforms, and inapplicable to services actively engaged in communicating to the public.”

In other words, make sure that YouTube and similar sites won’t enjoy the same level of safe harbor protection as they do today.

The RIAA also requires any negotiated safe harbor provisions in NAFTA to be flexible in the event that the DMCA is tightened up in response to the ongoing safe harbor rules study.

In any event, NAFTA should not “support interpretations that no longer reflect today’s digital economy and threaten the future of legitimate and sustainable digital trade,” the RIAA states.

For the MPAA, Section 512 is also perceived as a problem. While noting that the original intent was to foster a system of shared responsibility between copyright owners and service providers, the MPAA says courts have subsequently let copyright holders down. Like the RIAA, the MPAA also suggests that Canada and Mexico can be held to higher standards.

“We recommend a new approach to this important trade policy provision by moving to high-level language that establishes intermediary liability and appropriate limitations on liability. This would be fully consistent with U.S. law and avoid the same misinterpretations by policymakers and courts overseas,” the MPAA writes.

“In so doing, a modernized NAFTA would be consistent with Trade Promotion Authority’s negotiating objective of ‘ensuring that standards of protection and enforcement keep pace with technological developments’.”

The MPAA also has some specific problems with Mexico, including unauthorized camcording. The Hollywood group says that 85 illicit audio and video recordings of films were linked to Mexican theaters in 2016. However, recording is not currently a criminal offense in Mexico.

Another issue for the MPAA is that criminal sanctions for commercial scale infringement are only available if the infringement is for profit.

“This has hampered enforcement against the above-discussed camcording problem but also against online infringement, such as peer-to-peer piracy, that may be on a scale that is immensely harmful to U.S. rightsholders but nonetheless occur without profit by the infringer,” the MPAA writes.

“The modernized NAFTA like other U.S. bilateral free trade agreements must provide for criminal sanctions against commercial scale infringements without proof of profit motive.”

Also of interest are the MPAA’s complaints against Mexico’s telecoms laws. Unlike in the US and many countries in Europe, Mexico’s ISPs are forbidden to hand out their customers’ personal details to rights holders looking to sue. This, the MPAA says, needs to change.

The submissions from the RIAA and MPAA can be found here and here (pdf)

Source: TF, for the latest info on copyright, file-sharing, torrent sites and ANONYMOUS VPN services.

Building Loosely Coupled, Scalable, C# Applications with Amazon SQS and Amazon SNS

Post Syndicated from Tara Van Unen original https://aws.amazon.com/blogs/compute/building-loosely-coupled-scalable-c-applications-with-amazon-sqs-and-amazon-sns/

 
Stephen Liedig, Solutions Architect

 

One of the many challenges professional software architects and developers face is how to make cloud-native applications scalable, fault-tolerant, and highly available.

Fundamental to your project success is understanding the importance of making systems highly cohesive and loosely coupled. That means considering the multi-dimensional facets of system coupling to support the distributed nature of the applications that you are building for the cloud.

By that, I mean addressing not only the application-level coupling (managing incoming and outgoing dependencies), but also considering the impacts of of platform, spatial, and temporal coupling of your systems. Platform coupling relates to the interoperability, or lack thereof, of heterogeneous systems components. Spatial coupling deals with managing components at a network topology level or protocol level. Temporal, or runtime coupling, refers to the ability of a component within your system to do any kind of meaningful work while it is performing a synchronous, blocking operation.

The AWS messaging services, Amazon SQS and Amazon SNS, help you deal with these forms of coupling by providing mechanisms for:

  • Reliable, durable, and fault-tolerant delivery of messages between application components
  • Logical decomposition of systems and increased autonomy of components
  • Creating unidirectional, non-blocking operations, temporarily decoupling system components at runtime
  • Decreasing the dependencies that components have on each other through standard communication and network channels

Following on the recent topic, Building Scalable Applications and Microservices: Adding Messaging to Your Toolbox, in this post, I look at some of the ways you can introduce SQS and SNS into your architectures to decouple your components, and show how you can implement them using C#.

Walkthrough

To illustrate some of these concepts, consider a web application that processes customer orders. As good architects and developers, you have followed best practices and made your application scalable and highly available. Your solution included implementing load balancing, dynamic scaling across multiple Availability Zones, and persisting orders in a Multi-AZ Amazon RDS database instance, as in the following diagram.


In this example, the application is responsible for handling and persisting the order data, as well as dealing with increases in traffic for popular items.

One potential point of vulnerability in the order processing workflow is in saving the order in the database. The business expects that every order has been persisted into the database. However, any potential deadlock, race condition, or network issue could cause the persistence of the order to fail. Then, the order is lost with no recourse to restore the order.

With good logging capability, you may be able to identify when an error occurred and which customer’s order failed. This wouldn’t allow you to “restore” the transaction, and by that stage, your customer is no longer your customer.

As illustrated in the following diagram, introducing an SQS queue helps improve your ordering application. Using the queue isolates the processing logic into its own component and runs it in a separate process from the web application. This, in turn, allows the system to be more resilient to spikes in traffic, while allowing work to be performed only as fast as necessary in order to manage costs.


In addition, you now have a mechanism for persisting orders as messages (with the queue acting as a temporary database), and have moved the scope of your transaction with your database further down the stack. In the event of an application exception or transaction failure, this ensures that the order processing can be retired or redirected to the Amazon SQS Dead Letter Queue (DLQ), for re-processing at a later stage. (See the recent post, Using Amazon SQS Dead-Letter Queues to Control Message Failure, for more information on dead-letter queues.)

Scaling the order processing nodes

This change allows you now to scale the web application frontend independently from the processing nodes. The frontend application can continue to scale based on metrics such as CPU usage, or the number of requests hitting the load balancer. Processing nodes can scale based on the number of orders in the queue. Here is an example of scale-in and scale-out alarms that you would associate with the scaling policy.

Scale-out Alarm

aws cloudwatch put-metric-alarm --alarm-name AddCapacityToCustomerOrderQueue --metric-name ApproximateNumberOfMessagesVisible --namespace "AWS/SQS" 
--statistic Average --period 300 --threshold 3 --comparison-operator GreaterThanOrEqualToThreshold --dimensions Name=QueueName,Value=customer-orders
--evaluation-periods 2 --alarm-actions <arn of the scale-out autoscaling policy>

Scale-in Alarm

aws cloudwatch put-metric-alarm --alarm-name RemoveCapacityFromCustomerOrderQueue --metric-name ApproximateNumberOfMessagesVisible --namespace "AWS/SQS" 
 --statistic Average --period 300 --threshold 1 --comparison-operator LessThanOrEqualToThreshold --dimensions Name=QueueName,Value=customer-orders
 --evaluation-periods 2 --alarm-actions <arn of the scale-in autoscaling policy>

In the above example, use the ApproximateNumberOfMessagesVisible metric to discover the queue length and drive the scaling policy of the Auto Scaling group. Another useful metric is ApproximateAgeOfOldestMessage, when applications have time-sensitive messages and developers need to ensure that messages are processed within a specific time period.

Scaling the order processing implementation

On top of scaling at an infrastructure level using Auto Scaling, make sure to take advantage of the processing power of your Amazon EC2 instances by using as many of the available threads as possible. There are several ways to implement this. In this post, we build a Windows service that uses the BackgroundWorker class to process the messages from the queue.

Here’s a closer look at the implementation. In the first section of the consuming application, use a loop to continually poll the queue for new messages, and construct a ReceiveMessageRequest variable.

public static void PollQueue()
{
    while (_running)
    {
        Task<ReceiveMessageResponse> receiveMessageResponse;

        // Pull messages off the queue
        using (var sqs = new AmazonSQSClient())
        {
            const int maxMessages = 10;  // 1-10

            //Receiving a message
            var receiveMessageRequest = new ReceiveMessageRequest
            {
                // Get URL from Configuration
                QueueUrl = _queueUrl, 
                // The maximum number of messages to return. 
                // Fewer messages might be returned. 
                MaxNumberOfMessages = maxMessages, 
                // A list of attributes that need to be returned with message.
                AttributeNames = new List<string> { "All" },
                // Enable long polling. 
                // Time to wait for message to arrive on queue.
                WaitTimeSeconds = 5 
            };

            receiveMessageResponse = sqs.ReceiveMessageAsync(receiveMessageRequest);
        }

The WaitTimeSeconds property of the ReceiveMessageRequest specifies the duration (in seconds) that the call waits for a message to arrive in the queue before returning a response to the calling application. There are a few benefits to using long polling:

  • It reduces the number of empty responses by allowing SQS to wait until a message is available in the queue before sending a response.
  • It eliminates false empty responses by querying all (rather than a limited number) of the servers.
  • It returns messages as soon any message becomes available.

For more information, see Amazon SQS Long Polling.

After you have returned messages from the queue, you can start to process them by looping through each message in the response and invoking a new BackgroundWorker thread.

// Process messages
if (receiveMessageResponse.Result.Messages != null)
{
    foreach (var message in receiveMessageResponse.Result.Messages)
    {
        Console.WriteLine("Received SQS message, starting worker thread");

        // Create background worker to process message
        BackgroundWorker worker = new BackgroundWorker();
        worker.DoWork += (obj, e) => ProcessMessage(message);
        worker.RunWorkerAsync();
    }
}
else
{
    Console.WriteLine("No messages on queue");
}

The event handler, ProcessMessage, is where you implement business logic for processing orders. It is important to have a good understanding of how long a typical transaction takes so you can set a message VisibilityTimeout that is long enough to complete your operation. If order processing takes longer than the specified timeout period, the message becomes visible on the queue. Other nodes may pick it and process the same order twice, leading to unintended consequences.

Handling Duplicate Messages

In order to manage duplicate messages, seek to make your processing application idempotent. In mathematics, idempotent describes a function that produces the same result if it is applied to itself:

f(x) = f(f(x))

No matter how many times you process the same message, the end result is the same (definition from Enterprise Integration Patterns: Designing, Building, and Deploying Messaging Solutions, Hohpe and Wolf, 2004).

There are several strategies you could apply to achieve this:

  • Create messages that have inherent idempotent characteristics. That is, they are non-transactional in nature and are unique at a specified point in time. Rather than saying “place new order for Customer A,” which adds a duplicate order to the customer, use “place order <orderid> on <timestamp> for Customer A,” which creates a single order no matter how often it is persisted.
  • Deliver your messages via an Amazon SQS FIFO queue, which provides the benefits of message sequencing, but also mechanisms for content-based deduplication. You can deduplicate using the MessageDeduplicationId property on the SendMessage request or by enabling content-based deduplication on the queue, which generates a hash for MessageDeduplicationId, based on the content of the message, not the attributes.
var sendMessageRequest = new SendMessageRequest
{
    QueueUrl = _queueUrl,
    MessageBody = JsonConvert.SerializeObject(order),
    MessageGroupId = Guid.NewGuid().ToString("N"),
    MessageDeduplicationId = Guid.NewGuid().ToString("N")
};
  • If using SQS FIFO queues is not an option, keep a message log of all messages attributes processed for a specified period of time, as an alternative to message deduplication on the receiving end. Verifying the existence of the message in the log before processing the message adds additional computational overhead to your processing. This can be minimized through low latency persistence solutions such as Amazon DynamoDB. Bear in mind that this solution is dependent on the successful, distributed transaction of the message and the message log.

Handling exceptions

Because of the distributed nature of SQS queues, it does not automatically delete the message. Therefore, you must explicitly delete the message from the queue after processing it, using the message ReceiptHandle property (see the following code example).

However, if at any stage you have an exception, avoid handling it as you normally would. The intention is to make sure that the message ends back on the queue, so that you can gracefully deal with intermittent failures. Instead, log the exception to capture diagnostic information, and swallow it.

By not explicitly deleting the message from the queue, you can take advantage of the VisibilityTimeout behavior described earlier. Gracefully handle the message processing failure and make the unprocessed message available to other nodes to process.

In the event that subsequent retries fail, SQS automatically moves the message to the configured DLQ after the configured number of receives has been reached. You can further investigate why the order process failed. Most importantly, the order has not been lost, and your customer is still your customer.

private static void ProcessMessage(Message message)
{
    using (var sqs = new AmazonSQSClient())
    {
        try
        {
            Console.WriteLine("Processing message id: {0}", message.MessageId);

            // Implement messaging processing here
            // Ensure no downstream resource contention (parallel processing)
            // <your order processing logic in here…>
            Console.WriteLine("{0} Thread {1}: {2}", DateTime.Now.ToString("s"), Thread.CurrentThread.ManagedThreadId, message.MessageId);
            
            // Delete the message off the queue. 
            // Receipt handle is the identifier you must provide 
            // when deleting the message.
            var deleteRequest = new DeleteMessageRequest(_queueName, message.ReceiptHandle);
            sqs.DeleteMessageAsync(deleteRequest);
            Console.WriteLine("Processed message id: {0}", message.MessageId);

        }
        catch (Exception ex)
        {
            // Do nothing.
            // Swallow exception, message will return to the queue when 
            // visibility timeout has been exceeded.
            Console.WriteLine("Could not process message due to error. Exception: {0}", ex.Message);
        }
    }
}

Using SQS to adapt to changing business requirements

One of the benefits of introducing a message queue is that you can accommodate new business requirements without dramatically affecting your application.

If, for example, the business decided that all orders placed over $5000 are to be handled as a priority, you could introduce a new “priority order” queue. The way the orders are processed does not change. The only significant change to the processing application is to ensure that messages from the “priority order” queue are processed before the “standard order” queue.

The following diagram shows how this logic could be isolated in an “order dispatcher,” whose only purpose is to route order messages to the appropriate queue based on whether the order exceeds $5000. Nothing on the web application or the processing nodes changes other than the target queue to which the order is sent. The rates at which orders are processed can be achieved by modifying the poll rates and scalability settings that I have already discussed.

Extending the design pattern with Amazon SNS

Amazon SNS supports reliable publish-subscribe (pub-sub) scenarios and push notifications to known endpoints across a wide variety of protocols. It eliminates the need to periodically check or poll for new information and updates. SNS supports:

  • Reliable storage of messages for immediate or delayed processing
  • Publish / subscribe – direct, broadcast, targeted “push” messaging
  • Multiple subscriber protocols
  • Amazon SQS, HTTP, HTTPS, email, SMS, mobile push, AWS Lambda

With these capabilities, you can provide parallel asynchronous processing of orders in the system and extend it to support any number of different business use cases without affecting the production environment. This is commonly referred to as a “fanout” scenario.

Rather than your web application pushing orders to a queue for processing, send a notification via SNS. The SNS messages are sent to a topic and then replicated and pushed to multiple SQS queues and Lambda functions for processing.

As the diagram above shows, you have the development team consuming “live” data as they work on the next version of the processing application, or potentially using the messages to troubleshoot issues in production.

Marketing is consuming all order information, via a Lambda function that has subscribed to the SNS topic, inserting the records into an Amazon Redshift warehouse for analysis.

All of this, of course, is happening without affecting your order processing application.

Summary

While I haven’t dived deep into the specifics of each service, I have discussed how these services can be applied at an architectural level to build loosely coupled systems that facilitate multiple business use cases. I’ve also shown you how to use infrastructure and application-level scaling techniques, so you can get the most out of your EC2 instances.

One of the many benefits of using these managed services is how quickly and easily you can implement powerful messaging capabilities in your systems, and lower the capital and operational costs of managing your own messaging middleware.

Using Amazon SQS and Amazon SNS together can provide you with a powerful mechanism for decoupling application components. This should be part of design considerations as you architect for the cloud.

For more information, see the Amazon SQS Developer Guide and Amazon SNS Developer Guide. You’ll find tutorials on all the concepts covered in this post, and more. To can get started using the AWS console or SDK of your choice visit:

Happy messaging!

New – Managed Device Authentication for Amazon WorkSpaces

Post Syndicated from Jeff Barr original https://aws.amazon.com/blogs/aws/new-managed-device-authentication-for-amazon-workspaces/

Amazon WorkSpaces allows you to access a virtual desktop in the cloud from the web and from a wide variety of desktop and mobile devices. This flexibility makes WorkSpaces ideal for environments where users have the ability to use their existing devices (often known as BYOD, or Bring Your Own Device). In these environments, organizations sometimes need the ability to manage the devices which can access WorkSpaces. For example, they may have to regulate access based on the client device operating system, version, or patch level in order to help meet compliance or security policy requirements.

Managed Device Authentication
Today we are launching device authentication for WorkSpaces. You can now use digital certificates to manage client access from Apple OSX and Microsoft Windows. You can also choose to allow or block access from iOS, Android, Chrome OS, web, and zero client devices. You can implement policies to control which device types you want to allow and which ones you want to block, with control all the way down to the patch level. Access policies are set for each WorkSpaces directory. After you have set the policies, requests to connect to WorkSpaces from a client device are assessed and either blocked or allowed. In order to make use of this feature, you will need to distribute certificates to your client devices using Microsoft System Center Configuration Manager or a mobile device management (MDM) tool.

Here’s how you set your access control options from the WorkSpaces Console:

Here’s what happens if a client is not authorized to connect:

 

Available Today
This feature is now available in all Regions where WorkSpaces is available.

Jeff;

 

BPI Breaks Record After Sending 310 Million Google Takedowns

Post Syndicated from Andy original https://torrentfreak.com/bpi-breaks-record-after-sending-310-million-google-takedowns-170619/

A little over a year ago during March 2016, music industry group BPI reached an important milestone. After years of sending takedown notices to Google, the group burst through the 200 million URL barrier.

The fact that it took BPI several years to reach its 200 million milestone made the surpassing of the quarter billion milestone a few months later even more remarkable. In October 2016, the group sent its 250 millionth takedown to Google, a figure that nearly doubled when accounting for notices sent to Microsoft’s Bing.

But despite the volumes, the battle hadn’t been won, let alone the war. The BPI’s takedown machine continued to run at a remarkable rate, churning out millions more notices per week.

As a result, yet another new milestone was reached this month when the BPI smashed through the 300 million URL barrier. Then, days later, a further 10 million were added, with the latter couple of million added during the time it took to put this piece together.

BPI takedown notices, as reported by Google

While demanding that Google places greater emphasis on its de-ranking of ‘pirate’ sites, the BPI has called again and again for a “notice and stay down” regime, to ensure that content taken down by the search engine doesn’t simply reappear under a new URL. It’s a position BPI maintains today.

“The battle would be a whole lot easier if intermediaries played fair,” a BPI spokesperson informs TF.

“They need to take more proactive responsibility to reduce infringing content that appears on their platform, and, where we expressly notify infringing content to them, to ensure that they do not only take it down, but also keep it down.”

The long-standing suggestion is that the volume of takedown notices sent would reduce if a “take down, stay down” regime was implemented. The BPI says it’s difficult to present a precise figure but infringing content has a tendency to reappear, both in search engines and on hosting sites.

“Google rejects repeat notices for the same URL. But illegal content reappears as it is re-indexed by Google. As to the sites that actually host the content, the vast majority of notices sent to them could be avoided if they implemented take-down & stay-down,” BPI says.

The fact that the BPI has added 60 million more takedowns since the quarter billion milestone a few months ago is quite remarkable, particularly since there appears to be little slowdown from month to month. However, the numbers have grown so huge that 310 billion now feels a lot like 250 million, with just a few added on top for good measure.

That an extra 60 million takedowns can almost be dismissed as a handful is an indication of just how massive the issue is online. While pirates always welcome an abundance of links to juicy content, it’s no surprise that groups like the BPI are seeking more comprehensive and sustainable solutions.

Previously, it was hoped that the Digital Economy Bill would provide some relief, hopefully via government intervention and the imposition of a search engine Code of Practice. In the event, however, all pressure on search engines was removed from the legislation after a separate voluntary agreement was reached.

All parties agreed that the voluntary code should come into effect two weeks ago on June 1 so it seems likely that some effects should be noticeable in the near future. But the BPI says it’s still early days and there’s more work to be done.

“BPI has been working productively with search engines since the voluntary code was agreed to understand how search engines approach the problem, but also what changes can and have been made and how results can be improved,” the group explains.

“The first stage is to benchmark where we are and to assess the impact of the changes search engines have made so far. This will hopefully be completed soon, then we will have better information of the current picture and from that we hope to work together to continue to improve search for rights owners and consumers.”

With more takedown notices in the pipeline not yet publicly reported by Google, the BPI informs TF that it has now notified the search giant of 315 million links to illegal content.

“That’s an astonishing number. More than 1 in 10 of the entire world’s notices to Google come from BPI. This year alone, one in every three notices sent to Google from BPI is for independent record label repertoire,” BPI concludes.

While it’s clear that groups like BPI have developed systems to cope with the huge numbers of takedown notices required in today’s environment, it’s clear that few rightsholders are happy with the status quo. With that in mind, the fight will continue, until search engines are forced into compromise. Considering the implications, that could only appear on a very distant horizon.

Source: TF, for the latest info on copyright, file-sharing, torrent sites and ANONYMOUS VPN services.

Digital painter rundown

Post Syndicated from Eevee original https://eev.ee/blog/2017/06/17/digital-painter-rundown/

Another patron post! IndustrialRobot asks:

You should totally write about drawing/image manipulation programs! (Inspired by https://eev.ee/blog/2015/05/31/text-editor-rundown/)

This is a little trickier than a text editor comparison — while most text editors are cross-platform, quite a few digital art programs are not. So I’m effectively unable to even try a decent chunk of the offerings. I’m also still a relatively new artist, and image editors are much harder to briefly compare than text editors…

Right, now that your expectations have been suitably lowered:

Krita

I do all of my digital art in Krita. It’s pretty alright.

Okay so Krita grew out of Calligra, which used to be KOffice, which was an office suite designed for KDE (a Linux desktop environment). I bring this up because KDE has a certain… reputation. With KDE, there are at least three completely different ways to do anything, each of those ways has ludicrous amounts of customization and settings, and somehow it still can’t do what you want.

Krita inherits this aesthetic by attempting to do literally everything. It has 17 different brush engines, more than 70 layer blending modes, seven color picker dockers, and an ungodly number of colorspaces. It’s clearly intended primarily for drawing, but it also supports animation and vector layers and a pretty decent spread of raster editing tools. I just right now discovered that it has Photoshop-like “layer styles” (e.g. drop shadow), after a year and a half of using it.

In fairness, Krita manages all of this stuff well enough, and (apparently!) it manages to stay out of your way if you’re not using it. In less fairness, they managed to break erasing with a Wacom tablet pen for three months?

I don’t want to rag on it too hard; it’s an impressive piece of work, and I enjoy using it! The emotion it evokes isn’t so much frustration as… mystified bewilderment.

I once filed a ticket suggesting the addition of a brush size palette — a panel showing a grid of fixed brush sizes that makes it easy to switch between known sizes with a tablet pen (and increases the chances that you’ll be able to get a brush back to the right size again). It’s a prominent feature of Paint Tool SAI and Clip Studio Paint, and while I’ve never used either of those myself, I’ve seen a good few artists swear by it.

The developer response was that I could emulate the behavior by creating brush presets. But that’s flat-out wrong: getting the same effect would require creating a ton of brush presets for every brush I have, plus giving them all distinct icons so the size is obvious at a glance. Even then, it would be much more tedious to use and fill my presets with junk.

And that sort of response is what’s so mysterious to me. I’ve never even been able to use this feature myself, but a year of amateur painting with Krita has convinced me that it would be pretty useful. But a developer didn’t see the use and suggested an incredibly tedious alternative that only half-solves the problem and creates new ones. Meanwhile, of the 28 existing dockable panels, a quarter of them are different ways to choose colors.

What is Krita trying to be, then? What does Krita think it is? Who precisely is the target audience? I have no idea.


Anyway, I enjoy drawing in Krita well enough. It ships with a respectable set of brushes, and there are plenty more floating around. It has canvas rotation, canvas mirroring, perspective guide tools, and other art goodies. It doesn’t colordrop on right click by default, which is arguably a grave sin (it shows a customizable radial menu instead), but that’s easy to rebind. It understands having a background color beneath a bottom transparent layer, which is very nice. You can also toggle any brush between painting and erasing with the press of a button, and that turns out to be very useful.

It doesn’t support infinite canvases, though it does offer a one-click button to extend the canvas in a given direction. I’ve never used it (and didn’t even know what it did until just now), but would totally use an infinite canvas.

I haven’t used the animation support too much, but it’s pretty nice to have. Granted, the only other animation software I’ve used is Aseprite, so I don’t have many points of reference here. It’s a relatively new addition, too, so I assume it’ll improve over time.

The one annoyance I remember with animation was really an interaction with a larger annoyance, which is: working with selections kind of sucks. You can’t drag a selection around with the selection tool; you have to switch to the move tool. That would be fine if you could at least drag the selection ring around with the selection tool, but you can’t do that either; dragging just creates a new selection.

If you want to copy a selection, you have to explicitly copy it to the clipboard and paste it, which creates a new layer. Ctrl-drag with the move tool doesn’t work. So then you have to merge that layer down, which I think is where the problem with animation comes in: a new layer is non-animated by default, meaning it effectively appears in any frame, so simply merging it down with merge it onto every single frame of the layer below. And you won’t even notice until you switch frames or play back the animation. Not ideal.

This is another thing that makes me wonder about Krita’s sense of identity. It has a lot of fancy general-purpose raster editing features that even GIMP is still struggling to implement, like high color depth support and non-destructive filters, yet something as basic as working with selections is clumsy. (In fairness, GIMP is a bit clumsy here too, but it has a consistent notion of “floating selection” that’s easy enough to work with.)

I don’t know how well Krita would work as a general-purpose raster editor; I’ve never tried to use it that way. I can’t think of anything obvious that’s missing. The only real gotcha is that some things you might expect to be tools, like smudge or clone, are just types of brush in Krita.

GIMP

Ah, GIMP — open source’s answer to Photoshop.

It’s very obviously intended for raster editing, and I’m pretty familiar with it after half a lifetime of only using Linux. I even wrote a little Scheme script for it ages ago to automate some simple edits to a couple hundred files, back before I was aware of ImageMagick. I don’t know what to say about it, specifically; it’s fairly powerful and does a wide variety of things.

In fact I’d say it’s almost frustratingly intended for raster editing. I used GIMP in my first attempts at digital painting, before I’d heard of Krita. It was okay, but so much of it felt clunky and awkward. Painting is split between a pencil tool, a paintbrush tool, and an airbrush tool; I don’t really know why. The default brushes are largely uninteresting. Instead of brush presets, there are tool presets that can be saved for any tool; it’s a neat idea, but doesn’t feel like a real substitute for brush presets.

Much of the same functionality as Krita is there, but it’s all somehow more clunky. I’m sure it’s possible to fiddle with the interface to get something friendlier for painting, but I never really figured out how.

And then there’s the surprising stuff that’s missing. There’s no canvas rotation, for example. There’s only one type of brush, and it just stamps the same pattern along a path. I don’t think it’s possible to smear or blend or pick up color while painting. The only way to change the brush size is via the very sensitive slider on the tool options panel, which I remember being a little annoying with a tablet pen. Also, you have to specifically enable tablet support? It’s not difficult or anything, but I have no idea why the default is to ignore tablet pressure and treat it like a regular mouse cursor.

As I mentioned above, there’s also no support for high color depth or non-destructive editing, which is honestly a little embarrassing. Those are the major things Serious Professionals™ have been asking for for ages, and GIMP has been trying to provide them, but it’s taking a very long time. The first signs of GEGL, a new library intended to provide these features, appeared in GIMP 2.6… in 2008. The last major release was in 2012. GIMP has been working on this new plumbing for almost as long as Krita’s entire development history. (To be fair, Krita has also raised almost €90,000 from three Kickstarters to fund its development; I don’t know that GIMP is funded at all.)

I don’t know what’s up with GIMP nowadays. It’s still under active development, but the exact status and roadmap are a little unclear. I still use it for some general-purpose editing, but I don’t see any reason to use it to draw.

I do know that canvas rotation will be in the next release, and there was some experimentation with embedding MyPaint’s brush engine (though when I tried it it was basically unusable), so maybe GIMP is interested in wooing artists? I guess we’ll see.

MyPaint

Ah, MyPaint. I gave it a try once. Once.

It’s a shame, really. It sounds pretty great: specifically built for drawing, has very powerful brushes, supports an infinite canvas, supports canvas rotation, has a simple UI that gets out of your way. Perfect.

Or so it seems. But in MyPaint’s eagerness to shed unnecessary raster editing tools, it forgot a few of the more useful ones. Like selections.

MyPaint has no notion of a selection, nor of copy/paste. If you want to move a head to align better to a body, for example, the sanctioned approach is to duplicate the layer, erase the head from the old layer, erase everything but the head from the new layer, then move the new layer.

I can’t find anything that resembles HSL adjustment, either. I guess the workaround for that is to create H/S/L layers and floodfill them with different colors until you get what you want.

I can’t work seriously without these basic editing tools. I could see myself doodling in MyPaint, but Krita works just as well for doodling as for serious painting, so I’ve never gone back to it.

Drawpile

Drawpile is the modern equivalent to OpenCanvas, I suppose? It lets multiple people draw on the same canvas simultaneously. (I would not recommend it as a general-purpose raster editor.)

It’s a little clunky in places — I sometimes have bugs where keyboard focus gets stuck in the chat, or my tablet cursor becomes invisible — but the collaborative part works surprisingly well. It’s not a brush powerhouse or anything, and I don’t think it allows textured brushes, but it supports tablet pressure and canvas rotation and locked alpha and selections and whatnot.

I’ve used it a couple times, and it’s worked well enough that… well, other people made pretty decent drawings with it? I’m not sure I’ve managed yet. And I wouldn’t use it single-player. Still, it’s fun.

Aseprite

Aseprite is for pixel art so it doesn’t really belong here at all. But it’s very good at that and I like it a lot.

That’s all

I can’t name any other serious contender that exists for Linux.

I’m dimly aware of a thing called “Photo Shop” that’s more intended for photos but functions as a passable painter. More artists seem to swear by Paint Tool SAI and Clip Studio Paint. Also there’s Paint.NET, but I have no idea how well it’s actually suited for painting.

And that’s it! That’s all I’ve got. Krita for drawing, GIMP for editing, Drawpile for collaborative doodling.

Visualize and Monitor Amazon EC2 Events with Amazon CloudWatch Events and Amazon Kinesis Firehose

Post Syndicated from Karan Desai original https://aws.amazon.com/blogs/big-data/visualize-and-monitor-amazon-ec2-events-with-amazon-cloudwatch-events-and-amazon-kinesis-firehose/

Monitoring your AWS environment is important for security, performance, and cost control purposes. For example, by monitoring and analyzing API calls made to your Amazon EC2 instances, you can trace security incidents and gain insights into administrative behaviors and access patterns. The kinds of events you might monitor include console logins, Amazon EBS snapshot creation/deletion/modification, VPC creation/deletion/modification, and instance reboots, etc.

In this post, I show you how to build a near real-time API monitoring solution for EC2 events using Amazon CloudWatch Events and Amazon Kinesis Firehose. Please be sure to have Amazon CloudTrail enabled in your account.

  • CloudWatch Events offers a near real-time stream of system events that describe changes in AWS resources. CloudWatch Events now supports Kinesis Firehose as a target.
  • Kinesis Firehose is a fully managed service for continuously capturing, transforming, and delivering data in minutes to storage and analytics destinations such as Amazon S3, Amazon Kinesis Analytics, Amazon Redshift, and Amazon Elasticsearch Service.

Walkthrough

For this walkthrough, you create a CloudWatch event rule that matches specific EC2 events such as:

  • Starting, stopping, and terminating an instance
  • Creating and deleting VPC route tables
  • Creating and deleting a security group
  • Creating, deleting, and modifying instance volumes and snapshots

Your CloudWatch event target is a Kinesis Firehose delivery stream that delivers this data to an Elasticsearch cluster, where you set up Kibana for visualization. Using this solution, you can easily load and visualize EC2 events in minutes without setting up complicated data pipelines.

Set up the Elasticsearch cluster

Create the Amazon ES domain in the Amazon ES console, or by using the create-elasticsearch-domain command in the AWS CLI.

This example uses the following configuration:

  • Domain Name: esLogSearch
  • Elasticsearch Version: 1
  • Instance Count: 2
  • Instance type:elasticsearch
  • Enable dedicated master: true
  • Enable zone awareness: true
  • Restrict Amazon ES to an IP-based access policy

Other settings are left as the defaults.

Create a Kinesis Firehose delivery stream

In the Kinesis Firehose console, create a new delivery stream with Amazon ES as the destination. For detailed steps, see Create a Kinesis Firehose Delivery Stream to Amazon Elasticsearch Service.

Set up CloudWatch Events

Create a rule, and configure the event source and target. You can choose to configure multiple event sources with several AWS resources, along with options to specify specific or multiple event types.

In the CloudWatch console, choose Events.

For Service Name, choose EC2.

In Event Pattern Preview, choose Edit and copy the pattern below. For this walkthrough, I selected events that are specific to the EC2 API, but you can modify it to include events for any of your AWS resources.

 

{
	"source": [
		"aws.ec2"
	],
	"detail-type": [
		"AWS API Call via CloudTrail"
	],
	"detail": {
		"eventSource": [
			"ec2.amazonaws.com"
		],
		"eventName": [
			"RunInstances",
			"StopInstances",
			"StartInstances",
			"CreateFlowLogs",
			"CreateImage",
			"CreateNatGateway",
			"CreateVpc",
			"DeleteKeyPair",
			"DeleteNatGateway",
			"DeleteRoute",
			"DeleteRouteTable",
"CreateSnapshot",
"DeleteSnapshot",
			"DeleteVpc",
			"DeleteVpcEndpoints",
			"DeleteSecurityGroup",
			"ModifyVolume",
			"ModifyVpcEndpoint",
			"TerminateInstances"
		]
	}
}

The following screenshot shows what your event looks like in the console.

Next, choose Add target and select the delivery stream that you just created.

Set up Kibana on the Elasticsearch cluster

Amazon ES provides a default installation of Kibana with every Amazon ES domain. You can find the Kibana endpoint on your domain dashboard in the Amazon ES console. You can restrict Amazon ES access to an IP-based access policy.

In the Kibana console, for Index name or pattern, type log. This is the name of the Elasticsearch index.

For Time-field name, choose @time.

To view the events, choose Discover.

The following chart demonstrates the API operations and the number of times that they have been triggered in the past 12 hours.

Summary

In this post, you created a continuous, near real-time solution to monitor various EC2 events such as starting and shutting down instances, creating VPCs, etc. Likewise, you can build a continuous monitoring solution for all the API operations that are relevant to your daily AWS operations and resources.

With Kinesis Firehose as a new target for CloudWatch Events, you can retrieve, transform, and load system events to the storage and analytics destination of your choice in minutes, without setting up complicated data pipelines.

If you have any questions or suggestions, please comment below.


Additional Reading

Learn how to build a serverless architecture to analyze Amazon CloudFront access logs using AWS Lambda, Amazon Athena, and Amazon Kinesis Analytics

 

 

 

Mira, tiny robot of joyful delight

Post Syndicated from Alex Bate original https://www.raspberrypi.org/blog/mira-robot-alonso-martinez/

The staff of Pi Towers are currently melting into puddles while making ‘Aaaawwwwwww’ noises as Mira, the adorable little Pi-controlled robot made by Pixar 3D artist Alonso Martinez, steals their hearts.

Mira the robot playing peek-a-boo

If you want to get updates on Mira’s progress, sign up for the mailing list! http://eepurl.com/bteigD Mira is a desk companion that makes your life better one smile at a time. This project explores human robot interactivity and emotional intelligence. Currently Mira uses face tracking to interact with the users and loves playing the game “peek-a-boo”.

Introducing Mira

Honestly, I can’t type words – I am but a puddle! If I could type at all, I would only produce a stream of affectionate fragments. Imagine walking into a room full of kittens. What you would sound like is what I’d type.

No! I can do this. I’m a professional. I write for a living! I can…

SHE BLINKS OHMYAAAARGH!!!

Mira Alonso Martinez Raspberry Pi

Weebl & Bob meets South Park’s Ike Broflovski in an adorable 3D-printed bundle of ‘Aaawwwww’

Introducing Mira (I promise I can do this)

Right. I’ve had a nap and a drink. I’ve composed myself. I am up for this challenge. As long as I don’t look directly at her, I’ll be fine!

Here I go.

As one of the many über-talented 3D artists at Pixar, Alonso Martinez knows a thing or two about bringing adorable-looking characters to life on screen. However, his work left him wondering:

In movies you see really amazing things happening but you actually can’t interact with them – what would it be like if you could interact with characters?

So with the help of his friends Aaron Nathan and Vijay Sundaram, Alonso set out to bring the concept of animation to the physical world by building a “character” that reacts to her environment. His experiments with robotics started with Gertie, a ball-like robot reminiscent of his time spent animating bouncing balls when he was learning his trade. From there, he moved on to Mira.

Mira Alonso Martinez

Many, many of the views of this Tested YouTube video have come from me. So many.

Mira swivels to follow a person’s face, plays games such as peekaboo, shows surprise when you finger-shoot her, and giggles when you give her a kiss.

Mira’s inner workings

To get Mira to turn her head in three dimensions, Alonso took inspiration from the Microsoft Sidewinder Pro joystick he had as a kid. He purchased one on eBay, took it apart to understand how it works, and replicated its mechanism for Mira’s Raspberry Pi-powered innards.

Mira Alonso Martinez

Alonso used the smallest components he could find so that they would fit inside Mira’s tiny body.

Mira’s axis of 3D-printed parts moves via tiny Power HD DSM44 servos, while a camera and OpenCV handle face-tracking, and a single NeoPixel provides a range of colours to indicate her emotions. As for the blinking eyes? Two OLED screens boasting acrylic domes fit within the few millimeters between all the other moving parts.

More on Mira, including her history and how she works, can be found in this wonderful video released by Tested this week.

Pixar Artist’s 3D-Printed Animated Robots!

We’re gushing with grins and delight at the sight of these adorable animated robots created by artist Alonso Martinez. Sean chats with Alonso to learn how he designed and engineered his family of robots, using processes like 3D printing, mold-making, and silicone casting. They’re amazing!

You can also sign up for Alonso’s newsletter here to stay up-to-date about this little robot. Hopefully one of these newsletters will explain how to buy or build your own Mira, as I for one am desperate to see her adorable little face on my desk every day for the rest of my life.

The post Mira, tiny robot of joyful delight appeared first on Raspberry Pi.

Manage Instances at Scale without SSH Access Using EC2 Run Command

Post Syndicated from Jeff Barr original https://aws.amazon.com/blogs/aws/manage-instances-at-scale-without-ssh-access-using-ec2-run-command/

The guest post below, written by Ananth Vaidyanathan (Senior Product Manager for EC2 Systems Manager) and Rich Urmston (Senior Director of Cloud Architecture at Pegasystems) shows you how to use EC2 Run Command to manage a large collection of EC2 instances without having to resort to SSH.

Jeff;


Enterprises often have several managed environments and thousands of Amazon EC2 instances. It’s important to manage systems securely, without the headaches of Secure Shell (SSH). Run Command, part of Amazon EC2 Systems Manager, allows you to run remote commands on instances (or groups of instances using tags) in a controlled and auditable manner. It’s been a nice added productivity boost for Pega Cloud operations, which rely daily on Run Command services.

You can control Run Command access through standard IAM roles and policies, define documents to take input parameters, control the S3 bucket used to return command output. You can also share your documents with other AWS accounts, or with the public. All in all, Run Command provides a nice set of remote management features.

Better than SSH
Here’s why Run Command is a better option than SSH and why Pegasystems has adopted it as their primary remote management tool:

Run Command Takes Less Time –  Securely connecting to an instance requires a few steps e.g. jumpboxes to connect to or IP addresses to whitelist etc. With Run Command, cloud ops engineers can invoke commands directly from their laptop, and never have to find keys or even instance IDs. Instead, system security relies on AWS auth, IAM roles and policies.

Run Command Operations are Fully Audited – With SSH, there is no real control over what they can do, nor is there an audit trail. With Run Command, every invoked operation is audited in CloudTrail, including information on the invoking user, instances on which command was run, parameters, and operation status. You have full control and ability to restrict what functions engineers can perform on a system.

Run Command has no SSH keys to Manage – Run Command leverages standard AWS credentials, API keys, and IAM policies. Through integration with a corporate auth system, engineers can interact with systems based on their corporate credentials and identity.

Run Command can Manage Multiple Systems at the Same Time – Simple tasks such as looking at the status of a Linux service or retrieving a log file across a fleet of managed instances is cumbersome using SSH. Run Command allows you to specify a list of instances by IDs or tags, and invokes your command, in parallel, across the specified fleet. This provides great leverage when troubleshooting or managing more than the smallest Pega clusters.

Run Command Makes Automating Complex Tasks Easier – Standardizing operational tasks requires detailed procedure documents or scripts describing the exact commands. Managing or deploying these scripts across the fleet is cumbersome. Run Command documents provide an easy way to encapsulate complex functions, and handle document management and access controls. When combined with AWS Lambda, documents provide a powerful automation platform to handle any complex task.

Example – Restarting a Docker Container
Here is an example of a simple document used to restart a Docker container. It takes one parameter; the name of the Docker container to restart. It uses the AWS-RunShellScript method to invoke the command. The output is collected automatically by the service and returned to the caller. For an example of the latest document schema, see Creating Systems Manager Documents.

{
  "schemaVersion":"1.2",
  "description":"Restart the specified docker container.",
  "parameters":{
    "param":{
      "type":"String",
      "description":"(Required) name of the container to restart.",
      "maxChars":1024
    }
  },
  "runtimeConfig":{
    "aws:runShellScript":{
      "properties":[
        {
          "id":"0.aws:runShellScript",
          "runCommand":[
            "docker restart {{param}}"
          ]
        }
      ]
    }
  }
}

Putting Run Command into practice at Pegasystems
The Pegasystems provisioning system sits on AWS CloudFormation, which is used to deploy and update Pega Cloud resources. Layered on top of it is the Pega Provisioning Engine, a serverless, Lambda-based service that manages a library of CloudFormation templates and Ansible playbooks.

A Configuration Management Database (CMDB) tracks all the configurations details and history of every deployment and update, and lays out its data using a hierarchical directory naming convention. The following diagram shows how the various systems are integrated:

For cloud system management, Pega operations uses a command line version called cuttysh and a graphical version based on the Pega 7 platform, called the Pega Operations Portal. Both tools allow you to browse the CMDB of deployed environments, view configuration settings, and interact with deployed EC2 instances through Run Command.

CLI Walkthrough
Here is a CLI walkthrough for looking into a customer deployment and interacting with instances using Run Command.

Launching the cuttysh tool brings you to the root of the CMDB and a list of the provisioned customers:

% cuttysh
d CUSTA
d CUSTB
d CUSTC
d CUSTD

You interact with the CMDB using standard Linux shell commands, such as cd, ls, cat, and grep. Items prefixed with s are services that have viewable properties. Items prefixed with d are navigable subdirectories in the CMDB hierarchy.

In this example, change directories into customer CUSTB’s portion of the CMDB hierarchy, and then further into a provisioned Pega environment called env1, under the Dev network. The tool displays the artifacts that are provisioned for that environment. These entries map to provisioned CloudFormation templates.

> cd CUSTB
/ROOT/CUSTB/us-east-1 > cd DEV/env1

The ls –l command shows the version of the provisioned resources. These version numbers map back to source control–managed artifacts for the CloudFormation, Ansible, and other components that compose a version of the Pega Cloud.

/ROOT/CUSTB/us-east-1/DEV/env1 > ls -l
s 1.2.5 RDSDatabase 
s 1.2.5 PegaAppTier 
s 7.2.1 Pega7 

Now, use Run Command to interact with the deployed environments. To do this, use the attach command and specify the service with which to interact. In the following example, you attach to the Pega Web Tier. Using the information in the CMDB and instance tags, the CLI finds the corresponding EC2 instances and displays some basic information about them. This deployment has three instances.

/ROOT/CUSTB/us-east-1/DEV/env1 > attach PegaWebTier
 # ID         State  Public Ip    Private Ip  Launch Time
 0 i-0cf0e84 running 52.63.216.42 10.96.15.70 2017-01-16 
 1 i-0043c1d running 53.47.191.22 10.96.15.43 2017-01-16 
 2 i-09b879e running 55.93.118.27 10.96.15.19 2017-01-16 

From here, you can use the run command to invoke Run Command documents. In the following example, you run the docker-ps document against instance 0 (the first one on the list). EC2 executes the command and returns the output to the CLI, which in turn shows it.

/ROOT/CUSTB/us-east-1/DEV/env1 > run 0 docker-ps
. . 
CONTAINER ID IMAGE             CREATED      STATUS        NAMES
2f187cc38c1  pega-7.2         10 weeks ago  Up 8 weeks    pega-web

Using the same command and some of the other documents that have been defined, you can restart a Docker container or even pull back the contents of a file to your local system. When you get a file, Run Command also leaves a copy in an S3 bucket in case you want to pass the link along to a colleague.

/ROOT/CUSTB/us-east-1/DEV/env1 > run 0 docker-restart pega-web
..
pega-web

/ROOT/CUSTB/us-east-1/DEV/env1 > run 0 get-file /var/log/cfn-init-cmd.log
. . . . . 
get-file

Data has been copied locally to: /tmp/get-file/i-0563c9e/data
Data is also available in S3 at: s3://my-bucket/CUSTB/cuttysh/get-file/data

Now, leverage the Run Command ability to do more than one thing at a time. In the following example, you attach to a deployment with three running instances and want to see the uptime for each instance. Using the par (parallel) option for run, the CLI tells Run Command to execute the uptime document on all instances in parallel.

/ROOT/CUSTB/us-east-1/DEV/env1 > run par uptime
 …
Output for: i-006bdc991385c33
 20:39:12 up 15 days, 3:54, 0 users, load average: 0.42, 0.32, 0.30

Output for: i-09390dbff062618
 20:39:12 up 15 days, 3:54, 0 users, load average: 0.08, 0.19, 0.22

Output for: i-08367d0114c94f1
 20:39:12 up 15 days, 3:54, 0 users, load average: 0.36, 0.40, 0.40

Commands are complete.
/ROOT/PEGACLOUD/CUSTB/us-east-1/PROD/prod1 > 

Summary
Run Command improves productivity by giving you faster access to systems and the ability to run operations across a group of instances. Pega Cloud operations has integrated Run Command with other operational tools to provide a clean and secure method for managing systems. This greatly improves operational efficiency, and gives greater control over who can do what in managed deployments. The Pega continual improvement process regularly assesses why operators need access, and turns those operations into new Run Command documents to be added to the library. In fact, their long-term goal is to stop deploying cloud systems with SSH enabled.

If you have any questions or suggestions, please leave a comment for us!

— Ananth and Rich

Weekly roundup: Potpourri

Post Syndicated from Eevee original https://eev.ee/dev/2017/06/12/weekly-roundup-potpourri/

  • potluck: It’s about time I actually started on this! I spent a couple days squabbling with JavaScript frameworks, but in the end I gave up and decided to just build it on my existing LÖVE code. Trying out a new thing is nice, but maybe not when I have a somewhat complicated game in mind that I want to build as soon as possible.

  • fox flux: I made a font! Also drew a ton of player sprites and made my grass much prettier. I’m going to need to start thinking about environmental art soon, and I’m kinda dreading it because I have no idea what I’m doing there.

  • art: I did some (unfinished) modelling and painted some character art (warning: this pic is fine, but the rest of the account is NSFW).

  • blog: I wrote some thoughts about teaching technical subjects.

For the first time in possibly my entire life, I feel like I’m a little ahead of the game! The month isn’t even half over and I’ve already done some obligatory stuff, finished off one languishing task, and made some decent inroads into several other things. Nice.

How to Deploy Local Administrator Password Solution with AWS Microsoft AD

Post Syndicated from Dragos Madarasan original https://aws.amazon.com/blogs/security/how-to-deploy-local-administrator-password-solution-with-aws-microsoft-ad/

Local Administrator Password Solution (LAPS) from Microsoft simplifies password management by allowing organizations to use Active Directory (AD) to store unique passwords for computers. Typically, an organization might reuse the same local administrator password across the computers in an AD domain. However, this approach represents a security risk because it can be exploited during lateral escalation attacks. LAPS solves this problem by creating unique, randomized passwords for the Administrator account on each computer and storing it encrypted in AD.

Deploying LAPS with AWS Microsoft AD requires the following steps:

  1. Install the LAPS binaries on instances joined to your AWS Microsoft AD domain. The binaries add additional client-side extension (CSE) functionality to the Group Policy client.
  2. Extend the AWS Microsoft AD schema. LAPS requires new AD attributes to store an encrypted password and its expiration time.
  3. Configure AD permissions and delegate the ability to retrieve the local administrator password for IT staff in your organization.
  4. Configure Group Policy on instances joined to your AWS Microsoft AD domain to enable LAPS. This configures the Group Policy client to process LAPS settings and uses the binaries installed in Step 1.

The following diagram illustrates the setup that I will be using throughout this post and the associated tasks to set up LAPS. Note that the AWS Directory Service directory is deployed across multiple Availability Zones, and monitoring automatically detects and replaces domain controllers that fail.

Diagram illustrating this blog post's solution

In this blog post, I explain the prerequisites to set up Local Administrator Password Solution, demonstrate the steps involved to update the AD schema on your AWS Microsoft AD domain, show how to delegate permissions to IT staff and configure LAPS via Group Policy, and demonstrate how to retrieve the password using the graphical user interface or with Windows PowerShell.

This post assumes you are familiar with Lightweight Directory Access Protocol Data Interchange Format (LDIF) files and AWS Microsoft AD. If you need more of an introduction to Directory Service and AWS Microsoft AD, see How to Move More Custom Applications to the AWS Cloud with AWS Directory Service, which introduces working with schema changes in AWS Microsoft AD.

Prerequisites

In order to implement LAPS, you must use AWS Directory Service for Microsoft Active Directory (Enterprise Edition), also known as AWS Microsoft AD. Any instance on which you want to configure LAPS must be joined to your AWS Microsoft AD domain. You also need a Management instance on which you install the LAPS management tools.

In this post, I use an AWS Microsoft AD domain called example.com that I have launched in the EU (London) region. To see which the regions in which Directory Service is available, see AWS Regions and Endpoints.

Screenshot showing the AWS Microsoft AD domain example.com used in this blog post

In addition, you must have at least two instances launched in the same region as the AWS Microsoft AD domain. To join the instances to your AWS Microsoft AD domain, you have two options:

  1. Use the Amazon EC2 Systems Manager (SSM) domain join feature. To learn more about how to set up domain join for EC2 instances, see joining a Windows Instance to an AWS Directory Service Domain.
  2. Manually configure the DNS server addresses in the Internet Protocol version 4 (TCP/IPv4) settings of the network card to use the AWS Microsoft AD DNS addresses (172.31.9.64 and 172.31.16.191, for this blog post) and perform a manual domain join.

For the purpose of this post, my two instances are:

  1. A Management instance on which I will install the management tools that I have tagged as Management.
  2. A Web Server instance on which I will be deploying the LAPS binary.

Screenshot showing the two EC2 instances used in this post

Implementing the solution

 

1. Install the LAPS binaries on instances joined to your AWS Microsoft AD domain by using EC2 Run Command

LAPS binaries come in the form of an MSI installer and can be downloaded from the Microsoft Download Center. You can install the LAPS binaries manually, with an automation service such as EC2 Run Command, or with your existing software deployment solution.

For this post, I will deploy the LAPS binaries on my Web Server instance (i-0b7563d0f89d3453a) by using EC2 Run Command:

  1. While signed in to the AWS Management Console, choose EC2. In the Systems Manager Services section of the navigation pane, choose Run Command.
  2. Choose Run a command, and from the Command document list, choose AWS-InstallApplication.
  3. From Target instances, choose the instance on which you want to deploy the LAPS binaries. In my case, I will be selecting the instance tagged as Web Server. If you do not see any instances listed, make sure you have met the prerequisites for Amazon EC2 Systems Manager (SSM) by reviewing the Systems Manager Prerequisites.
  4. For Action, choose Install, and then stipulate the following values:
    • Parameters: /quiet
    • Source: https://download.microsoft.com/download/C/7/A/C7AAD914-A8A6-4904-88A1-29E657445D03/LAPS.x64.msi
    • Source Hash: f63ebbc45e2d080630bd62a195cd225de734131a56bb7b453c84336e37abd766
    • Comment: LAPS deployment

Leave the other options with the default values and choose Run. The AWS Management Console will return a Command ID, which will initially have a status of In Progress. It should take less than 5 minutes to download and install the binaries, after which the Command ID will update its status to Success.

Status showing the binaries have been installed successfully

If the Command ID runs for more than 5 minutes or returns an error, it might indicate a problem with the installer. To troubleshoot, review the steps in Troubleshooting Systems Manager Run Command.

To verify the binaries have been installed successfully, open Control Panel and review the recently installed applications in Programs and Features.

Screenshot of Control Panel that confirms LAPS has been installed successfully

You should see an entry for Local Administrator Password Solution with a version of 6.2.0.0 or newer.

2. Extend the AWS Microsoft AD schema

In the previous section, I used EC2 Run Command to install the LAPS binaries on an EC2 instance. Now, I am ready to extend the schema in an AWS Microsoft AD domain. Extending the schema is a requirement because LAPS relies on new AD attributes to store the encrypted password and its expiration time.

In an on-premises AD environment, you would update the schema by running the Update-AdmPwdADSchema Windows PowerShell cmdlet with schema administrator credentials. Because AWS Microsoft AD is a managed service, I do not have permissions to update the schema directly. Instead, I will update the AD schema from the Directory Service console by importing an LDIF file. If you are unfamiliar with schema updates or LDIF files, see How to Move More Custom Applications to the AWS Cloud with AWS Directory Service.

To make things easier for you, I am providing you with a sample LDIF file that contains the required AD schema changes. Using Notepad or a similar text editor, open the SchemaChanges-0517.ldif file and update the values of dc=example,dc=com with your own AWS Microsoft AD domain and suffix.

After I update the LDIF file with my AWS Microsoft AD details, I import it by using the AWS Management Console:

  1. On the Directory Service console, select from the list of directories in the Microsoft AD directory by choosing its identifier (it will look something like d-534373570ea).
  2. On the Directory details page, choose the Schema extensions tab and choose Upload and update schema.
    Screenshot showing the "Upload and update schema" option
  3. When prompted for the LDIF file that contains the changes, choose the sample LDIF file.
  4. In the background, the LDIF file is validated for errors and a backup of the directory is created for recovery purposes. Updating the schema might take a few minutes and the status will change to Updating Schema. When the process has completed, the status of Completed will be displayed, as shown in the following screenshot.

Screenshot showing the schema updates in progress
When the process has completed, the status of Completed will be displayed, as shown in the following screenshot.

Screenshot showing the process has completed

If the LDIF file contains errors or the schema extension fails, the Directory Service console will generate an error code and additional debug information. To help troubleshoot error messages, see Schema Extension Errors.

The sample LDIF file triggers AWS Microsoft AD to perform the following actions:

  1. Create the ms-Mcs-AdmPwd attribute, which stores the encrypted password.
  2. Create the ms-Mcs-AdmPwdExpirationTime attribute, which stores the time of the password’s expiration.
  3. Add both attributes to the Computer class.

3. Configure AD permissions

In the previous section, I updated the AWS Microsoft AD schema with the required attributes for LAPS. I am now ready to configure the permissions for administrators to retrieve the password and for computer accounts to update their password attribute.

As part of configuring AD permissions, I grant computers the ability to update their own password attribute and specify which security groups have permissions to retrieve the password from AD. As part of this process, I run Windows PowerShell cmdlets that are not installed by default on Windows Server.

Note: To learn more about Windows PowerShell and the concept of a cmdlet (pronounced “command-let”), go to Getting Started with Windows PowerShell.

Before getting started, I need to set up the required tools for LAPS on my Management instance, which must be joined to the AWS Microsoft AD domain. I will be using the same LAPS installer that I downloaded from the Microsoft LAPS website. In my Management instance, I have manually run the installer by clicking the LAPS.x64.msi file. On the Custom Setup page of the installer, under Management Tools, for each option I have selected Install on local hard drive.

Screenshot showing the required management tools

In the preceding screenshot, the features are:

  • The fat client UI – A simple user interface for retrieving the password (I will use it at the end of this post).
  • The Windows PowerShell module – Needed to run the commands in the next sections.
  • The GPO Editor templates – Used to configure Group Policy objects.

The next step is to grant computers in the Computers OU the permission to update their own attributes. While connected to my Management instance, I go to the Start menu and type PowerShell. In the list of results, right-click Windows PowerShell and choose Run as administrator and then Yes when prompted by User Account Control.

In the Windows PowerShell prompt, I type the following command.

Import-module AdmPwd.PS

Set-AdmPwdComputerSelfPermission –OrgUnit “OU=Computers,OU=MyMicrosoftAD,DC=example,DC=com

To grant the administrator group called Admins the permission to retrieve the computer password, I run the following command in the Windows PowerShell prompt I previously started.

Import-module AdmPwd.PS

Set-AdmPwdReadPasswordPermission –OrgUnit “OU=Computers, OU=MyMicrosoftAD,DC=example,DC=com” –AllowedPrincipals “Admins”

4. Configure Group Policy to enable LAPS

In the previous section, I deployed the LAPS management tools on my management instance, granted the computer accounts the permission to self-update their local administrator password attribute, and granted my Admins group permissions to retrieve the password.

Note: The following section addresses the Group Policy Management Console and Group Policy objects. If you are unfamiliar with or wish to learn more about these concepts, go to Get Started Using the GPMC and Group Policy for Beginners.

I am now ready to enable LAPS via Group Policy:

  1. On my Management instance (i-03b2c5d5b1113c7ac), I have installed the Group Policy Management Console (GPMC) by running the following command in Windows PowerShell.
Install-WindowsFeature –Name GPMC
  1. Next, I have opened the GPMC and created a new Group Policy object (GPO) called LAPS GPO.
  2. In the Local Group Policy Editor, I navigate to Computer Configuration > Policies > Administrative Templates > LAPS. I have configured the settings using the values in the following table.

Setting

State

Options

Password Settings

Enabled

Complexity: large letters, small letters, numbers, specials

Do not allow password expiration time longer than required by policy

Enabled

N/A

Enable local admin password management

Enabled

N/A

  1. Next, I need to link the GPO to an organizational unit (OU) in which my machine accounts sit. In your environment, I recommend testing the new settings on a test OU and then deploying the GPO to production OUs.

Note: If you choose to create a new test organizational unit, you must create it in the OU that AWS Microsoft AD delegates to you to manage. For example, if your AWS Microsoft AD directory name were example.com, the test OU path would be example.com/example/Computers/Test.

  1. To test that LAPS works, I need to make sure the computer has received the new policy by forcing a Group Policy update. While connected to the Web Server instance (i-0b7563d0f89d3453a) using Remote Desktop, I open an elevated administrative command prompt and run the following command: gpupdate /force. I can check if the policy is applied by running the command: gpresult /r | findstr LAPS GPO, where LAPS GPO is the name of the GPO created in the second step.
  2. Back on my Management instance, I can then launch the LAPS interface from the Start menu and use it to retrieve the password (as shown in the following screenshot). Alternatively, I can run the Get-ADComputer Windows PowerShell cmdlet to retrieve the password.
Get-ADComputer [YourComputerName] -Properties ms-Mcs-AdmPwd | select name, ms-Mcs-AdmPwd

Screenshot of the LAPS UI, which you can use to retrieve the password

Summary

In this blog post, I demonstrated how you can deploy LAPS with an AWS Microsoft AD directory. I then showed how to install the LAPS binaries by using EC2 Run Command. Using the sample LDIF file I provided, I showed you how to extend the schema, which is a requirement because LAPS relies on new AD attributes to store the encrypted password and its expiration time. Finally, I showed how to complete the LAPS setup by configuring the necessary AD permissions and creating the GPO that starts the LAPS password change.

If you have comments about this post, submit them in the “Comments” section below. If you have questions about or issues implementing this solution, please start a new thread on the Directory Service forum.

– Dragos

How to track that annoying pop-up

Post Syndicated from Robert Graham original http://blog.erratasec.com/2017/06/how-to-track-that-annoying-pop-up.html

In a recent update to their Office suite on Windows, Microsoft made a mistake where every hour, for a fraction of a second,  a black window pops up on the screen. This leads many to fear their system has been infected by a virus. I thought I’d document how to track this down.

The short answer is to use Mark Russinovich’s “sysinternals.com” tools. He’s Windows internals guru at Microsoft and has been maintaining a suite of tools that are critical for Windows system maintenance and security. Copy all the tools from “https://live.sysinternals.com“. Also, you can copy with Microsoft Windows Networking (SMB).

Of these tools, what we want is something that looks at “processes”. There are several tools that do this, but focus on processes that are currently running. What we want is something that monitors process creation.

The tool for that is “sysmon.exe”. It can monitor not only process creation, but a large number of other system events that a techy can use to see what the system has been doing, and if you are infected with a virus.

Sysmon has a fairly complicated configuration file, and if you enabled everything, you’d soon be overwhelmed with events. @SwiftOnSecurity has published a configuration file they use in the real world in real environment that cuts down on the noise, and focuses on events that are really important. It enables monitoring of “process creation”, but filters out know good processes that might fill up your logs. You grab the file here. Save it to the same directory to where you saved Sysmon:

https://raw.githubusercontent.com/SwiftOnSecurity/sysmon-config/master/sysmonconfig-export.xml

Once you’ve done it, run the following command to activate the Sysmon monitoring service using this configuration file by running the following command as Administrator. (Right click on the Command Prompt icon and select More/Run as Administrator).

sysmon.exe -accepteula -i sysmonconfig-export.xml

Now sit back and relax until that popup happens again. Right after it does, go into the “Event Viewer” application (click on Windows menu and type “Event Viewer”, or run ‘eventvwr.exe’. Now you have to find where the sysmon events are located, since there are so many things that log events.

The Sysmon events are under the path:

Applications and Services Logs\Microsoft\Windows\Sysmon\operational

When you open that up, you should see the top event is the one we are looking for. Actually, the very top event is launching the process “eventvwr.exe”, but the next one down is our event. It looks like this:

Drilling down into the details, we find the the offending thing causing those annoying popups is “officebackgroundtask.exe” in Office.

We can see it’s started by the “Schedule” service. This means we can go look at it with “autoruns.exe”, another Sysinternals tool that looks at all the things configured to automatically start when you start/login to your computer.

They are pink, which [update] is how autoruns shows they are “unsigned” programs (Microsoft’s programs should, normally, always be signed, so this should be suspicious). I’m assuming the suspicious thing is that they run in the user’s context, rather than system context, creating popup screens.

Autoruns allows you to do a bunch of things. You can click on the [X] box and disable it from running in the future. You can [right-click] in order to upload to Virus Total and check if it’s a known virus.

You can also double-click, to open the Task Scheduler, and see the specific configuration. You can see here that this thing is scheduled to run every hour:

Conclusion

So the conclusions are this.
To solve this particular problem of identifying what’s causing a process to flash a screen occasionally, use sysmon.
To solve generation problems like this, use Sysinternals suite of applications.
I haven’t been, but I am now, using @SwiftOnSecurity’s sysmon configuration just to monitor the security of my computers. I should probably install something to move a copy of the logs off the system.

Some Notes

Some URLs:
Some tweets:

Building High-Throughput Genomics Batch Workflows on AWS: Workflow Layer (Part 4 of 4)

Post Syndicated from Andy Katz original https://aws.amazon.com/blogs/compute/building-high-throughput-genomics-batch-workflows-on-aws-workflow-layer-part-4-of-4/

Aaron Friedman is a Healthcare and Life Sciences Partner Solutions Architect at AWS

Angel Pizarro is a Scientific Computing Technical Business Development Manager at AWS

This post is the fourth in a series on how to build a genomics workflow on AWS. In Part 1, we introduced a general architecture, shown below, and highlighted the three common layers in a batch workflow:

  • Job
  • Batch
  • Workflow

In Part 2, you built a Docker container for each job that needed to run as part of your workflow, and stored them in Amazon ECR.

In Part 3, you tackled the batch layer and built a scalable, elastic, and easily maintainable batch engine using AWS Batch. This solution took care of dynamically scaling your compute resources in response to the number of runnable jobs in your job queue length as well as managed job placement.

In part 4, you build out the workflow layer of your solution using AWS Step Functions and AWS Lambda. You then run an end-to-end genomic analysis―specifically known as exome secondary analysis―for many times at a cost of less than $1 per exome.

Step Functions makes it easy to coordinate the components of your applications using visual workflows. Building applications from individual components that each perform a single function lets you scale and change your workflow quickly. You can use the graphical console to arrange and visualize the components of your application as a series of steps, which simplify building and running multi-step applications. You can change and add steps without writing code, so you can easily evolve your application and innovate faster.

An added benefit of using Step Functions to define your workflows is that the state machines you create are immutable. While you can delete a state machine, you cannot alter it after it is created. For regulated workloads where auditing is important, you can be assured that state machines you used in production cannot be altered.

In this blog post, you will create a Lambda state machine to orchestrate your batch workflow. For more information on how to create a basic state machine, please see this Step Functions tutorial.

All code related to this blog series can be found in the associated GitHub repository here.

Build a state machine building block

To skip the following steps, we have provided an AWS CloudFormation template that can deploy your Step Functions state machine. You can use this in combination with the setup you did in part 3 to quickly set up the environment in which to run your analysis.

The state machine is composed of smaller state machines that submit a job to AWS Batch, and then poll and check its execution.

The steps in this building block state machine are as follows:

  1. A job is submitted.
    Each analytical module/job has its own Lambda function for submission and calls the batchSubmitJob Lambda function that you built in the previous blog post. You will build these specialized Lambda functions in the following section.
  2. The state machine queries the AWS Batch API for the job status.
    This is also a Lambda function.
  3. The job status is checked to see if the job has completed.
    If the job status equals SUCCESS, proceed to log the final job status. If the job status equals FAILED, end the execution of the state machine. In all other cases, wait 30 seconds and go back to Step 2.

Here is the JSON representing this state machine.

{
  "Comment": "A simple example that submits a Job to AWS Batch",
  "StartAt": "SubmitJob",
  "States": {
    "SubmitJob": {
      "Type": "Task",
      "Resource": "arn:aws:lambda:us-east-1:<account-id>::function:batchSubmitJob",
      "Next": "GetJobStatus"
    },
    "GetJobStatus": {
      "Type": "Task",
      "Resource": "arn:aws:lambda:us-east-1:<account-id>:function:batchGetJobStatus",
      "Next": "CheckJobStatus",
      "InputPath": "$",
      "ResultPath": "$.status"
    },
    "CheckJobStatus": {
      "Type": "Choice",
      "Choices": [
        {
          "Variable": "$.status",
          "StringEquals": "FAILED",
          "End": true
        },
        {
          "Variable": "$.status",
          "StringEquals": "SUCCEEDED",
          "Next": "GetFinalJobStatus"
        }
      ],
      "Default": "Wait30Seconds"
    },
    "Wait30Seconds": {
      "Type": "Wait",
      "Seconds": 30,
      "Next": "GetJobStatus"
    },
    "GetFinalJobStatus": {
      "Type": "Task",
      "Resource": "arn:aws:lambda:us-east-1:<account-id>:function:batchGetJobStatus",
      "End": true
    }
  }
}

Building the Lambda functions for the state machine

You need two basic Lambda functions for this state machine. The first one submits a job to AWS Batch and the second checks the status of the AWS Batch job that was submitted.

In AWS Step Functions, you specify an input as JSON that is read into your state machine. Each state receives the aggregate of the steps immediately preceding it, and you can specify which components a state passes on to its children. Because you are using Lambda functions to execute tasks, one of the easiest routes to take is to modify the input JSON, represented as a Python dictionary, within the Lambda function and return the entire dictionary back for the next state to consume.

Building the batchSubmitIsaacJob Lambda function

For Step 1 above, you need a Lambda function for each of the steps in your analysis workflow. As you created a generic Lambda function in the previous post to submit a batch job (batchSubmitJob), you can use that function as the basis for the specialized functions you’ll include in this state machine. Here is such a Lambda function for the Isaac aligner.

from __future__ import print_function

import boto3
import json
import traceback

lambda_client = boto3.client('lambda')



def lambda_handler(event, context):
    try:
        # Generate output put
        bam_s3_path = '/'.join([event['resultsS3Path'], event['sampleId'], 'bam/'])

        depends_on = event['dependsOn'] if 'dependsOn' in event else []

        # Generate run command
        command = [
            '--bam_s3_folder_path', bam_s3_path,
            '--fastq1_s3_path', event['fastq1S3Path'],
            '--fastq2_s3_path', event['fastq2S3Path'],
            '--reference_s3_path', event['isaac']['referenceS3Path'],
            '--working_dir', event['workingDir']
        ]

        if 'cmdArgs' in event['isaac']:
            command.extend(['--cmd_args', event['isaac']['cmdArgs']])
        if 'memory' in event['isaac']:
            command.extend(['--memory', event['isaac']['memory']])

        # Submit Payload
        response = lambda_client.invoke(
            FunctionName='batchSubmitJob',
            InvocationType='RequestResponse',
            LogType='Tail',
            Payload=json.dumps(dict(
                dependsOn=depends_on,
                containerOverrides={
                    'command': command,
                },
                jobDefinition=event['isaac']['jobDefinition'],
                jobName='-'.join(['isaac', event['sampleId']]),
                jobQueue=event['isaac']['jobQueue']
            )))

        response_payload = response['Payload'].read()

        # Update event
        event['bamS3Path'] = bam_s3_path
        event['jobId'] = json.loads(response_payload)['jobId']
        
        return event
    except Exception as e:
        traceback.print_exc()
        raise e

In the Lambda console, create a Python 2.7 Lambda function named batchSubmitIsaacJob and paste in the above code. Use the LambdaBatchExecutionRole that you created in the previous post. For more information, see Step 2.1: Create a Hello World Lambda Function.

This Lambda function reads in the inputs passed to the state machine it is part of, formats the data for the batchSubmitJob Lambda function, invokes that Lambda function, and then modifies the event dictionary to pass onto the subsequent states. You can repeat these for each of the other tools, which can be found in the tools//lambda/lambda_function.py script in the GitHub repo.

Building the batchGetJobStatus Lambda function

For Step 2 above, the process queries the AWS Batch DescribeJobs API action with jobId to identify the state that the job is in. You can put this into a Lambda function to integrate it with Step Functions.

In the Lambda console, create a new Python 2.7 function with the LambdaBatchExecutionRole IAM role. Name your function batchGetJobStatus and paste in the following code. This is similar to the batch-get-job-python27 Lambda blueprint.

from __future__ import print_function

import boto3
import json

print('Loading function')

batch_client = boto3.client('batch')

def lambda_handler(event, context):
    # Log the received event
    print("Received event: " + json.dumps(event, indent=2))
    # Get jobId from the event
    job_id = event['jobId']

    try:
        response = batch_client.describe_jobs(
            jobs=[job_id]
        )
        job_status = response['jobs'][0]['status']
        return job_status
    except Exception as e:
        print(e)
        message = 'Error getting Batch Job status'
        print(message)
        raise Exception(message)

Structuring state machine input

You have structured the state machine input so that general file references are included at the top-level of the JSON object, and any job-specific items are contained within a nested JSON object. At a high level, this is what the input structure looks like:

{
        "general_field_1": "value1",
        "general_field_2": "value2",
        "general_field_3": "value3",
        "job1": {},
        "job2": {},
        "job3": {}
}

Building the full state machine

By chaining these state machine components together, you can quickly build flexible workflows that can process genomes in multiple ways. The development of the larger state machine that defines the entire workflow uses four of the above building blocks. You use the Lambda functions that you built in the previous section. Rename each building block submission to match the tool name.

We have provided a CloudFormation template to deploy your state machine and the associated IAM roles. In the CloudFormation console, select Create Stack, choose your template (deploy_state_machine.yaml), and enter in the ARNs for the Lambda functions you created.

Continue through the rest of the steps and deploy your stack. Be sure to check the box next to "I acknowledge that AWS CloudFormation might create IAM resources."

Once the CloudFormation stack is finished deploying, you should see the following image of your state machine.

In short, you first submit a job for Isaac, which is the aligner you are using for the analysis. Next, you use parallel state to split your output from "GetFinalIsaacJobStatus" and send it to both your variant calling step, Strelka, and your QC step, Samtools Stats. These then are run in parallel and you annotate the results from your Strelka step with snpEff.

Putting it all together

Now that you have built all of the components for a genomics secondary analysis workflow, test the entire process.

We have provided sequences from an Illumina sequencer that cover a region of the genome known as the exome. Most of the positions in the genome that we have currently associated with disease or human traits reside in this region, which is 1–2% of the entire genome. The workflow that you have built works for both analyzing an exome, as well as an entire genome.

Additionally, we have provided prebuilt reference genomes for Isaac, located at:

s3://aws-batch-genomics-resources/reference/

If you are interested, we have provided a script that sets up all of that data. To execute that script, run the following command on a large EC2 instance:

make reference REGISTRY=<your-ecr-registry>

Indexing and preparing this reference takes many hours on a large-memory EC2 instance. Be careful about the costs involved and note that the data is available through the prebuilt reference genomes.

Starting the execution

In a previous section, you established a provenance for the JSON that is fed into your state machine. For ease, we have auto-populated the input JSON for you to the state machine. You can also find this in the GitHub repo under workflow/test.input.json:

{
  "fastq1S3Path": "s3://aws-batch-genomics-resources/fastq/SRR1919605_1.fastq.gz",
  "fastq2S3Path": "s3://aws-batch-genomics-resources/fastq/SRR1919605_2.fastq.gz",
  "referenceS3Path": "s3://aws-batch-genomics-resources/reference/hg38.fa",
  "resultsS3Path": "s3://<bucket>/genomic-workflow/results",
  "sampleId": "NA12878_states_1",
  "workingDir": "/scratch",
  "isaac": {
    "jobDefinition": "isaac-myenv:1",
    "jobQueue": "arn:aws:batch:us-east-1:<account-id>:job-queue/highPriority-myenv",
    "referenceS3Path": "s3://aws-batch-genomics-resources/reference/isaac/"
  },
  "samtoolsStats": {
    "jobDefinition": "samtools_stats-myenv:1",
    "jobQueue": "arn:aws:batch:us-east-1:<account-id>:job-queue/lowPriority-myenv"
  },
  "strelka": {
    "jobDefinition": "strelka-myenv:1",
    "jobQueue": "arn:aws:batch:us-east-1:<account-id>:job-queue/highPriority-myenv",
    "cmdArgs": " --exome "
  },
  "snpEff": {
    "jobDefinition": "snpeff-myenv:1",
    "jobQueue": "arn:aws:batch:us-east-1:<account-id>:job-queue/lowPriority-myenv",
    "cmdArgs": " -t hg38 "
  }
}

You are now at the stage to run your full genomic analysis. Copy the above to a new text file, change paths and ARNs to the ones that you created previously, and save your JSON input as input.states.json.

In the CLI, execute the following command. You need the ARN of the state machine that you created in the previous post:

aws stepfunctions start-execution --state-machine-arn <your-state-machine-arn> --input file://input.states.json

Your analysis has now started. By using Spot Instances with AWS Batch, you can quickly scale out your workflows while concurrently optimizing for cost. While this is not guaranteed, most executions of the workflows presented here should cost under $1 for a full analysis.

Monitoring the execution

The output from the above CLI command gives you the ARN that describes the specific execution. Copy that and navigate to the Step Functions console. Select the state machine that you created previously and paste the ARN into the search bar.

The screen shows information about your specific execution. On the left, you see where your execution currently is in the workflow.

In the following screenshot, you can see that your workflow has successfully completed the alignment job and moved onto the subsequent steps, which are variant calling and generating quality information about your sample.

You can also navigate to the AWS Batch console and see that progress of all of your jobs reflected there as well.

Finally, after your workflow has completed successfully, check out the S3 path to which you wrote all of your files. If you run a ls –recursive command on the S3 results path, specified in the input to your state machine execution, you should see something similar to the following:

2017-05-02 13:46:32 6475144340 genomic-workflow/results/NA12878_run1/bam/sorted.bam
2017-05-02 13:46:34    7552576 genomic-workflow/results/NA12878_run1/bam/sorted.bam.bai
2017-05-02 13:46:32         45 genomic-workflow/results/NA12878_run1/bam/sorted.bam.md5
2017-05-02 13:53:20      68769 genomic-workflow/results/NA12878_run1/stats/bam_stats.dat
2017-05-02 14:05:12        100 genomic-workflow/results/NA12878_run1/vcf/stats/runStats.tsv
2017-05-02 14:05:12        359 genomic-workflow/results/NA12878_run1/vcf/stats/runStats.xml
2017-05-02 14:05:12  507577928 genomic-workflow/results/NA12878_run1/vcf/variants/genome.S1.vcf.gz
2017-05-02 14:05:12     723144 genomic-workflow/results/NA12878_run1/vcf/variants/genome.S1.vcf.gz.tbi
2017-05-02 14:05:12  507577928 genomic-workflow/results/NA12878_run1/vcf/variants/genome.vcf.gz
2017-05-02 14:05:12     723144 genomic-workflow/results/NA12878_run1/vcf/variants/genome.vcf.gz.tbi
2017-05-02 14:05:12   30783484 genomic-workflow/results/NA12878_run1/vcf/variants/variants.vcf.gz
2017-05-02 14:05:12    1566596 genomic-workflow/results/NA12878_run1/vcf/variants/variants.vcf.gz.tbi

Modifications to the workflow

You have now built and run your genomics workflow. While diving deep into modifications to this architecture are beyond the scope of these posts, we wanted to leave you with several suggestions of how you might modify this workflow to satisfy additional business requirements.

  • Job tracking with Amazon DynamoDB
    In many cases, such as if you are offering Genomics-as-a-Service, you might want to track the state of your jobs with DynamoDB to get fine-grained records of how your jobs are running. This way, you can easily identify the cost of individual jobs and workflows that you run.
  • Resuming from failure
    Both AWS Batch and Step Functions natively support job retries and can cover many of the standard cases where a job might be interrupted. There may be cases, however, where your workflow might fail in a way that is unpredictable. In this case, you can use custom error handling with AWS Step Functions to build out a workflow that is even more resilient. Also, you can build in fail states into your state machine to fail at any point, such as if a batch job fails after a certain number of retries.
  • Invoking Step Functions from Amazon API Gateway
    You can use API Gateway to build an API that acts as a "front door" to Step Functions. You can create a POST method that contains the input JSON to feed into the state machine you built. For more information, see the Implementing Serverless Manual Approval Steps in AWS Step Functions and Amazon API Gateway blog post.

Conclusion

While the approach we have demonstrated in this series has been focused on genomics, it is important to note that this can be generalized to nearly any high-throughput batch workload. We hope that you have found the information useful and that it can serve as a jump-start to building your own batch workloads on AWS with native AWS services.

For more information about how AWS can enable your genomics workloads, be sure to check out the AWS Genomics page.

Other posts in this four-part series:

Please leave any questions and comments below.