Tag Archives: zip

Security updates for Friday

Post Syndicated from ris original https://lwn.net/Articles/734606/rss

Security updates have been issued by CentOS (augeas, samba, and samba4), Debian (apache2, bluez, emacs23, and newsbeuter), Fedora (kernel and mingw-LibRaw), openSUSE (apache2 and libzip), Oracle (kernel), SUSE (kernel, spice, and xen), and Ubuntu (emacs24, emacs25, and samba).

Security updates for Wednesday

Post Syndicated from ris original https://lwn.net/Articles/734318/rss

Security updates have been issued by CentOS (emacs), Debian (apache2, gdk-pixbuf, and pyjwt), Fedora (autotrace, converseen, dmtx-utils, drawtiming, emacs, gtatool, imageinfo, ImageMagick, inkscape, jasper, k3d, kxstitch, libwpd, mingw-libzip, perl-Image-SubImageFind, pfstools, php-pecl-imagick, psiconv, q, rawtherapee, ripright, rss-glx, rubygem-rmagick, synfig, synfigstudio, techne, vdr-scraper2vdr, vips, and WindowMaker), Oracle (emacs and kernel), Red Hat (emacs and kernel), Scientific Linux (emacs), SUSE (emacs), and Ubuntu (apache2).

Laser Cookies: a YouTube collaboration

Post Syndicated from Alex Bate original https://www.raspberrypi.org/blog/laser-cookies/

Lasers! Cookies! Raspberry Pi! We’re buzzing with excitement about sharing our latest YouTube video with you, which comes directly from the kitchen of maker Estefannie Explains It All!

Laser-guarded cookies feat. Estefannie Explains It All

Uploaded by Raspberry Pi on 2017-09-18.

Estefannie Explains It All + Raspberry Pi

When Estefannie visited Pi Towers earlier this year, we introduced her to the Raspberry Pi Digital Curriculum and the free resources on our website. We’d already chatted to her via email about the idea of creating a collab video for the Raspberry Pi channel. Once she’d met members of the Raspberry Pi Foundation team and listened to them wax lyrical about the work we do here, she was even more keen to collaborate with us.

Estefannie on Twitter

Ahhhh!!! I still can’t believe I got to hang out and make stuff at the @Raspberry_Pi towers!! Thank you thank you!!

Estefannie returned to the US filled with inspiration for a video for our channel, and we’re so pleased with how awesome her final result is. The video is a super addition to our Raspberry Pi YouTube channel, it shows what our resources can help you achieve, and it’s great fun. You might also have noticed that the project fits in perfectly with this season’s Pioneers challenge. A win all around!

So yeah, we’re really chuffed about this video, and we hope you all like it too!

Estefannie’s Laser Cookies guide

For those of you wanting to try your hand at building your own Cookie Jar Laser Surveillance Security System, Estefannie has provided a complete guide to talk you through it. Here she goes:

First off, you’ll need:

  • 10 lasers
  • 10 photoresistors
  • 10 capacitors
  • 1 Raspberry Pi Zero W
  • 1 buzzer
  • 1 Raspberry Pi Camera Module
  • 12 ft PVC pipes + 4 corners
  • 1 acrylic panel
  • 1 battery pack
  • 8 zip ties
  • tons of cookies

I used the Raspberry Pi Foundation’s Laser trip wire and the Tweeting Babbage resources to get one laser working and to set up the camera and Twitter API. This took me less than an hour, and it was easy, breezy, beautiful, Raspberry Pi.

I soldered ten lasers in parallel and connected ten photoresistors to their own GPIO pins. I didn’t wire them up in series because of sensitivity reasons and to make debugging easier.

Building the frame took a few tries: I actually started with a wood frame, then tried a clear case, and finally realized the best and cleaner solution would be pipes. All the wires go inside the pipes and come out in a small window on the top to wire up to the Zero W.

Using pipes also made the build cheaper, since they were about $3 for 12 ft. Wiring inside the pipes was tricky, and to finish the circuit, I soldered some of the wires after they were already in the pipes.

I tried glueing the lasers to the frame, but the lasers melted the glue and became decalibrated. Next I tried tape, and then I found picture mounting putty. The putty worked perfectly — it was easy to mold a putty base for the lasers and to calibrate and re-calibrate them if needed. Moreover, the lasers stayed in place no matter how hot they got.

Estefannie Explains It All Raspberry Pi Cookie Jar

Although the lasers were not very strong, I still strained my eyes after long hours of calibrating — hence the sunglasses! Working indoors with lasers, sunglasses, and code was weird. But now I can say I’ve done that…in my kitchen.

Using all the knowledge I have shared, this project should take a couple of hours. The code you need lives on my GitHub!

Estefannie Explains It All Raspberry Pi Cookie Jar

“The cookie recipe is my grandma’s, and I am not allowed to share it.”

Estefannie on YouTube

Estefannie made this video for us as a gift, and we’re so grateful for the time and effort she put into it! If you enjoyed it and would like to also show your gratitude, subscribe to her channel on YouTube and follow her on Instagram and Twitter. And if you make something similar, or build anything with our free resources, make sure to share it with us in the comments below or via our social media channels.

The post Laser Cookies: a YouTube collaboration appeared first on Raspberry Pi.

Using AWS CodePipeline, AWS CodeBuild, and AWS Lambda for Serverless Automated UI Testing

Post Syndicated from Prakash Palanisamy original https://aws.amazon.com/blogs/devops/using-aws-codepipeline-aws-codebuild-and-aws-lambda-for-serverless-automated-ui-testing/

Testing the user interface of a web application is an important part of the development lifecycle. In this post, I’ll explain how to automate UI testing using serverless technologies, including AWS CodePipeline, AWS CodeBuild, and AWS Lambda.

I built a website for UI testing that is hosted in S3. I used Selenium to perform cross-browser UI testing on Chrome, Firefox, and PhantomJS, a headless WebKit browser with Ghost Driver, an implementation of the WebDriver Wire Protocol. I used Python to create test cases for ChromeDriver, FirefoxDriver, or PhatomJSDriver based the browser against which the test is being executed.

Resources referred to in this post, including the AWS CloudFormation template, test and status websites hosted in S3, AWS CodeBuild build specification files, AWS Lambda function, and the Python script that performs the test are available in the serverless-automated-ui-testing GitHub repository.

S3 Hosted Test Website:

AWS CodeBuild supports custom containers so we can use the Selenium/standalone-Firefox and Selenium/standalone-Chrome containers, which include prebuild Firefox and Chrome browsers, respectively. Xvfb performs the graphical operation in virtual memory without any display hardware. It will be installed in the CodeBuild containers during the install phase.

Build Spec for Chrome and Firefox

The build specification for Chrome and Firefox testing includes multiple phases:

  • The environment variables section contains a set of default variables that are overridden while creating the build project or triggering the build.
  • As part of install phase, required packages like Xvfb and Selenium are installed using yum.
  • During the pre_build phase, the test bed is prepared for test execution.
  • During the build phase, the appropriate DISPLAY is set and the tests are executed.
version: 0.2

    BROWSER: "chrome"
    WebURL: "https://sampletestweb.s3-eu-west-1.amazonaws.com/website/index.html"
    ArtifactBucket: "codebuild-demo-artifact-repository"
    MODULES: "mod1"
    ModuleTable: "test-modules"
    StatusTable: "blog-test-status"

      - apt-get update
      - apt-get -y upgrade
      - apt-get install xvfb python python-pip build-essential -y
      - pip install --upgrade pip
      - pip install selenium
      - pip install awscli
      - pip install requests
      - pip install boto3
      - cp xvfb.init /etc/init.d/xvfb
      - chmod +x /etc/init.d/xvfb
      - update-rc.d xvfb defaults
      - service xvfb start
      - export PATH="$PATH:`pwd`/webdrivers"
      - python prepare_test.py
      - export DISPLAY=:5
      - cd tests
      - echo "Executing simple test..."
      - python testsuite.py

Because Ghost Driver runs headless, it can be executed on AWS Lambda. In keeping with a fire-and-forget model, I used CodeBuild to create the PhantomJS Lambda function and trigger the test invocations on Lambda in parallel. This is powerful because many tests can be executed in parallel on Lambda.

Build Spec for PhantomJS

The build specification for PhantomJS testing also includes multiple phases. It is a little different from the preceding example because we are using AWS Lambda for the test execution.

  • The environment variables section contains a set of default variables that are overridden while creating the build project or triggering the build.
  • As part of install phase, the required packages like Selenium and the AWS CLI are installed using yum.
  • During the pre_build phase, the test bed is prepared for test execution.
  • During the build phase, a zip file that will be used to create the PhantomJS Lambda function is created and tests are executed on the Lambda function.
version: 0.2

    BROWSER: "phantomjs"
    WebURL: "https://sampletestweb.s3-eu-west-1.amazonaws.com/website/index.html"
    ArtifactBucket: "codebuild-demo-artifact-repository"
    MODULES: "mod1"
    ModuleTable: "test-modules"
    StatusTable: "blog-test-status"
    LambdaRole: "arn:aws:iam::account-id:role/role-name"

      - apt-get update
      - apt-get -y upgrade
      - apt-get install python python-pip build-essential -y
      - apt-get install zip unzip -y
      - pip install --upgrade pip
      - pip install selenium
      - pip install awscli
      - pip install requests
      - pip install boto3
      - python prepare_test.py
      - cd lambda_function
      - echo "Packaging Lambda Function..."
      - zip -r /tmp/lambda_function.zip ./*
      - func_name=`echo $CODEBUILD_BUILD_ID | awk -F ':' '{print $1}'`-phantomjs
      - echo "Creating Lambda Function..."
      - chmod 777 phantomjs
      - |
         func_list=`aws lambda list-functions | grep FunctionName | awk -F':' '{print $2}' | tr -d ', "'`
         if echo "$func_list" | grep -qw $func_name
             echo "Lambda function already exists."
             aws lambda create-function --function-name $func_name --runtime "python2.7" --role $LambdaRole --handler "testsuite.lambda_handler" --zip-file fileb:///tmp/lambda_function.zip --timeout 150 --memory-size 1024 --environment Variables="{WebURL=$WebURL, StatusTable=$StatusTable}" --tags Name=$func_name
      - export PhantomJSFunction=$func_name
      - cd ../tests/
      - python testsuite.py

The list of test cases and the test modules that belong to each case are stored in an Amazon DynamoDB table. Based on the list of modules passed as an argument to the CodeBuild project, CodeBuild gets the test cases from that table and executes them. The test execution status and results are stored in another Amazon DynamoDB table. It will read the test status from the status table in DynamoDB and display it.

AWS CodeBuild and AWS Lambda perform the test execution as individual tasks. AWS CodePipeline plays an important role here by enabling continuous delivery and parallel execution of tests for optimized testing.

Here’s how to do it:

In AWS CodePipeline, create a pipeline with four stages:

  • Source (AWS CodeCommit)
  • UI testing (AWS Lambda and AWS CodeBuild)
  • Approval (manual approval)
  • Production (AWS Lambda)

Pipeline stages, the actions in each stage, and transitions between stages are shown in the following diagram.

This design implemented in AWS CodePipeline looks like this:

CodePipeline automatically detects a change in the source repository and triggers the execution of the pipeline.

In the UITest stage, there are two parallel actions:

  • DeployTestWebsite invokes a Lambda function to deploy the test website in S3 as an S3 website.
  • DeployStatusPage invokes another Lambda function to deploy in parallel the status website in S3 as an S3 website.

Next, there are three parallel actions that trigger the CodeBuild project:

  • TestOnChrome launches a container to perform the Selenium tests on Chrome.
  • TestOnFirefox launches another container to perform the Selenium tests on Firefox.
  • TestOnPhantomJS creates a Lambda function and invokes individual Lambda functions per test case to execute the test cases in parallel.

You can monitor the status of the test execution on the status website, as shown here:

When the UI testing is completed successfully, the pipeline continues to an Approval stage in which a notification is sent to the configured SNS topic. The designated team member reviews the test status and approves or rejects the deployment. Upon approval, the pipeline continues to the Production stage, where it invokes a Lambda function and deploys the website to a production S3 bucket.

I used a CloudFormation template to set up my continuous delivery pipeline. The automated-ui-testing.yaml template, available from GitHub, sets up a full-featured pipeline.

When I use the template to create my pipeline, I specify the following:

  • AWS CodeCommit repository.
  • SNS topic to send approval notification.
  • S3 bucket name where the artifacts will be stored.

The stack name should follow the rules for S3 bucket naming because it will be part of the S3 bucket name.

When the stack is created successfully, the URLs for the test website and status website appear in the Outputs section, as shown here:


In this post, I showed how you can use AWS CodePipeline, AWS CodeBuild, AWS Lambda, and a manual approval process to create a continuous delivery pipeline for serverless automated UI testing. Websites running on Amazon EC2 instances or AWS Elastic Beanstalk can also be tested using similar approach.

About the author

Prakash Palanisamy is a Solutions Architect for Amazon Web Services. When he is not working on Serverless, DevOps or Alexa, he will be solving problems in Project Euler. He also enjoys watching educational documentaries.

Malicious software libraries found in PyPI

Post Syndicated from ris original https://lwn.net/Articles/733853/rss

An advisory
from the National Security Authority of Slovakia warns that they have found
fake packages in PyPI, posing as well known libraries. “Copies of
several well known Python packages
were published under slightly modified names in the official Python package
repository PyPI (prominent example includes urllib vs. urrlib3, bzip
vs. bzip2, etc.). These packages contain the exact same code as their
upstream package thus their functionality is the same, but the installation
script, setup.py, is modified to include a malicious (but relatively
benign) code.
” The administrators of PyPI were informed and the
fake packages are gone now, however they were available from June 2017 to
September 2017. (Thanks to Paul Wise)

Manage Kubernetes Clusters on AWS Using CoreOS Tectonic

Post Syndicated from Arun Gupta original https://aws.amazon.com/blogs/compute/kubernetes-clusters-aws-coreos-tectonic/

There are multiple ways to run a Kubernetes cluster on Amazon Web Services (AWS). The first post in this series explained how to manage a Kubernetes cluster on AWS using kops. This second post explains how to manage a Kubernetes cluster on AWS using CoreOS Tectonic.

Tectonic overview

Tectonic delivers the most current upstream version of Kubernetes with additional features. It is a commercial offering from CoreOS and adds the following features over the upstream:

  • Installer
    Comes with a graphical installer that installs a highly available Kubernetes cluster. Alternatively, the cluster can be installed using AWS CloudFormation templates or Terraform scripts.
  • Operators
    An operator is an application-specific controller that extends the Kubernetes API to create, configure, and manage instances of complex stateful applications on behalf of a Kubernetes user. This release includes an etcd operator for rolling upgrades and a Prometheus operator for monitoring capabilities.
  • Console
    A web console provides a full view of applications running in the cluster. It also allows you to deploy applications to the cluster and start the rolling upgrade of the cluster.
  • Monitoring
    Node CPU and memory metrics are powered by the Prometheus operator. The graphs are available in the console. A large set of preconfigured Prometheus alerts are also available.
  • Security
    Tectonic ensures that cluster is always up to date with the most recent patches/fixes. Tectonic clusters also enable role-based access control (RBAC). Different roles can be mapped to an LDAP service.
  • Support
    CoreOS provides commercial support for clusters created using Tectonic.

Tectonic can be installed on AWS using a GUI installer or Terraform scripts. The installer prompts you for the information needed to boot the Kubernetes cluster, such as AWS access and secret key, number of master and worker nodes, and instance size for the master and worker nodes. The cluster can be created after all the options are specified. Alternatively, Terraform assets can be downloaded and the cluster can be created later. This post shows using the installer.

CoreOS License and Pull Secret

Even though Tectonic is a commercial offering, a cluster for up to 10 nodes can be created by creating a free account at Get Tectonic for Kubernetes. After signup, a CoreOS License and Pull Secret files are provided on your CoreOS account page. Download these files as they are needed by the installer to boot the cluster.

IAM user permission

The IAM user to create the Kubernetes cluster must have access to the following services and features:

  • Amazon Route 53
  • Amazon EC2
  • Elastic Load Balancing
  • Amazon S3
  • Amazon VPC
  • Security groups

Use the aws-policy policy to grant the required permissions for the IAM user.

DNS configuration

A subdomain is required to create the cluster, and it must be registered as a public Route 53 hosted zone. The zone is used to host and expose the console web application. It is also used as the static namespace for the Kubernetes API server. This allows kubectl to be able to talk directly with the master.

The domain may be registered using Route 53. Alternatively, a domain may be registered at a third-party registrar. This post uses a kubernetes-aws.io domain registered at a third-party registrar and a tectonic subdomain within it.

Generate a Route 53 hosted zone using the AWS CLI. Download jq to run this command:

ID=$(uuidgen) && \
aws route53 create-hosted-zone \
--name tectonic.kubernetes-aws.io \
--caller-reference $ID \
| jq .DelegationSet.NameServers

The command shows an output such as the following:


Create NS records for the domain with your registrar. Make sure that the NS records can be resolved using a utility like dig web interface. A sample output would look like the following:

The bottom of the screenshot shows NS records configured for the subdomain.

Download and run the Tectonic installer

Download the Tectonic installer (version 1.7.1) and extract it. The latest installer can always be found at coreos.com/tectonic. Start the installer:


Replace $PLATFORM with either darwin or linux. The installer opens your default browser and prompts you to select the cloud provider. Choose Amazon Web Services as the platform. Choose Next Step.

Specify the Access Key ID and Secret Access Key for the IAM role that you created earlier. This allows the installer to create resources required for the Kubernetes cluster. This also gives the installer full access to your AWS account. Alternatively, to protect the integrity of your main AWS credentials, use a temporary session token to generate temporary credentials.

You also need to choose a region in which to install the cluster. For the purpose of this post, I chose a region close to where I live, Northern California. Choose Next Step.

Give your cluster a name. This name is part of the static namespace for the master and the address of the console.

To enable in-place update to the Kubernetes cluster, select the checkbox next to Automated Updates. It also enables update to the etcd and Prometheus operators. This feature may become a default in future releases.

Choose Upload “tectonic-license.txt” and upload the previously downloaded license file.

Choose Upload “config.json” and upload the previously downloaded pull secret file. Choose Next Step.

Let the installer generate a CA certificate and key. In this case, the browser may not recognize this certificate, which I discuss later in the post. Alternatively, you can provide a CA certificate and a key in PEM format issued by an authorized certificate authority. Choose Next Step.

Use the SSH key for the region specified earlier. You also have an option to generate a new key. This allows you to later connect using SSH into the Amazon EC2 instances provisioned by the cluster. Here is the command that can be used to log in:

ssh –i <key> [email protected]<ec2-instance-ip>

Choose Next Step.

Define the number and instance type of master and worker nodes. In this case, create a 6 nodes cluster. Make sure that the worker nodes have enough processing power and memory to run the containers.

An etcd cluster is used as persistent storage for all of Kubernetes API objects. This cluster is required for the Kubernetes cluster to operate. There are three ways to use the etcd cluster as part of the Tectonic installer:

  • (Default) Provision the cluster using EC2 instances. Additional EC2 instances are used in this case.
  • Use an alpha support for cluster provisioning using the etcd operator. The etcd operator is used for automated operations of the etcd master nodes for the cluster itself, in addition to for etcd instances that are created for application usage. The etcd cluster is provisioned within the Tectonic installer.
  • Bring your own pre-provisioned etcd cluster.

Use the first option in this case.

For more information about choosing the appropriate instance type, see the etcd hardware recommendation. Choose Next Step.

Specify the networking options. The installer can create a new public VPC or use a pre-existing public or private VPC. Make sure that the VPC requirements are met for an existing VPC.

Give a DNS name for the cluster. Choose the domain for which the Route 53 hosted zone was configured earlier, such as tectonic.kubernetes-aws.io. Multiple clusters may be created under a single domain. The cluster name and the DNS name would typically match each other.

To select the CIDR range, choose Show Advanced Settings. You can also choose the Availability Zones for the master and worker nodes. By default, the master and worker nodes are spread across multiple Availability Zones in the chosen region. This makes the cluster highly available.

Leave the other values as default. Choose Next Step.

Specify an email address and password to be used as credentials to log in to the console. Choose Next Step.

At any point during the installation, you can choose Save progress. This allows you to save configurations specified in the installer. This configuration file can then be used to restore progress in the installer at a later point.

To start the cluster installation, choose Submit. At another time, you can download the Terraform assets by choosing Manually boot. This allows you to boot the cluster later.

The logs from the Terraform scripts are shown in the installer. When the installation is complete, the console shows that the Terraform scripts were successfully applied, the domain name was resolved successfully, and that the console has started. The domain works successfully if the DNS resolution worked earlier, and it’s the address where the console is accessible.

Choose Download assets to download assets related to your cluster. It contains your generated CA, kubectl configuration file, and the Terraform state. This download is an important step as it allows you to delete the cluster later.

Choose Next Step for the final installation screen. It allows you to access the Tectonic console, gives you instructions about how to configure kubectl to manage this cluster, and finally deploys an application using kubectl.

Choose Go to my Tectonic Console. In our case, it is also accessible at http://cluster.tectonic.kubernetes-aws.io/.

As I mentioned earlier, the browser does not recognize the self-generated CA certificate. Choose Advanced and connect to the console. Enter the login credentials specified earlier in the installer and choose Login.

The Kubernetes upstream and console version are shown under Software Details. Cluster health shows All systems go and it means that the API server and the backend API can be reached.

To view different Kubernetes resources in the cluster choose, the resource in the left navigation bar. For example, all deployments can be seen by choosing Deployments.

By default, resources in the all namespace are shown. Other namespaces may be chosen by clicking on a menu item on the top of the screen. Different administration tasks such as managing the namespaces, getting list of the nodes and RBAC can be configured as well.

Download and run Kubectl

Kubectl is required to manage the Kubernetes cluster. The latest version of kubectl can be downloaded using the following command:

curl -LO https://storage.googleapis.com/kubernetes-release/release/$(curl -s https://storage.googleapis.com/kubernetes-release/release/stable.txt)/bin/darwin/amd64/kubectl

It can also be conveniently installed using the Homebrew package manager. To find and access a cluster, Kubectl needs a kubeconfig file. By default, this configuration file is at ~/.kube/config. This file is created when a Kubernetes cluster is created from your machine. However, in this case, download this file from the console.

In the console, choose admin, My Account, Download Configuration and follow the steps to download the kubectl configuration file. Move this file to ~/.kube/config. If kubectl has already been used on your machine before, then this file already exists. Make sure to take a backup of that file first.

Now you can run the commands to view the list of deployments:

~ $ kubectl get deployments --all-namespaces
NAMESPACE         NAME                                    DESIRED   CURRENT   UP-TO-DATE   AVAILABLE   AGE
kube-system       etcd-operator                           1         1         1            1           43m
kube-system       heapster                                1         1         1            1           40m
kube-system       kube-controller-manager                 3         3         3            3           43m
kube-system       kube-dns                                1         1         1            1           43m
kube-system       kube-scheduler                          3         3         3            3           43m
tectonic-system   container-linux-update-operator         1         1         1            1           40m
tectonic-system   default-http-backend                    1         1         1            1           40m
tectonic-system   kube-state-metrics                      1         1         1            1           40m
tectonic-system   kube-version-operator                   1         1         1            1           40m
tectonic-system   prometheus-operator                     1         1         1            1           40m
tectonic-system   tectonic-channel-operator               1         1         1            1           40m
tectonic-system   tectonic-console                        2         2         2            2           40m
tectonic-system   tectonic-identity                       2         2         2            2           40m
tectonic-system   tectonic-ingress-controller             1         1         1            1           40m
tectonic-system   tectonic-monitoring-auth-alertmanager   1         1         1            1           40m
tectonic-system   tectonic-monitoring-auth-prometheus     1         1         1            1           40m
tectonic-system   tectonic-prometheus-operator            1         1         1            1           40m
tectonic-system   tectonic-stats-emitter                  1         1         1            1           40m

This output is similar to the one shown in the console earlier. Now, this kubectl can be used to manage your resources.

Upgrade the Kubernetes cluster

Tectonic allows the in-place upgrade of the cluster. This is an experimental feature as of this release. The clusters can be updated either automatically, or with manual approval.

To perform the update, choose Administration, Cluster Settings. If an earlier Tectonic installer, version 1.6.2 in this case, is used to install the cluster, then this screen would look like the following:

Choose Check for Updates. If any updates are available, choose Start Upgrade. After the upgrade is completed, the screen is refreshed.

This is an experimental feature in this release and so should only be used on clusters that can be easily replaced. This feature may become a fully supported in a future release. For more information about the upgrade process, see Upgrading Tectonic & Kubernetes.

Delete the Kubernetes cluster

Typically, the Kubernetes cluster is a long-running cluster to serve your applications. After its purpose is served, you may delete it. It is important to delete the cluster as this ensures that all resources created by the cluster are appropriately cleaned up.

The easiest way to delete the cluster is using the assets downloaded in the last step of the installer. Extract the downloaded zip file. This creates a directory like <cluster-name>_TIMESTAMP. In that directory, give the following command to delete the cluster:

TERRAFORM_CONFIG=$(pwd)/.terraformrc terraform destroy --force

This destroys the cluster and all associated resources.

You may have forgotten to download the assets. There is a copy of the assets in the directory tectonic/tectonic-installer/darwin/clusters. In this directory, another directory with the name <cluster-name>_TIMESTAMP contains your assets.


This post explained how to manage Kubernetes clusters using the CoreOS Tectonic graphical installer.  For more details, see Graphical Installer with AWS. If the installation does not succeed, see the helpful Troubleshooting tips. After the cluster is created, see the Tectonic tutorials to learn how to deploy, scale, version, and delete an application.

Future posts in this series will explain other ways of creating and running a Kubernetes cluster on AWS.


Security updates for Wednesday

Post Syndicated from ris original https://lwn.net/Articles/733583/rss

Security updates have been issued by Arch Linux (bluez and linux-hardened), CentOS (bluez and kernel), Debian (bluez, emacs24, tcpdump, and xen), Fedora (kernel and mimedefang), Oracle (bluez and kernel), Red Hat (bluez, flash-plugin, instack-undercloud, kernel, kernel-rt, and openvswitch), Scientific Linux (bluez and kernel), Slackware (emacs and libzip), SUSE (xen), and Ubuntu (bluez and qemu).

Security updates for Wednesday

Post Syndicated from ris original https://lwn.net/Articles/733040/rss

Security updates have been issued by Debian (file, icedove, irssi, ruby2.3, and tcpdump), Fedora (libzip and openjpeg2), openSUSE (clamav-database, icu, libzypp, zypper, and php5), Oracle (389-ds-base), Red Hat (rh-maven33-groovy), SUSE (postgresql94, postgresql96, and python-pycrypto), and Ubuntu (bzr and libgd2).

Implement Continuous Integration and Delivery of Apache Spark Applications using AWS

Post Syndicated from Luis Caro Perez original https://aws.amazon.com/blogs/big-data/implement-continuous-integration-and-delivery-of-apache-spark-applications-using-aws/

When you develop Apache Spark–based applications, you might face some additional challenges when dealing with continuous integration and deployment pipelines, such as the following common issues:

  • Applications must be tested on real clusters using automation tools (live test)
  • Any user or developer must be able to easily deploy and use different versions of both the application and infrastructure to be able to debug, experiment on, and test different functionality.
  • Infrastructure needs to be evaluated and tested along with the application that uses it.

In this post, we walk you through a solution that implements a continuous integration and deployment pipeline supported by AWS services. The pipeline offers the following workflow:

  • Deploy the application to a QA stage after a commit is performed to the source code.
  • Perform a unit test using Spark local mode.
  • Deploy to a dynamically provisioned Amazon EMR cluster and test the Spark application on it
  • Update the application as an AWS Service Catalog product version, allowing a user to deploy any version (commit) of the application on demand.

Solution overview

The following diagram shows the pipeline workflow.

The solution uses AWS CodePipeline, which allows users to orchestrate and automate the build, test, and deploy stages for application source code. The solution consists of a pipeline that contains the following stages:

  • Source: Both the Spark application source code in addition to the AWS CloudFormation template file for deploying the application are committed to version control. In this example, we use AWS CodeCommit. For an example of the application source code, see zip. 
  • Build: In this stage, you use Apache Maven both to generate the application .jar binaries and to execute all of the application unit tests that end with *Spec.scala. In this example, we use AWS CodeBuild, which runs the unit tests given that they are designed to use Spark local mode.
  • QADeploy: In this stage, the .jar file built previously is deployed using the CloudFormation template included with the application source code. All the resources are created in this stage, such as networks, EMR clusters, and so on. 
  • LiveTest: In this stage, you use Apache Maven to execute all the application tests that end with *SpecLive.scala. The tests submit EMR steps to the cluster created as part of the QADeploy step. The tests verify that the steps ran successfully and their results. 
  • LiveTestApproval: This stage is included in case a pipeline administrator approval is required to deploy the application to the next stages. The pipeline pauses in this stage until an administrator manually approves the release. 
  • QACleanup: In this stage, you use an AWS Lambda function to delete the CloudFormation template deployed as part of the QADeploy stage. The function does not affect any resources other than those deployed as part of the QADeploy stage. 
  • DeployProduct: In this stage, you use a Lambda function that creates or updates an AWS Service Catalog product and portfolio. Every time the pipeline releases a change to the application, the AWS Service Catalog product gets a new version, with the commit of the change as the version description. 

Try it out!

Use the provided sample template to get started using this solution. This template creates the pipeline described earlier with all of its stages. It performs an initial commit of the sample Spark application in order to trigger the first release change. To deploy the template, use the following AWS CLI command:

aws cloudformation create-stack  --template-url https://s3.amazonaws.com/aws-bigdata-blog/artifacts/sparkAppDemoForPipeline/emrSparkpipeline.yaml --stack-name emr-spark-pipeline --capabilities CAPABILITY_NAMED_IAM

After the template finishes creating resources, you see the pipeline name on the stack Outputs tab. After that, open the AWS CodePipeline console and select the newly created pipeline.

After a couple of minutes, AWS CodePipeline detects the initial commit applied by the CloudFormation stack and starts the first release.

You can watch how the pipeline goes through the Build, QADeploy, and LiveTest stages until it finally reaches the LiveTestApproval stage.

At this point, you can check the results of the test in the log files of the Build and LiveTest stage jobs on AWS CodeBuild. If you check the CloudFormation console, you see that a new template has been deployed as part of the QADeploy stage.

You can also visit the EMR console and view how the LiveTest stage submitted steps to the EMR cluster.

After performing the review, manually approve the revision on the LiveTestApproval stage by using the AWS CodePipeline console.

After the revision is approved, the pipeline proceeds to use a Lambda function that destroys the resources deployed on the QAdeploy stage. Finally, it creates or updates a product and portfolio in AWS Service Catalog. After the final stage of the pipeline is complete, you can check that the product is created successfully on the AWS Service Catalog console.

You can check the product versions and notice that the first version is the initial commit performed by the CloudFormation template.

You can proceed to share the created portfolio with any users in your AWS account and allow them to deploy any version of the Spark application. You can also perform a commit on the AWS CodeCommit repository. The pipeline is triggered automatically and repeats the pipeline execution to deploy a new version of the product.

To destroy all of the resources created by the stack, make sure all the deployed stacks using AWS Service Catalog or the QAdeploy stage are destroyed. Then, destroy the pipeline template using the following AWS CLI command:


aws cloudformation delete-stack --stack-name emr-spark-pipeline


You can use the sample template and Spark application shared in this post and adapt them for the specific needs of your own application. The pipeline can have as many stages as needed and it can be used to automatically deploy to AWS Service Catalog or a production environment using CloudFormation.

If you have questions or suggestions, please comment below.

Additional Reading

Learn how to implement authorization and auditing on Amazon EMR using Apache Ranger.


About the Authors

Luis Caro is a Big Data Consultant for AWS Professional Services. He works with our customers to provide guidance and technical assistance on big data projects, helping them improving the value of their solutions when using AWS.



Samuel Schmidt is a Big Data Consultant for AWS Professional Services. He works with our customers to provide guidance and technical assistance on big data projects, helping them improving the value of their solutions when using AWS.




Security updates for Friday

Post Syndicated from corbet original https://lwn.net/Articles/732649/rss

Security updates have been issued by CentOS (openssh, poppler, and thunderbird), Debian (graphicsmagick and openexr), Fedora (cacti, dnsdist, exim, groovy18, kernel, libsndfile, mingw-libzip, and taglib), Oracle (openssh), Red Hat (openssh), Scientific Linux (openssh), and SUSE (git and xen).

Security updates for Monday

Post Syndicated from jake original https://lwn.net/Articles/731567/rss

Security updates have been issued by Arch Linux (newsbeuter), Debian (augeas, curl, ioquake3, libxml2, newsbeuter, and strongswan), Fedora (bodhi, chicken, chromium, cryptlib, cups-filters, cyrus-imapd, glibc, mingw-openjpeg2, mingw-postgresql, qpdf, and torbrowser-launcher), Gentoo (bzip2, evilvte, ghostscript-gpl, Ked Password Manager, and rar), Mageia (curl, cvs, fossil, jetty, kernel, kernel-linus, kernel-tmb, libmspack, mariadb, mercurial, potrace, ruby, and taglib), Oracle (kernel), Red Hat (xmlsec1), and Ubuntu (graphite2 and strongswan).

Analyzing AWS Cost and Usage Reports with Looker and Amazon Athena

Post Syndicated from Dillon Morrison original https://aws.amazon.com/blogs/big-data/analyzing-aws-cost-and-usage-reports-with-looker-and-amazon-athena/

This is a guest post by Dillon Morrison at Looker. Looker is, in their own words, “a new kind of analytics platform–letting everyone in your business make better decisions by getting reliable answers from a tool they can use.” 

As the breadth of AWS products and services continues to grow, customers are able to more easily move their technology stack and core infrastructure to AWS. One of the attractive benefits of AWS is the cost savings. Rather than paying upfront capital expenses for large on-premises systems, customers can instead pay variables expenses for on-demand services. To further reduce expenses AWS users can reserve resources for specific periods of time, and automatically scale resources as needed.

The AWS Cost Explorer is great for aggregated reporting. However, conducting analysis on the raw data using the flexibility and power of SQL allows for much richer detail and insight, and can be the better choice for the long term. Thankfully, with the introduction of Amazon Athena, monitoring and managing these costs is now easier than ever.

In the post, I walk through setting up the data pipeline for cost and usage reports, Amazon S3, and Athena, and discuss some of the most common levers for cost savings. I surface tables through Looker, which comes with a host of pre-built data models and dashboards to make analysis of your cost and usage data simple and intuitive.

Analysis with Athena

With Athena, there’s no need to create hundreds of Excel reports, move data around, or deploy clusters to house and process data. Athena uses Apache Hive’s DDL to create tables, and the Presto querying engine to process queries. Analysis can be performed directly on raw data in S3. Conveniently, AWS exports raw cost and usage data directly into a user-specified S3 bucket, making it simple to start querying with Athena quickly. This makes continuous monitoring of costs virtually seamless, since there is no infrastructure to manage. Instead, users can leverage the power of the Athena SQL engine to easily perform ad-hoc analysis and data discovery without needing to set up a data warehouse.

After the data pipeline is established, cost and usage data (the recommended billing data, per AWS documentation) provides a plethora of comprehensive information around usage of AWS services and the associated costs. Whether you need the report segmented by product type, user identity, or region, this report can be cut-and-sliced any number of ways to properly allocate costs for any of your business needs. You can then drill into any specific line item to see even further detail, such as the selected operating system, tenancy, purchase option (on-demand, spot, or reserved), and so on.


By default, the Cost and Usage report exports CSV files, which you can compress using gzip (recommended for performance). There are some additional configuration options for tuning performance further, which are discussed below.


If you want to follow along, you need the following resources:

Enable the cost and usage reports

First, enable the Cost and Usage report. For Time unit, select Hourly. For Include, select Resource IDs. All options are prompted in the report-creation window.

The Cost and Usage report dumps CSV files into the specified S3 bucket. Please note that it can take up to 24 hours for the first file to be delivered after enabling the report.

Configure the S3 bucket and files for Athena querying

In addition to the CSV file, AWS also creates a JSON manifest file for each cost and usage report. Athena requires that all of the files in the S3 bucket are in the same format, so we need to get rid of all these manifest files. If you’re looking to get started with Athena quickly, you can simply go into your S3 bucket and delete the manifest file manually, skip the automation described below, and move on to the next section.

To automate the process of removing the manifest file each time a new report is dumped into S3, which I recommend as you scale, there are a few additional steps. The folks at Concurrency labs wrote a great overview and set of scripts for this, which you can find in their GitHub repo.

These scripts take the data from an input bucket, remove anything unnecessary, and dump it into a new output bucket. We can utilize AWS Lambda to trigger this process whenever new data is dropped into S3, or on a nightly basis, or whatever makes most sense for your use-case, depending on how often you’re querying the data. Please note that enabling the “hourly” report means that data is reported at the hour-level of granularity, not that a new file is generated every hour.

Following these scripts, you’ll notice that we’re adding a date partition field, which isn’t necessary but improves query performance. In addition, converting data from CSV to a columnar format like ORC or Parquet also improves performance. We can automate this process using Lambda whenever new data is dropped in our S3 bucket. Amazon Web Services discusses columnar conversion at length, and provides walkthrough examples, in their documentation.

As a long-term solution, best practice is to use compression, partitioning, and conversion. However, for purposes of this walkthrough, we’re not going to worry about them so we can get up-and-running quicker.

Set up the Athena query engine

In your AWS console, navigate to the Athena service, and click “Get Started”. Follow the tutorial and set up a new database (we’ve called ours “AWS Optimizer” in this example). Don’t worry about configuring your initial table, per the tutorial instructions. We’ll be creating a new table for cost and usage analysis. Once you walked through the tutorial steps, you’ll be able to access the Athena interface, and can begin running Hive DDL statements to create new tables.

One thing that’s important to note, is that the Cost and Usage CSVs also contain the column headers in their first row, meaning that the column headers would be included in the dataset and any queries. For testing and quick set-up, you can remove this line manually from your first few CSV files. Long-term, you’ll want to use a script to programmatically remove this row each time a new file is dropped in S3 (every few hours typically). We’ve drafted up a sample script for ease of reference, which we run on Lambda. We utilize Lambda’s native ability to invoke the script whenever a new object is dropped in S3.

For cost and usage, we recommend using the DDL statement below. Since our data is in CSV format, we don’t need to use a SerDe, we can simply specify the “separatorChar, quoteChar, and escapeChar”, and the structure of the files (“TEXTFILE”). Note that AWS does have an OpenCSV SerDe as well, if you prefer to use that.


identity_LineItemId String,
identity_TimeInterval String,
bill_InvoiceId String,
bill_BillingEntity String,
bill_BillType String,
bill_PayerAccountId String,
bill_BillingPeriodStartDate String,
bill_BillingPeriodEndDate String,
lineItem_UsageAccountId String,
lineItem_LineItemType String,
lineItem_UsageStartDate String,
lineItem_UsageEndDate String,
lineItem_ProductCode String,
lineItem_UsageType String,
lineItem_Operation String,
lineItem_AvailabilityZone String,
lineItem_ResourceId String,
lineItem_UsageAmount String,
lineItem_NormalizationFactor String,
lineItem_NormalizedUsageAmount String,
lineItem_CurrencyCode String,
lineItem_UnblendedRate String,
lineItem_UnblendedCost String,
lineItem_BlendedRate String,
lineItem_BlendedCost String,
lineItem_LineItemDescription String,
lineItem_TaxType String,
product_ProductName String,
product_accountAssistance String,
product_architecturalReview String,
product_architectureSupport String,
product_availability String,
product_bestPractices String,
product_cacheEngine String,
product_caseSeverityresponseTimes String,
product_clockSpeed String,
product_currentGeneration String,
product_customerServiceAndCommunities String,
product_databaseEdition String,
product_databaseEngine String,
product_dedicatedEbsThroughput String,
product_deploymentOption String,
product_description String,
product_durability String,
product_ebsOptimized String,
product_ecu String,
product_endpointType String,
product_engineCode String,
product_enhancedNetworkingSupported String,
product_executionFrequency String,
product_executionLocation String,
product_feeCode String,
product_feeDescription String,
product_freeQueryTypes String,
product_freeTrial String,
product_frequencyMode String,
product_fromLocation String,
product_fromLocationType String,
product_group String,
product_groupDescription String,
product_includedServices String,
product_instanceFamily String,
product_instanceType String,
product_io String,
product_launchSupport String,
product_licenseModel String,
product_location String,
product_locationType String,
product_maxIopsBurstPerformance String,
product_maxIopsvolume String,
product_maxThroughputvolume String,
product_maxVolumeSize String,
product_maximumStorageVolume String,
product_memory String,
product_messageDeliveryFrequency String,
product_messageDeliveryOrder String,
product_minVolumeSize String,
product_minimumStorageVolume String,
product_networkPerformance String,
product_operatingSystem String,
product_operation String,
product_operationsSupport String,
product_physicalProcessor String,
product_preInstalledSw String,
product_proactiveGuidance String,
product_processorArchitecture String,
product_processorFeatures String,
product_productFamily String,
product_programmaticCaseManagement String,
product_provisioned String,
product_queueType String,
product_requestDescription String,
product_requestType String,
product_routingTarget String,
product_routingType String,
product_servicecode String,
product_sku String,
product_softwareType String,
product_storage String,
product_storageClass String,
product_storageMedia String,
product_technicalSupport String,
product_tenancy String,
product_thirdpartySoftwareSupport String,
product_toLocation String,
product_toLocationType String,
product_training String,
product_transferType String,
product_usageFamily String,
product_usagetype String,
product_vcpu String,
product_version String,
product_volumeType String,
product_whoCanOpenCases String,
pricing_LeaseContractLength String,
pricing_OfferingClass String,
pricing_PurchaseOption String,
pricing_publicOnDemandCost String,
pricing_publicOnDemandRate String,
pricing_term String,
pricing_unit String,
reservation_AvailabilityZone String,
reservation_NormalizedUnitsPerReservation String,
reservation_NumberOfReservations String,
reservation_ReservationARN String,
reservation_TotalReservedNormalizedUnits String,
reservation_TotalReservedUnits String,
reservation_UnitsPerReservation String,
resourceTags_userName String,
resourceTags_usercostcategory String  

      ESCAPED BY '\\'

    LOCATION 's3://<<your bucket name>>';

Once you’ve successfully executed the command, you should see a new table named “cost_and_usage” with the below properties. Now we’re ready to start executing queries and running analysis!

Start with Looker and connect to Athena

Setting up Looker is a quick process, and you can try it out for free here (or download from Amazon Marketplace). It takes just a few seconds to connect Looker to your Athena database, and Looker comes with a host of pre-built data models and dashboards to make analysis of your cost and usage data simple and intuitive. After you’re connected, you can use the Looker UI to run whatever analysis you’d like. Looker translates this UI to optimized SQL, so any user can execute and visualize queries for true self-service analytics.

Major cost saving levers

Now that the data pipeline is configured, you can dive into the most popular use cases for cost savings. In this post, I focus on:

  • Purchasing Reserved Instances vs. On-Demand Instances
  • Data transfer costs
  • Allocating costs over users or other Attributes (denoted with resource tags)

On-Demand, Spot, and Reserved Instances

Purchasing Reserved Instances vs On-Demand Instances is arguably going to be the biggest cost lever for heavy AWS users (Reserved Instances run up to 75% cheaper!). AWS offers three options for purchasing instances:

  • On-Demand—Pay as you use.
  • Spot (variable cost)—Bid on spare Amazon EC2 computing capacity.
  • Reserved Instances—Pay for an instance for a specific, allotted period of time.

When purchasing a Reserved Instance, you can also choose to pay all-upfront, partial-upfront, or monthly. The more you pay upfront, the greater the discount.

If your company has been using AWS for some time now, you should have a good sense of your overall instance usage on a per-month or per-day basis. Rather than paying for these instances On-Demand, you should try to forecast the number of instances you’ll need, and reserve them with upfront payments.

The total amount of usage with Reserved Instances versus overall usage with all instances is called your coverage ratio. It’s important not to confuse your coverage ratio with your Reserved Instance utilization. Utilization represents the amount of reserved hours that were actually used. Don’t worry about exceeding capacity, you can still set up Auto Scaling preferences so that more instances get added whenever your coverage or utilization crosses a certain threshold (we often see a target of 80% for both coverage and utilization among savvy customers).

Calculating the reserved costs and coverage can be a bit tricky with the level of granularity provided by the cost and usage report. The following query shows your total cost over the last 6 months, broken out by Reserved Instance vs other instance usage. You can substitute the cost field for usage if you’d prefer. Please note that you should only have data for the time period after the cost and usage report has been enabled (though you can opt for up to 3 months of historical data by contacting your AWS Account Executive). If you’re just getting started, this query will only show a few days.


	DATE_FORMAT(from_iso8601_timestamp(cost_and_usage.lineitem_usagestartdate),'%Y-%m') AS "cost_and_usage.usage_start_month",
	COALESCE(SUM(cost_and_usage.lineitem_unblendedcost ), 0) AS "cost_and_usage.total_unblended_cost",
         WHEN cost_and_usage.lineitem_lineitemtype = 'DiscountedUsage' THEN 'RI Line Item'
         WHEN cost_and_usage.lineitem_lineitemtype = 'RIFee' THEN 'RI Line Item'
         WHEN cost_and_usage.lineitem_lineitemtype = 'Fee' THEN 'RI Line Item'
         ELSE 'Non RI Line Item'
        END = 'RI Line Item') THEN cost_and_usage.lineitem_unblendedcost  ELSE NULL END), 0) AS "cost_and_usage.total_reserved_unblended_cost",
         WHEN cost_and_usage.lineitem_lineitemtype = 'DiscountedUsage' THEN 'RI Line Item'
         WHEN cost_and_usage.lineitem_lineitemtype = 'RIFee' THEN 'RI Line Item'
         WHEN cost_and_usage.lineitem_lineitemtype = 'Fee' THEN 'RI Line Item'
         ELSE 'Non RI Line Item'
        END = 'RI Line Item') THEN cost_and_usage.lineitem_unblendedcost  ELSE NULL END), 0)) / NULLIF((COALESCE(SUM(cost_and_usage.lineitem_unblendedcost ), 0)),0)  AS "cost_and_usage.percent_spend_on_ris",
         WHEN cost_and_usage.lineitem_lineitemtype = 'DiscountedUsage' THEN 'RI Line Item'
         WHEN cost_and_usage.lineitem_lineitemtype = 'RIFee' THEN 'RI Line Item'
         WHEN cost_and_usage.lineitem_lineitemtype = 'Fee' THEN 'RI Line Item'
         ELSE 'Non RI Line Item'
        END = 'Non RI Line Item') THEN cost_and_usage.lineitem_unblendedcost  ELSE NULL END), 0) AS "cost_and_usage.total_non_reserved_unblended_cost",
         WHEN cost_and_usage.lineitem_lineitemtype = 'DiscountedUsage' THEN 'RI Line Item'
         WHEN cost_and_usage.lineitem_lineitemtype = 'RIFee' THEN 'RI Line Item'
         WHEN cost_and_usage.lineitem_lineitemtype = 'Fee' THEN 'RI Line Item'
         ELSE 'Non RI Line Item'
        END = 'Non RI Line Item') THEN cost_and_usage.lineitem_unblendedcost  ELSE NULL END), 0)) / NULLIF((COALESCE(SUM(cost_and_usage.lineitem_unblendedcost ), 0)),0)  AS "cost_and_usage.percent_spend_on_non_ris"
FROM aws_optimizer.cost_and_usage  AS cost_and_usage

	(((from_iso8601_timestamp(cost_and_usage.lineitem_usagestartdate)) >= ((DATE_ADD('month', -5, DATE_TRUNC('MONTH', CAST(NOW() AS DATE))))) AND (from_iso8601_timestamp(cost_and_usage.lineitem_usagestartdate)) < ((DATE_ADD('month', 6, DATE_ADD('month', -5, DATE_TRUNC('MONTH', CAST(NOW() AS DATE))))))))

The resulting table should look something like the image below (I’m surfacing tables through Looker, though the same table would result from querying via command line or any other interface).

With a BI tool, you can create dashboards for easy reference and monitoring. New data is dumped into S3 every few hours, so your dashboards can update several times per day.

It’s an iterative process to understand the appropriate number of Reserved Instances needed to meet your business needs. After you’ve properly integrated Reserved Instances into your purchasing patterns, the savings can be significant. If your coverage is consistently below 70%, you should seriously consider adjusting your purchase types and opting for more Reserved instances.

Data transfer costs

One of the great things about AWS data storage is that it’s incredibly cheap. Most charges often come from moving and processing that data. There are several different prices for transferring data, broken out largely by transfers between regions and availability zones. Transfers between regions are the most costly, followed by transfers between Availability Zones. Transfers within the same region and same availability zone are free unless using elastic or public IP addresses, in which case there is a cost. You can find more detailed information in the AWS Pricing Docs. With this in mind, there are several simple strategies for helping reduce costs.

First, since costs increase when transferring data between regions, it’s wise to ensure that as many services as possible reside within the same region. The more you can localize services to one specific region, the lower your costs will be.

Second, you should maximize the data you’re routing directly within AWS services and IP addresses. Transfers out to the open internet are the most costly and least performant mechanisms of data transfers, so it’s best to keep transfers within AWS services.

Lastly, data transfers between private IP addresses are cheaper than between elastic or public IP addresses, so utilizing private IP addresses as much as possible is the most cost-effective strategy.

The following query provides a table depicting the total costs for each AWS product, broken out transfer cost type. Substitute the “lineitem_productcode” field in the query to segment the costs by any other attribute. If you notice any unusually high spikes in cost, you’ll need to dig deeper to understand what’s driving that spike: location, volume, and so on. Drill down into specific costs by including “product_usagetype” and “product_transfertype” in your query to identify the types of transfer costs that are driving up your bill.

	cost_and_usage.lineitem_productcode  AS "cost_and_usage.product_code",
	COALESCE(SUM(cost_and_usage.lineitem_unblendedcost), 0) AS "cost_and_usage.total_unblended_cost",
	COALESCE(SUM(CASE WHEN REGEXP_LIKE(cost_and_usage.product_usagetype, 'DataTransfer')    THEN cost_and_usage.lineitem_unblendedcost  ELSE NULL END), 0) AS "cost_and_usage.total_data_transfer_cost",
	COALESCE(SUM(CASE WHEN REGEXP_LIKE(cost_and_usage.product_usagetype, 'DataTransfer-In')    THEN cost_and_usage.lineitem_unblendedcost  ELSE NULL END), 0) AS "cost_and_usage.total_inbound_data_transfer_cost",
	COALESCE(SUM(CASE WHEN REGEXP_LIKE(cost_and_usage.product_usagetype, 'DataTransfer-Out')    THEN cost_and_usage.lineitem_unblendedcost  ELSE NULL END), 0) AS "cost_and_usage.total_outbound_data_transfer_cost"
FROM aws_optimizer.cost_and_usage  AS cost_and_usage

	(((from_iso8601_timestamp(cost_and_usage.lineitem_usagestartdate)) >= ((DATE_ADD('month', -5, DATE_TRUNC('MONTH', CAST(NOW() AS DATE))))) AND (from_iso8601_timestamp(cost_and_usage.lineitem_usagestartdate)) < ((DATE_ADD('month', 6, DATE_ADD('month', -5, DATE_TRUNC('MONTH', CAST(NOW() AS DATE))))))))

When moving between regions or over the open web, many data transfer costs also include the origin and destination location of the data movement. Using a BI tool with mapping capabilities, you can get a nice visual of data flows. The point at the center of the map is used to represent external data flows over the open internet.

Analysis by tags

AWS provides the option to apply custom tags to individual resources, so you can allocate costs over whatever customized segment makes the most sense for your business. For a SaaS company that hosts software for customers on AWS, maybe you’d want to tag the size of each customer. The following query uses custom tags to display the reserved, data transfer, and total cost for each AWS service, broken out by tag categories, over the last 6 months. You’ll want to substitute the cost_and_usage.resourcetags_customersegment and cost_and_usage.customer_segment with the name of your customer field.


SELECT *, DENSE_RANK() OVER (ORDER BY z___min_rank) as z___pivot_row_rank, RANK() OVER (PARTITION BY z__pivot_col_rank ORDER BY z___min_rank) as z__pivot_col_ordering FROM (
SELECT *, MIN(z___rank) OVER (PARTITION BY "cost_and_usage.product_code") as z___min_rank FROM (
SELECT *, RANK() OVER (ORDER BY CASE WHEN z__pivot_col_rank=1 THEN (CASE WHEN "cost_and_usage.total_unblended_cost" IS NOT NULL THEN 0 ELSE 1 END) ELSE 2 END, CASE WHEN z__pivot_col_rank=1 THEN "cost_and_usage.total_unblended_cost" ELSE NULL END DESC, "cost_and_usage.total_unblended_cost" DESC, z__pivot_col_rank, "cost_and_usage.product_code") AS z___rank FROM (
SELECT *, DENSE_RANK() OVER (ORDER BY CASE WHEN "cost_and_usage.customer_segment" IS NULL THEN 1 ELSE 0 END, "cost_and_usage.customer_segment") AS z__pivot_col_rank FROM (
	cost_and_usage.lineitem_productcode  AS "cost_and_usage.product_code",
	cost_and_usage.resourcetags_customersegment  AS "cost_and_usage.customer_segment",
	COALESCE(SUM(cost_and_usage.lineitem_unblendedcost ), 0) AS "cost_and_usage.total_unblended_cost",
	1.0 * (COALESCE(SUM(CASE WHEN REGEXP_LIKE(cost_and_usage.product_usagetype, 'DataTransfer')    THEN cost_and_usage.lineitem_unblendedcost  ELSE NULL END), 0)) / NULLIF((COALESCE(SUM(cost_and_usage.lineitem_unblendedcost ), 0)),0)  AS "cost_and_usage.percent_spend_data_transfers_unblended",
         WHEN cost_and_usage.lineitem_lineitemtype = 'DiscountedUsage' THEN 'RI Line Item'
         WHEN cost_and_usage.lineitem_lineitemtype = 'RIFee' THEN 'RI Line Item'
         WHEN cost_and_usage.lineitem_lineitemtype = 'Fee' THEN 'RI Line Item'
         ELSE 'Non RI Line Item'
        END = 'Non RI Line Item') THEN cost_and_usage.lineitem_unblendedcost  ELSE NULL END), 0)) / NULLIF((COALESCE(SUM(cost_and_usage.lineitem_unblendedcost ), 0)),0)  AS "cost_and_usage.unblended_percent_spend_on_ris"
FROM aws_optimizer.cost_and_usage_raw  AS cost_and_usage

	(((from_iso8601_timestamp(cost_and_usage.lineitem_usagestartdate)) >= ((DATE_ADD('month', -5, DATE_TRUNC('MONTH', CAST(NOW() AS DATE))))) AND (from_iso8601_timestamp(cost_and_usage.lineitem_usagestartdate)) < ((DATE_ADD('month', 6, DATE_ADD('month', -5, DATE_TRUNC('MONTH', CAST(NOW() AS DATE))))))))
GROUP BY 1,2) ww
) bb WHERE z__pivot_col_rank <= 16384
) aa
) xx
) zz
 WHERE z___pivot_row_rank <= 500 OR z__pivot_col_ordering = 1 ORDER BY z___pivot_row_rank

The resulting table in this example looks like the results below. In this example, you can tell that we’re making poor use of Reserved Instances because they represent such a small portion of our overall costs.

Again, using a BI tool to visualize these costs and trends over time makes the analysis much easier to consume and take action on.


Saving costs on your AWS spend is always an iterative, ongoing process. Hopefully with these queries alone, you can start to understand your spending patterns and identify opportunities for savings. However, this is just a peek into the many opportunities available through analysis of the Cost and Usage report. Each company is different, with unique needs and usage patterns. To achieve maximum cost savings, we encourage you to set up an analytics environment that enables your team to explore all potential cuts and slices of your usage data, whenever it’s necessary. Exploring different trends and spikes across regions, services, user types, etc. helps you gain comprehensive understanding of your major cost levers and consistently implement new cost reduction strategies.

Note that all of the queries and analysis provided in this post were generated using the Looker data platform. If you’re already a Looker customer, you can get all of this analysis, additional pre-configured dashboards, and much more using Looker Blocks for AWS.

About the Author

Dillon Morrison leads the Platform Ecosystem at Looker. He enjoys exploring new technologies and architecting the most efficient data solutions for the business needs of his company and their customers. In his spare time, you’ll find Dillon rock climbing in the Bay Area or nose deep in the docs of the latest AWS product release at his favorite cafe (“Arlequin in SF is unbeatable!”).




Security updates for Monday

Post Syndicated from ris original https://lwn.net/Articles/730910/rss

Security updates have been issued by Debian (botan1.10, cvs, firefox-esr, iortcw, libgd2, libgxps, supervisor, and zabbix), Fedora (curl, firefox, git, jackson-databind, libgxps, libsoup, openjpeg2, potrace, python-dbusmock, spatialite-tools, and sqlite), Mageia (cacti, ffmpeg, git, heimdal, jackson-databind, kernel-linus, kernel-tmb, krb5, php-phpmailer, ruby-rubyzip, and supervisor), openSUSE (firefox, librsvg, libsoup, ncurses, and tcmu-runner), Oracle (firefox), Red Hat (java-1.8.0-ibm), Slackware (git, libsoup, mercurial, and subversion), and SUSE (kernel).

Backblaze Cloud Backup 5.0: The Rapid Access Release

Post Syndicated from Yev original https://www.backblaze.com/blog/cloud-backup-5-0-rapid-access/

Announcing Backblaze Cloud Backup 5.0: the Rapid Access Release. We’ve been at the backup game for a long time now, and we continue to focus on providing the best unlimited backup service on the planet. A lot of the features in this release have come from listening to our customers about how they want to use their data. “Rapid Access” quickly became the theme because, well, we’re all acquiring more and more data and want to access it in a myriad of ways.

This release brings a lot of new functionality to Backblaze Computer Backup: faster backups, accelerated file browsing, image preview, individual file download (without creating a “restore”), and file sharing. To top it all off, we’ve refreshed the user interface on our client app. We hope you like it!

Speeding Things Up

New code + new hardware + elbow grease = things are going to move much faster.

Faster Backups

We’ve doubled the number of threads available for backup on both Mac and PC . This gives our service the ability to intelligently detect the right settings for you (based on your computer, capacity, and bandwidth). As always, you can manually set the number of threads — keep in mind that if you have a slow internet connection, adding threads might have the opposite effect and slow you down. On its default settings, our client app will now automatically evaluate what’s best given your environment. We’ve internally tested our service backing up at over 100 Mbps, which means if you have a fast-enough internet connection, you could back up 50 GB in just one hour.

Faster Browsing

We’ve introduced a number of enhancements that increase file browsing speed by 3x. Hidden files are no longer displayed by default, but you can still show them with one click on the restore page. This gives the restore interface a cleaner look, and helps you navigate backup history if you need to roll back time.

Faster Restore Preparation

We take pride in providing a variety of ways for consumers to get their data back. When something has happened to your computer, getting your files back quickly is critical. Both web download restores and Restore by Mail will now be much faster. In some cases up to 10x faster!

Preview — Access — Share

Our system has received a number of enhancements — all intended to give you more access to your data.

Image Preview

If you have a lot of photos, this one’s for you. When you go to the restore page you’ll now be able to click on each individual file that we have backed up, and if it’s an image you’ll see a preview of that file. We hope this helps people figure out which pictures they want to download (this especially helps people with a lot of photos named something along the lines of: 2017-04-20-9783-41241.jpg). Now you can just click on the picture to preview it.


Once you’ve clicked on a file (30MB and smaller), you’ll be able to individually download that file directly in your browser. You’ll no longer need to wait for a single-file restore to be built and zipped up; you’ll be able to download it quickly and easily. This was a highly requested feature and we’re stoked to get it implemented.


We’re leveraging Backblaze B2 Cloud Storage and giving folks the ability to publicly share their files. In order to use this feature, you’ll need to enable Backblaze B2 on your account (if you haven’t already, there’s a simple wizard that will pop up the first time you try to share a file). Files can be shared anywhere in the world via URL. All B2 accounts have 10GB/month of storage and 1GB/day of downloads (equivalent to sharing an iPhone photo 1,000 times per month) for free. You can increase those limits in your B2 Settings. Keep in mind that any file you share will be accessible to anybody with the link. Learn more about File Sharing.

For now, we’ve limited the Preview/Access/Share functionality to files 30MB and smaller, but larger files will be supported in the coming weeks!

Other Goodies

In addition to adding 2FV via ToTP, we’ve also been hard at work on the client. In version 5.0 we’ve touched up the user interface to make it a bit more lively, and we’ve also made the client IPv6 compatible.

Backblaze 5.0 Available: August 10, 2017

We will slowly be auto-updating all users in the coming weeks. To update now:

This version is now the default download on www.backblaze.com.

We hope you enjoy Backblaze Cloud Backup v5.0!

The post Backblaze Cloud Backup 5.0: The Rapid Access Release appeared first on Backblaze Blog | Cloud Storage & Cloud Backup.

Security updates for Thursday

Post Syndicated from corbet original https://lwn.net/Articles/730474/rss

Security updates have been issued by Debian (firefox-esr), Fedora (cacti, community-mysql, and pspp), Mageia (varnish), openSUSE (mariadb, nasm, pspp, and rubygem-rubyzip), Oracle (evince, freeradius, golang, java-1.7.0-openjdk, log4j, NetworkManager and libnl3, pki-core, qemu-kvm, and X.org), Red Hat (flash-plugin), and Slackware (curl and mozilla).

Updates to GPIO Zero, the physical computing API

Post Syndicated from Ben Nuttall original https://www.raspberrypi.org/blog/gpio-zero-update/

GPIO Zero v1.4 is out now! It comes with a set of new features, including a handy pinout command line tool. To start using this newest version of the API, update your Raspbian OS now:

sudo apt update && sudo apt upgrade

Some of the things we’ve added will make it easier for you try your hand on different programming styles. In doing so you’ll build your coding skills, and will improve as a programmer. As a consequence, you’ll learn to write more complex code, which will enable you to take on advanced electronics builds. And on top of that, you can use the skills you’ll acquire in other computing projects.

GPIO Zero pinout tool

The new pinout tool

Developing GPIO Zero

Nearly two years ago, I started the GPIO Zero project as a simple wrapper around the low-level RPi.GPIO library. I wanted to create a simpler way to control GPIO-connected devices in Python, based on three years’ experience of training teachers, running workshops, and building projects. The idea grew over time, and the more we built for our Python library, the more sophisticated and powerful it became.

One of the great things about Python is that it’s a multi-paradigm programming language. You can write code in a number of different styles, according to your needs. You don’t have to write classes, but you can if you need them. There are functional programming tools available, but beginners get by without them. Importantly, the more advanced features of the language are not a barrier to entry.

Become a more advanced programmer

As a beginner to programming, you usually start by writing procedural programs, in which the flow moves from top to bottom. Then you’ll probably add loops and create your own functions. Your next step might be to start using libraries which introduce new patterns that operate in a different manner to what you’ve written before, for example threaded callbacks (event-driven programming). You might move on to object-oriented programming, extending the functionality of classes provided by other libraries, and starting to write your own classes. Occasionally, you may make use of tools created with functional programming techniques.

Five buttons in different colours

Take control of the buttons in your life

It’s much the same with GPIO Zero: you can start using it very easily, and we’ve made it simple to progress along the learning curve towards more advanced programming techniques. For example, if you want to make a push button control an LED, the easiest way to do this is via procedural programming using a while loop:

from gpiozero import LED, Button

led = LED(17)
button = Button(2)

while True:
    if button.is_pressed:

But another way to achieve the same thing is to use events:

from gpiozero import LED, Button
from signal import pause

led = LED(17)
button = Button(2)

button.when_pressed = led.on
button.when_released = led.off


You could even use a declarative approach, and set the LED’s behaviour in a single line:

from gpiozero import LED, Button
from signal import pause

led = LED(17)
button = Button(2)

led.source = button.values


You will find that using the procedural approach is a great start, but at some point you’ll hit a limit, and will have to try a different approach. The example above can be approach in several programming styles. However, if you’d like to control a wider range of devices or a more complex system, you need to carefully consider which style works best for what you want to achieve. Being able to choose the right programming style for a task is a skill in itself.

Source/values properties

So how does the led.source = button.values thing actually work?

Every GPIO Zero device has a .value property. For example, you can read a button’s state (True or False), and read or set an LED’s state (so led.value = True is the same as led.on()). Since LEDs and buttons operate with the same value set (True and False), you could say led.value = button.value. However, this only sets the LED to match the button once. If you wanted it to always match the button’s state, you’d have to use a while loop. To make things easier, we came up with a way of telling devices they’re connected: we added a .values property to all devices, and a .source to output devices. Now, a loop is no longer necessary, because this will do the job:

led.source = button.values

This is a simple approach to connecting devices using a declarative style of programming. In one single line, we declare that the LED should get its values from the button, i.e. when the button is pressed, the LED should be on. You can even mix the procedural with the declarative style: at one stage of the program, the LED could be set to match the button, while in the next stage it could just be blinking, and finally it might return back to its original state.

These additions are useful for connecting other devices as well. For example, a PWMLED (LED with variable brightness) has a value between 0 and 1, and so does a potentiometer connected via an ADC (analogue-digital converter) such as the MCP3008. The new GPIO Zero update allows you to say led.source = pot.values, and then twist the potentiometer to control the brightness of the LED.

But what if you want to do something more complex, like connect two devices with different value sets or combine multiple inputs?

We provide a set of device source tools, which allow you to process values as they flow from one device to another. They also let you send in artificial values such as random data, and you can even write your own functions to generate values to pass to a device’s source. For example, to control a motor’s speed with a potentiometer, you could use this code:

from gpiozero import Motor, MCP3008
from signal import pause

motor = Motor(20, 21)
pot = MCP3008()

motor.source = pot.values


This works, but it will only drive the motor forwards. If you wanted the potentiometer to drive it forwards and backwards, you’d use the scaled tool to scale its values to a range of -1 to 1:

from gpiozero import Motor, MCP3008
from gpiozero.tools import scaled
from signal import pause

motor = Motor(20, 21)
pot = MCP3008()

motor.source = scaled(pot.values, -1, 1)


And to separately control a robot’s left and right motor speeds with two potentiometers, you could do this:

from gpiozero import Robot, MCP3008
from signal import pause

robot = Robot(left=(2, 3), right=(4, 5))
left = MCP3008(0)
right = MCP3008(1)

robot.source = zip(left.values, right.values)


GPIO Zero and Blue Dot

Martin O’Hanlon created a Python library called Blue Dot which allows you to use your Android device to remotely control things on their Raspberry Pi. The API is very similar to GPIO Zero, and it even incorporates the value/values properties, which means you can hook it up to GPIO devices easily:

from bluedot import BlueDot
from gpiozero import LED
from signal import pause

bd = BlueDot()
led = LED(17)

led.source = bd.values


We even included a couple of Blue Dot examples in our recipes.

Make a series of binary logic gates using source/values

Read more in this source/values tutorial from The MagPi, and on the source/values documentation page.

Remote GPIO control

GPIO Zero supports multiple low-level GPIO libraries. We use RPi.GPIO by default, but you can choose to use RPIO or pigpio instead. The pigpio library supports remote connections, so you can run GPIO Zero on one Raspberry Pi to control the GPIO pins of another, or run code on a PC (running Windows, Mac, or Linux) to remotely control the pins of a Pi on the same network. You can even control two or more Pis at once!

If you’re using Raspbian on a Raspberry Pi (or a PC running our x86 Raspbian OS), you have everything you need to remotely control GPIO. If you’re on a PC running Windows, Mac, or Linux, you just need to install gpiozero and pigpio using pip. See our guide on configuring remote GPIO.

I road-tested the new pin_factory syntax at the Raspberry Jam @ Pi Towers

There are a number of different ways to use remote pins:

  • Set the default pin factory and remote IP address with environment variables:
$ GPIOZERO_PIN_FACTORY=pigpio PIGPIO_ADDR= python3 blink.py
  • Set the default pin factory in your script:
import gpiozero
from gpiozero import LED
from gpiozero.pins.pigpio import PiGPIOFactory

gpiozero.Device.pin_factory = PiGPIOFactory(host='')

led = LED(17)
  • The pin_factory keyword argument allows you to use multiple Pis in the same script:
from gpiozero import LED
from gpiozero.pins.pigpio import PiGPIOFactory

factory2 = PiGPIOFactory(host='')
factory3 = PiGPIOFactory(host='')

local_hat = TrafficHat()
remote_hat2 = TrafficHat(pin_factory=factory2)
remote_hat3 = TrafficHat(pin_factory=factory3)

This is a really powerful feature! For more, read this remote GPIO tutorial in The MagPi, and check out the remote GPIO recipes in our documentation.

GPIO Zero on your PC

GPIO Zero doesn’t have any dependencies, so you can install it on your PC using pip. In addition to the API’s remote GPIO control, you can use its ‘mock’ pin factory on your PC. We originally created the mock pin feature for the GPIO Zero test suite, but we found that it’s really useful to be able to test GPIO Zero code works without running it on real hardware:

>>> from gpiozero import LED
>>> led = LED(22)
>>> led.blink()
>>> led.value
>>> led.value

You can even tell pins to change state (e.g. to simulate a button being pressed) by accessing an object’s pin property:

>>> from gpiozero import LED
>>> led = LED(22)
>>> button = Button(23)
>>> led.source = button.values
>>> led.value
>>> button.pin.drive_low()
>>> led.value

You can also use the pinout command line tool if you set your pin factory to ‘mock’. It gives you a Pi 3 diagram by default, but you can supply a revision code to see information about other Pi models. For example, to use the pinout tool for the original 256MB Model B, just type pinout -r 2.

GPIO Zero documentation and resources

On the API’s website, we provide beginner recipes and advanced recipes, and we have added remote GPIO configuration including PC/Mac/Linux and Pi Zero OTG, and a section of GPIO recipes. There are also new sections on source/values, command-line tools, FAQs, Pi information and library development.

You’ll find plenty of cool projects using GPIO Zero in our learning resources. For example, you could check out the one that introduces physical computing with Python and get stuck in! We even provide a GPIO Zero cheat sheet you can download and print.

There are great GPIO Zero tutorials and projects in The MagPi magazine every month. Moreover, they also publish Simple Electronics with GPIO Zero, a book which collects a series of tutorials useful for building your knowledge of physical computing. And the best thing is, you can download it, and all magazine issues, for free!

Check out the API documentation and read more about what’s new in GPIO Zero on my blog. We have lots planned for the next release. Watch this space.

Get building!

The world of physical computing is at your fingertips! Are you feeling inspired?

If you’ve never tried your hand on physical computing, our Build a robot buggy learning resource is the perfect place to start! It’s your step-by-step guide for building a simple robot controlled with the help of GPIO Zero.

If you have a gee-whizz idea for an electronics project, do share it with us below. And if you’re currently working on a cool build and would like to show us how it’s going, pop a link to it in the comments.

The post Updates to GPIO Zero, the physical computing API appeared first on Raspberry Pi.