Tag Archives: zip

What We’re Thankful For

Post Syndicated from Roderick Bauer original https://www.backblaze.com/blog/what-were-thankful-for/

All of us at Backblaze hope you have a wonderful Thanksgiving, and that you can enjoy it with family and friends. We asked everyone at Backblaze to express what they are thankful for. Here are their responses.

Fall leaves

What We’re Thankful For

Aside from friends, family, hobbies, health, etc. I’m thankful for my home. It’s not much, but it’s mine, and allows me to indulge in everything listed above. Or not, if I so choose. And coffee.

— Tony

I’m thankful for my wife Jen, and my other friends. I’m thankful that I like my coworkers and can call them friends too. I’m thankful for my health. I’m thankful that I was born into a middle class family in the US and that I have been very, very lucky because of that.

— Adam

Besides the most important things which are being thankful for my family, my health and my friends, I am very thankful for Backblaze. This is the first job I’ve ever had where I truly feel like I have a great work/life balance. With having 3 kids ages 8, 6 and 4, a husband that works crazy hours and my tennis career on the rise (kidding but I am on 4 teams) it’s really nice to feel like I have balance in my life. So cheers to Backblaze – where a girl can have it all!

— Shelby

I am thankful to work at a high-tech company that recognizes the contributions of engineers in their 40s and 50s.

— Jeannine

I am thankful for the music, the songs I’m singing. Thankful for all the joy they’re bringing. Who can live without it, I ask in all honesty? What would life be? Without a song or a dance what are we? So I say thank you for the music. For giving it to me!

— Yev

I’m thankful that I don’t look anything like the portrait my son draws of me…seriously.

— Natalie

I am thankful to work for a company that puts its people and product ahead of profits.

— James

I am thankful that even in the middle of disasters, turmoil, and violence, there are always people who commit amazing acts of generosity, courage, and kindness that restore my faith in mankind.

— Roderick

The future.

— Ahin

The Future

I am thankful for the current state of modern inexpensive broadband networking that allows me to stay in touch with friends and family that are far away, allows Backblaze to exist and pay my salary so I can live comfortably, and allows me to watch cat videos for free. The internet makes this an amazing time to be alive.

— Brian

Other than being thankful for family & good health, I’m quite thankful through the years I’ve avoided losing any of my 12+TB photo archive. 20 years of photoshoots, family photos and cell phone photos kept safe through changing storage media (floppy drives, flopticals, ZIP, JAZ, DVD-RAM, CD, DVD and hard drives), not to mention various technology/software solutions. It’s a data minefield out there, especially in the long run with changing media formats.

— Jim

I am thankful for non-profit organizations and their volunteers, such as IMAlive. Possibly the greatest gift you can give someone is empowerment, and an opportunity for them to recognize their own resilience and strength.

— Emily

I am thankful for my loving family, friends who make me laugh, a cool company to work for, talented co-workers who make me a better engineer, and beautiful Fall days in Wisconsin!

— Marjorie

Marjorie Wisconsin

I’m thankful for preschool drawings about thankfulness.

— Adam

I am thankful for new friends and working for a company that allows us to be ourselves.

— Annalisa

I’m thankful for my dog as I always find a reason to smile at him everyday. Yes, he still smells from his skunkin’ last week and he tracks mud in my house, but he came from the San Quentin puppy-prisoner program and I’m thankful I found him and that he found me! My vet is thankful as well.

— Terry

I’m thankful that my colleagues are also my friends outside of the office and that the rain season has started in California.

— Aaron

I’m thankful for family, friends, and beer. Mostly for family and friends, but beer is really nice too!

— Ken

There are so many amazing blessings that make up my daily life that I thank God for, so here I go – my basic needs of food, water and shelter, my husband and 2 daughters and the rest of the family (here and abroad) — their love, support, health, and safety, waking up to a new day every day, friends, music, my job, funny things, hugs and more hugs (who does not like hugs?).

— Cecilia

I am thankful to be blessed with a close-knit extended family, and for everything they do for my new, growing family. With a toddler and a second child on the way, it helps having so many extra sets of hands around to help with the kids!

— Zack

I’m thankful for family and friends, the opportunities my parents gave me by moving the U.S., and that all of us together at Backblaze have built a place to be proud of.

— Gleb

Aside for being thankful for family and friends, I am also thankful I live in a place with such natural beauty. Being so close to mountains and the ocean, and everything in between, is something that I don’t take for granted!

— Sona

I’m thankful for my wonderful wife, family, friends, and co-workers. I’m thankful for having a happy and healthy son, and the chance to watch him grow on a daily basis.

— Ariel

I am thankful for a dog-friendly workplace.

— LeAnn

I’m thankful for my amazing new wife and that she’s as much of a nerd as I am.

— Troy

I am thankful for every reunion with my siblings and families.

— Cecilia

I am thankful for my funny, strong-willed, happy daughter, my awesome husband, my family, and amazing friends. I am also thankful for the USA and all the opportunities that come with living here. Finally, I am thankful for Backblaze, a truly great place to work and for all of my co-workers/friends here.

— Natasha

I am thankful that I do not need to hunt and gather everyday to put food on the table but at the same time I feel that I don’t appreciate the food the sits before me as much as I should. So I use Thanksgiving to think about the people and the animals that put food on my family’s table.

— KC

I am thankful for my cat, Catnip. She’s been with me for 18 years and seen me through so many ups and downs. She’s been along my side through two long-term relationships, several moves, and one marriage. I know we don’t have much time together and feel blessed every day she’s here.

— JC

I am thankful for imperfection and misshapen candies. The imperceptible romance of sunsets through bus windows. The dream that family, friends, co-workers, and strangers are connected by love. I am thankful to my ancestors for enduring so much hardship so that I could be here enjoying Bay Area burritos.

— Damon

Autumn leaves

The post What We’re Thankful For appeared first on Backblaze Blog | Cloud Storage & Cloud Backup.

Amazon QuickSight Update – Geospatial Visualization, Private VPC Access, and More

Post Syndicated from Jeff Barr original https://aws.amazon.com/blogs/aws/amazon-quicksight-update-geospatial-visualization-private-vpc-access-and-more/

We don’t often recognize or celebrate anniversaries at AWS. With nearly 100 services on our list, we’d be eating cake and drinking champagne several times a week. While that might sound like fun, we’d rather spend our working hours listening to customers and innovating. With that said, Amazon QuickSight has now been generally available for a little over a year and I would like to give you a quick update!

QuickSight in Action
Today, tens of thousands of customers (from startups to enterprises, in industries as varied as transportation, legal, mining, and healthcare) are using QuickSight to analyze and report on their business data.

Here are a couple of examples:

Gemini provides legal evidence procurement for California attorneys who represent injured workers. They have gone from creating custom reports and running one-off queries to creating and sharing dynamic QuickSight dashboards with drill-downs and filtering. QuickSight is used to track sales pipeline, measure order throughput, and to locate bottlenecks in the order processing pipeline.

Jivochat provides a real-time messaging platform to connect visitors to website owners. QuickSight lets them create and share interactive dashboards while also providing access to the underlying datasets. This has allowed them to move beyond the sharing of static spreadsheets, ensuring that everyone is looking at the same and is empowered to make timely decisions based on current data.

Transfix is a tech-powered freight marketplace that matches loads and increases visibility into logistics for Fortune 500 shippers in retail, food and beverage, manufacturing, and other industries. QuickSight has made analytics accessible to both BI engineers and non-technical business users. They scrutinize key business and operational metrics including shipping routes, carrier efficient, and process automation.

Looking Back / Looking Ahead
The feedback on QuickSight has been incredibly helpful. Customers tell us that their employees are using QuickSight to connect to their data, perform analytics, and make high-velocity, data-driven decisions, all without setting up or running their own BI infrastructure. We love all of the feedback that we get, and use it to drive our roadmap, leading to the introduction of over 40 new features in just a year. Here’s a summary:

Looking forward, we are watching an interesting trend develop within our customer base. As these customers take a close look at how they analyze and report on data, they are realizing that a serverless approach offers some tangible benefits. They use Amazon Simple Storage Service (S3) as a data lake and query it using a combination of QuickSight and Amazon Athena, giving them agility and flexibility without static infrastructure. They also make great use of QuickSight’s dashboards feature, monitoring business results and operational metrics, then sharing their insights with hundreds of users. You can read Building a Serverless Analytics Solution for Cleaner Cities and review Serverless Big Data Analytics using Amazon Athena and Amazon QuickSight if you are interested in this approach.

New Features and Enhancements
We’re still doing our best to listen and to learn, and to make sure that QuickSight continues to meet your needs. I’m happy to announce that we are making seven big additions today:

Geospatial Visualization – You can now create geospatial visuals on geographical data sets.

Private VPC Access – You can now sign up to access a preview of a new feature that allows you to securely connect to data within VPCs or on-premises, without the need for public endpoints.

Flat Table Support – In addition to pivot tables, you can now use flat tables for tabular reporting. To learn more, read about Using Tabular Reports.

Calculated SPICE Fields – You can now perform run-time calculations on SPICE data as part of your analysis. Read Adding a Calculated Field to an Analysis for more information.

Wide Table Support – You can now use tables with up to 1000 columns.

Other Buckets – You can summarize the long tail of high-cardinality data into buckets, as described in Working with Visual Types in Amazon QuickSight.

HIPAA Compliance – You can now run HIPAA-compliant workloads on QuickSight.

Geospatial Visualization
Everyone seems to want this feature! You can now take data that contains a geographic identifier (country, city, state, or zip code) and create beautiful visualizations with just a few clicks. QuickSight will geocode the identifier that you supply, and can also accept lat/long map coordinates. You can use this feature to visualize sales by state, map stores to shipping destinations, and so forth. Here’s a sample visualization:

To learn more about this feature, read Using Geospatial Charts (Maps), and Adding Geospatial Data.

Private VPC Access Preview
If you have data in AWS (perhaps in Amazon Redshift, Amazon Relational Database Service (RDS), or on EC2) or on-premises in Teradata or SQL Server on servers without public connectivity, this feature is for you. Private VPC Access for QuickSight uses an Elastic Network Interface (ENI) for secure, private communication with data sources in a VPC. It also allows you to use AWS Direct Connect to create a secure, private link with your on-premises resources. Here’s what it looks like:

If you are ready to join the preview, you can sign up today.

Jeff;

 

Visualize AWS Cloudtrail Logs using AWS Glue and Amazon Quicksight

Post Syndicated from Luis Caro Perez original https://aws.amazon.com/blogs/big-data/streamline-aws-cloudtrail-log-visualization-using-aws-glue-and-amazon-quicksight/

Being able to easily visualize AWS CloudTrail logs gives you a better understanding of how your AWS infrastructure is being used. It can also help you audit and review AWS API calls and detect security anomalies inside your AWS account. To do this, you must be able to perform analytics based on your CloudTrail logs.

In this post, I walk through using AWS Glue and AWS Lambda to convert AWS CloudTrail logs from JSON to a query-optimized format dataset in Amazon S3. I then use Amazon Athena and Amazon QuickSight to query and visualize the data.

Solution overview

To process CloudTrail logs, you must implement the following architecture:

CloudTrail delivers log files in an Amazon S3 bucket folder. To correctly crawl these logs, you modify the file contents and folder structure using an Amazon S3-triggered Lambda function that stores the transformed files in an S3 bucket single folder. When the files are in a single folder, AWS Glue scans the data, converts it into Apache Parquet format, and catalogs it to allow for querying and visualization using Amazon Athena and Amazon QuickSight.

Walkthrough

Let’s look at the steps that are required to build the solution.

Set up CloudTrail logs

First, you need to set up a trail that delivers log files to an S3 bucket. To create a trail in CloudTrail, follow the instructions in Creating a Trail.

When you finish, the trail settings page should look like the following screenshot:

In this example, I set up log files to be delivered to the cloudtraillfcaro bucket.

Consolidate CloudTrail reports into a single folder using Lambda

AWS CloudTrail delivers log files using the following folder structure inside the configured Amazon S3 bucket:

AWSLogs/ACCOUNTID/CloudTrail/REGION/YEAR/MONTH/HOUR/filename.json.gz

Additionally, log files have the following structure:

{
    "Records": [{
        "eventVersion": "1.01",
        "userIdentity": {
            "type": "IAMUser",
            "principalId": "AIDAJDPLRKLG7UEXAMPLE",
            "arn": "arn:aws:iam::123456789012:user/Alice",
            "accountId": "123456789012",
            "accessKeyId": "AKIAIOSFODNN7EXAMPLE",
            "userName": "Alice",
            "sessionContext": {
                "attributes": {
                    "mfaAuthenticated": "false",
                    "creationDate": "2014-03-18T14:29:23Z"
                }
            }
        },
        "eventTime": "2014-03-18T14:30:07Z",
        "eventSource": "cloudtrail.amazonaws.com",
        "eventName": "StartLogging",
        "awsRegion": "us-west-2",
        "sourceIPAddress": "72.21.198.64",
        "userAgent": "signin.amazonaws.com",
        "requestParameters": {
            "name": "Default"
        },
        "responseElements": null,
        "requestID": "cdc73f9d-aea9-11e3-9d5a-835b769c0d9c",
        "eventID": "3074414d-c626-42aa-984b-68ff152d6ab7"
    },
    ... additional entries ...
    ]

If AWS Glue crawlers are used to catalog these files as they are written, the following obstacles arise:

  1. AWS Glue identifies different tables per different folders because they don’t follow a traditional partition format.
  2. Based on the structure of the file content, AWS Glue identifies the tables as having a single column of type array.
  3. CloudTrail logs have JSON attributes that use uppercase letters. According to the Best Practices When Using Athena with AWS Glue, it is recommended that you convert these to lowercase.

To have AWS Glue catalog all log files in a single table with all the columns describing each event, implement the following Lambda function:

from __future__ import print_function
import json
import urllib
import boto3
import gzip

s3 = boto3.resource('s3')
client = boto3.client('s3')

def convertColumntoLowwerCaps(obj):
    for key in obj.keys():
        new_key = key.lower()
        if new_key != key:
            obj[new_key] = obj[key]
            del obj[key]
    return obj


def lambda_handler(event, context):

    bucket = event['Records'][0]['s3']['bucket']['name']
    key = urllib.unquote_plus(event['Records'][0]['s3']['object']['key'].encode('utf8'))
    print(bucket)
    print(key)
    try:
        newKey = 'flatfiles/' + key.replace("/", "")
        client.download_file(bucket, key, '/tmp/file.json.gz')
        with gzip.open('/tmp/out.json.gz', 'w') as output, gzip.open('/tmp/file.json.gz', 'rb') as file:
            i = 0
            for line in file: 
                for record in json.loads(line,object_hook=convertColumntoLowwerCaps)['records']:
            		if i != 0:
            		    output.write("\n")
            		output.write(json.dumps(record))
            		i += 1
        client.upload_file('/tmp/out.json.gz', bucket,newKey)
        return "success"
    except Exception as e:
        print(e)
        print('Error processing object {} from bucket {}. Make sure they exist and your bucket is in the same region as this function.'.format(key, bucket))
        raise e

The function goes over each element of the records array, changes uppercase letters to lowercase in column names, and inserts each element of the array as a single line of a new file. The new file is saved inside a flatfiles folder created by the function without any subfolders in the S3 bucket.

The function should have a role containing a policy with at least the following permissions:

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Action": [
                "s3:*"
            ],
            "Resource": [
                "arn:aws:s3:::cloudtraillfcaro/*",
                "arn:aws:s3:::cloudtraillfcaro"
            ],
            "Effect": "Allow"
        }
    ]
}

In this example, CloudTrail delivers logs to the cloudtraillfcaro bucket. Make sure that you replace this name with your bucket name in the policy. For more information about how to work with inline policies, see Working with Inline Policies.

After the Lambda function is created, you can set up the following trigger using the Triggers tab on the AWS Lambda console.

Choose Add trigger, and choose S3 as a source of the trigger.

After choosing the source, configure the following settings:

In the trigger, any file that is written to the path for the log files—which in this case is AWSLogs/119582755581/CloudTrail/—is processed. Make sure that the Enable trigger check box is selected and that the bucket and prefix parameters match your use case.

After you set up the function and receive log files, the bucket (in this case cloudtraillfcaro) should contain the processed files inside the flatfiles folder.

Catalog source data

Once the files are processed by the Lambda function, set up a crawler named cloudtrail to catalog them.

The crawler must point to the flatfiles folder.

All the crawlers and AWS Glue jobs created for this solution must have a role with the AWSGlueServiceRole managed policy and an inline policy with permissions to modify the S3 buckets used on the Lambda function. For more information, see Working with Managed Policies.

The role should look like the following:

In this example, the inline policy named s3perms contains the permissions to modify the S3 buckets.

After you choose the role, you can schedule the crawler to run on demand.

A new database is created, and the crawler is set to use it. In this case, the cloudtrail database is used for all the tables.

After the crawler runs, a single table should be created in the catalog with the following structure:

The table should contain the following columns:

Create and run the AWS Glue job

To convert all the CloudTrail logs to a columnar store in Parquet, set up an AWS Glue job by following these steps.

Upload the following script into a bucket in Amazon S3:

import sys
from awsglue.transforms import *
from awsglue.utils import getResolvedOptions
from pyspark.context import SparkContext
from awsglue.context import GlueContext
from awsglue.job import Job
import boto3
import time

## @params: [JOB_NAME]
args = getResolvedOptions(sys.argv, ['JOB_NAME'])

sc = SparkContext()
glueContext = GlueContext(sc)
spark = glueContext.spark_session
job = Job(glueContext)
job.init(args['JOB_NAME'], args)

datasource0 = glueContext.create_dynamic_frame.from_catalog(database = "cloudtrail", table_name = "flatfiles", transformation_ctx = "datasource0")
resolvechoice1 = ResolveChoice.apply(frame = datasource0, choice = "make_struct", transformation_ctx = "resolvechoice1")
relationalized1 = resolvechoice1.relationalize("trail", args["TempDir"]).select("trail")
datasink = glueContext.write_dynamic_frame.from_options(frame = relationalized1, connection_type = "s3", connection_options = {"path": "s3://cloudtraillfcaro/parquettrails"}, format = "parquet", transformation_ctx = "datasink4")
job.commit()

In the example, you load the script as a file named cloudtrailtoparquet.py. Make sure that you modify the script and update the “{"path": "s3://cloudtraillfcaro/parquettrails"}” with the destination in which you want to store your results.

After uploading the script, add a new AWS Glue job. Choose a name and role for the job, and choose the option of running the job from An existing script that you provide.

To avoid processing the same data twice, enable the Job bookmark setting in the Advanced properties section of the job properties.

Choose Next twice, and then choose Finish.

If logs are already in the flatfiles folder, you can run the job on demand to generate the first set of results.

Once the job starts running, wait for it to complete.

When the job is finished, its Run status should be Succeeded. After that, you can verify that the Parquet files are written to the Amazon S3 location.

Catalog results

To be able to process results from Athena, you can use an AWS Glue crawler to catalog the results of the AWS Glue job.

In this example, the crawler is set to use the same database as the source named cloudtrail.

You can run the crawler using the console. When the crawler finishes running and has processed the Parquet results, a new table should be created in the AWS Glue Data Catalog. In this example, it’s named parquettrails.

The table should have the classification set to parquet.

It should have the same columns as the flatfiles table, with the exception of the struct type columns, which should be relationalized into several columns:

In this example, notice how the requestparameters column, which was a struct in the original table (flatfiles), was transformed to several columns—one for each key value inside it. This is done using a transformation native to AWS Glue called relationalize.

Query results with Athena

After crawling the results, you can query them using Athena. For example, to query what events took place in the time frame between 2017-10-23t12:00:00 and 2017-10-23t13:00, use the following select statement:

select *
from cloudtrail.parquettrails
where eventtime > '2017-10-23T12:00:00Z' AND eventtime < '2017-10-23T13:00:00Z'
order by eventtime asc;

Be sure to replace cloudtrail.parquettrails with the names of your database and table that references the Parquet results. Replace the datetimes with an hour when your account had activity and was processed by the AWS Glue job.

Visualize results using Amazon QuickSight

Once you can query the data using Athena, you can visualize it using Amazon QuickSight. Before connecting Amazon QuickSight to Athena, be sure to grant QuickSight access to Athena and the associated S3 buckets in your account. For more information, see Managing Amazon QuickSight Permissions to AWS Resources. You can then create a new data set in Amazon QuickSight based on the Athena table that you created.

After setting up permissions, you can create a new analysis in Amazon QuickSight by choosing New analysis.

Then add a new data set.

Choose Athena as the source.

Give the data source a name (in this case, I named it cloudtrail).

Choose the name of the database and the table referencing the Parquet results.

Then choose Visualize.

After that, you should see the following screen:

Now you can create some visualizations. First, search for the sourceipaddress column, and drag it to the AutoGraph section.

You can see a list of the IP addresses that you have used to interact with AWS. To review whether these IP addresses have been used from IAM users, internal AWS services, or roles, use the type value that is inside the useridentity field of the original log files. Thanks to the relationalize transformation, this value is available as the useridentity.type column. After the column is added into the Group/Color box, the visualization should look like the following:

You can now see and distinguish the most used IPs and whether they are used from roles, AWS services, or IAM users.

After following all these steps, you can use Amazon QuickSight to add different columns from CloudTrail and perform different types of visualizations. You can build operational dashboards that continuously monitor AWS infrastructure usage and access. You can share those dashboards with others in your organization who might need to see this data.

Summary

In this post, you saw how you can use a simple Lambda function and an AWS Glue script to convert text files into Parquet to improve Athena query performance and data compression. The post also demonstrated how to use AWS Lambda to preprocess files in Amazon S3 and transform them into a format that is recognizable by AWS Glue crawlers.

This example, used AWS CloudTrail logs, but you can apply the proposed solution to any set of files that after preprocessing, can be cataloged by AWS Glue.


Additional Reading

Learn how to Harmonize, Query, and Visualize Data from Various Providers using AWS Glue, Amazon Athena, and Amazon QuickSight.


About the Authors

Luis Caro is a Big Data Consultant for AWS Professional Services. He works with our customers to provide guidance and technical assistance on big data projects, helping them improving the value of their solutions when using AWS.

 

 

 

Security updates for Wednesday

Post Syndicated from ris original https://lwn.net/Articles/738473/rss

Security updates have been issued by Arch Linux (chromium, libzip, and openssl), Debian (chromium-browser, otrs2, slurm-llnl, and tomcat7), Fedora (kernel, libgcrypt, nodejs, php, poppler, qemu, rpm, and wget), openSUSE (chromium), Red Hat (chromium-browser and rhvm-appliance), SUSE (krb5 and qemu), and Ubuntu (openjdk-8).

piwheels: making “pip install” fast

Post Syndicated from Ben Nuttall original https://www.raspberrypi.org/blog/piwheels/

TL;DR pip install numpy used to take ages, and now it’s super fast thanks to piwheels.

The Python Package Index (PyPI) is a package repository for Python modules. Members of the Python community publish software and libraries in it as an easy method of distribution. If you’ve ever used pip install, PyPI is the service that hosts the software you installed. You may have noticed that some installations can take a long time on the Raspberry Pi. That usually happens when modules have been implemented in C and require compilation.

XKCD comic of two people sword-fighting on office chairs while their code is compiling

No more slacking off! pip install numpy takes just a few seconds now \o/

Wheels for Python packages

A general solution to this problem exists: Python wheels are a standard for distributing pre-built versions of packages, saving users from having to build from source. However, when C code is compiled, it’s compiled for a particular architecture, so package maintainers usually publish wheels for 32-bit and 64-bit Windows, macOS, and Linux. Although Raspberry Pi runs Linux, its architecture is ARM, so Linux wheels are not compatible.

A comic of snakes biting their own tails to roll down a sand dune like wheels

What Python wheels are not

Pip works by browsing PyPI for a wheel matching the user’s architecture — and if it doesn’t find one, it falls back to the source distribution (usually a tarball or zip of the source code). Then the user has to build it themselves, which can take a long time, or may require certain dependencies. And if pip can’t find a source distribution, the installation fails.

Developing piwheels

In order to solve this problem, I decided to build wheels of every package on PyPI. I wrote some tooling for automating the process and used a postgres database to monitor the status of builds and log the output. Using a Pi 3 in my living room, I attempted to build wheels of the latest version of all 100 000 packages on PyPI and to host them on a web server on the Pi. This took a total of ten days, and my proof-of-concept seemed to show that it generally worked and was likely to be useful! You could install packages directly from the server, and installations were really fast.

A Raspberry Pi 3 sitting atop a Pi 2 on cloth

This Pi 3 was the piwheels beta server, sitting atop my SSH gateway Pi 2 at home

I proceeded to plan for version 2, which would attempt to build every version of every package — about 750 000 versions in total. I estimated this would take 75 days for one Pi, but I intended to scale it up to use multiple Pis. Our web hosts Mythic Beasts provide dedicated Pi 3 hosting, so I fired up 20 of them to share the load. With some help from Dave Jones, who created an efficient queuing system for the builders, we were able make this run like clockwork. In under two weeks, it was complete! Read ALL about the first build run drama on my blog.

A list of the mythic beasts cloud Pis

ALL the cloud Pis

Improving piwheels

We analysed the failures, made some tweaks, installed some key dependencies, and ran the build again to raise our success rate from 76% to 83%. We also rebuilt packages for Python 3.5 (the new default in Raspbian Stretch). The wheels we build are tagged ‘armv7l’, but because our Raspbian image is compatible with all Pi models, they’re really ARMv6, so they’re compatible with Pi 3, Pi 2, Pi 1 and Pi Zero. This means the ‘armv6l’-tagged wheels we provide are really just the ARMv7 wheels renamed.

The piwheels monitor interface created by Dave Jones

The wonderful piwheels monitor interface created by Dave

Now, you might be thinking “Why didn’t you just cross-compile?” I really wanted to have full compatibility, and building natively on Pis seemed to be the best way to achieve that. I had easy access to the Pis, and it really didn’t take all that long. Plus, you know, I wanted to eat my own dog food.

You might also be thinking “Why don’t you just apt install python3-numpy?” It’s true that some Python packages are distributed via the Raspbian/Debian archives too. However, if you’re in a virtual environment, or you need a more recent version than the one packaged for Debian, you need pip.

How it works

Now that the piwheels package repository is running as a service, hosted on a Pi 3 in the Mythic Beasts data centre in London. The pip package in Raspbian Stretch is configured to use piwheels as an additional index, so it falls back to PyPI if we’re missing a package. Just run sudo apt upgrade to get the configuration change. You’ll find that pip installs are much faster now! If you want to use piwheels on Raspbian Jessie, that’s possible too — find the instructions in our FAQs. And now, every time you pip install something, your files come from a web server running on a Raspberry Pi (that capable little machine)!

Try it for yourself in a virtual environment:

sudo apt install virtualenv python3-virtualenv -y
virtualenv -p /usr/bin/python3 testpip
source testpip/bin/activate
pip install numpy

This takes about 20 minutes on a Pi 3, 2.5 hours on a Pi 1, or just a few seconds on either if you use piwheels.

If you’re interested to see the details, try pip install numpy -v for verbose output. You’ll see that pip discovers two indexes to search:

2 location(s) to search for versions of numpy:
  * https://pypi.python.org/simple/numpy/
  * https://www.piwheels.hostedpi.com/simple/numpy/

Then it searches both indexes for available files. From this list of files, it determines the latest version available. Next it looks for a Python version and architecture match, and then opts for a wheel over a source distribution. If a new package or version is released, piwheels will automatically pick it up and add it to the build queue.

A flowchart of how piwheels works

How piwheels works

For the users unfamiliar with virtual environments I should mention that doing this isn’t a requirement — just an easy way of testing installations in a sandbox. Most pip usage will require sudo pip3 install {package}, which installs at a system level.

If you come across any issues with any packages from piwheels, please let us know in a GitHub issue.

Taking piwheels further

We currently provide over 670 000 wheels for more than 96 000 packages, all compiled natively on Raspberry Pi hardware. Moreover, we’ll keep building new packages as they are released.

Note that, at present, we have built wheels for Python 3.4 and 3.5 — we’re planning to add support for Python 3.6 and 2.7. The fact that piwheels is currently missing Python 2 wheels does not affect users: until we rebuild for Python 2, PyPI will be used as normal, it’ll just take longer than installing a Python 3 package for which we have a wheel. But remember, Python 2 end-of-life is less than three years away!

Many thanks to Dave Jones for his contributions to the project, and to Mythic Beasts for providing the excellent hosted Pi service.

Screenshot of the mythic beasts Raspberry Pi 3 server service website

Related/unrelated, check out my poster from the PyCon UK poster session:

A poster about Python and Raspberry Pi

Click to download the PDF!

The post piwheels: making “pip install” fast appeared first on Raspberry Pi.

Backing Up the Modern Enterprise with Backblaze for Business

Post Syndicated from Roderick Bauer original https://www.backblaze.com/blog/endpoint-backup-solutions/

Endpoint backup diagram

Organizations of all types and sizes need reliable and secure backup. Whether they have as few as 3 or as many as 300,000 computer users, an organization’s computer data is a valuable business asset that needs to be protected.

Modern organizations are changing how they work and where they work, which brings new challenges to making sure that company’s data assets are not only available, but secure. Larger organizations have IT departments that are prepared to address these needs, but often times in smaller and newer organizations the challenge falls upon office management who might not be as prepared or knowledgeable to face a work environment undergoing dramatic changes.

Whether small or large, local or world-wide, for-profit or non-profit, organizations need a backup strategy and solution that matches the new ways of working in the enterprise.

The Enterprise Has Changed, and So Has Data Use

More and more, organizations are working in the cloud. These days organizations can operate just fine without their own file servers, database servers, mail servers, or other IT infrastructure that used to be standard for all but the smallest organization.

The reality is that for most organizations, though, it’s a hybrid work environment, with a combination of cloud-based and PC and Macintosh-based applications. Legacy apps aren’t going away any time soon. They will be with us for a while, with their accompanying data scattered amongst all the desktops, laptops and other endpoints in corporate headquarters, home offices, hotel rooms, and airport waiting areas.

In addition, the modern workforce likely combines regular full-time employees, remote workers, contractors, and sometimes interns, volunteers, and other temporary workers who also use company IT assets.

The Modern Enterprise Brings New Challenges for IT

These changes in how enterprises work present a problem for anyone tasked with making sure that data — no matter who uses it or where it lives — is adequately backed-up. Cloud-based applications, when properly used and managed, can be adequately backed up, provided that users are connected to the internet and data transfers occur regularly — which is not always the case. But what about the data on the laptops, desktops, and devices used by remote employees, contractors, or just employees whose work keeps them on the road?

The organization’s backup solution must address all the needs of the modern organization or enterprise using both cloud and PC and Mac-based applications, and not be constrained by employee or computer location.

A Ten-Point Checklist for the Modern Enterprise for Backing Up

What should the modern enterprise look for when evaluating a backup solution?

1) Easy to deploy to workers’ computers

Whether installed by the computer user or an IT person locally or remotely, the backup solution must be easy to implement quickly with minimal demands on the user or administrator.

2) Fast and unobtrusive client software

Backups should happen in the background by efficient (native) PC and Macintosh software clients that don’t consume valuable processing power or take memory away from applications the user needs.

3) Easy to configure

The backup solutions must be easy to configure for both the user and the IT professional. Ease-of-use means less time to deploy, configure, and manage.

4) Defaults to backing up all valuable data

By default, the solution backs up commonly used files and folders or directories, including desktops. Some backup solutions are difficult and intimidating because they require that the user chose what needs to be backed up, often missing files and folders/directories that contain valuable data.

5) Works automatically in the background

Backups should happen automatically, no matter where the computer is located. The computer user, especially the remote or mobile one, shouldn’t be required to attach cables or drives, or remember to initiate backups. A working solution backs up automatically without requiring action by the user or IT administrator.

6) Data restores are fast and easy

Whether it’s a single file, directory, or an entire system that must be restored, a user or IT sysadmin needs to be able to restore backed up data as quickly as possible. In cases of large restores to remote locations, the ability to send a restore via physical media is a must.

7) No limitations on data

Throttling, caps, and data limits complicate backups and require guesses about how much storage space will be needed.

8) Safe & Secure

Organizations require that their data is secure during all phases of initial upload, storage, and restore.

9) Easy-to-manage

The backup solution needs to provide a clear and simple web management interface for all functions. Designing for ease-of-use leads to efficiency in management and operation.

10) Affordable and transparent pricing

Backup costs should be predictable, understandable, and without surprises.

Two Scenarios for the Modern Enterprise

Enterprises exist in many forms and types, but wanting to meet the above requirements is common across all of them. Below, we take a look at two common scenarios showing how enterprises face these challenges. Three case studies are available that provide more information about how Backblaze customers have succeeded in these environments.

Enterprise Profile 1

The needs of a smaller enterprise differ from those of larger, established organizations. This organization likely doesn’t have anyone who is devoted full-time to IT. The job of on-boarding new employees and getting them set up with a computer likely falls upon an executive assistant or office manager. This person might give new employees a checklist with the software and account information and lets users handle setting up the computer themselves.

Organizations in this profile need solutions that are easy to install and require little to no configuration. Backblaze, by default, backs up all user data, which lets the organization be secure in knowing all the data will be backed up to the cloud — including files left on the desktop. Combined with Backblaze’s unlimited data policy, organizations have a truly “set it and forget it” platform.

Customizing Groups To Meet Teams’ Needs

The Groups feature of Backblaze for Business allows an organization to decide whether an individual client’s computer will be Unmanaged (backups and restores under the control of the worker), or Managed, in which an administrator can monitor the status and frequency of backups and handle restores should they become necessary. One group for the entire organization might be adequate at this stage, but the organization has the option to add additional groups as it grows and needs more flexibility and control.

The organization, of course, has the choice of managing and monitoring users using Groups. With Backblaze’s Groups, organizations can set user-based access rules, which allows the administrator to create restores for lost files or entire computers on an employee’s behalf, to centralize billing for all client computers in the organization, and to redeploy a recovered computer or new computer with the backed up data.

Restores

In this scenario, the decision has been made to let each user manage her own backups, including restores, if necessary, of individual files or entire systems. If a restore of a file or system is needed, the restore process is easy enough for the user to handle it by herself.

Case Study 1

Read about how PagerDuty uses Backblaze for Business in a mixed enterprise of cloud and desktop/laptop applications.

PagerDuty Case Study

In a common approach, the employee can retrieve an accidentally deleted file or an earlier version of a document on her own. The Backblaze for Business interface is easy to navigate and was designed with feedback from thousands of customers over the course of a decade.

In the event of a lost, damaged, or stolen laptop,  administrators of Managed Groups can  initiate the restore, which could be in the form of a download of a restore ZIP file from the web management console, or the overnight shipment of a USB drive directly to the organization or user.

Enterprise Profile 2

This profile is for an organization with a full-time IT staff. When a new worker joins the team, the IT staff is tasked with configuring the computer and delivering it to the new employee.

Backblaze for Business Groups

Case Study 2

Global charitable organization charity: water uses Backblaze for Business to back up workers’ and volunteers’ laptops as they travel to developing countries in their efforts to provide clean and safe drinking water.

charity: water Case Study

This organization can take advantage of additional capabilities in Groups. A Managed Group makes sense in an organization with a geographically dispersed work force as it lets IT ensure that workers’ data is being regularly backed up no matter where they are. Billing can be company-wide or assigned to individual departments or geographical locations. The organization has the choice of how to divide the organization into Groups (location, function, subsidiary, etc.) and whether the Group should be Managed or Unmanaged. Using Managed Groups might be suitable for most of the organization, but there are exceptions in which sensitive data might dictate using an Unmanaged Group, such as could be the case with HR, the executive team, or finance.

Deployment

By Invitation Email, Link, or Domain

Backblaze for Business allows a number of options for deploying the client software to workers’ computers. Client installation is fast and easy on both Windows and Macintosh, so sending email invitations to users or automatically enrolling users by domain or invitation link, is a common approach.

By Remote Deployment

IT might choose to remotely and silently deploy Backblaze for Business across specific Groups or the entire organization. An administrator can silently deploy the Backblaze backup client via the command-line, or use common RMM (Remote Monitoring and Management) tools such as Jamf and Munki.

Restores

Case Study 3

Read about how Bright Bear Technology Solutions, an IT Managed Service Provider (MSP), uses the Groups feature of Backblaze for Business to manage customer backups and restores, deploy Backblaze licenses to their customers, and centralize billing for all their client-based backup services.

Bright Bear Case Study

Some organizations are better equipped to manage or assist workers when restores become necessary. Individual users will be pleased to discover they can roll-back files to an earlier version if they wish, but IT will likely manage any complete system restore that involves reconfiguring a computer after a repair or requisitioning an entirely new system when needed.

This organization might chose to retain a client’s entire computer backup for archival purposes, using Backblaze B2 as the cloud storage solution. This is another advantage of having a cloud storage provider that combines both endpoint backup and cloud object storage among its services.

The Next Step: Server Backup & Data Archiving with B2 Cloud Storage

As organizations grow, they have increased needs for cloud storage beyond Macintosh and PC data backup. Backblaze’s object cloud storage, Backblaze B2, provides low-cost storage and archiving of records, media, and server data that can grow with the organization’s size and needs.

B2 Cloud Storage is available through the same Backblaze management console as Backblaze Computer Backup. This means that Admins have one console for billing, monitoring, deployment, and role provisioning. B2 is priced at 1/4 the cost of Amazon S3, or $0.005 per month per gigabyte (which equals $5/month per terabyte).

Why Modern Enterprises Chose Backblaze

Backblaze for Business

Businesses and organizations select Backblaze for Business for backup because Backblaze is designed to meet the needs of the modern enterprise. Backblaze customers are part of a a platform that has a 10+ year track record of innovation and over 400 petabytes of customer data already under management.

Backblaze’s backup model is proven through head-to-head comparisons to back up data that other backup solutions overlook in their default configurations — including valuable files that are needed after an accidental deletion, theft, or computer failure.

Backblaze is the only enterprise-level backup company that provides TOTP (Time-based One-time Password) via both SMS and Authentication app to all accounts at no incremental charge. At just $50/year/computer, Backblaze is affordable for any size of enterprise.

Modern Enterprises can Meet The Challenge of The Changing Data Environment

With the right backup solution and strategy, the modern enterprise will be prepared to ensure that its data is protected from accident, disaster, or theft, whether its data is in one office or dispersed among many locations, and remote and mobile employees.

Backblaze for Business is an affordable solution that enables organizations to meet the evolving data demands facing the modern enterprise.

The post Backing Up the Modern Enterprise with Backblaze for Business appeared first on Backblaze Blog | Cloud Storage & Cloud Backup.

Hard Drive Stats for Q3 2017

Post Syndicated from Andy Klein original https://www.backblaze.com/blog/hard-drive-failure-rates-q3-2017/

Q3 2017 Hard Drive Stats

In Q3 2017, Backblaze introduced both 10 TB and 12 TB hard drives into our data centers, we continued to retire 3 TB and 4 TB hard drives to increase storage density, and we added over 59 petabytes of data storage to bring our total storage capacity to 400 petabytes.

In this update, we’ll review the Q3 2017 and lifetime hard drive failure rates for all our drive models in use at the end of Q3. We’ll also check in on our 8 TB enterprise versus consumer hard drive comparison, and look at the storage density changes in our data centers over the past couple of years. Along the way, we’ll share our observations and insights, and as always, you can download the hard drive statistics data we use to create these reports.

Q3 2017 Hard Drive Failure Rates

Since our Q2 2017 report, we added 9,599 new hard drives and retired 6,221 hard drives, for a net add of 3,378 drives and a total of 86,529. These numbers are for those hard drives of which we have 45 or more drives — with one exception that we’ll get to in a minute.

Let’s look at the Q3 statistics that include our first look at the 10 TB and 12 TB hard drives we added in Q3. The chart below is for activity that occurred just in Q3 2017.

Hard Drive Failure Rates for Q3 2017

Observations

  1. The hard drive failure rate for the quarter was 1.84%, our lowest quarterly rate ever. There are several factors that contribute to this, but one that stands out is the average age of the hard drives in use. Only the 4 TB HGST drives (model: HDS5C4040ALE630) have an average age over 4 years — 51.3 months to be precise. The average age of all the other drive models is less than 4 years, with nearly 80% of all of the drives being less than 3 years old.
  2. The 10- and 12 TB drive models are new. With a combined 13,000 drive days in operation, they’ve had zero failures. While all of these drives passed through formatting and load testing without incident, it is a little too early to reach any conclusions.

Testing Drives

Normally, we list only those drive models where we have 45 drives or more, as it formerly took 45 drives (currently 60), to fill a Storage Pod. We consider a Storage Pod as a base unit for drive testing. Yet, we listed the 12 TB drives even though we only have 20 of them in operation. What gives? It’s the first step in testing drives.

A Backblaze Vault consists of 20 Storage Pods logically grouped together. Twenty 12 TB drives are deployed in the same drive position in each of the 20 Storage Pods and grouped together into a storage unit we call a “tome.” An incoming file is stored in one tome, and is spread out across the 20 storage pods in the tome for reliability and availability. The remaining 59 tomes, in this case, use 8 TB drives. This allows us to see the performance and reliability of a 12 TB hard drive model in an operational environment without having to buy 1,200 of them to start.

Breaking news: Our first Backblaze Vault filled with 1,200 Seagate 12 TB hard drives (model: ST12000NM007) went into production on October 20th.

Storage Density Continues to Increase

As noted earlier, we retired 6,221 hard drives in Q3: all 3- or 4 TB hard drives. The retired drives have been replaced by 8-, 10-, and 12 TB drive models. This dramatic increase in storage density added 59 petabytes of storage in Q3. The following chart shows that change since the beginning of 2016.

Hard Drive Count by Drive Size

You clearly can see the retirement of the 2 TB and 3 TB drives, each being replaced predominantly by 8 TB drives. You also can see the beginning of the retirement curve for the 4 TB drives that will be replaced most likely by 12 TB drives over the coming months. A subset of the 4 TB drives, about 10,000 or so which were installed in the past year or so, will most likely stay in service for at least the next couple of years.

Lifetime Hard Drive Stats

The table below shows the failure rates for the hard drive models we had in service as of September 30, 2017. This is over the period beginning in April 2013 and ending September 30, 2017. If you are interested in the hard drive failure rates for all the hard drives we’ve used over the years, please refer to our 2016 hard drive review.

Cumulative Hard Drive Failure Rates

Note 1: The “+ / – Change” column reflects the change in the annualized failure rate from the previous quarter. Down is good, up is bad.
Note 2: You can download the data on this chart and the data from the “Hard Drive Failure Rates for Q3 2017” chart shown earlier in this review. The downloaded ZIP file contains one MSExcel spreadsheet.

The annualized failure rate for all of the drive models listed above is 2.07%; this is the higher than the 1.97% for the previous quarter. The primary driver behind this was the retirement of all of the HGST 3 TB drives (model: HDS5C3030ALA630) in Q3. Those drives had over 6 million drive days and an annualized failure rate of 0.82% — well below the average for the entire set of drives. Those drives now are gone and no longer part of the results.

Consumer Versus Enterprise Drives

The comparison of the consumer and enterprise Seagate 8 TB drives continues. Both of the drive models, Enterprise: ST8000NM0055 and Consumer: ST8000DM002, saw their annualized failure rates decrease from the previous quarter. In the case of the enterprise drives, this occurred even though we added 8,350 new drives in Q3. This brings the total number of Seagate 8 TB enterprise drives to 14,404, which have accumulated nearly 1.4 million drive days.

A comparison of the two drive models shows the annualized failure rates being very similar:

  • 8 TB Consumer Drives: 1.1% Annualized Failure Rate
  • 8 TB Enterprise Drives: 1.2% Annualized Failure Rate

Given that the failure rates for the two drive models appears to be similar, are the Seagate 8 TB enterprise drives worth any premium you might have to pay for them? As we have previously documented, the Seagate enterprise drives load data faster and have a number of features such as the PowerChoiceTM technology that can be very useful. In addition, enterprise drives typically have a 5 year warranty versus a 2 year warranty for the consumer drives. While drive price and availability are our primary considerations, you may decide other factors are more important.

We will continue to follow these drives, especially as they age over 2 years: the warranty point for the consumer drives.

Join the Drive Stats Webinar on Friday, November 3

We will be doing a deeper dive on this review in a webinar: “Q3 2017 Hard Drive Failure Stats” being held on Friday, November 3rd at 10:00 am Pacific Time. We’ll dig into what’s behind the numbers, including the enterprise vs consumer drive comparison. To sign up for the webinar, you will need to subscribe to the Backblaze BrightTALK channel if you haven’t already done so.

Wrapping Up

Our next drive stats post will be in January, when we’ll review the data for Q4 and all of 2017, and we’ll update our lifetime stats for all of the drives we have ever used. In addition, we’ll get our first real look at the 12 TB drives.

As a reminder, the hard drive data we use is available on our Hard Drive Test Data page. You can download and use this data for free for your own purpose. All we ask are three things 1) you cite Backblaze as the source if you use the data, 2) you accept that you are solely responsible for how you use the data, and 3) you do not sell this data to anyone: it is free.

Good luck and let us know if you find anything interesting.

The post Hard Drive Stats for Q3 2017 appeared first on Backblaze Blog | Cloud Storage & Cloud Backup.

Some notes about the Kaspersky affair

Post Syndicated from Robert Graham original http://blog.erratasec.com/2017/10/some-notes-about-kaspersky-affair.html

I thought I’d write up some notes about Kaspersky, the Russian anti-virus vendor that many believe has ties to Russian intelligence.

There’s two angles to this story. One is whether the accusations are true. The second is the poor way the press has handled the story, with mainstream outlets like the New York Times more intent on pushing government propaganda than informing us what’s going on.

The press

Before we address Kaspersky, we need to talk about how the press covers this.
The mainstream media’s stories have been pure government propaganda, like this one from the New York Times. It garbles the facts of what happened, and relies primarily on anonymous government sources that cannot be held accountable. It’s so messed up that we can’t easily challenge it because we aren’t even sure exactly what it’s claiming.
The Society of Professional Journalists have a name for this abuse of anonymous sources, the “Washington Game“. Journalists can identify this as bad journalism, but the big newspapers like The New York Times continues to do it anyway, because how dare anybody criticize them?
For all that I hate the anti-American bias of The Intercept, at least they’ve had stories that de-garble what’s going on, that explain things so that we can challenge them.

Our Government

Our government can’t tell us everything, of course. But at the same time, they need to tell us something, to at least being clear what their accusations are. These vague insinuations through the media hurt their credibility, not help it. The obvious craptitude is making us in the cybersecurity community come to Kaspersky’s defense, which is not the government’s aim at all.
There are lots of issues involved here, but let’s consider the major one insinuated by the NYTimes story, that Kaspersky was getting “data” files along with copies of suspected malware. This is troublesome if true.
But, as Kaspersky claims today, it’s because they had detected malware within a zip file, and uploaded the entire zip — including the data files within the zip.
This is reasonable. This is indeed how anti-virus generally works. It completely defeats the NYTimes insinuations.
This isn’t to say Kaspersky is telling the truth, of course, but that’s not the point. The point is that we are getting vague propaganda from the government further garbled by the press, making Kaspersky’s clear defense the credible party in the affair.
It’s certainly possible for Kaspersky to write signatures to look for strings like “TS//SI/OC/REL TO USA” that appear in secret US documents, then upload them to Russia. If that’s what our government believes is happening, they need to come out and be explicit about it. They can easily setup honeypots, in the way described in today’s story, to confirm it. However, it seems the government’s description of honeypots is that Kaspersky only upload files that were clearly viruses, not data.

Kaspersky

I believe Kaspersky is guilty, that the company and Eugene himself, works directly with Russian intelligence.
That’s because on a personal basis, people in government have given me specific, credible stories — the sort of thing they should be making public. And these stories are wholly unrelated to stories that have been made public so far.
You shouldn’t believe me, of course, because I won’t go into details you can challenge. I’m not trying to convince you, I’m just disclosing my point of view.
But there are some public reasons to doubt Kaspersky. For example, when trying to sell to our government, they’ve claimed they can help us against terrorists. The translation of this is that they could help our intelligence services. Well, if they are willing to help our intelligence services against customers who are terrorists, then why wouldn’t they likewise help Russian intelligence services against their adversaries?
Then there is how Russia works. It’s a violent country. Most of the people mentioned in that “Steele Dossier” have died. In the hacker community, hackers are often coerced to help the government. Many have simply gone missing.
Being rich doesn’t make Kaspersky immune from this — it makes him more of a target. Russian intelligence knows he’s getting all sorts of good intelligence, such as malware written by foreign intelligence services. It’s unbelievable they wouldn’t put the screws on him to get this sort of thing.
Russia is our adversary. It’d be foolish of our government to buy anti-virus from Russian companies. Likewise, the Russian government won’t buy such products from American companies.

Conclusion

I have enormous disrespect for mainstream outlets like The New York Times and the way they’ve handled the story. It makes me want to come to Kaspersky’s defense.

I have enormous respect for Kaspersky technology. They do good work.

But I hear stories. I don’t think our government should be trusting Kaspersky at all. For that matter, our government shouldn’t trust any cybersecurity products from Russia, China, Iran, etc.

Automating Security Group Updates with AWS Lambda

Post Syndicated from Ian Scofield original https://aws.amazon.com/blogs/compute/automating-security-group-updates-with-aws-lambda/

Customers often use public endpoints to perform cross-region replication or other application layer communication to remote regions. But a common problem is how do you protect these endpoints? It can be tempting to open up the security groups to the world due to the complexity of keeping security groups in sync across regions with a dynamically changing infrastructure.

Consider a situation where you are running large clusters of instances in different regions that all require internode connectivity. One approach would be to use a VPN tunnel between regions to provide a secure tunnel over which to send your traffic. A good example of this is the Transit VPC Solution, which is a published AWS solution to help customers quickly get up and running. However, this adds additional cost and complexity to your solution due to the newly required additional infrastructure.

Another approach, which I’ll explore in this post, is to restrict access to the nodes by whitelisting the public IP addresses of your hosts in the opposite region. Today, I’ll outline a solution that allows for cross-region security group updates, can handle remote region failures, and supports external actions such as manually terminating instances or adding instances to an existing Auto Scaling group.

Solution overview

The overview of this solution is diagrammed below. Although this post covers limiting access to your instances, you should still implement encryption to protect your data in transit.

If your entire infrastructure is running in a single region, you can reference a security group as the source, allowing your IP addresses to change without any updates required. However, if you’re going across the public internet between regions to perform things like application-level traffic or cross-region replication, this is no longer an option. Security groups are regional. When you go across regions it can be tempting to drop security to enable this communication.

Although using an Elastic IP address can provide you with a static IP address that you can define as a source for your security groups, this may not always be feasible, especially when automatic scaling is desired.

In this example scenario, you have a distributed database that requires full internode communication for replication. If you place a cluster in us-east-1 and us-west-2, you must provide a secure method of communication between the two. Because the database uses cloud best practices, you can add or remove nodes as the load varies.

To start the process of updating your security groups, you must know when an instance has come online to trigger your workflow. Auto Scaling groups have the concept of lifecycle hooks that enable you to perform custom actions as the group launches or terminates instances.

When Auto Scaling begins to launch or terminate an instance, it puts the instance into a wait state (Pending:Wait or Terminating:Wait). The instance remains in this state while you perform your various actions until either you tell Auto Scaling to Continue, Abandon, or the timeout period ends. A lifecycle hook can trigger a CloudWatch event, publish to an Amazon SNS topic, or send to an Amazon SQS queue. For this example, you use CloudWatch Events to trigger an AWS Lambda function that updates an Amazon DynamoDB table.

Component breakdown

Here’s a quick breakdown of the components involved in this solution:

• Lambda function
• CloudWatch event
• DynamoDB table

Lambda function

The Lambda function automatically updates your security groups, in the following way:

1. Determines whether a change was triggered by your Auto Scaling group lifecycle hook or manually invoked for a “true up” functionality, which I discuss later in this post.
2. Describes the instances in the Auto Scaling group and obtain public IP addresses for each instance.
3. Updates both local and remote DynamoDB tables.
4. Compares the list of public IP addresses for both local and remote clusters with what’s already in the local region security group. Update the security group.
5. Compares the list of public IP addresses for both local and remote clusters with what’s already in the remote region security group. Update the security group
6. Signals CONTINUE back to the lifecycle hook.

CloudWatch event

The CloudWatch event triggers when an instance passes through either the launching or terminating states. When the Lambda function gets invoked, it receives an event that looks like the following:

{
	"account": "123456789012",
	"region": "us-east-1",
	"detail": {
		"LifecycleHookName": "hook-launching",
		"AutoScalingGroupName": "",
		"LifecycleActionToken": "33965228-086a-4aeb-8c26-f82ed3bef495",
		"LifecycleTransition": "autoscaling:EC2_INSTANCE_LAUNCHING",
		"EC2InstanceId": "i-017425ec54f22f994"
	},
	"detail-type": "EC2 Instance-launch Lifecycle Action",
	"source": "aws.autoscaling",
	"version": "0",
	"time": "2017-05-03T02:20:59Z",
	"id": "cb930cf8-ce8b-4b6c-8011-af17966eb7e2",
	"resources": [
		"arn:aws:autoscaling:us-east-1:123456789012:autoScalingGroup:d3fe9d96-34d0-4c62-b9bb-293a41ba3765:autoScalingGroupName/"
	]
}

DynamoDB table

You use DynamoDB to store lists of remote IP addresses in a local table that is updated by the opposite region as a failsafe source of truth. Although you can describe your Auto Scaling group for the local region, you must maintain a list of IP addresses for the remote region.

To minimize the number of describe calls and prevent an issue in the remote region from blocking your local scaling actions, we keep a list of the remote IP addresses in a local DynamoDB table. Each Lambda function in each region is responsible for updating the public IP addresses of its Auto Scaling group for both the local and remote tables.

As with all the infrastructure in this solution, there is a DynamoDB table in both regions that mirror each other. For example, the following screenshot shows a sample DynamoDB table. The Lambda function in us-east-1 would update the DynamoDB entry for us-east-1 in both tables in both regions.

By updating a DynamoDB table in both regions, it allows the local region to gracefully handle issues with the remote region, which would otherwise prevent your ability to scale locally. If the remote region becomes inaccessible, you have a copy of the latest configuration from the table that you can use to continue to sync with your security groups. When the remote region comes back online, it pushes its updated public IP addresses to the DynamoDB table. The security group is updated to reflect the current status by the remote Lambda function.

 

Walkthrough

Note: All of the following steps are performed in both regions. The Launch Stack buttons will default to the us-east-1 region.

Here’s a quick overview of the steps involved in this process:

1. An instance is launched or terminated, which triggers an Auto Scaling group lifecycle hook, triggering the Lambda function via CloudWatch Events.
2. The Lambda function retrieves the list of public IP addresses for all instances in the local region Auto Scaling group.
3. The Lambda function updates the local and remote region DynamoDB tables with the public IP addresses just received for the local Auto Scaling group.
4. The Lambda function updates the local region security group with the public IP addresses, removing and adding to ensure that it mirrors what is present for the local and remote Auto Scaling groups.
5. The Lambda function updates the remote region security group with the public IP addresses, removing and adding to ensure that it mirrors what is present for the local and remote Auto Scaling groups.

Prerequisites

To deploy this solution, you need to have Auto Scaling groups, launch configurations, and a base security group in both regions. To expedite this process, this CloudFormation template can be launched in both regions.

Step 1: Launch the AWS SAM template in the first region

To make the deployment process easy, I’ve created an AWS Serverless Application Model (AWS SAM) template, which is a new specification that makes it easier to manage and deploy serverless applications on AWS. This template creates the following resources:

• A Lambda function, to perform the various security group actions
• A DynamoDB table, to track the state of the local and remote Auto Scaling groups
• Auto Scaling group lifecycle hooks for instance launching and terminating
• A CloudWatch event, to track the EC2 Instance-Launch Lifecycle-Action and EC2 Instance-terminate Lifecycle-Action events
• A pointer from the CloudWatch event to the Lambda function, and the necessary permissions

Download the template from here or click to launch.

Upon launching the template, you’ll be presented with a list of parameters which includes the remote/local names for your Auto Scaling Groups, AWS region, Security Group IDs, DynamoDB table names, as well as where the code for the Lambda function is located. Because this is the first region you’re launching the stack in, fill out all the parameters except for the RemoteTable parameter as it hasn’t been created yet (you fill this in later).

Step 2: Test the local region

After the stack has finished launching, you can test the local region. Open the EC2 console and find the Auto Scaling group that was created when launching the prerequisite stack. Change the desired number of instances from 0 to 1.

For both regions, check your security group to verify that the public IP address of the instance created is now in the security group.

Local region:

Remote region:

Now, change the desired number of instances for your group back to 0 and verify that the rules are properly removed.

Local region:

Remote region:

Step 3: Launch in the remote region

When you deploy a Lambda function using CloudFormation, the Lambda zip file needs to reside in the same region you are launching the template. Once you choose your remote region, create an Amazon S3 bucket and upload the Lambda zip file there. Next, go to the remote region and launch the same SAM template as before, but make sure you update the CodeBucket and CodeKey parameters. Also, because this is the second launch, you now have all the values and can fill out all the parameters, specifically the RemoteTable value.

 

Step 4: Update the local region Lambda environment variable

When you originally launched the template in the local region, you didn’t have the name of the DynamoDB table for the remote region, because you hadn’t created it yet. Now that you have launched the remote template, you can perform a CloudFormation stack update on the initial SAM template. This populates the remote DynamoDB table name into the initial Lambda function’s environment variables.

In the CloudFormation console in the initial region, select the stack. Under Actions, choose Update Stack, and select the SAM template used for both regions. Under Parameters, populate the remote DynamoDB table name, as shown below. Choose Next and let the stack update complete. This updates your Lambda function and completes the setup process.

 

Step 5: Final testing

You now have everything fully configured and in place to trigger security group changes based on instances being added or removed to your Auto Scaling groups in both regions. Test this by changing the desired capacity of your group in both regions.

True up functionality
If an instance is manually added or removed from the Auto Scaling group, the lifecycle hooks don’t get triggered. To account for this, the Lambda function supports a “true up” functionality in which the function can be manually invoked. If you paste in the following JSON text for your test event, it kicks off the entire workflow. For added peace of mind, you can also have this function fire via a CloudWatch event with a CRON expression for nearly continuous checking.

{
	"detail": {
		"AutoScalingGroupName": "<your ASG name>"
	},
	"trueup":true
}

Extra credit

Now that all the resources are created in both regions, go back and break down the policy to incorporate resource-level permissions for specific security groups, Auto Scaling groups, and the DynamoDB tables.

Although this post is centered around using public IP addresses for your instances, you could instead use a VPN between regions. In this case, you would still be able to use this solution to scope down the security groups to the cluster instances. However, the code would need to be modified to support private IP addresses.

 

Conclusion

At this point, you now have a mechanism in place that captures when a new instance is added to or removed from your cluster and updates the security groups in both regions. This ensures that you are locking down your infrastructure securely by allowing access only to other cluster members.

Keep in mind that this architecture (lifecycle hooks, CloudWatch event, Lambda function, and DynamoDB table) requires that the infrastructure to be deployed in both regions, to have synchronization going both ways.

Because this Lambda function is modifying security group rules, it’s important to have an audit log of what has been modified and who is modifying them. The out-of-the-box function provides logs in CloudWatch for what IP addresses are being added and removed for which ports. As these are all API calls being made, they are logged in CloudTrail and can be traced back to the IAM role that you created for your lifecycle hooks. This can provide historical data that can be used for troubleshooting or auditing purposes.

Security is paramount at AWS. We want to ensure that customers are protecting access to their resources. This solution helps you keep your security groups in both regions automatically in sync with your Auto Scaling group resources. Let us know if you have any questions or other solutions you’ve come up with!

RIAA Identifies Top YouTube MP3 Rippers and Other Pirate Sites

Post Syndicated from Ernesto original https://torrentfreak.com/riaa-identifies-top-youtube-mp3-rippers-and-other-pirate-sites-171006/

Around the same time as Hollywood’s MPAA, the RIAA has also submitted its overview of “notorious markets” to the Office of the US Trade Representative (USTR).

These submissions help to guide the U.S. Government’s position toward foreign countries when it comes to copyright enforcement.

The RIAA’s overview begins positively, announcing two major successes achieved over the past year.

The first is the shutdown of sites such as Emp3world, AudioCastle, Viperial, Album Kings, and im1music. These sites all used the now-defunct Sharebeast platform, whose operator pleaded guilty to criminal copyright infringement.

Another victory followed a few weeks ago when YouTube-MP3.org shut down its services after being sued by the RIAA.

“The most popular YouTube ripping site, youtube-mp3.org, based in Germany and included in last year’s list of notorious markes [sic], recently shut down in response to a civil action brought by major record labels,” the RIAA writes.

This case also had an effect on similar services. Some stream ripping services that were reported to the USTR last year no longer permit the conversion and download of music videos on YouTube, the RIAA reports. However, they add that the problem is far from over.

“Unfortunately, several other stream-ripping sites have ‘doubled down’ and carry on in this illegal behavior, continuing to make this form of theft a major concern for the music industry,” the music group writes.

“The overall popularity of these sites and the staggering volume of traffic it attracts evidences the enormous damage being inflicted on the U.S. record industry.”

The music industry group is tracking more than 70 of these stream ripping sites and the most popular ones are listed in the overview of notorious markets. These are Mp3juices.cc, Convert2mp3.net, Savefrom.net, Ytmp3.cc, Convertmp3.io, Flvto.biz, and 2conv.com.

Youtube2mp3’s listing

The RIAA notes that many sites use domain privacy services to hide their identities, as well as Cloudflare to obscure the sites’ true hosting locations. This frustrates efforts to take action against these sites, they say.

Popular torrent sites are also highlighted, including The Pirate Bay. These sites regularly change domain names to avoid ISP blockades and domain seizures, and also use Cloudflare to hide their hosting location.

“BitTorrent sites, like many other pirate sites, are increasing [sic] turning to Cloudflare because routing their site through Cloudflare obfuscates the IP address of the actual hosting provider, masking the location of the site.”

Finally, the RIAA reports several emerging threats reported to the Government. Third party app stores, such as DownloadAtoZ.com, reportedly offer a slew of infringing apps. In addition, there’s a boom of Nigerian pirate sites that flood the market with free music.

“The number of such infringing sites with a Nigerian operator stands at over 200. Their primary method of promotion is via Twitter, and most sites make use of the Nigerian operated ISP speedhost247.com,” the report notes

The full list of RIAA’s “notorious” pirate sites, which also includes several cyberlockers, MP3 search and download sites, as well as unlicensed pay services, can be found below. The full report is available here (pdf).

Stream-Ripping Sites

– Mp3juices.cc
– Convert2mp3.net
– Savefrom.net
– Ytmp3.cc
– Convertmp3.io
– Flvto.biz
– 2conv.com.

Search-and-Download Sites

– Newalbumreleases.net
– Rnbxclusive.top
– DNJ.to

BitTorrent Indexing and Tracker Sites

– Thepiratebay.org
– Torrentdownloads.me
– Rarbg.to
– 1337x.to

Cyberlockers

– 4shared.com
– Uploaded.net
– Zippyshare.com
– Rapidgator.net
– Dopefile.pk
– Chomikuj.pl

Unlicensed Pay-for-Download Sites

– Mp3va.com
– Mp3fiesta.com

Source: TF, for the latest info on copyright, file-sharing, torrent sites and ANONYMOUS VPN services.

Backing Up WordPress

Post Syndicated from Roderick Bauer original https://www.backblaze.com/blog/backing-up-wordpress/

WordPress cloud backup
WordPress logo

WordPress is the most popular CMS (Content Management System) for websites, with almost 30% of all websites in the world using WordPress. That’s a lot of sites — over 350 million!

In this post we’ll talk about the different approaches to keeping the data on your WordPress website safe.


Stop the Presses! (Or the Internet!)

As we were getting ready to publish this post, we received news from UpdraftPlus, one of the biggest WordPress plugin developers, that they are supporting Backblaze B2 as a storage solution for their backup plugin. They shipped the update (1.13.9) this week. This is great news for Backblaze customers! UpdraftPlus is also offering a 20% discount to Backblaze customers wishing to purchase or upgrade to UpdraftPlus Premium. The complete information is below.

UpdraftPlus joins backup plugin developer XCloner — Backup and Restore in supporting Backblaze B2. A third developer, BlogVault, also announced their intent to support Backblaze B2. Contact your favorite WordPress backup plugin developer and urge them to support Backblaze B2, as well.

Now, back to our post…


Your WordPress website data is on a web server that’s most likely located in a large data center. You might wonder why it is necessary to have a backup of your website if it’s in a data center. Website data can be lost in a number of ways, including mistakes by the website owner (been there), hacking, or even domain ownership dispute (I’ve seen it happen more than once). A website backup also can provide a history of changes you’ve made to the website, which can be useful. As an overall strategy, it’s best to have a backup of any data that you can’t afford to lose for personal or business reasons.

Your web hosting company might provide backup services as part of your hosting plan. If you are using their service, you should know where and how often your data is being backed up. You don’t want to find out too late that your backup plan was not adequate.

Sites on WordPress.com are automatically backed up by VaultPress (Automattic), which also is available for self-hosted WordPress installations. If you don’t want the work or decisions involved in managing the hosting for your WordPress site, WordPress.com will handle it for you. You do, however, give up some customization abilities, such as the option to add plugins of your own choice.

Very large and active websites might consider WordPress VIP by Automattic, or another premium WordPress hosting service such as Pagely.com.

This post is about backing up self-hosted WordPress sites, so we’ll focus on those options.

WordPress Backup

Backup strategies for WordPress can be divided into broad categories depending on 1) what you back up, 2) when you back up, and 3) where the data is backed up.

With server data, such as with a WordPress installation, you should plan to have three copies of the data (the 3-2-1 backup strategy). The first is the active data on the WordPress web server, the second is a backup stored on the web server or downloaded to your local computer, and the third should be in another location, such as the cloud.

We’ll talk about the different approaches to backing up WordPress, but we recommend using a WordPress plugin to handle your backups. A backup plugin can automate the task, optimize your backup storage space, and alert you of problems with your backups or WordPress itself. We’ll cover plugins in more detail, below.

What to Back Up?

The main components of your WordPress installation are:

You should decide which of these elements you wish to back up. The database is the top priority, as it contains all your website posts and pages (exclusive of media). Your current theme is important, as it likely contains customizations you’ve made. Following those in priority are any other files you’ve customized or made changes to.

You can choose to back up the WordPress core installation and plugins, if you wish, but these files can be downloaded again if necessary from the source, so you might not wish to include them. You likely have all the media files you use on your website on your local computer (which should be backed up), so it is your choice whether to back these up from the server as well.

If you wish to be able to recreate your entire website easily in case of data loss or disaster, you might choose to back up everything, though on a large website this could be a lot of data.

Generally, you should 1) prioritize any file that you’ve customized that you can’t afford to lose, and 2) decide whether you need a copy of everything in order to get your site back up quickly. These choices will determine your backup method and the amount of storage you need.

A good backup plugin for WordPress enables you to specify which files you wish to back up, and even to create separate backups and schedules for different backup contents. That’s another good reason to use a plugin for backing up WordPress.

When to Back Up?

You can back up manually at any time by using the Export tool in WordPress. This is handy if you wish to do a quick backup of your site or parts of it. Since it is manual, however, it is not a part of a dependable backup plan that should be done regularly. If you wish to use this tool, go to Tools, Export, and select what you wish to back up. The output will be an XML file that uses the WordPress Extended RSS format, also known as WXR. You can create a WXR file that contains all of the information on your site or just portions of the site, such as posts or pages by selecting: All content, Posts, Pages, or Media.
Note: You can use WordPress’s Export tool for sites hosted on WordPress.com, as well.

Export instruction for WordPress

Many of the backup plugins we’ll be discussing later also let you do a manual backup on demand in addition to regularly scheduled or continuous backups.

Note:  Another use of the WordPress Export tool and the WXR file is to transfer or clone your website to another server. Once you have exported the WXR file from the website you wish to transfer from, you can import the WXR file from the Tools, Import menu on the new WordPress destination site. Be aware that there are file size limits depending on the settings on your web server. See the WordPress Codex entry for more information. To make this job easier, you may wish to use one of a number of WordPress plugins designed specifically for this task.

You also can manually back up the WordPress MySQL database using a number of tools or a plugin. The WordPress Codex has good information on this. All WordPress plugins will handle this for you and do it automatically. They also typically include tools for optimizing the database tables, which is just good housekeeping.

A dependable backup strategy doesn’t rely on manual backups, which means you should consider using one of the many backup plugins available either free or for purchase. We’ll talk more about them below.

Which Format To Back Up In?

In addition to the WordPress WXR format, plugins and server tools will use various file formats and compression algorithms to store and compress your backup. You may get to choose between zip, tar, tar.gz, tar.gz2, and others. See The Most Common Archive File Formats for more information on these formats.

Select a format that you know you can access and unarchive should you need access to your backup. All of these formats are standard and supported across operating systems, though you might need to download a utility to access the file.

Where To Back Up?

Once you have your data in a suitable format for backup, where do you back it up to?

We want to have multiple copies of our active website data, so we’ll choose more than one destination for our backup data. The backup plugins we’ll discuss below enable you to specify one or more possible destinations for your backup. The possible destinations for your backup include:

A backup folder on your web server
A backup folder on your web server is an OK solution if you also have a copy elsewhere. Depending on your hosting plan, the size of your site, and what you include in the backup, you may or may not have sufficient disk space on the web server. Some backup plugins allow you to configure the plugin to keep only a certain number of recent backups and delete older ones, saving you disk space on the server.
Email to you
Because email servers have size limitations, the email option is not the best one to use unless you use it to specifically back up just the database or your main theme files.
FTP, SFTP, SCP, WebDAV
FTP, SFTP, SCP, and WebDAV are all widely-supported protocols for transferring files over the internet and can be used if you have access credentials to another server or supported storage device that is suitable for storing a backup.
Sync service (Dropbox, SugarSync, Google Drive, OneDrive)
A sync service is another possible server storage location though it can be a pricier choice depending on the plan you have and how much you wish to store.
Cloud storage (Backblaze B2, Amazon S3, Google Cloud, Microsoft Azure, Rackspace)
A cloud storage service can be an inexpensive and flexible option with pay-as-you go pricing for storing backups and other data.

A good website backup strategy would be to have multiple backups of your website data: one in a backup folder on your web hosting server, one downloaded to your local computer, and one in the cloud, such as with Backblaze B2.

If I had to choose just one of these, I would choose backing up to the cloud because it is geographically separated from both your local computer and your web host, it uses fault-tolerant and redundant data storage technologies to protect your data, and it is available from anywhere if you need to restore your site.

Backup Plugins for WordPress

Probably the easiest and most common way to implement a solid backup strategy for WordPress is to use one of the many backup plugins available for WordPress. Fortunately, there are a number of good ones and are available free or in “freemium” plans in which you can use the free version and pay for more features and capabilities only if you need them. The premium options can give you more flexibility in configuring backups or have additional options for where you can store the backups.

How to Choose a WordPress Backup Plugin

screenshot of WordPress plugins search

When considering which plugin to use, you should take into account a number of factors in making your choice.

Is the plugin actively maintained and up-to-date? You can determine this from the listing in the WordPress Plugin Repository. You also can look at reviews and support comments to get an idea of user satisfaction and how well issues are resolved.

Does the plugin work with your web hosting provider? Generally, well-supported plugins do, but you might want to check to make sure there are no issues with your hosting provider.

Does it support the cloud service or protocol you wish to use? This can be determined from looking at the listing in the WordPress Plugin Repository or on the developer’s website. Developers often will add support for cloud services or other backup destinations based on user demand, so let the developer know if there is a feature or backup destination you’d like them to add to their plugin.

Other features and options to consider in choosing a backup plugin are:

  • Whether encryption of your backup data is available
  • What are the options for automatically deleting backups from the storage destination?
  • Can you globally exclude files, folders, and specific types of files from the backup?
  • Do the options for scheduling automatic backups meet your needs for frequency?
  • Can you exclude/include specific database tables (a good way to save space in your backup)?

WordPress Backup Plugins Review

Let’s review a few of the top choices for WordPress backup plugins.

UpdraftPlus

UpdraftPlus is one of the most popular backup plugins for WordPress with over one million active installations. It is available in both free and Premium versions.

UpdraftPlus just released support for Backblaze B2 Cloud Storage in their 1.13.9 update on September 25. According to the developer, support for Backblaze B2 was the most frequent request for a new storage option for their plugin. B2 support is available in their Premium plugin and as a stand-alone update to their standard product.

Note: The developers of UpdraftPlus are offering a special 20% discount to Backblaze customers on the purchase of UpdraftPlus Premium by using the coupon code backblaze20. The discount is valid until the end of Friday, October 6th, 2017.

screenshot of Backblaze B2 cloud backup for WordPress in UpdraftPlus

XCloner — Backup and Restore

XCloner — Backup and Restore is a useful open-source plugin with many options for backing up WordPress.

XCloner supports B2 Cloud Storage in their free plugin.

screenshot of XCloner WordPress Backblaze B2 backup settings

BlogVault

BlogVault describes themselves as a “complete WordPress backup solution.” They offer a free trial of their paid WordPress backup subscription service that features real-time backups of changes to your WordPress site, as well as many other features.

BlogVault has announced their intent to support Backblaze B2 Cloud Storage in a future update.

screenshot of BlogValut WordPress Backup settings

BackWPup

BackWPup is a popular and free option for backing up WordPress. It supports a number of options for storing your backup, including the cloud, FTP, email, or on your local computer.

screenshot of BackWPup WordPress backup settings

WPBackItUp

WPBackItUp has been around since 2012 and is highly rated. It has both free and paid versions.

screenshot of WPBackItUp WordPress backup settings

VaultPress

VaultPress is part of Automattic’s well-known WordPress product, JetPack. You will need a JetPack subscription plan to use VaultPress. There are different pricing plans with different sets of features.

screenshot of VaultPress backup settings

Backup by Supsystic

Backup by Supsystic supports a number of options for backup destinations, encryption, and scheduling.

screenshot of Backup by Supsystic backup settings

BackupWordPress

BackUpWordPress is an open-source project on Github that has a popular and active following and many positive reviews.

screenshot of BackupWordPress WordPress backup settings

BackupBuddy

BackupBuddy, from iThemes, is the old-timer of backup plugins, having been around since 2010. iThemes knows a lot about WordPress, as they develop plugins, themes, utilities, and provide training in WordPress.

BackupBuddy’s backup includes all WordPress files, all files in the WordPress Media library, WordPress themes, and plugins. BackupBuddy generates a downloadable zip file of the entire WordPress website. Remote storage destinations also are supported.

screenshot of BackupBuddy settings

WordPress and the Cloud

Do you use WordPress and back up to the cloud? We’d like to hear about it. We’d also like to hear whether you are interested in using B2 Cloud Storage for storing media files served by WordPress. If you are, we’ll write about it in a future post.

In the meantime, keep your eye out for new plugins supporting Backblaze B2, or better yet, urge them to support B2 if they’re not already.

The Best Backup Strategy is the One You Use

There are other approaches and tools for backing up WordPress that you might use. If you have an approach that works for you, we’d love to hear about it in the comments.

The post Backing Up WordPress appeared first on Backblaze Blog | Cloud Storage & Cloud Backup.

Using Enhanced Request Authorizers in Amazon API Gateway

Post Syndicated from Stefano Buliani original https://aws.amazon.com/blogs/compute/using-enhanced-request-authorizers-in-amazon-api-gateway/

Recently, AWS introduced a new type of authorizer in Amazon API Gateway, enhanced request authorizers. Previously, custom authorizers received only the bearer token included in the request and the ARN of the API Gateway method being called. Enhanced request authorizers receive all of the headers, query string, and path parameters as well as the request context. This enables you to make more sophisticated authorization decisions based on parameters such as the client IP address, user agent, or a query string parameter alongside the client bearer token.

Enhanced request authorizer configuration

From the API Gateway console, you can declare a new enhanced request authorizer by selecting the Request option as the AWS Lambda event payload:

Create enhanced request authorizer

 

Just like normal custom authorizers, API Gateway can cache the policy returned by your Lambda function. With enhanced request authorizers, however, you can also specify the values that form the unique key of a policy in the cache. For example, if your authorization decision is based on both the bearer token and the IP address of the client, both values should be part of the unique key in the policy cache. The identity source parameter lets you specify these values as mapping expressions:

  • The bearer token appears in the Authorization header
  • The client IP address is stored in the sourceIp parameter of the request context.

Configure identity sources

 

Using enhanced request authorizers with Swagger

You can also define enhanced request authorizers in your Swagger (Open API) definitions. In the following example, you can see that all of the options configured in the API Gateway console are available as custom extensions in the API definition. For example, the identitySource field is a comma-separated list of mapping expressions.

securityDefinitions:
  IpAuthorizer:
    type: "apiKey"
    name: "IpAuthorizer"
    in: "header"
    x-amazon-apigateway-authtype: "custom"
    x-amazon-apigateway-authorizer:
      authorizerResultTtlInSeconds: 300
      identitySource: "method.request.header.Authorization, context.identity.sourceIp"
      authorizerUri: "arn:aws:apigateway:us-east-1:lambda:path/2015-03-31/functions/arn:aws:lambda:us-east-1:XXXXXXXXXX:function:py-ip-authorizer/invocations"
      type: "request"

After you have declared your authorizer in the security definitions section, you can use it in your API methods:

---
swagger: "2.0"
info:
  title: "request-authorizer-demo"
basePath: "/dev"
paths:
  /hello:
    get:
      security:
      - IpAuthorizer: []
...

Enhanced request authorizer Lambda functions

Enhanced request authorizer Lambda functions receive an event object that is similar to proxy integrations. It contains all of the information about a request, excluding the body.

{
    "methodArn": "arn:aws:execute-api:us-east-1:XXXXXXXXXX:xxxxxx/dev/GET/hello",
    "resource": "/hello",
    "requestContext": {
        "resourceId": "xxxx",
        "apiId": "xxxxxxxxx",
        "resourcePath": "/hello",
        "httpMethod": "GET",
        "requestId": "9e04ff18-98a6-11e7-9311-ef19ba18fc8a",
        "path": "/dev/hello",
        "accountId": "XXXXXXXXXXX",
        "identity": {
            "apiKey": "",
            "sourceIp": "58.240.196.186"
        },
        "stage": "dev"
    },
    "queryStringParameters": {},
    "httpMethod": "GET",
    "pathParameters": {},
    "headers": {
        "cache-control": "no-cache",
        "x-amzn-ssl-client-hello": "AQACJAMDAAAAAAAAAAAAAAAAAAAAAAAAAAAA…",
        "Accept-Encoding": "gzip, deflate",
        "X-Forwarded-For": "54.240.196.186, 54.182.214.90",
        "Accept": "*/*",
        "User-Agent": "PostmanRuntime/6.2.5",
        "Authorization": "hello"
    },
    "stageVariables": {},
    "path": "/hello",
    "type": "REQUEST"
}

The following enhanced request authorizer snippet is written in Python and compares the source IP address against a list of valid IP addresses. The comments in the code explain what happens in each step.

...
VALID_IPS = ["58.240.195.186", "201.246.162.38"]

def lambda_handler(event, context):

    # Read the client’s bearer token.
    jwtToken = event["headers"]["Authorization"]
    
    # Read the source IP address for the request form 
    # for the API Gateway context object.
    clientIp = event["requestContext"]["identity"]["sourceIp"]
    
    # Verify that the client IP address is allowed.
    # If it’s not valid, raise an exception to make sure
    # that API Gateway returns a 401 status code.
    if clientIp not in VALID_IPS:
        raise Exception('Unauthorized')
    
    # Only allow hello users in!
    if not validate_jwt(userId):
        raise Exception('Unauthorized')

    # Use the values from the event object to populate the 
    # required parameters in the policy object.
    policy = AuthPolicy(userId, event["requestContext"]["accountId"])
    policy.restApiId = event["requestContext"]["apiId"]
    policy.region = event["methodArn"].split(":")[3]
    policy.stage = event["requestContext"]["stage"]
    
    # Use the scopes from the bearer token to make a 
    # decision on which methods to allow in the API.
    policy.allowMethod(HttpVerb.GET, '/hello')

    # Finally, build the policy.
    authResponse = policy.build()

    return authResponse
...

Conclusion

API Gateway customers build complex APIs, and authorization decisions often go beyond the simple properties in a JWT token. For example, users may be allowed to call the “list cars” endpoint but only with a specific subset of filter parameters. With enhanced request authorizers, you have access to all request parameters. You can centralize all of your application’s access control decisions in a Lambda function, making it easier to manage your application security.

Security updates for Friday

Post Syndicated from ris original https://lwn.net/Articles/734606/rss

Security updates have been issued by CentOS (augeas, samba, and samba4), Debian (apache2, bluez, emacs23, and newsbeuter), Fedora (kernel and mingw-LibRaw), openSUSE (apache2 and libzip), Oracle (kernel), SUSE (kernel, spice, and xen), and Ubuntu (emacs24, emacs25, and samba).

Security updates for Wednesday

Post Syndicated from ris original https://lwn.net/Articles/734318/rss

Security updates have been issued by CentOS (emacs), Debian (apache2, gdk-pixbuf, and pyjwt), Fedora (autotrace, converseen, dmtx-utils, drawtiming, emacs, gtatool, imageinfo, ImageMagick, inkscape, jasper, k3d, kxstitch, libwpd, mingw-libzip, perl-Image-SubImageFind, pfstools, php-pecl-imagick, psiconv, q, rawtherapee, ripright, rss-glx, rubygem-rmagick, synfig, synfigstudio, techne, vdr-scraper2vdr, vips, and WindowMaker), Oracle (emacs and kernel), Red Hat (emacs and kernel), Scientific Linux (emacs), SUSE (emacs), and Ubuntu (apache2).