Tag Archives: chat

Announcing the Winners of the AWS Chatbot Challenge – Conversational, Intelligent Chatbots using Amazon Lex and AWS Lambda

Post Syndicated from Tara Walker original https://aws.amazon.com/blogs/aws/announcing-the-winners-of-the-aws-chatbot-challenge-conversational-intelligent-chatbots-using-amazon-lex-and-aws-lambda/

A couple of months ago on the blog, I announced the AWS Chatbot Challenge in conjunction with Slack. The AWS Chatbot Challenge was an opportunity to build a unique chatbot that helped to solve a problem or that would add value for its prospective users. The mission was to build a conversational, natural language chatbot using Amazon Lex and leverage Lex’s integration with AWS Lambda to execute logic or data processing on the backend.

I know that you all have been anxiously waiting to hear announcements of who were the winners of the AWS Chatbot Challenge as much as I was. Well wait no longer, the winners of the AWS Chatbot Challenge have been decided.

May I have the Envelope Please? (The Trumpets sound)

The winners of the AWS Chatbot Challenge are:

  • First Place: BuildFax Counts by Joe Emison
  • Second Place: Hubsy by Andrew Riess, Andrew Puch, and John Wetzel
  • Third Place: PFMBot by Benny Leong and his team from MoneyLion.
  • Large Organization Winner: ADP Payroll Innovation Bot by Eric Liu, Jiaxing Yan, and Fan Yang

 

Diving into the Winning Chatbot Projects

Let’s take a walkthrough of the details for each of the winning projects to get a view of what made these chatbots distinctive, as well as, learn more about the technologies used to implement the chatbot solution.

 

BuildFax Counts by Joe Emison

The BuildFax Counts bot was created as a real solution for the BuildFax company to decrease the amount the time that sales and marketing teams can get answers on permits or properties with permits meet certain criteria.

BuildFax, a company co-founded by bot developer Joe Emison, has the only national database of building permits, which updates data from approximately half of the United States on a monthly basis. In order to accommodate the many requests that come in from the sales and marketing team regarding permit information, BuildFax has a technical sales support team that fulfills these requests sent to a ticketing system by manually writing SQL queries that run across the shards of the BuildFax databases. Since there are a large number of requests received by the internal sales support team and due to the manual nature of setting up the queries, it may take several days for getting the sales and marketing teams to receive an answer.

The BuildFax Counts chatbot solves this problem by taking the permit inquiry that would normally be sent into a ticket from the sales and marketing team, as input from Slack to the chatbot. Once the inquiry is submitted into Slack, a query executes and the inquiry results are returned immediately.

Joe built this solution by first creating a nightly export of the data in their BuildFax MySQL RDS database to CSV files that are stored in Amazon S3. From the exported CSV files, an Amazon Athena table was created in order to run quick and efficient queries on the data. He then used Amazon Lex to create a bot to handle the common questions and criteria that may be asked by the sales and marketing teams when seeking data from the BuildFax database by modeling the language used from the BuildFax ticketing system. He added several different sample utterances and slot types; both custom and Lex provided, in order to correctly parse every question and criteria combination that could be received from an inquiry.  Using Lambda, Joe created a Javascript Lambda function that receives information from the Lex intent and used it to build a SQL statement that runs against the aforementioned Athena database using the AWS SDK for JavaScript in Node.js library to return inquiry count result and SQL statement used.

The BuildFax Counts bot is used today for the BuildFax sales and marketing team to get back data on inquiries immediately that previously took up to a week to receive results.

Not only is BuildFax Counts bot our 1st place winner and wonderful solution, but its creator, Joe Emison, is a great guy.  Joe has opted to donate his prize; the $5,000 cash, the $2,500 in AWS Credits, and one re:Invent ticket to the Black Girls Code organization. I must say, you rock Joe for helping these kids get access and exposure to technology.

 

Hubsy by Andrew Riess, Andrew Puch, and John Wetzel

Hubsy bot was created to redefine and personalize the way users traditionally manage their HubSpot account. HubSpot is a SaaS system providing marketing, sales, and CRM software. Hubsy allows users of HubSpot to create engagements and log engagements with customers, provide sales teams with deals status, and retrieves client contact information quickly. Hubsy uses Amazon Lex’s conversational interface to execute commands from the HubSpot API so that users can gain insights, store and retrieve data, and manage tasks directly from Facebook, Slack, or Alexa.

In order to implement the Hubsy chatbot, Andrew and the team members used AWS Lambda to create a Lambda function with Node.js to parse the users request and call the HubSpot API, which will fulfill the initial request or return back to the user asking for more information. Terraform was used to automatically setup and update Lambda, CloudWatch logs, as well as, IAM profiles. Amazon Lex was used to build the conversational piece of the bot, which creates the utterances that a person on a sales team would likely say when seeking information from HubSpot. To integrate with Alexa, the Amazon Alexa skill builder was used to create an Alexa skill which was tested on an Echo Dot. Cloudwatch Logs are used to log the Lambda function information to CloudWatch in order to debug different parts of the Lex intents. In order to validate the code before the Terraform deployment, ESLint was additionally used to ensure the code was linted and proper development standards were followed.

 

PFMBot by Benny Leong and his team from MoneyLion

PFMBot, Personal Finance Management Bot,  is a bot to be used with the MoneyLion finance group which offers customers online financial products; loans, credit monitoring, and free credit score service to improve the financial health of their customers. Once a user signs up an account on the MoneyLion app or website, the user has the option to link their bank accounts with the MoneyLion APIs. Once the bank account is linked to the APIs, the user will be able to login to their MoneyLion account and start having a conversation with the PFMBot based on their bank account information.

The PFMBot UI has a web interface built with using Javascript integration. The chatbot was created using Amazon Lex to build utterances based on the possible inquiries about the user’s MoneyLion bank account. PFMBot uses the Lex built-in AMAZON slots and parsed and converted the values from the built-in slots to pass to AWS Lambda. The AWS Lambda functions interacting with Amazon Lex are Java-based Lambda functions which call the MoneyLion Java-based internal APIs running on Spring Boot. These APIs obtain account data and related bank account information from the MoneyLion MySQL Database.

 

ADP Payroll Innovation Bot by Eric Liu, Jiaxing Yan, and Fan Yang

ADP PI (Payroll Innovation) bot is designed to help employees of ADP customers easily review their own payroll details and compare different payroll data by just asking the bot for results. The ADP PI Bot additionally offers issue reporting functionality for employees to report payroll issues and aids HR managers in quickly receiving and organizing any reported payroll issues.

The ADP Payroll Innovation bot is an ecosystem for the ADP payroll consisting of two chatbots, which includes ADP PI Bot for external clients (employees and HR managers), and ADP PI DevOps Bot for internal ADP DevOps team.


The architecture for the ADP PI DevOps bot is different architecture from the ADP PI bot shown above as it is deployed internally to ADP. The ADP PI DevOps bot allows input from both Slack and Alexa. When input comes into Slack, Slack sends the request to Lex for it to process the utterance. Lex then calls the Lambda backend, which obtains ADP data sitting in the ADP VPC running within an Amazon VPC. When input comes in from Alexa, a Lambda function is called that also obtains data from the ADP VPC running on AWS.

The architecture for the ADP PI bot consists of users entering in requests and/or entering issues via Slack. When requests/issues are entered via Slack, the Slack APIs communicate via Amazon API Gateway to AWS Lambda. The Lambda function either writes data into one of the Amazon DynamoDB databases for recording issues and/or sending issues or it sends the request to Lex. When sending issues, DynamoDB integrates with Trello to keep HR Managers abreast of the escalated issues. Once the request data is sent from Lambda to Lex, Lex processes the utterance and calls another Lambda function that integrates with the ADP API and it calls ADP data from within the ADP VPC, which runs on Amazon Virtual Private Cloud (VPC).

Python and Node.js were the chosen languages for the development of the bots.

The ADP PI bot ecosystem has the following functional groupings:

Employee Functionality

  • Summarize Payrolls
  • Compare Payrolls
  • Escalate Issues
  • Evolve PI Bot

HR Manager Functionality

  • Bot Management
  • Audit and Feedback

DevOps Functionality

  • Reduce call volume in service centers (ADP PI Bot).
  • Track issues and generate reports (ADP PI Bot).
  • Monitor jobs for various environment (ADP PI DevOps Bot)
  • View job dashboards (ADP PI DevOps Bot)
  • Query job details (ADP PI DevOps Bot)

 

Summary

Let’s all wish all the winners of the AWS Chatbot Challenge hearty congratulations on their excellent projects.

You can review more details on the winning projects, as well as, all of the submissions to the AWS Chatbot Challenge at: https://awschatbot2017.devpost.com/submissions. If you are curious on the details of Chatbot challenge contest including resources, rules, prizes, and judges, you can review the original challenge website here:  https://awschatbot2017.devpost.com/.

Hopefully, you are just as inspired as I am to build your own chatbot using Lex and Lambda. For more information, take a look at the Amazon Lex developer guide or the AWS AI blog on Building Better Bots Using Amazon Lex (Part 1)

Chat with you soon!

Tara

Security updates for Friday

Post Syndicated from ris original https://lwn.net/Articles/730629/rss

Security updates have been issued by Arch Linux (firefox, flashplugin, lib32-flashplugin, libsoup, and varnish), Debian (freeradius, git, libsoup2.4, pjproject, postgresql-9.1, postgresql-9.4, postgresql-9.6, subversion, and xchat), Fedora (gsoap, irssi, knot-resolver, php-horde-horde, php-horde-Horde-Core, php-horde-Horde-Form, php-horde-Horde-Url, php-horde-kronolith, php-horde-nag, and php-horde-turba), Mageia (perl-XML-LibXML), Oracle (libsoup), Red Hat (firefox and libsoup), SUSE (kernel and libsoup), and Ubuntu (git, kernel, libsoup2.4, linux, linux-aws, linux-gke, linux-raspi2, linux-snapdragon, linux, linux-raspi2, linux-hwe, linux-lts-trusty, linux-lts-xenial, php5, php7.0, and subversion).

Updates to GPIO Zero, the physical computing API

Post Syndicated from Ben Nuttall original https://www.raspberrypi.org/blog/gpio-zero-update/

GPIO Zero v1.4 is out now! It comes with a set of new features, including a handy pinout command line tool. To start using this newest version of the API, update your Raspbian OS now:

sudo apt update && sudo apt upgrade

Some of the things we’ve added will make it easier for you try your hand on different programming styles. In doing so you’ll build your coding skills, and will improve as a programmer. As a consequence, you’ll learn to write more complex code, which will enable you to take on advanced electronics builds. And on top of that, you can use the skills you’ll acquire in other computing projects.

GPIO Zero pinout tool

The new pinout tool

Developing GPIO Zero

Nearly two years ago, I started the GPIO Zero project as a simple wrapper around the low-level RPi.GPIO library. I wanted to create a simpler way to control GPIO-connected devices in Python, based on three years’ experience of training teachers, running workshops, and building projects. The idea grew over time, and the more we built for our Python library, the more sophisticated and powerful it became.

One of the great things about Python is that it’s a multi-paradigm programming language. You can write code in a number of different styles, according to your needs. You don’t have to write classes, but you can if you need them. There are functional programming tools available, but beginners get by without them. Importantly, the more advanced features of the language are not a barrier to entry.

Become a more advanced programmer

As a beginner to programming, you usually start by writing procedural programs, in which the flow moves from top to bottom. Then you’ll probably add loops and create your own functions. Your next step might be to start using libraries which introduce new patterns that operate in a different manner to what you’ve written before, for example threaded callbacks (event-driven programming). You might move on to object-oriented programming, extending the functionality of classes provided by other libraries, and starting to write your own classes. Occasionally, you may make use of tools created with functional programming techniques.

Five buttons in different colours

Take control of the buttons in your life

It’s much the same with GPIO Zero: you can start using it very easily, and we’ve made it simple to progress along the learning curve towards more advanced programming techniques. For example, if you want to make a push button control an LED, the easiest way to do this is via procedural programming using a while loop:

from gpiozero import LED, Button

led = LED(17)
button = Button(2)

while True:
    if button.is_pressed:
        led.on()
    else:
        led.off()

But another way to achieve the same thing is to use events:

from gpiozero import LED, Button
from signal import pause

led = LED(17)
button = Button(2)

button.when_pressed = led.on
button.when_released = led.off

pause()

You could even use a declarative approach, and set the LED’s behaviour in a single line:

from gpiozero import LED, Button
from signal import pause

led = LED(17)
button = Button(2)

led.source = button.values

pause()

You will find that using the procedural approach is a great start, but at some point you’ll hit a limit, and will have to try a different approach. The example above can be approach in several programming styles. However, if you’d like to control a wider range of devices or a more complex system, you need to carefully consider which style works best for what you want to achieve. Being able to choose the right programming style for a task is a skill in itself.

Source/values properties

So how does the led.source = button.values thing actually work?

Every GPIO Zero device has a .value property. For example, you can read a button’s state (True or False), and read or set an LED’s state (so led.value = True is the same as led.on()). Since LEDs and buttons operate with the same value set (True and False), you could say led.value = button.value. However, this only sets the LED to match the button once. If you wanted it to always match the button’s state, you’d have to use a while loop. To make things easier, we came up with a way of telling devices they’re connected: we added a .values property to all devices, and a .source to output devices. Now, a loop is no longer necessary, because this will do the job:

led.source = button.values

This is a simple approach to connecting devices using a declarative style of programming. In one single line, we declare that the LED should get its values from the button, i.e. when the button is pressed, the LED should be on. You can even mix the procedural with the declarative style: at one stage of the program, the LED could be set to match the button, while in the next stage it could just be blinking, and finally it might return back to its original state.

These additions are useful for connecting other devices as well. For example, a PWMLED (LED with variable brightness) has a value between 0 and 1, and so does a potentiometer connected via an ADC (analogue-digital converter) such as the MCP3008. The new GPIO Zero update allows you to say led.source = pot.values, and then twist the potentiometer to control the brightness of the LED.

But what if you want to do something more complex, like connect two devices with different value sets or combine multiple inputs?

We provide a set of device source tools, which allow you to process values as they flow from one device to another. They also let you send in artificial values such as random data, and you can even write your own functions to generate values to pass to a device’s source. For example, to control a motor’s speed with a potentiometer, you could use this code:

from gpiozero import Motor, MCP3008
from signal import pause

motor = Motor(20, 21)
pot = MCP3008()

motor.source = pot.values

pause()

This works, but it will only drive the motor forwards. If you wanted the potentiometer to drive it forwards and backwards, you’d use the scaled tool to scale its values to a range of -1 to 1:

from gpiozero import Motor, MCP3008
from gpiozero.tools import scaled
from signal import pause

motor = Motor(20, 21)
pot = MCP3008()

motor.source = scaled(pot.values, -1, 1)

pause()

And to separately control a robot’s left and right motor speeds with two potentiometers, you could do this:

from gpiozero import Robot, MCP3008
from signal import pause

robot = Robot(left=(2, 3), right=(4, 5))
left = MCP3008(0)
right = MCP3008(1)

robot.source = zip(left.values, right.values)

pause()

GPIO Zero and Blue Dot

Martin O’Hanlon created a Python library called Blue Dot which allows you to use your Android device to remotely control things on their Raspberry Pi. The API is very similar to GPIO Zero, and it even incorporates the value/values properties, which means you can hook it up to GPIO devices easily:

from bluedot import BlueDot
from gpiozero import LED
from signal import pause

bd = BlueDot()
led = LED(17)

led.source = bd.values

pause()

We even included a couple of Blue Dot examples in our recipes.

Make a series of binary logic gates using source/values

Read more in this source/values tutorial from The MagPi, and on the source/values documentation page.

Remote GPIO control

GPIO Zero supports multiple low-level GPIO libraries. We use RPi.GPIO by default, but you can choose to use RPIO or pigpio instead. The pigpio library supports remote connections, so you can run GPIO Zero on one Raspberry Pi to control the GPIO pins of another, or run code on a PC (running Windows, Mac, or Linux) to remotely control the pins of a Pi on the same network. You can even control two or more Pis at once!

If you’re using Raspbian on a Raspberry Pi (or a PC running our x86 Raspbian OS), you have everything you need to remotely control GPIO. If you’re on a PC running Windows, Mac, or Linux, you just need to install gpiozero and pigpio using pip. See our guide on configuring remote GPIO.

I road-tested the new pin_factory syntax at the Raspberry Jam @ Pi Towers

There are a number of different ways to use remote pins:

  • Set the default pin factory and remote IP address with environment variables:
$ GPIOZERO_PIN_FACTORY=pigpio PIGPIO_ADDR=192.168.1.2 python3 blink.py
  • Set the default pin factory in your script:
import gpiozero
from gpiozero import LED
from gpiozero.pins.pigpio import PiGPIOFactory

gpiozero.Device.pin_factory = PiGPIOFactory(host='192.168.1.2')

led = LED(17)
  • The pin_factory keyword argument allows you to use multiple Pis in the same script:
from gpiozero import LED
from gpiozero.pins.pigpio import PiGPIOFactory

factory2 = PiGPIOFactory(host='192.168.1.2')
factory3 = PiGPIOFactory(host='192.168.1.3')

local_hat = TrafficHat()
remote_hat2 = TrafficHat(pin_factory=factory2)
remote_hat3 = TrafficHat(pin_factory=factory3)

This is a really powerful feature! For more, read this remote GPIO tutorial in The MagPi, and check out the remote GPIO recipes in our documentation.

GPIO Zero on your PC

GPIO Zero doesn’t have any dependencies, so you can install it on your PC using pip. In addition to the API’s remote GPIO control, you can use its ‘mock’ pin factory on your PC. We originally created the mock pin feature for the GPIO Zero test suite, but we found that it’s really useful to be able to test GPIO Zero code works without running it on real hardware:

$ GPIOZERO_PIN_FACTORY=mock python3
>>> from gpiozero import LED
>>> led = LED(22)
>>> led.blink()
>>> led.value
True
>>> led.value
False

You can even tell pins to change state (e.g. to simulate a button being pressed) by accessing an object’s pin property:

>>> from gpiozero import LED
>>> led = LED(22)
>>> button = Button(23)
>>> led.source = button.values
>>> led.value
False
>>> button.pin.drive_low()
>>> led.value
True

You can also use the pinout command line tool if you set your pin factory to ‘mock’. It gives you a Pi 3 diagram by default, but you can supply a revision code to see information about other Pi models. For example, to use the pinout tool for the original 256MB Model B, just type pinout -r 2.

GPIO Zero documentation and resources

On the API’s website, we provide beginner recipes and advanced recipes, and we have added remote GPIO configuration including PC/Mac/Linux and Pi Zero OTG, and a section of GPIO recipes. There are also new sections on source/values, command-line tools, FAQs, Pi information and library development.

You’ll find plenty of cool projects using GPIO Zero in our learning resources. For example, you could check out the one that introduces physical computing with Python and get stuck in! We even provide a GPIO Zero cheat sheet you can download and print.

There are great GPIO Zero tutorials and projects in The MagPi magazine every month. Moreover, they also publish Simple Electronics with GPIO Zero, a book which collects a series of tutorials useful for building your knowledge of physical computing. And the best thing is, you can download it, and all magazine issues, for free!

Check out the API documentation and read more about what’s new in GPIO Zero on my blog. We have lots planned for the next release. Watch this space.

Get building!

The world of physical computing is at your fingertips! Are you feeling inspired?

If you’ve never tried your hand on physical computing, our Build a robot buggy learning resource is the perfect place to start! It’s your step-by-step guide for building a simple robot controlled with the help of GPIO Zero.

If you have a gee-whizz idea for an electronics project, do share it with us below. And if you’re currently working on a cool build and would like to show us how it’s going, pop a link to it in the comments.

The post Updates to GPIO Zero, the physical computing API appeared first on Raspberry Pi.

China Says It Will “Severely Strike” Websites Involved in Piracy

Post Syndicated from Andy original https://torrentfreak.com/china-says-it-will-severely-strike-websites-involved-in-piracy-170729/

When it comes to the protection of intellectual property, China is often viewed as one of the world’s leading scofflaws. Everything is copied in the country, from designer watches to cars. Not even major landmarks can escape the replica treatment.

In more recent times, however, there have been signs that China might be at least warming to the idea that IP protection should be given more priority.

For example, every few months authorities announce a new crackdown on Internet piracy, such as the “Jian Wang 2016” program which shuttered 290 piracy websites in the final six months of last year.

Maintaining the same naming convention, this week China’s National Copyright Administration revealed the new “Jian Wang 2017” anti-piracy program. During a meeting in Beijing attended by other state bodies, copyright groups, rights organizations, and representatives from the news media, the administration detailed its latest plans.

The anti-piracy program will focus on protecting the copyrights of the film, television, and news industries in China. Infringing websites, e-commerce and cloud storage services, social networks, plus mobile Internet applications will all be put under the spotlight, with authorities investigating and prosecuting major cases.

The program, which will run for the next four months, has a mission to improve compliance in three key areas.

The first aims to assist the film and TV industries by cracking down on ‘pirate’ websites, the unlawful use of file-sharing software, plus “forum communities and other channels that supply infringing film and television works.”

Also on the cards is a blitz against users of the hugely popular social media and instant messaging app, WeChat.

Released in 2011, WeChat now has more than 930 million users, some of which use the platform to republish news articles without permission from creators. Chinese authorities want to reduce this activity, noting that too many articles are stripped from their sources and reproduced on personal blogs and similar platforms.

The second area for attention is the booming market for pirate apps. Chinese authorities say that cracked app stores and the software they provide are contributing to a huge rise in the unlawful spread of films, TV shows, music, news and other literature. Set-top boxes that utilize such apps will also be targeted in the crackdown.

Finally, there will be a “strengthening of copyright supervision” on large-scale e-commerce platforms that supply audio and video products, eBooks, and other publications. Cloud storage platforms will also be subjected to additional scrutiny, as these are often used to share copyright works without permission.

What kind of effect the program will have on overall copyrighted content availability will remain to be seen, but if previous patterns are maintained, the National Copyright Administration should reveal the results of its blitz in December.

Source: TF, for the latest info on copyright, file-sharing, torrent sites and ANONYMOUS VPN services.

Announcing the Raspberry Jam Big Birthday Weekend 2018

Post Syndicated from Ben Nuttall original https://www.raspberrypi.org/blog/raspberry-jam-big-birthday-weekend-2018/

For the last few years, we have held a big Raspberry Pi community event in Cambridge around Raspberry Pi’s birthday, where people have come together for a huge party with talks, workshops, and more. We want more people to have the chance to join in with our birthday celebrations next year, so we’re going to be coordinating Raspberry Jams all over the world to take place over the Raspberry Jam Big Birthday Weekend, 3–4 March 2018.

Raspberry Pi Big Birthday Weekend 2018. GIF with confetti and bopping JAM balloons

Big Birthday fun!

Whether you’ve run a Raspberry Jam before, or you’d like to start a new Jam in your area, we invite you to join us for our Big Birthday Weekend, wherever you are in the world. This event will be a community-led, synchronised, global mega-Jam in celebration of our sixth birthday and the digital making community! Members of the Raspberry Pi Foundation team will be attending Jams far and wide to celebrate with you during the weekend.

Jams across the world will receive a special digital pack – be sure to register your interest so we can get your pack to you! We’ll also be sending out party kits to registered Jams – more info on this below.

Need help getting started?

First of all, check out the Raspberry Jam page to read all about Jams, and take a look at our recent blog post explaining the support for Jams that we offer.

If there’s no Jam near you yet, the Raspberry Jam Big Birthday Weekend is the perfect opportunity to start one yourself! If you’d like some help getting your Jam off the ground, there are a few places you can get support:

  • The Raspberry Jam Guidebook is full of advice gathered from the amazing people who run Jams in the UK.
  • The Raspberry Jam Slack team is available for Jam organisers to chat, share ideas, and get help from each other. Just email jam [at] raspberrypi.org and ask to be invited.
  • Attend a Jam! Find an upcoming Jam near you, and go along to get an idea of what it’s like.
  • Email us – if you have more queries, you can email jam [at] raspberrypi.org and we’ll do what we can to help.

Raspberry Jam

Get involved

If you’re keen to start a new Jam, there’s no need to wait until March – why not get up and running over the summer? Then you’ll be an expert by the time the Raspberry Jam Big Birthday Weekend comes around. Check out the guidebook, join the Jam Slack, and submit your event to the map when you’re ready.

Like the idea of running a Jam, but don’t want to do it by yourself? Then feel free to email us, and we’ll try and help you find someone to co-organise it.

If you don’t fancy organising a Jam for our Big Birthday Weekend, but would like to celebrate with us, keep an eye on our website for an update early next year. We’ll publish a full list of Jams participating in the festivities so you can find one near you. And if you’ve never attended a Jam before, there’s no need to wait: find one to join on the map here.

Raspberry Jam

Register your interest

If you think you’d like to run a Jam as part of the Big Birthday Weekend, register your interest now, and you’ll be the first to receive updates. Don’t worry if you don’t have the venue or logistics in place yet – this is just to let us know you’re keen, and to give us an idea about how big our party is going to be.

We will contact you in autumn to give you more information, as well as some useful resources. On top of our regular Raspberry Jam branding pack, we’ll provide a special digital Big Birthday Weekend pack to help you celebrate and tell everyone about your Jam!

Then, once you have confirmed you’re taking part, you’ll be able to register your Jam on our website. This will make sure that other people interested in joining the party can find your event. If your Jam is among the first 150 to be registered for a Big Birthday Weekend event, we will send you a free pack of goodies to use on the big day!

Go fill in the form, and we’ll be in touch!

 

PS: We’ll be running a big Cambridge event in the summer on the weekend of 30 June–1 July 2018. Put it in your diary – we’ll say more about it as we get closer to the date.

The post Announcing the Raspberry Jam Big Birthday Weekend 2018 appeared first on Raspberry Pi.

Ring 1.0 released

Post Syndicated from corbet original https://lwn.net/Articles/728728/rss

Savoir-faire Linux has announced
the release of Ring 1.0. “Ring is a free/libre and universal
communication platform that preserves the users’ privacy and freedoms. It
is a GNU package. It runs on multiple platforms; and, it can be used for
texting, calls, and video chats more privately, more securely, and more
reliably.

‘Game of Thrones Season 7 Premiere Pirated 90 Million Times’

Post Syndicated from Ernesto original https://torrentfreak.com/game-of-thrones-season-7-premiere-pirated-90-million-times-170721/

Last Sunday, the long-awaited seventh season of the hit series Game of Thrones aired in dozens of countries worldwide.

The show has broken several piracy records over the years and, thus far, there has been plenty of interest in the latest season as well.

Like every year, the torrent download figures quickly ran into the millions. However, little is known about the traffic that goes to streaming portals, which have outgrown traditional file-sharing sites in recent years.

One of the main problems is that it’s impossible for outsiders to know exactly how many visitors pirate streaming services get. Traffic data for these sites are not public, which makes it difficult to put an exact figure on the number of views one particular video has.

Piracy monitoring firm MUSO hasn’t shied away from this unexplored territory though and has now released some hard numbers.

According to MUSO, the premiere episode of the seventh season of Game of Thrones has been pirated more than 90 million times in only three days. A massive number, which is largely driven by streaming traffic.

Exactly 77,913,032 pirate views came from streaming portals, while public torrent traffic sits in second place with 8,356,382 downloads. Another 4,949,298 downloads are linked to direct download sites, while the remaining 523,109 come from private torrents.

Why other platforms such as Usenet are not covered remains unexplained in the press release, but without these the total is already quite substantial, to say the least.

MUSO reports that most pirate traffic comes from the United States, with 15.1 million unauthorized downloads and streams. The United Kingdom follows in second place with 6.2 million, before Germany, India, and Indonesia, with between 4 and 5 million each.

Andy Chatterley, MUSO’s CEO and Co-Founder, notes that the results may come as a surprise to some industry insiders, describing them as “huge.”

“There is no denying that these figures are huge, so they’re likely to raise more than a few eyebrows in the mainstream industry, but it’s in line with the sort of scale we see across piracy sites and should be looked at objectively.

“What we’re seeing here isn’t just P2P torrent downloads but unauthorized streams and every type of piracy around the premiere. This is the total audience picture, which is usually unreported,” Chatterley adds.

While there is no denying that the numbers are indeed huge, it would probably be better to view them as estimates. MUSO generally sources its data from SimilarWeb, which uses a sample of 200 million ‘devices’ to estimate website traffic. Website visits are then seen as “downloads,” and the sample data is extrapolated into the totals.

This also explains why other types of download traffic, such as Usenet, are not included at all. These are not web-based. Similarly, the data doesn’t appear to cover all countries. Game of Thrones piracy is very substantial in China, for example, but in its previous reports, MUSO didn’t exclude Chinese traffic.

Taking the caveats above into account, MUSO’s data could be a good estimate of the total (web) pirate traffic for the Game of Thrones premiere. This would suggest some pretty high piracy rates in some countries, but we’ve seen stranger things.

Note: TorrentFreak reached out to MUSO for further details on its methodology. The company confirmed that its data is based on traffic to 23,000 of the most-used piracy sites. The data is collected from over 200 million devices, located in over 200 countries. This appears to confirm that it is indeed SimilarWeb data.

Countries with the highest GoT piracy activity, according to MUSO:

United States of America: 15,075,951
United Kingdom: 6,252,903
Germany: 4,897,280
India: 4,335,331
Indonesia: 4,286,927
Philippines: 4,189,030
Canada: 3,182,851
France: 2,881,467
Turkey: 2,802,458
Vietnam: 2,436,149
Australia: 2,241,463
Russian Federation: 2,196,799
Netherlands: 1,881,718
Brazil: 1,796,759
Malaysia: 1,737,005

Source: TF, for the latest info on copyright, file-sharing, torrent sites and ANONYMOUS VPN services.

Hightail — Empowering Creative Collaboration in the Cloud

Post Syndicated from Ana Visneski original https://aws.amazon.com/blogs/aws/hightail-empowering-creative-collaboration-in-the-cloud/

Hightail – formerly YouSendIt – streamlines how creative work is reviewed, improved, and approved by helping more than 50 million professionals around the world get great content in front of their audiences faster. Since its debut in 2004 as a file sharing company, Hightail shifted its strategic direction to focus on delivering value-added creative collaboTagsration services and boasts a strong lineup of name-brand customers.

In today’s guest post, Hightail’s SVP of Technology Shiva Paranandi tells the company’s migration story, moving petabytes of data from on-premises to the cloud. He highlights their cloud vendor evaluation process and reasons for going all-in on AWS.


Hightail started as a way to help people easily share and store large files, but has since evolved into a creative collaboration tool. We became a place where users could not only control and share their digital assets, but also assemble their creative teams, connect with clients, develop creative workflows, and manage projects from start to finish. We now power collaboration services for major brands such as Lionsgate and Jimmy Kimmel Live!. With a growing list of domestic and international clients, we required more internal focus on product development and serving the users. We found that running our own data centers consumed more time, money, and manpower than we were willing to devote.

We needed an approach that would help us iterate more rapidly to meet customer needs and dramatically improve our time to market. We wanted to reduce data center costs and have the flexibility to scale up quickly in any given region around the globe. Setting up a data center in a new location took so long that it was limiting the pace of growth that we could achieve. In addition, we were tired of buying ahead of our needs, which meant we had storage capacity that we did not even use. We required a storage solution that was both tiered and highly scalable to reduce costs by allowing us to keep infrequently used data in inactive storage while also allowing us to resurface it quickly at the customer’s request. Our main drivers were agility and innovation, and the cloud enables these in a significant way. Given that, we decided to adopt a cloud-first policy that would enable us to spend time and money on initiatives that differentiate our business, instead of putting resources into managing our storage and computing infrastructure.

Comparing AWS Against Cloud Competitors

To kick off the migration, we did our due diligence by evaluating a variety of cloud vendors, including AWS, Google, IBM, and Microsoft. AWS stuck out as the clear winner for us. At one point, we considered combining services from multiple cloud providers to meet our needs, but decided the best route was to use AWS exclusively. When we factored in training, synchronization, support, and system availability along with other migration and management elements, it was just not practical to take a multi-cloud approach. With the best cost savings and an unmatched ecosystem of partner solutions, we did not need anyone else and chose to go all-in on AWS.

By migrating to AWS, we were able to secure the lowest cost-per-gigabyte pricing, gain access to a rich ecosystem, quickly develop in-house talent, and maintain SOC II compliance. The ecosystem was particularly important to us and set AWS apart from its competitors with its expansive list of partners. In fact, all the vendors we depend on for services such as previewing images, encoding videos, and serving up presentations were already a part of the network so we were easily able to leverage our existing investments and expertise. If we went with a different provider, it would have meant moving away from a platform that was already working so well for which was not the desired outcome for us. Also, the amount of talent we were able to build up in house on AWS technologies was astounding. Training our internal team to work with AWS was a simple process using available tools such as AWS conferences, training materials, and support.

Migrating Petabytes of Data

Going with AWS made things easier. In many instances, it gave us better functionality than what we were using in house. We moved multiple petabytes of data from on-premises storage to AWS with ease. AWS gave us great speeds with Direct Connect, so we were able to push all the data in a little more than three months with no user impact. We employed AWS Key Management Service to keep our data secure, which eased our minds through the move. We performed extensive QA testing before flipping users over to ensure low customer impact, using methods such as checksums between our data center and the data that got pushed to AWS.

Our new platform on AWS has greatly improved our user experience. We have seen huge improvement in reliability, performance, and uptime—all critical in our line of business. We are now able to achieve upload and download speeds up to 17 times faster than our previous data centers, and uptime has increased by orders of magnitude. Also, the time it takes us to deploy services to a new region has been cut by more than 90%. It used to take us at least six months to get a new region online, and now we can get a region up and running in less than three weeks. On AWS, we can even replicate data at the bucket level across regions for disaster recovery purposes.

To cut costs, we were successfully able to divide our storage infrastructure into frequently and infrequently accessed data. Tiered storage in Amazon S3 has been a huge advantage, allowing us to optimize our storage costs so we have more to invest in product development. We can now move data from inactive to active tiers instantly to meet customer needs and eliminated the need to overprovision our storage infrastructure. It is refreshing to see services automatically scale up or down during peak load times, and know that we are only paying for what we need.

Overall, we achieved our key strategic goal of focusing more on development and less on infrastructure. Our migration felt seamless, and the progress we were able to share is a true testament to how easy it has been for us to run our workloads on AWS. We attribute part of our successful migration to the dedicated support provided by the AWS team. They were pretty awesome. We had a couple of their technicians available 24/7 via chat, which proved to be essential during this large-scale migration.

-Shiva Paranandi, SVP of Technology at Hightail

Learning More

Learn more about cost-effective tiered data storage with Amazon S3, or dive deeper into our AWS Partner Ecosystem to see which solutions could best serve the needs of your company.

How to Use Batch References in Amazon Cloud Directory to Refer to New Objects in a Batch Request

Post Syndicated from Vineeth Harikumar original https://aws.amazon.com/blogs/security/how-to-use-batch-references-in-amazon-cloud-directory-to-refer-to-new-objects-in-a-batch-request/

In Amazon Cloud Directory, it’s often necessary to add new objects or add relationships between new objects and existing objects to reflect changes in a real-world hierarchy. With Cloud Directory, you can make these changes efficiently by using batch references within batch operations.

Let’s say I want to take an existing child object in a hierarchy, detach it from its parent, and reattach it to another part of the hierarchy. A simple way to do this would be to make a call to get the object’s unique identifier, another call to detach the object from its parent using the unique identifier, and a third call to attach it to a new parent. However, if I use batch references within a batch write operation, I can perform all three of these actions in the same request, greatly simplifying my code and reducing the round trips required to make such changes.

In this post, I demonstrate how to use batch references in a single write request to simplify adding and restructuring a Cloud Directory hierarchy. I have used the AWS SDK for Java for all the sample code in this post, but you can use other language SDKs or the AWS CLI in a similar way.

Using batch references

In my previous post, I demonstrated how to add AnyCompany’s North American warehouses to a global network of warehouses. As time passes and demand grows, AnyCompany launches multiple warehouses in North American cities to fulfill customer orders with continued efficiency. This requires the company to restructure the network to group warehouses in the same region so that the company can apply similar standards to them, such as delivery times, delivery areas, and types of products sold.

For instance, in the NorthAmerica object (see the following diagram), AnyCompany has launched two new warehouses in the Phoenix (PHX) area: PHX_2 and PHX_3. AnyCompany wants to add these new warehouses to the network and regroup them with existing warehouse PHX_1 under the new node, PHX.

The state of the hierarchy before this regrouping is shown in the following diagram, where I added the NorthAmerica warehouses (also represented as NA in the diagram) to the larger network of AnyCompany’s warehouses.

Diagram showing the state of the hierarchy before this post's regrouping

Adding and grouping new warehouses in the NorthAmerica network

I want to add and group the new warehouses with a single request, and using batch references in a batch write lets me do that. A batch reference is just another way of using object references that you are allowed to define arbitrarily. This allows you to chain operations, which means using the return value from one operation in a subsequent operation within the same batch write request

Let’s say I have a batch write request with two batch operations: operation A and operation B. Both batch operations operate on the same object X. In operation A, I use the object X found at /NorthAmerica/Phoenix, and I assign it to a batch reference that I call referencePhoenix. In operation B, I want to modify the same object X, so I use referencePhoenix as the object reference that points to the same unique object X used in operation A. I also will use the same helper method implementation from my previous post for getBatchCreateOperation. To learn more about batch references, see the ObjectReference documentation.

To add and group the new warehouses, I will take advantage of batch references to sequentially:

  1. Detach PHX_1 from the NA node and maintain a reference to PHX_1.
  2. Create a new child node, PHX, and attach it to the NA node.
  3. Create PHX_2 and PHX_3 nodes for the new warehouses.
  4. Link all three nodes—PHX_1 (using the batch reference), PHX_2, and PHX_3—to the PHX node.

The following code example achieves these changes in a single batch by using references. First, the code sets up a createObjectPHX operation to create the PHX parent object and attach it to the parent NorthAmerica object. It then sets up createObjectPHX_2 and createObjectPHX_3 and attaches these new objects to the new PHX object. The code then sets up a detachObject to detach the current PHX_1 object from its parent and assign it to a batch reference. The last operation uses that same batch reference to attach the PHX_1 object to the newly created PHX object. The code example orders these steps sequentially in a batch write operation.

   BatchWriteOperation createObjectPHX = getBatchCreateOperation(
        "PHX",
        directorySchemaARN,
        "/NorthAmerica",
        "Phoenix");
   BatchWriteOperation createObjectPHX_2 = getBatchCreateOperation(
        "PHX_2",
        directorySchemaARN,
        "/NorthAmerica/Phoenix",
        "PHX_2");
   BatchWriteOperation createObjectPHX_3 = getBatchCreateOperation(
        "PHX_3",
        directorySchemaARN,
        "/NorthAmerica/Phoenix",
        "PHX_3");


   BatchDetachObject detachObject = new BatchDetachObject()
        .withBatchReferenceName("referenceToPHX_1")
        .withLinkName("Phoenix")
        .withParentReference(new ObjectReference()
             .withSelector("/NorthAmerica"));

   BatchAttachObject attachObject = new BatchAttachObject()
        .withChildReference(new ObjectReference().withSelector("#referenceToPHX_1"))
        .withLinkName("PHX_1")
        .withParentReference(new ObjectReference()
            .withSelector("/NorthAmerica/Phoenix"));

   BatchWriteOperation detachOperation = new BatchWriteOperation()
       .withDetachObject(detachObject);
   BatchWriteOperation attachOperation = new BatchWriteOperation()
       .withAttachObject(attachObject);


   BatchWriteRequest request = new BatchWriteRequest();
   request.setDirectoryArn(directoryARN);
   request.setOperations(Lists.newArrayList(
       detachOperation,
       createObjectPHX,
       createObjectPHX_2,
       createObjectPHX_3,
       attachOperation));

   client.batchWrite(request);

In the preceding code example, I use the batch reference, referenceToPHX_1, in the same batch write operation because I do not have to know the object identifier of that object. If I couldn’t use such a batch reference, I would have to use separate requests to get the PHX_1 identifier, detach it from the NA node, and then attach it to the new PHX node.

I now have the network configuration I want, as shown in the following diagram. I have used a combination of batch operations with batch references to bring new warehouses into the network and regroup them within the same local group of warehouses.

Diagram showing the desired network configuration

Summary

In this post, I have shown how you can use batch references in a single batch write request to simplify adding and restructuring your existing hierarchies in Cloud Directory. You can use batch references in scenarios where you want to get an object identifier, but don’t want the overhead of using a read operation before a write operation. Instead, you can use a batch reference to refer to an object as part of the intermediate batch operation. To learn more about batch operations, see Batches, BatchWrite, and BatchRead.

If you have comments about this post, submit them in the “Comments” section below. If you have implementation questions, start a new thread on the Directory Service forum.

– Vineeth

New: Server-Side Encryption for Amazon Kinesis Streams

Post Syndicated from Tara Walker original https://aws.amazon.com/blogs/aws/new-server-side-encryption-for-amazon-kinesis-streams/

In this age of smart homes, big data, IoT devices, mobile phones, social networks, chatbots, and game consoles, streaming data scenarios are everywhere. Amazon Kinesis Streams enables you to build custom applications that can capture, process, analyze, and store terabytes of data per hour from thousands of streaming data sources. Since Amazon Kinesis Streams allows applications to process data concurrently from the same Kinesis stream, you can build parallel processing systems. For example, you can emit processed data to Amazon S3, perform complex analytics with Amazon Redshift, and even build robust, serverless streaming solutions using AWS Lambda.

Kinesis Streams enables several streaming use cases for consumers, and now we are making the service more effective for securing your data in motion by adding server-side encryption (SSE) support for Kinesis Streams. With this new Kinesis Streams feature, you can now enhance the security of your data and/or meet any regulatory and compliance requirements for any of your organization’s data streaming needs.
In fact, Kinesis Streams is now one of the AWS Services in Scope for the Payment Card Industry Data Security Standard (PCI DSS) compliance program. PCI DSS is a proprietary information security standard administered by the PCI Security Standards Council founded by key financial institutions. PCI DSS compliance applies to all entities that store, process, or transmit cardholder data and/or sensitive authentication data which includes service providers. You can request the PCI DSS Attestation of Compliance and Responsibility Summary using AWS Artifact. But the good news about compliance with Kinesis Streams doesn’t stop there. Kinesis Streams is now also FedRAMP compliant in AWS GovCloud. FedRAMP stands for Federal Risk and Authorization Management Program and is a U.S. government-wide program that delivers a standard approach to the security assessment, authorization, and continuous monitoring for cloud products and services. You can learn more about FedRAMP compliance with AWS Services here.

Now are you ready to get into the keys? Get it, instead of get into the weeds. Okay a little corny, but it was the best I could do. Coming back to discussing SSE for Kinesis Streams, let me explain the flow of server-side encryption with Kinesis.  Each data record and partition key put into a Kinesis Stream using the PutRecord or PutRecords API is encrypted using an AWS Key Management Service (KMS) master key. With the AWS Key Management Service (KMS) master key, Kinesis Streams uses the 256-bit Advanced Encryption Standard (AES-256 GCM algorithm) to add encryption to the incoming data.

In order to enable server-side encryption with Kinesis Streams for new or existing streams, you can use the Kinesis management console or leverage one of the available AWS SDKs.  Additionally, you can audit the history of your stream encryption, validate the encryption status of a certain stream in the Kinesis Streams console, or check that the PutRecord or GetRecord transactions are encrypted using the AWS CloudTrail service.

 

Walkthrough: Kinesis Streams Server-Side Encryption

Let’s do a quick walkthrough of server-side encryption with Kinesis Streams. First, I’ll go to the Amazon Kinesis console and select the Streams console option.

Once in the Kinesis Streams console, I can add server-side encryption to one of my existing Kinesis streams or opt to create a new Kinesis stream.  For this walkthrough, I’ll opt to quickly create a new Kinesis stream, therefore, I’ll select the Create Kinesis stream button.

I’ll name my stream, KinesisSSE-stream, and allocate one shard for my stream. Remember that the data capacity of your stream is calculated based upon the number of shards specified for the stream.  You can use the Estimate the number of shards you’ll need dropdown within the console or read more calculations to estimate the number of shards in a stream here.  To complete the creation of my stream, now I click the Create Kinesis stream button.

 

With my KinesisSSE-stream created, I will select it in the dashboard and choose the Actions dropdown and select the Details option.


On the Details page of the KinesisSSE-stream, there is now a Server-side encryption section.  In this section, I will select the Edit button.

 

 

Now I can enable server-side encryption for my stream with an AWS KMS master key, by selecting the Enabled radio button. Once selected I can choose which AWS KMS master key to use for the encryption of  data in KinesisSSE-stream. I can either select the KMS master key generated by the Kinesis service, (Default) aws/kinesis, or select one of my own KMS master keys that I have previously generated.  I’ll select the default master key and all that is left is for me to click the Save button.


That’s it!  As you can see from my screenshots below, after only about 20 seconds, server-side encryption was added to my Kinesis stream and now any incoming data into my stream will be encrypted.  One thing to note is server-side encryption only encrypts incoming data after encryption has been enabled. Preexisting data that is in a Kinesis stream prior to server-side encryption being enabled will remain unencrypted.

 

Summary

Kinesis Streams with Server-side encryption using AWS KMS keys makes it easy for you to automatically encrypt the streaming data coming into your  stream. You can start, stop, or update server-side encryption for any Kinesis stream using the AWS management console or the AWS SDK. To learn more about Kinesis Server-Side encryption, AWS Key Management Service, or about Kinesis Streams review the Amazon Kinesis getting started guide, the AWS Key Management Service developer guide, or the Amazon Kinesis product page.

 

Enjoy streaming.

Tara

More on the NSA’s Use of Traffic Shaping

Post Syndicated from Bruce Schneier original https://www.schneier.com/blog/archives/2017/07/more_on_the_nsa_2.html

“Traffic shaping” — the practice of tricking data to flow through a particular route on the Internet so it can be more easily surveiled — is an NSA technique that has gotten much less attention than it deserves. It’s a powerful technique that allows an eavesdropper to get access to communications channels it would otherwise not be able to monitor.

There’s a new paper on this technique:

This report describes a novel and more disturbing set of risks. As a technical matter, the NSA does not have to wait for domestic communications to naturally turn up abroad. In fact, the agency has technical methods that can be used to deliberately reroute Internet communications. The NSA uses the term “traffic shaping” to describe any technical means the deliberately reroutes Internet traffic to a location that is better suited, operationally, to surveillance. Since it is hard to intercept Yemen’s international communications from inside Yemen itself, the agency might try to “shape” the traffic so that it passes through communications cables located on friendlier territory. Think of it as diverting part of a river to a location from which it is easier (or more legal) to catch fish.

The NSA has clandestine means of diverting portions of the river of Internet traffic that travels on global communications cables.

Could the NSA use traffic shaping to redirect domestic Internet traffic — ­emails and chat messages sent between Americans, say­ — to foreign soil, where its surveillance can be conducted beyond the purview of Congress and the courts? It is impossible to categorically answer this question, due to the classified nature of many national-security surveillance programs, regulations and even of the legal decisions made by the surveillance courts. Nevertheless, this report explores a legal, technical, and operational landscape that suggests that traffic shaping could be exploited to sidestep legal restrictions imposed by Congress and the surveillance courts.

News article. NSA document detailing the technique with Yemen.

This work builds on previous research that I blogged about here.

The fundamental vulnerability is that routing information isn’t authenticated.

DevOps Cafe Episode 73 – Patrick Chanezon

Post Syndicated from DevOpsCafeAdmin original http://devopscafe.org/show/2017/7/11/devops-cafe-episode-73-patrick-chanezon.html

Changing how developers make change.

Patrick Chanezon chats with John and Damon about trends that are changing how Developers work.


 

  

Direct download

Follow John Willis on Twitter: @botchagalupe
Follow Damon Edwards on Twitter: @damonedwards 
Follow Patrick Chanezon on Twitter: @chanezon

Notes:

 

Please tweet or leave comments or questions below and we’ll read them on the show!

A kindly lesson for you non-techies about encryption

Post Syndicated from Robert Graham original http://blog.erratasec.com/2017/06/a-kindly-lesson-for-you-non-techies.html

The following tweets need to be debunked:

The answer to John Schindler’s question is:

every expert in cryptography doesn’t know this

Oh, sure, you can find fringe wacko who also knows crypto that agrees with you but all the sane members of the security community will not.

Telegram is not trustworthy because it’s partially closed-source. We can’t see how it works. We don’t know if they’ve made accidental mistakes that can be hacked. We don’t know if they’ve been bribed by the NSA or Russia to put backdoors in their program. In contrast, PGP and Signal are open-source. We can read exactly what the software does. Indeed, thousands of people have been reviewing their software looking for mistakes and backdoors. Being open-source doesn’t automatically make software better, but it does make hiding secret backdoors much harder.

Telegram is not trustworthy because we aren’t certain the crypto is done properly. Signal, and especially PGP, are done properly.

The thing about encryption is that when done properly, it works. Neither the NSA nor the Russians can break properly encrypted content. There’s no such thing as “military grade” encryption that is better than consumer grade. There’s only encryption that nobody can hack vs. encryption that your neighbor’s teenage kid can easily hack. Those scenes in TV/movies about breaking encryption is as realistic as sound in space: good for dramatic presentation, but not how things work in the real world.

In particular, end-to-end encryption works. Sure, in the past, such apps only encrypted as far as the server, so whoever ran the server could read your messages. Modern chat apps, though, are end-to-end: the servers have absolutely no ability to decrypt what’s on them, unless they can get the decryption keys from the phones. But some tasks, like encrypted messages to a group of people, can be hard to do properly.

Thus, in contrast to what John Schindler says, while we techies have doubts about Telegram, we don’t have doubts about Russia authorities having access to Signal and PGP messages.

Snowden hatred has become the anti-vax of crypto. Sure, there’s no particular reason to trust Snowden — people should really stop treating him as some sort of privacy-Jesus. But there’s no particular reason to distrust him, either. His bland statements on crypto are indistinguishable from any other crypto-enthusiast statements. If he’s a Russian pawn, then so too is the bulk of the crypto community.

With all this said, using Signal doesn’t make you perfectly safe. The person you are chatting with could be a secret agent — especially in group chat. There could be cameras/microphones in the room where you are using the app. The Russians can also hack into your phone, and likewise eavesdrop on everything you do with the phone, regardless of which app you use. And they probably have hacked specific people’s phones. On the other hand, if the NSA or Russians were widely hacking phones, we’d detect that this was happening. We haven’t.

Signal is therefore not a guarantee of safety, because nothing is, and if your life depends on it, you can’t trust any simple advice like “use Signal”. But, for the bulk of us, it’s pretty damn secure, and I trust neither the Russians nor the NSA are reading my Signal or PGP messages.

At first blush, this @20committee tweet appears to be non-experts opining on things outside their expertise. But in reality, it’s just obtuse partisanship, where truth and expertise doesn’t matter. Nothing you or I say can change some people’s minds on this matter, no matter how much our expertise gives weight to our words. This post is instead for bystanders, who don’t know enough to judge whether these crazy statements have merit.


Bonus:

So let’s talk about “every crypto expert“. It’s, of course, impossible to speak for every crypto expert. It’s like saying how the consensus among climate scientists is that mankind is warming the globe, while at the same time, ignoring the wide spread disagreement on how much warming that is.

The same is true here. You’ll get a widespread different set of responses from experts about the above tweet. Some, for example, will stress my point at the bottom that hacking the endpoint (the phone) breaks all the apps, and thus justify the above tweet from that point of view. Others will point out that all software has bugs, and it’s quite possible that Signal has some unknown bug that the Russians are exploiting.

So I’m not attempting to speak for what all experts might say here in the general case and what long lecture they can opine about. I am, though, pointing out the basics that virtually everyone agrees on, the consensus of open-source and working crypto.

Court Grants Subpoenas to Unmask ‘TVAddons’ and ‘ZemTV’ Operators

Post Syndicated from Ernesto original https://torrentfreak.com/court-grants-subpoenas-to-unmask-tvaddons-and-zemtv-operators-170621/

Earlier this month we broke the news that third-party Kodi add-on ZemTV and the TVAddons library were being sued in a federal court in Texas.

In a complaint filed by American satellite and broadcast provider Dish Network, both stand accused of copyright infringement, facing up to $150,000 for each offense.

While the allegations are serious, Dish doesn’t know the full identities of the defendants.

To find out more, the company requested a broad range of subpoenas from the court, targeting Amazon, Github, Google, Twitter, Facebook, PayPal, and several hosting providers.

From Dish’s request

This week the court granted the subpoenas, which means that they can be forwarded to the companies in question. Whether that will be enough to identify the people behind ‘TVAddons’ and ‘ZemTV’ remains to be seen, but Dish has cast its net wide.

For example, the subpoena directed at Google covers any type of information that can be used to identify the account holder of [email protected], which is believed to be tied to ZemTV.

The information requested from Google includes IP address logs with session date and timestamps, but also covers “all communications,” including GChat messages from 2014 onwards.

Similarly, Twitter is required to hand over information tied to the accounts of the users “TV Addons” and “shani_08_kodi” as well as other accounts linked to tvaddons.ag and streamingboxes.com. This also applies the various tweets that were sent through the account.

The subpoena specifically mentions “all communications, including ‘tweets’, Twitter sent to or received from each Twitter Account during the time period of February 1, 2014 to present.”

From the Twitter subpoena

Similar subpoenas were granted for the other services, tailored towards the information Dish hopes to find there. For example, the broadcast provider also requests details of each transaction from PayPal, as well as all debits and credits to the accounts.

In some parts, the subpoenas appear to be quite broad. PayPal is asked to reveal information on any account with the credit card statement “Shani,” for example. Similarly, Github is required to hand over information on accounts that are ‘associated’ with the tvaddons.ag domain, which is referenced by many people who are not directly connected to the site.

The service providers in question still have the option to challenge the subpoenas or ask the court for further clarification. A full overview of all the subpoena requests is available here (Exhibit 2 and onwards), including all the relevant details. This also includes several letters to foreign hosting providers.

While Dish still appears to be keen to find out who is behind ‘TVAddons’ and ‘ZemTV,’ not much has been heard from the defendants in question.

ZemTV developer “Shani” shut down his addon soon after the lawsuit was announced, without mentioning it specifically. TVAddons, meanwhile, has been offline for well over a week, without any notice in public about the reason for the prolonged downtime.

The court’s order granting the subpoenas and letters of request is available here (pdf).

Source: TF, for the latest info on copyright, file-sharing, torrent sites and ANONYMOUS VPN services.

casync — A tool for distributing file system images

Post Syndicated from Lennart Poettering original http://0pointer.net/blog/casync-a-tool-for-distributing-file-system-images.html

Introducing casync

In the past months I have been working on a new project:
casync. casync takes
inspiration from the popular rsync file
synchronization tool as well as the probably even more popular
git revision control system. It combines the
idea of the rsync algorithm with the idea of git-style
content-addressable file systems, and creates a new system for
efficiently storing and delivering file system images, optimized for
high-frequency update cycles over the Internet. Its current focus is
on delivering IoT, container, VM, application, portable service or OS
images, but I hope to extend it later in a generic fashion to become
useful for backups and home directory synchronization as well (but
more about that later).

The basic technological building blocks casync is built from are
neither new nor particularly innovative (at least not anymore),
however the way casync combines them is different from existing tools,
and that’s what makes it useful for a variety of use-cases that other
tools can’t cover that well.

Why?

I created casync after studying how today’s popular tools store and
deliver file system images. To briefly name a few: Docker has a
layered tarball approach,
OSTree serves the
individual files directly via HTTP and maintains packed deltas to
speed up updates, while other systems operate on the block layer and
place raw squashfs images (or other archival file systems, such as
IS09660) for download on HTTP shares (in the better cases combined
with zsync data).

Neither of these approaches appeared fully convincing to me when used
in high-frequency update cycle systems. In such systems, it is
important to optimize towards a couple of goals:

  1. Most importantly, make updates cheap traffic-wise (for this most tools use image deltas of some form)
  2. Put boundaries on disk space usage on servers (keeping deltas between all version combinations clients might want to run updates between, would suggest keeping an exponentially growing amount of deltas on servers)
  3. Put boundaries on disk space usage on clients
  4. Be friendly to Content Delivery Networks (CDNs), i.e. serve neither too many small nor too many overly large files, and only require the most basic form of HTTP. Provide the repository administrator with high-level knobs to tune the average file size delivered.
  5. Simplicity to use for users, repository administrators and developers

I don’t think any of the tools mentioned above are really good on more
than a small subset of these points.

Specifically: Docker’s layered tarball approach dumps the “delta”
question onto the feet of the image creators: the best way to make
your image downloads minimal is basing your work on an existing image
clients might already have, and inherit its resources, maintaining full
history. Here, revision control (a tool for the developer) is
intermingled with update management (a concept for optimizing
production delivery). As container histories grow individual deltas
are likely to stay small, but on the other hand a brand-new deployment
usually requires downloading the full history onto the deployment
system, even though there’s no use for it there, and likely requires
substantially more disk space and download sizes.

OSTree’s serving of individual files is unfriendly to CDNs (as many
small files in file trees cause an explosion of HTTP GET
requests). To counter that OSTree supports placing pre-calculated
delta images between selected revisions on the delivery servers, which
means a certain amount of revision management, that leaks into the
clients.

Delivering direct squashfs (or other file system) images is almost
beautifully simple, but of course means every update requires a full
download of the newest image, which is both bad for disk usage and
generated traffic. Enhancing it with zsync makes this a much better
option, as it can reduce generated traffic substantially at very
little cost of history/meta-data (no explicit deltas between a large
number of versions need to be prepared server side). On the other hand
server requirements in disk space and functionality (HTTP Range
requests) are minus points for the use-case I am interested in.

(Note: all the mentioned systems have great properties, and it’s not
my intention to badmouth them. They only point I am trying to make is
that for the use case I care about — file system image delivery with
high high frequency update-cycles — each system comes with certain
drawbacks.)

Security & Reproducibility

Besides the issues pointed out above I wasn’t happy with the security
and reproducibility properties of these systems. In today’s world
where security breaches involving hacking and breaking into connected
systems happen every day, an image delivery system that cannot make
strong guarantees regarding data integrity is out of
date. Specifically, the tarball format is famously nondeterministic:
the very same file tree can result in any number of different
valid serializations depending on the tool used, its version and the
underlying OS and file system. Some tar implementations attempt to
correct that by guaranteeing that each file tree maps to exactly
one valid serialization, but such a property is always only specific
to the tool used. I strongly believe that any good update system must
guarantee on every single link of the chain that there’s only one
valid representation of the data to deliver, that can easily be
verified.

What casync Is

So much about the background why I created casync. Now, let’s have a
look what casync actually is like, and what it does. Here’s the brief
technical overview:

Encoding: Let’s take a large linear data stream, split it into
variable-sized chunks (the size of each being a function of the
chunk’s contents), and store these chunks in individual, compressed
files in some directory, each file named after a strong hash value of
its contents, so that the hash value may be used to as key for
retrieving the full chunk data. Let’s call this directory a “chunk
store”. At the same time, generate a “chunk index” file that lists
these chunk hash values plus their respective chunk sizes in a simple
linear array. The chunking algorithm is supposed to create variable,
but similarly sized chunks from the data stream, and do so in a way
that the same data results in the same chunks even if placed at
varying offsets. For more information see this blog
story
.

Decoding: Let’s take the chunk index file, and reassemble the large
linear data stream by concatenating the uncompressed chunks retrieved
from the chunk store, keyed by the listed chunk hash values.

As an extra twist, we introduce a well-defined, reproducible,
random-access serialization format for file trees (think: a more
modern tar), to permit efficient, stable storage of complete file
trees in the system, simply by serializing them and then passing them
into the encoding step explained above.

Finally, let’s put all this on the network: for each image you want to
deliver, generate a chunk index file and place it on an HTTP
server. Do the same with the chunk store, and share it between the
various index files you intend to deliver.

Why bother with all of this? Streams with similar contents will result
in mostly the same chunk files in the chunk store. This means it is
very efficient to store many related versions of a data stream in the
same chunk store, thus minimizing disk usage. Moreover, when
transferring linear data streams chunks already known on the receiving
side can be made use of, thus minimizing network traffic.

Why is this different from rsync or OSTree, or similar tools? Well,
one major difference between casync and those tools is that we
remove file boundaries before chunking things up. This means that
small files are lumped together with their siblings and large files
are chopped into pieces, which permits us to recognize similarities in
files and directories beyond file boundaries, and makes sure our chunk
sizes are pretty evenly distributed, without the file boundaries
affecting them.

The “chunking” algorithm is based on a the buzhash rolling hash
function. SHA256 is used as strong hash function to generate digests
of the chunks. xz is used to compress the individual chunks.

Here’s a diagram, hopefully explaining a bit how the encoding process
works, wasn’t it for my crappy drawing skills:

Diagram

The diagram shows the encoding process from top to bottom. It starts
with a block device or a file tree, which is then serialized and
chunked up into variable sized blocks. The compressed chunks are then
placed in the chunk store, while a chunk index file is written listing
the chunk hashes in order. (The original SVG of this graphic may be
found here.)

Details

Note that casync operates on two different layers, depending on the
use-case of the user:

  1. You may use it on the block layer. In this case the raw block data
    on disk is taken as-is, read directly from the block device, split
    into chunks as described above, compressed, stored and delivered.

  2. You may use it on the file system layer. In this case, the
    file tree serialization format mentioned above comes into play:
    the file tree is serialized depth-first (much like tar would do
    it) and then split into chunks, compressed, stored and delivered.

The fact that it may be used on both the block and file system layer
opens it up for a variety of different use-cases. In the VM and IoT
ecosystems shipping images as block-level serializations is more
common, while in the container and application world file-system-level
serializations are more typically used.

Chunk index files referring to block-layer serializations carry the
.caibx suffix, while chunk index files referring to file system
serializations carry the .caidx suffix. Note that you may also use
casync as direct tar replacement, i.e. without the chunking, just
generating the plain linear file tree serialization. Such files
carry the .catar suffix. Internally .caibx are identical to
.caidx files, the only difference is semantical: .caidx files
describe a .catar file, while .caibx files may describe any other
blob. Finally, chunk stores are directories carrying the .castr
suffix.

Features

Here are a couple of other features casync has:

  1. When downloading a new image you may use casync‘s --seed=
    feature: each block device, file, or directory specified is processed
    using the same chunking logic described above, and is used as
    preferred source when putting together the downloaded image locally,
    avoiding network transfer of it. This of course is useful whenever
    updating an image: simply specify one or more old versions as seed and
    only download the chunks that truly changed since then. Note that
    using seeds requires no history relationship between seed and the new
    image to download. This has major benefits: you can even use it to
    speed up downloads of relatively foreign and unrelated data. For
    example, when downloading a container image built using Ubuntu you can
    use your Fedora host OS tree in /usr as seed, and casync will
    automatically use whatever it can from that tree, for example timezone
    and locale data that tends to be identical between
    distributions. Example: casync extract
    http://example.com/myimage.caibx --seed=/dev/sda1 /dev/sda2
    . This
    will place the block-layer image described by the indicated URL in the
    /dev/sda2 partition, using the existing /dev/sda1 data as seeding
    source. An invocation like this could be typically used by IoT systems
    with an A/B partition setup. Example 2: casync extract
    http://example.com/mycontainer-v3.caidx --seed=/srv/container-v1
    --seed=/srv/container-v2 /src/container-v3
    , is very similar but
    operates on the file system layer, and uses two old container versions
    to seed the new version.

  2. When operating on the file system level, the user has fine-grained
    control on the meta-data included in the serialization. This is
    relevant since different use-cases tend to require a different set of
    saved/restored meta-data. For example, when shipping OS images, file
    access bits/ACLs and ownership matter, while file modification times
    hurt. When doing personal backups OTOH file ownership matters little
    but file modification times are important. Moreover different backing
    file systems support different feature sets, and storing more
    information than necessary might make it impossible to validate a tree
    against an image if the meta-data cannot be replayed in full. Due to
    this, casync provides a set of --with= and --without= parameters
    that allow fine-grained control of the data stored in the file tree
    serialization, including the granularity of modification times and
    more. The precise set of selected meta-data features is also always
    part of the serialization, so that seeding can work correctly and
    automatically.

  3. casync tries to be as accurate as possible when storing file
    system meta-data. This means that besides the usual baseline of file
    meta-data (file ownership and access bits), and more advanced features
    (extended attributes, ACLs, file capabilities) a number of more exotic
    data is stored as well, including Linux
    chattr(1) file attributes, as
    well as FAT file
    attributes

    (you may wonder why the latter? — EFI is FAT, and /efi is part of
    the comprehensive serialization of any host). In the future I intend
    to extend this further, for example storing btrfs sub-volume
    information where available. Note that as described above every single
    type of meta-data may be turned off and on individually, hence if you
    don’t need FAT file bits (and I figure it’s pretty likely you don’t),
    then they won’t be stored.

  4. The user creating .caidx or .caibx files may control the desired
    average chunk length (before compression) freely, using the
    --chunk-size= parameter. Smaller chunks increase the number of
    generated files in the chunk store and increase HTTP GET load on the
    server, but also ensure that sharing between similar images is
    improved, as identical patterns in the images stored are more likely
    to be recognized. By default casync will use a 64K average chunk
    size. Tweaking this can be particularly useful when adapting the
    system to specific CDNs, or when delivering compressed disk images
    such as squashfs (see below).

  5. Emphasis is placed on making all invocations reproducible,
    well-defined and strictly deterministic. As mentioned above this is a
    requirement to reach the intended security guarantees, but is also
    useful for many other use-cases. For example, the casync digest
    command may be used to calculate a hash value identifying a specific
    directory in all desired detail (use --with= and --without to pick
    the desired detail). Moreover the casync mtree command may be used
    to generate a BSD mtree(5) compatible manifest of a directory tree,
    .caidx or .catar file.

  6. The file system serialization format is nicely composable. By this
    I mean that the serialization of a file tree is the concatenation of
    the serializations of all files and file sub-trees located at the
    top of the tree, with zero meta-data references from any of these
    serializations into the others. This property is essential to ensure
    maximum reuse of chunks when similar trees are serialized.

  7. When extracting file trees or disk image files, casync
    will automatically create
    reflinks
    from any specified seeds if the underlying file system supports it
    (such as btrfs, ocfs, and future xfs). After all, instead of
    copying the desired data from the seed, we can just tell the file
    system to link up the relevant blocks. This works both when extracting
    .caidx and .caibx files — the latter of course only when the
    extracted disk image is placed in a regular raw image file on disk,
    rather than directly on a plain block device, as plain block devices
    do not know the concept of reflinks.

  8. Optionally, when extracting file trees, casync can
    create traditional UNIX hard-links for identical files in specified
    seeds (--hardlink=yes). This works on all UNIX file systems, and can
    save substantial amounts of disk space. However, this only works for
    very specific use-cases where disk images are considered read-only
    after extraction, as any changes made to one tree will propagate to
    all other trees sharing the same hard-linked files, as that’s the
    nature of hard-links. In this mode, casync exposes OSTree-like
    behavior, which is built heavily around read-only hard-link trees.

  9. casync tries to be smart when choosing what to include in file
    system images. Implicitly, file systems such as procfs and sysfs are
    excluded from serialization, as they expose API objects, not real
    files. Moreover, the “nodump” (+d)
    chattr(1) flag is honored by
    default, permitting users to mark files to exclude from serialization.

  10. When creating and extracting file trees casync may apply an
    automatic or explicit UID/GID shift. This is particularly useful when
    transferring container image for use with Linux user name-spacing.

  11. In addition to local operation, casync currently supports HTTP,
    HTTPS, FTP and ssh natively for downloading chunk index files and
    chunks (the ssh mode requires installing casync on the remote host,
    though, but an sftp mode not requiring that should be easy to
    add). When creating index files or chunks, only ssh is supported as
    remote back-end.

  12. When operating on block-layer images, you may expose locally or
    remotely stored images as local block devices. Example: casync mkdev
    http://example.com/myimage.caibx
    exposes the disk image described by
    the indicated URL as local block device in /dev, which you then may
    use the usual block device tools on, such as mount or fdisk (only
    read-only though). Chunks are downloaded on access with high priority,
    and at low priority when idle in the background. Note that in this
    mode, casync also plays a role similar to “dm-verity”, as all blocks
    are validated against the strong digests in the chunk index file
    before passing them on to the kernel’s block layer. This feature is
    implemented though Linux’ NBD kernel facility.

  13. Similar, when operating on file-system-layer images, you may mount
    locally or remotely stored images as regular file systems. Example:
    casync mount http://example.com/mytree.caidx /srv/mytree mounts the
    file tree image described by the indicated URL as a local directory
    /srv/mytree. This feature is implemented though Linux’ FUSE kernel
    facility. Note that special care is taken that the images exposed this
    way can be packed up again with casync make and are guaranteed to
    return the bit-by-bit exact same serialization again that it was
    mounted from. No data is lost or changed while passing things through
    FUSE (OK, strictly speaking this is a lie, we do lose ACLs, but that’s
    hopefully just a temporary gap to be fixed soon).

  14. In IoT A/B fixed size partition setups the file systems placed in
    the two partitions are usually much shorter than the partition size,
    in order to keep some room for later, larger updates. casync is able
    to analyze the super-block of a number of common file systems in order
    to determine the actual size of a file system stored on a block
    device, so that writing a file system to such a partition and reading
    it back again will result in reproducible data. Moreover this speeds
    up the seeding process, as there’s little point in seeding the
    white-space after the file system within the partition.

Example Command Lines

Here’s how to use casync, explained with a few examples:

$ casync make foobar.caidx /some/directory

This will create a chunk index file foobar.caidx in the local
directory, and populate the chunk store directory default.castr
located next to it with the chunks of the serialization (you can
change the name for the store directory with --store= if you
like). This command operates on the file-system level. A similar
command operating on the block level:

$ casync make foobar.caibx /dev/sda1

This command creates a chunk index file foobar.caibx in the local
directory describing the current contents of the /dev/sda1 block
device, and populates default.castr in the same way as above. Note
that you may as well read a raw disk image from a file instead of a
block device:

$ casync make foobar.caibx myimage.raw

To reconstruct the original file tree from the .caidx file and
the chunk store of the first command, use:

$ casync extract foobar.caidx /some/other/directory

And similar for the block-layer version:

$ casync extract foobar.caibx /dev/sdb1

or, to extract the block-layer version into a raw disk image:

$ casync extract foobar.caibx myotherimage.raw

The above are the most basic commands, operating on local data
only. Now let’s make this more interesting, and reference remote
resources:

$ casync extract http://example.com/images/foobar.caidx /some/other/directory

This extracts the specified .caidx onto a local directory. This of
course assumes that foobar.caidx was uploaded to the HTTP server in
the first place, along with the chunk store. You can use any command
you like to accomplish that, for example scp or
rsync. Alternatively, you can let casync do this directly when
generating the chunk index:

$ casync make ssh.example.com:images/foobar.caidx /some/directory

This will use ssh to connect to the ssh.example.com server, and then
places the .caidx file and the chunks on it. Note that this mode of
operation is “smart”: this scheme will only upload chunks currently
missing on the server side, and not re-transmit what already is
available.

Note that you can always configure the precise path or URL of the
chunk store via the --store= option. If you do not do that, then the
store path is automatically derived from the path or URL: the last
component of the path or URL is replaced by default.castr.

Of course, when extracting .caidx or .caibx files from remote sources,
using a local seed is advisable:

$ casync extract http://example.com/images/foobar.caidx --seed=/some/exising/directory /some/other/directory

Or on the block layer:

$ casync extract http://example.com/images/foobar.caibx --seed=/dev/sda1 /dev/sdb2

When creating chunk indexes on the file system layer casync will by
default store meta-data as accurately as possible. Let’s create a chunk
index with reduced meta-data:

$ casync make foobar.caidx --with=sec-time --with=symlinks --with=read-only /some/dir

This command will create a chunk index for a file tree serialization
that has three features above the absolute baseline supported: 1s
granularity time-stamps, symbolic links and a single read-only bit. In
this mode, all the other meta-data bits are not stored, including
nanosecond time-stamps, full UNIX permission bits, file ownership or
even ACLs or extended attributes.

Now let’s make a .caidx file available locally as a mounted file
system, without extracting it:

$ casync mount http://example.comf/images/foobar.caidx /mnt/foobar

And similar, let’s make a .caibx file available locally as a block device:

$ casync mkdev http://example.comf/images/foobar.caibx

This will create a block device in /dev and print the used device
node path to STDOUT.

As mentioned, casync is big about reproducibility. Let’s make use of
that to calculate the a digest identifying a very specific version of
a file tree:

$ casync digest .

This digest will include all meta-data bits casync and the underlying
file system know about. Usually, to make this useful you want to
configure exactly what meta-data to include:

$ casync digest --with=unix .

This makes use of the --with=unix shortcut for selecting meta-data
fields. Specifying --with-unix= selects all meta-data that
traditional UNIX file systems support. It is a shortcut for writing out:
--with=16bit-uids --with=permissions --with=sec-time --with=symlinks
--with=device-nodes --with=fifos --with=sockets
.

Note that when calculating digests or creating chunk indexes you may
also use the negative --without= option to remove specific features
but start from the most precise:

$ casync digest --without=flag-immutable

This generates a digest with the most accurate meta-data, but leaves
one feature out: chattr(1)‘s
immutable (+i) file flag.

To list the contents of a .caidx file use a command like the following:

$ casync list http://example.com/images/foobar.caidx

or

$ casync mtree http://example.com/images/foobar.caidx

The former command will generate a brief list of files and
directories, not too different from tar t or ls -al in its
output. The latter command will generate a BSD
mtree(5) compatible
manifest. Note that casync actually stores substantially more file
meta-data than mtree files can express, though.

What casync isn’t

  1. casync is not an attempt to minimize serialization and downloaded
    deltas to the extreme. Instead, the tool is supposed to find a good
    middle ground, that is good on traffic and disk space, but not at the
    price of convenience or requiring explicit revision control. If you
    care about updates that are absolutely minimal, there are binary delta
    systems around that might be an option for you, such as Google’s
    Courgette
    .

  2. casync is not a replacement for rsync, or git or zsync or
    anything like that. They have very different use-cases and
    semantics. For example, rsync permits you to directly synchronize two
    file trees remotely. casync just cannot do that, and it is unlikely
    it every will.

Where next?

casync is supposed to be a generic synchronization tool. Its primary
focus for now is delivery of OS images, but I’d like to make it useful
for a couple other use-cases, too. Specifically:

  1. To make the tool useful for backups, encryption is missing. I have
    pretty concrete plans how to add that. When implemented, the tool
    might become an alternative to restic,
    BorgBackup or
    tarsnap.

  2. Right now, if you want to deploy casync in real-life, you still
    need to validate the downloaded .caidx or .caibx file yourself, for
    example with some gpg signature. It is my intention to integrate with
    gpg in a minimal way so that signing and verifying chunk index files
    is done automatically.

  3. In the longer run, I’d like to build an automatic synchronizer for
    $HOME between systems from this. Each $HOME instance would be
    stored automatically in regular intervals in the cloud using casync,
    and conflicts would be resolved locally.

  4. casync is written in a shared library style, but it is not yet
    built as one. Specifically this means that almost all of casync‘s
    functionality is supposed to be available as C API soon, and
    applications can process casync files on every level. It is my
    intention to make this library useful enough so that it will be easy
    to write a module for GNOME’s gvfs subsystem in order to make remote
    or local .caidx files directly available to applications (as an
    alternative to casync mount). In fact the idea is to make this all
    flexible enough that even the remoting back-ends can be replaced
    easily, for example to replace casync‘s default HTTP/HTTPS back-ends
    built on CURL with GNOME’s own HTTP implementation, in order to share
    cookies, certificates, … There’s also an alternative method to
    integrate with casync in place already: simply invoke casync as a
    sub-process. casync will inform you about a certain set of state
    changes using a mechanism compatible with
    sd_notify(3). In
    future it will also propagate progress data this way and more.

  5. I intend to a add a new seeding back-end that sources chunks from
    the local network. After downloading the new .caidx file off the
    Internet casync would then search for the listed chunks on the local
    network first before retrieving them from the Internet. This should
    speed things up on all installations that have multiple similar
    systems deployed in the same network.

Further plans are listed tersely in the
TODO file.

FAQ:

  1. Is this a systemd project?casync is hosted under the
    github systemd umbrella, and the
    projects share the same coding style. However, the code-bases are
    distinct and without interdependencies, and casync works fine both
    on systemd systems and systems without it.

  2. Is casync portable? — At the moment: no. I only run Linux and
    that’s what I code for. That said, I am open to accepting portability
    patches (unlike for systemd, which doesn’t really make sense on
    non-Linux systems), as long as they don’t interfere too much with the
    way casync works. Specifically this means that I am not too
    enthusiastic about merging portability patches for OSes lacking the
    openat(2) family
    of APIs.

  3. Does casync require reflink-capable file systems to work, such
    as btrfs?
    — No it doesn’t. The reflink magic in casync is
    employed when the file system permits it, and it’s good to have it,
    but it’s not a requirement, and casync will implicitly fall back to
    copying when it isn’t available. Note that casync supports a number
    of file system features on a variety of file systems that aren’t
    available everywhere, for example FAT’s system/hidden file flags or
    xfs‘s projinherit file flag.

  4. Is casync stable? — I just tagged the first, initial
    release. While I have been working on it since quite some time and it
    is quite featureful, this is the first time I advertise it publicly,
    and it hence received very little testing outside of its own test
    suite. I am also not fully ready to commit to the stability of the
    current serialization or chunk index format. I don’t see any breakages
    coming for it though. casync is pretty light on documentation right
    now, and does not even have a man page. I also intend to correct that
    soon.

  5. Are the .caidx/.caibx and .catar file formats open and
    documented?
    casync is Open Source, so if you want to know the
    precise format, have a look at the sources for now. It’s definitely my
    intention to add comprehensive docs for both formats however. Don’t
    forget this is just the initial version right now.

  6. casync is just like $SOMEOTHERTOOL! Why are you reinventing
    the wheel (again)?
    — Well, because casync isn’t “just like” some
    other tool. I am pretty sure I did my homework, and that there is no
    tool just like casync right now. The tools coming closest are probably
    rsync, zsync, tarsnap, restic, but they are quite different beasts
    each.

  7. Why did you invent your own serialization format for file trees?
    Why don’t you just use tar?
    — That’s a good question, and other
    systems — most prominently tarsnap — do that. However, as mentioned
    above tar doesn’t enforce reproducibility. It also doesn’t really do
    random access: if you want to access some specific file you need to
    read every single byte stored before it in the tar archive to find
    it, which is of course very expensive. The serialization casync
    implements places a focus on reproducibility, random access, and
    meta-data control. Much like traditional tar it can still be
    generated and extracted in a stream fashion though.

  8. Does casync save/restore SELinux/SMACK file labels? — At the
    moment not. That’s not because I wouldn’t want it to, but simply
    because I am not a guru of either of these systems, and didn’t want to
    implement something I do not fully grok nor can test. If you look at
    the sources you’ll find that there’s already some definitions in place
    that keep room for them though. I’d be delighted to accept a patch
    implementing this fully.

  9. What about delivering squashfs images? How well does chunking
    work on compressed serializations?
    – That’s a very good point!
    Usually, if you apply the a chunking algorithm to a compressed data
    stream (let’s say a tar.gz file), then changing a single bit at the
    front will propagate into the entire remainder of the file, so that
    minimal changes will explode into major changes. Thankfully this
    doesn’t apply that strictly to squashfs images, as it provides
    random access to files and directories and thus breaks up the
    compression streams in regular intervals to make seeking easy. This
    fact is beneficial for systems employing chunking, such as casync as
    this means single bit changes might affect their vicinity but will not
    explode in an unbounded fashion. In order achieve best results when
    delivering squashfs images through casync the block sizes of
    squashfs and the chunks sizes of casync should be matched up
    (using casync‘s --chunk-size= option). How precisely to choose
    both values is left a research subject for the user, for now.

  10. What does the name casync mean? – It’s a synchronizing
    tool, hence the -sync suffix, following rsync‘s naming. It makes
    use of the content-addressable concept of git hence the ca-
    prefix.

  11. Where can I get this stuff? Is it already packaged? – Check
    out the sources on GitHub. I
    just tagged the first
    version
    . Martin
    Pitt has packaged casync for
    Ubuntu
    . There
    is also an ArchLinux
    package
    . Zbigniew
    Jędrzejewski-Szmek has prepared a Fedora
    RPM
    that hopefully
    will soon be included in the distribution.

Should you care? Is this a tool for you?

Well, that’s up to you really. If you are involved with projects that
need to deliver IoT, VM, container, application or OS images, then
maybe this is a great tool for you — but other options exist, some of
which are linked above.

Note that casync is an Open Source project: if it doesn’t do exactly
what you need, prepare a patch that adds what you need, and we’ll
consider it.

If you are interested in the project and would like to talk about this
in person, I’ll be presenting casync soon at Kinvolk’s Linux
Technologies
Meetup

in Berlin, Germany. You are invited. I also intend to talk about it at
All Systems Go!, also in Berlin.

DevOps Cafe Episode 72 – Kelsey Hightower

Post Syndicated from DevOpsCafeAdmin original http://devopscafe.org/show/2017/6/18/devops-cafe-episode-72-kelsey-hightower.html

You can’t contain(er) Kelsey.

John and Damon chat with Kelsey Hightower (Google) about the future of operations, kubernetes, docker, containers, self-learning, and more!
  

  

Direct download

Follow John Willis on Twitter: @botchagalupe
Follow Damon Edwards on Twitter: @damonedwards 
Follow Kelsey Hightower on Twitter: @kelseyhightower

Notes:

 

Please tweet or leave comments or questions below and we’ll read them on the show!

Digital painter rundown

Post Syndicated from Eevee original https://eev.ee/blog/2017/06/17/digital-painter-rundown/

Another patron post! IndustrialRobot asks:

You should totally write about drawing/image manipulation programs! (Inspired by https://eev.ee/blog/2015/05/31/text-editor-rundown/)

This is a little trickier than a text editor comparison — while most text editors are cross-platform, quite a few digital art programs are not. So I’m effectively unable to even try a decent chunk of the offerings. I’m also still a relatively new artist, and image editors are much harder to briefly compare than text editors…

Right, now that your expectations have been suitably lowered:

Krita

I do all of my digital art in Krita. It’s pretty alright.

Okay so Krita grew out of Calligra, which used to be KOffice, which was an office suite designed for KDE (a Linux desktop environment). I bring this up because KDE has a certain… reputation. With KDE, there are at least three completely different ways to do anything, each of those ways has ludicrous amounts of customization and settings, and somehow it still can’t do what you want.

Krita inherits this aesthetic by attempting to do literally everything. It has 17 different brush engines, more than 70 layer blending modes, seven color picker dockers, and an ungodly number of colorspaces. It’s clearly intended primarily for drawing, but it also supports animation and vector layers and a pretty decent spread of raster editing tools. I just right now discovered that it has Photoshop-like “layer styles” (e.g. drop shadow), after a year and a half of using it.

In fairness, Krita manages all of this stuff well enough, and (apparently!) it manages to stay out of your way if you’re not using it. In less fairness, they managed to break erasing with a Wacom tablet pen for three months?

I don’t want to rag on it too hard; it’s an impressive piece of work, and I enjoy using it! The emotion it evokes isn’t so much frustration as… mystified bewilderment.

I once filed a ticket suggesting the addition of a brush size palette — a panel showing a grid of fixed brush sizes that makes it easy to switch between known sizes with a tablet pen (and increases the chances that you’ll be able to get a brush back to the right size again). It’s a prominent feature of Paint Tool SAI and Clip Studio Paint, and while I’ve never used either of those myself, I’ve seen a good few artists swear by it.

The developer response was that I could emulate the behavior by creating brush presets. But that’s flat-out wrong: getting the same effect would require creating a ton of brush presets for every brush I have, plus giving them all distinct icons so the size is obvious at a glance. Even then, it would be much more tedious to use and fill my presets with junk.

And that sort of response is what’s so mysterious to me. I’ve never even been able to use this feature myself, but a year of amateur painting with Krita has convinced me that it would be pretty useful. But a developer didn’t see the use and suggested an incredibly tedious alternative that only half-solves the problem and creates new ones. Meanwhile, of the 28 existing dockable panels, a quarter of them are different ways to choose colors.

What is Krita trying to be, then? What does Krita think it is? Who precisely is the target audience? I have no idea.


Anyway, I enjoy drawing in Krita well enough. It ships with a respectable set of brushes, and there are plenty more floating around. It has canvas rotation, canvas mirroring, perspective guide tools, and other art goodies. It doesn’t colordrop on right click by default, which is arguably a grave sin (it shows a customizable radial menu instead), but that’s easy to rebind. It understands having a background color beneath a bottom transparent layer, which is very nice. You can also toggle any brush between painting and erasing with the press of a button, and that turns out to be very useful.

It doesn’t support infinite canvases, though it does offer a one-click button to extend the canvas in a given direction. I’ve never used it (and didn’t even know what it did until just now), but would totally use an infinite canvas.

I haven’t used the animation support too much, but it’s pretty nice to have. Granted, the only other animation software I’ve used is Aseprite, so I don’t have many points of reference here. It’s a relatively new addition, too, so I assume it’ll improve over time.

The one annoyance I remember with animation was really an interaction with a larger annoyance, which is: working with selections kind of sucks. You can’t drag a selection around with the selection tool; you have to switch to the move tool. That would be fine if you could at least drag the selection ring around with the selection tool, but you can’t do that either; dragging just creates a new selection.

If you want to copy a selection, you have to explicitly copy it to the clipboard and paste it, which creates a new layer. Ctrl-drag with the move tool doesn’t work. So then you have to merge that layer down, which I think is where the problem with animation comes in: a new layer is non-animated by default, meaning it effectively appears in any frame, so simply merging it down with merge it onto every single frame of the layer below. And you won’t even notice until you switch frames or play back the animation. Not ideal.

This is another thing that makes me wonder about Krita’s sense of identity. It has a lot of fancy general-purpose raster editing features that even GIMP is still struggling to implement, like high color depth support and non-destructive filters, yet something as basic as working with selections is clumsy. (In fairness, GIMP is a bit clumsy here too, but it has a consistent notion of “floating selection” that’s easy enough to work with.)

I don’t know how well Krita would work as a general-purpose raster editor; I’ve never tried to use it that way. I can’t think of anything obvious that’s missing. The only real gotcha is that some things you might expect to be tools, like smudge or clone, are just types of brush in Krita.

GIMP

Ah, GIMP — open source’s answer to Photoshop.

It’s very obviously intended for raster editing, and I’m pretty familiar with it after half a lifetime of only using Linux. I even wrote a little Scheme script for it ages ago to automate some simple edits to a couple hundred files, back before I was aware of ImageMagick. I don’t know what to say about it, specifically; it’s fairly powerful and does a wide variety of things.

In fact I’d say it’s almost frustratingly intended for raster editing. I used GIMP in my first attempts at digital painting, before I’d heard of Krita. It was okay, but so much of it felt clunky and awkward. Painting is split between a pencil tool, a paintbrush tool, and an airbrush tool; I don’t really know why. The default brushes are largely uninteresting. Instead of brush presets, there are tool presets that can be saved for any tool; it’s a neat idea, but doesn’t feel like a real substitute for brush presets.

Much of the same functionality as Krita is there, but it’s all somehow more clunky. I’m sure it’s possible to fiddle with the interface to get something friendlier for painting, but I never really figured out how.

And then there’s the surprising stuff that’s missing. There’s no canvas rotation, for example. There’s only one type of brush, and it just stamps the same pattern along a path. I don’t think it’s possible to smear or blend or pick up color while painting. The only way to change the brush size is via the very sensitive slider on the tool options panel, which I remember being a little annoying with a tablet pen. Also, you have to specifically enable tablet support? It’s not difficult or anything, but I have no idea why the default is to ignore tablet pressure and treat it like a regular mouse cursor.

As I mentioned above, there’s also no support for high color depth or non-destructive editing, which is honestly a little embarrassing. Those are the major things Serious Professionals™ have been asking for for ages, and GIMP has been trying to provide them, but it’s taking a very long time. The first signs of GEGL, a new library intended to provide these features, appeared in GIMP 2.6… in 2008. The last major release was in 2012. GIMP has been working on this new plumbing for almost as long as Krita’s entire development history. (To be fair, Krita has also raised almost €90,000 from three Kickstarters to fund its development; I don’t know that GIMP is funded at all.)

I don’t know what’s up with GIMP nowadays. It’s still under active development, but the exact status and roadmap are a little unclear. I still use it for some general-purpose editing, but I don’t see any reason to use it to draw.

I do know that canvas rotation will be in the next release, and there was some experimentation with embedding MyPaint’s brush engine (though when I tried it it was basically unusable), so maybe GIMP is interested in wooing artists? I guess we’ll see.

MyPaint

Ah, MyPaint. I gave it a try once. Once.

It’s a shame, really. It sounds pretty great: specifically built for drawing, has very powerful brushes, supports an infinite canvas, supports canvas rotation, has a simple UI that gets out of your way. Perfect.

Or so it seems. But in MyPaint’s eagerness to shed unnecessary raster editing tools, it forgot a few of the more useful ones. Like selections.

MyPaint has no notion of a selection, nor of copy/paste. If you want to move a head to align better to a body, for example, the sanctioned approach is to duplicate the layer, erase the head from the old layer, erase everything but the head from the new layer, then move the new layer.

I can’t find anything that resembles HSL adjustment, either. I guess the workaround for that is to create H/S/L layers and floodfill them with different colors until you get what you want.

I can’t work seriously without these basic editing tools. I could see myself doodling in MyPaint, but Krita works just as well for doodling as for serious painting, so I’ve never gone back to it.

Drawpile

Drawpile is the modern equivalent to OpenCanvas, I suppose? It lets multiple people draw on the same canvas simultaneously. (I would not recommend it as a general-purpose raster editor.)

It’s a little clunky in places — I sometimes have bugs where keyboard focus gets stuck in the chat, or my tablet cursor becomes invisible — but the collaborative part works surprisingly well. It’s not a brush powerhouse or anything, and I don’t think it allows textured brushes, but it supports tablet pressure and canvas rotation and locked alpha and selections and whatnot.

I’ve used it a couple times, and it’s worked well enough that… well, other people made pretty decent drawings with it? I’m not sure I’ve managed yet. And I wouldn’t use it single-player. Still, it’s fun.

Aseprite

Aseprite is for pixel art so it doesn’t really belong here at all. But it’s very good at that and I like it a lot.

That’s all

I can’t name any other serious contender that exists for Linux.

I’m dimly aware of a thing called “Photo Shop” that’s more intended for photos but functions as a passable painter. More artists seem to swear by Paint Tool SAI and Clip Studio Paint. Also there’s Paint.NET, but I have no idea how well it’s actually suited for painting.

And that’s it! That’s all I’ve got. Krita for drawing, GIMP for editing, Drawpile for collaborative doodling.

Mira, tiny robot of joyful delight

Post Syndicated from Alex Bate original https://www.raspberrypi.org/blog/mira-robot-alonso-martinez/

The staff of Pi Towers are currently melting into puddles while making ‘Aaaawwwwwww’ noises as Mira, the adorable little Pi-controlled robot made by Pixar 3D artist Alonso Martinez, steals their hearts.

Mira the robot playing peek-a-boo

If you want to get updates on Mira’s progress, sign up for the mailing list! http://eepurl.com/bteigD Mira is a desk companion that makes your life better one smile at a time. This project explores human robot interactivity and emotional intelligence. Currently Mira uses face tracking to interact with the users and loves playing the game “peek-a-boo”.

Introducing Mira

Honestly, I can’t type words – I am but a puddle! If I could type at all, I would only produce a stream of affectionate fragments. Imagine walking into a room full of kittens. What you would sound like is what I’d type.

No! I can do this. I’m a professional. I write for a living! I can…

SHE BLINKS OHMYAAAARGH!!!

Mira Alonso Martinez Raspberry Pi

Weebl & Bob meets South Park’s Ike Broflovski in an adorable 3D-printed bundle of ‘Aaawwwww’

Introducing Mira (I promise I can do this)

Right. I’ve had a nap and a drink. I’ve composed myself. I am up for this challenge. As long as I don’t look directly at her, I’ll be fine!

Here I go.

As one of the many über-talented 3D artists at Pixar, Alonso Martinez knows a thing or two about bringing adorable-looking characters to life on screen. However, his work left him wondering:

In movies you see really amazing things happening but you actually can’t interact with them – what would it be like if you could interact with characters?

So with the help of his friends Aaron Nathan and Vijay Sundaram, Alonso set out to bring the concept of animation to the physical world by building a “character” that reacts to her environment. His experiments with robotics started with Gertie, a ball-like robot reminiscent of his time spent animating bouncing balls when he was learning his trade. From there, he moved on to Mira.

Mira Alonso Martinez

Many, many of the views of this Tested YouTube video have come from me. So many.

Mira swivels to follow a person’s face, plays games such as peekaboo, shows surprise when you finger-shoot her, and giggles when you give her a kiss.

Mira’s inner workings

To get Mira to turn her head in three dimensions, Alonso took inspiration from the Microsoft Sidewinder Pro joystick he had as a kid. He purchased one on eBay, took it apart to understand how it works, and replicated its mechanism for Mira’s Raspberry Pi-powered innards.

Mira Alonso Martinez

Alonso used the smallest components he could find so that they would fit inside Mira’s tiny body.

Mira’s axis of 3D-printed parts moves via tiny Power HD DSM44 servos, while a camera and OpenCV handle face-tracking, and a single NeoPixel provides a range of colours to indicate her emotions. As for the blinking eyes? Two OLED screens boasting acrylic domes fit within the few millimeters between all the other moving parts.

More on Mira, including her history and how she works, can be found in this wonderful video released by Tested this week.

Pixar Artist’s 3D-Printed Animated Robots!

We’re gushing with grins and delight at the sight of these adorable animated robots created by artist Alonso Martinez. Sean chats with Alonso to learn how he designed and engineered his family of robots, using processes like 3D printing, mold-making, and silicone casting. They’re amazing!

You can also sign up for Alonso’s newsletter here to stay up-to-date about this little robot. Hopefully one of these newsletters will explain how to buy or build your own Mira, as I for one am desperate to see her adorable little face on my desk every day for the rest of my life.

The post Mira, tiny robot of joyful delight appeared first on Raspberry Pi.