Tag Archives: ASA

AWS IoT 1-Click – Use Simple Devices to Trigger Lambda Functions

Post Syndicated from Jeff Barr original https://aws.amazon.com/blogs/aws/aws-iot-1-click-use-simple-devices-to-trigger-lambda-functions/

We announced a preview of AWS IoT 1-Click at AWS re:Invent 2017 and have been refining it ever since, focusing on simplicity and a clean out-of-box experience. Designed to make IoT available and accessible to a broad audience, AWS IoT 1-Click is now generally available, along with new IoT buttons from AWS and AT&T.

I sat down with the dev team a month or two ago to learn about the service so that I could start thinking about my blog post. During the meeting they gave me a pair of IoT buttons and I started to think about some creative ways to put them to use. Here are a few that I came up with:

Help Request – Earlier this month I spent a very pleasant weekend at the HackTillDawn hackathon in Los Angeles. As the participants were hacking away, they occasionally had questions about AWS, machine learning, Amazon SageMaker, and AWS DeepLens. While we had plenty of AWS Solution Architects on hand (decked out in fashionable & distinctive AWS shirts for easy identification), I imagined an IoT button for each team. Pressing the button would alert the SA crew via SMS and direct them to the proper table.

Camera ControlTim Bray and I were in the AWS video studio, prepping for the first episode of Tim’s series on AWS Messaging. Minutes before we opened the Twitch stream I realized that we did not have a clean, unobtrusive way to ask the camera operator to switch to a closeup view. Again, I imagined that a couple of IoT buttons would allow us to make the request.

Remote Dog Treat Dispenser – My dog barks every time a stranger opens the gate in front of our house. While it is great to have confirmation that my Ring doorbell is working, I would like to be able to press a button and dispense a treat so that Luna stops barking!

Homes, offices, factories, schools, vehicles, and health care facilities can all benefit from IoT buttons and other simple IoT devices, all managed using AWS IoT 1-Click.

All About AWS IoT 1-Click
As I said earlier, we have been focusing on simplicity and a clean out-of-box experience. Here’s what that means:

Architects can dream up applications for inexpensive, low-powered devices.

Developers don’t need to write any device-level code. They can make use of pre-built actions, which send email or SMS messages, or write their own custom actions using AWS Lambda functions.

Installers don’t have to install certificates or configure cloud endpoints on newly acquired devices, and don’t have to worry about firmware updates.

Administrators can monitor the overall status and health of each device, and can arrange to receive alerts when a device nears the end of its useful life and needs to be replaced, using a single interface that spans device types and manufacturers.

I’ll show you how easy this is in just a moment. But first, let’s talk about the current set of devices that are supported by AWS IoT 1-Click.

Who’s Got the Button?
We’re launching with support for two types of buttons (both pictured above). Both types of buttons are pre-configured with X.509 certificates, communicate to the cloud over secure connections, and are ready to use.

The AWS IoT Enterprise Button communicates via Wi-Fi. It has a 2000-click lifetime, encrypts outbound data using TLS, and can be configured using BLE and our mobile app. It retails for $19.99 (shipping and handling not included) and can be used in the United States, Europe, and Japan.

The AT&T LTE-M Button communicates via the LTE-M cellular network. It has a 1500-click lifetime, and also encrypts outbound data using TLS. The device and the bundled data plan is available an an introductory price of $29.99 (shipping and handling not included), and can be used in the United States.

We are very interested in working with device manufacturers in order to make even more shapes, sizes, and types of devices (badge readers, asset trackers, motion detectors, and industrial sensors, to name a few) available to our customers. Our team will be happy to tell you about our provisioning tools and our facility for pushing OTA (over the air) updates to large fleets of devices; you can contact them at [email protected].

AWS IoT 1-Click Concepts
I’m eager to show you how to use AWS IoT 1-Click and the buttons, but need to introduce a few concepts first.

Device – A button or other item that can send messages. Each device is uniquely identified by a serial number.

Placement Template – Describes a like-minded collection of devices to be deployed. Specifies the action to be performed and lists the names of custom attributes for each device.

Placement – A device that has been deployed. Referring to placements instead of devices gives you the freedom to replace and upgrade devices with minimal disruption. Each placement can include values for custom attributes such as a location (“Building 8, 3rd Floor, Room 1337”) or a purpose (“Coffee Request Button”).

Action – The AWS Lambda function to invoke when the button is pressed. You can write a function from scratch, or you can make use of a pair of predefined functions that send an email or an SMS message. The actions have access to the attributes; you can, for example, send an SMS message with the text “Urgent need for coffee in Building 8, 3rd Floor, Room 1337.”

Getting Started with AWS IoT 1-Click
Let’s set up an IoT button using the AWS IoT 1-Click Console:

If I didn’t have any buttons I could click Buy devices to get some. But, I do have some, so I click Claim devices to move ahead. I enter the device ID or claim code for my AT&T button and click Claim (I can enter multiple claim codes or device IDs if I want):

The AWS buttons can be claimed using the console or the mobile app; the first step is to use the mobile app to configure the button to use my Wi-Fi:

Then I scan the barcode on the box and click the button to complete the process of claiming the device. Both of my buttons are now visible in the console:

I am now ready to put them to use. I click on Projects, and then Create a project:

I name and describe my project, and click Next to proceed:

Now I define a device template, along with names and default values for the placement attributes. Here’s how I set up a device template (projects can contain several, but I just need one):

The action has two mandatory parameters (phone number and SMS message) built in; I add three more (Building, Room, and Floor) and click Create project:

I’m almost ready to ask for some coffee! The next step is to associate my buttons with this project by creating a placement for each one. I click Create placements to proceed. I name each placement, select the device to associate with it, and then enter values for the attributes that I established for the project. I can also add additional attributes that are peculiar to this placement:

I can inspect my project and see that everything looks good:

I click on the buttons and the SMS messages appear:

I can monitor device activity in the AWS IoT 1-Click Console:

And also in the Lambda Console:

The Lambda function itself is also accessible, and can be used as-is or customized:

As you can see, this is the code that lets me use {{*}}include all of the placement attributes in the message and {{Building}} (for example) to include a specific placement attribute.

Now Available
I’ve barely scratched the surface of this cool new service and I encourage you to give it a try (or a click) yourself. Buy a button or two, build something cool, and let me know all about it!

Pricing is based on the number of enabled devices in your account, measured monthly and pro-rated for partial months. Devices can be enabled or disabled at any time. See the AWS IoT 1-Click Pricing page for more info.

To learn more, visit the AWS IoT 1-Click home page or read the AWS IoT 1-Click documentation.

Jeff;

 

ЕСПЧ: Турция ограничава свободата на политическите дебати

Post Syndicated from nellyo original https://nellyo.wordpress.com/2018/05/13/echr_10-3/

Европейският съд по правата на човека (ЕСПЧ) прие две важни решения по дела, заведени от двама видни журналисти, задържани в Турция след опита за държавен преврат от 15 юли 2016 г. И в двата случая –  Mehmet Hasan Altan v. Turkey и  Şahin Alpay v. Turkey – съдът установява нарушение на  правото на свободно изразяване на мнение (чл.10 ЕКПЧ).

Мехмет Хасан Алтан и Шахин Алпай са известни с критичните си виждания за политиката на правителството. Двамата журналисти са арестувани   от лятото на 2016 г. и са обвинени въз основа на писани от тях статии и публични  изявления, в опит за промяна на конституционния ред от името на терористична организация. Мехмет Хасан Алтан  е осъден на  доживотен затвор.

ЕСПЧ пояснява, че съществуването на извънредна ситуация не може да служи като претекст за ограничаване на свободата на политическите дебати, която е в основата на концепцията за демократично общество. Дори в извънредно положение  държавите – страни по Конвенцията трябва да  положат всички усилия за опазване на ценностите на едно демократично общество като плурализъм, толерантност и свобода на изразяване.

Задържането  и наказателното преследване на журналистите неизбежно имат смразяващ ефект върху свободата на изразяване чрез сплашване на гражданското общество и заглушаване на несъгласните гласове в Турция.

Според ЕСПЧ изразените мнения, които не представляват подбуждане към насилие, не са свързани с извършването на терористични актове  и не насърчават насилието, не трябва да бъдат ограничавани.

ЕСПЧ смята, че една от основните характеристики на демокрацията е възможността за решаване на проблеми чрез публични дебати и че демокрацията изисква свобода  на изразяване. Критиките към правителството и публикуването на информация, разглеждана от лидерите на дадена държава като застрашаваща националните интереси, не бива да са основание за наказателни обвинения в тероризъм или разпространение  на терористична пропаганда.

Нарушение на член 10 от ЕКПЧ.

Congratulations to Oracle on MySQL 8.0

Post Syndicated from Michael "Monty" Widenius original http://monty-says.blogspot.com/2018/04/congratulations-to-oracle-on-mysql-80.html

Last week, Oracle announced the general availability of MySQL 8.0. This is good news for database users, as it means Oracle is still developing MySQL.

I decide to celebrate the event by doing a quick test of MySQL 8.0. Here follows a step-by-step description of my first experience with MySQL 8.0.
Note that I did the following without reading the release notes, as is what I have done with every MySQL / MariaDB release up to date; In this case it was not the right thing to do.

I pulled MySQL 8.0 from [email protected]:mysql/mysql-server.git
I was pleasantly surprised that ‘cmake . ; make‘ worked without without any compiler warnings! I even checked the used compiler options and noticed that MySQL was compiled with -Wall + several other warning flags. Good job MySQL team!

I did have a little trouble finding the mysqld binary as Oracle had moved it to ‘runtime_output_directory’; Unexpected, but no big thing.

Now it’s was time to install MySQL 8.0.

I did know that MySQL 8.0 has removed mysql_install_db, so I had to use the mysqld binary directly to install the default databases:
(I have specified datadir=/my/data3 in the /tmp/my.cnf file)

> cd runtime_output_directory
> mkdir /my/data3
> ./mysqld –defaults-file=/tmp/my.cnf –install

2018-04-22T12:38:18.332967Z 1 [ERROR] [MY-011011] [Server] Failed to find valid data directory.
2018-04-22T12:38:18.333109Z 0 [ERROR] [MY-010020] [Server] Data Dictionary initialization failed.
2018-04-22T12:38:18.333135Z 0 [ERROR] [MY-010119] [Server] Aborting

A quick look in mysqld –help –verbose output showed that the right command option is –-initialize. My bad, lets try again,

> ./mysqld –defaults-file=/tmp/my.cnf –initialize

2018-04-22T12:39:31.910509Z 0 [ERROR] [MY-010457] [Server] –initialize specified but the data directory has files in it. Aborting.
2018-04-22T12:39:31.910578Z 0 [ERROR] [MY-010119] [Server] Aborting

Now I used the right options, but still didn’t work.
I took a quick look around:

> ls /my/data3/
binlog.index

So even if the mysqld noticed that the data3 directory was wrong, it still wrote things into it.  This even if I didn’t have –log-binlog enabled in the my.cnf file. Strange, but easy to fix:

> rm /my/data3/binlog.index
> ./mysqld –defaults-file=/tmp/my.cnf –initialize

2018-04-22T12:40:45.633637Z 0 [ERROR] [MY-011071] [Server] unknown variable ‘max-tmp-tables=100’
2018-04-22T12:40:45.633657Z 0 [Warning] [MY-010952] [Server] The privilege system failed to initialize correctly. If you have upgraded your server, make sure you’re executing mysql_upgrade to correct the issue.
2018-04-22T12:40:45.633663Z 0 [ERROR] [MY-010119] [Server] Aborting

The warning about the privilege system confused me a bit, but I ignored it for the time being and removed from my configuration files the variables that MySQL 8.0 doesn’t support anymore. I couldn’t find a list of the removed variables anywhere so this was done with the trial and error method.

> ./mysqld –defaults-file=/tmp/my.cnf

2018-04-22T12:42:56.626583Z 0 [ERROR] [MY-010735] [Server] Can’t open the mysql.plugin table. Please run mysql_upgrade to create it.
2018-04-22T12:42:56.827685Z 0 [Warning] [MY-010015] [Repl] Gtid table is not ready to be used. Table ‘mysql.gtid_executed’ cannot be opened.
2018-04-22T12:42:56.838501Z 0 [Warning] [MY-010068] [Server] CA certificate ca.pem is self signed.
2018-04-22T12:42:56.848375Z 0 [Warning] [MY-010441] [Server] Failed to open optimizer cost constant tables
2018-04-22T12:42:56.848863Z 0 [ERROR] [MY-013129] [Server] A message intended for a client cannot be sent there as no client-session is attached. Therefore, we’re sending the information to the error-log instead: MY-001146 – Table ‘mysql.component’ doesn’t exist
2018-04-22T12:42:56.848916Z 0 [Warning] [MY-013129] [Server] A message intended for a client cannot be sent there as no client-session is attached. Therefore, we’re sending the information to the error-log instead: MY-003543 – The mysql.component table is missing or has an incorrect definition.
….
2018-04-22T12:42:56.854141Z 0 [System] [MY-010931] [Server] /home/my/mysql-8.0/runtime_output_directory/mysqld: ready for connections. Version: ‘8.0.11’ socket: ‘/tmp/mysql.sock’ port: 3306 Source distribution.

I figured out that if there is a single wrong variable in the configuration file, running mysqld –initialize will leave the database in an inconsistent state. NOT GOOD! I am happy I didn’t try this in a production system!

Time to start over from the beginning:

> rm -r /my/data3/*
> ./mysqld –defaults-file=/tmp/my.cnf –initialize

2018-04-22T12:44:45.548960Z 5 [Note] [MY-010454] [Server] A temporary password is generated for [email protected]: px)NaaSp?6um
2018-04-22T12:44:51.221751Z 0 [System] [MY-013170] [Server] /home/my/mysql-8.0/runtime_output_directory/mysqld (mysqld 8.0.11) initializing of server has completed

Success!

I wonder why the temporary password is so complex; It could easily have been something that one could easily remember without decreasing security, it’s temporary after all. No big deal, one can always paste it from the logs. (Side note: MariaDB uses socket authentication on many system and thus doesn’t need temporary installation passwords).

Now lets start the MySQL server for real to do some testing:

> ./mysqld –defaults-file=/tmp/my.cnf

2018-04-22T12:45:43.683484Z 0 [System] [MY-010931] [Server] /home/my/mysql-8.0/runtime_output_directory/mysqld: ready for connections. Version: ‘8.0.11’ socket: ‘/tmp/mysql.sock’ port: 3306 Source distribution.

And the lets start the client:

> ./client/mysql –socket=/tmp/mysql.sock –user=root –password=”px)NaaSp?6um”
ERROR 2059 (HY000): Plugin caching_sha2_password could not be loaded: /usr/local/mysql/lib/plugin/caching_sha2_password.so: cannot open shared object file: No such file or directory

Apparently MySQL 8.0 doesn’t work with old MySQL / MariaDB clients by default 🙁

I was testing this in a system with MariaDB installed, like all modern Linux system today, and didn’t want to use the MySQL clients or libraries.

I decided to try to fix this by changing the authentication to the native (original) MySQL authentication method.

> mysqld –skip-grant-tables

> ./client/mysql –socket=/tmp/mysql.sock –user=root
ERROR 1045 (28000): Access denied for user ‘root’@’localhost’ (using password: NO)

Apparently –skip-grant-tables is not good enough anymore. Let’s try again with:

> mysqld –skip-grant-tables –default_authentication_plugin=mysql_native_password

> ./client/mysql –socket=/tmp/mysql.sock –user=root mysql
Welcome to the MariaDB monitor. Commands end with ; or \g.
Your MySQL connection id is 7
Server version: 8.0.11 Source distribution

Great, we are getting somewhere, now lets fix “root”  to work with the old authenticaion:

MySQL [mysql]> update mysql.user set plugin=”mysql_native_password”,authentication_string=password(“test”) where user=”root”;
ERROR 1064 (42000): You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near ‘(“test”) where user=”root”‘ at line 1

A quick look in the MySQL 8.0 release notes told me that the PASSWORD() function is removed in 8.0. Why???? I don’t know how one in MySQL 8.0 is supposed to generate passwords compatible with old installations of MySQL. One could of course start an old MySQL or MariaDB version, execute the password() function and copy the result.

I decided to fix this the easy way and use an empty password:

(Update:: I later discovered that the right way would have been to use: FLUSH PRIVILEGES;  ALTER USER’ root’@’localhost’ identified by ‘test’  ; I however dislike this syntax as it has the password in clear text which is easy to grab and the command can’t be used to easily update the mysql.user table. One must also disable the –skip-grant mode to do use this)

MySQL [mysql]> update mysql.user set plugin=”mysql_native_password”,authentication_string=”” where user=”root”;
Query OK, 1 row affected (0.077 sec)
Rows matched: 1 Changed: 1 Warnings: 0
 
I restarted mysqld:
> mysqld –default_authentication_plugin=mysql_native_password

> ./client/mysql –user=root –password=”” mysql
ERROR 1862 (HY000): Your password has expired. To log in you must change it using a client that supports expired passwords.

Ouch, forgot that. Lets try again:

> mysqld –skip-grant-tables –default_authentication_plugin=mysql_native_password

> ./client/mysql –user=root –password=”” mysql
MySQL [mysql]> update mysql.user set password_expired=”N” where user=”root”;

Now restart and test worked:

> ./mysqld –default_authentication_plugin=mysql_native_password

>./client/mysql –user=root –password=”” mysql

Finally I had a working account that I can use to create other users!

When looking at mysqld –help –verbose again. I noticed the option:

–initialize-insecure
Create the default database and exit. Create a super user
with empty password.

I decided to check if this would have made things easier:

> rm -r /my/data3/*
> ./mysqld –defaults-file=/tmp/my.cnf –initialize-insecure

2018-04-22T13:18:06.629548Z 5 [Warning] [MY-010453] [Server] [email protected] is created with an empty password ! Please consider switching off the –initialize-insecure option.

Hm. Don’t understand the warning as–initialize-insecure is not an option that one would use more than one time and thus nothing one would ‘switch off’.

> ./mysqld –defaults-file=/tmp/my.cnf

> ./client/mysql –user=root –password=”” mysql
ERROR 2059 (HY000): Plugin caching_sha2_password could not be loaded: /usr/local/mysql/lib/plugin/caching_sha2_password.so: cannot open shared object file: No such file or directory

Back to the beginning 🙁

To get things to work with old clients, one has to initialize the database with:
> ./mysqld –defaults-file=/tmp/my.cnf –initialize-insecure –default_authentication_plugin=mysql_native_password

Now I finally had MySQL 8.0 up and running and thought I would take it up for a spin by running the “standard” MySQL/MariaDB sql-bench test suite. This was removed in MySQL 5.7, but as I happened to have MariaDB 10.3 installed, I decided to run it from there.

sql-bench is a single threaded benchmark that measures the “raw” speed for some common operations. It gives you the ‘maximum’ performance for a single query. Its different from other benchmarks that measures the maximum throughput when you have a lot of users, but sql-bench still tells you a lot about what kind of performance to expect from the database.

I tried first to be clever and create the “test” database, that I needed for sql-bench, with
> mkdir /my/data3/test

but when I tried to run the benchmark, MySQL 8.0 complained that the test database didn’t exist.

MySQL 8.0 has gone away from the original concept of MySQL where the user can easily
create directories and copy databases into the database directory. This may have serious
implication for anyone doing backup of databases and/or trying to restore a backup with normal OS commands.

I created the ‘test’ database with mysqladmin and then tried to run sql-bench:

> ./run-all-tests –user=root

The first run failed in test-ATIS:

Can’t execute command ‘create table class_of_service (class_code char(2) NOT NULL,rank tinyint(2) NOT NULL,class_description char(80) NOT NULL,PRIMARY KEY (class_code))’
Error: You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near ‘rank tinyint(2) NOT NULL,class_description char(80) NOT NULL,PRIMARY KEY (class_’ at line 1

This happened because ‘rank‘ is now a reserved word in MySQL 8.0. This is also reserved in ANSI SQL, but I don’t know of any other database that has failed to run test-ATIS before. I have in the past run it against Oracle, PostgreSQL, Mimer, MSSQL etc without any problems.

MariaDB also has ‘rank’ as a keyword in 10.2 and 10.3 but one can still use it as an identifier.

I fixed test-ATIS and then managed to run all tests on MySQL 8.0.

I did run the test both with MySQL 8.0 and MariaDB 10.3 with the InnoDB storage engine and by having identical values for all InnoDB variables, table-definition-cache and table-open-cache. I turned off performance schema for both databases. All test are run with a user with an empty password (to keep things comparable and because it’s was too complex to generate a password in MySQL 8.0)

The result are as follows
Results per test in seconds:

Operation         |MariaDB|MySQL-8|

———————————–
ATIS              | 153.00| 228.00|
alter-table       |  92.00| 792.00|
big-tables        | 990.00|2079.00|
connect           | 186.00| 227.00|
create            | 575.00|4465.00|
insert            |4552.00|8458.00|
select            | 333.00| 412.00|
table-elimination |1900.00|3916.00|
wisconsin         | 272.00| 590.00|
———————————–

This is of course just a first view of the performance of MySQL 8.0 in a single user environment. Some reflections about the results:

  • Alter-table test is slower (as expected) in 8.0 as some of the alter tests benefits of the instant add column in MariaDB 10.3.
  • connect test is also better for MariaDB as we put a lot of efforts to speed this up in MariaDB 10.2
  • table-elimination shows an optimization in MariaDB for the  Anchor table model, which MySQL doesn’t have.
  • CREATE and DROP TABLE is almost 8 times slower in MySQL 8.0 than in MariaDB 10.3. I assume this is the cost of ‘atomic DDL’. This may also cause performance problems for any thread using the data dictionary when another thread is creating/dropping tables.
  • When looking at the individual test results, MySQL 8.0 was slower in almost every test, in many significantly slower.
  • The only test where MySQL was faster was “update_with_key_prefix”. I checked this and noticed that there was a bug in the test and the columns was updated to it’s original value (which should be instant with any storage engine). This is an old bug that MySQL has found and fixed and that we have not been aware of in the test or in MariaDB.
  • While writing this, I noticed that MySQL 8.0 is now using utf8mb4 as the default character set instead of latin1. This may affect some of the benchmarks slightly (not much as most tests works with numbers and Oracle claims that utf8mb4 is only 20% slower than latin1), but needs to be verified.
  • Oracle claims that MySQL 8.0 is much faster on multi user benchmarks. The above test indicates that they may have done this by sacrificing single user performance.
  •  We need to do more and many different benchmarks to better understand exactly what is going on. Stay tuned!

Short summary of my first run with MySQL 8.0:

  • Using the new caching_sha2_password authentication as default for new installation is likely to cause a lot of problems for users. No old application will be able to use MySQL 8.0, installed with default options, without moving to MySQL’s client libraries. While working on this blog I saw MySQL users complain on IRC that not even MySQL Workbench can authenticate with MySQL 8.0. This is the first time in MySQL’s history where such an incompatible change has ever been done!
  • Atomic DDL is a good thing (We plan to have this in MariaDB 10.4), but it should not have such a drastic impact on performance. I am also a bit skeptical of MySQL 8.0 having just one copy of the data dictionary as if this gets corrupted you will lose all your data. (Single point of failure)
  • MySQL 8.0 has several new reserved words and has removed a lot of variables, which makes upgrades hard. Before upgrading to MySQL 8.0 one has to check all one’s databases and applications to ensure that there are no conflicts.
  • As my test above shows, if you have a single deprecated variable in your configuration files, the installation of MySQL will abort and can leave the database in inconsistent state. I did of course my tests by installing into an empty data dictionary, but one can assume that some of the problems may also happen when upgrading an old installation.

Conclusions:
In many ways, MySQL 8.0 has caught up with some earlier versions of MariaDB. For instance, in MariaDB 10.0, we introduced roles (four years ago). In MariaDB 10.1, we introduced encrypted redo/undo logs (three years ago). In MariaDB 10.2, we introduced window functions and CTEs (a year ago). However, some catch-up of MariaDB Server 10.2 features still remains for MySQL (such as check constraints, binlog compression, and log-based rollback).

MySQL 8.0 has a few new interesting features (mostly Atomic DDL and JSON TABLE functions), but at the same time MySQL has strayed away from some of the fundamental corner stone principles of MySQL:

From the start of the first version of MySQL in 1995, all development has been focused around 3 core principles:

  • Ease of use
  • Performance
  • Stability

With MySQL 8.0, Oracle has sacrifices 2 of 3 of these.

In addition (as part of ease of use), while I was working on MySQL, we did our best to ensure that the following should hold:

  • Upgrades should be trivial
  • Things should be kept compatible, if possible (don’t remove features/options/functions that are used)
  • Minimize reserved words, don’t remove server variables
  • One should be able to use normal OS commands to create and drop databases, copy and move tables around within the same system or between different systems. With 8.0 and data dictionary taking backups of specific tables will be hard, even if the server is not running.
  • mysqldump should always be usable backups and to move to new releases
  • Old clients and application should be able to use ‘any’ MySQL server version unchanged. (Some Oracle client libraries, like C++, by default only supports the new X protocol and can thus not be used with older MySQL or any MariaDB version)

We plan to add a data dictionary to MariaDB 10.4 or MariaDB 10.5, but in a way to not sacrifice any of the above principles!

The competition between MySQL and MariaDB is not just about a tactical arms race on features. It’s about design philosophy, or strategic vision, if you will.

This shows in two main ways: our respective view of the Storage Engine structure, and of the top-level direction of the roadmap.

On the Storage Engine side, MySQL is converging on InnoDB, even for clustering and partitioning. In doing so, they are abandoning the advantages of multiple ways of storing data. By contrast, MariaDB sees lots of value in the Storage Engine architecture: MariaDB Server 10.3 will see the general availability of MyRocks (for write-intensive workloads) and Spider (for scalable workloads). On top of that, we have ColumnStore for analytical workloads. One can use the CONNECT engine to join with other databases. The use of different storage engines for different workloads and different hardware is a competitive differentiator, now more than ever.

On the roadmap side, MySQL is carefully steering clear of features that close the gap between MySQL and Oracle. MariaDB has no such constraints. With MariaDB 10.3, we are introducing PL/SQL compatibility (Oracle’s stored procedures) and AS OF (built-in system versioned tables with point-in-time querying). For both of those features, MariaDB is the first Open Source database doing so. I don’t except Oracle to provide any of the above features in MySQL!

Also on the roadmap side, MySQL is not working with the ecosystem in extending the functionality. In 2017, MariaDB accepted more code contributions in one year, than MySQL has done during its entire lifetime, and the rate is increasing!

I am sure that the experience I had with testing MySQL 8.0 would have been significantly better if MySQL would have an open development model where the community could easily participate in developing and testing MySQL continuously. Most of the confusing error messages and strange behavior would have been found and fixed long before the GA release.

Before upgrading to MySQL 8.0 please read https://dev.mysql.com/doc/refman/8.0/en/upgrading-from-previous-series.html to see what problems you can run into! Don’t expect that old installations or applications will work out of the box without testing as a lot of features and options has been removed (query cache, partition of myisam tables etc)! You probably also have to revise your backup methods, especially if you want to ever restore just a few tables. (With 8.0, I don’t know how this can be easily done).

According to the MySQL 8.0 release notes, one can’t use mysqldump to copy a database to MySQL 8.0. One has to first to move to a MySQL 5.7 GA version (with mysqldump, as recommended by Oracle) and then to MySQL 8.0 with in-place update. I assume this means that all old mysqldump backups are useless for MySQL 8.0?

MySQL 8.0 seams to be a one way street to an unknown future. Up to MySQL 5.7 it has been trivial to move to MariaDB and one could always move back to MySQL with mysqldump. All MySQL client libraries has worked with MariaDB and all MariaDB client libraries has worked with MySQL. With MySQL 8.0 this has changed in the wrong direction.

As long as you are using MySQL 5.7 and below you have choices for your future, after MySQL 8.0 you have very little choice. But don’t despair, as MariaDB will always be able to load a mysqldump file and it’s very easy to upgrade your old MySQL installation to MariaDB 🙂

I wish you good luck to try MySQL 8.0 (and also the upcoming MariaDB 10.3)!

Implement continuous integration and delivery of serverless AWS Glue ETL applications using AWS Developer Tools

Post Syndicated from Prasad Alle original https://aws.amazon.com/blogs/big-data/implement-continuous-integration-and-delivery-of-serverless-aws-glue-etl-applications-using-aws-developer-tools/

AWS Glue is an increasingly popular way to develop serverless ETL (extract, transform, and load) applications for big data and data lake workloads. Organizations that transform their ETL applications to cloud-based, serverless ETL architectures need a seamless, end-to-end continuous integration and continuous delivery (CI/CD) pipeline: from source code, to build, to deployment, to product delivery. Having a good CI/CD pipeline can help your organization discover bugs before they reach production and deliver updates more frequently. It can also help developers write quality code and automate the ETL job release management process, mitigate risk, and more.

AWS Glue is a fully managed data catalog and ETL service. It simplifies and automates the difficult and time-consuming tasks of data discovery, conversion, and job scheduling. AWS Glue crawls your data sources and constructs a data catalog using pre-built classifiers for popular data formats and data types, including CSV, Apache Parquet, JSON, and more.

When you are developing ETL applications using AWS Glue, you might come across some of the following CI/CD challenges:

  • Iterative development with unit tests
  • Continuous integration and build
  • Pushing the ETL pipeline to a test environment
  • Pushing the ETL pipeline to a production environment
  • Testing ETL applications using real data (live test)
  • Exploring and validating data

In this post, I walk you through a solution that implements a CI/CD pipeline for serverless AWS Glue ETL applications supported by AWS Developer Tools (including AWS CodePipeline, AWS CodeCommit, and AWS CodeBuild) and AWS CloudFormation.

Solution overview

The following diagram shows the pipeline workflow:

This solution uses AWS CodePipeline, which lets you orchestrate and automate the test and deploy stages for ETL application source code. The solution consists of a pipeline that contains the following stages:

1.) Source Control: In this stage, the AWS Glue ETL job source code and the AWS CloudFormation template file for deploying the ETL jobs are both committed to version control. I chose to use AWS CodeCommit for version control.

To get the ETL job source code and AWS CloudFormation template, download the gluedemoetl.zip file. This solution is developed based on a previous post, Build a Data Lake Foundation with AWS Glue and Amazon S3.

2.) LiveTest: In this stage, all resources—including AWS Glue crawlers, jobs, S3 buckets, roles, and other resources that are required for the solution—are provisioned, deployed, live tested, and cleaned up.

The LiveTest stage includes the following actions:

  • Deploy: In this action, all the resources that are required for this solution (crawlers, jobs, buckets, roles, and so on) are provisioned and deployed using an AWS CloudFormation template.
  • AutomatedLiveTest: In this action, all the AWS Glue crawlers and jobs are executed and data exploration and validation tests are performed. These validation tests include, but are not limited to, record counts in both raw tables and transformed tables in the data lake and any other business validations. I used AWS CodeBuild for this action.
  • LiveTestApproval: This action is included for the cases in which a pipeline administrator approval is required to deploy/promote the ETL applications to the next stage. The pipeline pauses in this action until an administrator manually approves the release.
  • LiveTestCleanup: In this action, all the LiveTest stage resources, including test crawlers, jobs, roles, and so on, are deleted using the AWS CloudFormation template. This action helps minimize cost by ensuring that the test resources exist only for the duration of the AutomatedLiveTest and LiveTestApproval

3.) DeployToProduction: In this stage, all the resources are deployed using the AWS CloudFormation template to the production environment.

Try it out

This code pipeline takes approximately 20 minutes to complete the LiveTest test stage (up to the LiveTest approval stage, in which manual approval is required).

To get started with this solution, choose Launch Stack:

This creates the CI/CD pipeline with all of its stages, as described earlier. It performs an initial commit of the sample AWS Glue ETL job source code to trigger the first release change.

In the AWS CloudFormation console, choose Create. After the template finishes creating resources, you see the pipeline name on the stack Outputs tab.

After that, open the CodePipeline console and select the newly created pipeline. Initially, your pipeline’s CodeCommit stage shows that the source action failed.

Allow a few minutes for your new pipeline to detect the initial commit applied by the CloudFormation stack creation. As soon as the commit is detected, your pipeline starts. You will see the successful stage completion status as soon as the CodeCommit source stage runs.

In the CodeCommit console, choose Code in the navigation pane to view the solution files.

Next, you can watch how the pipeline goes through the LiveTest stage of the deploy and AutomatedLiveTest actions, until it finally reaches the LiveTestApproval action.

At this point, if you check the AWS CloudFormation console, you can see that a new template has been deployed as part of the LiveTest deploy action.

At this point, make sure that the AWS Glue crawlers and the AWS Glue job ran successfully. Also check whether the corresponding databases and external tables have been created in the AWS Glue Data Catalog. Then verify that the data is validated using Amazon Athena, as shown following.

Open the AWS Glue console, and choose Databases in the navigation pane. You will see the following databases in the Data Catalog:

Open the Amazon Athena console, and run the following queries. Verify that the record counts are matching.

SELECT count(*) FROM "nycitytaxi_gluedemocicdtest"."data";
SELECT count(*) FROM "nytaxiparquet_gluedemocicdtest"."datalake";

The following shows the raw data:

The following shows the transformed data:

The pipeline pauses the action until the release is approved. After validating the data, manually approve the revision on the LiveTestApproval action on the CodePipeline console.

Add comments as needed, and choose Approve.

The LiveTestApproval stage now appears as Approved on the console.

After the revision is approved, the pipeline proceeds to use the AWS CloudFormation template to destroy the resources that were deployed in the LiveTest deploy action. This helps reduce cost and ensures a clean test environment on every deployment.

Production deployment is the final stage. In this stage, all the resources—AWS Glue crawlers, AWS Glue jobs, Amazon S3 buckets, roles, and so on—are provisioned and deployed to the production environment using the AWS CloudFormation template.

After successfully running the whole pipeline, feel free to experiment with it by changing the source code stored on AWS CodeCommit. For example, if you modify the AWS Glue ETL job to generate an error, it should make the AutomatedLiveTest action fail. Or if you change the AWS CloudFormation template to make its creation fail, it should affect the LiveTest deploy action. The objective of the pipeline is to guarantee that all changes that are deployed to production are guaranteed to work as expected.

Conclusion

In this post, you learned how easy it is to implement CI/CD for serverless AWS Glue ETL solutions with AWS developer tools like AWS CodePipeline and AWS CodeBuild at scale. Implementing such solutions can help you accelerate ETL development and testing at your organization.

If you have questions or suggestions, please comment below.

 


Additional Reading

If you found this post useful, be sure to check out Implement Continuous Integration and Delivery of Apache Spark Applications using AWS and Build a Data Lake Foundation with AWS Glue and Amazon S3.

 


About the Authors

Prasad Alle is a Senior Big Data Consultant with AWS Professional Services. He spends his time leading and building scalable, reliable Big data, Machine learning, Artificial Intelligence and IoT solutions for AWS Enterprise and Strategic customers. His interests extend to various technologies such as Advanced Edge Computing, Machine learning at Edge. In his spare time, he enjoys spending time with his family.

 
Luis Caro is a Big Data Consultant for AWS Professional Services. He works with our customers to provide guidance and technical assistance on big data projects, helping them improving the value of their solutions when using AWS.

 

 

 

[$] Kernel lockdown in 4.17?

Post Syndicated from corbet original https://lwn.net/Articles/750730/rss

The UEFI secure boot mechanism is intended to protect the system against
persistent malware threats — unpleasant bits of software attached to the
operating system or bootloader that will survive a reboot. While Linux
has supported secure boot for some time, proponents have long said that
this support is incomplete in that it is still possible for the root user
to corrupt the system in a number of ways. Patches that attempt to
close this hole have been circulating for years, but they have been
controversial at best. This story may finally come to a close, though, if
Linus Torvalds accepts the “kernel lockdown” patch series during the 4.17
merge window.

Astro Pi upgrades launch today!

Post Syndicated from David Honess original https://www.raspberrypi.org/blog/astro-pi-upgrades-launch/

Before our beloved SpaceDave left the Raspberry Pi Foundation to join the ranks of the European Space Agency (ESA) — and no, we’re still not jealous *ahem* — he kindly drafted us one final blog post about the Astro Pi upgrades heading to the International Space Station today! So here it is. Enjoy!

We are very excited to announce that Astro Pi upgrades are on their way to the International Space Station! Back in September, we blogged about a small payload being launched to the International Space Station to upgrade the capabilities of our Astro Pi units.

Astro Pi Raspberry Pi International Space Station

Sneak peek

For the longest time, the payload was scheduled to be launched on SpaceX CRS 14 in February. However, the launch was delayed to April and so impacted the flight operations we have planned for running Mission Space Lab student experiments.

To avoid this, ESA had the payload transferred to Russian Soyuz MS-08 (54S), which is launching today to carry crew members Oleg Artemyev, Andrew Feustel, and Ricky Arnold to the ISS.

Ricky Arnold on Twitter

L-47 hours.

You can watch coverage of the launch on NASA TV from 4.30pm GMT this afternoon, with the launch scheduled for 5.44pm GMT. Check the NASA TV schedule for updates.

The upgrades

The pictures below show the flight hardware in its final configuration before loading onto the launch vehicle.

Wireless dongle in bag — Astro Pi upgrades

All access

With the wireless dongle, the Astro Pi units can be deployed in ISS locations other than the Columbus module, where they don’t have access to an Ethernet switch.

We are also sending some flexible optical filters. These are made from the same material as the blue square which is shipped with the Raspberry Pi NoIR Camera Module.

Optical filters in bag — Astro Pi upgrades

#bluefilter

So that future Astro Pi code will need to command fewer windows to download earth observation imagery to the ground, we’re also including some 32GB micro SD cards to replace the current 8GB cards.

Micro SD cards in bag — Astro Pi upgrades

More space in space

Tthe items above are enclosed in a large 8″ ziplock bag that has been designated the “AstroPi Kit”.

bag of Astro Pi upgrades

It’s ziplock bags all the way down up

Once the Soyuz docks with the ISS, this payload is one of the first which will be unpacked, so that the Astro Pi units can be upgraded and deployed ready to run your experiments!

More Astro Pi

Stay tuned for our next update in April, when student code is set to be run on the Astro Pi units as part of our Mission Space Lab programme. And to find out more about Astro Pi, head to the programme website.

The post Astro Pi upgrades launch today! appeared first on Raspberry Pi.

Tamilrockers Arrests: Police Parade Alleged Movie Pirates on TV

Post Syndicated from Andy original https://torrentfreak.com/tamilrockers-arrests-police-parade-alleged-movie-pirates-on-tv-180315/

Just two years ago around 277 million people used the Internet in India. Today there are estimates as high as 355 million and with a population of more than 1.3 billion, India has plenty of growth yet to come.

Also evident is that in addition to a thirst for hard work, many Internet-enabled Indians have developed a taste for Internet piracy. While the US and Europe were the most likely bases for pirate site operators between 2000 and 2015, India now appears in a growing number of cases, from torrent and streaming platforms to movie release groups.

One site that is clearly Indian-focused is the ever-popular Tamilrockers. The movie has laughed in the face of the authorities for a number of years, skipping from domain to domain as efforts to block the site descend into a chaotic game of whack-a-mole. Like The Pirate Bay, Tamilrockers has burned through plenty of domains including tamilrockers.in, tamilrockers.ac, tamilrockers.me, tamilrockers.co, tamilrockers.is, tamilrockers.us and tamilrockers.ro.

Now, however, the authorities are claiming a significant victory against the so-far elusive operators of the site. The anti-piracy cell of the Kerala police announced last evening that they’ve arrested five men said to be behind both Tamilrockers and alleged sister site, DVDRockers.

They’re named as alleged Tamilrockers owner ‘Prabhu’, plus ‘Karthi’ and ‘Suresh’ (all aged 24), plus alleged DVD Rockers owner ‘Johnson’ and ‘Jagan’ (elsewhere reported as ‘Maria John’). The men were said to be generating between US$1,500 and US$3,000 each per month. The average salary in India is around $600 per annum.

While details of how the suspects were caught tend to come later in US and European cases, the Indian authorities are more forthright. According to Anti-Piracy Cell Superintendent B.K. Prasanthan, who headed the team that apprehended the men, it was a trail of advertising revenue crumbs that led them to the suspects.

Prasanthan revealed that it was an email, sent by a Haryana-based ad company to an individual who was arrested in 2016 in a similar case, that helped in tracking the members of Tamilrockers.

“This ad company had sent a mail to [the individual], offering to publish ads on the website he was running. In that email, the company happened to mention that they have ties with Tamilrockers. We got the information about Tamilrockers through this ad company,” Prasanthan said.

That information included the bank account details of the suspects.

Given the technical nature of the sites, it’s perhaps no surprise that the suspects are qualified in the IT field. Prasanthan revealed that all had done well.

“All the gang members were technically qualified. It even included MSc and BSc holders in computer science. They used to record movies in pieces from various parts of the world and join [them together]. We are trying to trace more members of the gang including Karthi’s brothers,” Prasanathan said.

All five men were remanded in custody but not before they were paraded in front of the media, footage which later appeared on TV.

Source: TF, for the latest info on copyright, file-sharing, torrent sites and more. We also have VPN reviews, discounts, offers and coupons.

Raspberry Pi 3 Model B+ on sale now at $35

Post Syndicated from Eben Upton original https://www.raspberrypi.org/blog/raspberry-pi-3-model-bplus-sale-now-35/

Here’s a long post. We think you’ll find it interesting. If you don’t have time to read it all, we recommend you watch this video, which will fill you in with everything you need, and then head straight to the product page to fill yer boots. (We recommend the video anyway, even if you do have time for a long read. ‘Cos it’s fab.)

A BRAND-NEW PI FOR π DAY

Raspberry Pi 3 Model B+ is now on sale now for $35, featuring: – A 1.4GHz 64-bit quad-core ARM Cortex-A53 CPU – Dual-band 802.11ac wireless LAN and Bluetooth 4.2 – Faster Ethernet (Gigabit Ethernet over USB 2.0) – Power-over-Ethernet support (with separate PoE HAT) – Improved PXE network and USB mass-storage booting – Improved thermal management Alongside a 200MHz increase in peak CPU clock frequency, we have roughly three times the wired and wireless network throughput, and the ability to sustain high performance for much longer periods.

If you’ve been a Raspberry Pi watcher for a while now, you’ll have a bit of a feel for how we update our products. Just over two years ago, we released Raspberry Pi 3 Model B. This was our first 64-bit product, and our first product to feature integrated wireless connectivity. Since then, we’ve sold over nine million Raspberry Pi 3 units (we’ve sold 19 million Raspberry Pis in total), which have been put to work in schools, homes, offices and factories all over the globe.

Those Raspberry Pi watchers will know that we have a history of releasing improved versions of our products a couple of years into their lives. The first example was Raspberry Pi 1 Model B+, which added two additional USB ports, introduced our current form factor, and rolled up a variety of other feedback from the community. Raspberry Pi 2 didn’t get this treatment, of course, as it was superseded after only one year; but it feels like it’s high time that Raspberry Pi 3 received the “plus” treatment.

So, without further ado, Raspberry Pi 3 Model B+ is now on sale for $35 (the same price as the existing Raspberry Pi 3 Model B), featuring:

  • A 1.4GHz 64-bit quad-core ARM Cortex-A53 CPU
  • Dual-band 802.11ac wireless LAN and Bluetooth 4.2
  • Faster Ethernet (Gigabit Ethernet over USB 2.0)
  • Power-over-Ethernet support (with separate PoE HAT)
  • Improved PXE network and USB mass-storage booting
  • Improved thermal management

Alongside a 200MHz increase in peak CPU clock frequency, we have roughly three times the wired and wireless network throughput, and the ability to sustain high performance for much longer periods.

Behold the shiny

Raspberry Pi 3B+ is available to buy today from our network of Approved Resellers.

New features, new chips

Roger Thornton did the design work on this revision of the Raspberry Pi. Here, he and I have a chat about what’s new.

Introducing the Raspberry Pi 3 Model B+

Raspberry Pi 3 Model B+ is now on sale now for $35, featuring: – A 1.4GHz 64-bit quad-core ARM Cortex-A53 CPU – Dual-band 802.11ac wireless LAN and Bluetooth 4.2 – Faster Ethernet (Gigabit Ethernet over USB 2.0) – Power-over-Ethernet support (with separate PoE HAT) – Improved PXE network and USB mass-storage booting – Improved thermal management Alongside a 200MHz increase in peak CPU clock frequency, we have roughly three times the wired and wireless network throughput, and the ability to sustain high performance for much longer periods.

The new product is built around BCM2837B0, an updated version of the 64-bit Broadcom application processor used in Raspberry Pi 3B, which incorporates power integrity optimisations, and a heat spreader (that’s the shiny metal bit you can see in the photos). Together these allow us to reach higher clock frequencies (or to run at lower voltages to reduce power consumption), and to more accurately monitor and control the temperature of the chip.

Dual-band wireless LAN and Bluetooth are provided by the Cypress CYW43455 “combo” chip, connected to a Proant PCB antenna similar to the one used on Raspberry Pi Zero W. Compared to its predecessor, Raspberry Pi 3B+ delivers somewhat better performance in the 2.4GHz band, and far better performance in the 5GHz band, as demonstrated by these iperf results from LibreELEC developer Milhouse.

Tx bandwidth (Mb/s) Rx bandwidth (Mb/s)
Raspberry Pi 3B 35.7 35.6
Raspberry Pi 3B+ (2.4GHz) 46.7 46.3
Raspberry Pi 3B+ (5GHz) 102 102

The wireless circuitry is encapsulated under a metal shield, rather fetchingly embossed with our logo. This has allowed us to certify the entire board as a radio module under FCC rules, which in turn will significantly reduce the cost of conformance testing Raspberry Pi-based products.

We’ll be teaching metalwork next.

Previous Raspberry Pi devices have used the LAN951x family of chips, which combine a USB hub and 10/100 Ethernet controller. For Raspberry Pi 3B+, Microchip have supported us with an upgraded version, LAN7515, which supports Gigabit Ethernet. While the USB 2.0 connection to the application processor limits the available bandwidth, we still see roughly a threefold increase in throughput compared to Raspberry Pi 3B. Again, here are some typical iperf results.

Tx bandwidth (Mb/s) Rx bandwidth (Mb/s)
Raspberry Pi 3B 94.1 95.5
Raspberry Pi 3B+ 315 315

We use a magjack that supports Power over Ethernet (PoE), and bring the relevant signals to a new 4-pin header. We will shortly launch a PoE HAT which can generate the 5V necessary to power the Raspberry Pi from the 48V PoE supply.

There… are… four… pins!

Coming soon to a Raspberry Pi 3B+ near you

Raspberry Pi 3B was our first product to support PXE Ethernet boot. Testing it in the wild shook out a number of compatibility issues with particular switches and traffic environments. Gordon has rolled up fixes for all known issues into the BCM2837B0 boot ROM, and PXE boot is now enabled by default.

Clocking, voltages and thermals

The improved power integrity of the BCM2837B0 package, and the improved regulation accuracy of our new MaxLinear MxL7704 power management IC, have allowed us to tune our clocking and voltage rules for both better peak performance and longer-duration sustained performance.

Below 70°C, we use the improvements to increase the core frequency to 1.4GHz. Above 70°C, we drop to 1.2GHz, and use the improvements to decrease the core voltage, increasing the period of time before we reach our 80°C thermal throttle; the reduction in power consumption is such that many use cases will never reach the throttle. Like a modern smartphone, we treat the thermal mass of the device as a resource, to be spent carefully with the goal of optimising user experience.

This graph, courtesy of Gareth Halfacree, demonstrates that Raspberry Pi 3B+ runs faster and at a lower temperature for the duration of an eight‑minute quad‑core Sysbench CPU test.

Note that Raspberry Pi 3B+ does consume substantially more power than its predecessor. We strongly encourage you to use a high-quality 2.5A power supply, such as the official Raspberry Pi Universal Power Supply.

FAQs

We’ll keep updating this list over the next couple of days, but here are a few to get you started.

Are you discontinuing earlier Raspberry Pi models?

No. We have a lot of industrial customers who will want to stick with the existing products for the time being. We’ll keep building these models for as long as there’s demand. Raspberry Pi 1B+, Raspberry Pi 2B, and Raspberry Pi 3B will continue to sell for $25, $35, and $35 respectively.

What about Model A+?

Raspberry Pi 1A+ continues to be the $20 entry-level “big” Raspberry Pi for the time being. We are considering the possibility of producing a Raspberry Pi 3A+ in due course.

What about the Compute Module?

CM1, CM3 and CM3L will continue to be available. We may offer versions of CM3 and CM3L with BCM2837B0 in due course, depending on customer demand.

Are you still using VideoCore?

Yes. VideoCore IV 3D is the only publicly-documented 3D graphics core for ARM‑based SoCs, and we want to make Raspberry Pi more open over time, not less.

Credits

A project like this requires a vast amount of focused work from a large team over an extended period. Particular credit is due to Roger Thornton, who designed the board and ran the exhaustive (and exhausting) RF compliance campaign, and to the team at the Sony UK Technology Centre in Pencoed, South Wales. A partial list of others who made major direct contributions to the BCM2837B0 chip program, CYW43455 integration, LAN7515 and MxL7704 developments, and Raspberry Pi 3B+ itself follows:

James Adams, David Armour, Jonathan Bell, Maria Blazquez, Jamie Brogan-Shaw, Mike Buffham, Rob Campling, Cindy Cao, Victor Carmon, KK Chan, Nick Chase, Nigel Cheetham, Scott Clark, Nigel Clift, Dominic Cobley, Peter Coyle, John Cronk, Di Dai, Kurt Dennis, David Doyle, Andrew Edwards, Phil Elwell, John Ferdinand, Doug Freegard, Ian Furlong, Shawn Guo, Philip Harrison, Jason Hicks, Stefan Ho, Andrew Hoare, Gordon Hollingworth, Tuomas Hollman, EikPei Hu, James Hughes, Andy Hulbert, Anand Jain, David John, Prasanna Kerekoppa, Shaik Labeeb, Trevor Latham, Steve Le, David Lee, David Lewsey, Sherman Li, Xizhe Li, Simon Long, Fu Luo Larson, Juan Martinez, Sandhya Menon, Ben Mercer, James Mills, Max Passell, Mark Perry, Eric Phiri, Ashwin Rao, Justin Rees, James Reilly, Matt Rowley, Akshaye Sama, Ian Saturley, Serge Schneider, Manuel Sedlmair, Shawn Shadburn, Veeresh Shivashimper, Graham Smith, Ben Stephens, Mike Stimson, Yuree Tchong, Stuart Thomson, John Wadsworth, Ian Watch, Sarah Williams, Jason Zhu.

If you’re not on this list and think you should be, please let me know, and accept my apologies.

The post Raspberry Pi 3 Model B+ on sale now at $35 appeared first on Raspberry Pi.

New Amazon EC2 Spot pricing model: Simplified purchasing without bidding and fewer interruptions

Post Syndicated from Roshni Pary original https://aws.amazon.com/blogs/compute/new-amazon-ec2-spot-pricing/

Contributed by Deepthi Chelupati and Roshni Pary

Amazon EC2 Spot Instances offer spare compute capacity in the AWS Cloud at steep discounts. Customers—including Yelp, NASA JPL, FINRA, and Autodesk—use Spot Instances to reduce costs and get faster results. Spot Instances provide acceleration, scale, and deep cost savings to big data workloads, containerized applications such as web services, test/dev, and many types of HPC and batch jobs.

At re:Invent 2017, we launched a new pricing model that simplified the Spot purchasing experience. The new model gives you predictable prices that adjust slowly over days and weeks, with typical savings of 70-90% over On-Demand. With the previous pricing model, some of you had to invest time and effort to analyze historical prices to determine your bidding strategy and maximum bid price. Not anymore.

How does the new pricing model work?

You don’t have to bid for Spot Instances in the new pricing model, and you just pay the Spot price that’s in effect for the current hour for the instances that you launch. It’s that simple. Now you can request Spot capacity just like you would request On-Demand capacity, without having to spend time analyzing market prices or setting a maximum bid price.

Previously, Spot Instances were terminated in ascending order of bids, and the Spot price was set to the highest unfulfilled bid. The market prices fluctuated frequently because of this. In the new model, the Spot prices are more predictable, updated less frequently, and are determined by supply and demand for Amazon EC2 spare capacity, not bid prices. You can find the price that’s in effect for the current hour in the EC2 console.

As you can see from the above Spot Instance Pricing History graph (available in the EC2 console under Spot Requests), Spot prices were volatile before the pricing model update. However, after the pricing model update, prices are more predictable and change less frequently.

In the new model, you still have the option to further control costs by submitting a “maximum price” that you are willing to pay in the console when you request Spot Instances:

You can also set your maximum price in EC2 RunInstances or RequestSpotFleet API calls, or in command line requests:

$ aws ec2 run-instances --instance-market-options 
'{"MarketType":"Spot", "SpotOptions": {"SpotPrice": "0.12"}}' \
    --image-id ami-1a2b3c4d --count 1 --instance-type c4.2xlarge

The default maximum price is the On-Demand price and you can continue to set a maximum Spot price of up to 10x the On-Demand price. That means, if you have been running applications on Spot Instances and use the RequestSpotInstances or RequestSpotFleet operations, you can continue to do so. The new Spot pricing model is backward compatible and you do not need to make any changes to your existing applications.

Fewer interruptions

Spot Instances receive a two-minute interruption notice when these instances are about to be reclaimed by EC2, because EC2 needs the capacity back. We have significantly reduced the interruptions with the new pricing model. Now instances are not interrupted because of higher competing bids, and you can enjoy longer workload runtimes. The typical frequency of interruption for Spot Instances in the last 30 days was less than 5% on average.

To reduce the impact of interruptions and optimize Spot Instances, diversify and run your application across multiple capacity pools. Each instance family, each instance size, in each Availability Zone, in every Region is a separate Spot pool. You can use the RequestSpotFleet API operation to launch thousands of Spot Instances and diversify resources automatically. To further reduce the impact of interruptions, you can also set up Spot Instances and Spot Fleets to respond to an interruption notice by stopping or hibernating rather than terminating instances when capacity is no longer available.

Spot Instances are now available in 18 Regions and 51 Availability Zones, and offer 100s of instance options. We have eliminated bidding, simplified the pricing model, and have made it easy to get started with Amazon EC2 Spot Instances for you to take advantage of the largest pool of cost-effective compute capacity in the world. See the Spot Instances detail page for more information and create your Spot Instance here.

Best Practices for Running Apache Kafka on AWS

Post Syndicated from Prasad Alle original https://aws.amazon.com/blogs/big-data/best-practices-for-running-apache-kafka-on-aws/

This post was written in partnership with Intuit to share learnings, best practices, and recommendations for running an Apache Kafka cluster on AWS. Thanks to Vaishak Suresh and his colleagues at Intuit for their contribution and support.

Intuit, in their own words: Intuit, a leading enterprise customer for AWS, is a creator of business and financial management solutions. For more information on how Intuit partners with AWS, see our previous blog post, Real-time Stream Processing Using Apache Spark Streaming and Apache Kafka on AWS. Apache Kafka is an open-source, distributed streaming platform that enables you to build real-time streaming applications.

The best practices described in this post are based on our experience in running and operating large-scale Kafka clusters on AWS for more than two years. Our intent for this post is to help AWS customers who are currently running Kafka on AWS, and also customers who are considering migrating on-premises Kafka deployments to AWS.

AWS offers Amazon Kinesis Data Streams, a Kafka alternative that is fully managed.

Running your Kafka deployment on Amazon EC2 provides a high performance, scalable solution for ingesting streaming data. AWS offers many different instance types and storage option combinations for Kafka deployments. However, given the number of possible deployment topologies, it’s not always trivial to select the most appropriate strategy suitable for your use case.

In this blog post, we cover the following aspects of running Kafka clusters on AWS:

  • Deployment considerations and patterns
  • Storage options
  • Instance types
  • Networking
  • Upgrades
  • Performance tuning
  • Monitoring
  • Security
  • Backup and restore

Note: While implementing Kafka clusters in a production environment, make sure also to consider factors like your number of messages, message size, monitoring, failure handling, and any operational issues.

Deployment considerations and patterns

In this section, we discuss various deployment options available for Kafka on AWS, along with pros and cons of each option. A successful deployment starts with thoughtful consideration of these options. Considering availability, consistency, and operational overhead of the deployment helps when choosing the right option.

Single AWS Region, Three Availability Zones, All Active

One typical deployment pattern (all active) is in a single AWS Region with three Availability Zones (AZs). One Kafka cluster is deployed in each AZ along with Apache ZooKeeper and Kafka producer and consumer instances as shown in the illustration following.

In this pattern, this is the Kafka cluster deployment:

  • Kafka producers and Kafka cluster are deployed on each AZ.
  • Data is distributed evenly across three Kafka clusters by using Elastic Load Balancer.
  • Kafka consumers aggregate data from all three Kafka clusters.

Kafka cluster failover occurs this way:

  • Mark down all Kafka producers
  • Stop consumers
  • Debug and restack Kafka
  • Restart consumers
  • Restart Kafka producers

Following are the pros and cons of this pattern.

Pros Cons
  • Highly available
  • Can sustain the failure of two AZs
  • No message loss during failover
  • Simple deployment

 

  • Very high operational overhead:
    • All changes need to be deployed three times, one for each Kafka cluster
    • Maintaining and monitoring three Kafka clusters
    • Maintaining and monitoring three consumer clusters

A restart is required for patching and upgrading brokers in a Kafka cluster. In this approach, a rolling upgrade is done separately for each cluster.

Single Region, Three Availability Zones, Active-Standby

Another typical deployment pattern (active-standby) is in a single AWS Region with a single Kafka cluster and Kafka brokers and Zookeepers distributed across three AZs. Another similar Kafka cluster acts as a standby as shown in the illustration following. You can use Kafka mirroring with MirrorMaker to replicate messages between any two clusters.

In this pattern, this is the Kafka cluster deployment:

  • Kafka producers are deployed on all three AZs.
  • Only one Kafka cluster is deployed across three AZs (active).
  • ZooKeeper instances are deployed on each AZ.
  • Brokers are spread evenly across all three AZs.
  • Kafka consumers can be deployed across all three AZs.
  • Standby Kafka producers and a Multi-AZ Kafka cluster are part of the deployment.

Kafka cluster failover occurs this way:

  • Switch traffic to standby Kafka producers cluster and Kafka cluster.
  • Restart consumers to consume from standby Kafka cluster.

Following are the pros and cons of this pattern.

Pros Cons
  • Less operational overhead when compared to the first option
  • Only one Kafka cluster to manage and consume data from
  • Can handle single AZ failures without activating a standby Kafka cluster
  • Added latency due to cross-AZ data transfer among Kafka brokers
  • For Kafka versions before 0.10, replicas for topic partitions have to be assigned so they’re distributed to the brokers on different AZs (rack-awareness)
  • The cluster can become unavailable in case of a network glitch, where ZooKeeper does not see Kafka brokers
  • Possibility of in-transit message loss during failover

Intuit recommends using a single Kafka cluster in one AWS Region, with brokers distributing across three AZs (single region, three AZs). This approach offers stronger fault tolerance than otherwise, because a failed AZ won’t cause Kafka downtime.

Storage options

There are two storage options for file storage in Amazon EC2:

Ephemeral storage is local to the Amazon EC2 instance. It can provide high IOPS based on the instance type. On the other hand, Amazon EBS volumes offer higher resiliency and you can configure IOPS based on your storage needs. EBS volumes also offer some distinct advantages in terms of recovery time. Your choice of storage is closely related to the type of workload supported by your Kafka cluster.

Kafka provides built-in fault tolerance by replicating data partitions across a configurable number of instances. If a broker fails, you can recover it by fetching all the data from other brokers in the cluster that host the other replicas. Depending on the size of the data transfer, it can affect recovery process and network traffic. These in turn eventually affect the cluster’s performance.

The following table contrasts the benefits of using an instance store versus using EBS for storage.

Instance store EBS
  • Instance storage is recommended for large- and medium-sized Kafka clusters. For a large cluster, read/write traffic is distributed across a high number of brokers, so the loss of a broker has less of an impact. However, for smaller clusters, a quick recovery for the failed node is important, but a failed broker takes longer and requires more network traffic for a smaller Kafka cluster.
  • Storage-optimized instances like h1, i3, and d2 are an ideal choice for distributed applications like Kafka.

 

  • The primary advantage of using EBS in a Kafka deployment is that it significantly reduces data-transfer traffic when a broker fails or must be replaced. The replacement broker joins the cluster much faster.
  • Data stored on EBS is persisted in case of an instance failure or termination. The broker’s data stored on an EBS volume remains intact, and you can mount the EBS volume to a new EC2 instance. Most of the replicated data for the replacement broker is already available in the EBS volume and need not be copied over the network from another broker. Only the changes made after the original broker failure need to be transferred across the network. That makes this process much faster.

 

 

Intuit chose EBS because of their frequent instance restacking requirements and also other benefits provided by EBS.

Generally, Kafka deployments use a replication factor of three. EBS offers replication within their service, so Intuit chose a replication factor of two instead of three.

Instance types

The choice of instance types is generally driven by the type of storage required for your streaming applications on a Kafka cluster. If your application requires ephemeral storage, h1, i3, and d2 instances are your best option.

Intuit used r3.xlarge instances for their brokers and r3.large for ZooKeeper, with ST1 (throughput optimized HDD) EBS for their Kafka cluster.

Here are sample benchmark numbers from Intuit tests.

Configuration Broker bytes (MB/s)
  • r3.xlarge
  • ST1 EBS
  • 12 brokers
  • 12 partitions

 

Aggregate 346.9

If you need EBS storage, then AWS has a newer-generation r4 instance. The r4 instance is superior to R3 in many ways:

  • It has a faster processor (Broadwell).
  • EBS is optimized by default.
  • It features networking based on Elastic Network Adapter (ENA), with up to 10 Gbps on smaller sizes.
  • It costs 20 percent less than R3.

Note: It’s always best practice to check for the latest changes in instance types.

Networking

The network plays a very important role in a distributed system like Kafka. A fast and reliable network ensures that nodes can communicate with each other easily. The available network throughput controls the maximum amount of traffic that Kafka can handle. Network throughput, combined with disk storage, is often the governing factor for cluster sizing.

If you expect your cluster to receive high read/write traffic, select an instance type that offers 10-Gb/s performance.

In addition, choose an option that keeps interbroker network traffic on the private subnet, because this approach allows clients to connect to the brokers. Communication between brokers and clients uses the same network interface and port. For more details, see the documentation about IP addressing for EC2 instances.

If you are deploying in more than one AWS Region, you can connect the two VPCs in the two AWS Regions using cross-region VPC peering. However, be aware of the networking costs associated with cross-AZ deployments.

Upgrades

Kafka has a history of not being backward compatible, but its support of backward compatibility is getting better. During a Kafka upgrade, you should keep your producer and consumer clients on a version equal to or lower than the version you are upgrading from. After the upgrade is finished, you can start using a new protocol version and any new features it supports. There are three upgrade approaches available, discussed following.

Rolling or in-place upgrade

In a rolling or in-place upgrade scenario, upgrade one Kafka broker at a time. Take into consideration the recommendations for doing rolling restarts to avoid downtime for end users.

Downtime upgrade

If you can afford the downtime, you can take your entire cluster down, upgrade each Kafka broker, and then restart the cluster.

Blue/green upgrade

Intuit followed the blue/green deployment model for their workloads, as described following.

If you can afford to create a separate Kafka cluster and upgrade it, we highly recommend the blue/green upgrade scenario. In this scenario, we recommend that you keep your clusters up-to-date with the latest Kafka version. For additional details on Kafka version upgrades or more details, see the Kafka upgrade documentation.

The following illustration shows a blue/green upgrade.

In this scenario, the upgrade plan works like this:

  • Create a new Kafka cluster on AWS.
  • Create a new Kafka producers stack to point to the new Kafka cluster.
  • Create topics on the new Kafka cluster.
  • Test the green deployment end to end (sanity check).
  • Using Amazon Route 53, change the new Kafka producers stack on AWS to point to the new green Kafka environment that you have created.

The roll-back plan works like this:

  • Switch Amazon Route 53 to the old Kafka producers stack on AWS to point to the old Kafka environment.

For additional details on blue/green deployment architecture using Kafka, see the re:Invent presentation Leveraging the Cloud with a Blue-Green Deployment Architecture.

Performance tuning

You can tune Kafka performance in multiple dimensions. Following are some best practices for performance tuning.

 These are some general performance tuning techniques:

  • If throughput is less than network capacity, try the following:
    • Add more threads
    • Increase batch size
    • Add more producer instances
    • Add more partitions
  • To improve latency when acks =-1, increase your num.replica.fetches value.
  • For cross-AZ data transfer, tune your buffer settings for sockets and for OS TCP.
  • Make sure that num.io.threads is greater than the number of disks dedicated for Kafka.
  • Adjust num.network.threads based on the number of producers plus the number of consumers plus the replication factor.
  • Your message size affects your network bandwidth. To get higher performance from a Kafka cluster, select an instance type that offers 10 Gb/s performance.

For Java and JVM tuning, try the following:

  • Minimize GC pauses by using the Oracle JDK, which uses the new G1 garbage-first collector.
  • Try to keep the Kafka heap size below 4 GB.

Monitoring

Knowing whether a Kafka cluster is working correctly in a production environment is critical. Sometimes, just knowing that the cluster is up is enough, but Kafka applications have many moving parts to monitor. In fact, it can easily become confusing to understand what’s important to watch and what you can set aside. Items to monitor range from simple metrics about the overall rate of traffic, to producers, consumers, brokers, controller, ZooKeeper, topics, partitions, messages, and so on.

For monitoring, Intuit used several tools, including Newrelec, Wavefront, Amazon CloudWatch, and AWS CloudTrail. Our recommended monitoring approach follows.

For system metrics, we recommend that you monitor:

  • CPU load
  • Network metrics
  • File handle usage
  • Disk space
  • Disk I/O performance
  • Garbage collection
  • ZooKeeper

For producers, we recommend that you monitor:

  • Batch-size-avg
  • Compression-rate-avg
  • Waiting-threads
  • Buffer-available-bytes
  • Record-queue-time-max
  • Record-send-rate
  • Records-per-request-avg

For consumers, we recommend that you monitor:

  • Batch-size-avg
  • Compression-rate-avg
  • Waiting-threads
  • Buffer-available-bytes
  • Record-queue-time-max
  • Record-send-rate
  • Records-per-request-avg

Security

Like most distributed systems, Kafka provides the mechanisms to transfer data with relatively high security across the components involved. Depending on your setup, security might involve different services such as encryption, Kerberos, Transport Layer Security (TLS) certificates, and advanced access control list (ACL) setup in brokers and ZooKeeper. The following tells you more about the Intuit approach. For details on Kafka security not covered in this section, see the Kafka documentation.

Encryption at rest

For EBS-backed EC2 instances, you can enable encryption at rest by using Amazon EBS volumes with encryption enabled. Amazon EBS uses AWS Key Management Service (AWS KMS) for encryption. For more details, see Amazon EBS Encryption in the EBS documentation. For instance store–backed EC2 instances, you can enable encryption at rest by using Amazon EC2 instance store encryption.

Encryption in transit

Kafka uses TLS for client and internode communications.

Authentication

Authentication of connections to brokers from clients (producers and consumers) to other brokers and tools uses either Secure Sockets Layer (SSL) or Simple Authentication and Security Layer (SASL).

Kafka supports Kerberos authentication. If you already have a Kerberos server, you can add Kafka to your current configuration.

Authorization

In Kafka, authorization is pluggable and integration with external authorization services is supported.

Backup and restore

The type of storage used in your deployment dictates your backup and restore strategy.

The best way to back up a Kafka cluster based on instance storage is to set up a second cluster and replicate messages using MirrorMaker. Kafka’s mirroring feature makes it possible to maintain a replica of an existing Kafka cluster. Depending on your setup and requirements, your backup cluster might be in the same AWS Region as your main cluster or in a different one.

For EBS-based deployments, you can enable automatic snapshots of EBS volumes to back up volumes. You can easily create new EBS volumes from these snapshots to restore. We recommend storing backup files in Amazon S3.

For more information on how to back up in Kafka, see the Kafka documentation.

Conclusion

In this post, we discussed several patterns for running Kafka in the AWS Cloud. AWS also provides an alternative managed solution with Amazon Kinesis Data Streams, there are no servers to manage or scaling cliffs to worry about, you can scale the size of your streaming pipeline in seconds without downtime, data replication across availability zones is automatic, you benefit from security out of the box, Kinesis Data Streams is tightly integrated with a wide variety of AWS services like Lambda, Redshift, Elasticsearch and it supports open source frameworks like Storm, Spark, Flink, and more. You may refer to kafka-kinesis connector.

If you have questions or suggestions, please comment below.


Additional Reading

If you found this post useful, be sure to check out Implement Serverless Log Analytics Using Amazon Kinesis Analytics and Real-time Clickstream Anomaly Detection with Amazon Kinesis Analytics.


About the Author

Prasad Alle is a Senior Big Data Consultant with AWS Professional Services. He spends his time leading and building scalable, reliable Big data, Machine learning, Artificial Intelligence and IoT solutions for AWS Enterprise and Strategic customers. His interests extend to various technologies such as Advanced Edge Computing, Machine learning at Edge. In his spare time, he enjoys spending time with his family.

 

 

Best Practices for Running Apache Cassandra on Amazon EC2

Post Syndicated from Prasad Alle original https://aws.amazon.com/blogs/big-data/best-practices-for-running-apache-cassandra-on-amazon-ec2/

Apache Cassandra is a commonly used, high performance NoSQL database. AWS customers that currently maintain Cassandra on-premises may want to take advantage of the scalability, reliability, security, and economic benefits of running Cassandra on Amazon EC2.

Amazon EC2 and Amazon Elastic Block Store (Amazon EBS) provide secure, resizable compute capacity and storage in the AWS Cloud. When combined, you can deploy Cassandra, allowing you to scale capacity according to your requirements. Given the number of possible deployment topologies, it’s not always trivial to select the most appropriate strategy suitable for your use case.

In this post, we outline three Cassandra deployment options, as well as provide guidance about determining the best practices for your use case in the following areas:

  • Cassandra resource overview
  • Deployment considerations
  • Storage options
  • Networking
  • High availability and resiliency
  • Maintenance
  • Security

Before we jump into best practices for running Cassandra on AWS, we should mention that we have many customers who decided to use DynamoDB instead of managing their own Cassandra cluster. DynamoDB is fully managed, serverless, and provides multi-master cross-region replication, encryption at rest, and managed backup and restore. Integration with AWS Identity and Access Management (IAM) enables DynamoDB customers to implement fine-grained access control for their data security needs.

Several customers who have been using large Cassandra clusters for many years have moved to DynamoDB to eliminate the complications of administering Cassandra clusters and maintaining high availability and durability themselves. Gumgum.com is one customer who migrated to DynamoDB and observed significant savings. For more information, see Moving to Amazon DynamoDB from Hosted Cassandra: A Leap Towards 60% Cost Saving per Year.

AWS provides options, so you’re covered whether you want to run your own NoSQL Cassandra database, or move to a fully managed, serverless DynamoDB database.

Cassandra resource overview

Here’s a short introduction to standard Cassandra resources and how they are implemented with AWS infrastructure. If you’re already familiar with Cassandra or AWS deployments, this can serve as a refresher.

Resource Cassandra AWS
Cluster

A single Cassandra deployment.

 

This typically consists of multiple physical locations, keyspaces, and physical servers.

A logical deployment construct in AWS that maps to an AWS CloudFormation StackSet, which consists of one or many CloudFormation stacks to deploy Cassandra.
Datacenter A group of nodes configured as a single replication group.

A logical deployment construct in AWS.

 

A datacenter is deployed with a single CloudFormation stack consisting of Amazon EC2 instances, networking, storage, and security resources.

Rack

A collection of servers.

 

A datacenter consists of at least one rack. Cassandra tries to place the replicas on different racks.

A single Availability Zone.
Server/node A physical virtual machine running Cassandra software. An EC2 instance.
Token Conceptually, the data managed by a cluster is represented as a ring. The ring is then divided into ranges equal to the number of nodes. Each node being responsible for one or more ranges of the data. Each node gets assigned with a token, which is essentially a random number from the range. The token value determines the node’s position in the ring and its range of data. Managed within Cassandra.
Virtual node (vnode) Responsible for storing a range of data. Each vnode receives one token in the ring. A cluster (by default) consists of 256 tokens, which are uniformly distributed across all servers in the Cassandra datacenter. Managed within Cassandra.
Replication factor The total number of replicas across the cluster. Managed within Cassandra.

Deployment considerations

One of the many benefits of deploying Cassandra on Amazon EC2 is that you can automate many deployment tasks. In addition, AWS includes services, such as CloudFormation, that allow you to describe and provision all your infrastructure resources in your cloud environment.

We recommend orchestrating each Cassandra ring with one CloudFormation template. If you are deploying in multiple AWS Regions, you can use a CloudFormation StackSet to manage those stacks. All the maintenance actions (scaling, upgrading, and backing up) should be scripted with an AWS SDK. These may live as standalone AWS Lambda functions that can be invoked on demand during maintenance.

You can get started by following the Cassandra Quick Start deployment guide. Keep in mind that this guide does not address the requirements to operate a production deployment and should be used only for learning more about Cassandra.

Deployment patterns

In this section, we discuss various deployment options available for Cassandra in Amazon EC2. A successful deployment starts with thoughtful consideration of these options. Consider the amount of data, network environment, throughput, and availability.

  • Single AWS Region, 3 Availability Zones
  • Active-active, multi-Region
  • Active-standby, multi-Region

Single region, 3 Availability Zones

In this pattern, you deploy the Cassandra cluster in one AWS Region and three Availability Zones. There is only one ring in the cluster. By using EC2 instances in three zones, you ensure that the replicas are distributed uniformly in all zones.

To ensure the even distribution of data across all Availability Zones, we recommend that you distribute the EC2 instances evenly in all three Availability Zones. The number of EC2 instances in the cluster is a multiple of three (the replication factor).

This pattern is suitable in situations where the application is deployed in one Region or where deployments in different Regions should be constrained to the same Region because of data privacy or other legal requirements.

Pros Cons

●     Highly available, can sustain failure of one Availability Zone.

●     Simple deployment

●     Does not protect in a situation when many of the resources in a Region are experiencing intermittent failure.

 

Active-active, multi-Region

In this pattern, you deploy two rings in two different Regions and link them. The VPCs in the two Regions are peered so that data can be replicated between two rings.

We recommend that the two rings in the two Regions be identical in nature, having the same number of nodes, instance types, and storage configuration.

This pattern is most suitable when the applications using the Cassandra cluster are deployed in more than one Region.

Pros Cons

●     No data loss during failover.

●     Highly available, can sustain when many of the resources in a Region are experiencing intermittent failures.

●     Read/write traffic can be localized to the closest Region for the user for lower latency and higher performance.

●     High operational overhead

●     The second Region effectively doubles the cost

 

Active-standby, multi-region

In this pattern, you deploy two rings in two different Regions and link them. The VPCs in the two Regions are peered so that data can be replicated between two rings.

However, the second Region does not receive traffic from the applications. It only functions as a secondary location for disaster recovery reasons. If the primary Region is not available, the second Region receives traffic.

We recommend that the two rings in the two Regions be identical in nature, having the same number of nodes, instance types, and storage configuration.

This pattern is most suitable when the applications using the Cassandra cluster require low recovery point objective (RPO) and recovery time objective (RTO).

Pros Cons

●     No data loss during failover.

●     Highly available, can sustain failure or partitioning of one whole Region.

●     High operational overhead.

●     High latency for writes for eventual consistency.

●     The second Region effectively doubles the cost.

Storage options

In on-premises deployments, Cassandra deployments use local disks to store data. There are two storage options for EC2 instances:

Your choice of storage is closely related to the type of workload supported by the Cassandra cluster. Instance store works best for most general purpose Cassandra deployments. However, in certain read-heavy clusters, Amazon EBS is a better choice.

The choice of instance type is generally driven by the type of storage:

  • If ephemeral storage is required for your application, a storage-optimized (I3) instance is the best option.
  • If your workload requires Amazon EBS, it is best to go with compute-optimized (C5) instances.
  • Burstable instance types (T2) don’t offer good performance for Cassandra deployments.

Instance store

Ephemeral storage is local to the EC2 instance. It may provide high input/output operations per second (IOPs) based on the instance type. An SSD-based instance store can support up to 3.3M IOPS in I3 instances. This high performance makes it an ideal choice for transactional or write-intensive applications such as Cassandra.

In general, instance storage is recommended for transactional, large, and medium-size Cassandra clusters. For a large cluster, read/write traffic is distributed across a higher number of nodes, so the loss of one node has less of an impact. However, for smaller clusters, a quick recovery for the failed node is important.

As an example, for a cluster with 100 nodes, the loss of 1 node is 3.33% loss (with a replication factor of 3). Similarly, for a cluster with 10 nodes, the loss of 1 node is 33% less capacity (with a replication factor of 3).

  Ephemeral storage Amazon EBS Comments

IOPS

(translates to higher query performance)

Up to 3.3M on I3

80K/instance

10K/gp2/volume

32K/io1/volume

This results in a higher query performance on each host. However, Cassandra implicitly scales well in terms of horizontal scale. In general, we recommend scaling horizontally first. Then, scale vertically to mitigate specific issues.

 

Note: 3.3M IOPS is observed with 100% random read with a 4-KB block size on Amazon Linux.

AWS instance types I3 Compute optimized, C5 Being able to choose between different instance types is an advantage in terms of CPU, memory, etc., for horizontal and vertical scaling.
Backup/ recovery Custom Basic building blocks are available from AWS.

Amazon EBS offers distinct advantage here. It is small engineering effort to establish a backup/restore strategy.

a) In case of an instance failure, the EBS volumes from the failing instance are attached to a new instance.

b) In case of an EBS volume failure, the data is restored by creating a new EBS volume from last snapshot.

Amazon EBS

EBS volumes offer higher resiliency, and IOPs can be configured based on your storage needs. EBS volumes also offer some distinct advantages in terms of recovery time. EBS volumes can support up to 32K IOPS per volume and up to 80K IOPS per instance in RAID configuration. They have an annualized failure rate (AFR) of 0.1–0.2%, which makes EBS volumes 20 times more reliable than typical commodity disk drives.

The primary advantage of using Amazon EBS in a Cassandra deployment is that it reduces data-transfer traffic significantly when a node fails or must be replaced. The replacement node joins the cluster much faster. However, Amazon EBS could be more expensive, depending on your data storage needs.

Cassandra has built-in fault tolerance by replicating data to partitions across a configurable number of nodes. It can not only withstand node failures but if a node fails, it can also recover by copying data from other replicas into a new node. Depending on your application, this could mean copying tens of gigabytes of data. This adds additional delay to the recovery process, increases network traffic, and could possibly impact the performance of the Cassandra cluster during recovery.

Data stored on Amazon EBS is persisted in case of an instance failure or termination. The node’s data stored on an EBS volume remains intact and the EBS volume can be mounted to a new EC2 instance. Most of the replicated data for the replacement node is already available in the EBS volume and won’t need to be copied over the network from another node. Only the changes made after the original node failed need to be transferred across the network. That makes this process much faster.

EBS volumes are snapshotted periodically. So, if a volume fails, a new volume can be created from the last known good snapshot and be attached to a new instance. This is faster than creating a new volume and coping all the data to it.

Most Cassandra deployments use a replication factor of three. However, Amazon EBS does its own replication under the covers for fault tolerance. In practice, EBS volumes are about 20 times more reliable than typical disk drives. So, it is possible to go with a replication factor of two. This not only saves cost, but also enables deployments in a region that has two Availability Zones.

EBS volumes are recommended in case of read-heavy, small clusters (fewer nodes) that require storage of a large amount of data. Keep in mind that the Amazon EBS provisioned IOPS could get expensive. General purpose EBS volumes work best when sized for required performance.

Networking

If your cluster is expected to receive high read/write traffic, select an instance type that offers 10–Gb/s performance. As an example, i3.8xlarge and c5.9xlarge both offer 10–Gb/s networking performance. A smaller instance type in the same family leads to a relatively lower networking throughput.

Cassandra generates a universal unique identifier (UUID) for each node based on IP address for the instance. This UUID is used for distributing vnodes on the ring.

In the case of an AWS deployment, IP addresses are assigned automatically to the instance when an EC2 instance is created. With the new IP address, the data distribution changes and the whole ring has to be rebalanced. This is not desirable.

To preserve the assigned IP address, use a secondary elastic network interface with a fixed IP address. Before swapping an EC2 instance with a new one, detach the secondary network interface from the old instance and attach it to the new one. This way, the UUID remains same and there is no change in the way that data is distributed in the cluster.

If you are deploying in more than one region, you can connect the two VPCs in two regions using cross-region VPC peering.

High availability and resiliency

Cassandra is designed to be fault-tolerant and highly available during multiple node failures. In the patterns described earlier in this post, you deploy Cassandra to three Availability Zones with a replication factor of three. Even though it limits the AWS Region choices to the Regions with three or more Availability Zones, it offers protection for the cases of one-zone failure and network partitioning within a single Region. The multi-Region deployments described earlier in this post protect when many of the resources in a Region are experiencing intermittent failure.

Resiliency is ensured through infrastructure automation. The deployment patterns all require a quick replacement of the failing nodes. In the case of a regionwide failure, when you deploy with the multi-Region option, traffic can be directed to the other active Region while the infrastructure is recovering in the failing Region. In the case of unforeseen data corruption, the standby cluster can be restored with point-in-time backups stored in Amazon S3.

Maintenance

In this section, we look at ways to ensure that your Cassandra cluster is healthy:

  • Scaling
  • Upgrades
  • Backup and restore

Scaling

Cassandra is horizontally scaled by adding more instances to the ring. We recommend doubling the number of nodes in a cluster to scale up in one scale operation. This leaves the data homogeneously distributed across Availability Zones. Similarly, when scaling down, it’s best to halve the number of instances to keep the data homogeneously distributed.

Cassandra is vertically scaled by increasing the compute power of each node. Larger instance types have proportionally bigger memory. Use deployment automation to swap instances for bigger instances without downtime or data loss.

Upgrades

All three types of upgrades (Cassandra, operating system patching, and instance type changes) follow the same rolling upgrade pattern.

In this process, you start with a new EC2 instance and install software and patches on it. Thereafter, remove one node from the ring. For more information, see Cassandra cluster Rolling upgrade. Then, you detach the secondary network interface from one of the EC2 instances in the ring and attach it to the new EC2 instance. Restart the Cassandra service and wait for it to sync. Repeat this process for all nodes in the cluster.

Backup and restore

Your backup and restore strategy is dependent on the type of storage used in the deployment. Cassandra supports snapshots and incremental backups. When using instance store, a file-based backup tool works best. Customers use rsync or other third-party products to copy data backups from the instance to long-term storage. For more information, see Backing up and restoring data in the DataStax documentation. This process has to be repeated for all instances in the cluster for a complete backup. These backup files are copied back to new instances to restore. We recommend using S3 to durably store backup files for long-term storage.

For Amazon EBS based deployments, you can enable automated snapshots of EBS volumes to back up volumes. New EBS volumes can be easily created from these snapshots for restoration.

Security

We recommend that you think about security in all aspects of deployment. The first step is to ensure that the data is encrypted at rest and in transit. The second step is to restrict access to unauthorized users. For more information about security, see the Cassandra documentation.

Encryption at rest

Encryption at rest can be achieved by using EBS volumes with encryption enabled. Amazon EBS uses AWS KMS for encryption. For more information, see Amazon EBS Encryption.

Instance store–based deployments require using an encrypted file system or an AWS partner solution. If you are using DataStax Enterprise, it supports transparent data encryption.

Encryption in transit

Cassandra uses Transport Layer Security (TLS) for client and internode communications.

Authentication

The security mechanism is pluggable, which means that you can easily swap out one authentication method for another. You can also provide your own method of authenticating to Cassandra, such as a Kerberos ticket, or if you want to store passwords in a different location, such as an LDAP directory.

Authorization

The authorizer that’s plugged in by default is org.apache.cassandra.auth.Allow AllAuthorizer. Cassandra also provides a role-based access control (RBAC) capability, which allows you to create roles and assign permissions to these roles.

Conclusion

In this post, we discussed several patterns for running Cassandra in the AWS Cloud. This post describes how you can manage Cassandra databases running on Amazon EC2. AWS also provides managed offerings for a number of databases. To learn more, see Purpose-built databases for all your application needs.

If you have questions or suggestions, please comment below.


Additional Reading

If you found this post useful, be sure to check out Analyze Your Data on Amazon DynamoDB with Apache Spark and Analysis of Top-N DynamoDB Objects using Amazon Athena and Amazon QuickSight.


About the Authors

Prasad Alle is a Senior Big Data Consultant with AWS Professional Services. He spends his time leading and building scalable, reliable Big data, Machine learning, Artificial Intelligence and IoT solutions for AWS Enterprise and Strategic customers. His interests extend to various technologies such as Advanced Edge Computing, Machine learning at Edge. In his spare time, he enjoys spending time with his family.

 

 

 

Provanshu Dey is a Senior IoT Consultant with AWS Professional Services. He works on highly scalable and reliable IoT, data and machine learning solutions with our customers. In his spare time, he enjoys spending time with his family and tinkering with electronics & gadgets.

 

 

 

On the Security of Walls

Post Syndicated from Bruce Schneier original https://www.schneier.com/blog/archives/2018/02/on_the_security.html

Interesting history of the security of walls:

Dún Aonghasa presents early evidence of the same principles of redundant security measures at work in 13th century castles, 17th century star-shaped artillery fortifications, and even “defense in depth” security architecture promoted today by the National Institute of Standards and Technology, the Nuclear Regulatory Commission, and countless other security organizations world-wide.

Security advances throughout the centuries have been mostly technical adjustments in response to evolving weaponry. Fortification — the art and science of protecting a place by imposing a barrier between you and an enemy — is as ancient as humanity. From the standpoint of theory, however, there is very little about modern network or airport security that could not be learned from a 17th century artillery manual. That should trouble us more than it does.

Fortification depends on walls as a demarcation between attacker and defender. The very first priority action listed in the 2017 National Security Strategy states: “We will secure our borders through the construction of a border wall, the use of multilayered defenses and advanced technology, the employment of additional personnel, and other measures.” The National Security Strategy, as well as the executive order just preceding it, are just formal language to describe the recurrent and popular idea of a grand border wall as a central tool of strategic security. There’s been a lot said about the costs of the wall. But, as the American finger hovers over the Hadrian’s Wall 2.0 button, whether or not a wall will actually improve national security depends a lot on how walls work, but moreso, how they fail.

Lots more at the link.

HackSpace magazine 4: the wearables issue

Post Syndicated from Andrew Gregory original https://www.raspberrypi.org/blog/hackspace-4-wearables/

Big things are afoot in the world of HackSpace magazine! This month we’re running our first special issue, with wearables projects throughout the magazine. Moreover, we’re giving away our first subscription gift free to all 12-month print subscribers. Lastly, and most importantly, we’ve made the cover EXTRA SHINY!

HackSpace magazine issue 4 cover

Prepare your eyeballs — it’s HackSpace magazine issue 4!

Wearables

In this issue, we’re taking an in-depth look at wearable tech. Not Fitbits or Apple Watches — we’re talking stuff you can make yourself, from projects that take a couple of hours to put together, to the huge, inspiring builds that are bringing technology to the runway. If you like wearing clothes and you like using your brain to make things better, then you’ll love this feature.

We’re continuing our obsession with Nixie tubes, with the brilliant Time-To-Go-Clock – Trump edition. This ingenious bit of kit uses obsolete Russian electronics to count down the time until the end of the 45th president’s term in office. However, you can also program it to tell the time left to any predictable event, such as the deadline for your tax return or essay submission, or the date England gets knocked out of the World Cup.

HackSpace magazine page 08
HackSpace magazine page 70
HackSpace magazine issue 4 page 98

We’re also talking to Dr Lucy Rogers — NASA alumna, Robot Wars judge, and fellow of the Institution of Mechanical Engineers — about the difference between making as a hobby and as a job, and about why we need the Guild of Makers. Plus, issue 4 has a teeny boat, the most beautiful Raspberry Pi cases you’ve ever seen, and it explores the results of what happens when you put a bunch of hardware hackers together in a French chateau — sacré bleu!

Tutorials

As always, we’ve got more how-tos than you can shake a soldering iron at. Fittingly for the current climate here in the UK, there’s a hot water monitor, which shows you how long you have before your morning shower turns cold, and an Internet of Tea project to summon a cuppa from your kettle via the web. Perhaps not so fittingly, there’s also an ESP8266 project for monitoring a solar power station online. Readers in the southern hemisphere, we’ll leave that one for you — we haven’t seen the sun here for months!

And there’s more!

We’re super happy to say that all our 12-month print subscribers have been sent an Adafruit Circuit Playground Express with this new issue:

Adafruit Circuit Playground Express HackSpace

This gadget was developed primarily with wearables in mind and comes with all sorts of in-built functionality, so subscribers can get cracking with their latest wearable project today! If you’re not a 12-month print subscriber, you’ll miss out, so subscribe here to get your magazine and your device,  and let us know what you’ll make.

The post HackSpace magazine 4: the wearables issue appeared first on Raspberry Pi.

Early Challenges: Making Critical Hires

Post Syndicated from Gleb Budman original https://www.backblaze.com/blog/early-challenges-making-critical-hires/

row of potential employee hires sitting waiting for an interview

In 2009, Google disclosed that they had 400 recruiters on staff working to hire nearly 10,000 people. Someday, that might be your challenge, but most companies in their early days are looking to hire a handful of people — the right people — each year. Assuming you are closer to startup stage than Google stage, let’s look at who you need to hire, when to hire them, where to find them (and how to help them find you), and how to get them to join your company.

Who Should Be Your First Hires

In later stage companies, the roles in the company have been well fleshed out, don’t change often, and each role can be segmented to focus on a specific area. A large company may have an entire department focused on just cubicle layout; at a smaller company you may not have a single person whose actual job encompasses all of facilities. At Backblaze, our CTO has a passion and knack for facilities and mostly led that charge. Also, the needs of a smaller company are quick to change. One of our first hires was a QA person, Sean, who ended up being 100% focused on data center infrastructure. In the early stage, things can shift quite a bit and you need people that are broadly capable, flexible, and most of all willing to pitch in where needed.

That said, there are times you may need an expert. At a previous company we hired Jon, a PhD in Bayesian statistics, because we needed algorithmic analysis for spam fighting. However, even that person was not only able and willing to do the math, but also code, and to not only focus on Bayesian statistics but explore a plethora of spam fighting options.

When To Hire

If you’ve raised a lot of cash and are willing to burn it with mistakes, you can guess at all the roles you might need and start hiring for them. No judgement: that’s a reasonable strategy if you’re cash-rich and time-poor.

If your cash is limited, try to see what you and your team are already doing and then hire people to take those jobs. It may sound counterintuitive, but if you’re already doing it presumably it needs to be done, you have a good sense of the type of skills required to do it, and you can bring someone on-board and get them up to speed quickly. That then frees you up to focus on tasks that can’t be done by someone else. At Backblaze, I ran marketing internally for years before hiring a VP of Marketing, making it easier for me to know what we needed. Once I was hiring, my primary goal was to find someone I could trust to take that role completely off of me so I could focus solely on my CEO duties

Where To Find the Right People

Finding great people is always difficult, particularly when the skillsets you’re looking for are highly in-demand by larger companies with lots of cash and cachet. You, however, have one massive advantage: you need to hire 5 people, not 5,000.

People You Worked With

The absolutely best people to hire are ones you’ve worked with before that you already know are good in a work situation. Consider your last job, the one before, and the one before that. A significant number of the people we recruited at Backblaze came from our previous startup MailFrontier. We knew what they could do and how they would fit into the culture, and they knew us and thus could quickly meld into the environment. If you didn’t have a previous job, consider people you went to school with or perhaps individuals with whom you’ve done projects previously.

People You Know

Hiring friends, family, and others can be risky, but should be considered. Sometimes a friend can be a “great buddy,” but is not able to do the job or isn’t a good fit for the organization. Having to let go of someone who is a friend or family member can be rough. Have the conversation up front with them about that possibility, so you have the ability to stay friends if the position doesn’t work out. Having said that, if you get along with someone as a friend, that’s one critical component of succeeding together at work. At Backblaze we’ve hired a number of people successfully that were friends of someone in the organization.

Friends Of People You Know

Your network is likely larger than you imagine. Your employees, investors, advisors, spouses, friends, and other folks all know people who might be a great fit for you. Make sure they know the roles you’re hiring for and ask them if they know anyone that would fit. Search LinkedIn for the titles you’re looking for and see who comes up; if they’re a 2nd degree connection, ask your connection for an introduction.

People You Know About

Sometimes the person you want isn’t someone anyone knows, but you may have read something they wrote, used a product they’ve built, or seen a video of a presentation they gave. Reach out. You may get a great hire: worst case, you’ll let them know they were appreciated, and make them aware of your organization.

Other Places to Find People

There are a million other places to find people, including job sites, community groups, Facebook/Twitter, GitHub, and more. Consider where the people you’re looking for are likely to congregate online and in person.

A Comment on Diversity

Hiring “People You Know” can often result in “Hiring People Like You” with the same workplace experiences, culture, background, and perceptions. Some studies have shown [1, 2, 3, 4] that homogeneous groups deliver faster, while heterogeneous groups are more creative. Also, “Hiring People Like You” often propagates the lack of women and minorities in tech and leadership positions in general. When looking for people you know, keep an eye to not discount people you know who don’t have the same cultural background as you.

Helping People To Find You

Reaching out proactively to people is the most direct way to find someone, but you want potential hires coming to you as well. To do this, they have to a) be aware of you, b) know you have a role they’re interested in, and c) think they would want to work there. Let’s tackle a) and b) first below.

Your Blog

I started writing our blog before we launched the product and talked about anything I found interesting related to our space. For several years now our team has owned the content on the blog and in 2017 over 1.5 million people read it. Each time we have a position open it’s published to the blog. If someone finds reading about backup and storage interesting, perhaps they’d want to dig in deeper from the inside. Many of the people we’ve recruited have mentioned reading the blog as either how they found us or as a factor in why they wanted to work here.
[BTW, this is Gleb’s 200th post on Backblaze’s blog. The first was in 2008. — Editor]

Your Email List

In addition to the emails our blog subscribers receive, we send regular emails to our customers, partners, and prospects. These are largely focused on content we think is directly useful or interesting for them. However, once every few months we include a small mention that we’re hiring, and the positions we’re looking for. Often a small blurb is all you need to capture people’s imaginations whether they might find the jobs interesting or can think of someone that might fit the bill.

Your Social Involvement

Whether it’s Twitter or Facebook, Hacker News or Slashdot, your potential hires are engaging in various communities. Being socially involved helps make people aware of you, reminds them of you when they’re considering a job, and paints a picture of what working with you and your company would be like. Adam was in a Reddit thread where we were discussing our Storage Pods, and that interaction was ultimately part of the reason he left Apple to come to Backblaze.

Convincing People To Join

Once you’ve found someone or they’ve found you, how do you convince them to join? They may be currently employed, have other offers, or have to relocate. Again, while the biggest companies have a number of advantages, you might have more unique advantages than you realize.

Why Should They Join You

Here are a set of items that you may be able to offer which larger organizations might not:

Role: Consider the strengths of the role. Perhaps it will have broader scope? More visibility at the executive level? No micromanagement? Ability to take risks? Option to create their own role?

Compensation: In addition to salary, will their options potentially be worth more since they’re getting in early? Can they trade-off salary for more options? Do they get option refreshes?

Benefits: In addition to healthcare, food, and 401(k) plans, are there unique benefits of your company? One company I knew took the entire team for a one-month working retreat abroad each year.

Location: Most people prefer to work close to home. If you’re located outside of the San Francisco Bay Area, you might be at a disadvantage for not being in the heart of tech. But if you find employees close to you you’ve got a huge advantage. Sometimes it’s micro; even in the Bay Area the difference of 5 miles can save 20 minutes each way every day. We located the Backblaze headquarters in San Mateo, a middle-ground that made it accessible to those coming from San Jose and San Francisco. We also chose a downtown location near a train, restaurants, and cafes: all to make it easier and more pleasant. Also, are you flexible in letting your employees work remotely? Our systems administrator Elliott is about to embark on a long-term cross-country journey working from an RV.

Environment: Open office, cubicle, cafe, work-from-home? Loud/quiet? Social or focused? 24×7 or work-life balance? Different environments appeal to different people.

Team: Who will they be working with? A company with 100,000 people might have 100 brilliant ones you’d want to work with, but ultimately we work with our core team. Who will your prospective hires be working with?

Market: Some people are passionate about gaming, others biotech, still others food. The market you’re targeting will get different people excited.

Product: Have an amazing product people love? Highlight that. If you’re lucky, your potential hire is already a fan.

Mission: Curing cancer, making people happy, and other company missions inspire people to strive to be part of the journey. Our mission is to make storing data astonishingly easy and low-cost. If you care about data, information, knowledge, and progress, our mission helps drive all of them.

Culture: I left this for last, but believe it’s the most important. What is the culture of your company? Finding people who want to work in the culture of your organization is critical. If they like the culture, they’ll fit and continue it. We’ve worked hard to build a culture that’s collaborative, friendly, supportive, and open; one in which people like coming to work. For example, the five founders started with (and still have) the same compensation and equity. That started a culture of “we’re all in this together.” Build a culture that will attract the people you want, and convey what the culture is.

Writing The Job Description

Most job descriptions focus on the all the requirements the candidate must meet. While important to communicate, the job description should first sell the job. Why would the appropriate candidate want the job? Then share some of the requirements you think are critical. Remember that people read not just what you say but how you say it. Try to write in a way that conveys what it is like to actually be at the company. Ahin, our VP of Marketing, said the job description itself was one of the things that attracted him to the company.

Orchestrating Interviews

Much can be said about interviewing well. I’m just going to say this: make sure that everyone who is interviewing knows that their job is not only to evaluate the candidate, but give them a sense of the culture, and sell them on the company. At Backblaze, we often have one person interview core prospects solely for company/culture fit.

Onboarding

Hiring success shouldn’t be defined by finding and hiring the right person, but instead by the right person being successful and happy within the organization. Ensure someone (usually their manager) provides them guidance on what they should be concentrating on doing during their first day, first week, and thereafter. Giving new employees opportunities and guidance so that they can achieve early wins and feel socially integrated into the company does wonders for bringing people on board smoothly

In Closing

Our Director of Production Systems, Chris, said to me the other day that he looks for companies where he can work on “interesting problems with nice people.” I’m hoping you’ll find your own version of that and find this post useful in looking for your early and critical hires.

Of course, I’d be remiss if I didn’t say, if you know of anyone looking for a place with “interesting problems with nice people,” Backblaze is hiring. 😉

The post Early Challenges: Making Critical Hires appeared first on Backblaze Blog | Cloud Storage & Cloud Backup.

Astro Pi celebrates anniversary of ISS Columbus module

Post Syndicated from David Honess original https://www.raspberrypi.org/blog/astro-pi-celebrates-anniversary/

Right now, 400km above the Earth aboard the International Space Station, are two very special Raspberry Pi computers. They were launched into space on 6 December 2015 and are, most assuredly, the farthest-travelled Raspberry Pi computers in existence. Each year they run experiments that school students create in the European Astro Pi Challenge.

Raspberry Astro Pi units on the International Space Station

Left: Astro Pi Vis (Ed); right: Astro Pi IR (Izzy). Image credit: ESA.

The European Columbus module

Today marks the tenth anniversary of the launch of the European Columbus module. The Columbus module is the European Space Agency’s largest single contribution to the ISS, and it supports research in many scientific disciplines, from astrobiology and solar science to metallurgy and psychology. More than 225 experiments have been carried out inside it during the past decade. It’s also home to our Astro Pi computers.

Here’s a video from 7 February 2008, when Space Shuttle Atlantis went skywards carrying the Columbus module in its cargo bay.

STS-122 Launch NASA TV Coverage

From February 7th, 2008 NASA-TV Coverage of The 121st Space Shuttle Launch Launched At:2:45:30 P.M E.T – Coverage begins exactly one hour till launch STS-122 Crew:

Today, coincidentally, is also the deadline for the European Astro Pi Challenge: Mission Space Lab. Participating teams have until midnight tonight to submit their experiments.

Anniversary celebrations

At 16:30 GMT today there will be a live event on NASA TV for the Columbus module anniversary with NASA flight engineers Joe Acaba and Mark Vande Hei.

Our Astro Pi computers will be joining in the celebrations by displaying a digital birthday candle that the crew can blow out. It works by detecting an increase in humidity when someone blows on it. The video below demonstrates the concept.

AstroPi candle

Uploaded by Effi Edmonton on 2018-01-17.

Do try this at home

The exact Astro Pi code that will run on the ISS today is available for you to download and run on your own Raspberry Pi and Sense HAT. You’ll notice that the program includes code to make it stop automatically when the date changes to 8 February. This is just to save time for the ground control team.

If you have a Raspberry Pi and a Sense HAT, you can use the terminal commands below to download and run the code yourself:

wget http://rpf.io/colbday -O birthday.py
chmod +x birthday.py
./birthday.py

When you see a blank blue screen with the brightness increasing, the Sense HAT is measuring the baseline humidity. It does this every 15 minutes so it can recalibrate to take account of natural changes in background humidity. A humidity increase of 2% is needed to blow out the candle, so if the background humidity changes by more than 2% in 15 minutes, it’s possible to get a false positive. Press Ctrl + C to quit.

Please tweet pictures of your candles to @astro_pi – we might share yours! And if we’re lucky, we might catch a glimpse of the candle on the ISS during the NASA TV event at 16:30 GMT today.

The post Astro Pi celebrates anniversary of ISS Columbus module appeared first on Raspberry Pi.

All-In on Unlimited Backup

Post Syndicated from Gleb Budman original https://www.backblaze.com/blog/all-in-on-unlimited-backup/

chips on computer with cloud backup

The cloud backup industry has seen its share of tumultuousness. BitCasa, Dell DataSafe, Xdrive, and a dozen others have closed up shop. Mozy, Amazon, and Microsoft offered, but later canceled, their unlimited offerings. Recently, CrashPlan for Home customers were notified that their service was being end-of-lifed. Then today we’ve heard from Carbonite customers who are frustrated by this morning’s announcement of a price increase from Carbonite.

We believe that the fundamental goal of a cloud backup is having peace-of-mind: knowing your data — all of it — is safe. For over 10 years Backblaze has been providing that peace-of-mind by offering completely unlimited cloud backup to our customers. And we continue to be committed to that. Knowing that your cloud backup vendor is not going to disappear or fundamentally change their service is an essential element in achieving that peace-of-mind.

Committed to Unlimited Backup

When Mozy discontinued their unlimited backup on Jan 31, 2011, a lot of people asked, “Does this mean Backblaze will discontinue theirs as well?” At that time I wrote the blog post Backblaze is committed to unlimited backup. That was seven years ago. Since then we’ve continued to make Backblaze cloud backup better: dramatically speeding up backups and restores, offering the unique and very popular Restore Return Refund program, enabling direct access and sharing of any file in your backup, and more. We also introduced Backblaze Groups to enable businesses and families to manage backups — all at no additional cost.

How That’s Possible

I’d like to answer the question of “How have you been able to do this when others haven’t?

First, commitment. It’s not impossible to offer unlimited cloud backup, but it’s not easy. The Backblaze team has been committed to unlimited as a core tenet.

Second, we have pursued the technical, business, and cultural steps required to make it happen. We’ve designed our own servers, written our cloud storage software, run our own operations, and been continually focused on every place we could optimize a penny out of the cost of storage. We’ve built a culture at Backblaze that cares deeply about that.

Ensuring Peace-of-Mind

Price increases and plan changes happen in our industry, but Backblaze has consistently been the low price leader, and continues to stand by the foundational element of our service — truly unlimited backup storage. Carbonite just announced a price increase from $60 to $72/year, and while that’s not an astronomical increase, it’s important to keep in mind the service that they are providing at that rate. The basic Carbonite plan provides a service that doesn’t back up videos or external hard drives by default. We think that’s dangerous. No one wants to discover that their videos weren’t backed up after their computer dies, or have to worry about the safety and durability of their data. That is why we have continued to build on our foundation of unlimited, as well as making our service faster and more accessible. All of these serve the goal of ensuring peace-of-mind for our customers.

3 Months Free For You & A Friend

As part of our commitment to unlimited, refer your friends to receive three months of Backblaze service through March 15, 2018. When you Refer-a-Friend with your personal referral link, and they subscribe, both of you will receive three months of service added to your account. See promotion details on our Refer-a-Friend page.

Want A Reminder When Your Carbonite Subscription Runs Out?

If you’re considering switching from Carbonite, we’d love to be your new backup provider. Enter your email and the date you’d like to be reminded in the form below and you’ll get a friendly reminder email from us to start a new backup plan with Backblaze. Or, you could start a free trial today.

We think you’ll be glad you switched, and you’ll have a chance to experience some of that Backblaze peace-of-mind for your data.

Please Send Me a Reminder When I Need a New Backup Provider



 

The post All-In on Unlimited Backup appeared first on Backblaze Blog | Cloud Storage & Cloud Backup.

Astro Pi Mission Zero: your code is in space

Post Syndicated from David Honess original https://www.raspberrypi.org/blog/astro-pi-mission-zero-day/

Every school year, we run the European Astro Pi challenge to find the next generation of space scientists who will program two space-hardened Raspberry Pi units, called Astro Pis, living aboard the International Space Station.

Italian ESA Astronaut Paolo Nespoli with the Astro Pi units. Image credit ESA.

Astro Pi Mission Zero

The 2017–2018 challenge included the brand-new non-competitive Mission Zero, which guaranteed that participants could have their code run on the ISS for 30 seconds, provided they followed the rules. They would also get a certificate showing the exact time period during which their code ran in space.

Astro Pi Mission Zero logo

We asked participants to write a simple Python program to display a personalised message and the air temperature on the Astro Pi screen. No special hardware was needed, since all the code could be written in a web browser using the Sense HAT emulator developed in partnership with Trinket.

Scott McKenzie on Twitter

Students coding #astropi emulator to scroll a message to astronauts on @Raspberry_Pi in space this summer. Try it here: https://t.co/0KURq11X0L #Rm9Parents #CSforAll #ontariocodes

And now it’s time…

We received over 2500 entries for Mission Zero, and we’re excited to announce that tomorrow all entries with flight status will be run on the ISS…in SPAAACE!

There are 1771 Python programs with flight status, which will run back-to-back on Astro Pi VIS (Ed). The whole process will take about 14 hours. This means that everyone will get a timestamp showing 1 February, so we’re going to call this day Mission Zero Day!

Part of each team’s certificate will be a map, like the one below, showing the exact location of the ISS while the team’s code was running.

The grey line is the ISS orbital path, the red marker shows the ISS’s location when their code was running. Produced using Google Static Maps API.

The programs will be run in the same sequence in which we received them. For operational reasons, we can’t guarantee that they will run while the ISS flies over any particular location. However, if you have submitted an entry to Mission Zero, there is a chance that your code will run while the ISS is right overhead!

Go out and spot the station

Spotting the ISS is a great activity to do by yourself or with your students. The station looks like a very fast-moving star that crosses the sky in just a few minutes. If you know when and where to look, and it’s not cloudy, you literally can’t miss it.

Source Andreas Möller, Wikimedia Commons.

The ISS passes over most ground locations about twice a day. For it to be clearly visible though, you need darkness on the ground with sunlight on the ISS due to its altitude. There are a number of websites which can tell you when these visible passes occur, such as NASA’s Spot the Station. Each of the sites requires you to give your location so it can work out when visible passes will occur near you.

Visible ISS pass star chart from Heavens Above, on which familiar constellations such as the Plough (see label Ursa Major) can be seen.

A personal favourite of mine is Heavens Above. It’s slightly more fiddly to use than other sites, but it produces brilliant star charts that show you precisely where to look in the sky. This is how it works:

  1. Go to www.heavens-above.com
  2. To set your location, click on Unspecified in the top right-hand corner
  3. Enter your location (e.g. Cambridge, United Kingdom) into the text box and click Search
  4. The map should change to the correct location — scroll down and click Update
  5. You’ll be taken back to the homepage, but with your location showing at the top right
  6. Click on ISS in the Satellites section
  7. A table of dates will now show, which are the upcoming visible passes for your location
  8. Click on a row to view the star chart for that pass — the line is the path of the ISS, and the arrow shows direction of travel
  9. Be outside in cloudless weather at the start time, look towards the direction where the line begins, and hope the skies stay clear

If you go out and do this, then tweet some pictures to @raspberry_pi, @astro_pi, and @esa. Good luck!

More Astro Pi

Mission Zero certificates will be arriving in participants’ inboxes shortly. We would like to thank everyone who participated in Mission Zero this school year, and we hope that next time you’ll take it one step further and try Mission Space Lab.

Mission Zero and Mission Space Lab are two really exciting programmes that young people of all ages can take part in. If you would like to be notified when the next round of Astro Pi opens for registrations, sign up to our mailing list here.

The post Astro Pi Mission Zero: your code is in space appeared first on Raspberry Pi.