Tag Archives: Visa

Spooky Halloween Video Contest

Post Syndicated from Yev original https://www.backblaze.com/blog/spooky-halloween-video-contest/

Would You LIke to Play a Game? Let's make a scary movie or at least a silly one.

Think you can create a really spooky Halloween video?

We’re giving out $100 Visa gift cards just in time for the holidays. Want a chance to win? You’ll need to make a spooky 30-second Halloween-themed video. We had a lot of fun with this the last time we did it a few years back so we’re doing it again this year.

Here’s How to Enter

  1. Prepare a short, 30 seconds or less, video recreating your favorite horror movie scene using your computer or hard drive as the victim — or make something original!
  2. Insert the following image at the end of the video (right-click and save as):
    Backblaze cloud backup
  3. Upload your video to YouTube
  4. Post a link to your video on the Backblaze Facebook wall or on Twitter with the hashtag #Backblaze so we can see it and enter it into the contest. Or, link to it in the comments below!
  5. Share your video with friends

Common Questions
Q: How many people can be in the video?
A: However many you need in order to recreate the scene!
Q: Can I make it longer than 30 seconds?
A: Maybe 32 seconds, but that’s it. If you want to make a longer “director’s cut,” we’d love to see it, but the contest video should be close to 30 seconds. Please keep it short and spooky.
Q: Can I record it on an iPhone, Android, iPad, Camera, etc?
A: You can use whatever device you wish to record your video.
Q: Can I submit multiple videos?
A: If you have multiple favorite scenes, make a vignette! But please submit only one video.
Q: How many winners will there be?
A: We will select up to three winners total.

Contest Rules

  • To upload the video to YouTube, you must have a valid YouTube account and comply with all YouTube rules for age, content, copyright, etc.
  • To post a link to your video on the Backblaze Facebook wall, you must use a valid Facebook account and comply with all Facebook rules for age, content, copyrights, etc.
  • We reserve the right to remove and/or not consider as a valid entry, any videos which we deem inappropriate. We reserve the exclusive right to determine what is inappropriate.
  • Backblaze reserves the right to use your video for promotional purposes.
  • The contest will end on October 29, 2017 at 11:59:59 PM Pacific Daylight Time. The winners (up to three) will be selected by Backblaze and will be announced on October 31, 2017.
  • We will be giving away gift cards to the top winners. The prize will be mailed to the winner in a timely manner.
  • Please keep the content of the post PG rated — no cursing or extreme gore/violence.
  • By submitting a video you agree to all of these rules.

Need an example?

The post Spooky Halloween Video Contest appeared first on Backblaze Blog | Cloud Storage & Cloud Backup.

The Cost of Cloud Storage

Post Syndicated from Tim Nufire original https://www.backblaze.com/blog/cost-of-cloud-storage/

the cost of the cloud as a percentage of revenue

This week, we’re celebrating the one year anniversary of the launch of Backblaze B2 Cloud Storage. Today’s post is focused on giving you a peek behind the curtain about the costs of providing cloud storage. Why? Over the last 10 years, the most common question we get is still “how do you do it?” In this multi-billion dollar, global industry exhibiting exponential growth, none of the other major players seem to be willing to discuss the underlying costs. By exposing a chunk of the Backblaze financials, we hope to provide a better understanding of what it costs to run “the cloud,” and continue our tradition of sharing information for the betterment of the larger community.

Context
Backblaze built one of the industry’s largest cloud storage systems and we’re proud of that accomplishment. We bootstrapped the business and funded our growth through a combination of our own business operations and just $5.3M in equity financing ($2.8M of which was invested into the business – the other $2.5M was a tender offer to shareholders). To do this, we had to build our storage system efficiently and run as a real, self-sustaining, business. After over a decade in the data storage business, we have developed a deep understanding of cloud storage economics.

Definitions
I promise we’ll get into the costs of cloud storage soon, but some quick definitions first:

    Revenue: Money we collect from customers.
    Cost of Goods Sold (“COGS”): The costs associated with providing the service.
    Operating Expenses (“OpEx”): The costs associated with developing and selling the service.
    Income/Loss: What is left after subtracting COGS and OpEx from Revenue.

I’m going to focus today’s discussion on the Cost of Goods Sold (“COGS”): What goes into it, how it breaks down, and what percent of revenue it makes up. Backblaze is a roughly break-even business with COGS accounting for 47% of our revenue and the remaining 53% spent on our Operating Expenses (“OpEx”) like developing new features, marketing, sales, office rent, and other administrative costs that are required for us to be a functional company.

This post’s focus on COGS should let us answer the commonly asked question of “how do you provide cloud storage for such a low cost?”

Breaking Down Cloud COGS

Providing a cloud storage service requires the following components (COGS and OpEX – below we break out COGS):
cloud infrastructure costs as a percentage of revenue

  • Hardware: 23% of Revenue
  • Backblaze stores data on hard drives. Those hard drives are “wrapped” with servers so they can connect to the public and store data. We’ve discussed our approach to how this works with our Vaults and Storage Pods. Our infrastructure is purpose built for data storage. That is, we thought about how data storage ought to work, and then built it from the ground up. Other companies may use different storage media like Flash, SSD, or even tape. But it all serves the same function of being the thing that data actually is stored on. For today, we’ll think of all this as “hardware.”

    We buy storage hardware that, on average, will last 5 years (60 months) before needing to be replaced. To account for hardware costs in a way that can be compared to our monthly expenses, we amortize them and recognize 1/60th of the purchase price each month.

    Storage Pods and hard drives are not the only hardware in our environment. We also have to buy the cabinets and rails that hold the servers, core servers that manage accounts/billing/etc., switches, routers, power strips, cables, and more. (Our post on bringing up a data center goes into some of this detail.) However, Storage Pods and the drives inside them make up about 90% of all the hardware cost.

  • Data Center (Space & Power): 8% of Revenue
  • “The cloud” is a great marketing term and one that has caught on for our industry. That said, all “clouds” store data on something physical like hard drives. Those hard drives (and servers) are actual, tangible things that take up actual space on earth, not in the clouds.

    At Backblaze, we lease space in colocation facilities which offer a secure, temperature controlled, reliable home for our equipment. Other companies build their own data centers. It’s the classic rent vs buy decision; but it always ends with hardware in racks in a data center.

    Hardware also needs power to function. Not everyone realizes it, but electricity is a significant cost of running cloud storage. In fact, some data center space is billed simply as a function of an electricity bill.

    Every hard drive storing data adds incremental space and power need. This is a cost that scales with storage growth.

    I also want to make a comment on taxes. We pay sales and property tax on hardware, and it is amortized as part of the hardware section above. However, it’s valuable to think about taxes when considering the data center since the location of the hardware actually drives the amount of taxes on the hardware that gets placed inside of it.

  • People: 7% of Revenue
  • Running a data center requires humans to make sure things go smoothly. The more data we store, the more human hands we need in the data center. All drives will fail eventually. When they fail, “stuff” needs to happen to get a replacement drive physically mounted inside the data center and filled with the customer data (all customer data is redundantly stored across multiple drives). The individuals that are associated specifically with managing the data center operations are included in COGS since, as you deploy more hard drives and servers, you need more of these people.

    Customer Support is the other group of people that are part of COGS. As customers use our services, questions invariably arise. To service our customers and get questions answered expediently, we staff customer support from our headquarters in San Mateo, CA. They do an amazing job! Staffing models, internally, are a function of the number of customers and the rate of acquiring new customers.

  • Bandwidth: 3% of Revenue
  • We have over 350 PB of customer data being stored across our data centers. The bulk of that has been uploaded by customers over the Internet (the other option, our Fireball service, is 6 months old and is seeing great adoption). Uploading data over the Internet requires bandwidth – basically, an Internet connection similar to the one running to your home or office. But, for a data center, instead of contracting with Time Warner or Comcast, we go “upstream.” Effectively, we’re buying wholesale.

    Understanding how that dynamic plays out with your customer base is a significant driver of how a cloud provider sets its pricing. Being in business for a decade has explicit advantages here. Because we understand our customer behavior, and have reached a certain scale, we are able to buy bandwidth in sufficient bulk to offer the industry’s best download pricing at $0.02 / Gigabyte (compared to $0.05 from Amazon, Google, and Microsoft).

    Why does optimizing download bandwidth charges matter for customers of a data storage business? Because it has a direct relationship to you being able to retrieve and use your data, which is important.

  • Other Fees: 6% of Revenue
  • We have grouped a the remaining costs inside of “Other Fees.” This includes fees we pay to our payment processor as well as the costs of running our Restore Return Refund program.

    A payment processor is required for businesses like ours that need to accept credit cards securely over the Internet. The bulk of the money we pay to the payment processor is actually passed through to pay the credit card companies like AmEx, Visa, and Mastercard.

    The Restore Return Refund program is a unique program for our consumer and business backup business. Customers can download any and all of their files directly from our website. We also offer customers the ability to order a hard drive with some or all of their data on it, we then FedEx it to the customer wherever in the world she is. If the customer chooses, she can return the drive to us for a full refund. Customers love the program, but it does cost Backblaze money. We choose to subsidize the cost associated with this service in an effort to provide the best customer experience we can.

The Big Picture

At the beginning of the post, I mentioned that Backblaze is, effectively, a break even business. The reality is that our products drive a profitable business but those profits are invested back into the business to fund product development and growth. That means growing our team as the size and complexity of the business expands; it also means being fortunate enough to have the cash on hand to fund “reserves” of extra hardware, bandwidth, data center space, etc. In our first few years as a bootstrapped business, having sufficient buffer was a challenge. Having weathered that storm, we are particularly proud of being in a financial place where we can afford to make things a bit more predictable.

All this adds up to answer the question of how Backblaze has managed to carve out its slice of the cloud market – a market that is a key focus for some of the largest companies of our time. We have innovated a novel, purpose built storage infrastructure with our Vaults and Pods. That infrastructure allows us to keep costs very, very low. Low costs enable us to offer the world’s most affordable, reliable cloud storage.

Does reliable, affordable storage matter? For a company like Vintage Aerial, it enables them to digitize 50 years’ worth of aerial photography of rural America and share that national treasure with the world. Having the best download pricing in the storage industry means Austin City Limits, a PBS show out of Austin, can digitize and preserve over 550 concerts.

We think offering purpose built, affordable storage is important. It empowers our customers to monetize existing assets, make sure data is backed up (and not lost), and focus on their core business because we can handle their data storage needs.

The post The Cost of Cloud Storage appeared first on Backblaze Blog | Cloud Storage & Cloud Backup.

casync — A tool for distributing file system images

Post Syndicated from Lennart Poettering original http://0pointer.net/blog/casync-a-tool-for-distributing-file-system-images.html

Introducing casync

In the past months I have been working on a new project:
casync. casync takes
inspiration from the popular rsync file
synchronization tool as well as the probably even more popular
git revision control system. It combines the
idea of the rsync algorithm with the idea of git-style
content-addressable file systems, and creates a new system for
efficiently storing and delivering file system images, optimized for
high-frequency update cycles over the Internet. Its current focus is
on delivering IoT, container, VM, application, portable service or OS
images, but I hope to extend it later in a generic fashion to become
useful for backups and home directory synchronization as well (but
more about that later).

The basic technological building blocks casync is built from are
neither new nor particularly innovative (at least not anymore),
however the way casync combines them is different from existing tools,
and that’s what makes it useful for a variety of use-cases that other
tools can’t cover that well.

Why?

I created casync after studying how today’s popular tools store and
deliver file system images. To briefly name a few: Docker has a
layered tarball approach,
OSTree serves the
individual files directly via HTTP and maintains packed deltas to
speed up updates, while other systems operate on the block layer and
place raw squashfs images (or other archival file systems, such as
IS09660) for download on HTTP shares (in the better cases combined
with zsync data).

Neither of these approaches appeared fully convincing to me when used
in high-frequency update cycle systems. In such systems, it is
important to optimize towards a couple of goals:

  1. Most importantly, make updates cheap traffic-wise (for this most tools use image deltas of some form)
  2. Put boundaries on disk space usage on servers (keeping deltas between all version combinations clients might want to run updates between, would suggest keeping an exponentially growing amount of deltas on servers)
  3. Put boundaries on disk space usage on clients
  4. Be friendly to Content Delivery Networks (CDNs), i.e. serve neither too many small nor too many overly large files, and only require the most basic form of HTTP. Provide the repository administrator with high-level knobs to tune the average file size delivered.
  5. Simplicity to use for users, repository administrators and developers

I don’t think any of the tools mentioned above are really good on more
than a small subset of these points.

Specifically: Docker’s layered tarball approach dumps the “delta”
question onto the feet of the image creators: the best way to make
your image downloads minimal is basing your work on an existing image
clients might already have, and inherit its resources, maintaining full
history. Here, revision control (a tool for the developer) is
intermingled with update management (a concept for optimizing
production delivery). As container histories grow individual deltas
are likely to stay small, but on the other hand a brand-new deployment
usually requires downloading the full history onto the deployment
system, even though there’s no use for it there, and likely requires
substantially more disk space and download sizes.

OSTree’s serving of individual files is unfriendly to CDNs (as many
small files in file trees cause an explosion of HTTP GET
requests). To counter that OSTree supports placing pre-calculated
delta images between selected revisions on the delivery servers, which
means a certain amount of revision management, that leaks into the
clients.

Delivering direct squashfs (or other file system) images is almost
beautifully simple, but of course means every update requires a full
download of the newest image, which is both bad for disk usage and
generated traffic. Enhancing it with zsync makes this a much better
option, as it can reduce generated traffic substantially at very
little cost of history/meta-data (no explicit deltas between a large
number of versions need to be prepared server side). On the other hand
server requirements in disk space and functionality (HTTP Range
requests) are minus points for the use-case I am interested in.

(Note: all the mentioned systems have great properties, and it’s not
my intention to badmouth them. They only point I am trying to make is
that for the use case I care about — file system image delivery with
high high frequency update-cycles — each system comes with certain
drawbacks.)

Security & Reproducibility

Besides the issues pointed out above I wasn’t happy with the security
and reproducibility properties of these systems. In today’s world
where security breaches involving hacking and breaking into connected
systems happen every day, an image delivery system that cannot make
strong guarantees regarding data integrity is out of
date. Specifically, the tarball format is famously nondeterministic:
the very same file tree can result in any number of different
valid serializations depending on the tool used, its version and the
underlying OS and file system. Some tar implementations attempt to
correct that by guaranteeing that each file tree maps to exactly
one valid serialization, but such a property is always only specific
to the tool used. I strongly believe that any good update system must
guarantee on every single link of the chain that there’s only one
valid representation of the data to deliver, that can easily be
verified.

What casync Is

So much about the background why I created casync. Now, let’s have a
look what casync actually is like, and what it does. Here’s the brief
technical overview:

Encoding: Let’s take a large linear data stream, split it into
variable-sized chunks (the size of each being a function of the
chunk’s contents), and store these chunks in individual, compressed
files in some directory, each file named after a strong hash value of
its contents, so that the hash value may be used to as key for
retrieving the full chunk data. Let’s call this directory a “chunk
store”. At the same time, generate a “chunk index” file that lists
these chunk hash values plus their respective chunk sizes in a simple
linear array. The chunking algorithm is supposed to create variable,
but similarly sized chunks from the data stream, and do so in a way
that the same data results in the same chunks even if placed at
varying offsets. For more information see this blog
story
.

Decoding: Let’s take the chunk index file, and reassemble the large
linear data stream by concatenating the uncompressed chunks retrieved
from the chunk store, keyed by the listed chunk hash values.

As an extra twist, we introduce a well-defined, reproducible,
random-access serialization format for file trees (think: a more
modern tar), to permit efficient, stable storage of complete file
trees in the system, simply by serializing them and then passing them
into the encoding step explained above.

Finally, let’s put all this on the network: for each image you want to
deliver, generate a chunk index file and place it on an HTTP
server. Do the same with the chunk store, and share it between the
various index files you intend to deliver.

Why bother with all of this? Streams with similar contents will result
in mostly the same chunk files in the chunk store. This means it is
very efficient to store many related versions of a data stream in the
same chunk store, thus minimizing disk usage. Moreover, when
transferring linear data streams chunks already known on the receiving
side can be made use of, thus minimizing network traffic.

Why is this different from rsync or OSTree, or similar tools? Well,
one major difference between casync and those tools is that we
remove file boundaries before chunking things up. This means that
small files are lumped together with their siblings and large files
are chopped into pieces, which permits us to recognize similarities in
files and directories beyond file boundaries, and makes sure our chunk
sizes are pretty evenly distributed, without the file boundaries
affecting them.

The “chunking” algorithm is based on a the buzhash rolling hash
function. SHA256 is used as strong hash function to generate digests
of the chunks. xz is used to compress the individual chunks.

Here’s a diagram, hopefully explaining a bit how the encoding process
works, wasn’t it for my crappy drawing skills:

Diagram

The diagram shows the encoding process from top to bottom. It starts
with a block device or a file tree, which is then serialized and
chunked up into variable sized blocks. The compressed chunks are then
placed in the chunk store, while a chunk index file is written listing
the chunk hashes in order. (The original SVG of this graphic may be
found here.)

Details

Note that casync operates on two different layers, depending on the
use-case of the user:

  1. You may use it on the block layer. In this case the raw block data
    on disk is taken as-is, read directly from the block device, split
    into chunks as described above, compressed, stored and delivered.

  2. You may use it on the file system layer. In this case, the
    file tree serialization format mentioned above comes into play:
    the file tree is serialized depth-first (much like tar would do
    it) and then split into chunks, compressed, stored and delivered.

The fact that it may be used on both the block and file system layer
opens it up for a variety of different use-cases. In the VM and IoT
ecosystems shipping images as block-level serializations is more
common, while in the container and application world file-system-level
serializations are more typically used.

Chunk index files referring to block-layer serializations carry the
.caibx suffix, while chunk index files referring to file system
serializations carry the .caidx suffix. Note that you may also use
casync as direct tar replacement, i.e. without the chunking, just
generating the plain linear file tree serialization. Such files
carry the .catar suffix. Internally .caibx are identical to
.caidx files, the only difference is semantical: .caidx files
describe a .catar file, while .caibx files may describe any other
blob. Finally, chunk stores are directories carrying the .castr
suffix.

Features

Here are a couple of other features casync has:

  1. When downloading a new image you may use casync‘s --seed=
    feature: each block device, file, or directory specified is processed
    using the same chunking logic described above, and is used as
    preferred source when putting together the downloaded image locally,
    avoiding network transfer of it. This of course is useful whenever
    updating an image: simply specify one or more old versions as seed and
    only download the chunks that truly changed since then. Note that
    using seeds requires no history relationship between seed and the new
    image to download. This has major benefits: you can even use it to
    speed up downloads of relatively foreign and unrelated data. For
    example, when downloading a container image built using Ubuntu you can
    use your Fedora host OS tree in /usr as seed, and casync will
    automatically use whatever it can from that tree, for example timezone
    and locale data that tends to be identical between
    distributions. Example: casync extract
    http://example.com/myimage.caibx --seed=/dev/sda1 /dev/sda2
    . This
    will place the block-layer image described by the indicated URL in the
    /dev/sda2 partition, using the existing /dev/sda1 data as seeding
    source. An invocation like this could be typically used by IoT systems
    with an A/B partition setup. Example 2: casync extract
    http://example.com/mycontainer-v3.caidx --seed=/srv/container-v1
    --seed=/srv/container-v2 /src/container-v3
    , is very similar but
    operates on the file system layer, and uses two old container versions
    to seed the new version.

  2. When operating on the file system level, the user has fine-grained
    control on the meta-data included in the serialization. This is
    relevant since different use-cases tend to require a different set of
    saved/restored meta-data. For example, when shipping OS images, file
    access bits/ACLs and ownership matter, while file modification times
    hurt. When doing personal backups OTOH file ownership matters little
    but file modification times are important. Moreover different backing
    file systems support different feature sets, and storing more
    information than necessary might make it impossible to validate a tree
    against an image if the meta-data cannot be replayed in full. Due to
    this, casync provides a set of --with= and --without= parameters
    that allow fine-grained control of the data stored in the file tree
    serialization, including the granularity of modification times and
    more. The precise set of selected meta-data features is also always
    part of the serialization, so that seeding can work correctly and
    automatically.

  3. casync tries to be as accurate as possible when storing file
    system meta-data. This means that besides the usual baseline of file
    meta-data (file ownership and access bits), and more advanced features
    (extended attributes, ACLs, file capabilities) a number of more exotic
    data is stored as well, including Linux
    chattr(1) file attributes, as
    well as FAT file
    attributes

    (you may wonder why the latter? — EFI is FAT, and /efi is part of
    the comprehensive serialization of any host). In the future I intend
    to extend this further, for example storing btrfs sub-volume
    information where available. Note that as described above every single
    type of meta-data may be turned off and on individually, hence if you
    don’t need FAT file bits (and I figure it’s pretty likely you don’t),
    then they won’t be stored.

  4. The user creating .caidx or .caibx files may control the desired
    average chunk length (before compression) freely, using the
    --chunk-size= parameter. Smaller chunks increase the number of
    generated files in the chunk store and increase HTTP GET load on the
    server, but also ensure that sharing between similar images is
    improved, as identical patterns in the images stored are more likely
    to be recognized. By default casync will use a 64K average chunk
    size. Tweaking this can be particularly useful when adapting the
    system to specific CDNs, or when delivering compressed disk images
    such as squashfs (see below).

  5. Emphasis is placed on making all invocations reproducible,
    well-defined and strictly deterministic. As mentioned above this is a
    requirement to reach the intended security guarantees, but is also
    useful for many other use-cases. For example, the casync digest
    command may be used to calculate a hash value identifying a specific
    directory in all desired detail (use --with= and --without to pick
    the desired detail). Moreover the casync mtree command may be used
    to generate a BSD mtree(5) compatible manifest of a directory tree,
    .caidx or .catar file.

  6. The file system serialization format is nicely composable. By this
    I mean that the serialization of a file tree is the concatenation of
    the serializations of all files and file sub-trees located at the
    top of the tree, with zero meta-data references from any of these
    serializations into the others. This property is essential to ensure
    maximum reuse of chunks when similar trees are serialized.

  7. When extracting file trees or disk image files, casync
    will automatically create
    reflinks
    from any specified seeds if the underlying file system supports it
    (such as btrfs, ocfs, and future xfs). After all, instead of
    copying the desired data from the seed, we can just tell the file
    system to link up the relevant blocks. This works both when extracting
    .caidx and .caibx files — the latter of course only when the
    extracted disk image is placed in a regular raw image file on disk,
    rather than directly on a plain block device, as plain block devices
    do not know the concept of reflinks.

  8. Optionally, when extracting file trees, casync can
    create traditional UNIX hard-links for identical files in specified
    seeds (--hardlink=yes). This works on all UNIX file systems, and can
    save substantial amounts of disk space. However, this only works for
    very specific use-cases where disk images are considered read-only
    after extraction, as any changes made to one tree will propagate to
    all other trees sharing the same hard-linked files, as that’s the
    nature of hard-links. In this mode, casync exposes OSTree-like
    behavior, which is built heavily around read-only hard-link trees.

  9. casync tries to be smart when choosing what to include in file
    system images. Implicitly, file systems such as procfs and sysfs are
    excluded from serialization, as they expose API objects, not real
    files. Moreover, the “nodump” (+d)
    chattr(1) flag is honored by
    default, permitting users to mark files to exclude from serialization.

  10. When creating and extracting file trees casync may apply an
    automatic or explicit UID/GID shift. This is particularly useful when
    transferring container image for use with Linux user name-spacing.

  11. In addition to local operation, casync currently supports HTTP,
    HTTPS, FTP and ssh natively for downloading chunk index files and
    chunks (the ssh mode requires installing casync on the remote host,
    though, but an sftp mode not requiring that should be easy to
    add). When creating index files or chunks, only ssh is supported as
    remote back-end.

  12. When operating on block-layer images, you may expose locally or
    remotely stored images as local block devices. Example: casync mkdev
    http://example.com/myimage.caibx
    exposes the disk image described by
    the indicated URL as local block device in /dev, which you then may
    use the usual block device tools on, such as mount or fdisk (only
    read-only though). Chunks are downloaded on access with high priority,
    and at low priority when idle in the background. Note that in this
    mode, casync also plays a role similar to “dm-verity”, as all blocks
    are validated against the strong digests in the chunk index file
    before passing them on to the kernel’s block layer. This feature is
    implemented though Linux’ NBD kernel facility.

  13. Similar, when operating on file-system-layer images, you may mount
    locally or remotely stored images as regular file systems. Example:
    casync mount http://example.com/mytree.caidx /srv/mytree mounts the
    file tree image described by the indicated URL as a local directory
    /srv/mytree. This feature is implemented though Linux’ FUSE kernel
    facility. Note that special care is taken that the images exposed this
    way can be packed up again with casync make and are guaranteed to
    return the bit-by-bit exact same serialization again that it was
    mounted from. No data is lost or changed while passing things through
    FUSE (OK, strictly speaking this is a lie, we do lose ACLs, but that’s
    hopefully just a temporary gap to be fixed soon).

  14. In IoT A/B fixed size partition setups the file systems placed in
    the two partitions are usually much shorter than the partition size,
    in order to keep some room for later, larger updates. casync is able
    to analyze the super-block of a number of common file systems in order
    to determine the actual size of a file system stored on a block
    device, so that writing a file system to such a partition and reading
    it back again will result in reproducible data. Moreover this speeds
    up the seeding process, as there’s little point in seeding the
    white-space after the file system within the partition.

Example Command Lines

Here’s how to use casync, explained with a few examples:

$ casync make foobar.caidx /some/directory

This will create a chunk index file foobar.caidx in the local
directory, and populate the chunk store directory default.castr
located next to it with the chunks of the serialization (you can
change the name for the store directory with --store= if you
like). This command operates on the file-system level. A similar
command operating on the block level:

$ casync make foobar.caibx /dev/sda1

This command creates a chunk index file foobar.caibx in the local
directory describing the current contents of the /dev/sda1 block
device, and populates default.castr in the same way as above. Note
that you may as well read a raw disk image from a file instead of a
block device:

$ casync make foobar.caibx myimage.raw

To reconstruct the original file tree from the .caidx file and
the chunk store of the first command, use:

$ casync extract foobar.caidx /some/other/directory

And similar for the block-layer version:

$ casync extract foobar.caibx /dev/sdb1

or, to extract the block-layer version into a raw disk image:

$ casync extract foobar.caibx myotherimage.raw

The above are the most basic commands, operating on local data
only. Now let’s make this more interesting, and reference remote
resources:

$ casync extract http://example.com/images/foobar.caidx /some/other/directory

This extracts the specified .caidx onto a local directory. This of
course assumes that foobar.caidx was uploaded to the HTTP server in
the first place, along with the chunk store. You can use any command
you like to accomplish that, for example scp or
rsync. Alternatively, you can let casync do this directly when
generating the chunk index:

$ casync make ssh.example.com:images/foobar.caidx /some/directory

This will use ssh to connect to the ssh.example.com server, and then
places the .caidx file and the chunks on it. Note that this mode of
operation is “smart”: this scheme will only upload chunks currently
missing on the server side, and not re-transmit what already is
available.

Note that you can always configure the precise path or URL of the
chunk store via the --store= option. If you do not do that, then the
store path is automatically derived from the path or URL: the last
component of the path or URL is replaced by default.castr.

Of course, when extracting .caidx or .caibx files from remote sources,
using a local seed is advisable:

$ casync extract http://example.com/images/foobar.caidx --seed=/some/exising/directory /some/other/directory

Or on the block layer:

$ casync extract http://example.com/images/foobar.caibx --seed=/dev/sda1 /dev/sdb2

When creating chunk indexes on the file system layer casync will by
default store meta-data as accurately as possible. Let’s create a chunk
index with reduced meta-data:

$ casync make foobar.caidx --with=sec-time --with=symlinks --with=read-only /some/dir

This command will create a chunk index for a file tree serialization
that has three features above the absolute baseline supported: 1s
granularity time-stamps, symbolic links and a single read-only bit. In
this mode, all the other meta-data bits are not stored, including
nanosecond time-stamps, full UNIX permission bits, file ownership or
even ACLs or extended attributes.

Now let’s make a .caidx file available locally as a mounted file
system, without extracting it:

$ casync mount http://example.comf/images/foobar.caidx /mnt/foobar

And similar, let’s make a .caibx file available locally as a block device:

$ casync mkdev http://example.comf/images/foobar.caibx

This will create a block device in /dev and print the used device
node path to STDOUT.

As mentioned, casync is big about reproducibility. Let’s make use of
that to calculate the a digest identifying a very specific version of
a file tree:

$ casync digest .

This digest will include all meta-data bits casync and the underlying
file system know about. Usually, to make this useful you want to
configure exactly what meta-data to include:

$ casync digest --with=unix .

This makes use of the --with=unix shortcut for selecting meta-data
fields. Specifying --with-unix= selects all meta-data that
traditional UNIX file systems support. It is a shortcut for writing out:
--with=16bit-uids --with=permissions --with=sec-time --with=symlinks
--with=device-nodes --with=fifos --with=sockets
.

Note that when calculating digests or creating chunk indexes you may
also use the negative --without= option to remove specific features
but start from the most precise:

$ casync digest --without=flag-immutable

This generates a digest with the most accurate meta-data, but leaves
one feature out: chattr(1)‘s
immutable (+i) file flag.

To list the contents of a .caidx file use a command like the following:

$ casync list http://example.com/images/foobar.caidx

or

$ casync mtree http://example.com/images/foobar.caidx

The former command will generate a brief list of files and
directories, not too different from tar t or ls -al in its
output. The latter command will generate a BSD
mtree(5) compatible
manifest. Note that casync actually stores substantially more file
meta-data than mtree files can express, though.

What casync isn’t

  1. casync is not an attempt to minimize serialization and downloaded
    deltas to the extreme. Instead, the tool is supposed to find a good
    middle ground, that is good on traffic and disk space, but not at the
    price of convenience or requiring explicit revision control. If you
    care about updates that are absolutely minimal, there are binary delta
    systems around that might be an option for you, such as Google’s
    Courgette
    .

  2. casync is not a replacement for rsync, or git or zsync or
    anything like that. They have very different use-cases and
    semantics. For example, rsync permits you to directly synchronize two
    file trees remotely. casync just cannot do that, and it is unlikely
    it every will.

Where next?

casync is supposed to be a generic synchronization tool. Its primary
focus for now is delivery of OS images, but I’d like to make it useful
for a couple other use-cases, too. Specifically:

  1. To make the tool useful for backups, encryption is missing. I have
    pretty concrete plans how to add that. When implemented, the tool
    might become an alternative to restic,
    BorgBackup or
    tarsnap.

  2. Right now, if you want to deploy casync in real-life, you still
    need to validate the downloaded .caidx or .caibx file yourself, for
    example with some gpg signature. It is my intention to integrate with
    gpg in a minimal way so that signing and verifying chunk index files
    is done automatically.

  3. In the longer run, I’d like to build an automatic synchronizer for
    $HOME between systems from this. Each $HOME instance would be
    stored automatically in regular intervals in the cloud using casync,
    and conflicts would be resolved locally.

  4. casync is written in a shared library style, but it is not yet
    built as one. Specifically this means that almost all of casync‘s
    functionality is supposed to be available as C API soon, and
    applications can process casync files on every level. It is my
    intention to make this library useful enough so that it will be easy
    to write a module for GNOME’s gvfs subsystem in order to make remote
    or local .caidx files directly available to applications (as an
    alternative to casync mount). In fact the idea is to make this all
    flexible enough that even the remoting back-ends can be replaced
    easily, for example to replace casync‘s default HTTP/HTTPS back-ends
    built on CURL with GNOME’s own HTTP implementation, in order to share
    cookies, certificates, … There’s also an alternative method to
    integrate with casync in place already: simply invoke casync as a
    sub-process. casync will inform you about a certain set of state
    changes using a mechanism compatible with
    sd_notify(3). In
    future it will also propagate progress data this way and more.

  5. I intend to a add a new seeding back-end that sources chunks from
    the local network. After downloading the new .caidx file off the
    Internet casync would then search for the listed chunks on the local
    network first before retrieving them from the Internet. This should
    speed things up on all installations that have multiple similar
    systems deployed in the same network.

Further plans are listed tersely in the
TODO file.

FAQ:

  1. Is this a systemd project?casync is hosted under the
    github systemd umbrella, and the
    projects share the same coding style. However, the code-bases are
    distinct and without interdependencies, and casync works fine both
    on systemd systems and systems without it.

  2. Is casync portable? — At the moment: no. I only run Linux and
    that’s what I code for. That said, I am open to accepting portability
    patches (unlike for systemd, which doesn’t really make sense on
    non-Linux systems), as long as they don’t interfere too much with the
    way casync works. Specifically this means that I am not too
    enthusiastic about merging portability patches for OSes lacking the
    openat(2) family
    of APIs.

  3. Does casync require reflink-capable file systems to work, such
    as btrfs?
    — No it doesn’t. The reflink magic in casync is
    employed when the file system permits it, and it’s good to have it,
    but it’s not a requirement, and casync will implicitly fall back to
    copying when it isn’t available. Note that casync supports a number
    of file system features on a variety of file systems that aren’t
    available everywhere, for example FAT’s system/hidden file flags or
    xfs‘s projinherit file flag.

  4. Is casync stable? — I just tagged the first, initial
    release. While I have been working on it since quite some time and it
    is quite featureful, this is the first time I advertise it publicly,
    and it hence received very little testing outside of its own test
    suite. I am also not fully ready to commit to the stability of the
    current serialization or chunk index format. I don’t see any breakages
    coming for it though. casync is pretty light on documentation right
    now, and does not even have a man page. I also intend to correct that
    soon.

  5. Are the .caidx/.caibx and .catar file formats open and
    documented?
    casync is Open Source, so if you want to know the
    precise format, have a look at the sources for now. It’s definitely my
    intention to add comprehensive docs for both formats however. Don’t
    forget this is just the initial version right now.

  6. casync is just like $SOMEOTHERTOOL! Why are you reinventing
    the wheel (again)?
    — Well, because casync isn’t “just like” some
    other tool. I am pretty sure I did my homework, and that there is no
    tool just like casync right now. The tools coming closest are probably
    rsync, zsync, tarsnap, restic, but they are quite different beasts
    each.

  7. Why did you invent your own serialization format for file trees?
    Why don’t you just use tar?
    — That’s a good question, and other
    systems — most prominently tarsnap — do that. However, as mentioned
    above tar doesn’t enforce reproducibility. It also doesn’t really do
    random access: if you want to access some specific file you need to
    read every single byte stored before it in the tar archive to find
    it, which is of course very expensive. The serialization casync
    implements places a focus on reproducibility, random access, and
    meta-data control. Much like traditional tar it can still be
    generated and extracted in a stream fashion though.

  8. Does casync save/restore SELinux/SMACK file labels? — At the
    moment not. That’s not because I wouldn’t want it to, but simply
    because I am not a guru of either of these systems, and didn’t want to
    implement something I do not fully grok nor can test. If you look at
    the sources you’ll find that there’s already some definitions in place
    that keep room for them though. I’d be delighted to accept a patch
    implementing this fully.

  9. What about delivering squashfs images? How well does chunking
    work on compressed serializations?
    – That’s a very good point!
    Usually, if you apply the a chunking algorithm to a compressed data
    stream (let’s say a tar.gz file), then changing a single bit at the
    front will propagate into the entire remainder of the file, so that
    minimal changes will explode into major changes. Thankfully this
    doesn’t apply that strictly to squashfs images, as it provides
    random access to files and directories and thus breaks up the
    compression streams in regular intervals to make seeking easy. This
    fact is beneficial for systems employing chunking, such as casync as
    this means single bit changes might affect their vicinity but will not
    explode in an unbounded fashion. In order achieve best results when
    delivering squashfs images through casync the block sizes of
    squashfs and the chunks sizes of casync should be matched up
    (using casync‘s --chunk-size= option). How precisely to choose
    both values is left a research subject for the user, for now.

  10. What does the name casync mean? – It’s a synchronizing
    tool, hence the -sync suffix, following rsync‘s naming. It makes
    use of the content-addressable concept of git hence the ca-
    prefix.

  11. Where can I get this stuff? Is it already packaged? – Check
    out the sources on GitHub. I
    just tagged the first
    version
    . Martin
    Pitt has packaged casync for
    Ubuntu
    . There
    is also an ArchLinux
    package
    . Zbigniew
    Jędrzejewski-Szmek has prepared a Fedora
    RPM
    that hopefully
    will soon be included in the distribution.

Should you care? Is this a tool for you?

Well, that’s up to you really. If you are involved with projects that
need to deliver IoT, VM, container, application or OS images, then
maybe this is a great tool for you — but other options exist, some of
which are linked above.

Note that casync is an Open Source project: if it doesn’t do exactly
what you need, prepare a patch that adds what you need, and we’ll
consider it.

If you are interested in the project and would like to talk about this
in person, I’ll be presenting casync soon at Kinvolk’s Linux
Technologies
Meetup

in Berlin, Germany. You are invited. I also intend to talk about it at
All Systems Go!, also in Berlin.

Extending the Airplane Laptop Ban

Post Syndicated from Bruce Schneier original https://www.schneier.com/blog/archives/2017/05/extending_the_a.html

The Department of Homeland Security is rumored to be considering extending the current travel ban on large electronics for Middle Eastern flights to European ones as well. The likely reaction of airlines will be to implement new traveler programs, effectively allowing wealthier and more frequent fliers to bring their computers with them. This will only exacerbate the divide between the haves and the have-nots — all without making us any safer.

In March, both the United States and the United Kingdom required that passengers from 10 Muslim countries give up their laptop computers and larger tablets, and put them in checked baggage. The new measure was based on reports that terrorists would try to smuggle bombs onto planes concealed in these larger electronic devices.

The security measure made no sense for two reasons. First, moving these computers into the baggage holds doesn’t keep them off planes. Yes, it is easier to detonate a bomb that’s in your hands than to remotely trigger it in the cargo hold. But it’s also more effective to screen laptops at security checkpoints than it is to place them in checked baggage. TSA already does this kind of screening randomly and occasionally: making passengers turn laptops on to ensure that they’re functional computers and not just bomb-filled cases, and running chemical tests on their surface to detect explosive material.

And, two, banning laptops on selected flights just forces terrorists to buy more roundabout itineraries. It doesn’t take much creativity to fly Doha-Amsterdam-New York instead of direct. Adding Amsterdam to the list of affected airports makes the terrorist add yet another itinerary change; it doesn’t remove the threat.

Which brings up another question: If this is truly a threat, why aren’t domestic flights included in this ban? Remember that anyone boarding a plane to the United States from these Muslim countries has already received a visa to enter the country. This isn’t perfect security — the infamous underwear bomber had a visa, after all — but anyone who could detonate a laptop bomb on his international flight could do it on his domestic connection.

I don’t have access to classified intelligence, and I can’t comment on whether explosive-filled laptops are truly a threat. But, if they are, TSA can set up additional security screenings at the gates of US-bound flights worldwide and screen every laptop coming onto the plane. It wouldn’t be the first time we’ve had additional security screening at the gate. And they should require all laptops to go through this screening, prohibiting them from being stashed in checked baggage.

This measure is nothing more than security theater against what appears to be a movie-plot threat.

Banishing laptops to the cargo holds brings with it a host of other threats. Passengers run the risk of their electronics being stolen from their checked baggage — something that has happened in the past. And, depending on the country, passengers also have to worry about border control officials intercepting checked laptops and making copies of what’s on their hard drives.

Safety is another concern. We’re already worried about large lithium-ion batteries catching fire in airplane baggage holds; adding a few hundred of these devices will considerably exacerbate the risk. Both FedEx and UPS no longer accept bulk shipments of these batteries after two jets crashed in 2010 and 2011 due to combustion.

Of course, passengers will rebel against this rule. Having access to a computer on these long transatlantic flights is a must for many travelers, especially the high-revenue business-class travelers. They also won’t accept the delays and confusion this rule will cause as it’s rolled out. Unhappy passengers fly less, or fly other routes on other airlines without these restrictions.

I don’t know how many passengers are choosing to fly to the Middle East via Toronto to avoid the current laptop ban, but I suspect there may be some. If Europe is included in the new ban, many more may consider adding Canada to their itineraries, as well as choosing European hubs that remain unaffected.

As passengers voice their disapproval with their wallets, airlines will rebel. Already Emirates has a program to loan laptops to their premium travelers. I can imagine US airlines doing the same, although probably for an extra fee. We might learn how to make this work: keeping our data in the cloud or on portable memory sticks and using unfamiliar computers for the length of the flight.

A more likely response will be comparable to what happened after the US increased passenger screening post-9/11. In the months and years that followed, we saw different ways for high-revenue travelers to avoid the lines: faster first-class lanes, and then the extra-cost trusted traveler programs that allow people to bypass the long lines, keep their shoes on their feet and leave their laptops and liquids in their bags. It’s a bad security idea, but it keeps both frequent fliers and airlines happy. It would be just another step to allow these people to keep their electronics with them on their flight.

The problem with this response is that it solves the problem for frequent fliers, while leaving everyone else to suffer. This is already the case; those of us enrolled in a trusted traveler program forget what it’s like to go through “normal” security screening. And since frequent fliers — likely to be more wealthy — no longer see the problem, they don’t have any incentive to fix it.

Dividing security checks into haves and have-nots is bad social policy, and we should actively fight any expansion of it. If the TSA implements this security procedure, it should implement it for every flight. And there should be no exceptions. Force every politically connected flier, from members of Congress to the lobbyists that influence them, to do without their laptops on planes. Let the TSA explain to them why they can’t work on their flights to and from D.C.

This essay previously appeared on CNN.com.

EDITED TO ADD: US officials are backing down.

Grafana 4.0 Stable Release

Post Syndicated from Blogs on Grafana Labs Blog original https://grafana.com/blog/2016/12/12/grafana-4.0-stable-release/

Grafana v4.0.2 is stable is now available for download. After about 4 weeks of beta fixes and testing
are proud to announce that Grafana v4.0 stable is now released and production ready. This release contains a ton of minor
new features, fixes and improved UX. But on top of the usual new goodies is a core new feature: Alerting!
Read on below for a detailed description of what’s new in Grafana v4!

Alerting

Alerting is a really revolutionary feature for Grafana. It transforms Grafana from a
visualization tool into a truly mission critical monitoring tool. The alert rules are very easy to
configure using your existing graph panels and threshold levels can be set simply by dragging handles to
the right side of the graph. The rules will continually be evaluated by grafana-server and
notifications will be sent out when the rule conditions are met.

This feature has been worked on for over a year with many iterations and rewrites
just to make sure the foundations are really solid. We are really proud to finally release it!
Since the alerting execution is processed in the backend all data source plugins are not supported.
Right now Graphite, Prometheus, InfluxDB and OpenTSDB are supported. Elasticsearch is being worked
on but will not ready for v4 release.

Rules

The rule config allows you to specify a name, how often the rule should be evaluated and a series
of conditions that all need to be true for the alert to fire.

Currently the only condition type that exists is a Query condition that allows you to
specify a query letter, time range and an aggregation function. The letter refers to
a query you already have added in the Metrics tab. The result from the
query and the aggregation function is a single value that is then used in the threshold check.

We plan to add other condition types in the future, like Other Alert, where you can include the state
of another alert in your conditions, and Time Of Day.

Notifications

Alerting would not be very useful if there was no way to send notifications when rules trigger and change state. You
can setup notifications of different types. We currently have Slack, PagerDuty, Email and Webhook with more in the
pipe that will be added during beta period. The notifications can then be added to your alert rules.
If you have configured an external image store in the grafana.ini config file (s3 and webdav options available)
you can get very rich notifications with an image of the graph and the metric
values all included in the notification.

Annotations

Alert state changes are recorded in a new annotation store that is built into Grafana. This store
currently only supports storing annotations in Grafana’s own internal database (mysql, postgres or sqlite).
The Grafana annotation storage is currently only used for alert state changes but we hope to add the ability for users
to add graph comments in the form of annotations directly from within Grafana in a future release.

Alert List Panel

This new panel allows you to show alert rules or a history of alert rule state changes. You can filter based on states your
interested in. Very useful panel for overview style dashboards.

Ad-hoc filter variable

This is a new and very different type of template variable. It will allow you to create new key/value filters on the fly.
With autocomplete for both key and values. The filter condition will be automatically applied to all
queries that use that data source. This feature opens up more exploratory dashboards. In the gif animation to the right
you have a dashboard for Elasticsearch log data. It uses one query variable that allow you to quickly change how the data
is grouped, and an interval variable for controlling the granularity of the time buckets. What was missing
was a way to dynamically apply filters to the log query. With the Ad-Hoc Filters variable you can
dynamically add filters to any log property!

UX Improvements

We always try to bring some UX/UI refinements & polish in every release.

TV-mode & Kiosk mode

Grafana is so often used on wall mounted TVs that we figured a clean TV mode would be
really nice. In TV mode the top navbar, row & panel controls will all fade to transparent.

This happens automatically after one minute of user inactivity but can also be toggled manually
with the d v sequence shortcut. Any mouse movement or keyboard action will
restore navbar & controls.

Another feature is the kiosk mode. This can be enabled with d k
shortcut or by adding &kiosk to the URL when you load a dashboard.
In kiosk mode the navbar is completely hidden/removed from view.

New row menu & add panel experience

We spent a lot of time improving the dashboard building experience. Trying to make it both
more efficient and easier for beginners. After many good but not great experiments
with a build mode we eventually decided to just improve the green row menu and
continue work on a build mode for a future release.

The new row menu automatically slides out when you mouse over the edge of the row. You no longer need
to hover over the small green icon and the click it to expand the row menu.

There is some minor improvements to drag and drop behaviour. Now when dragging a panel from one row
to another you will insert the panel and Grafana will automatically make room for it.
When you drag a panel within a row you will simply reorder the panels.

If you look at the animation to the right you can see that you can drag and drop a new panel. This is not
required, you can also just click the panel type and it will be inserted at the end of the row
automatically. Dragging a new panel has an advantage in that you can insert a new panel where ever you want
not just at the end of the row.

We plan to further improve dashboard building in the future with a more rich grid & layout system.

Keyboard shortcuts

Grafana v4 introduces a number of really powerful keyboard shortcuts. You can now focus a panel
by hovering over it with your mouse. With a panel focused you can simple hit e to toggle panel
edit mode, or v to toggle fullscreen mode. p r removes the panel. p s opens share
modal.

Some nice navigation shortcuts are:

  • g h for go to home dashboard
  • s s open search with starred pre-selected
  • s t open search in tags list view

Upgrade & Breaking changes

There are no breaking changes. Old dashboards and features should work the same. Grafana-server will automatically upgrade it’s db
schema on restart. It’s advisable to do a backup of Grafana’s database before updating.

If your are using plugins make sure to update your plugins as some might not work perfectly v4.

You can update plugins using grafana-cli

grafana-cli plugins update-all

Changelog

Checkout the CHANGELOG.md file for a complete list
of new features, changes, and bug fixes.

Download

Head to v4 download page for download links & instructions.

Big thanks to all the Grafana users and devs out there who have helped with bug reports, feature
requests and pull requests!

Until next time, keep on graphing!
Torkel Ödegaard

Guessing Credit Card Security Details

Post Syndicated from Bruce Schneier original https://www.schneier.com/blog/archives/2016/12/guessing_credit.html

Researchers have found that they can guess various credit-card-number security details by spreading their guesses around multiple websites so as not to trigger any alarms.

From a news article:

Mohammed Ali, a PhD student at the university’s School of Computing Science, said: “This sort of attack exploits two weaknesses that on their own are not too severe but when used together, present a serious risk to the whole payment system.

“Firstly, the current online payment system does not detect multiple invalid payment requests from different websites.

“This allows unlimited guesses on each card data field, using up to the allowed number of attempts — typically 10 or 20 guesses — on each website.

“Secondly, different websites ask for different variations in the card data fields to validate an online purchase. This means it’s quite easy to build up the information and piece it together like a jigsaw.

“The unlimited guesses, when combined with the variations in the payment data fields make it frighteningly easy for attackers to generate all the card details one field at a time.

“Each generated card field can be used in succession to generate the next field and so on. If the hits are spread across enough websites then a positive response to each question can be received within two seconds — just like any online payment.

“So even starting with no details at all other than the first six digits — which tell you the bank and card type and so are the same for every card from a single provider — a hacker can obtain the three essential pieces of information to make an online purchase within as little as six seconds.”

That’s card number, expiration date, and CVV code.

From the paper:

Abstract: This article provides an extensive study of the current practice of online payment using credit and debit cards, and the intrinsic security challenges caused by the differences in how payment sites operate. We investigated the Alexa top-400 online merchants’ payment sites, and realised that the current landscape facilitates a distributed guessing attack. This attack subverts the payment functionality from its intended purpose of validating card details, into helping the attackers to generate all security data fields required to make online transactions. We will show that this attack would not be practical if all payment sites performed the same security checks. As part of our responsible disclosure measure, we notified a selection of payment sites about our findings, and we report on their responses. We will discuss potential solutions to the problem and the practical difficulty to implement these, given the varying technical and business concerns of the involved parties.

BoingBoing post:

The researchers believe this method has already been used in the wild, as part of a spectacular hack against Tesco bank last month.

MasterCard is immune to this hack because they detect the guesses, even though they’re distributed across multiple websites. Visa is not.

AWS Snowball Update – Job Management API & S3 Adapter

Post Syndicated from Jeff Barr original https://aws.amazon.com/blogs/aws/aws-snowball-update-job-management-api-s3-adapter/

We introduced AWS Import/Export Snowball last fall from the re:Invent stage. The Snowball appliance is designed for customers who need to transfer large amounts of data into or out of AWS on a one-time or recurring basis (read AWS Import/Export Snowball – Transfer 1 Petabyte Per Week Using Amazon-Owned Storage Appliances to learn more).

Today we are launching two important additions to Snowball. Here’s the scoop:

  • Snowball Job Management API – The new Snowball API lets you build applications that create and manage Snowball jobs.
  • S3 Adapter – The new Snowball S3 Adapter lets you access a Snowball appliance as if it were an S3 endpoint.

Time to dive in!

Snowball Job Management API
The original Snowball model was interactive and console-driven. You could create a job (basically “Send me a Snowball”) and then monitor its progress, tracking the shipment, transit, delivery, and return to AWS visually. This was great for one-off jobs, but did not meet the needs of customers who wanted to integrate Snowball into their existing backup or data transfer model. Based on the requests that we received from these customers and from our Storage Partners, we are introducing a Snowball Job Management API today.

The Snowball Job Management API gives our customers and partners the power to make Snowball an intrinsic, integrated part of their data management solutions. Here are the primary functions:

  • CreateJob – Create an import or export job & initiates shipment of an appliance.
  • ListJobs – Fetch a list of jobs and associated job states.
  • DescribeJob – Fetch information about a specific job.

Read the API Reference to learn more!

I’m looking forward to reading about creative and innovative applications that make use of this new API! Leave me a comment and let me know what you come up with.

S3 Adapter
The new Snowball S3 Adapter allows you to access a Snowball as if it were an Amazon S3 endpoint running on-premises. This allows you to use your existing, S3-centric tools to move data to or from a Snowball.

The adapter is available for multiple Linux distributions and Windows releases, and is easy to install:

  1. Download the appropriate file from the Snowball Tools page and extract its contents to a local directory.
  2. Verify that the adapter’s configuration is appropriate for your environment (the adapter listens on port 8080 by default).
  3. Connect your Snowball to your network and get its IP address from the built-in display on the appliance.
  4. Visit the Snowball Console to obtain the unlock code and the job manifest.
  5. Launch the adapter, providing it with the IP address, unlock code, and manifest file.

With the adapter up and running, you can use your existing S3-centric tools by simply configuring them to use the local endpoint (the IP address of the on-premises host and the listener port). For example, here’s how you would run the s3 ls command on the on-premises host:

$ aws s3 ls --endpoint http://localhost:8080

After you copy your files to the Snowball, you can easily verify that the expected number of files were copied:

$ snowball validate

The initial release of the adapter supports a subset of the S3 API including GET on buckets and on the service, HEAD on a bucket and on objects, PUT and DELETE on objects, and all of the multipart upload operations. If you plan to access the adapter using your own code or third party tools, some testing is advisable.

To learn more, read about the Snowball Transfer Adapter.

Available Now
These new features are available now and you can start using them today!


Jeff;

 

Computing and weather stations at Eastlea Community School

Post Syndicated from clive original https://www.raspberrypi.org/blog/eastlea-community-school/

In my day, you were lucky if you had some broken Clackers and a half-sucked, flocculent gobstopper in your trouser pockets. But here I am, half a century later, watching a swarm of school pupils running around the playground with entire computers attached to them.

Or microcontrollers, at least. This was Eastlea Community School’s Technology Day, and Steph and I had been invited along by ICT and computing teacher Mr Richards, a long-term Raspberry Pi forum member and Pi enthusiast. The day was a whole school activity, involving 930 pupils and 100 staff, showcasing how computing and technology can be used across the curriculum. In the playground, PE students had designed and coded micro:bits to measure all manner of sporting metrics. In physics, they were investigating g-forces. In the ICT and computing rooms, whole cohorts were learning to code. This was really innovative stuff.

shelves of awesome

All ICT classrooms should have shelves like this

A highlight of the tour was Mr Richard’s classroom, stuffed with electronics, robots, and hacking goodness, and pupils coming and going. It was a really creative space. Impressively, there are Raspberry Pis permanently installed on every desk, which is just how we envisaged it: a normal classroom tool for digital making.

pis on table

All this was amazing, and certainly the most impressive cross-curricular use of computing I’ve seen in a school. But having lived and breathed the Raspberry Pi Oracle weather station project for several months, I was really keen to see what they’d done with theirs. And it was a corker. Students from the computing club had built and set up the station in their lunch breaks, and installed it in a small garden area.

eastlea ws team

Mr Richards and the Eastlea Community School weather station team

Then they had hacked it, adding a solar panel, battery and WiFi. This gets round the problems of how to power the station and how to transfer data. The standard way is Power over Ethernet, which uses the same cable for power and data, but this is not always the optimal solution, depending on location. It’s not as simple as sticking a solar panel on a stick either. What happens when it’s cloudy? Will the battery recharge in winter? Mr Richards and his students have spent a lot of time investigating such questions, and it’s exactly the sort of problem-solving and engineering that we want to encourage. Also, we love hacking.

eastlea weather station garden

Not content with these achievements, they plan to add a camera to monitor wildlife and vegetation, perhaps tying it in with the weather data. They’re also hoping to install another weather station elsewhere, so that they can compare the data and investigate the school microclimate in more detail. The weather station itself will be used for teaching and learning this September.

Eastlea Community School’s weather station really is a showcase for the project, and we’d like to thank Mr Richards and his students for working so hard on it. If you want to learn more about solar panels and other hacks, then head over to our weather station forum.


Weather station update

The remaining weather station kits have started shipping to schools this week! We sent an email out recently for people to confirm delivery addresses, and if you’ve done this you should have yours soon. If you were offered a weather station last year and have not had an email from us in the last few weeks (early July), then please contact us immediately at [email protected].

The post Computing and weather stations at Eastlea Community School appeared first on Raspberry Pi.

AWS Becomes First Cloud Service Provider to Adopt New PCI DSS 3.2

Post Syndicated from Chad Woolf original https://blogs.aws.amazon.com/security/post/Tx20SIO4LU1XDFA/AWS-Becomes-First-Cloud-Service-Provider-to-Adopt-New-PCI-DSS-3-2

We are happy to announce the availability of the Amazon Web Services PCI DSS 3.2 Compliance Package for the 2016/2017 cycle. AWS is the first cloud service provider (CSP) to successfully complete the assessment against the newly released PCI Data Security Standard (PCI DSS) version 3.2, 18 months in advance of the mandatory February 1, 2018, deadline. The AWS Attestation of Compliance (AOC), available upon request, now features 26 PCI DSS certified services, including the latest additions of Amazon EC2 Container Service (ECS), AWS Config, and AWS WAF (a web application firewall). We at AWS are committed to this international information security and compliance program, and adopting the new standard as early as possible once again demonstrates our commitment to information security as our highest priority. Our customers (and customers of our customers) can operate confidently as they store and process credit card information (and any other sensitive data) in the cloud knowing that AWS products and services are tested against the latest and most mature set of PCI compliance requirements.

What’s new in PCI DSS 3.2?

The PCI Standards Council published PCI DSS 3.2 in April 2016 as the most updated set of requirements available. PCI DSS version 3.2 has revised and clarified the online credit card transaction requirements around encryption, access control, change management, application security, and risk management programs. Specific changes, per the PCI Security Standards Council’s Chief Technology Officer Troy Leach, include:

  • A change management process is now required as part of implementing a continuous monitoring environment (versus a yearly assessment).
  • Service providers now are required to detect and report on failures of critical security control systems.
  • The penetration testing requirement was increased from yearly to once every six months.
  • Multi-factor authentication is a requirement for personnel with non-console administrative access to systems handling card data.
  • Service providers are now required to perform quarterly reviews to confirm that personnel are following security policies and operational procedures.

Intended use of the Compliance Package

The AWS PCI DSS Compliance Package is intended to be used by AWS customers and their compliance advisors to understand the scope of the AWS Service Provider PCI DSS assessment and expectations for responsibilities when using AWS products as part of the customer’s cardholder data environment. Customers and assessors should be familiar with the AWS PCI FAQs, security best practices and recommendations published in Technical Workbook: PCI Compliance in the AWS Cloud. This Compliance Package will also assist AWS customers in:

  • Planning to host a PCI Cardholder Data Environment at AWS.
  • Preparing for a PCI DSS assessment.
  • Assessing, documenting, and certifying the deployment of a Cardholder Data Environment on AWS.

Additionally, the AWS PCI DSS Compliance Package contains AWS’s Attestation of Compliance (AoC). Provided by a PCI SSC Qualified Security Assessor Company, the AoC attests that AWS is a PCI DSS “Compliant” Level 1 service provider. Service provider Level 1, the highest level requiring the most stringent assessment requirements, is required for any service provider that stores, processes, and/or transmits more than 300,000 transactions annually. Our AoC also provides AWS customers assurance that the AWS infrastructure meets all of the applicable PCI DSS requirements. Note: As a part of the Payment Brand’s annual PCI DSS compliance validation process for Service Providers, AWS AoC is also approved by Visa and MasterCard.

Our Compliance Package also includes a Responsibility Summary, which illustrates the Shared Responsibility Model between AWS and customers to fulfill each of the PCI DSS requirements. This document was validated by a Qualified Security Assessor Company and the contents in this document are aligned with the AWS Report on Compliance.

This document includes:

  • An Executive Summary, a Business Description, and the Description of PCI DSS In-Scope Services.
  • PCI DSS Responsibility Requirements – AWS & Customers Responsibilities for In-Scope Services.
  • Appendix A1: Additional PCI DSS Requirements for Shared Hosting Providers.
  • Appendix A2: Additional PCI DSS Requirements for Entities Using SSL/Early TLS.

To request an AWS PCI DSS Compliance Package, please contact AWS Sales and Business Development. If you have any other questions about this package or its contents, please contact your AWS Sales or Business Development representative or visit AWS Compliance website for more information.

Additional resources

Chad Woolf
Director, AWS Risk and Compliance

Error Handling Patterns in Amazon API Gateway and AWS Lambda

Post Syndicated from Bryan Liston original https://aws.amazon.com/blogs/compute/error-handling-patterns-in-amazon-api-gateway-and-aws-lambda/


Ryan Green @ryangtweets
Software Development Engineer, API Gateway

A common API design practice is to define an explicit contract for the types of error responses that the API can produce. This allows API consumers to implement a robust error-handling mechanism which may include user feedback or automatic retries, improving the usability and reliability of applications consuming your API.

In addition, well-defined input and output contracts, including error outcomes, allows strongly-typed SDK client generation which further improves the application developer experience. Similarly, your API backend should be prepared to handle the various types of errors that may occur and/or surface them to the client via the API response.

This post discusses some recommended patterns and tips for handling error outcomes in your serverless API built on Amazon API Gateway and AWS Lambda.

HTTP status codes

In HTTP, error status codes are generally divided between client (4xx) and server (5xx) errors. It’s up to your API to determine which errors are appropriate for your application. The table shows some common patterns of basic API errors.

TypeHTTP status codeDescription
Data Validation400 (Bad Request)The client sends some invalid data in the request, for example, missing or incorrect content in the payload or parameters. Could also represent a generic client error.
Authentication/Authorization401 (Unauthorized)
403 (Forbidden)
The client is not authenticated (403) or is not authorized to access the requested resource (401).
Invalid Resource404 (Not Found)The client is attempting to access a resource that doesn’t exist.
Throttling429 (Too Many Requests)The client is sending more than the allowed number of requests per unit time.
Dependency Issues502 (Bad Gateway)
504 (Gateway Timeout)
A dependent service is throwing errors (502) or timing out (504).
Unhandled Errors500 (Internal Server Error)
503 (Service Unavailable)
The service failed in an unexpected way (500), or is failing but is expected to recover (503).

For more information about HTTP server status codes, see RFC2616 section 10.5 on the W3C website.

Routing Lambda function errors to API Gateway HTTP responses

In API Gateway, AWS recommends that you model the various types of HTTP responses that your API method may produce, and define a mapping from the various error outcomes in your backend Lambda implementation to these HTTP responses.

In Lambda, function error messages are always surfaced in the “errorMessage” field in the response. Here’s how it’s populated in the various runtimes:

Node.js (4.3):

exports.handler = function(event, context, callback) {
    callback(new Error("the sky is falling!");
};

Java:

public class LambdaFunctionHandler implements RequestHandler<String, String> {
  @Override
    public String handleRequest(String input, Context context) {
        throw new RuntimeException("the sky is falling!");
   }
}

Python:

def lambda_handler(event, context):
    raise Exception('the sky is falling!')
Each results in the following Lambda response body:
{
  "errorMessage" : "the sky is falling!",
    …
}

The routing of Lambda function errors to HTTP responses in API Gateway is achieved by pattern matching against this “errorMessage” field in the Lambda response. This allows various function errors to be routed to API responses with an appropriate HTTP status code and response body.

The Lambda function must exit with an error in order for the response pattern to be evaluated – it is not possible to “fake” an error response by simply returning an “errorMessage” field in a successful Lambda response.

Note: Lambda functions failing due to a service error, i.e. before the Lambda function code is executed, are not subject to the API Gateway routing mechanism. These types of errors include internal server errors, Lambda function or account throttling, or failure of Lambda to parse the request body. Generally, these types of errors are returned by API Gateway as a 500 response. AWS recommends using CloudWatch Logs to troubleshoot these types of errors.

API Gateway method response and integration response

In API Gateway, the various HTTP responses supported by your method are represented by method responses. These define an HTTP status code as well as a model schema for the expected shape of the payload for the response.

Model schemas are not required on method responses but they enable support for strongly-typed SDK generation. For example, the generated SDKs can unmarshall your API error responses into appropriate exception types which are thrown from the SDK client.

The mapping from a Lambda function error to an API Gateway method responseis defined by an integration response. An integration response defines a selection pattern used to match the Lambda function “errorMessage” and routes it to an associated method response.

Note: API Gateway uses Java pattern-style regexes for response mapping. For more information, see Pattern in the Oracle documentation.

Example:

Lambda function (Node.js 4.3):

exports.handler = (event, context, callback) => {
   callback ("the sky is falling!");
};

Lambda response body:

{
  "errorMessage": "the sky is falling!"
}

API Gateway integration response:

Selection pattern : “the sky is falling!”

Method response : 500

API Gateway response:

Status: 500

Response body:

{
  "errorMessage": "the sky is falling!"
}

In this example, API Gateway returns the Lambda response body verbatim, a.k.a. “passthrough”. It is possible to define mapping templates on the integration response to transform the Lambda response body into a different form for the API Gateway method response. This is useful when you want to format or filter the response seen by the API client.

When a Lambda function completes successfully or if none of the integration response patterns match the error message, API Gateway responds with the default integration response (typically, HTTP status 200). For this reason, it is imperative that you design your integration response patterns such that they capture every possible error outcome from your Lambda function. Because the evaluation order is undefined, it is unadvisable to define a “catch-all” (i.e., “.*”) error pattern which may be evaluated before the default response.

Common patterns for error handling in API Gateway and Lambda

There are many ways to structure your serverless API to handle error outcomes. The following section will identify two successful patterns to consider when designing your API.

Simple prefix-based

This common pattern uses a prefix in the Lambda error message string to route error types.

You would define a static set of prefixes, and create integration responses to capture each and route them to the appropriate method response. An example mapping might look like the following:

PrefixMethod response status
[BadRequest]400
[Forbidden]403
[NotFound]404
[InternalServerError]500

Example:

Lambda function (NodeJS):

exports.handler = (event, context, callback) => {
    callback("[BadRequest] Validation error: Missing field 'name'");
};

Lambda output:

{
  "errorMessage": "[BadRequest] Validation error: Missing field 'name'"
}

API Gateway integration response:

Selection pattern: “^[BadRequest].*”

Method response: 400

API Gateway response:

Status: 400

Response body:

{
  "errorMessage": "[BadRequest] Validation error: Missing field 'name'"
}

If you don’t want to expose the error prefix to API consumers, you can perform string processing within a mapping template and strip the prefix from the errorMessage field.

Custom error object serialization

Lambda functions can return a custom error object serialized as a JSON string, and fields in this object can be used to route to the appropriate API Gateway method response.

This pattern uses a custom error object with an “httpStatus” field and defines an explicit 1-to-1 mapping from the value of this field to the method response.

An API Gateway mapping template is defined to deserialize the custom error object and build a custom response based on the fields in the Lambda error.

Lambda function (Node.js 4.3):

exports.handler = (event, context, callback) => {
    var myErrorObj = {
        errorType : "InternalServerError",
        httpStatus : 500,
        requestId : context.awsRequestId,
        message : "An unknown error has occurred. Please try again."
    }
    callback(JSON.stringify(myErrorObj));
};

Lambda function (Java):

public class LambdaFunctionHandler implements RequestHandler<String, String> {
  @Override
    public String handleRequest(String input, Context context) {

        Map<String, Object> errorPayload = new HashMap();
        errorPayload.put("errorType", "BadRequest");
        errorPayload.put("httpStatus", 400);
        errorPayload.put("requestId", context.getAwsRequestId());
        errorPayload.put("message", "An unknown error has occurred. Please try again.");
        String message = new ObjectMapper().writeValueAsString(errorPayload);
        
        throw new RuntimeException(message);
    }
}

Note: this example uses Jackson ObjectMapper for JSON serialization. For more information, see ObjectMapper on the FasterXML website.

Lambda output:

{
  "errorMessage": "{\"errorType\":\"InternalServerError\",\"httpStatus\":500,\"requestId\":\"40cd9bf6-0819-11e6-98f3-415848322efb\",\"message\":\"An unknown error has occurred. Please try again.\"}"
}

Integration response:

Selection pattern: “.*httpStatus”:500.*”

Method response: 500

Mapping template:

#set ($errorMessageObj = $util.parseJson($input.path('$.errorMessage')))
{
  "type" : "$errorMessageObj.errorType",
  "message" : "$errorMessageObj.message",
  "request-id" : "$errorMessageObj.requestId"
}

Note: This template makes use of the $util.parseJson() function to parse elements from the custom Lambda error object. For more information, see Accessing the $util Variable.

API Gateway response:

Status: 500

Response body:

{
  "type": "InternalServerError",
  "message": " An unknown error has occurred. Please try again.",
  "request-id": "e308b7b7-081a-11e6-9ab9-117c7feffb09"
}

This is a full Swagger example of the custom error object serialization pattern. This can be imported directly into API Gateway for testing or as a starting point for your API.

{
  "swagger": "2.0",
  "info": {
    "version": "2016-04-21T23:52:49Z",
    "title": "Best practices for API error responses with API Gateway and Lambda"
  },
  "schemes": [
    "https"
  ],
  "paths": {
    "/lambda": {
      "get": {
        "consumes": [
          "application/json"
        ],
        "produces": [
          "application/json"
        ],
        "parameters": [
          {
            "name": "status",
            "in": "query",
            "required": true,
            "type": "string"
          }
        ],
        "responses": {
          "200": {
            "description": "200 response",
            "schema": {
              "$ref": "#/definitions/Empty"
            }
          },
          "400": {
            "description": "400 response",
            "schema": {
              "$ref": "#/definitions/Error"
            }
          },
          "403": {
            "description": "403 response",
            "schema": {
              "$ref": "#/definitions/Error"
            }
          },
          "404": {
            "description": "404 response",
            "schema": {
              "$ref": "#/definitions/Error"
            }
          },
          "500": {
            "description": "500 response",
            "schema": {
              "$ref": "#/definitions/Error"
            }
          }
        },
        "x-amazon-apigateway-integration": {
          "responses": {
            "default": {
              "statusCode": "200"
            },
            ".\*httpStatus\\\":404.\*": {
              "statusCode": "404",
              "responseTemplates": {
                "application/json": "#set ($errorMessageObj = $util.parseJson($input.path('$.errorMessage')))\n#set ($bodyObj = $util.parseJson($input.body))\n{\n  \"type\" : \"$errorMessageObj.errorType\",\n  \"message\" : \"$errorMessageObj.message\",\n  \"request-id\" : \"$errorMessageObj.requestId\"\n}"
              }
            },
            ".\*httpStatus\\\":403.\*": {
              "statusCode": "403",
              "responseTemplates": {
                "application/json": "#set ($errorMessageObj = $util.parseJson($input.path('$.errorMessage')))\n#set ($bodyObj = $util.parseJson($input.body))\n{\n  \"type\" : \"$errorMessageObj.errorType\",\n  \"message\" : \"$errorMessageObj.message\",\n  \"request-id\" : \"$errorMessageObj.requestId\"\n}"
              }
            },
            ".\*httpStatus\\\":400.\*": {
              "statusCode": "400",
              "responseTemplates": {
                "application/json": "#set ($errorMessageObj = $util.parseJson($input.path('$.errorMessage')))\n#set ($bodyObj = $util.parseJson($input.body))\n{\n  \"type\" : \"$errorMessageObj.errorType\",\n  \"message\" : \"$errorMessageObj.message\",\n  \"request-id\" : \"$errorMessageObj.requestId\"\n}"
              }
            },
            ".\*httpStatus\\\":500.\*": {
              "statusCode": "500",
              "responseTemplates": {
                "application/json": "#set ($errorMessageObj = $util.parseJson($input.path('$.errorMessage')))\n#set ($bodyObj = $util.parseJson($input.body))\n{\n  \"type\" : \"$errorMessageObj.errorType\",\n  \"message\" : \"$errorMessageObj.message\",\n  \"request-id\" : \"$errorMessageObj.requestId\"\n}"
              }
            }
          },
          "httpMethod": "POST",
          "requestTemplates": {
            "application/json": "{\"failureStatus\" : $input.params('status')\n}"
          },
          "uri": "arn:aws:apigateway:us-east-1:lambda:path/2015-03-31/functions/[MY_FUNCTION_ARN]/invocations",
          "type": "aws"
        }
      }
    }
  },
  "definitions": {
    "Empty": {
      "type": "object"
    },
    "Error": {
      "type": "object",
      "properties": {
        "message": {
          "type": "string"
        },
        "type": {
          "type": "string"
        },
        "request-id": {
          "type": "string"
        }
      }
    }
  }
}

Conclusion

There are many ways to represent errors in your API. While API Gateway and Lambda provide the basic building blocks, it is helpful to follow some best practices when designing your API. This post highlights a few successful patterns that we have identified but we look forward to seeing other patterns emerge from our serverless API users.

The Backdoor Factory (BDF) – Patch Binaries With Shellcode

Post Syndicated from Darknet original http://feedproxy.google.com/~r/darknethackers/~3/wVDf5vrafmk/

The Backdoor Factory or BDF is a tool which enables you to patch binaries with shellcode and continue normal execution exactly as the executable binary would have in its’ pre-patched state. Some executables have built in protection, as such this tool will not work on all binaries. It is advisable that you test target binaries […]

The post…

Read the full post at darknet.org.uk

Combine NoSQL and Massively Parallel Analytics Using Apache HBase and Apache Hive on Amazon EMR

Post Syndicated from Ben Snively original https://blogs.aws.amazon.com/bigdata/post/Tx3EGE8Z90LZ9WX/Combine-NoSQL-and-Massively-Parallel-Analytics-Using-Apache-HBase-and-Apache-Hiv

Ben Snively is a Solutions Architect with AWS

Jon Fritz, a Senior Product Manager for Amazon EMR, co-authored this post

With today’s launch of Amazon EMR release 4.6, you can now quickly and easily provision a cluster with Apache HBase 1.2Apache HBase is a massively scalable, distributed big data store in the Apache Hadoop ecosystem. It is an open-source, non-relational, versioned database which runs on top of the Hadoop Distributed Filesystem (HDFS), and it is built for random, strictly consistent realtime access for tables with billions of rows and millions of columns. It has tight integration with Apache Hadoop, Apache Hive, and Apache Pig, so you can easily combine massively parallel analytics with fast data access. Apache HBase’s data model, throughput, and fault tolerance are a good match for workloads in ad tech, web analytics, financial services, applications using time-series data, and many more.

Table structure in Apache HBase, like many NoSQL technologies, should be directly influenced by the queries and access patterns of the data. Query performance varies drastically based on the way the cluster has to process and return the data. In this post, we’ll demonstrate query performance differences by showing you how to launch an EMR cluster with HBase and restore a table from a snapshot in Amazon S3. The table in the snapshot contains approximately 3.5 million rows, and you’ll perform look-ups and scans using the HBase shell as well as perform SQL queries over the same dataset using Hive with the Hive query editor in the Hue UI.

Introduction to Apache HBase

HBase is considered a persistent, multidimensional, sorted map, where each cell is indexed by a row key and column key (family and qualifier). Each cell can contain multiple versions of a value captured with a long variable, which defaults to a timestamp value. The reason it’s considered multidimensional is that there are several parameters that contribute to the cell location.

Table Structure

An HBase table is composed of one or more column families that are defined for the table. Each column family defines shared storage and a set of properties for an arbitrary grouping of columns. Column families are predefined (either at table creation time or modifying an existing table), but the columns themselves can be dynamically created and discovered while adding or updating rows. 

HBase stores data in HDFS, which spreads data stored in a table across the cluster. All the columns in a column family are stored together in a set of HFiles (also known as a column-oriented storage). A rowkey, which is immutable and uniquely defines a row, usually spans multiple HFiles. Rowkeys are treated as byte arrays (byte[]) and are stored in a sorted order in the multi-dimensional sorted map. Cells in your table are byte arrays too.

Data from a table is served by RegionServers running on nodes in your cluster. Each region server manages a namespace of rowkeys, and HBase can split regions as tables get larger, to keep the namespace for each region to a manageable size.

Query Performance

Lower query latency and higher throughput is achieved when each query scans less data and are distributed across all the RegionServers. Querying for a specific row and column allows the cluster to find the RegionServer quickly and underlying files that store the information and return it to the caller. A partial range, if known, can speed up queries by reducing the rows that need to be scanned when a single row is unknown. By default, HBase also utilizes row-level bloom filters to reduce the number of disk reads per Get request

The diagram below shows the relationship between a cardinality and query performance:

HBase shell query walkthrough

Using the console or the CLI, launch a new EMR 4.6 cluster that has the HBase, Hive, and Hue applications selected. Next, you restore a table snapshot on your cluster. We created this snapshot from an HBase table for this demo, and snapshotting is a convenient way to back-up and restore tables for production HBase clusters. Below is the schema of the data stored in the snapshot:

The rowkey is a composite value that combines lastname, firstname, and customerId. There are three column families that group columns containing information about address, credit card, and contact information.

After the cluster is running, recover the sample HBase snapshot from Amazon S3. HBase uses Hadoop MapReduce and EMRFS under the hood to transfer snapshots quickly from Amazon S3. SSH to the master node of your cluster, and use the HBase shell to create an empty table which will be populated by the restored data.

hbase shell
>> create 'customer', 'address', 'cc', 'contact'

Next, run an HBase command (outside of the shell) to copy the snapshot from Amazon S3 to HDFS on your cluster.

sudo su hbase -s /bin/sh -c 'hbase snapshot export -D hbase.rootdir=s3://us-east-1.elasticmapreduce.samples/hbase-demo-customer-data/snapshot -snapshot customer_snapshot1 -copy-to hdfs://<MASTERDNS>:8020/user/hbase -mappers 2 -chuser hbase -chmod 700'

Finally, use the HBase shell to disable the ‘customer’ table, restore the snapshot, and re-enable the table.

hbase shell
>> disable 'customer'
>> restore_snapshot 'customer_snapshot1'
>> enable 'customer'

To demonstrate how specifying different portions of the key structure in the multi-dimensional map influences performance, you will perform a variety of queries using the HBase shell.

Get the credit card type for a specific customer

In this case, you specify every parameter in the multi-dimension map and quickly return the cell from the RegionServer. The syntax for this ‘get’ request is ‘TABLE_NAME’, ‘ROWKEY’, ‘COLUMN_FAMILY:COLUMN’

hbase(main):008:0> 
    get 'customer', 'armstrong_zula_8570365786', 'cc:type'

Get all the address data from the address column family for a specific customer

In this case, you return all of the data from the ‘address’ column family for a specific customer. As you can see, there is a time penalty when compared to just returning one specific cell.

hbase(main):011:0> 
    get 'customer', 'armstrong_zula_8570365786', {COLUMN => 'address'}

                                              

Get all the data on a specific customer

In this case, you return all of the data for a specific customer. There is a time cost for returning additional information.

hbase(main):004:0> get 'customer', 'armstrong_zula_8570365786'

 

Get all the cities for each row that has a customer with a last name starting with “armstrong”

In this case, there is a partial prefix for the rowkey as well as a specific column family/qualifier for which you are querying. The HBase cluster is able to check with Region and RegionServers that you need to query. After that is done, the RegionServer can check the various HFiles if it has that specific rowkey and column data in the dataset. 

There are multiple performance improvements that a customer can turn on their table, such as bloom filters. Bloom filters allow the RegionServer to skip over some of the HFiles quickly.

hbase(main):014:0> 
scan 'customer', {STARTROW => 'armstrong_', ENDROW => 'armstronh', COLUMN => 'address:city'}

Get all customers that use a Visa credit card

In this example, HBase has to scan over every RegionServer, and search all the HFiles that contain the cc Column Family. The full list of customers that use Visa credit cards is returned.

hbase(main):014:0> 
scan 'customer', { COLUMNS => 'cc:type',  
                   FILTER => "ValueFilter( =, 'binaryprefix:visa' )" }

 

Running SQL analytics over data in HBase tables using Hive

HBase can organize your dataset for fast look-ups and scans, easily update rows, and bulk update your tables. Though it doesn’t have a native SQL interface to run massive parallel analytics workloads, HBase tightly integrates with Apache Hive, which utilizes Hadoop MapReduce as an execution engine, allowing you to write SQL queries on your HBase tables quickly or join data in HBase with other datasets.

For this post, you will write your Hive queries and browse tables in the Hive Metastore in the Hue UI, which runs on the master node of your cluster. To connect to Hue, see Launch the Hue Web Interface.

After you have signed into Hue in your browser, explore the UI by choosing Data Browsers, HBase Browser. Select the “customer” table to view your HBase table in the browser. Here you can see a sample of rows in your table and can query, add, and update rows through the UI. To learn more about the functionality available, see HBase Browser in the Hue documentation. Use the search box to scan HBase for 100 records with the rowkey prefix “smith” and return the credit card type and state:

Smith* +100 [cc:type,address:state]

 

Next, go to the Hive editor under Query Editor and create an external Hive table over your data in HBase:

CREATE EXTERNAL TABLE customer
(rowkey STRING,
street STRING,
city STRING,
state STRING,
zip STRING,
cctype STRING,
ccnumber STRING,
ccexpire STRING,
phone STRING)
STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler'
WITH SERDEPROPERTIES ('hbase.columns.mapping' = ':key,address:street,address:city,address:state,address:zip,cc:type,cc:number,cc:expire,contact:phone', 'hbase.scan.cacheblocks' = 'false', 'hbase.scan.cache' = '1000')
TBLPROPERTIES ('hbase.table.name' = 'customer');

Hive uses the HBase storage handler to interact with HBase tables, and you specify the column information and other information in the SerDe properties. In this table, you disable caching the blocks requested by Hive in RegionServers to avoid replacing records in the cache, and set the number of batched records returned from HBase to a client to 1,000. Other settings for the storage handler can be found in the Class HBaseSerDe topic in the Hive documentation.                        

After creating the Hive table, you can browse the schema and sample the data in the Metastore Tables section under the Data Browsers category:

Now that you have created your table, run some SQL queries. Note that Hive creates one mapper per input split, and calculates one split per region in your HBase table (see the HBase storage handler page for more information). The table in this example has two regions, so the Hive job uses two mappers despite the true memory requirements of the job.

Mappers use the memory settings present on the cluster, and the default mapper sizes for each instance type can be found in the EMR documentation. However, the way Hive calculates the number of mappers (1 per region) may cause too few mappers to be created for the actual data size, which in turn can cause out-of-memory issues for your job. To give each mapper more memory than the defaults, you can set these parameters in your Hive query editor (these stay alive for the duration of your Hive session in Hue):

SET mapreduce.map.memory.mb=2000;
SET mapreduce.map.java.opts=-Xmx1500m;

Execute, and now you have increased the mapper memory size. Now, count the rows in the HBase table:

SELECT count (rowkey)
FROM customer;

As you can see from the logs, Hive creates a MapReduce job, and each mapper is using predicate-push downs to use the RegionServer scan API to return data efficiently. Now, perform a more advanced query: finding the count of each credit card type for all customers from California.

SELECT cctype, count(rowkey) AS count
FROM customer
WHERE state = 'CA'
GROUP BY cctype;

Choose Chart in the results section to view the output in a graphical representation.

Querying, fusing, and aggregating data from S3 and HBase

You can create multiple tables in the Hive Metastore, and have each table backed by different external sources including Amazon S3, HDFS, and HBase. These sources can be used together in a join, enabling you to enrich data across sources and aggregate metrics across datasets.

In this example, you are going to use data stored in Amazon S3 to help enrich the data already stored in HBase. The data in HBase has the abbreviation for each state in each row. However, there is a CSV file in S3 that contains the state abbreviation (as a join key), full state name, governor name, and governor’s political affiliation.

First, you need to create an external table over Amazon S3. Use the Hive query editor in Hue to create the table:

CREATE EXTERNAL TABLE StateInfo(
        StateName STRING, 
        Abbr STRING,
        Name STRING,
        Affiliation STRING)
     ROW FORMAT DELIMITED
    FIELDS TERMINATED BY ','
    STORED AS TEXTFILE
    location 's3://us-east-1.elasticmapreduce.samples/hbase-demo-customer-data/state-data/';

In the following query, you find the number of customers per credit card type and state name. To do this, query tables in both HBase and S3 as input sources, and join the tables on the state abbreviation:

SELECT cctype, statename, count(rowkey)
FROM customer c
JOIN stateinfo s ON (c.state = s.abbr)
GROUP BY cctype, statename;

Next, you can see the top 10 senators who have the most constituents who use Visa credit cards:

SELECT name, max(affiliation) AS affiliation, count(rowkey)
FROM customer c
JOIN stateinfo s ON (c.state = s.abbr)
WHERE cctype = 'visa'
GROUP BY name;

Conclusion

In this post, you explored the HBase table structure, restoring table snapshots from S3, and the relative performance of various GET requests. Next, you used Hive to create an external table over data in HBase, and ran SQL queries using the Hive HBase storage handler on your HBase data. Finally, you joined data from an external, relational table in Amazon S3 with data in non-relational format in HBase.

If you have any questions about using HBase 1.2 on Amazon EMR, or have interesting use cases leveraging HBase with other applications in the Apache ecosystem, please comment below.

If you have questions or suggestions, please leave a comment below.

———————————-

Related

Analyze Your Data on Amazon DynamoDB with Apache Spark

Poland vs the United States: immigration

Post Syndicated from Michal Zalewski original http://lcamtuf.blogspot.com/2015/07/poland-vs-united-states-immigration.html

This is the eleventh article in a short series about Poland, Europe, and the United States. To explore the entire series, start here.

There are quite a few corners of the world where the ratio of immigrants to native-born citizens is remarkably high. Many of these places are small or rapidly growing countries – say, Monaco or Qatar. Some others, including several European states, just happen to be on the receiving end of transient, regional demographic shifts; for example, in the past decade, over 500,000 people moved from Poland to the UK. But on the list of foreigner-friendly destinations, the US deserves a special spot: it is an enduring home to by far the largest, most diverse, and quite possibly best-assimilated migrant population in the world.

The inner workings of the American immigration system are a fascinating mess – a tangle of complex regulation, of multiple overlapping bureaucracies, and of quite a few unique social norms. The bureaucratic machine itself is ruthlessly efficient, issuing several million non-tourist visas and processing over 700,000 naturalization applications every year. But the system is also marred by puzzling dysfunction: for example, it allows highly skilled foreign students to attend US universities, sometimes granting them scholarships – only to show many of them the door the day they graduate. It runs a restrictive H-1B visa program that ties foreign workers to their petitioning employers, preventing them from seeking better wages – thus artificially depressing the salaries of some citizen and permanent resident employees who now have to compete with H-1B captives. It also neglects the countless illegal immigrants who, with the tacit approval of legislators and business owners, prop up many facets of the economy – but are denied the ability to join the society even after decades of staying out of trouble and doing honest work.

Despite being fairly picky about the people it admits into its borders, in many ways, the United States is still an exceptionally welcoming country: very few other developed nations unconditionally bestow citizenship onto all children born on their soil, run immigration lotteries, or allow newly-naturalized citizens to invite their parents, siblings, and adult children over, no questions asked. At the same time, the US immigration system has a shameful history of giving credence to populist fears about alien cultures – and of implementing exclusionary policies that, at one time or another, targeted anyone from the Irish, to Poles, to Arabs, to people from many parts of Asia or Africa. Some pundits still find this sort of scaremongering fashionable, now seeing Mexico as the new threat to the national identity and to the American way of life. The claim made very little sense 15 years ago – and makes even less of it today, as the migration from the region has dropped precipitously and has been eclipsed by the inflow from other parts of the world.

The contradictions, the dysfunction, and the occasional prejudice aside, what always struck me about the United States is that immigration is simply a part of the nation’s identity; the principle of welcoming people from all over the world and giving them a fair chance is an axiom that is seldom questioned in any serious way. When surveyed, around 80% Americans can identify their own foreign ancestry – and they often do this with enthusiasm and pride. Europe is very different, with national identity being a more binary affair; I always felt that over there, accepting foreigners is seen as a humanitarian duty, not an act of nation-building – and that this attitude makes it harder for the newcomers to truly integrate into the society.

In the US, as a consequence of treating contemporary immigrants as equals, many newcomers face a strong social pressure to make it on their own, to accept American values, and to adopt the American way of life; it is a powerful, implicit social contract that very few dare to willingly renege on. In contrast to this, post-war Europe approaches the matter differently, seeing greater moral value in letting the immigrants preserve their cultural identity and customs, with the state stepping in to help them jumpstart their new lives through a variety of education programs and financial benefits. It is a noble concept, although I’m not sure if the compassionate European approach always worked better than the more ruthless and pragmatic American method: in France and in the United Kingdom, massive migrant populations have been condemned to a life of exclusion and hopelessness, giving rise to social unrest and – in response – to powerful anti-immigrant sentiments and policies. I think this hasn’t happened to nearly the same extent in the US, perhaps simply because the social contract is structured in a different way – but then, I know eminently reasonable folks who would disagree.

As for my own country of origin, it occupies an interesting spot. Historically a cosmopolitan nation, Poland has lost much of its foreign population and ethnic minorities to the horrors of World War II and to the policies implemented within the Soviet Bloc – eventually becoming one of the most culturally and ethnically homogeneous nations on the continent. Today, migrants comprise less than 1% of its populace, and most of them come from the neighboring, culturally similar Slavic states. Various flavors of xenophobia run deep in the society, playing right into the recent pan-European anti-immigration sentiments. As I’m writing this, Poland is fighting the European Commission tooth and nail not to take three thousand asylum seekers from Syria; many politicians and pundits want to first make sure that all the refugees are of Christian faith. For many Poles, reasonable concerns over non-assimilation and extremism blend with a wholesale distrust of foreign cultures.

For the next article in the series, click here.

Poland vs the United States: immigration

Post Syndicated from Michal Zalewski original http://lcamtuf.blogspot.com/2015/07/poland-vs-united-states-immigration.html

This is the eleventh article in a short series about Poland, Europe, and the United States. To explore the entire series, start here.

There are quite a few corners of the world where the ratio of immigrants to native-born citizens is remarkably high. Many of these places are small or rapidly growing countries – say, Monaco or Qatar. Some others, including several European states, just happen to be on the receiving end of transient, regional demographic shifts; for example, in the past decade, over 500,000 people moved from Poland to the UK. But on the list of foreigner-friendly destinations, the US deserves a special spot: it is an enduring home to by far the largest, most diverse, and quite possibly best-assimilated migrant population in the world.

The inner workings of the American immigration system are a fascinating mess – a tangle of complex regulation, of multiple overlapping bureaucracies, and of quite a few unique social norms. The bureaucratic machine itself is ruthlessly efficient, issuing several million non-tourist visas and processing over 700,000 naturalization applications every year. But the system is also marred by puzzling dysfunction: for example, it allows highly skilled foreign students to attend US universities, sometimes granting them scholarships – only to show many of them the door the day they graduate. It runs a restrictive H-1B visa program that ties foreign workers to their petitioning employers, preventing them from seeking better wages – thus artificially depressing the salaries of some citizen and permanent resident employees who now have to compete with H-1B captives. It also neglects the countless illegal immigrants who, with the tacit approval of legislators and business owners, prop up many facets of the economy – but are denied the ability to join the society even after decades of staying out of trouble and doing honest work.

Despite being fairly picky about the people it admits into its borders, in many ways, the United States is still an exceptionally welcoming country: very few other developed nations unconditionally bestow citizenship onto all children born on their soil, run immigration lotteries, or allow newly-naturalized citizens to invite their parents, siblings, and adult children over, no questions asked. At the same time, the US immigration system has a shameful history of giving credence to populist fears about alien cultures – and of implementing exclusionary policies that, at one time or another, targeted anyone from the Irish, to Poles, to Arabs, to people from many parts of Asia or Africa. Some pundits still find this sort of scaremongering fashionable, now seeing Mexico as the new threat to the national identity and to the American way of life. The claim made very little sense 15 years ago – and makes even less of it today, as the migration from the region has dropped precipitously and has been eclipsed by the inflow from other parts of the world.

The contradictions, the dysfunction, and the occasional prejudice aside, what always struck me about the United States is that immigration is simply a part of the nation’s identity; the principle of welcoming people from all over the world and giving them a fair chance is an axiom that is seldom questioned in any serious way. When surveyed, around 80% Americans can identify their own foreign ancestry – and they often do this with enthusiasm and pride. Europe is very different, with national identity being a more binary affair; I always felt that over there, accepting foreigners is seen as a humanitarian duty, not an act of nation-building – and that this attitude makes it harder for the newcomers to truly integrate into the society.

In the US, as a consequence of treating contemporary immigrants as equals, many newcomers face a strong social pressure to make it on their own, to accept American values, and to adopt the American way of life; it is a powerful, implicit social contract that very few dare to willingly renege on. In contrast to this, post-war Europe approaches the matter differently, seeing greater moral value in letting the immigrants preserve their cultural identity and customs, with the state stepping in to help them jumpstart their new lives through a variety of education programs and financial benefits. It is a noble concept, although I’m not sure if the compassionate European approach always worked better than the more ruthless and pragmatic American method: in France and in the United Kingdom, massive migrant populations have been condemned to a life of exclusion and hopelessness, giving rise to social unrest and – in response – to powerful anti-immigrant sentiments and policies. I think this hasn’t happened to nearly the same extent in the US, perhaps simply because the social contract is structured in a different way – but then, I know eminently reasonable folks who would disagree.

As for my own country of origin, it occupies an interesting spot. Historically a cosmopolitan nation, Poland has lost much of its foreign population and ethnic minorities to the horrors of World War II and to the policies implemented within the Soviet Bloc – eventually becoming one of the most culturally and ethnically homogeneous nations on the continent. Today, migrants comprise less than 1% of its populace, and most of them come from the neighboring, culturally similar Slavic states. Various flavors of xenophobia run deep in the society, playing right into the recent pan-European anti-immigration sentiments. As I’m writing this, Poland is fighting the European Commission tooth and nail not to take three thousand asylum seekers from Syria; many politicians and pundits want to first make sure that all the refugees are of Christian faith. For many Poles, reasonable concerns over non-assimilation and extremism blend with a wholesale distrust of foreign cultures.

For the next article in the series, click here.

On journeys

Post Syndicated from Michal Zalewski original http://lcamtuf.blogspot.com/2015/03/on-journeys.html

– 1 –

Poland is an ancient country whose history is deeply intertwined with that of the western civilization. In its glory days, the Polish-Lithuanian Commonwealth sprawled across vast expanses of land in central Europe, from Black Sea to Baltic Sea. But over the past two centuries, it suffered a series of military defeats and political partitions at the hands of its closest neighbors: Russia, Austria, Prussia, and – later – Germany.

After more than a hundred years of foreign rule, Poland re-emerged as an independent state in 1918, only to face the armies of Nazi Germany at the onset of World War II. With Poland’s European allies reneging on their earlier military guarantees, the fierce fighting left the country in ruins. Some six million people have died within its borders – more than ten times the death toll in France or in the UK. Warsaw was reduced to a sea of rubble, with perhaps one in ten buildings still standing by the end of the war.

With the collapse of the Third Reich, Franklin D. Roosevelt, Winston Churchill, and Joseph Stalin sat down in Yalta to decide the new order for war-torn Europe. At Stalin’s behest, Poland and its neighboring countries were placed under Soviet political and military control, forming what has become known as the Eastern Bloc.

Over the next several decades, the Soviet satellite states experienced widespread repression and economic decline. But weakened by the expense of the Cold War, the communist chokehold on the region eventually began to wane. In Poland, the introduction of martial law in 1981 could not put an end to sweeping labor unrest. Narrowly dodging the specter of Soviet intervention, the country regained its independence in 1989 and elected its first democratic government; many other Eastern Bloc countries soon followed suit.

Ever since then, Poland has enjoyed a period of unprecedented growth and has emerged as one of the more robust capitalist democracies in the region. In just two decades, it shed many of its backwardly, state-run heavy industries and adopted a modern, service-oriented economy. But the effects of the devastating war and the lost decades under communist rule still linger on – whether you look at the country’s infrastructure, at its socrealist cityscapes, at its political traditions, or at the depressingly low median wage.

When thinking about the American involvement in the Cold War, people around the world may recall Vietnam, Bay of Pigs, or the proxy wars fought in the Middle East. But in Poland and many of its neighboring states, the picture you remember the most is the fall of the Berlin Wall.

– 2 –

I was born in Warsaw in the winter of 1981, at the onset of martial law, with armored vehicles rolling onto Polish streets. My mother, like many of her generation, moved to the capital in the sixties as a part of an effort to rebuild and repopulate the war-torn city. My grandma would tell eerie stories of Germans and Soviets marching through their home village somewhere in the west. I liked listening to the stories; almost every family in Poland had some to tell.

I did not get to know my father. I knew his name; he was a noted cinematographer who worked on big-ticket productions back in the day. He left my mother when I was very young and never showed interest in staying in touch. He had a wife and other children, so it might have been that.

Compared to him, mom hasn’t done well for herself. We ended up in social housing in one of the worst parts of the city, on the right bank of the Vistula river. My early memories from school are that of classmates sniffing glue from crumpled grocery bags. I remember my family waiting in lines for rationed toilet paper and meat. As a kid, you don’t think about it much.

The fall of communism came suddenly. I have a memory of grandma listening to broadcasts from Radio Free Europe, but I did not understand what they were all about. I remember my family cheering one afternoon, transfixed to a black-and-white TV screen. I recall my Russian language class morphing into English; I had my first taste of bananas and grapefruits. There is the image of the monument of Feliks Dzierżyński coming down. I remember being able to go to a better school on the other side of Warsaw – and getting mugged many times on the way.

The transformation brought great wealth to some, but many others have struggled to find their place in the fledgling and sometimes ruthless capitalist economy. Well-educated and well read, my mom ended up in the latter pack, at times barely making ends meet. I think she was in part a victim of circumstance, and in part a slave to way of thinking that did not permit the possibility of taking chances or pursuing happiness.

– 3 –

Mother always frowned upon popular culture, seeing it as unworthy of an educated mind. For a time, she insisted that I only listen to classical music. She angrily shunned video games, comic books, and cartoons. I think she perceived technology as trivia; the only field of science she held in high regard was abstract mathematics, perhaps for its detachment from the mundane world. She hoped that I would learn Latin, a language she could read and write; that I would practice drawing and painting; or that I would read more of the classics of modernist literature.

Of course, I did almost none of that. I hid my grunge rock tapes between Tchaikovsky, listened to the radio under the sheets, and watched the reruns of The A-Team while waiting for her to come back from work. I liked electronics and chemistry a lot more than math. And when I laid my hands on my first computer – an 8-bit relic of British engineering from 1982 – I soon knew that these machines, in their incredible complexity and flexibility, were what I wanted to spend my time on.

I suspected I could be a competent programmer, but never had enough faith in my skill. Yet, in learning about computers, I realized that I had a knack for understanding complex systems and poking holes in how they work. With a couple of friends, we joined the nascent information security community in Europe, comparing notes on mailing lists. Before long, we were taking on serious consulting projects for banks and the government – usually on weekends and after school, but sometimes skipping a class or two. Well, sometimes more than that.

All of the sudden, I was facing an odd choice. I could stop, stay in school and try to get a degree – going back every night to a cramped apartment, my mom sleeping on a folding bed in the kitchen, my personal space limited to a bare futon and a tiny desk. Or, I could seize the moment and try to make it on my own, without hoping that one day, my family would be able to give me a head start.

I moved out, dropped out of school, and took on a full-time job. It paid somewhere around $12,000 a year – a pittance anywhere west of the border, but a solid wage in Poland even today. Not much later, I was making two times as much, about the upper end of what one could hope for in this line of work. I promised myself to keep taking courses after hours, but I wasn’t good at sticking to the plan. I moved in with my girlfriend, and at the age of 19, I felt for the first time that things were going to be all right.

– 4 –

Growing up in Europe, you get used to the barrage of low-brow swipes taken at the United States. Your local news will never pass up the opportunity to snicker about the advances of creationism somewhere in Kentucky. You can stay tuned for a panel of experts telling you about the vastly inferior schools, the medieval justice system, and the striking social inequality on the other side of the pond. You don’t doubt their words – but deep down inside, no matter how smug the critics are, or how seemingly convincing their arguments, the American culture still draws you in.

My moment of truth came in the summer of 2000. A company from Boston asked me if I’d like to talk about a position on their research team; I looked at the five-digit figure and could not believe my luck. Moving to the US was an unreasonable risk for a kid who could barely speak English and had no safety net to fall back to. But that did not matter: I knew I had no prospects of financial independence in Poland – and besides, I simply needed to experience the New World through my own eyes.

Of course, even with a job offer in hand, getting into the United States is not an easy task. An engineering degree and a willing employer opens up a straightforward path; it is simple enough that some companies would abuse the process to source cheap labor for menial, low-level jobs. With a visa tied to the petitioning company, such captive employees could not seek better wages or more rewarding work.

But without a degree, the options shrink drastically. For me, the only route would be a seldom-granted visa reserved for extraordinary skill – meant for the recipients of the Nobel Prize and other folks who truly stand out in their field of expertise. The attorneys looked over my publication record, citations, and the supporting letters from other well-known people in the field. Especially given my age, they thought we had a good shot. A few stressful months later, it turned out that they were right.

On the week of my twentieth birthday, I packed two suitcases and boarded a plane to Boston. My girlfriend joined me, miraculously securing a scholarship at a local university to continue her physics degree; her father helped her with some of the costs. We had no idea what we were doing; we had perhaps few hundred bucks on us, enough to get us through the first couple of days. Four thousand miles away from our place of birth, we were starting a brand new life.

– 5 –

The cultural shock gets you, but not in the sense you imagine. You expect big contrasts, a single eye-opening day to remember for the rest of your life. But driving down a highway in the middle of a New England winter, I couldn’t believe how ordinary the world looked: just trees, boxy buildings, and pavements blanketed with dirty snow.

Instead of a moment of awe, you drown in a sea of small, inconsequential things, draining your energy and making you feel helpless and lost. It’s how you turn on the shower; it’s where you can find a grocery store; it’s what they meant by that incessant “paper or plastic” question at the checkout line. It’s how you get a mailbox key, how you make international calls, it’s how you pay your bills with a check. It’s the rules at the roundabout, it’s your social security number, it’s picking the right toll lane, it’s getting your laundry done. It’s setting up a dial-up account and finding the food you like in the sea of unfamiliar brands. It’s doing all this without Google Maps or a Facebook group to connect with other expats nearby.

The other thing you don’t expect is losing touch with your old friends; you can call or e-mail them every day, but your social frames of reference begin to drift apart, leaving less and less to talk about. The acquaintances you make in the office will probably never replace the folks you grew up with. We managed, but we weren’t prepared for that.

– 6 –

In the summer, we had friends from Poland staying over for a couple of weeks. By the end of their trip, they asked to visit New York City one more time; we liked the Big Apple, so we took them on a familiar ride down I-95. One of them went to see the top of World Trade Center; the rest of us just walked around, grabbing something to eat before we all headed back. A few days later, we were all standing in front of a TV, watching September 11 unfold in real time.

We felt horror and outrage. But when we roamed the unsettlingly quiet streets of Boston, greeted by flags and cardboard signs urging American drivers to honk, we understood that we were strangers a long way from home – and that our future in this country hanged in the balance more than we would have thought.

Permanent residency is a status that gives a foreigner the right to live in the US and do almost anything they please – change jobs, start a business, or live off one’s savings all the same. For many immigrants, the pursuit of this privilege can take a decade or more; for some others, it stays forever out of reach, forcing them to abandon the country in a matter of days as their visas expire or companies fold. With my O-1 visa, I always counted myself among the lucky ones. Sure, it tied me to an employer, but I figured that sorting it out wouldn’t be a big deal.

That proved to be a mistake. In the wake of 9/11, an agency known as Immigration and Naturalization Services was being dismantled and replaced by a division within the Department of Homeland Security. My own seemingly straightforward immigration petition ended up somewhere in the bureaucratic vacuum that formed in between the two administrative bodies. I waited patiently, watching the deepening market slump, and seeing my employer’s prospects get dimmer and dimmer every month. I was ready for the inevitable, with other offers in hand, prepared to make my move perhaps the very first moment I could. But the paperwork just would not come through. With the Boston office finally shutting down, we packed our bags and booked flights. We faced the painful admission that for three years, we chased nothing but a pipe dream. The only thing we had to show for it were two adopted cats, now sitting frightened somewhere in the cargo hold.

The now-worthless approval came through two months later; the lawyers, cheerful as ever, were happy to send me a scan. The hollowed-out remnants of my former employer were eventually bought by Symantec – the very place from where I had my backup offer in hand.

– 7 –

In a way, Europe’s obsession with America’s flaws made it easier to come home without ever explaining how the adventure really played out. When asked, I could just wing it: a mention of the death penalty or permissive gun laws would always get you a knowing nod, allowing the conversation to move on.

Playing to other people’s preconceptions takes little effort; lying to yourself calls for more skill. It doesn’t help that when you come back after three years away from home, you notice all the small annoyances that you used to simply tune out. Back then, Warsaw still had a run-down vibe: the dilapidated road from the airport; the drab buildings on the other side of the river; the uneven pavements littered with dog poop; the dirty walls at my mother’s place, with barely any space to turn. You can live with it, of course – but it’s a reminder that you settled for less, and it’s a sensation that follows you every step of the way.

But more than the sights, I couldn’t forgive myself something else: that I was coming back home with just loose change in my pocket. There are some things that a failed communist state won’t teach you, and personal finance is one of them; I always looked at money just as a reward for work, something you get to spend to brighten your day. The indulgences were never extravagant: perhaps I would take the cab more often, or have take-out every day. But no matter how much I made, I kept living paycheck-to-paycheck – the only way I knew, the way our family always did.

– 8 –

With a three-year stint in the US on your resume, you don’t have a hard time finding a job in Poland. You face the music in a different way. I ended up with a salary around a fourth of what I used to make in Massachusetts, but I simply decided not to think about it much. I wanted to settle down, work on interesting projects, marry my girlfriend, have a child. I started doing consulting work whenever I could, setting almost all the proceeds aside.

After four years with T-Mobile in Poland, I had enough saved to get us through a year or so – and in a way, it changed the way I looked at my work. Being able to take on ambitious challenges and learn new things started to matter more than jumping ships for a modest salary bump. Burned by the folly of pursuing riches in a foreign land, I put a premium on boring professional growth.

Comically, all this introspection made me realize that from where I stood, I had almost nowhere left to go. Sure, Poland had telcos, refineries, banks – but they all consumed the technologies developed elsewhere, shipped here in a shrink-wrapped box; as far as their IT went, you could hardly tell the companies apart. To be a part of the cutting edge, you had to pack your bags, book a flight, and take a jump into the unknown. I sure as heck wasn’t ready for that again.

And then, out of the blue, Google swooped in with an offer to work for them from the comfort of my home, dialing in for a videoconference every now and then. The starting pay was about the same, but I had no second thoughts. I didn’t say it out loud, but deep down inside, I already knew what needed to happen next.

– 9 –

We moved back to the US in 2009, two years after taking the job, already on the hook for a good chunk of Google’s product security and with the comfort of knowing where we stood. In a sense, my motive was petty: you could call it a desire to vindicate a failed adolescent dream. But in many other ways, I have grown fond of the country that shunned us once before; and I wanted our children to grow up without ever having to face the tough choices and the uncertain prospects I had to deal with in my earlier years.

This time, we knew exactly what to do: a quick stop at a grocery store on a way from the airport, followed by e-mail to our immigration folks to get the green card paperwork out the door. A bit more than half a decade later, we were standing in a theater in Campbell, reciting the Oath of Allegiance and clinging on to our new certificates of US citizenship.

The ceremony closed a long and interesting chapter in my life. But more importantly, standing in that hall with people from all over the globe made me realize that my story is not extraordinary; many of them had lived through experiences far more harrowing and captivating than mine. If anything, my tale is hard to tell apart from that of countless other immigrants from the former Eastern Bloc. By some estimates, in the US alone, the Polish diaspora is about 9 million strong.

I know that the Poland of today is not the Poland I grew up in. It’s not not even the Poland I came back to in 2003; the gap to Western Europe is shrinking every single year. But I am grateful to now live in a country that welcomes more immigrants than any other place on Earth – and at the end of their journey, makes many of them them feel at home. It also makes me realize how small and misguided must be the conversations we are having about immigration – not just here, but all over the developed world.

To explore other articles in this short series about Poland, click here. You can also directly proceed to the next entry here.

On journeys

Post Syndicated from Michal Zalewski original http://lcamtuf.blogspot.com/2015/03/on-journeys.html

– 1 –

Poland is an ancient country whose history is deeply intertwined with that of the western civilization. In its glory days, the Polish-Lithuanian Commonwealth sprawled across vast expanses of land in central Europe, from Black Sea to Baltic Sea. But over the past two centuries, it suffered a series of military defeats and political partitions at the hands of its closest neighbors: Russia, Austria, Prussia, and – later – Germany.

After more than a hundred years of foreign rule, Poland re-emerged as an independent state in 1918, only to face the armies of Nazi Germany at the onset of World War II. With Poland’s European allies reneging on their earlier military guarantees, the fierce fighting left the country in ruins. Some six million people have died within its borders – more than ten times the death toll in France or in the UK. Warsaw was reduced to a sea of rubble, with perhaps one in ten buildings still standing by the end of the war.

With the collapse of the Third Reich, Franklin D. Roosevelt, Winston Churchill, and Joseph Stalin held a meeting in Yalta to decide the new order for war-torn Europe. At Stalin’s behest, Poland and its neighboring countries were placed under Soviet political and military control, forming what has become known as the Eastern Bloc.

Over the next several decades, the Soviet satellite states experienced widespread repression and economic decline. But weakened by the expense of the Cold War, the communist chokehold on the region eventually began to wane. In Poland, even the introduction of martial law in 1981 could not put an end to sweeping labor unrest. Narrowly dodging the specter of Soviet intervention, the country regained its independence in 1989 and elected its first democratic government; many other Eastern Bloc countries soon followed suit.

Ever since then, Poland has enjoyed a period of unprecedented growth and has emerged as one of the more robust capitalist democracies in the region. In just two decades, it shed many of its backwardly, state-run heavy industries and adopted a modern, service-oriented economy. But the effects of the devastating war and the lost decades under communist rule still linger on – whether you look at the country’s infrastructure, at its socrealist cityscapes, at its political traditions, or at the depressingly low median wage.

When thinking about the American involvement in the Cold War, people around the world may recall Vietnam, Bay of Pigs, or the proxy wars fought in the Middle East. But in Poland and many of its neighboring states, the picture you remember the most is the fall of the Berlin Wall.

– 2 –

I was born in Warsaw in the winter of 1981, at the onset of martial law, with armored vehicles rolling onto Polish streets. My mother, like many of her generation, moved to the capital in the sixties as a part of an effort to rebuild and repopulate the war-torn city. My grandma would tell eerie stories of Germans and Soviets marching through their home village somewhere in the west. I liked listening to the stories; almost every family in Poland had some to tell.

I did not get to know my father. I knew his name; he was a noted cinematographer who worked on big-ticket productions back in the day. He left my mother when I was very young and never showed interest in staying in touch. He had a wife and other children, so it might have been that.

Compared to him, mom hasn’t done well for herself. We ended up in social housing in one of the worst parts of the city, on the right bank of the Vistula river. My early memories from school are that of classmates sniffing glue from crumpled grocery bags. I remember my family waiting in lines for rationed toilet paper and meat. As a kid, you don’t think about it much.

The fall of communism came suddenly. I have a memory of grandma listening to broadcasts from Radio Free Europe, but I did not understand what they were all about. I remember my family cheering one afternoon, transfixed to a black-and-white TV screen. I recall my Russian language class morphing into English; I had my first taste of bananas and grapefruits. There is the image of the monument of Feliks Dzierżyński coming down. I remember being able to go to a better school on the other side of Warsaw – and getting mugged many times on the way.

The transformation brought great wealth to some, but many others have struggled to find their place in the fledgling and sometimes ruthless capitalist economy. Well-educated and well read, my mom ended up in the latter pack, at times barely making ends meet. I think she was in part a victim of circumstance, and in part a slave to way of thinking that did not permit the possibility of taking chances or pursuing happiness.

– 3 –

Mother always frowned upon popular culture, seeing it as unworthy of an educated mind. For a time, she insisted that I only listen to classical music. She angrily shunned video games, comic books, and cartoons. I think she perceived technology as trivia; the only field of science she held in high regard was abstract mathematics, perhaps for its detachment from the mundane world. She hoped that I would learn Latin, a language she could read and write; that I would practice drawing and painting; or that I would read more of the classics of modernist literature.

Of course, I did almost none of that. I hid my grunge rock tapes between Tchaikovsky, listened to the radio under the sheets, and watched the reruns of The A-Team while waiting for her to come back from work. I liked electronics and chemistry a lot more than math. And when I laid my hands on my first computer – an 8-bit relic of British engineering from 1982 – I soon knew that these machines, in their incredible complexity and flexibility, were what I wanted to spend my time on.

I suspected I could become a competent programmer, but never had enough faith in my skill. Yet, in learning about computers, I realized that I had a knack for understanding complex systems and poking holes in how they work. With a couple of friends, we joined the nascent information security community in Europe, comparing notes on mailing lists. Before long, we were taking on serious consulting projects for banks and the government – usually on weekends and after school, but sometimes skipping a class or two. Well, sometimes more than that.

All of the sudden, I was facing an odd choice. I could stop, stay in school and try to get a degree – going back every night to a cramped apartment, my mom sleeping on a folding bed in the kitchen, my personal space limited to a bare futon and a tiny desk. Or, I could seize the moment and try to make it on my own, without hoping that one day, my family would be able to give me a head start.

I moved out, dropped out of school, and took on a full-time job. It paid somewhere around $12,000 a year – a pittance anywhere west of the border, but a solid wage in Poland even today. Not much later, I was making two times as much, about the upper end of what one could hope for in this line of work. I promised myself to keep taking courses after hours, but I wasn’t good at sticking to the plan. I moved in with my girlfriend, and at the age of 19, I felt for the first time that things were going to be all right.

– 4 –

Growing up in Europe, you get used to the barrage of low-brow swipes taken at the United States. Your local news will never pass up the opportunity to snicker about the advances of creationism somewhere in Kentucky. You can stay tuned for a panel of experts telling you about the vastly inferior schools, the medieval justice system, and the striking social inequality on the other side of the pond. You don’t doubt their words – but deep down inside, no matter how smug the critics are, or how seemingly convincing their arguments, the American culture still draws you in.

My moment of truth came in the summer of 2000. A company from Boston asked me if I’d like to talk about a position on their research team; I looked at the five-digit figure and could not believe my luck. Moving to the US was an unreasonable risk for a kid who could barely speak English and had no safety net to fall back to. But that did not matter: I knew I had no prospects of financial independence in Poland – and besides, I simply needed to experience the New World through my own eyes.

Of course, even with a job offer in hand, getting into the United States is not an easy task. An engineering degree and a willing employer opens up a straightforward path; it is simple enough that some companies would abuse the process to source cheap labor for menial, low-level jobs. With a visa tied to the petitioning company, such captive employees could not seek better wages or more rewarding work.

But without a degree, the options shrink drastically. For me, the only route would be a seldom-granted visa reserved for extraordinary skill – meant for the recipients of the Nobel Prize and other folks who truly stand out in their field of expertise. The attorneys looked over my publication record, citations, and the supporting letters from other well-known people in the field. Especially given my age, they thought we had a good shot. A few stressful months later, it turned out that they were right.

On the week of my twentieth birthday, I packed two suitcases and boarded a plane to Boston. My girlfriend joined me, miraculously securing a scholarship at a local university to continue her physics degree; her father helped her with some of the costs. We had no idea what we were doing; we had perhaps few hundred bucks on us, enough to get us through the first couple of days. Four thousand miles away from our place of birth, we were starting a brand new life.

– 5 –

The cultural shock gets you, but not in the sense you imagine. You expect big contrasts, a single eye-opening day to remember for the rest of your life. But driving down a highway in the middle of a New England winter, I couldn’t believe how ordinary the world looked: just trees, boxy buildings, and pavements blanketed with dirty snow.

Instead of a moment of awe, you drown in a sea of small, inconsequential things, draining your energy and making you feel helpless and lost. It’s how you turn on the shower; it’s where you can find a grocery store; it’s what they meant by that incessant “paper or plastic” question at the checkout line. It’s how you get a mailbox key, how you make international calls, it’s how you pay your bills with a check. It’s the rules at the roundabout, it’s your social security number, it’s picking the right toll lane, it’s getting your laundry done. It’s setting up a dial-up account and finding the food you like in the sea of unfamiliar brands. It’s doing all this without Google Maps or a Facebook group to connect with other expats nearby.

The other thing you don’t expect is losing touch with your old friends; you can call or e-mail them every day, but your social frames of reference begin to drift apart, leaving less and less to talk about. The acquaintances you make in the office will probably never replace the folks you grew up with. We managed, but we weren’t prepared for that.

– 6 –

In the summer, we had friends from Poland staying over for a couple of weeks. By the end of their trip, they asked to visit New York City one more time; we liked the Big Apple, so we took them on a familiar ride down I-95. One of them went to see the top of World Trade Center; the rest of us just walked around, grabbing something to eat before we all headed back. A few days later, we were all standing in front of a TV, watching September 11 unfold in real time.

We felt horror and outrage. But when we roamed the unsettlingly quiet streets of Boston, greeted by flags and cardboard signs urging American drivers to honk, we understood that we were strangers a long way from home – and that our future in this country hanged in the balance more than we would have thought.

Permanent residency is a status that gives a foreigner the right to live in the US and do almost anything they please – change jobs, start a business, or live off one’s savings all the same. For many immigrants, the pursuit of this privilege can take a decade or more; for some others, it stays forever out of reach, forcing them to abandon the country in a matter of days as their visas expire or companies fold. With my O-1 visa, I always counted myself among the lucky ones. Sure, it tied me to an employer, but I figured that sorting it out wouldn’t be a big deal.

That proved to be a mistake. In the wake of 9/11, an agency known as Immigration and Naturalization Services was being dismantled and replaced by a division within the Department of Homeland Security. My own seemingly straightforward immigration petition ended up somewhere in the bureaucratic vacuum that formed in between the two administrative bodies. I waited patiently, watching the deepening market slump, and seeing my employer’s prospects get dimmer and dimmer every month. I was ready for the inevitable, with other offers in hand, prepared to make my move perhaps the very first moment I could. But the paperwork just would not come through. With the Boston office finally shutting down, we packed our bags and booked flights. We faced the painful admission that for three years, we chased nothing but a pipe dream. The only thing we had to show for it were two adopted cats, now sitting frightened somewhere in the cargo hold.

The now-worthless approval came through two months later; the lawyers, cheerful as ever, were happy to send me a scan. The hollowed-out remnants of my former employer were eventually bought by Symantec – the very place from where I had my backup offer in hand.

– 7 –

In a way, Europe’s obsession with America’s flaws made it easier to come home without ever explaining how the adventure really played out. When asked, I could just wing it: a mention of the death penalty or permissive gun laws would always get you a knowing nod, allowing the conversation to move on.

Playing to other people’s preconceptions takes little effort; lying to yourself calls for more skill. It doesn’t help that when you come back after three years away from home, you notice all the small annoyances that you used to simply tune out. Back then, Warsaw still had a run-down vibe: the dilapidated road from the airport; the drab buildings on the other side of the river; the uneven pavements littered with dog poop; the dirty walls at my mother’s place, with barely any space to turn. You can live with it, of course – but it’s a reminder that you settled for less, and it’s a sensation that follows you every step of the way.

But more than the sights, I couldn’t forgive myself something else: that I was coming back home with just loose change in my pocket. There are some things that a failed communist state won’t teach you, and personal finance is one of them; I always looked at money just as a reward for work, something you get to spend to brighten your day. The indulgences were never extravagant: perhaps I would take the cab more often, or have take-out every day. But no matter how much I made, I kept living paycheck-to-paycheck – the only way I knew, the way our family always did.

– 8 –

With a three-year stint in the US on your resume, you don’t have a hard time finding a job in Poland. You face the music in a different way. I ended up with a salary around a fourth of what I used to make in Massachusetts, but I simply decided not to think about it much. I wanted to settle down, work on interesting projects, marry my girlfriend, have a child. I started doing consulting work whenever I could, setting almost all the proceeds aside.

After four years with T-Mobile in Poland, I had enough saved to get us through a year or so – and in a way, it changed the way I looked at my work. Being able to take on ambitious challenges and learn new things started to matter more than jumping ships for a modest salary bump. Burned by the folly of pursuing riches in a foreign land, I put a premium on boring professional growth.

Comically, all this introspection made me realize that from where I stood, I had almost nowhere left to go. Sure, Poland had telcos, refineries, banks – but they all consumed the technologies developed elsewhere, shipped here in a shrink-wrapped box; as far as their IT went, you could hardly tell the companies apart. To be a part of the cutting edge, you had to pack your bags, book a flight, and take a jump into the unknown. I sure as heck wasn’t ready for that again.

And then, out of the blue, Google swooped in with an offer to work for them from the comfort of my home, dialing in for a videoconference every now and then. The starting pay was about the same, but I had no second thoughts. I didn’t say it out loud, but deep down inside, I already knew what needed to happen next.

– 9 –

We moved back to the US in 2009, two years after taking the job, already on the hook for a good chunk of Google’s product security and with the comfort of knowing where we stood. In a sense, my motive was petty: you could call it a desire to vindicate a failed adolescent dream. But in many other ways, I have grown fond of the country that shunned us once before; and I wanted our children to grow up without ever having to face the tough choices and the uncertain prospects I had to deal with in my earlier years.

This time, we knew exactly what to do: a quick stop at a grocery store on a way from the airport, followed by e-mail to our immigration folks to get the green card paperwork out the door. A bit more than half a decade later, we were standing in a theater in Campbell, reciting the Oath of Allegiance and clinging on to our new certificates of US citizenship.

The ceremony closed a long and interesting chapter in my life. But more importantly, standing in that hall with people from all over the globe made me realize that my story is not extraordinary; many of them had lived through experiences far more harrowing and captivating than mine. If anything, my tale is hard to tell apart from that of countless other immigrants from the former Eastern Bloc. By some estimates, in the US alone, the Polish diaspora is about 9 million strong.

I know that the Poland of today is not the Poland I grew up in. It’s not not even the Poland I came back to in 2003; the gap to Western Europe is shrinking every single year. But I am grateful to now live in a country that welcomes more immigrants than any other place on Earth – and at the end of their journey, makes many of them them feel at home. It also makes me realize how small and misguided must be the conversations we are having about immigration – not just here, but all over the developed world.

To explore other articles in this short series about Poland, click here. You can also directly proceed to the next entry here.

Lapni.bg – лапни го ти

Човек и добре да живее, все някой ден се сблъсква с on-line търговията.
Горното може да обобщи най-кратко и ясно тъжната картинка на българската on-line търговия, ако изобщо може да се нарече on-line и ако изобщо може да се нарече търговия. Историята започва с това, че решавам, да си закупя ваучер за нещо (тук се абстрахираме за какво точно, защото няма връзка с историята) от някой сайт… например от Lapni.bg. Всичко е прекрасно, избрал съм офертата и съм готов да пазарувам:
1. Проверявам как може да се плати. Разбира се – информация на сайта няма, той се състои от купчина оферти. А може и да има, ама е скрита някъде на тайно място. Обаче виждам на началната страница, че има големи лога на различни платежни инструменти. И сред тях се мъдри логото на PayPal – явно приемат плащания и с PayPal. Супер – продължаваме.
Явно приемат PayPal
2. Откривам с изненада, че имам регистрация в сайта, естествено не си помня паролата… кликвам, че е забравена и след известна стандартна процедура получавам нова парола. Дотук всичко ОК. Решавам да си сменя новогенерираната парола с нещо, което се надявам, че ще запомня по-лесно и започва едно търсене… Оказва се, 10 минути по-късно, че паролата се сменя от менюто “Моите ваучери”. Супер, как не се сетих по-рано, а? Добре, в “Моите ваучери” има подменю “Настройки”… А там – има смяна на парола… и паролата се въвежда в? Не познахте – не в поле за парола, а в обикновено текстово поле. Признавам – много е удобно – виждаш си паролата!
3. Дотук – бели кахъри. Смених си паролата и се отправих към тайната страница за поръчване на оферта. И какво имам там – голям бутон “Направи подарък”… ОК, няма да се правя на ударен, сетих се, че се поръчва с големия червен плюс, ама не беше чак толкова очевидно.
Червения плюс
Кликваш на червения плюс и след няколко екрана за приключване на поръчката се озоваваш при един бутон “Плати сега”. Кликваш го и се зарежда следната форма:
Формата за плащане
Не знам, дали на вас ви прави впечатление, но на мен ми прави впечатление, че в изброените опции за плащане няма PayPal – доста разочароващо. След известен размисъл решавам, че ще избера опцията “Кредитна / Дебитна карта – Плащане директно с Вашата кредитна/дебитна карта”. Въпреки, че по-принцип съм доста мнителен, да не кажа параноичен по отношение на всевъзможни платежни системи със съмнително качество, произход и функционалност. Продължаваме нанатък и – изненада! Попадаме на някакъв зловещ сайт на БОРИКА, който изглежда приблизително така:
БОРИКА
Да започнем с това, че логото на търговеца не се зарежда… в едни браузъри се изписва мъдрият надпис “Merchant Logo”, в други стои дупка с очертание. Първото ми подозрение беше, че логото не е по https и затова не се зарежда, при по-обстойната проверка установих, че просто не работи – сървърът просто връща празен отговор, на всичкото отгоре с header Content-Typе: text/plain. Е не че нещо ме учудва – това е БОРИКА все пак.
След това – отдолу се мъдри следната забележка: “Ако Вашата карта поддържа 3D автентификация, може да се наложи да се идентифицирате след натискане на бутона “Плащане”.” И понеже моята карта не е беше с 3D автентикация, продължих най-спокойно нататък. Излезе съобщение, че всъщност моята карта поддържа 3D сигурност и въпреки, че аз съм отказал в банката да използвам тази опция – ако желая да платя през системата на БОРИКА, ще се наложи да се съглася да използвам въпросната 3D сигурност. Но затова по-късно…
4. Попълних всички полета и кликнах заветния бутон “Плащане”. Замърдаха някакви progress bar-ове… и “Системата каза не” – изписа ми, че е възникнала грешка… някаква грешка, никой не знае каква точно, да опитам отново по-късно. Обаче не става ясно, минало ли е плащане или не? Голяма работа – опитайте пак, ако платите два пъти – здраве да е.
Тук правя неочаквано отклонение, което не беше планирано в целия процес на on-line търговията. Вадя си електрическия подпис, пускам другия лаптоп, защото там е инсталирано всичок за него… и влизам в on-line банкерането на банката, която е издала картата, за да проверя, дали имам някакви картови авторизации през последния час. Барем, ако е минало плащане – да ходя да се разправям с някого. Да де, ама не е минало, сакън.
5. Понеже на сайта на БОРИКА има голям надпис, да не се използва BACK бутон или REFRESH (това е от грамотност на програмистите, от опит го знам) – решавам, да се върна ръчно на Lapni.bg и да опитам втори път да платя. Връщам се, обаче там няма опция да направиш плащане за поръчка, която първия път не е била платена по някаква причина. Добре – ще пуснем нова поръчка… Техниката вече е отработена – цък, цък, цък… готово, вече сме на сайта на БОРИКА… попълвам пак данните, “Плащане”… progress bar… ура – няма грешка… излиза надпис, че тази карта поддържа 3D сигурност и трябва да посоча някаква парола, която аз естествено нямам, понеже нямам 3D сигурност. След четене на някакъв help, който между другото е настроен да се отваря по подразбиране, като натиснеш Enter в някое поле на формата става ясно, че въпреки, че аз не ползвам 3D сигурност, ако искам да платя през тази система, ще трябва да си регистрирам картата за 3D сигурност в банката, която я е издала…
Следва една друга част, която може да разкажа някой друг път… но да речем, че след около 20 – 30 минути вече имам 3D сигурност на картата и си знам въпросната парола… Естествено – сесията в БОРИКА вече е изтекла и всичок започва отначало.
Тук трябав да отбележим, че бройт на ваучерите в Lapni.bg е ограничен и това изрично е посочено в офертата. Прави ми впечатление, че всеки път, като поръчам ваучер и не успея да го платя – бройката на “продадените” се увеличава. И ако си мислите, че причината е, че някой друг също си купува в момента – аз не мисля така, защото действието се развива в малките часове на нощта и просто по-вероятният сценарий е, системата да е малоумна.
6. Минавам през целия сценарий, пускам нова поръчка, вече знам всички подводни камъни, стигам до плащането на БОРИКА, няма грешка и няма да трябва да опитам пак по-късно… пита ме за тайната парола за 3D сигурност… въвеждам я (буквално преди минути съм я получил от банката)… и “Системата каза не” – паролата била грешна. Въвеждам я втори път… “Системата каза не“… трети път, много внимателно, въвеждам я извън полето за парола, за да виждам точно какво се изписва (тук иронично си припомних, колко е удобна формата на Lapni.bg където за паролата не се използва поле за пароли), копирам 100% сигурно правилната парола, поставям я и “Системата каза не“… На третия опит вече ми каза, че съм лош хакер и не мога да платя и ме изхвърли… Егаси!
7. Върнах се до on-line банкерането, да проверя, да не би да съм въвел грешно паролата при регистрацията за 3D сигурност… въпреки, че имаше поле за повторно въвеждане на паролата, но уви – оказа се, че няма как да го проверя. Единствената опция е, да си сменя паролата срещу скромната сума от 10 стотинки. Теглих им една майна на всичките (за пореден път)… и реших, че преди да сменям паролата (въпросът е принципен, не в 10-те стотинки) ще се опитам още веднъж да мина по цялата пътечка отначало – докрай. Барем нещо стане най-накрая… Междувременно след всеки неуспешен опит ходя да проверя дали имам картова авторизация, щото вече на никого и на никоя система вяра нямам.
И така – започнах за пореден, не знам кой подрес път, да попълвам всички полета и поленца отначало… намерих офертата, поръчах я още веднъж, избрах метод на плащане, отидох на сайта на БОРИКА, въведох данните, попита ме за паролата за 3D сигурност… и О!Чудо – същата парола, която използвах преди малко и беше грешна, без да я сменям – сега вече не е грешна.
8. Надпис – успешно плащане, проверявам в банката – имаме успешна картова авторизация, пристига SMS за плащането, фамфари, конфети… радост, едночасова битка е на път да приключи с победа на човека над on-line търговията. Връщам се в Lapni.bg и там няма нищо… Когато използваш on-line инструменти за плащане и търговия очакваш, че нещата се случват в реално време – уви, оказва се, че се случвали до няколко минути… След няколко минути всичко се появи.
9. Междувременно други проблеми които възникнаха, но не са описани по-горе:
9.1. На сайта на Lapni.bg няма контактен телефон, на окйто да се обадиш, ако имаш проблеми като горе описаните.
9.2. На сайта на БОРИКА пише, да се свържа с администратора, но естествено също няма нито телефон, нито e-mail.
9.3. На Lapni.bg има едни тайни линкове, до които успях да се докопам чак на другия ден, защото някой титан на техническата мисъл е сложил JavaScript за infinite scroll и в момента в който скролнеш най-долу, да да видиш линковете във footer-а, динамично се зареждат още оферти и footer-а изчезва надолу… и така може да си го гониш до умопомрачаване.
9.4. Търсачката на Lapni.bg бърза да търси, докато пишеш… че пишеш разбира по това, че се натискат клавиши. Да обаче няма сложен timeout и колкото и бързо да пишеш – на всеки клавиш се опитва да презареди резултатите. В резултат на това става мазало. На всичкото отгоре – ако натискаш стрелките в полето за търсене (т.е. нищо не пишеш) – резултатите от търсенето отново се презареждат.
И така, някои биха заключили, че опитът ми за on-line търговия е бил успешен, защото всичко е добре, когато завършва добре. Аз обаче ще кажа – НЕ, ОПИТЪТ БЕШЕ НЕУСПЕШЕН, защото не вярвам, че енормално елементарна покупка от Интернет да отнеме в крайна сметка почти два астрономически часа! Това е дейност, която се очаква да бъде бърза, достъпна и лесна.
P.S. докато пишех този пост и правех screenshots в сайтовете на Lapni.bg и БОРИКА, ненадейно установих, че всъщност има опция за плащане с PayPal… просто я има в други оферти. От никъде и от нищо не става ясно, защо едни оферти могат да бъдат платени с PayPal, а други не. Може би цената е определяща, а може би нещо друго. Това обаче не се споменава в сайта… в “Често задавани въпроси” пише: “PayPal.com е международна система за електронни плащания. Поддържа всички видове кредитни карти, както и дебитни карти Visa Electron, които поддържат електронни плащания. За да платите от PayPal.com, трябва предварително да имате регистриран акаунт, както и добавена и потвърдена банкова карта. Ако нямате акаунт в PayPal.com, разгледайте останалите начини за плащане.“.

systemd for Administrators, Part XV

Post Syndicated from Lennart Poettering original http://0pointer.net/blog/projects/watchdog.html

Quickly
following the previous iteration
, here’s
now the fifteenth
installment
of

my ongoing series
on
systemd
for
Administrators:

Watchdogs

There are three big target audiences we try to cover with systemd:
the embedded/mobile folks, the desktop people and the server
folks. While the systems used by embedded/mobile tend to be
underpowered and have few resources are available, desktops tend to be
much more powerful machines — but still much less resourceful than
servers. Nonetheless there are surprisingly many features that matter
to both extremes of this axis (embedded and servers), but not the
center (desktops). On of them is support for watchdogs in
hardware and software.

Embedded devices frequently rely on watchdog hardware that resets
it automatically if software stops responding (more specifically,
stops signalling the hardware in fixed intervals that it is still
alive). This is required to increase reliability and make sure that
regardless what happens the best is attempted to get the system
working again. Functionality like this makes little sense on the
desktop[1]. However, on
high-availability servers watchdogs are frequently used, again.

Starting with version 183 systemd provides full support for
hardware watchdogs (as exposed in /dev/watchdog to
userspace), as well as supervisor (software) watchdog support for
invidual system services. The basic idea is the following: if enabled,
systemd will regularly ping the watchdog hardware. If systemd or the
kernel hang this ping will not happen anymore and the hardware will
automatically reset the system. This way systemd and the kernel are
protected from boundless hangs — by the hardware. To make the chain
complete, systemd then exposes a software watchdog interface for
individual services so that they can also be restarted (or some other
action taken) if they begin to hang. This software watchdog logic can
be configured individually for each service in the ping frequency and
the action to take. Putting both parts together (i.e. hardware
watchdogs supervising systemd and the kernel, as well as systemd
supervising all other services) we have a reliable way to watchdog
every single component of the system.

To make use of the hardware watchdog it is sufficient to set the
RuntimeWatchdogSec= option in
/etc/systemd/system.conf. It defaults to 0 (i.e. no hardware
watchdog use). Set it to a value like 20s and the watchdog is
enabled. After 20s of no keep-alive pings the hardware will reset
itself. Note that systemd will send a ping to the hardware at half the
specified interval, i.e. every 10s. And that’s already all there is to
it. By enabling this single, simple option you have turned on
supervision by the hardware of systemd and the kernel beneath
it.[2]

Note that the hardware watchdog device (/dev/watchdog) is
single-user only. That means that you can either enable this
functionality in systemd, or use a separate external watchdog daemon,
such as the aptly named watchdog.

ShutdownWatchdogSec= is another option that can be
configured in /etc/systemd/system.conf. It controls the
watchdog interval to use during reboots. It defaults to 10min, and
adds extra reliability to the system reboot logic: if a clean reboot
is not possible and shutdown hangs, we rely on the watchdog hardware
to reset the system abruptly, as extra safety net.

So much about the hardware watchdog logic. These two options are
really everything that is necessary to make use of the hardware
watchdogs. Now, let’s have a look how to add watchdog logic to
individual services.

First of all, to make software watchdog-supervisable it needs to be
patched to send out “I am alive” signals in regular intervals in its
event loop. Patching this is relatively easy. First, a daemon needs to
read the WATCHDOG_USEC= environment variable. If it is set,
it will contain the watchdog interval in usec formatted as ASCII text
string, as it is configured for the service. The daemon should then
issue sd_notify(“WATCHDOG=1”)
calls every half of that interval. A daemon patched this way should
transparently support watchdog functionality by checking whether the
environment variable is set and honouring the value it is set to.

To enable the software watchdog logic for a service (which has been
patched to support the logic pointed out above) it is sufficient to
set the WatchdogSec= to the desired failure latency. See systemd.service(5)
for details on this setting. This causes WATCHDOG_USEC= to be
set for the service’s processes and will cause the service to enter a
failure state as soon as no keep-alive ping is received within the
configured interval.

If a service enters a failure state as soon as the watchdog logic
detects a hang, then this is hardly sufficient to build a reliable
system. The next step is to configure whether the service shall be
restarted and how often, and what to do if it then still fails. To
enable automatic service restarts on failure set
Restart=on-failure for the service. To configure how many
times a service shall be attempted to be restarted use the combination
of StartLimitBurst= and StartLimitInterval= which
allow you to configure how often a service may restart within a time
interval. If that limit is reached, a special action can be
taken. This action is configured with StartLimitAction=. The
default is a none, i.e. that no further action is taken and
the service simply remains in the failure state without any further
attempted restarts. The other three possible values are
reboot, reboot-force and
reboot-immediate. reboot attempts a clean reboot,
going through the usual, clean shutdown logic. reboot-force
is more abrupt: it will not actually try to cleanly shutdown any
services, but immediately kills all remaining services and unmounts
all file systems and then forcibly reboots (this way all file systems
will be clean but reboot will still be very fast). Finally,
reboot-immediate does not attempt to kill any process or
unmount any file systems. Instead it just hard reboots the machine
without delay. reboot-immediate hence comes closest to a
reboot triggered by a hardware watchdog. All these settings are
documented in systemd.service(5).

Putting this all together we now have pretty flexible options to
watchdog-supervise a specific service and configure automatic restarts
of the service if it hangs, plus take ultimate action if that doesn’t
help.

Here’s an example unit file:

[Unit]
Description=My Little Daemon
Documentation=man:mylittled(8)

[Service]
ExecStart=/usr/bin/mylittled
WatchdogSec=30s
Restart=on-failure
StartLimitInterval=5min
StartLimitBurst=4
StartLimitAction=reboot-force

This service will automatically be restarted if it hasn’t pinged
the system manager for longer than 30s or if it fails otherwise. If it
is restarted this way more often than 4 times in 5min action is taken
and the system quickly rebooted, with all file systems being clean
when it comes up again.

And that’s already all I wanted to tell you about! With hardware
watchdog support right in PID 1, as well as supervisor watchdog
support for individual services we should provide everything you need
for most watchdog usecases. Regardless if you are building an embedded
or mobile applience, or if your are working with high-availability
servers, please give this a try!

(Oh, and if you wonder why in heaven PID 1 needs to deal with
/dev/watchdog, and why this shouldn’t be kept in a separate
daemon, then please read this again and try to understand that this is
all about the supervisor chain we are building here, where the hardware watchdog
supervises systemd, and systemd supervises the individual
services. Also, we believe that a service not responding should be
treated in a similar way as any other service error. Finally, pinging
/dev/watchdog is one of the most trivial operations in the OS
(basically little more than a ioctl() call), to the support for this
is not more than a handful lines of code. Maintaining this externally
with complex IPC between PID 1 (and the daemons) and this watchdog
daemon would be drastically more complex, error-prone and resource
intensive.)

Note that the built-in hardware watchdog support of systemd does
not conflict with other watchdog software by default. systemd does not
make use of /dev/watchdog by default, and you are welcome to
use external watchdog daemons in conjunction with systemd, if this
better suits your needs.

And one last thing: if you wonder whether your hardware has a
watchdog, then the answer is: almost definitely yes — if it is anything more
recent than a few years. If you want to verify this, try the wdctl
tool from recent util-linux, which shows you everything you need to
know about your watchdog hardware.

I’d like to thank the great folks from Pengutronix for contributing
most of the watchdog logic. Thank you!

Footnotes

[1] Though actually most desktops tend to include watchdog
hardware these days too, as this is cheap to build and available in
most modern PC chipsets.

[2] So, here’s a free tip for you if you hack on the core
OS: don’t enable this feature while you hack. Otherwise your system
might suddenly reboot if you are in the middle of tracing through PID
1 with gdb and cause it to be stopped for a moment, so that no
hardware ping can be done…