Tag Archives: dam

Five Best Practices to Securely Preserve Your Video, Photo, and Other Data

Post Syndicated from Roderick Bauer original https://www.backblaze.com/blog/five-best-practices-to-securely-preserve-your-video-photo-and-other-data/

computer and camera overlooking a lake

Whether you’re working with video, photo, audio, or other data, preserving the security of your data has to be at the top of your priority list. Data security might sound like a challenging proposition, but by following just a handful of guidelines it becomes a straightforward and easily accomplished task.

We’d like to share what we consider best practices for maintaining the safety of your data. For both seasoned pros and those just getting started with digital media, these best practices are important to implement and revisit regularly. We believe that by following these practices — independently of which specific data storage software, service, or device you use — you will ensure that all your media and other data are kept secure to the greatest extent possible.

The Five Best Practices to Keep Your Digital Media Safe

1 — Keep Multiple Copies of Your Media Files

Everyone by now is likely familiar with the 3-2-1 strategy for maintaining multiple copies of your data (video, photos, digital asset management catalogs, etc.). Following a 3-2-1 strategy simply means that you should always have at least three copies of your active data, two of which are local, and at least one that is in another location.

a tech standing looking at a pod full of hard drives in a data center
Choose a reliable storage provider

Mind you, this is for active data, that is, files and other data that you are currently working on and want to have backed up in case of accident, theft, or hardware failure. Once you’re finished working with your data, you should consider archiving your data, which we’ve also written about on our blog.

2 — Use Trustworthy Vendors

There are times when you can legitimately cut corners to save money, and there are times when you shouldn’t. When it comes to your digital media and services, you want to go with the best. That means using topnotch memory sticks, HDD and SSD drives, software, and cloud services.

For hardware devices and software, it’s always helpful to read reviews or talk with others using the devices to find out how well they work. For hard drive reliability, our Drive Stats blog posts can be informative and are a unique source of information in the data storage industry.

For cloud storage, you want a vendor with a strong track record of reliability and cost stability. You don’t want to use a cloud service or other SaaS vendor that has a history of making it difficult or expensive to access or download your data from their service. A topnotch service vendor will be transparent in their business practices, inform you when there are any outages in their service or maintenance windows, and try as hard as possible to make things right if problems occur.

3 — Always Use Encryption (The Strongest Available)

Encrypting your data provides a number of benefits. It protects your data no matter where it is stored, and also when it is being moved — potentially the most vulnerable exposure your data will have.

Encrypted data can’t be altered or corrupted without the changes being detected, which provides another advantage. Encryption also enables you to meet requirements for privacy and security compliance and to keep up with changing rules and regulations.

Encryption comes in different flavors. You should always select the strongest encryption available, and make sure that any passwords or multi-factor authentication you use are strong and unique for each application.

4 — Automate Whenever Possible

Don’t rely on your memory or personal discipline alone to remember to regularly back up your data. While we always start with the best of intentions, we are busy and we often let things slide (much like resolving to exercise regularly). It’s better to have a regular schedule that you commit to, and best if the backups happen automatically. Many backup and archive apps let you specify when backups, incremental backups, or snapshots occur. You usually can set how many copies of your data to keep, and whether backups are triggered by the date and time or when data changes.

Automating your backups and archives means that you won’t forget to back up and results in a greater likelihood that your data will not only be recoverable after an accident or hardware failure, but up to date. You’ll be glad for the reduced stress and worry in your life, as well.

5 — Be Mindful of Security in Your Workflow

Nobody wants to worry about security all the time, but if it’s ignored, sooner or later that inattention will catch up with you. The best way to both increase the security of your data and reduce stress in your life is to have a plan and implement it.

At its simplest, the concept of security mindfulness means that you should be conscious of how you handle your data during all stages of your workflow. Being mindful shouldn’t require you to overthink, stress or worry, but just to be aware of the possible outcomes of your decisions about how you’re handling your data.

If you follow the first four practices in this list, then this fifth concept should flow naturally from them. You’ve taken the right steps to a long term plan for maintaining your data securely.

Data Security Can Be Both Simple and Effective

The best security practices are the ones that are easy to follow consistently. If you pay attention to the five best practices we’ve outlined here, then you’re well on your way to secure data and peace of mind.

•  •  •

Note:  This post originally appeared on Lensrentals.com on September 18, 2018.

The post Five Best Practices to Securely Preserve Your Video, Photo, and Other Data appeared first on Backblaze Blog | Cloud Storage & Cloud Backup.

What’s the Diff: DAM vs MAM

Post Syndicated from Janet Lafleur original https://www.backblaze.com/blog/whats-the-diff-dam-vs-mam/

What's the Diff: DAM vs MAM

There’s a reason digital asset management (DAM) and media asset management (MAM) seem to be used interchangeably. Both help organizations centrally organize and manage assets —  images, graphics, documents, video, audio — so that teams can create content efficiently and securely. Both simplify managing those assets through the content life cycle, from raw source files through editing, to distribution, to archive. And, as a central repository, they enable teams to collaborate by giving team members direct access to shared assets.

A quick answer to the difference is that MAM is considered a subset of the broader DAM, with MAMs providing more video capabilities. But since most DAMs can manage videos, and MAMs vary widely in what kind of video-oriented features they offer, it’s worth diving deeper to understand these different asset management solutions.

What to Expect From Any Asset Manager

Before we focus on the differences, let’s outline the basic structure and the capabilities of any asset manager.  The best place to start is with the understanding that any given asset a team might want to work with — a video clip, a document, an image —  is usually presented by the asset manager as a single item to the user, but is actually composed of three elements: the master source file, a thumbnail or proxy that’s displayed, and metadata about the object itself. Note that in the context of asset management, metadata is more than simple file attributes (i.e. owner, date created, last modified date, size). It’s a broader set of attributes, including details about the actual content of the file. We’ll spell out more on that later. As far as capabilities, any DAM or MAM worth being called an asset manager should offer:

  • Collaboration — Members of content creation teams all should have direct access to assets in the asset management system from their own workstations.
  • Access control — Access to specific assets or groups of assets should be allowed or restricted based on the user’s rights and permission settings. This is particularly important if teams work in different departments or for different external clients.
  • Browse — Assets should be easily identifiable by more than their file name, such as thumbnails or proxies for videos, and browsable in the asset manager’s graphical interface.
  • Metadata search —  Assets should be searchable by attributes assigned to them, known as metadata. Metadata assignment capabilities should be flexible and extensible over time.
  • Preview — For larger or archived assets, a preview or quick review capability should be provided, such as playing video proxies or mouse-over zoom for thumbnails.
  • Versions — Based on permissions, team members should be able to add new versions of existing assets or add new assets so that material can be easily repurposed for future projects.

Why Metadata Matters So Much

Metadata is a critical element that distinguishes asset managers from file browsers. Without metadata, file names end up doing the heavy lifting with long names like 20190118-gbudman-broll-01-lv-0001.mp4, which strings together a shoot date, subject, camera number, clip number, and more. Structured file naming is not a bad practice, but it doesn’t scale easily to larger teams of contributors and creators. And metadata is not used only to search for assets, it can be fed into other workflow applications integrated with the asset manager for use there.

Metadata is particularly important for images and video because, unlike text-based documents, they can’t be searched for keywords. Metadata can describe in detail what’s in the image or video. For example, metadata for an image could be: male, beard, portrait, blue shirt, dark hair, fair skin, middle-aged, outdoors. And since videos are streams of images, their metadata goes one step further to describe elements at precise moments or ranges of time in the video, known as timecodes. For example, video of a football game could include metadata tags such as 00:10:30 kickoff, 00:15:37 interception, and 00:21:04 touchdown.

iconik MAM example displaying meta data for a BMW M635CSi

iconik MAM

Workflow Integration and Archive Support

More robust DAMs and MAMs go beyond the basic capabilities and offer a range of advanced features that simplify or otherwise support the creation process, also known as the workflow. These can include features for editorial review, automated metadata extraction (e.g. transcription for facial recognition), multilingual support, automated transcode, and much, much more. This is where different asset management solutions diverge the most and show their customization for a particular type of workflow or industry.

Regardless of whether you need all the bells and whistles in your asset manager, as your content library grows it will need storage management features, starting with archive. Archiving completed projects and assets that are infrequently used can conserve disk space on your server by moving them off to less expensive storage, such as cloud storage or digital tape. In particular, images and video are huge storage hogs, and the higher the resolution, the more storage capacity they consume. Regular archiving can keep costs down and keep you from having to upgrade your expensive storage server every year.

Asset managers with built-in archiving make moving content into and out of an archive seamless and straightforward. For most asset managers, assets can be archived directly from the graphical interface. After archive, the thumbnails or proxies of the archived assets continue to appear as before, with a visual indication that they’re archived on secondary storage. Users can retrieve the asset as before, albeit with some time delay that depends on the archive storage and network connection chosen.

A good asset manager will offer multiple choices for archive storage, from cloud storage to LTO tape to inexpensive disk, and from different vendors.  An excellent one will let you automatically make multiple copies to different archive storage for added data protection.

What is a MAM?

With all these common characteristics, what makes a media asset manager different than other asset managers is that it’s created for video production. While DAMs can generally manage video assets, and MAMs can manage images and documents, MAMs are designed from the ground up for creating and managing video content in a video production workflow. That means metadata creation and management, application integrations, and workflow orchestration are all video-oriented.

Metadata for video starts when it’s shot, with camera data, shoot notes or basic logging captured on set.  More detailed metadata cataloging happens when the content is ingested from the camera into the MAM for post-production. Nearly all MAMs offer some type of manual logging to create timecode-based metadata. MAMs built for live broadcast events like sports provide shortcut buttons for key events, such as a face off or slap shot in a hockey game.

More advanced systems offer additional tools for automated metadata extraction. For example, some will use facial recognition to automatically identify actors or public figures.

There is also metadata related to how, where, and how many times the asset has been used and what kinds of edits have been made from the original. There’s no end to what you can describe and categorize with metadata. Defining it for a content library of any reasonable size can be a major undertaking.

MAMs Integrate Video Production Applications

Unlike the more general-purpose DAMs, MAMs will integrate tools built specifically for video production. These widely ranging integrated applications include ingest tools, video editing suites, visual effects, graphics tools, transcode, quality assurance, file transport, specific distribution systems, and much more.

Modern MAM solutions integrate cloud storage throughout the workflow, and not just for archive, but also for creating content through proxy editing. In proxy editing, video editors work using a lower-resolution of the video stored locally, then those edits are applied later to the full-resolution version stored in the cloud when the final cut in rendered.

MAMs May be Tailored for Specific Industry Niches and Workflows

To sum up, the longer explanation for DAM vs MAM is that MAMs focus on video production, with better MAMs offering all the integrations needed for complex video workflows. And because video workflows are as varied as they are complex, MAMs often fall into specific niches within the industry: news, sports, post-production, film production, etc. The size of the organization or team matters too. To stay within their budget, a small post house may select a MAM with fewer of the advanced features that may be basic requirements for a larger multinational post-production facility.

That’s why there are so many MAMs on the market, and why choosing one can be a daunting task with a long evaluation process. And it’s why migrating from one asset manager to another is more common than you’d think. Pro tip: working with a trusted system integrator that serves your industry niche can save you a lot of heartache and money in the long run.

Finally, keep in mind that for legacy reasons, sometimes what’s marketed as a DAM will have all the video capabilities you’d expect from MAM.  So don’t let the name throw you off. Instead, look for an asset manager that fits your workflow with the features and integrated tools you need today, while also providing the  flexibility you need as your business changes in the future.

Backblaze will be exhibiting at NAB 2019 in Las Vegas on April 8-11, 2019.NABShow logoSchedule a meeting with our cloud storage experts to learn how B2 Cloud Storage can streamline your workflow today!

The post What’s the Diff: DAM vs MAM appeared first on Backblaze Blog | Cloud Storage & Cloud Backup.

How Much Photo & Video Data Do You Have Stored?

Post Syndicated from Jim Goldstein original https://www.backblaze.com/blog/how-much-photo-video-data-do-you-have-stored/

How Much Photo and Video Data Do You Have?

Backblaze’s Director of Marketing Operations, Jim, is not just a marketing wizard, he’s worked as a professional photographer and run marketing for a gear rental business. He knows a lot of photographers. We thought that our readers would be interested in the results of an informal poll he recently conducted among his media friends about the amount of media data they store.You’re invited to contribute to the poll, as well!

— Editor

I asked my circle of professional and amateur photographer friends how much digital media data they have stored. It was a quick survey, and not in any way scientific, but it did show the range of data use by photographers and videographers.

Jim's media data storage poll

I received 64 responses. The answers ranged from less than 5 TB (17 users) to 2 petabytes (1 user). The most popular response was 10-19 TB (18 users). Here are the results.

Digital media storage poll results

Jim's digital media storage poll results

How Much Digital Media Do You Have Stored?

I wondered if the results would be similar if I expanded our survey to a wider audience.

The poll below replicates what I asked of my circle of professional and non-professional photographer and videographer friends. The poll results will be updated in real-time. I ask that you respond only once.

Backblaze is interested in the results as it will help us write blog articles that will be useful to our readership, and also offer cloud services suitable for the needs of our users. Please feel free to ask questions in the comments about cloud backup and storage, and about our products Backblaze Backup and Backblaze B2 Cloud Storage.

I’m anxious to see the results.

Our Poll — Please Vote!

How much photo/video data do you have in total (TB)?

Thanks for participating in the poll. If you’d like to provide more details about the data you store and how you do it, we’d love to hear from you in the comments.

The post How Much Photo & Video Data Do You Have Stored? appeared first on Backblaze Blog | Cloud Storage & Cloud Backup.

Securely Managing Your Digital Media (SD, CF, SSD, and Beyond)

Post Syndicated from Roderick Bauer original https://www.backblaze.com/blog/securely-managing-your-digital-media-sd-cf-ssd-and-beyond/

3 rows of 3 memory cards

This is the second in our post exchange series with our friends Zach Sutton and Ryan Hill at Lensrentals.com, who have an online site for renting photography, videography, and lighting equipment. You can read our post from last month on their blog, 3-2-1 Backup Best Practices using Cloud Archiving, and all posts on our blog in this series at Lensrentals post series.

— Editor

Managing digital media securely is crucial for all photographers and videographers. At Lensrentals.com, we take media security very seriously, with dozens of rented memory cards, hard drives, and other data devices returned to our facility every day. All of our media is inspected with each and every rental customer. Most of the cards returned to us in rental shipments are not properly reformatted and erased, so it’s part of our usual service to clear all the data from returned media to keep each client’s identity and digital property secure.

We’ve gotten pretty good at the routine of managing data and formatting storage devices for our clients while making sure our media has a long life and remains free from corruption. Before we get too involved in our process of securing digital media, we should first talk fundamentals.

The Difference Between Erasing and Reformatting Digital Media

When you insert a card in the camera, you’re likely given two options, either erase the card or format the card. There is an important distinction between the two. Erasing images from a card does just that — erases them. That’s it. It designates the area the prior data occupied on the card as available to write over and confirms to you that the data has been removed.

The term erase is a bit misleading here. The underlying data, the 1’s and 0’s that are recorded on the media, are still there. What really happens is that the drive’s address table is changed to show that the space the previous file occupied is available for new data.

This is the reason that simply erasing a file does not securely remove it. Data recovery software can be used to recover that old data as long as it hasn’t been overwritten with new data.

Formatting goes further. When you format a drive or memory card, all of the files are erased (even files you’re designated as “protected”) and also usually adds a file system. This is a more effective method for removing all the data on the drive since all the space previously divided up for specific files has a brand new structure unencumbered by whatever size files were previously stored. Be beware, however, that it’s possible to retrieve older data even after a format. Whether that can happen depends on the formatting method and whether new data has overwritten what was previously stored.

To make sure that the older data cannot be recovered, a secure erase goes further. Rather than simply designating the data that can be overwritten with new data, a secure erase writes a random selection of 1s and 0s to the disk to make sure the old data is no longer available. This takes longer and is more taxing on the card because data is being overwritten rather than simply removed.

Always Format a Card for the Camera You’re Going to Be Using

If you’ve ever tried to use the same memory card on cameras of different makes without formatting it, you may have seen problems with how the data files are displayed. Each camera system handles its file structure a little differently.

For this reason it’s advisable to format the card for the specific camera you’re using. If this is not done, there is a risk of corrupting data on the card.

Our Process For Securing Data

Our inspection process for recording media varies a little depending on what kind of card we’re inspecting. For standardized media like SD cards or compact flash cards, we simply use a card reader to format the card to exFAT. This is done in Disk Utility on the Apple Macbooks that we issue to each of our Video Technicians. We use exFAT specifically because it’s recognizable by just about every device. Since these cards are used in a wide variety of different cameras, recorders, and accessories, and we have no way of knowing at the point of inspection what device they’ll be used with, we have to choose a format that will allow any camera to recognize the card. While our customer may still have to format a card in a camera for file structure purposes, the card will at least always come formatted in a way that the camera can recognize.

Sony SxS media
For proprietary media — things like REDMAGs, SxS, and other cards that we know will only be used in a particular camera — we use cameras to do the formatting. While the exFAT system would technically work, a camera-specific erase and format process saves the customer a step and allows us to more regularly double-check the media ports on our cameras. In fact, we actually format these cards twice at inspection. First, the Technician erases the card to clear out any customer footage that may have been left on it. Next, they record a new clip to the card, around 30 seconds, just to make sure everything is working as it’s supposed to. Finally, they format the card again, erasing the test footage before sending it to the shelf where it awaits use by another customer.

REDMAG Red Mini-Mag You’ll notice that at no point in this process do we do a full secure erase. This is both to save time and to prevent unnecessary wear and tear on the cards. About 75% of the media we get back from orders still has footage on it, so we don’t get the impression that many of our customers are overly concerned with keeping their footage private once they’re done shooting. However, if you are one of those 25% that may have a personal or professional interest in keeping your footage secure after shooting, we’d recommend that you securely erase the media before returning rented memory cards and drives. Or, if you’d rather we handle it, just send an email or note with your return order requesting that we perform a secure erase rather than simply formatting the cards, and we’ll be happy to oblige.

Managing your digital media securely can be easy if done right. Data management and backing up files, on the other hand, can be more involved and require more planning. If you have any questions on that topic, be sure to check out our recent blog post on proper data backup.

— Zach Sutton and Ryan Hill, lensrentals.com

The post Securely Managing Your Digital Media (SD, CF, SSD, and Beyond) appeared first on Backblaze Blog | Cloud Storage & Cloud Backup.

Protecting Your Data From Camera to Archive

Post Syndicated from Roderick Bauer original https://www.backblaze.com/blog/protecting-your-data-from-camera-to-archive/

Camera data getting backed up to Backblaze B2 cloud

Lensrentals.com is a highly respected company that rents photography and videography equipment. We’re a fan of their blog and asked Zach Sutton and Ryan Hill of Lensrentals to contribute something for our audience. We also contributed a post to their blog that was posted today: 3-2-1 Backup Best Practices using Cloud Archiving.

Enjoy!

— Editor

At Lensrentals.com we get a number of support calls, but unfortunately one of them is among the most common: data catastrophes.

The first of the frequent calls is from someone who thought they transferred over their footage or photos before returning their rental and discovered later that they were missing some images or footage. If we haven’t already gone through an inspection of those cards, it’s usually not a problem to send the cards back to them so they can collect their data. But if our techs have inspected the memory cards, then there isn’t much we can do. Our team at Lensrentals.com perform a full and secure reformatting of the cards to keep each customer’s data safe from the next renter. Once that footage is gone, it is unrecoverable and gone forever. This is never a fun conversation to have.

The second scenario is when a customer calls to tell us that they did manage to transfer all the footage over, but one or more of the clips or images were corrupted in the transferring process. Typically, people don’t discover this until after they’ve sent back the memory cards, and after we’ve already formatted the original media. This is another tough phone call to have. On occasion, data corruption happens in camera, but more often than not, the file gets corrupted during the transfer from the media to the computer or hard drive.

These kinds of problems aren’t entirely avoidable and are inherent risks users take when working with digital media. However, as with all risks, you can take proper steps to assure that your data is safe. If a problem arises, there are techniques you can use to work around it.

We’ve summarized our best suggestions for protecting your data from camera to archive in the following sections. We hope you find them useful.

How to Protect Your Digital Assets

Before Your Shoot

The first and most obvious step to take to assure your data is safe is to make sure you use reliable media. For us, we recommend using cards from brands you trust, such as Sandisk, Lexar or ProGrade Digital (a company that took the reins from Lexar). For hard drives, SanDisk, Samsung, Western Digital, and Intel are all considered incredibly reliable. These brands may be more expensive than bargain brands but have been proven time and time again to be more reliable. The few extra dollars spent on reliable media will potentially save you thousands in the long run and will assure that your data is safe and free of corruption.

One of the most important things you should do before any shoot is format your memory card in the camera. Formatting in camera is a great way to minimize file corruption as it keeps the card’s file structure conforming to that camera manufacturer’s specifications, and it should be done every time before every shoot. Equally important, if the camera gives you an option to do a complete or secure format, take that option over the other low-level formatting options available. In the same vein, it’s essential to also take the time to research and see if your camera needs to unmount or “eject” the media before removing it physically. While this option applies more for video camera recording systems, like those found on the RED camera platform and the Odyssey 7Q, it’s always worth checking into to avoid any corruption of the data. More often than not, preventable data corruption happens when the users turn off the camera system before the media has been unmounted.

Finally, if you’re shooting for the entire day, you’ll want to make sure you have enough media on hand for the entire day, so that you do not need to back up and reformat cards throughout the shoot. While it’s possible to take footage off of the card, reformat it, and use it again for the same day, that is not something you’d want to be doing during the hectic environment of a shoot day — it’s best to have extra media on hand. We’ve all made a mistake and deleted a file we didn’t mean to, so it’s best to avoid that mistake by not having to delete or manage files while shooting. Play it safe, and only reformat when you have the time and clear head to do so.

During Your Shoot

On many modern camera systems, you have the option of dual-recording using two different card slots. If your camera offers this option, we cannot recommend it enough. Doubling the media you’re recording onto can overcome a failure in one of the memory cards. While the added cost may be a hard sell, it’s negligible when compared to all the money spent on lights, cameras, actors and lousy pizza for the day. Additionally, develop a system that works for you and keeps everything as organized as possible. Spent media shouldn’t be in the same location as unused media, and your file structure should be consistent throughout the entire shoot. A proper file structure not only saves time but assures that none of the footage goes missing after the shoot, lost in some random folder.

Camera memory cards

Among one of the most critical jobs while on set is the work of a DIT (Digital Imaging Technician) for video, and a DT (Digital Technician) for photography. Essentially, the responsibilities of these positions are to keep the data archived and organized on a set, as well as metadata logging and other technical tasks involved in keeping a shoot organized. While it may not be cost effective to have a DIT/DT on every shoot, if the budget allows for it, I highly recommend you hire one to take on the responsibilities. Having someone on set who is solely responsible for safely backing up and organizing footage helps keep the rest of the crew focused on their obligations to assure nothing goes wrong. When they’re not transferring and archiving data, DIT/DT’s also log metadata, color correct footage and help with the other preliminary editing processes. Even if the budget doesn’t allow for this position to be filled, work to find someone who can solely handle these processes while on set. You don’t want your camera operator to be in charge of also backing up and organizing footage if you can help it.

Ingest Software

If there is one piece of information we’d like for videographers and photographers to take away from this article, it is this: file-moving or ‘offloading’ software is worth the investment and should be used every time you shoot anything. For those who are unfamiliar with offload software, it’s any application that is designed to make it easier for you to back up footage from one location to another, and one shoot to another. In short, to avoid accidents or data corruption, it’s always best to have your media on a MINIMUM of two different devices. The easiest way to do this is to simply dump media onto two separate hard drives, and keep those drives separately stored. Ideally (if the budget allows), you’ll also keep all of your data on the original media for the day as well, making sure you have multiple copies stored in various locations. Many other options are available and recommended if possible, such as RAID arrays or even copying the data over to a cloud service such as Backblaze B2. What offloading software does is just this process, and helps build a platform of automation while verifying all the data as it’s transferred.

There are a few different recommendations I give for offloading software, all at different price points and with unique features. At the highest end of video production, you’ll often see DITs using a piece of software called Silverstack, which offers color grading functionalities, LTO tape support, and basic editing tools for creating daily edits. At a $600 annual price, it is the most expensive in this field and is probably overkill for most users. As for my recommendation, I recommend a tool call Shotput Pro. At $129, Shotput Pro offers all the tools you’d need to build a great archiving process while sacrificing some of the color editing tools. Shotput Pro can simultaneously copy and transfer files to multiple locations, build PDF reports, and verify all transfers. If you’re looking for something even cheaper, there are additional options such as Offload and Hedge. They’re both available for $99 each and give all the tools you’d need within their simple interfaces.

When it comes to photo, the two most obvious choices are Adobe Lightroom and Capture One Pro. While both tools are known more for their editing tools, they also have a lot of archiving functions built into their ingest systems, allowing you to unload cards to multiple locations and make copies on the fly.

workstation with video camera and RAID NAS

When it comes to video, the most crucial feature all of the apps should have is an option called “checksum verification.” This subject can get complicated, but all you really need to know is that larger files are more likely to be corrupted when transferring and copying, so what checksum verification does is verify the file to assure that it’s identical to the original version down to the individual byte. It is by far the most reliable and effective way to ensure that entire volumes of data are copied without corruption or loss of data. Whichever application you choose, make sure checksum verification is an available feature, and part of your workflow every time you’re copying video files. While available on select photo ingesting software, corruption happens less on smaller files and is generally less of an issue. Still, if possible, use it.

Post-Production

Once you’ve completed your shoot and all of your data is safely transfered over to external drives, it’s time to look at how you can store your information long term. Different people approach archiving in different ways because none of us will have an identical workflow. There is no correct way to handle how to archive your photos and videos, but there are a few rules that you’ll want to implement.

The first rule is the most obvious. You’ll want to make sure your media is stored on multiple drives. That way, if one of your drives dies on you, you still have a backup version of the work ready to go. The second rule of thumb is that you’ll want to store these backups in different locations. This can be extremely important if there is a fire in your office, or you’re a victim of a robbery. The most obvious way to do this is to back up or archive into a cloud service such as Backblaze B2. In my production experience I’ve seen multiple production houses implement a system where they store their backup hard drives in a safety deposit box at their bank. The final rule of thumb is especially important when you’re working with significant amounts of data, and that is to keep a working drive separate from an archive drive. The reason for this is an obvious one: all hard drives have a life expectancy, and you can prolong that by minimizing drive use. Having a working drive separate from your archive drives means that your archive drives will have fewer hours on them, thereby extending their practical life.

Ryan Hill’s Workflow

To help visualize what we discussed above, I’ll lay out my personal workflow for you. Please keep in mind that I’m mainly a one-man band, so my workflow is based on me handling everything. I’m also working with a large variety of mediums, so nothing I’m doing is going to be video and camera specific as all of my video projects, photo projects, and graphic projects are organized in the same way. I won’t bore you with details on my file structure, except to say that everything in my root folder is organized by job number, followed by sub-folders with the data classified into categories. I will keep track of which jobs are which, and have a Google Spreadsheet that organizes the job numbers with descriptions and client information. All of this information is secured within my Google account but also allows me to access it from anywhere if needed.

With archiving, my system is pretty simple. I’ve got a 4-drive RAID array in my office that gets updated every time I’m working on a new project. The array is set to RAID 1+0, which means I could lose two of the four hard drives, and still be able to recover the data. Usually, I’ll put 1TB drives in each bay, fill them as I work on projects, and replace them when they’re full. Once they’re full, I label them with the corresponding job numbers and store them in a plastic case on my bookshelf. By no means am I suggesting that my system is a perfect system, but for me, it’s incredibly adaptable to the various projects I work on. In case I was to get robbed, or if my house caught fire, I still have all of my work also archived onto a cloud system, giving me a second level of security.

Finally, to finish up my backup solution, I also keep a two-bay Thunderbolt hard drive dock on my desk as my working drive system. Solid state drives (SSD) and the Thunderbolt connection give me the speed and reliability that I’d need from a drive that I’ll be working from, and rendering outputs off of. For now, there is a single 960gb SSD in the first bay, with the option to extend to the second bay if I need additional storage. I start work by transferring the job file from my archive to the working drive, do whatever I need to do to the files, then replace the old job folder on my archive with the updated one at the end of the day. This way, if I were to have a drive failure, the worst I will lose is a day’s worth of work. For video projects or anything that takes a lot of data, I usually keep copies of all my source files on both my working and archive drive, and just replace the Adobe Premiere project file as I go. Again, this is just my system that works for me, and I recommend you develop one that works for your workflow while keeping your data safe.

The Takeaway

The critical point you should take away is that these sorts of strategies are things you should be thinking about at every step of your production. How does your camera or codec choice affect your media needs? How are you going to ensure safe data backup in the field? How are you going to work with all of this footage in post-production in a way that’s both secure and efficient? Answering all of these questions ahead of time will keep your media safe and your clients happy.

— Zach Sutton and Ryan Hill, lensrentals.com

The post Protecting Your Data From Camera to Archive appeared first on Backblaze Blog | Cloud Storage & Cloud Backup.

Protecting coral reefs with Nemo-Pi, the underwater monitor

Post Syndicated from Janina Ander original https://www.raspberrypi.org/blog/coral-reefs-nemo-pi/

The German charity Save Nemo works to protect coral reefs, and they are developing Nemo-Pi, an underwater “weather station” that monitors ocean conditions. Right now, you can vote for Save Nemo in the Google.org Impact Challenge.

Nemo-Pi — Save Nemo

Save Nemo

The organisation says there are two major threats to coral reefs: divers, and climate change. To make diving saver for reefs, Save Nemo installs buoy anchor points where diving tour boats can anchor without damaging corals in the process.

reef damaged by anchor
boat anchored at buoy

In addition, they provide dos and don’ts for how to behave on a reef dive.

The Nemo-Pi

To monitor the effects of climate change, and to help divers decide whether conditions are right at a reef while they’re still on shore, Save Nemo is also in the process of perfecting Nemo-Pi.

Nemo-Pi schematic — Nemo-Pi — Save Nemo

This Raspberry Pi-powered device is made up of a buoy, a solar panel, a GPS device, a Pi, and an array of sensors. Nemo-Pi measures water conditions such as current, visibility, temperature, carbon dioxide and nitrogen oxide concentrations, and pH. It also uploads its readings live to a public webserver.

Inside the Nemo-Pi device — Save Nemo
Inside the Nemo-Pi device — Save Nemo
Inside the Nemo-Pi device — Save Nemo

The Save Nemo team is currently doing long-term tests of Nemo-Pi off the coast of Thailand and Indonesia. They are also working on improving the device’s power consumption and durability, and testing prototypes with the Raspberry Pi Zero W.

web dashboard — Nemo-Pi — Save Nemo

The web dashboard showing live Nemo-Pi data

Long-term goals

Save Nemo aims to install a network of Nemo-Pis at shallow reefs (up to 60 metres deep) in South East Asia. Then diving tour companies can check the live data online and decide day-to-day whether tours are feasible. This will lower the impact of humans on reefs and help the local flora and fauna survive.

Coral reefs with fishes

A healthy coral reef

Nemo-Pi data may also be useful for groups lobbying for reef conservation, and for scientists and activists who want to shine a spotlight on the awful effects of climate change on sea life, such as coral bleaching caused by rising water temperatures.

Bleached coral

A bleached coral reef

Vote now for Save Nemo

If you want to help Save Nemo in their mission today, vote for them to win the Google.org Impact Challenge:

  1. Head to the voting web page
  2. Click “Abstimmen” in the footer of the page to vote
  3. Click “JA” in the footer to confirm

Voting is open until 6 June. You can also follow Save Nemo on Facebook or Twitter. We think this organisation is doing valuable work, and that their projects could be expanded to reefs across the globe. It’s fantastic to see the Raspberry Pi being used to help protect ocean life.

The post Protecting coral reefs with Nemo-Pi, the underwater monitor appeared first on Raspberry Pi.

Measuring the throughput for Amazon MQ using the JMS Benchmark

Post Syndicated from Rachel Richardson original https://aws.amazon.com/blogs/compute/measuring-the-throughput-for-amazon-mq-using-the-jms-benchmark/

This post is courtesy of Alan Protasio, Software Development Engineer, Amazon Web Services

Just like compute and storage, messaging is a fundamental building block of enterprise applications. Message brokers (aka “message-oriented middleware”) enable different software systems, often written in different languages, on different platforms, running in different locations, to communicate and exchange information. Mission-critical applications, such as CRM and ERP, rely on message brokers to work.

A common performance consideration for customers deploying a message broker in a production environment is the throughput of the system, measured as messages per second. This is important to know so that application environments (hosts, threads, memory, etc.) can be configured correctly.

In this post, we demonstrate how to measure the throughput for Amazon MQ, a new managed message broker service for ActiveMQ, using JMS Benchmark. It should take between 15–20 minutes to set up the environment and an hour to run the benchmark. We also provide some tips on how to configure Amazon MQ for optimal throughput.

Benchmarking throughput for Amazon MQ

ActiveMQ can be used for a number of use cases. These use cases can range from simple fire and forget tasks (that is, asynchronous processing), low-latency request-reply patterns, to buffering requests before they are persisted to a database.

The throughput of Amazon MQ is largely dependent on the use case. For example, if you have non-critical workloads such as gathering click events for a non-business-critical portal, you can use ActiveMQ in a non-persistent mode and get extremely high throughput with Amazon MQ.

On the flip side, if you have a critical workload where durability is extremely important (meaning that you can’t lose a message), then you are bound by the I/O capacity of your underlying persistence store. We recommend using mq.m4.large for the best results. The mq.t2.micro instance type is intended for product evaluation. Performance is limited, due to the lower memory and burstable CPU performance.

Tip: To improve your throughput with Amazon MQ, make sure that you have consumers processing messaging as fast as (or faster than) your producers are pushing messages.

Because it’s impossible to talk about how the broker (ActiveMQ) behaves for each and every use case, we walk through how to set up your own benchmark for Amazon MQ using our favorite open-source benchmarking tool: JMS Benchmark. We are fans of the JMS Benchmark suite because it’s easy to set up and deploy, and comes with a built-in visualizer of the results.

Non-Persistent Scenarios – Queue latency as you scale producer throughput

JMS Benchmark nonpersistent scenarios

Getting started

At the time of publication, you can create an mq.m4.large single-instance broker for testing for $0.30 per hour (US pricing).

This walkthrough covers the following tasks:

  1.  Create and configure the broker.
  2. Create an EC2 instance to run your benchmark
  3. Configure the security groups
  4.  Run the benchmark.

Step 1 – Create and configure the broker
Create and configure the broker using Tutorial: Creating and Configuring an Amazon MQ Broker.

Step 2 – Create an EC2 instance to run your benchmark
Launch the EC2 instance using Step 1: Launch an Instance. We recommend choosing the m5.large instance type.

Step 3 – Configure the security groups
Make sure that all the security groups are correctly configured to let the traffic flow between the EC2 instance and your broker.

  1. Sign in to the Amazon MQ console.
  2. From the broker list, choose the name of your broker (for example, MyBroker)
  3. In the Details section, under Security and network, choose the name of your security group or choose the expand icon ( ).
  4. From the security group list, choose your security group.
  5. At the bottom of the page, choose Inbound, Edit.
  6. In the Edit inbound rules dialog box, add a role to allow traffic between your instance and the broker:
    • Choose Add Rule.
    • For Type, choose Custom TCP.
    • For Port Range, type the ActiveMQ SSL port (61617).
    • For Source, leave Custom selected and then type the security group of your EC2 instance.
    • Choose Save.

Your broker can now accept the connection from your EC2 instance.

Step 4 – Run the benchmark
Connect to your EC2 instance using SSH and run the following commands:

$ cd ~
$ curl -L https://github.com/alanprot/jms-benchmark/archive/master.zip -o master.zip
$ unzip master.zip
$ cd jms-benchmark-master
$ chmod a+x bin/*
$ env \
  SERVER_SETUP=false \
  SERVER_ADDRESS={activemq-endpoint} \
  ACTIVEMQ_TRANSPORT=ssl\
  ACTIVEMQ_PORT=61617 \
  ACTIVEMQ_USERNAME={activemq-user} \
  ACTIVEMQ_PASSWORD={activemq-password} \
  ./bin/benchmark-activemq

After the benchmark finishes, you can find the results in the ~/reports directory. As you may notice, the performance of ActiveMQ varies based on the number of consumers, producers, destinations, and message size.

Amazon MQ architecture

The last bit that’s important to know so that you can better understand the results of the benchmark is how Amazon MQ is architected.

Amazon MQ is architected to be highly available (HA) and durable. For HA, we recommend using the multi-AZ option. After a message is sent to Amazon MQ in persistent mode, the message is written to the highly durable message store that replicates the data across multiple nodes in multiple Availability Zones. Because of this replication, for some use cases you may see a reduction in throughput as you migrate to Amazon MQ. Customers have told us they appreciate the benefits of message replication as it helps protect durability even in the face of the loss of an Availability Zone.

Conclusion

We hope this gives you an idea of how Amazon MQ performs. We encourage you to run tests to simulate your own use cases.

To learn more, see the Amazon MQ website. You can try Amazon MQ for free with the AWS Free Tier, which includes up to 750 hours of a single-instance mq.t2.micro broker and up to 1 GB of storage per month for one year.

C is to low level

Post Syndicated from Robert Graham original https://blog.erratasec.com/2018/05/c-is-too-low-level.html

I’m in danger of contradicting myself, after previously pointing out that x86 machine code is a high-level language, but this article claiming C is a not a low level language is bunk. C certainly has some problems, but it’s still the closest language to assembly. This is obvious by the fact it’s still the fastest compiled language. What we see is a typical academic out of touch with the real world.

The author makes the (wrong) observation that we’ve been stuck emulating the PDP-11 for the past 40 years. C was written for the PDP-11, and since then CPUs have been designed to make C run faster. The author imagines a different world, such as where CPU designers instead target something like LISP as their preferred language, or Erlang. This misunderstands the state of the market. CPUs do indeed supports lots of different abstractions, and C has evolved to accommodate this.


The author criticizes things like “out-of-order” execution which has lead to the Spectre sidechannel vulnerabilities. Out-of-order execution is necessary to make C run faster. The author claims instead that those resources should be spent on having more slower CPUs, with more threads. This sacrifices single-threaded performance in exchange for a lot more threads executing in parallel. The author cites Sparc Tx CPUs as his ideal processor.

But here’s the thing, the Sparc Tx was a failure. To be fair, it’s mostly a failure because most of the time, people wanted to run old C code instead of new Erlang code. But it was still a failure at running Erlang.

Time after time, engineers keep finding that “out-of-order”, single-threaded performance is still the winner. A good example is ARM processors for both mobile phones and servers. All the theory points to in-order CPUs as being better, but all the products are out-of-order, because this theory is wrong. The custom ARM cores from Apple and Qualcomm used in most high-end phones are so deeply out-of-order they give Intel CPUs competition. The same is true on the server front with the latest Qualcomm Centriq and Cavium ThunderX2 processors, deeply out of order supporting more than 100 instructions in flight.

The Cavium is especially telling. Its ThunderX CPU had 48 simple cores which was replaced with the ThunderX2 having 32 complex, deeply out-of-order cores. The performance increase was massive, even on multithread-friendly workloads. Every competitor to Intel’s dominance in the server space has learned the lesson from Sparc Tx: many wimpy cores is a failure, you need fewer beefy cores. Yes, they don’t need to be as beefy as Intel’s processors, but they need to be close.

Even Intel’s “Xeon Phi” custom chip learned this lesson. This is their GPU-like chip, running 60 cores with 512-bit wide “vector” (sic) instructions, designed for supercomputer applications. Its first version was purely in-order. Its current version is slightly out-of-order. It supports four threads and focuses on basic number crunching, so in-order cores seems to be the right approach, but Intel found in this case that out-of-order processing still provided a benefit. Practice is different than theory.

As an academic, the author of the above article focuses on abstractions. The criticism of C is that it has the wrong abstractions which are hard to optimize, and that if we instead expressed things in the right abstractions, it would be easier to optimize.

This is an intellectually compelling argument, but so far bunk.

The reason is that while the theoretical base language has issues, everyone programs using extensions to the language, like “intrinsics” (C ‘functions’ that map to assembly instructions). Programmers write libraries using these intrinsics, which then the rest of the normal programmers use. In other words, if your criticism is that C is not itself low level enough, it still provides the best access to low level capabilities.

Given that C can access new functionality in CPUs, CPU designers add new paradigms, from SIMD to transaction processing. In other words, while in the 1980s CPUs were designed to optimize C (stacks, scaled pointers), these days CPUs are designed to optimize tasks regardless of language.

The author of that article criticizes the memory/cache hierarchy, claiming it has problems. Yes, it has problems, but only compared to how well it normally works. The author praises the many simple cores/threads idea as hiding memory latency with little caching, but misses the point that caches also dramatically increase memory bandwidth. Intel processors are optimized to read a whopping 256 bits every clock cycle from L1 cache. Main memory bandwidth is orders of magnitude slower.

The author goes onto criticize cache coherency as a problem. C uses it, but other languages like Erlang don’t need it. But that’s largely due to the problems each languages solves. Erlang solves the problem where a large number of threads work on largely independent tasks, needing to send only small messages to each other across threads. The problems C solves is when you need many threads working on a huge, common set of data.

For example, consider the “intrusion prevention system”. Any thread can process any incoming packet that corresponds to any region of memory. There’s no practical way of solving this problem without a huge coherent cache. It doesn’t matter which language or abstractions you use, it’s the fundamental constraint of the problem being solved. RDMA is an important concept that’s moved from supercomputer applications to the data center, such as with memcached. Again, we have the problem of huge quantities (terabytes worth) shared among threads rather than small quantities (kilobytes).

The fundamental issue the author of the the paper is ignoring is decreasing marginal returns. Moore’s Law has gifted us more transistors than we can usefully use. We can’t apply those additional registers to just one thing, because the useful returns we get diminish.

For example, Intel CPUs have two hardware threads per core. That’s because there are good returns by adding a single additional thread. However, the usefulness of adding a third or fourth thread decreases. That’s why many CPUs have only two threads, or sometimes four threads, but no CPU has 16 threads per core.

You can apply the same discussion to any aspect of the CPU, from register count, to SIMD width, to cache size, to out-of-order depth, and so on. Rather than focusing on one of these things and increasing it to the extreme, CPU designers make each a bit larger every process tick that adds more transistors to the chip.

The same applies to cores. It’s why the “more simpler cores” strategy fails, because more cores have their own decreasing marginal returns. Instead of adding cores tied to limited memory bandwidth, it’s better to add more cache. Such cache already increases the size of the cores, so at some point it’s more effective to add a few out-of-order features to each core rather than more cores. And so on.

The question isn’t whether we can change this paradigm and radically redesign CPUs to match some academic’s view of the perfect abstraction. Instead, the goal is to find new uses for those additional transistors. For example, “message passing” is a useful abstraction in languages like Go and Erlang that’s often more useful than sharing memory. It’s implemented with shared memory and atomic instructions, but I can’t help but think it couldn’t better be done with direct hardware support.

Of course, as soon as they do that, it’ll become an intrinsic in C, then added to languages like Go and Erlang.

Summary

Academics live in an ideal world of abstractions, the rest of us live in practical reality. The reality is that vast majority of programmers work with the C family of languages (JavaScript, Go, etc.), whereas academics love the epiphanies they learned using other languages, especially function languages. CPUs are only superficially designed to run C and “PDP-11 compatibility”. Instead, they keep adding features to support other abstractions, abstractions available to C. They are driven by decreasing marginal returns — they would love to add new abstractions to the hardware because it’s a cheap way to make use of additional transitions. Academics are wrong believing that the entire system needs to be redesigned from scratch. Instead, they just need to come up with new abstractions CPU designers can add.

Raspberry Jam Cameroon #PiParty

Post Syndicated from Ben Nuttall original https://www.raspberrypi.org/blog/raspberry-jam-cameroon-piparty/

Earlier this year on 3 and 4 March, communities around the world held Raspberry Jam events to celebrate Raspberry Pi’s sixth birthday. We sent out special birthday kits to participating Jams — it was amazing to know the kits would end up in the hands of people in parts of the world very far from Raspberry Pi HQ in Cambridge, UK.

The Raspberry Jam Camer team: Damien Doumer, Eyong Etta, Loïc Dessap and Lionel Sichom, aka Lionel Tellem

Preparing for the #PiParty

One birthday kit went to Yaoundé, the capital of Cameroon. There, a team of four students in their twenties — Lionel Sichom (aka Lionel Tellem), Eyong Etta, Loïc Dessap, and Damien Doumer — were organising Yaoundé’s first Jam, called Raspberry Jam Camer, as part of the Raspberry Jam Big Birthday Weekend. The team knew one another through their shared interests and skills in electronics, robotics, and programming. Damien explains in his blog post about the Jam that they planned ahead for several activities for the Jam based on their own projects, so they could be confident of having a few things that would definitely be successful for attendees to do and see.

Show-and-tell at Raspberry Jam Cameroon

Loïc presented a Raspberry Pi–based, Android app–controlled robot arm that he had built, and Lionel coded a small video game using Scratch on Raspberry Pi while the audience watched. Damien demonstrated the possibilities of Windows 10 IoT Core on Raspberry Pi, showing how to install it, how to use it remotely, and what you can do with it, including building a simple application.

Loïc Dessap, wearing a Raspberry Jam Big Birthday Weekend T-shirt, sits at a table with a robot arm, a laptop with a Pi sticker and other components. He is making an adjustment to his set-up.

Loïc showcases the prototype robot arm he built

There was lots more too, with others discussing their own Pi projects and talking about the possibilities Raspberry Pi offers, including a Pi-controlled drone and car. Cake was a prevailing theme of the Raspberry Jam Big Birthday Weekend around the world, and Raspberry Jam Camer made sure they didn’t miss out.

A round pink-iced cake decorated with the words "Happy Birthday RBP" and six candles, on a table beside Raspberry Pi stickers, Raspberry Jam stickers and Raspberry Jam fliers

Yay, birthday cake!!

A big success

Most visitors to the Jam were secondary school students, while others were university students and graduates. The majority were unfamiliar with Raspberry Pi, but all wanted to learn about Raspberry Pi and what they could do with it. Damien comments that the fact most people were new to Raspberry Pi made the event more interactive rather than creating any challenges, because the visitors were all interested in finding out about the little computer. The Jam was an all-round success, and the team was pleased with how it went:

What I liked the most was that we sensitized several people about the Raspberry Pi and what one can be capable of with such a small but powerful device. — Damien Doumer

The Jam team rounded off the event by announcing that this was the start of a Raspberry Pi community in Yaoundé. They hope that they and others will be able to organise more Jams and similar events in the area to spread the word about what people can do with Raspberry Pi, and to help them realise their ideas.

The Raspberry Jam Camer team, wearing Raspberry Jam Big Birthday Weekend T-shirts, pose with young Jam attendees outside their venue

Raspberry Jam Camer gets the thumbs-up

The Raspberry Pi community in Cameroon

In a French-language interview about their Jam, the team behind Raspberry Jam Camer said they’d like programming to become the third official language of Cameroon, after French and English; their aim is to to popularise programming and digital making across Cameroonian society. Neither of these fields is very familiar to most people in Cameroon, but both are very well aligned with the country’s ambitions for development. The team is conscious of the difficulties around the emergence of information and communication technologies in the Cameroonian context; in response, they are seizing the opportunities Raspberry Pi offers to give children and young people access to modern and constantly evolving technology at low cost.

Thanks to Lionel, Eyong, Damien, and Loïc, and to everyone who helped put on a Jam for the Big Birthday Weekend! Remember, anyone can start a Jam at any time — and we provide plenty of resources to get you started. Check out the Guidebook, the Jam branding pack, our specially-made Jam activities online (in multiple languages), printable worksheets, and more.

The post Raspberry Jam Cameroon #PiParty appeared first on Raspberry Pi.

Amazon Aurora Backtrack – Turn Back Time

Post Syndicated from Jeff Barr original https://aws.amazon.com/blogs/aws/amazon-aurora-backtrack-turn-back-time/

We’ve all been there! You need to make a quick, seemingly simple fix to an important production database. You compose the query, give it a once-over, and let it run. Seconds later you realize that you forgot the WHERE clause, dropped the wrong table, or made another serious mistake, and interrupt the query, but the damage has been done. You take a deep breath, whistle through your teeth, wish that reality came with an Undo option. Now what?

New Amazon Aurora Backtrack
Today I would like to tell you about the new backtrack feature for Amazon Aurora. This is as close as we can come, given present-day technology, to an Undo option for reality.

This feature can be enabled at launch time for all newly-launched Aurora database clusters. To enable it, you simply specify how far back in time you might want to rewind, and use the database as usual (this is on the Configure advanced settings page):

Aurora uses a distributed, log-structured storage system (read Design Considerations for High Throughput Cloud-Native Relational Databases to learn a lot more); each change to your database generates a new log record, identified by a Log Sequence Number (LSN). Enabling the backtrack feature provisions a FIFO buffer in the cluster for storage of LSNs. This allows for quick access and recovery times measured in seconds.

After that regrettable moment when all seems lost, you simply pause your application, open up the Aurora Console, select the cluster, and click Backtrack DB cluster:

Then you select Backtrack and choose the point in time just before your epic fail, and click Backtrack DB cluster:

Then you wait for the rewind to take place, unpause your application and proceed as if nothing had happened. When you initiate a backtrack, Aurora will pause the database, close any open connections, drop uncommitted writes, and wait for the backtrack to complete. Then it will resume normal operation and being to accept requests. The instance state will be backtracking while the rewind is underway:

The console will let you know when the backtrack is complete:

If it turns out that you went back a bit too far, you can backtrack to a later time. Other Aurora features such as cloning, backups, and restores continue to work on an instance that has been configured for backtrack.

I’m sure you can think of some creative and non-obvious use cases for this cool new feature. For example, you could use it to restore a test database after running a test that makes changes to the database. You can initiate the restoration from the API or the CLI, making it easy to integrate into your existing test framework.

Things to Know
This option applies to newly created MySQL-compatible Aurora database clusters and to MySQL-compatible clusters that have been restored from a backup. You must opt-in when you create or restore a cluster; you cannot enable it for a running cluster.

This feature is available now in all AWS Regions where Amazon Aurora runs, and you can start using it today.

Jeff;

Creating a 1.3 Million vCPU Grid on AWS using EC2 Spot Instances and TIBCO GridServer

Post Syndicated from Jeff Barr original https://aws.amazon.com/blogs/aws/creating-a-1-3-million-vcpu-grid-on-aws-using-ec2-spot-instances-and-tibco-gridserver/

Many of my colleagues are fortunate to be able to spend a good part of their day sitting down with and listening to our customers, doing their best to understand ways that we can better meet their business and technology needs. This information is treated with extreme care and is used to drive the roadmap for new services and new features.

AWS customers in the financial services industry (often abbreviated as FSI) are looking ahead to the Fundamental Review of Trading Book (FRTB) regulations that will come in to effect between 2019 and 2021. Among other things, these regulations mandate a new approach to the “value at risk” calculations that each financial institution must perform in the four hour time window after trading ends in New York and begins in Tokyo. Today, our customers report this mission-critical calculation consumes on the order of 200,000 vCPUs, growing to between 400K and 800K vCPUs in order to meet the FRTB regulations. While there’s still some debate about the magnitude and frequency with which they’ll need to run this expanded calculation, the overall direction is clear.

Building a Big Grid
In order to make sure that we are ready to help our FSI customers meet these new regulations, we worked with TIBCO to set up and run a proof of concept grid in the AWS Cloud. The periodic nature of the calculation, along with the amount of processing power and storage needed to run it to completion within four hours, make it a great fit for an environment where a vast amount of cost-effective compute power is available on an on-demand basis.

Our customers are already using the TIBCO GridServer on-premises and want to use it in the cloud. This product is designed to run grids at enterprise scale. It runs apps in a virtualized fashion, and accepts requests for resources, dynamically provisioning them on an as-needed basis. The cloud version supports Amazon Linux as well as the PostgreSQL-compatible edition of Amazon Aurora.

Working together with TIBCO, we set out to create a grid that was substantially larger than the current high-end prediction of 800K vCPUs, adding a 50% safety factor and then rounding up to reach 1.3 million vCPUs (5x the size of the largest on-premises grid). With that target in mind, the account limits were raised as follows:

  • Spot Instance Limit – 120,000
  • EBS Volume Limit – 120,000
  • EBS Capacity Limit – 2 PB

If you plan to create a grid of this size, you should also bring your friendly local AWS Solutions Architect into the loop as early as possible. They will review your plans, provide you with architecture guidance, and help you to schedule your run.

Running the Grid
We hit the Go button and launched the grid, watching as it bid for and obtained Spot Instances, each of which booted, initialized, and joined the grid within two minutes. The test workload used the Strata open source analytics & market risk library from OpenGamma and was set up with their assistance.

The grid grew to 61,299 Spot Instances (1.3 million vCPUs drawn from 34 instance types spanning 3 generations of EC2 hardware) as planned, with just 1,937 instances reclaimed and automatically replaced during the run, and cost $30,000 per hour to run, at an average hourly cost of $0.078 per vCPU. If the same instances had been used in On-Demand form, the hourly cost to run the grid would have been approximately $93,000.

Despite the scale of the grid, prices for the EC2 instances did not move during the bidding process. This is due to the overall size of the AWS Cloud and the smooth price change model that we launched late last year.

To give you a sense of the compute power, we computed that this grid would have taken the #1 position on the TOP 500 supercomputer list in November 2007 by a considerable margin, and the #2 position in June 2008. Today, it would occupy position #360 on the list.

I hope that you enjoyed this AWS success story, and that it gives you an idea of the scale that you can achieve in the cloud!

Jeff;

Ray Ozzie’s Encryption Backdoor

Post Syndicated from Bruce Schneier original https://www.schneier.com/blog/archives/2018/05/ray_ozzies_encr.html

Last month, Wired published a long article about Ray Ozzie and his supposed new scheme for adding a backdoor in encrypted devices. It’s a weird article. It paints Ozzie’s proposal as something that “attains the impossible” and “satisfies both law enforcement and privacy purists,” when (1) it’s barely a proposal, and (2) it’s essentially the same key escrow scheme we’ve been hearing about for decades.

Basically, each device has a unique public/private key pair and a secure processor. The public key goes into the processor and the device, and is used to encrypt whatever user key encrypts the data. The private key is stored in a secure database, available to law enforcement on demand. The only other trick is that for law enforcement to use that key, they have to put the device in some sort of irreversible recovery mode, which means it can never be used again. That’s basically it.

I have no idea why anyone is talking as if this were anything new. Several cryptographers have already explained why this key escrow scheme is no better than any other key escrow scheme. The short answer is (1) we won’t be able to secure that database of backdoor keys, (2) we don’t know how to build the secure coprocessor the scheme requires, and (3) it solves none of the policy problems around the whole system. This is the typical mistake non-cryptographers make when they approach this problem: they think that the hard part is the cryptography to create the backdoor. That’s actually the easy part. The hard part is ensuring that it’s only used by the good guys, and there’s nothing in Ozzie’s proposal that addresses any of that.

I worry that this kind of thing is damaging in the long run. There should be some rule that any backdoor or key escrow proposal be a fully specified proposal, not just some cryptography and hand-waving notions about how it will be used in practice. And before it is analyzed and debated, it should have to satisfy some sort of basic security analysis. Otherwise, we’ll be swatting pseudo-proposals like this one, while those on the other side of this debate become increasingly convinced that it’s possible to design one of these things securely.

Already people are using the National Academies report on backdoors for law enforcement as evidence that engineers are developing workable and secure backdoors. Writing in Lawfare, Alan Z. Rozenshtein claims that the report — and a related New York Times story — “undermine the argument that secure third-party access systems are so implausible that it’s not even worth trying to develop them.” Susan Landau effectively corrects this misconception, but the damage is done.

Here’s the thing: it’s not hard to design and build a backdoor. What’s hard is building the systems — both technical and procedural — around them. Here’s Rob Graham:

He’s only solving the part we already know how to solve. He’s deliberately ignoring the stuff we don’t know how to solve. We know how to make backdoors, we just don’t know how to secure them.

A bunch of us cryptographers have already explained why we don’t think this sort of thing will work in the foreseeable future. We write:

Exceptional access would force Internet system developers to reverse “forward secrecy” design practices that seek to minimize the impact on user privacy when systems are breached. The complexity of today’s Internet environment, with millions of apps and globally connected services, means that new law enforcement requirements are likely to introduce unanticipated, hard to detect security flaws. Beyond these and other technical vulnerabilities, the prospect of globally deployed exceptional access systems raises difficult problems about how such an environment would be governed and how to ensure that such systems would respect human rights and the rule of law.

Finally, Matthew Green:

The reason so few of us are willing to bet on massive-scale key escrow systems is that we’ve thought about it and we don’t think it will work. We’ve looked at the threat model, the usage model, and the quality of hardware and software that exists today. Our informed opinion is that there’s no detection system for key theft, there’s no renewability system, HSMs are terrifically vulnerable (and the companies largely staffed with ex-intelligence employees), and insiders can be suborned. We’re not going to put the data of a few billion people on the line an environment where we believe with high probability that the system will fail.

EDITED TO ADD (5/14): An analysis of the proposal.

Announcing Local Build Support for AWS CodeBuild

Post Syndicated from Karthik Thirugnanasambandam original https://aws.amazon.com/blogs/devops/announcing-local-build-support-for-aws-codebuild/

Today, we’re excited to announce local build support in AWS CodeBuild.

AWS CodeBuild is a fully managed build service. There are no servers to provision and scale, or software to install, configure, and operate. You just specify the location of your source code, choose your build settings, and CodeBuild runs build scripts for compiling, testing, and packaging your code.

In this blog post, I’ll show you how to set up CodeBuild locally to build and test a sample Java application.

By building an application on a local machine you can:

  • Test the integrity and contents of a buildspec file locally.
  • Test and build an application locally before committing.
  • Identify and fix errors quickly from your local development environment.

Prerequisites

In this post, I am using AWS Cloud9 IDE as my development environment.

If you would like to use AWS Cloud9 as your IDE, follow the express setup steps in the AWS Cloud9 User Guide.

The AWS Cloud9 IDE comes with Docker and Git already installed. If you are going to use your laptop or desktop machine as your development environment, install Docker and Git before you start.

Steps to build CodeBuild image locally

Run git clone https://github.com/aws/aws-codebuild-docker-images.git to download this repository to your local machine.

$ git clone https://github.com/aws/aws-codebuild-docker-images.git

Lets build a local CodeBuild image for JDK 8 environment. The Dockerfile for JDK 8 is present in /aws-codebuild-docker-images/ubuntu/java/openjdk-8.

Edit the Dockerfile to remove the last line ENTRYPOINT [“dockerd-entrypoint.sh”] and save the file.

Run cd ubuntu/java/openjdk-8 to change the directory in your local workspace.

Run docker build -t aws/codebuild/java:openjdk-8 . to build the Docker image locally. This command will take few minutes to complete.

$ cd aws-codebuild-docker-images
$ cd ubuntu/java/openjdk-8
$ docker build -t aws/codebuild/java:openjdk-8 .

Steps to setup CodeBuild local agent

Run the following Docker pull command to download the local CodeBuild agent.

$ docker pull amazon/aws-codebuild-local:latest --disable-content-trust=false

Now you have the local agent image on your machine and can run a local build.

Run the following git command to download a sample Java project.

$ git clone https://github.com/karthiksambandam/sample-web-app.git

Steps to use the local agent to build a sample project

Let’s build the sample Java project using the local agent.

Execute the following Docker command to run the local agent and build the sample web app repository you cloned earlier.

$ docker run -it -v /var/run/docker.sock:/var/run/docker.sock -e "IMAGE_NAME=aws/codebuild/java:openjdk-8" -e "ARTIFACTS=/home/ec2-user/environment/artifacts" -e "SOURCE=/home/ec2-user/environment/sample-web-app" amazon/aws-codebuild-local

Note: We need to provide three environment variables namely  IMAGE_NAME, SOURCE and ARTIFACTS.

IMAGE_NAME: The name of your build environment image.

SOURCE: The absolute path to your source code directory.

ARTIFACTS: The absolute path to your artifact output folder.

When you run the sample project, you get a runtime error that says the YAML file does not exist. This is because a buildspec.yml file is not included in the sample web project. AWS CodeBuild requires a buildspec.yml to run a build. For more information about buildspec.yml, see Build Spec Example in the AWS CodeBuild User Guide.

Let’s add a buildspec.yml file with the following content to the sample-web-app folder and then rebuild the project.

version: 0.2

phases:
  build:
    commands:
      - echo Build started on `date`
      - mvn install

artifacts:
  files:
    - target/javawebdemo.war

$ docker run -it -v /var/run/docker.sock:/var/run/docker.sock -e "IMAGE_NAME=aws/codebuild/java:openjdk-8" -e "ARTIFACTS=/home/ec2-user/environment/artifacts" -e "SOURCE=/home/ec2-user/environment/sample-web-app" amazon/aws-codebuild-local

This time your build should be successful. Upon successful execution, look in the /artifacts folder for the final built artifacts.zip file to validate.

Conclusion:

In this blog post, I showed you how to quickly set up the CodeBuild local agent to build projects right from your local desktop machine or laptop. As you see, local builds can improve developer productivity by helping you identify and fix errors quickly.

I hope you found this post useful. Feel free to leave your feedback or suggestions in the comments.

MyEtherWallet DNS Hack Causes 17 Million USD User Loss

Post Syndicated from Darknet original https://www.darknet.org.uk/2018/04/myetherwallet-dns-hack-causes-17-million-usd-user-loss/?utm_source=rss&utm_medium=social&utm_campaign=darknetfeed

MyEtherWallet DNS Hack Causes 17 Million USD User Loss

Big news in the crypto scene this week was that the MyEtherWallet DNS Hack that occured managed to collect about $17 Million USD worth of Ethereum in just a few hours.

The hack itself could have been MUCH bigger as it actually involved compromising 1300 Amazon AWS Route 53 DNS IP addresses, fortunately though only MEW was targetted resulting in the damage being contained in the cryptosphere (as far as we know anyway).

Read the rest of MyEtherWallet DNS Hack Causes 17 Million USD User Loss now! Only available at Darknet.

[$] Exposing storage devices as memory

Post Syndicated from corbet original https://lwn.net/Articles/752969/rss

Storage devices are in a period of extensive change. As they
get faster and become byte-addressable by the CPU, they tend to look
increasingly like ordinary memory. But they aren’t memory, so it still
isn’t clear what the best model for accessing them should be. Adam
Manzanares led a session during the memory-management track of the 2018
Linux Storage, Filesystem, and Memory-Management Summit, where his proposal
of a new access mechanism ran into some skepticism.

No, Ray Ozzie hasn’t solved crypto backdoors

Post Syndicated from Robert Graham original https://blog.erratasec.com/2018/04/no-ray-ozzie-hasnt-solved-crypto.html

According to this Wired article, Ray Ozzie may have a solution to the crypto backdoor problem. No, he hasn’t. He’s only solving the part we already know how to solve. He’s deliberately ignoring the stuff we don’t know how to solve. We know how to make backdoors, we just don’t know how to secure them.

The vault doesn’t scale

Yes, Apple has a vault where they’ve successfully protected important keys. No, it doesn’t mean this vault scales. The more people and the more often you have to touch the vault, the less secure it becomes. We are talking thousands of requests per day from 100,000 different law enforcement agencies around the world. We are unlikely to protect this against incompetence and mistakes. We are definitely unable to secure this against deliberate attack.

A good analogy to Ozzie’s solution is LetsEncrypt for getting SSL certificates for your website, which is fairly scalable, using a private key locked in a vault for signing hundreds of thousands of certificates. That this scales seems to validate Ozzie’s proposal.

But at the same time, LetsEncrypt is easily subverted. LetsEncrypt uses DNS to verify your identity. But spoofing DNS is easy, as was recently shown in the recent BGP attack against a cryptocurrency. Attackers can create fraudulent SSL certificates with enough effort. We’ve got other protections against this, such as discovering and revoking the SSL bad certificate, so while damaging, it’s not catastrophic.

But with Ozzie’s scheme, equivalent attacks would be catastrophic, as it would lead to unlocking the phone and stealing all of somebody’s secrets.

In particular, consider what would happen if LetsEncrypt’s certificate was stolen (as Matthew Green points out). The consequence is that this would be detected and mass revocations would occur. If Ozzie’s master key were stolen, nothing would happen. Nobody would know, and evildoers would be able to freely decrypt phones. Ozzie claims his scheme can work because SSL works — but then his scheme includes none of the many protections necessary to make SSL work.

What I’m trying to show here is that in a lab, it all looks nice and pretty, but when attacked at scale, things break down — quickly. We have so much experience with failure at scale that we can judge Ozzie’s scheme as woefully incomplete. It’s not even up to the standard of SSL, and we have a long list of SSL problems.

Cryptography is about people more than math

We have a mathematically pure encryption algorithm called the “One Time Pad”. It can’t ever be broken, provably so with mathematics.

It’s also perfectly useless, as it’s not something humans can use. That’s why we use AES, which is vastly less secure (anything you encrypt today can probably be decrypted in 100 years). AES can be used by humans whereas One Time Pads cannot be. (I learned the fallacy of One Time Pad’s on my grandfather’s knee — he was a WW II codebreaker who broke German messages trying to futz with One Time Pads).

The same is true with Ozzie’s scheme. It focuses on the mathematical model but ignores the human element. We already know how to solve the mathematical problem in a hundred different ways. The part we don’t know how to secure is the human element.

How do we know the law enforcement person is who they say they are? How do we know the “trusted Apple employee” can’t be bribed? How can the law enforcement agent communicate securely with the Apple employee?

You think these things are theoretical, but they aren’t. Consider financial transactions. It used to be common that you could just email your bank/broker to wire funds into an account for such things as buying a house. Hackers have subverted that, intercepting messages, changing account numbers, and stealing millions. Most banks/brokers require additional verification before doing such transfers.

Let me repeat: Ozzie has only solved the part we already know how to solve. He hasn’t addressed these issues that confound us.

We still can’t secure security, much less secure backdoors

We already know how to decrypt iPhones: just wait a year or two for somebody to discover a vulnerability. FBI claims it’s “going dark”, but that’s only for timely decryption of phones. If they are willing to wait a year or two a vulnerability will eventually be found that allows decryption.

That’s what’s happened with the “GrayKey” device that’s been all over the news lately. Apple is fixing it so that it won’t work on new phones, but it works on old phones.

Ozzie’s solution is based on the assumption that iPhones are already secure against things like GrayKey. Like his assumption “if Apple already has a vault for private keys, then we have such vaults for backdoor keys”, Ozzie is saying “if Apple already had secure hardware/software to secure the phone, then we can use the same stuff to secure the backdoors”. But we don’t really have secure vaults and we don’t really have secure hardware/software to secure the phone.

Again, to stress this point, Ozzie is solving the part we already know how to solve, but ignoring the stuff we don’t know how to solve. His solution is insecure for the same reason phones are already insecure.

Locked phones aren’t the problem

Phones are general purpose computers. That means anybody can install an encryption app on the phone regardless of whatever other security the phone might provide. The police are powerless to stop this. Even if they make such encryption crime, then criminals will still use encryption.

That leads to a strange situation that the only data the FBI will be able to decrypt is that of people who believe they are innocent. Those who know they are guilty will install encryption apps like Signal that have no backdoors.

In the past this was rare, as people found learning new apps a barrier. These days, apps like Signal are so easy even drug dealers can figure out how to use them.

We know how to get Apple to give us a backdoor, just pass a law forcing them to. It may look like Ozzie’s scheme, it may be something more secure designed by Apple’s engineers. Sure, it will weaken security on the phone for everyone, but those who truly care will just install Signal. But again we are back to the problem that Ozzie’s solving the problem we know how to solve while ignoring the much larger problem, that of preventing people from installing their own encryption.

The FBI isn’t necessarily the problem

Ozzie phrases his solution in terms of U.S. law enforcement. Well, what about Europe? What about Russia? What about China? What about North Korea?

Technology is borderless. A solution in the United States that allows “legitimate” law enforcement requests will inevitably be used by repressive states for what we believe would be “illegitimate” law enforcement requests.

Ozzie sees himself as the hero helping law enforcement protect 300 million American citizens. He doesn’t see himself what he really is, the villain helping oppress 1.4 billion Chinese, 144 million Russians, and another couple billion living in oppressive governments around the world.

Conclusion

Ozzie pretends the problem is political, that he’s created a solution that appeases both sides. He hasn’t. He’s solved the problem we already know how to solve. He’s ignored all the problems we struggle with, the problems we claim make secure backdoors essentially impossible. I’ve listed some in this post, but there are many more. Any famous person can create a solution that convinces fawning editors at Wired Magazine, but if Ozzie wants to move forward he’s going to have to work harder to appease doubting cryptographers.

Continued: the answers to your questions for Eben Upton

Post Syndicated from Alex Bate original https://www.raspberrypi.org/blog/eben-q-a-2/

Last week, we shared the first half of our Q&A with Raspberry Pi Trading CEO and Raspberry Pi creator Eben Upton. Today we follow up with all your other questions, including your expectations for a Raspberry Pi 4, Eben’s dream add-ons, and whether we really could go smaller than the Zero.

Live Q&A with Eben Upton, creator of the Raspberry Pi

Get your questions to us now using #AskRaspberryPi on Twitter

With internet security becoming more necessary, will there be automated versions of VPN on an SD card?

There are already third-party tools which turn your Raspberry Pi into a VPN endpoint. Would we do it ourselves? Like the power button, it’s one of those cases where there are a million things we could do and so it’s more efficient to let the community get on with it.

Just to give a counterexample, while we don’t generally invest in optimising for particular use cases, we did invest a bunch of money into optimising Kodi to run well on Raspberry Pi, because we found that very large numbers of people were using it. So, if we find that we get half a million people a year using a Raspberry Pi as a VPN endpoint, then we’ll probably invest money into optimising it and feature it on the website as we’ve done with Kodi. But I don’t think we’re there today.

Have you ever seen any Pis running and doing important jobs in the wild, and if so, how does it feel?

It’s amazing how often you see them driving displays, for example in radio and TV studios. Of course, it feels great. There’s something wonderful about the geographic spread as well. The Raspberry Pi desktop is quite distinctive, both in its previous incarnation with the grey background and logo, and the current one where we have Greg Annandale’s road picture.

The PIXEL desktop on Raspberry Pi

And so it’s funny when you see it in places. Somebody sent me a video of them teaching in a classroom in rural Pakistan and in the background was Greg’s picture.

Raspberry Pi 4!?!

There will be a Raspberry Pi 4, obviously. We get asked about it a lot. I’m sticking to the guidance that I gave people that they shouldn’t expect to see a Raspberry Pi 4 this year. To some extent, the opportunity to do the 3B+ was a surprise: we were surprised that we’ve been able to get 200MHz more clock speed, triple the wireless and wired throughput, and better thermals, and still stick to the $35 price point.

We’re up against the wall from a silicon perspective; we’re at the end of what you can do with the 40nm process. It’s not that you couldn’t clock the processor faster, or put a larger processor which can execute more instructions per clock in there, it’s simply about the energy consumption and the fact that you can’t dissipate the heat. So we’ve got to go to a smaller process node and that’s an order of magnitude more challenging from an engineering perspective. There’s more effort, more risk, more cost, and all of those things are challenging.

With 3B+ out of the way, we’re going to start looking at this now. For the first six months or so we’re going to be figuring out exactly what people want from a Raspberry Pi 4. We’re listening to people’s comments about what they’d like to see in a new Raspberry Pi, and I’m hoping by early autumn we should have an idea of what we want to put in it and a strategy for how we might achieve that.

Could you go smaller than the Zero?

The challenge with Zero as that we’re periphery-limited. If you run your hand around the unit, there is no edge of that board that doesn’t have something there. So the question is: “If you want to go smaller than Zero, what feature are you willing to throw out?”

It’s a single-sided board, so you could certainly halve the PCB area if you fold the circuitry and use both sides, though you’d have to lose something. You could give up some GPIO and go back to 26 pins like the first Raspberry Pi. You could give up the camera connector, you could go to micro HDMI from mini HDMI. You could remove the SD card and just do USB boot. I’m inventing a product live on air! But really, you could get down to two thirds and lose a bunch of GPIO – it’s hard to imagine you could get to half the size.

What’s the one feature that you wish you could outfit on the Raspberry Pi that isn’t cost effective at this time? Your dream feature.

Well, more memory. There are obviously technical reasons why we don’t have more memory on there, but there are also market reasons. People ask “why doesn’t the Raspberry Pi have more memory?”, and my response is typically “go and Google ‘DRAM price’”. We’re used to the price of memory going down. And currently, we’re going through a phase where this has turned around and memory is getting more expensive again.

Machine learning would be interesting. There are machine learning accelerators which would be interesting to put on a piece of hardware. But again, they are not going to be used by everyone, so according to our method of pricing what we might add to a board, machine learning gets treated like a $50 chip. But that would be lovely to do.

Which citizen science projects using the Pi have most caught your attention?

I like the wildlife camera projects. We live out in the countryside in a little village, and we’re conscious of being surrounded by nature but we don’t see a lot of it on a day-to-day basis. So I like the nature cam projects, though, to my everlasting shame, I haven’t set one up yet. There’s a range of them, from very professional products to people taking a Raspberry Pi and a camera and putting them in a plastic box. So those are good fun.

Raspberry Shake seismometer

The Raspberry Shake seismometer

And there’s Meteor Pi from the Cambridge Science Centre, that’s a lot of fun. And the seismometer Raspberry Shake – that sort of thing is really nice. We missed the recent South Wales earthquake; perhaps we should set one up at our Californian office.

How does it feel to go to bed every day knowing you’ve changed the world for the better in such a massive way?

What feels really good is that when we started this in 2006 nobody else was talking about it, but now we’re part of a very broad movement.

We were in a really bad way: we’d seen a collapse in the number of applicants applying to study Computer Science at Cambridge and elsewhere. In our view, this reflected a move away from seeing technology as ‘a thing you do’ to seeing it as a ‘thing that you have done to you’. It is problematic from the point of view of the economy, industry, and academia, but most importantly it damages the life prospects of individual children, particularly those from disadvantaged backgrounds. The great thing about STEM subjects is that you can’t fake being good at them. There are a lot of industries where your Dad can get you a job based on who he knows and then you can kind of muddle along. But if your dad gets you a job building bridges and you suck at it, after the first or second bridge falls down, then you probably aren’t going to be building bridges anymore. So access to STEM education can be a great driver of social mobility.

By the time we were launching the Raspberry Pi in 2012, there was this wonderful movement going on. Code Club, for example, and CoderDojo came along. Lots of different ways of trying to solve the same problem. What feels really, really good is that we’ve been able to do this as part of an enormous community. And some parts of that community became part of the Raspberry Pi Foundation – we merged with Code Club, we merged with CoderDojo, and we continue to work alongside a lot of these other organisations. So in the two seconds it takes me to fall asleep after my face hits the pillow, that’s what I think about.

We’re currently advertising a Programme Manager role in New Delhi, India. Did you ever think that Raspberry Pi would be advertising a role like this when you were bringing together the Foundation?

No, I didn’t.

But if you told me we were going to be hiring somewhere, India probably would have been top of my list because there’s a massive IT industry in India. When we think about our interaction with emerging markets, India, in a lot of ways, is the poster child for how we would like it to work. There have already been some wonderful deployments of Raspberry Pi, for example in Kerala, without our direct involvement. And we think we’ve got something that’s useful for the Indian market. We have a product, we have clubs, we have teacher training. And we have a body of experience in how to teach people, so we have a physical commercial product as well as a charitable offering that we think are a good fit.

It’s going to be massive.

What is your favourite BBC type-in listing?

There was a game called Codename: Druid. There is a famous game called Codename: Droid which was the sequel to Stryker’s Run, which was an awesome, awesome game. And there was a type-in game called Codename: Druid, which was at the bottom end of what you would consider a commercial game.

codename druid

And I remember typing that in. And what was really cool about it was that the next month, the guy who wrote it did another article that talks about the memory map and which operating system functions used which bits of memory. So if you weren’t going to do disc access, which bits of memory could you trample on and know the operating system would survive.

babbage versus bugs Raspberry Pi annual

See the full listing for Babbage versus Bugs in the Raspberry Pi 2018 Annual

I still like type-in listings. The Raspberry Pi 2018 Annual has a type-in listing that I wrote for a Babbage versus Bugs game. I will say that’s not the last type-in listing you will see from me in the next twelve months. And if you download the PDF, you could probably copy and paste it into your favourite text editor to save yourself some time.

The post Continued: the answers to your questions for Eben Upton appeared first on Raspberry Pi.

Ransomware Update: Viruses Targeting Business IT Servers

Post Syndicated from Roderick Bauer original https://www.backblaze.com/blog/ransomware-update-viruses-targeting-business-it-servers/

Ransomware warning message on computer

As ransomware attacks have grown in number in recent months, the tactics and attack vectors also have evolved. While the primary method of attack used to be to target individual computer users within organizations with phishing emails and infected attachments, we’re increasingly seeing attacks that target weaknesses in businesses’ IT infrastructure.

How Ransomware Attacks Typically Work

In our previous posts on ransomware, we described the common vehicles used by hackers to infect organizations with ransomware viruses. Most often, downloaders distribute trojan horses through malicious downloads and spam emails. The emails contain a variety of file attachments, which if opened, will download and run one of the many ransomware variants. Once a user’s computer is infected with a malicious downloader, it will retrieve additional malware, which frequently includes crypto-ransomware. After the files have been encrypted, a ransom payment is demanded of the victim in order to decrypt the files.

What’s Changed With the Latest Ransomware Attacks?

In 2016, a customized ransomware strain called SamSam began attacking the servers in primarily health care institutions. SamSam, unlike more conventional ransomware, is not delivered through downloads or phishing emails. Instead, the attackers behind SamSam use tools to identify unpatched servers running Red Hat’s JBoss enterprise products. Once the attackers have successfully gained entry into one of these servers by exploiting vulnerabilities in JBoss, they use other freely available tools and scripts to collect credentials and gather information on networked computers. Then they deploy their ransomware to encrypt files on these systems before demanding a ransom. Gaining entry to an organization through its IT center rather than its endpoints makes this approach scalable and especially unsettling.

SamSam’s methodology is to scour the Internet searching for accessible and vulnerable JBoss application servers, especially ones used by hospitals. It’s not unlike a burglar rattling doorknobs in a neighborhood to find unlocked homes. When SamSam finds an unlocked home (unpatched server), the software infiltrates the system. It is then free to spread across the company’s network by stealing passwords. As it transverses the network and systems, it encrypts files, preventing access until the victims pay the hackers a ransom, typically between $10,000 and $15,000. The low ransom amount has encouraged some victimized organizations to pay the ransom rather than incur the downtime required to wipe and reinitialize their IT systems.

The success of SamSam is due to its effectiveness rather than its sophistication. SamSam can enter and transverse a network without human intervention. Some organizations are learning too late that securing internet-facing services in their data center from attack is just as important as securing endpoints.

The typical steps in a SamSam ransomware attack are:

1
Attackers gain access to vulnerable server
Attackers exploit vulnerable software or weak/stolen credentials.
2
Attack spreads via remote access tools
Attackers harvest credentials, create SOCKS proxies to tunnel traffic, and abuse RDP to install SamSam on more computers in the network.
3
Ransomware payload deployed
Attackers run batch scripts to execute ransomware on compromised machines.
4
Ransomware demand delivered requiring payment to decrypt files
Demand amounts vary from victim to victim. Relatively low ransom amounts appear to be designed to encourage quick payment decisions.

What all the organizations successfully exploited by SamSam have in common is that they were running unpatched servers that made them vulnerable to SamSam. Some organizations had their endpoints and servers backed up, while others did not. Some of those without backups they could use to recover their systems chose to pay the ransom money.

Timeline of SamSam History and Exploits

Since its appearance in 2016, SamSam has been in the news with many successful incursions into healthcare, business, and government institutions.

March 2016
SamSam appears

SamSam campaign targets vulnerable JBoss servers
Attackers hone in on healthcare organizations specifically, as they’re more likely to have unpatched JBoss machines.

April 2016
SamSam finds new targets

SamSam begins targeting schools and government.
After initial success targeting healthcare, attackers branch out to other sectors.

April 2017
New tactics include RDP

Attackers shift to targeting organizations with exposed RDP connections, and maintain focus on healthcare.
An attack on Erie County Medical Center costs the hospital $10 million over three months of recovery.
Erie County Medical Center attacked by SamSam ransomware virus

January 2018
Municipalities attacked

• Attack on Municipality of Farmington, NM.
• Attack on Hancock Health.
Hancock Regional Hospital notice following SamSam attack
• Attack on Adams Memorial Hospital
• Attack on Allscripts (Electronic Health Records), which includes 180,000 physicians, 2,500 hospitals, and 7.2 million patients’ health records.

February 2018
Attack volume increases

• Attack on Davidson County, NC.
• Attack on Colorado Department of Transportation.
SamSam virus notification

March 2018
SamSam shuts down Atlanta

• Second attack on Colorado Department of Transportation.
• City of Atlanta suffers a devastating attack by SamSam.
The attack has far-reaching impacts — crippling the court system, keeping residents from paying their water bills, limiting vital communications like sewer infrastructure requests, and pushing the Atlanta Police Department to file paper reports.
Atlanta Ransomware outage alert
• SamSam campaign nets $325,000 in 4 weeks.
Infections spike as attackers launch new campaigns. Healthcare and government organizations are once again the primary targets.

How to Defend Against SamSam and Other Ransomware Attacks

The best way to respond to a ransomware attack is to avoid having one in the first place. If you are attacked, making sure your valuable data is backed up and unreachable by ransomware infection will ensure that your downtime and data loss will be minimal or none if you ever suffer an attack.

In our previous post, How to Recover From Ransomware, we listed the ten ways to protect your organization from ransomware.

  1. Use anti-virus and anti-malware software or other security policies to block known payloads from launching.
  2. Make frequent, comprehensive backups of all important files and isolate them from local and open networks. Cybersecurity professionals view data backup and recovery (74% in a recent survey) by far as the most effective solution to respond to a successful ransomware attack.
  3. Keep offline backups of data stored in locations inaccessible from any potentially infected computer, such as disconnected external storage drives or the cloud, which prevents them from being accessed by the ransomware.
  4. Install the latest security updates issued by software vendors of your OS and applications. Remember to patch early and patch often to close known vulnerabilities in operating systems, server software, browsers, and web plugins.
  5. Consider deploying security software to protect endpoints, email servers, and network systems from infection.
  6. Exercise cyber hygiene, such as using caution when opening email attachments and links.
  7. Segment your networks to keep critical computers isolated and to prevent the spread of malware in case of attack. Turn off unneeded network shares.
  8. Turn off admin rights for users who don’t require them. Give users the lowest system permissions they need to do their work.
  9. Restrict write permissions on file servers as much as possible.
  10. Educate yourself, your employees, and your family in best practices to keep malware out of your systems. Update everyone on the latest email phishing scams and human engineering aimed at turning victims into abettors.

Please Tell Us About Your Experiences with Ransomware

Have you endured a ransomware attack or have a strategy to avoid becoming a victim? Please tell us of your experiences in the comments.

The post Ransomware Update: Viruses Targeting Business IT Servers appeared first on Backblaze Blog | Cloud Storage & Cloud Backup.

Congratulations to Oracle on MySQL 8.0

Post Syndicated from Michael "Monty" Widenius original http://monty-says.blogspot.com/2018/04/congratulations-to-oracle-on-mysql-80.html

Last week, Oracle announced the general availability of MySQL 8.0. This is good news for database users, as it means Oracle is still developing MySQL.

I decide to celebrate the event by doing a quick test of MySQL 8.0. Here follows a step-by-step description of my first experience with MySQL 8.0.
Note that I did the following without reading the release notes, as is what I have done with every MySQL / MariaDB release up to date; In this case it was not the right thing to do.

I pulled MySQL 8.0 from [email protected]:mysql/mysql-server.git
I was pleasantly surprised that ‘cmake . ; make‘ worked without without any compiler warnings! I even checked the used compiler options and noticed that MySQL was compiled with -Wall + several other warning flags. Good job MySQL team!

I did have a little trouble finding the mysqld binary as Oracle had moved it to ‘runtime_output_directory’; Unexpected, but no big thing.

Now it’s was time to install MySQL 8.0.

I did know that MySQL 8.0 has removed mysql_install_db, so I had to use the mysqld binary directly to install the default databases:
(I have specified datadir=/my/data3 in the /tmp/my.cnf file)

> cd runtime_output_directory
> mkdir /my/data3
> ./mysqld –defaults-file=/tmp/my.cnf –install

2018-04-22T12:38:18.332967Z 1 [ERROR] [MY-011011] [Server] Failed to find valid data directory.
2018-04-22T12:38:18.333109Z 0 [ERROR] [MY-010020] [Server] Data Dictionary initialization failed.
2018-04-22T12:38:18.333135Z 0 [ERROR] [MY-010119] [Server] Aborting

A quick look in mysqld –help –verbose output showed that the right command option is –-initialize. My bad, lets try again,

> ./mysqld –defaults-file=/tmp/my.cnf –initialize

2018-04-22T12:39:31.910509Z 0 [ERROR] [MY-010457] [Server] –initialize specified but the data directory has files in it. Aborting.
2018-04-22T12:39:31.910578Z 0 [ERROR] [MY-010119] [Server] Aborting

Now I used the right options, but still didn’t work.
I took a quick look around:

> ls /my/data3/
binlog.index

So even if the mysqld noticed that the data3 directory was wrong, it still wrote things into it.  This even if I didn’t have –log-binlog enabled in the my.cnf file. Strange, but easy to fix:

> rm /my/data3/binlog.index
> ./mysqld –defaults-file=/tmp/my.cnf –initialize

2018-04-22T12:40:45.633637Z 0 [ERROR] [MY-011071] [Server] unknown variable ‘max-tmp-tables=100’
2018-04-22T12:40:45.633657Z 0 [Warning] [MY-010952] [Server] The privilege system failed to initialize correctly. If you have upgraded your server, make sure you’re executing mysql_upgrade to correct the issue.
2018-04-22T12:40:45.633663Z 0 [ERROR] [MY-010119] [Server] Aborting

The warning about the privilege system confused me a bit, but I ignored it for the time being and removed from my configuration files the variables that MySQL 8.0 doesn’t support anymore. I couldn’t find a list of the removed variables anywhere so this was done with the trial and error method.

> ./mysqld –defaults-file=/tmp/my.cnf

2018-04-22T12:42:56.626583Z 0 [ERROR] [MY-010735] [Server] Can’t open the mysql.plugin table. Please run mysql_upgrade to create it.
2018-04-22T12:42:56.827685Z 0 [Warning] [MY-010015] [Repl] Gtid table is not ready to be used. Table ‘mysql.gtid_executed’ cannot be opened.
2018-04-22T12:42:56.838501Z 0 [Warning] [MY-010068] [Server] CA certificate ca.pem is self signed.
2018-04-22T12:42:56.848375Z 0 [Warning] [MY-010441] [Server] Failed to open optimizer cost constant tables
2018-04-22T12:42:56.848863Z 0 [ERROR] [MY-013129] [Server] A message intended for a client cannot be sent there as no client-session is attached. Therefore, we’re sending the information to the error-log instead: MY-001146 – Table ‘mysql.component’ doesn’t exist
2018-04-22T12:42:56.848916Z 0 [Warning] [MY-013129] [Server] A message intended for a client cannot be sent there as no client-session is attached. Therefore, we’re sending the information to the error-log instead: MY-003543 – The mysql.component table is missing or has an incorrect definition.
….
2018-04-22T12:42:56.854141Z 0 [System] [MY-010931] [Server] /home/my/mysql-8.0/runtime_output_directory/mysqld: ready for connections. Version: ‘8.0.11’ socket: ‘/tmp/mysql.sock’ port: 3306 Source distribution.

I figured out that if there is a single wrong variable in the configuration file, running mysqld –initialize will leave the database in an inconsistent state. NOT GOOD! I am happy I didn’t try this in a production system!

Time to start over from the beginning:

> rm -r /my/data3/*
> ./mysqld –defaults-file=/tmp/my.cnf –initialize

2018-04-22T12:44:45.548960Z 5 [Note] [MY-010454] [Server] A temporary password is generated for [email protected]: px)NaaSp?6um
2018-04-22T12:44:51.221751Z 0 [System] [MY-013170] [Server] /home/my/mysql-8.0/runtime_output_directory/mysqld (mysqld 8.0.11) initializing of server has completed

Success!

I wonder why the temporary password is so complex; It could easily have been something that one could easily remember without decreasing security, it’s temporary after all. No big deal, one can always paste it from the logs. (Side note: MariaDB uses socket authentication on many system and thus doesn’t need temporary installation passwords).

Now lets start the MySQL server for real to do some testing:

> ./mysqld –defaults-file=/tmp/my.cnf

2018-04-22T12:45:43.683484Z 0 [System] [MY-010931] [Server] /home/my/mysql-8.0/runtime_output_directory/mysqld: ready for connections. Version: ‘8.0.11’ socket: ‘/tmp/mysql.sock’ port: 3306 Source distribution.

And the lets start the client:

> ./client/mysql –socket=/tmp/mysql.sock –user=root –password=”px)NaaSp?6um”
ERROR 2059 (HY000): Plugin caching_sha2_password could not be loaded: /usr/local/mysql/lib/plugin/caching_sha2_password.so: cannot open shared object file: No such file or directory

Apparently MySQL 8.0 doesn’t work with old MySQL / MariaDB clients by default 🙁

I was testing this in a system with MariaDB installed, like all modern Linux system today, and didn’t want to use the MySQL clients or libraries.

I decided to try to fix this by changing the authentication to the native (original) MySQL authentication method.

> mysqld –skip-grant-tables

> ./client/mysql –socket=/tmp/mysql.sock –user=root
ERROR 1045 (28000): Access denied for user ‘root’@’localhost’ (using password: NO)

Apparently –skip-grant-tables is not good enough anymore. Let’s try again with:

> mysqld –skip-grant-tables –default_authentication_plugin=mysql_native_password

> ./client/mysql –socket=/tmp/mysql.sock –user=root mysql
Welcome to the MariaDB monitor. Commands end with ; or \g.
Your MySQL connection id is 7
Server version: 8.0.11 Source distribution

Great, we are getting somewhere, now lets fix “root”  to work with the old authenticaion:

MySQL [mysql]> update mysql.user set plugin=”mysql_native_password”,authentication_string=password(“test”) where user=”root”;
ERROR 1064 (42000): You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near ‘(“test”) where user=”root”‘ at line 1

A quick look in the MySQL 8.0 release notes told me that the PASSWORD() function is removed in 8.0. Why???? I don’t know how one in MySQL 8.0 is supposed to generate passwords compatible with old installations of MySQL. One could of course start an old MySQL or MariaDB version, execute the password() function and copy the result.

I decided to fix this the easy way and use an empty password:

(Update:: I later discovered that the right way would have been to use: FLUSH PRIVILEGES;  ALTER USER’ root’@’localhost’ identified by ‘test’  ; I however dislike this syntax as it has the password in clear text which is easy to grab and the command can’t be used to easily update the mysql.user table. One must also disable the –skip-grant mode to do use this)

MySQL [mysql]> update mysql.user set plugin=”mysql_native_password”,authentication_string=”” where user=”root”;
Query OK, 1 row affected (0.077 sec)
Rows matched: 1 Changed: 1 Warnings: 0
 
I restarted mysqld:
> mysqld –default_authentication_plugin=mysql_native_password

> ./client/mysql –user=root –password=”” mysql
ERROR 1862 (HY000): Your password has expired. To log in you must change it using a client that supports expired passwords.

Ouch, forgot that. Lets try again:

> mysqld –skip-grant-tables –default_authentication_plugin=mysql_native_password

> ./client/mysql –user=root –password=”” mysql
MySQL [mysql]> update mysql.user set password_expired=”N” where user=”root”;

Now restart and test worked:

> ./mysqld –default_authentication_plugin=mysql_native_password

>./client/mysql –user=root –password=”” mysql

Finally I had a working account that I can use to create other users!

When looking at mysqld –help –verbose again. I noticed the option:

–initialize-insecure
Create the default database and exit. Create a super user
with empty password.

I decided to check if this would have made things easier:

> rm -r /my/data3/*
> ./mysqld –defaults-file=/tmp/my.cnf –initialize-insecure

2018-04-22T13:18:06.629548Z 5 [Warning] [MY-010453] [Server] [email protected] is created with an empty password ! Please consider switching off the –initialize-insecure option.

Hm. Don’t understand the warning as–initialize-insecure is not an option that one would use more than one time and thus nothing one would ‘switch off’.

> ./mysqld –defaults-file=/tmp/my.cnf

> ./client/mysql –user=root –password=”” mysql
ERROR 2059 (HY000): Plugin caching_sha2_password could not be loaded: /usr/local/mysql/lib/plugin/caching_sha2_password.so: cannot open shared object file: No such file or directory

Back to the beginning 🙁

To get things to work with old clients, one has to initialize the database with:
> ./mysqld –defaults-file=/tmp/my.cnf –initialize-insecure –default_authentication_plugin=mysql_native_password

Now I finally had MySQL 8.0 up and running and thought I would take it up for a spin by running the “standard” MySQL/MariaDB sql-bench test suite. This was removed in MySQL 5.7, but as I happened to have MariaDB 10.3 installed, I decided to run it from there.

sql-bench is a single threaded benchmark that measures the “raw” speed for some common operations. It gives you the ‘maximum’ performance for a single query. Its different from other benchmarks that measures the maximum throughput when you have a lot of users, but sql-bench still tells you a lot about what kind of performance to expect from the database.

I tried first to be clever and create the “test” database, that I needed for sql-bench, with
> mkdir /my/data3/test

but when I tried to run the benchmark, MySQL 8.0 complained that the test database didn’t exist.

MySQL 8.0 has gone away from the original concept of MySQL where the user can easily
create directories and copy databases into the database directory. This may have serious
implication for anyone doing backup of databases and/or trying to restore a backup with normal OS commands.

I created the ‘test’ database with mysqladmin and then tried to run sql-bench:

> ./run-all-tests –user=root

The first run failed in test-ATIS:

Can’t execute command ‘create table class_of_service (class_code char(2) NOT NULL,rank tinyint(2) NOT NULL,class_description char(80) NOT NULL,PRIMARY KEY (class_code))’
Error: You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near ‘rank tinyint(2) NOT NULL,class_description char(80) NOT NULL,PRIMARY KEY (class_’ at line 1

This happened because ‘rank‘ is now a reserved word in MySQL 8.0. This is also reserved in ANSI SQL, but I don’t know of any other database that has failed to run test-ATIS before. I have in the past run it against Oracle, PostgreSQL, Mimer, MSSQL etc without any problems.

MariaDB also has ‘rank’ as a keyword in 10.2 and 10.3 but one can still use it as an identifier.

I fixed test-ATIS and then managed to run all tests on MySQL 8.0.

I did run the test both with MySQL 8.0 and MariaDB 10.3 with the InnoDB storage engine and by having identical values for all InnoDB variables, table-definition-cache and table-open-cache. I turned off performance schema for both databases. All test are run with a user with an empty password (to keep things comparable and because it’s was too complex to generate a password in MySQL 8.0)

The result are as follows
Results per test in seconds:

Operation         |MariaDB|MySQL-8|

———————————–
ATIS              | 153.00| 228.00|
alter-table       |  92.00| 792.00|
big-tables        | 990.00|2079.00|
connect           | 186.00| 227.00|
create            | 575.00|4465.00|
insert            |4552.00|8458.00|
select            | 333.00| 412.00|
table-elimination |1900.00|3916.00|
wisconsin         | 272.00| 590.00|
———————————–

This is of course just a first view of the performance of MySQL 8.0 in a single user environment. Some reflections about the results:

  • Alter-table test is slower (as expected) in 8.0 as some of the alter tests benefits of the instant add column in MariaDB 10.3.
  • connect test is also better for MariaDB as we put a lot of efforts to speed this up in MariaDB 10.2
  • table-elimination shows an optimization in MariaDB for the  Anchor table model, which MySQL doesn’t have.
  • CREATE and DROP TABLE is almost 8 times slower in MySQL 8.0 than in MariaDB 10.3. I assume this is the cost of ‘atomic DDL’. This may also cause performance problems for any thread using the data dictionary when another thread is creating/dropping tables.
  • When looking at the individual test results, MySQL 8.0 was slower in almost every test, in many significantly slower.
  • The only test where MySQL was faster was “update_with_key_prefix”. I checked this and noticed that there was a bug in the test and the columns was updated to it’s original value (which should be instant with any storage engine). This is an old bug that MySQL has found and fixed and that we have not been aware of in the test or in MariaDB.
  • While writing this, I noticed that MySQL 8.0 is now using utf8mb4 as the default character set instead of latin1. This may affect some of the benchmarks slightly (not much as most tests works with numbers and Oracle claims that utf8mb4 is only 20% slower than latin1), but needs to be verified.
  • Oracle claims that MySQL 8.0 is much faster on multi user benchmarks. The above test indicates that they may have done this by sacrificing single user performance.
  •  We need to do more and many different benchmarks to better understand exactly what is going on. Stay tuned!

Short summary of my first run with MySQL 8.0:

  • Using the new caching_sha2_password authentication as default for new installation is likely to cause a lot of problems for users. No old application will be able to use MySQL 8.0, installed with default options, without moving to MySQL’s client libraries. While working on this blog I saw MySQL users complain on IRC that not even MySQL Workbench can authenticate with MySQL 8.0. This is the first time in MySQL’s history where such an incompatible change has ever been done!
  • Atomic DDL is a good thing (We plan to have this in MariaDB 10.4), but it should not have such a drastic impact on performance. I am also a bit skeptical of MySQL 8.0 having just one copy of the data dictionary as if this gets corrupted you will lose all your data. (Single point of failure)
  • MySQL 8.0 has several new reserved words and has removed a lot of variables, which makes upgrades hard. Before upgrading to MySQL 8.0 one has to check all one’s databases and applications to ensure that there are no conflicts.
  • As my test above shows, if you have a single deprecated variable in your configuration files, the installation of MySQL will abort and can leave the database in inconsistent state. I did of course my tests by installing into an empty data dictionary, but one can assume that some of the problems may also happen when upgrading an old installation.

Conclusions:
In many ways, MySQL 8.0 has caught up with some earlier versions of MariaDB. For instance, in MariaDB 10.0, we introduced roles (four years ago). In MariaDB 10.1, we introduced encrypted redo/undo logs (three years ago). In MariaDB 10.2, we introduced window functions and CTEs (a year ago). However, some catch-up of MariaDB Server 10.2 features still remains for MySQL (such as check constraints, binlog compression, and log-based rollback).

MySQL 8.0 has a few new interesting features (mostly Atomic DDL and JSON TABLE functions), but at the same time MySQL has strayed away from some of the fundamental corner stone principles of MySQL:

From the start of the first version of MySQL in 1995, all development has been focused around 3 core principles:

  • Ease of use
  • Performance
  • Stability

With MySQL 8.0, Oracle has sacrifices 2 of 3 of these.

In addition (as part of ease of use), while I was working on MySQL, we did our best to ensure that the following should hold:

  • Upgrades should be trivial
  • Things should be kept compatible, if possible (don’t remove features/options/functions that are used)
  • Minimize reserved words, don’t remove server variables
  • One should be able to use normal OS commands to create and drop databases, copy and move tables around within the same system or between different systems. With 8.0 and data dictionary taking backups of specific tables will be hard, even if the server is not running.
  • mysqldump should always be usable backups and to move to new releases
  • Old clients and application should be able to use ‘any’ MySQL server version unchanged. (Some Oracle client libraries, like C++, by default only supports the new X protocol and can thus not be used with older MySQL or any MariaDB version)

We plan to add a data dictionary to MariaDB 10.4 or MariaDB 10.5, but in a way to not sacrifice any of the above principles!

The competition between MySQL and MariaDB is not just about a tactical arms race on features. It’s about design philosophy, or strategic vision, if you will.

This shows in two main ways: our respective view of the Storage Engine structure, and of the top-level direction of the roadmap.

On the Storage Engine side, MySQL is converging on InnoDB, even for clustering and partitioning. In doing so, they are abandoning the advantages of multiple ways of storing data. By contrast, MariaDB sees lots of value in the Storage Engine architecture: MariaDB Server 10.3 will see the general availability of MyRocks (for write-intensive workloads) and Spider (for scalable workloads). On top of that, we have ColumnStore for analytical workloads. One can use the CONNECT engine to join with other databases. The use of different storage engines for different workloads and different hardware is a competitive differentiator, now more than ever.

On the roadmap side, MySQL is carefully steering clear of features that close the gap between MySQL and Oracle. MariaDB has no such constraints. With MariaDB 10.3, we are introducing PL/SQL compatibility (Oracle’s stored procedures) and AS OF (built-in system versioned tables with point-in-time querying). For both of those features, MariaDB is the first Open Source database doing so. I don’t except Oracle to provide any of the above features in MySQL!

Also on the roadmap side, MySQL is not working with the ecosystem in extending the functionality. In 2017, MariaDB accepted more code contributions in one year, than MySQL has done during its entire lifetime, and the rate is increasing!

I am sure that the experience I had with testing MySQL 8.0 would have been significantly better if MySQL would have an open development model where the community could easily participate in developing and testing MySQL continuously. Most of the confusing error messages and strange behavior would have been found and fixed long before the GA release.

Before upgrading to MySQL 8.0 please read https://dev.mysql.com/doc/refman/8.0/en/upgrading-from-previous-series.html to see what problems you can run into! Don’t expect that old installations or applications will work out of the box without testing as a lot of features and options has been removed (query cache, partition of myisam tables etc)! You probably also have to revise your backup methods, especially if you want to ever restore just a few tables. (With 8.0, I don’t know how this can be easily done).

According to the MySQL 8.0 release notes, one can’t use mysqldump to copy a database to MySQL 8.0. One has to first to move to a MySQL 5.7 GA version (with mysqldump, as recommended by Oracle) and then to MySQL 8.0 with in-place update. I assume this means that all old mysqldump backups are useless for MySQL 8.0?

MySQL 8.0 seams to be a one way street to an unknown future. Up to MySQL 5.7 it has been trivial to move to MariaDB and one could always move back to MySQL with mysqldump. All MySQL client libraries has worked with MariaDB and all MariaDB client libraries has worked with MySQL. With MySQL 8.0 this has changed in the wrong direction.

As long as you are using MySQL 5.7 and below you have choices for your future, after MySQL 8.0 you have very little choice. But don’t despair, as MariaDB will always be able to load a mysqldump file and it’s very easy to upgrade your old MySQL installation to MariaDB 🙂

I wish you good luck to try MySQL 8.0 (and also the upcoming MariaDB 10.3)!