Tag Archives: Redundancy

How to Compete with Giants

Post Syndicated from Gleb Budman original https://www.backblaze.com/blog/how-to-compete-with-giants/

How to Compete with Giants

This post by Backblaze’s CEO and co-founder Gleb Budman is the sixth in a series about entrepreneurship. You can choose posts in the series from the list below:

  1. How Backblaze got Started: The Problem, The Solution, and the Stuff In-Between
  2. Building a Competitive Moat: Turning Challenges Into Advantages
  3. From Idea to Launch: Getting Your First Customers
  4. How to Get Your First 1,000 Customers
  5. Surviving Your First Year
  6. How to Compete with Giants

Use the Join button above to receive notification of new posts in this series.

Perhaps your business is competing in a brand new space free from established competitors. Most of us, though, start companies that compete with existing offerings from large, established companies. You need to come up with a better mousetrap — not the first mousetrap.

That’s the challenge Backblaze faced. In this post, I’d like to share some of the lessons I learned from that experience.

Backblaze vs. Giants

Competing with established companies that are orders of magnitude larger can be daunting. How can you succeed?

I’ll set the stage by offering a few sets of giants we compete with:

  • When we started Backblaze, we offered online backup in a market where companies had been offering “online backup” for at least a decade, and even the newer entrants had raised tens of millions of dollars.
  • When we built our storage servers, the alternatives were EMC, NetApp, and Dell — each of which had a market cap of over $10 billion.
  • When we introduced our cloud storage offering, B2, our direct competitors were Amazon, Google, and Microsoft. You might have heard of them.

What did we learn by competing with these giants on a bootstrapped budget? Let’s take a look.

Determine What Success Means

For a long time Apple considered Apple TV to be a hobby, not a real product worth focusing on, because it did not generate a billion in revenue. For a $10 billion per year revenue company, a new business that generates $50 million won’t move the needle and often isn’t worth putting focus on. However, for a startup, getting to $50 million in revenue can be the start of a wildly successful business.

Lesson Learned: Don’t let the giants set your success metrics.

The Advantages Startups Have

The giants have a lot of advantages: more money, people, scale, resources, access, etc. Following their playbook and attacking head-on means you’re simply outgunned. Common paths to failure are trying to build more features, enter more markets, outspend on marketing, and other similar approaches where scale and resources are the primary determinants of success.

But being a startup affords many advantages most giants would salivate over. As a nimble startup you can leverage those to succeed. Let’s breakdown nine competitive advantages we’ve used that you can too.

1. Drive Focus

It’s hard to build a $10 billion revenue business doing just one thing, and most giants have a broad portfolio of businesses, numerous products for each, and targeting a variety of customer segments in multiple markets. That adds complexity and distributes management attention.

Startups get the benefit of having everyone in the company be extremely focused, often on a singular mission, product, customer segment, and market. While our competitors sell everything from advertising to Zantac, and are investing in groceries and shipping, Backblaze has focused exclusively on cloud storage. This means all of our best people (i.e. everyone) is focused on our cloud storage business. Where is all of your focus going?

Lesson Learned: Align everyone in your company to a singular focus to dramatically out-perform larger teams.

2. Use Lack-of-Scale as an Advantage

You may have heard Paul Graham say “Do things that don’t scale.” There are a host of things you can do specifically because you don’t have the same scale as the giants. Use that as an advantage.

When we look for data center space, we have more options than our largest competitors because there are simply more spaces available with room for 100 cabinets than for 1,000 cabinets. With some searching, we can find data center space that is better/cheaper.

When a flood in Thailand destroyed factories, causing the world’s supply of hard drives to plummet and prices to triple, we started drive farming. The giants certainly couldn’t. It was a bit crazy, but it let us keep prices unchanged for our customers.

Our Chief Cloud Officer, Tim, used to work at Adobe. Because of their size, any new product needed to always launch in a multitude of languages and in global markets. Once launched, they had scale. But getting any new product launched was incredibly challenging.

Lesson Learned: Use lack-of-scale to exploit opportunities that are closed to giants.

3. Build a Better Product

This one is probably obvious. If you’re going to provide the same product, at the same price, to the same customers — why do it? Remember that better does not always mean more features. Here’s one way we built a better product that didn’t require being a bigger company.

All online backup services required customers to choose what to include in their backup. We found that this was complicated for users since they often didn’t know what needed to be backed up. We flipped the model to back up everything and allow users to exclude if they wanted to, but it was not required. This reduced the number of features/options, while making it easier and better for the user.

This didn’t require the resources of a huge company; it just required understanding customers a bit deeper and thinking about the solution differently. Building a better product is the most classic startup competitive advantage.

Lesson Learned: Dig deep with your customers to understand and deliver a better mousetrap.

4. Provide Better Service

How can you provide better service? Use your advantages. Escalations from your customer care folks to engineering can go through fewer hoops. Fixing an issue and shipping can be quicker. Access to real answers on Twitter or Facebook can be more effective.

A strategic decision we made was to have all customer support people as full-time employees in our headquarters. This ensures they are in close contact to the whole company for feedback to quickly go both ways.

Having a smaller team and fewer layers enables faster internal communication, which increases customer happiness. And the option to do things that don’t scale — such as help a customer in a unique situation — can go a long way in building customer loyalty.

Lesson Learned: Service your customers better by establishing clear internal communications.

5. Remove The Unnecessary

After determining that the industry standard EMC/NetApp/Dell storage servers would be too expensive to build our own cloud storage upon, we decided to build our own infrastructure. Many said we were crazy to compete with these multi-billion dollar companies and that it would be impossible to build a lower cost storage server. However, not only did it prove to not be impossible — it wasn’t even that hard.

One key trick? Remove the unnecessary. While EMC and others built servers to sell to other companies for a wide variety of use cases, Backblaze needed servers that only Backblaze would run, and for a single use case. As a result we could tailor the servers for our needs by removing redundancy from each server (since we would run redundant servers), and using lower-performance components (since we would get high-performance by running parallel servers).

What do your customers and use cases not need? This can trim costs and complexity while often improving the product for your use case.

Lesson Learned: Don’t think “what can we add” to what the giants offer — think “what can we remove.”

6. Be Easy

How many times have you visited a large company website, particularly one that’s not consumer-focused, only to leave saying, “Huh? I don’t understand what you do.” Keeping your website clear, and your product and pricing simple, will dramatically increase conversion and customer satisfaction. If you’re able to make it 2x easier and thus increasing your conversion by 2x, you’ve just allowed yourself to spend ½ as much acquiring a customer.

Providing unlimited data backup wasn’t specifically about providing more storage — it was about making it easier. Since users didn’t know how much data they needed to back up, charging per gigabyte meant they wouldn’t know the cost. Providing unlimited data backup meant they could just relax.

Customers love easy — and being smaller makes easy easier to deliver. Use that as an advantage in your website, marketing materials, pricing, product, and in every other customer interaction.

Lesson Learned: Ease-of-use isn’t a slogan: it’s a competitive advantage. Treat it as seriously as any other feature of your product

7. Don’t Be Afraid of Risk

Obviously unnecessary risks are unnecessary, and some risks aren’t worth taking. However, large companies that have given guidance to Wall Street with a $0.01 range on their earning-per-share are inherently going to be very risk-averse. Use risk-tolerance to open up opportunities, and adjust your tolerance level as you scale. In your first year, there are likely an infinite number of ways your business may vaporize; don’t be too worried about taking a risk that might have a 20% downside when the upside is hockey stick growth.

Using consumer-grade hard drives in our servers may have caused pain and suffering for us years down-the-line, but they were priced at approximately 50% of enterprise drives. Giants wouldn’t have considered the option. Turns out, the consumer drives performed great for us.

Lesson Learned: Use calculated risks as an advantage.

8. Be Open

The larger a company grows, the more it wants to hide information. Some of this is driven by regulatory requirements as a public company. But most of this is cultural. Sharing something might cause a problem, so let’s not. All external communication is treated as a critical press release, with rounds and rounds of editing by multiple teams and approvals. However, customers are often desperate for information. Moreover, sharing information builds trust, understanding, and advocates.

I started blogging at Backblaze before we launched. When we blogged about our Storage Pod and open-sourced the design, many thought we were crazy to share this information. But it was transformative for us, establishing Backblaze as a tech thought leader in storage and giving people a sense of how we were able to provide our service at such a low cost.

Over the years we’ve developed a culture of being open internally and externally, on our blog and with the press, and in communities such as Hacker News and Reddit. Often we’ve been asked, “why would you share that!?” — but it’s the continual openness that builds trust. And that culture of openness is incredibly challenging for the giants.

Lesson Learned: Overshare to build trust and brand where giants won’t.

9. Be Human

As companies scale, typically a smaller percent of founders and executives interact with customers. The people who build the company become more hidden, the language feels “corporate,” and customers start to feel they’re interacting with the cliche “faceless, nameless corporation.” Use your humanity to your advantage. From day one the Backblaze About page listed all the founders, and my email address. While contacting us shouldn’t be the first path for a customer support question, I wanted it to be clear that we stand behind the service we offer; if we’re doing something wrong — I want to know it.

To scale it’s important to have processes and procedures, but sometimes a situation falls outside of a well-established process. While we want our employees to follow processes, they’re still encouraged to be human and “try to do the right thing.” How to you strike this balance? Simon Sinek gives a good talk about it: make your employees feel safe. If employees feel safe they’ll be human.

If your customer is a consumer, they’ll appreciate being treated as a human. Even if your customer is a corporation, the purchasing decision-makers are still people.

Lesson Learned: Being human is the ultimate antithesis to the faceless corporation.

Build Culture to Sustain Your Advantages at Scale

Presumably the goal is not to always be competing with giants, but to one day become a giant. Does this mean you’ll lose all of these advantages? Some, yes — but not all. Some of these advantages are cultural, and if you build these into the culture from the beginning, and fight to keep them as you scale, you can keep them as you become a giant.

Tesla still comes across as human, with Elon Musk frequently interacting with people on Twitter. Apple continues to provide great service through their Genius Bar. And, worst case, if you lose these at scale, you’ll still have the other advantages of being a giant such as money, people, scale, resources, and access.

Of course, some new startup will be gunning for you with grand ambitions, so just be sure not to get complacent. 😉

The post How to Compete with Giants appeared first on Backblaze Blog | Cloud Storage & Cloud Backup.

5 years with home NAS/RAID

Post Syndicated from Robert Graham original http://blog.erratasec.com/2017/09/5-years-with-home-nasraid.html

I have lots of data-sets (packet-caps, internet-scans), so I need a large RAID system to hole it all. As I described in 2012, I bought a home “NAS” system. I thought I’d give the 5 year perspective.

Reliability. I had two drives fail, which is about to be expected. Buying a new drive, swapping it in, and rebuilding the RAID went painless, though that’s because I used RAID6 (two drive redundancy). RAID5 (one drive redundancy) is for chumps.

Speed. I’ve been unhappy with the speed, but there’s not much I can do about it. Mechanical drives access times are slow, and I don’t see any way of fixing that.

Cost. It’s been $3000 over 5 years (including the two replacement drives). That comes out to $50/month. Amazon’s “Glacier” service is $108/month. Since we all have the same hardware costs, it’s unlikely that any online cloud storage can do better than doing it yourself.

Moore’s Law. For the same price as I spent 5 years ago, I can now get three times the storage, including faster processors in the NAS box. From that perspective, I’ve only spent $33/month on storage, as the remaining third still has value.

Ease-of-use: The reason to go with a NAS is ease-of-use, so I don’t have to mess with it. Yes, I’m a Linux sysadmin, but I have more than enough Linux boxen needing my attention. The NAS has been extremely easy to use, even dealing with the two disk failures.

Battery backup. The cheap $50 CyberPower UPS I bought never worked well and completely failed recently, so I’ve ordered a $150 APC unit to replace it.

Vendor. I chose Synology, and have no reason to complain. Of course they’ve had security vulnerabilities, but then, so have all their competition.

DLNA. This is a standard for streaming music among home devices. It never worked well. I suspect partly it’s Synology’s fault that they can’t transcode well. I suspect it’s also the apps I tried on the iPad which have obvious problems. I end up streaming to the iPad by simply using the SMB protocol to serve files rather than a video protocol.

Consumer vs. enterprise drives. I chose consumer rather than enterprise drives. I think this is always the best choice (RAID means inexpensive drives). But very smart people with experience in recovering data disagree with me.

If you are in the market. If you are building your own NAS, get a 4 or 5 bay device and RAID6. Two-drive redundancy is really important.

Cloud Storage Doesn’t have to be Convoluted, Complex, or Confusing

Post Syndicated from Ahin Thomas original https://www.backblaze.com/blog/cloud-storage-pricing-comparison/

business man frustrated over cloud storage pricing

So why do many vendors make it so hard to get information about how much you’re storing and how much you’re being charged?

Cloud storage is fast becoming the central repository for mission critical information, irreplaceable memories, and in some cases entire corporate and personal histories. Given this responsibility, we believe cloud storage vendors have an obligation to be transparent as possible in how they interact with their customers.

In that light we decided to challenge four cloud storage vendors and ask two simple questions:

  1. Can a customer understand how much data is stored?
  2. Can a customer understand the bill?

The detailed results are below, but if you wish to skip the details and the screen captures (TL;DR), we’ve summarized the results in the table below.

Summary of Cloud Storage Pricing Test

Our challenge was to upload 1 terabyte of data, store it for one month, and then download it.

Visibility to Data Stored Easy to Understand Bill Cost
Backblaze B2 Accurate, intuitive display of storage information. Available on demand, and the site clearly defines what has and will be charged for. $25
Microsoft Azure Storage is being measured in KiB, but is billed by the GB. With a calculator, it is unclear how much storage we are using. Available, but difficult to find. The nearly 30 day lag in billing creates business and accounting challenges. $72
Amazon S3 Incomplete. From the file browsing user interface, there is no reasonable way to understand how much data is being stored. Available on demand. While there are some line items that seem unnecessary for our test, the bill is generally straight-forward to understand. $71
Google Cloud Service Incomplete. From the file browsing user interface, there is no reasonable way to understand how much data is being stored. Available, but provides descriptions in units that are not on the pricing table nor commonly used. $100

Cloud Storage Test Details

For our tests, we choose Backblaze B2, Microsoft’s Azure, Amazon’s S3, and Google Cloud Storage. Our idea was simple: Upload 1 TB of data to the comparable service for each vendor, store it for 1 month, download that 1 TB, then document and share the results.

Let’s start with most obvious observation, the cost charged by each vendor for the test:

Cost
Backblaze B2 $25
Microsoft Azure $72
Amazon S3 $71
Google Cloud Service $100

Later in this post, we’ll see if we can determine the different cost components (storage, downloading, transactions, etc.) for each vendor, but our first step is to see if we can determine how much data we stored. In some cases, the answer is not as obvious as it would seem.

Test 1: Can a Customer Understand How Much Data Is Stored?

At the core, a provider of a service ought to be able to tell a customer how much of the service he or she is using. In this case, one might assume that providers of Cloud Storage would be able to tell customers how much data is being stored at any given moment. It turns out, it’s not that simple.

Backblaze B2
Logging into a Backblaze B2 account, one is presented with a summary screen that displays all “buckets.” Each bucket displays key summary information, including data currently stored.

B2 Cloud Storage Buckets screenshot

Clicking into a given bucket, one can browse individual files. Each file displays its size, and multiple files can be selected to create a size summary.

B2 file tree screenshot

Summary: Accurate, intuitive display of storage information.

Microsoft Azure

Moving on to Microsoft’s Azure, things get a little more “exciting.” There was no area that we could find where one can determine the total amount of data, in GB, stored with Azure.

There’s an area entitled “usage,” but that wasn’t helpful.

Microsoft Azure cloud storage screenshot

We then moved on to “Overview,” but had a couple challenges.The first issue was that we were presented with KiB (kibibyte) as a unit of measure. One GB (the unit of measure used in Azure’s pricing table) equates to roughly 976,563 KiB. It struck us as odd that things would be summarized by a unit of measure different from the billing unit of measure.

Microsoft Azure usage dashboard screenshot

Summary: Storage is being measured in KiB, but is billed by the GB. Even with a calculator, it is unclear how much storage we are using.

Amazon S3

Next we checked on the data we were storing in S3. We again ran into problems.

In the bucket overview, we were able to identify our buckets. However, we could not tell how much data was being stored.

Amazon S3 cloud storage buckets screenshot

Drilling into a bucket, the detail view does tell us file size. However, there was no method for summarizing the data stored within that bucket or for multiple files.

Amazon S3 cloud storage buckets usage screenshot

Summary: Incomplete. From the file browsing user interface, there is no reasonable way to understand how much data is being stored.

Google Cloud Storage (“GCS”)

GCS proved to have its own quirks, as well.

One can easily find the “bucket” summary, however, it does not provide information on data stored.

Google Cloud Storage Bucket screenshot

Clicking into the bucket, one can see files and the size of an individual file. However, no ability to see data total is provided.

Google Cloud Storage bucket files screenshot

Summary: Incomplete. From the file browsing user interface, there is no reasonable way to understand how much data is being stored.

Test 1 Conclusions

We knew how much storage we were uploading and, in many cases, the user will have some sense of the amount of data they are uploading. However, it strikes us as odd that many vendors won’t tell you how much data you have stored. Even stranger are the vendors that provide reporting in a unit of measure that is different from the units in their pricing table.

Test 2: Can a Customer Understand The Bill?

The cloud storage industry has done itself no favors with its tiered pricing that requires a calculator to figure out what’s going on. Setting that aside for a moment, one would presume that bills would be created in clear, auditable ways.

Backblaze

Inside of the Backblaze user interface, one finds a navigation link entitled “Billing.” Clicking on that, the user is presented with line items for previous bills, payments, and an estimate for the upcoming charges.

Backblaze B2 billing screenshot

One can expand any given row to see the the line item transactions composing each bill.

Backblaze B2 billing details screenshot

Summary: Available on demand, and the site clearly defines what has and will be charged for.

Azure

Trying to understand the Azure billing proved to be a bit tricky.

On August 6th, we logged into the billing console and were presented with this screen.

Microsoft Azure billing screenshot

As you can see, on Aug 6th, billing for the period of May-June was not available for download. For the period ending June 26th, we were charged nearly a month later, on July 24th. Clicking into that row item does display line item information.

Microsoft Azure cloud storage billing details screenshot

Summary: Available, but difficult to find. The nearly 30 day lag in billing creates business and accounting challenges.

Amazon S3

Amazon presents a clean billing summary and enables users to “drill down” into line items.

Going to the billing area of AWS, one can survey various monthly bills and is presented with a clean summary of billing charges.

AWS billing screenshot

Expanding into the billing detail, Amazon articulates each line item charge. Within each line item, charges are broken out into sub-line items for the different tiers of pricing.

AWS billing details screenshot

Summary: Available on demand. While there are some line items that seem unnecessary for our test, the bill is generally straight-forward to understand.

Google Cloud Storage (“GCS”)

This was an area where the GCS User Interface, which was otherwise relatively intuitive, became confusing.

Going to the Billing Overview page did not offer much in the way of an overview on charges.

Google Cloud Storage billing screenshot

However, moving down to the “Transactions” section did provide line item detail on all the charges incurred. However, similar to Azure introducing the concept of KiB, Google introduces the concept of the equally confusing Gibibyte (GiB). While all of Google’s pricing tables are listed in terms of GB, the line items reference GiB. 1 GiB is 1.07374 GBs.

Google Cloud Storage billing details screenshot

Summary: Available, but provides descriptions in units that are not on the pricing table nor commonly used.

Test 2 Conclusions

Clearly, some vendors do a better job than others in making their pricing available and understandable. From a transparency standpoint, it’s difficult to justify why a vendor would have their pricing table in units of X, but then put units of Y in the user interface.

Transparency: The Backblaze Way

Transparency isn’t easy. At Backblaze, we believe in investing time and energy into presenting the most intuitive user interfaces that we can create. We take pride in our heritage in the consumer backup space — servicing consumers has taught us how to make things understandable and usable. We do our best to apply those lessons to everything we do.

This philosophy reflects our desire to make our products usable, but it’s also part of a larger ethos of being transparent with our customers. We are being trusted with precious data. We want to repay that trust with, among other things, transparency.

It’s that spirit that was behind the decision to publish our hard drive performance stats, to open source the infrastructure that is behind us having the lowest cost of storage in the industry, and also to open source our erasure coding (the math that drives a significant portion of our redundancy for your data).

Why? We believe it’s not just about good user interface, it’s about the relationship we want to build with our customers.

The post Cloud Storage Doesn’t have to be Convoluted, Complex, or Confusing appeared first on Backblaze Blog | Cloud Storage & Cloud Backup.

How to Configure an LDAPS Endpoint for Simple AD

Post Syndicated from Cameron Worrell original https://aws.amazon.com/blogs/security/how-to-configure-an-ldaps-endpoint-for-simple-ad/

Simple AD, which is powered by Samba  4, supports basic Active Directory (AD) authentication features such as users, groups, and the ability to join domains. Simple AD also includes an integrated Lightweight Directory Access Protocol (LDAP) server. LDAP is a standard application protocol for the access and management of directory information. You can use the BIND operation from Simple AD to authenticate LDAP client sessions. This makes LDAP a common choice for centralized authentication and authorization for services such as Secure Shell (SSH), client-based virtual private networks (VPNs), and many other applications. Authentication, the process of confirming the identity of a principal, typically involves the transmission of highly sensitive information such as user names and passwords. To protect this information in transit over untrusted networks, companies often require encryption as part of their information security strategy.

In this blog post, we show you how to configure an LDAPS (LDAP over SSL/TLS) encrypted endpoint for Simple AD so that you can extend Simple AD over untrusted networks. Our solution uses Elastic Load Balancing (ELB) to send decrypted LDAP traffic to HAProxy running on Amazon EC2, which then sends the traffic to Simple AD. ELB offers integrated certificate management, SSL/TLS termination, and the ability to use a scalable EC2 backend to process decrypted traffic. ELB also tightly integrates with Amazon Route 53, enabling you to use a custom domain for the LDAPS endpoint. The solution needs the intermediate HAProxy layer because ELB can direct traffic only to EC2 instances. To simplify testing and deployment, we have provided an AWS CloudFormation template to provision the ELB and HAProxy layers.

This post assumes that you have an understanding of concepts such as Amazon Virtual Private Cloud (VPC) and its components, including subnets, routing, Internet and network address translation (NAT) gateways, DNS, and security groups. You should also be familiar with launching EC2 instances and logging in to them with SSH. If needed, you should familiarize yourself with these concepts and review the solution overview and prerequisites in the next section before proceeding with the deployment.

Note: This solution is intended for use by clients requiring an LDAPS endpoint only. If your requirements extend beyond this, you should consider accessing the Simple AD servers directly or by using AWS Directory Service for Microsoft AD.

Solution overview

The following diagram and description illustrates and explains the Simple AD LDAPS environment. The CloudFormation template creates the items designated by the bracket (internal ELB load balancer and two HAProxy nodes configured in an Auto Scaling group).

Diagram of the the Simple AD LDAPS environment

Here is how the solution works, as shown in the preceding numbered diagram:

  1. The LDAP client sends an LDAPS request to ELB on TCP port 636.
  2. ELB terminates the SSL/TLS session and decrypts the traffic using a certificate. ELB sends the decrypted LDAP traffic to the EC2 instances running HAProxy on TCP port 389.
  3. The HAProxy servers forward the LDAP request to the Simple AD servers listening on TCP port 389 in a fixed Auto Scaling group configuration.
  4. The Simple AD servers send an LDAP response through the HAProxy layer to ELB. ELB encrypts the response and sends it to the client.

Note: Amazon VPC prevents a third party from intercepting traffic within the VPC. Because of this, the VPC protects the decrypted traffic between ELB and HAProxy and between HAProxy and Simple AD. The ELB encryption provides an additional layer of security for client connections and protects traffic coming from hosts outside the VPC.

Prerequisites

  1. Our approach requires an Amazon VPC with two public and two private subnets. The previous diagram illustrates the environment’s VPC requirements. If you do not yet have these components in place, follow these guidelines for setting up a sample environment:
    1. Identify a region that supports Simple AD, ELB, and NAT gateways. The NAT gateways are used with an Internet gateway to allow the HAProxy instances to access the internet to perform their required configuration. You also need to identify the two Availability Zones in that region for use by Simple AD. You will supply these Availability Zones as parameters to the CloudFormation template later in this process.
    2. Create or choose an Amazon VPC in the region you chose. In order to use Route 53 to resolve the LDAPS endpoint, make sure you enable DNS support within your VPC. Create an Internet gateway and attach it to the VPC, which will be used by the NAT gateways to access the internet.
    3. Create a route table with a default route to the Internet gateway. Create two NAT gateways, one per Availability Zone in your public subnets to provide additional resiliency across the Availability Zones. Together, the routing table, the NAT gateways, and the Internet gateway enable the HAProxy instances to access the internet.
    4. Create two private routing tables, one per Availability Zone. Create two private subnets, one per Availability Zone. The dual routing tables and subnets allow for a higher level of redundancy. Add each subnet to the routing table in the same Availability Zone. Add a default route in each routing table to the NAT gateway in the same Availability Zone. The Simple AD servers use subnets that you create.
    5. The LDAP service requires a DNS domain that resolves within your VPC and from your LDAP clients. If you do not have an existing DNS domain, follow the steps to create a private hosted zone and associate it with your VPC. To avoid encryption protocol errors, you must ensure that the DNS domain name is consistent across your Route 53 zone and in the SSL/TLS certificate (see Step 2 in the “Solution deployment” section).
  2. Make sure you have completed the Simple AD Prerequisites.
  3. We will use a self-signed certificate for ELB to perform SSL/TLS decryption. You can use a certificate issued by your preferred certificate authority or a certificate issued by AWS Certificate Manager (ACM).
    Note: To prevent unauthorized connections directly to your Simple AD servers, you can modify the Simple AD security group on port 389 to block traffic from locations outside of the Simple AD VPC. You can find the security group in the EC2 console by creating a search filter for your Simple AD directory ID. It is also important to allow the Simple AD servers to communicate with each other as shown on Simple AD Prerequisites.

Solution deployment

This solution includes five main parts:

  1. Create a Simple AD directory.
  2. Create a certificate.
  3. Create the ELB and HAProxy layers by using the supplied CloudFormation template.
  4. Create a Route 53 record.
  5. Test LDAPS access using an Amazon Linux client.

1. Create a Simple AD directory

With the prerequisites completed, you will create a Simple AD directory in your private VPC subnets:

  1. In the Directory Service console navigation pane, choose Directories and then choose Set up directory.
  2. Choose Simple AD.
    Screenshot of choosing "Simple AD"
  3. Provide the following information:
    • Directory DNS – The fully qualified domain name (FQDN) of the directory, such as corp.example.com. You will use the FQDN as part of the testing procedure.
    • NetBIOS name – The short name for the directory, such as CORP.
    • Administrator password – The password for the directory administrator. The directory creation process creates an administrator account with the user name Administrator and this password. Do not lose this password because it is nonrecoverable. You also need this password for testing LDAPS access in a later step.
    • Description – An optional description for the directory.
    • Directory Size – The size of the directory.
      Screenshot of the directory details to provide
  4. Provide the following information in the VPC Details section, and then choose Next Step:
    • VPC – Specify the VPC in which to install the directory.
    • Subnets – Choose two private subnets for the directory servers. The two subnets must be in different Availability Zones. Make a note of the VPC and subnet IDs for use as CloudFormation input parameters. In the following example, the Availability Zones are us-east-1a and us-east-1c.
      Screenshot of the VPC details to provide
  5. Review the directory information and make any necessary changes. When the information is correct, choose Create Simple AD.

It takes several minutes to create the directory. From the AWS Directory Service console , refresh the screen periodically and wait until the directory Status value changes to Active before continuing. Choose your Simple AD directory and note the two IP addresses in the DNS address section. You will enter them when you run the CloudFormation template later.

Note: Full administration of your Simple AD implementation is out of scope for this blog post. See the documentation to add users, groups, or instances to your directory. Also see the previous blog post, How to Manage Identities in Simple AD Directories.

2. Create a certificate

In the previous step, you created the Simple AD directory. Next, you will generate a self-signed SSL/TLS certificate using OpenSSL. You will use the certificate with ELB to secure the LDAPS endpoint. OpenSSL is a standard, open source library that supports a wide range of cryptographic functions, including the creation and signing of x509 certificates. You then import the certificate into ACM that is integrated with ELB.

  1. You must have a system with OpenSSL installed to complete this step. If you do not have OpenSSL, you can install it on Amazon Linux by running the command, sudo yum install openssl. If you do not have access to an Amazon Linux instance you can create one with SSH access enabled to proceed with this step. Run the command, openssl version, at the command line to see if you already have OpenSSL installed.
    [[email protected] ~]$ openssl version
    OpenSSL 1.0.1k-fips 8 Jan 2015

  2. Create a private key using the command, openssl genrsa command.
    [[email protected] tmp]$ openssl genrsa 2048 > privatekey.pem
    Generating RSA private key, 2048 bit long modulus
    ......................................................................................................................................................................+++
    ..........................+++
    e is 65537 (0x10001)

  3. Generate a certificate signing request (CSR) using the openssl req command. Provide the requested information for each field. The Common Name is the FQDN for your LDAPS endpoint (for example, ldap.corp.example.com). The Common Name must use the domain name you will later register in Route 53. You will encounter certificate errors if the names do not match.
    [[email protected] tmp]$ openssl req -new -key privatekey.pem -out server.csr
    You are about to be asked to enter information that will be incorporated into your certificate request.

  4. Use the openssl x509 command to sign the certificate. The following example uses the private key from the previous step (privatekey.pem) and the signing request (server.csr) to create a public certificate named server.crt that is valid for 365 days. This certificate must be updated within 365 days to avoid disruption of LDAPS functionality.
    [[email protected] tmp]$ openssl x509 -req -sha256 -days 365 -in server.csr -signkey privatekey.pem -out server.crt
    Signature ok
    subject=/C=XX/L=Default City/O=Default Company Ltd/CN=ldap.corp.example.com
    Getting Private key

  5. You should see three files: privatekey.pem, server.crt, and server.csr.
    [[email protected] tmp]$ ls
    privatekey.pem server.crt server.csr

    Restrict access to the private key.

    [ec2-us[email protected] tmp]$ chmod 600 privatekey.pem

    Keep the private key and public certificate for later use. You can discard the signing request because you are using a self-signed certificate and not using a Certificate Authority. Always store the private key in a secure location and avoid adding it to your source code.

  6. In the ACM console, choose Import a certificate.
  7. Using your favorite Linux text editor, paste the contents of your server.crt file in the Certificate body box.
  8. Using your favorite Linux text editor, paste the contents of your privatekey.pem file in the Certificate private key box. For a self-signed certificate, you can leave the Certificate chain box blank.
  9. Choose Review and import. Confirm the information and choose Import.

3. Create the ELB and HAProxy layers by using the supplied CloudFormation template

Now that you have created your Simple AD directory and SSL/TLS certificate, you are ready to use the CloudFormation template to create the ELB and HAProxy layers.

  1. Load the supplied CloudFormation template to deploy an internal ELB and two HAProxy EC2 instances into a fixed Auto Scaling group. After you load the template, provide the following input parameters. Note: You can find the parameters relating to your Simple AD from the directory details page by choosing your Simple AD in the Directory Service console.
Input parameter Input parameter description
HAProxyInstanceSize The EC2 instance size for HAProxy servers. The default size is t2.micro and can scale up for large Simple AD environments.
MyKeyPair The SSH key pair for EC2 instances. If you do not have an existing key pair, you must create one.
VPCId The target VPC for this solution. Must be in the VPC where you deployed Simple AD and is available in your Simple AD directory details page.
SubnetId1 The Simple AD primary subnet. This information is available in your Simple AD directory details page.
SubnetId2 The Simple AD secondary subnet. This information is available in your Simple AD directory details page.
MyTrustedNetwork Trusted network Classless Inter-Domain Routing (CIDR) to allow connections to the LDAPS endpoint. For example, use the VPC CIDR to allow clients in the VPC to connect.
SimpleADPriIP The primary Simple AD Server IP. This information is available in your Simple AD directory details page.
SimpleADSecIP The secondary Simple AD Server IP. This information is available in your Simple AD directory details page.
LDAPSCertificateARN The Amazon Resource Name (ARN) for the SSL certificate. This information is available in the ACM console.
  1. Enter the input parameters and choose Next.
  2. On the Options page, accept the defaults and choose Next.
  3. On the Review page, confirm the details and choose Create. The stack will be created in approximately 5 minutes.

4. Create a Route 53 record

The next step is to create a Route 53 record in your private hosted zone so that clients can resolve your LDAPS endpoint.

  1. If you do not have an existing DNS domain for use with LDAP, create a private hosted zone and associate it with your VPC. The hosted zone name should be consistent with your Simple AD (for example, corp.example.com).
  2. When the CloudFormation stack is in CREATE_COMPLETE status, locate the value of the LDAPSURL on the Outputs tab of the stack. Copy this value for use in the next step.
  3. On the Route 53 console, choose Hosted Zones and then choose the zone you used for the Common Name box for your self-signed certificate. Choose Create Record Set and enter the following information:
    1. Name – The label of the record (such as ldap).
    2. Type – Leave as A – IPv4 address.
    3. Alias – Choose Yes.
    4. Alias Target – Paste the value of the LDAPSURL on the Outputs tab of the stack.
  4. Leave the defaults for Routing Policy and Evaluate Target Health, and choose Create.
    Screenshot of finishing the creation of the Route 53 record

5. Test LDAPS access using an Amazon Linux client

At this point, you have configured your LDAPS endpoint and now you can test it from an Amazon Linux client.

  1. Create an Amazon Linux instance with SSH access enabled to test the solution. Launch the instance into one of the public subnets in your VPC. Make sure the IP assigned to the instance is in the trusted IP range you specified in the CloudFormation parameter MyTrustedNetwork in Step 3.b.
  2. SSH into the instance and complete the following steps to verify access.
    1. Install the openldap-clients package and any required dependencies:
      sudo yum install -y openldap-clients.
    2. Add the server.crt file to the /etc/openldap/certs/ directory so that the LDAPS client will trust your SSL/TLS certificate. You can copy the file using Secure Copy (SCP) or create it using a text editor.
    3. Edit the /etc/openldap/ldap.conf file and define the environment variables BASE, URI, and TLS_CACERT.
      • The value for BASE should match the configuration of the Simple AD directory name.
      • The value for URI should match your DNS alias.
      • The value for TLS_CACERT is the path to your public certificate.

Here is an example of the contents of the file.

BASE dc=corp,dc=example,dc=com
URI ldaps://ldap.corp.example.com
TLS_CACERT /etc/openldap/certs/server.crt

To test the solution, query the directory through the LDAPS endpoint, as shown in the following command. Replace corp.example.com with your domain name and use the Administrator password that you configured with the Simple AD directory

$ ldapsearch -D "[email protected]corp.example.com" -W sAMAccountName=Administrator

You should see a response similar to the following response, which provides the directory information in LDAP Data Interchange Format (LDIF) for the administrator distinguished name (DN) from your Simple AD LDAP server.

# extended LDIF
#
# LDAPv3
# base <dc=corp,dc=example,dc=com> (default) with scope subtree
# filter: sAMAccountName=Administrator
# requesting: ALL
#

# Administrator, Users, corp.example.com
dn: CN=Administrator,CN=Users,DC=corp,DC=example,DC=com
objectClass: top
objectClass: person
objectClass: organizationalPerson
objectClass: user
description: Built-in account for administering the computer/domain
instanceType: 4
whenCreated: 20170721123204.0Z
uSNCreated: 3223
name: Administrator
objectGUID:: l3h0HIiKO0a/ShL4yVK/vw==
userAccountControl: 512
…

You can now use the LDAPS endpoint for directory operations and authentication within your environment. If you would like to learn more about how to interact with your LDAPS endpoint within a Linux environment, here are a few resources to get started:

Troubleshooting

If you receive an error such as the following error when issuing the ldapsearch command, there are a few things you can do to help identify issues.

ldap_sasl_bind(SIMPLE): Can't contact LDAP server (-1)
  • You might be able to obtain additional error details by adding the -d1 debug flag to the ldapsearch command in the previous section.
    $ ldapsearch -D "[email protected]" -W sAMAccountName=Administrator –d1

  • Verify that the parameters in ldap.conf match your configured LDAPS URI endpoint and that all parameters can be resolved by DNS. You can use the following dig command, substituting your configured endpoint DNS name.
    $ dig ldap.corp.example.com

  • Confirm that the client instance from which you are connecting is in the CIDR range of the CloudFormation parameter, MyTrustedNetwork.
  • Confirm that the path to your public SSL/TLS certificate configured in ldap.conf as TLS_CAERT is correct. You configured this in Step 5.b.3. You can check your SSL/TLS connection with the command, substituting your configured endpoint DNS name for the string after –connect.
    $ echo -n | openssl s_client -connect ldap.corp.example.com:636

  • Verify that your HAProxy instances have the status InService in the EC2 console: Choose Load Balancers under Load Balancing in the navigation pane, highlight your LDAPS load balancer, and then choose the Instances

Conclusion

You can use ELB and HAProxy to provide an LDAPS endpoint for Simple AD and transport sensitive authentication information over untrusted networks. You can explore using LDAPS to authenticate SSH users or integrate with other software solutions that support LDAP authentication. This solution’s CloudFormation template is available on GitHub.

If you have comments about this post, submit them in the “Comments” section below. If you have questions about or issues implementing this solution, start a new thread on the Directory Service forum.

– Cameron and Jeff

On ISO standardization of blockchains

Post Syndicated from Robert Graham original http://blog.erratasec.com/2017/08/on-iso-standardization-of-blockchains.html

So ISO, the primary international standards organization, is seeking to standardize blockchain technologies. On the surface, this seems a reasonable idea, creating a common standard that everyone can interoperate with.

But it can be silly idea in practice. I mean, it should not be assumed that this is a good thing to do.

The value of official standards

You don’t need the official imprimatur of a government committee for something to be a “standard”. The Internet itself is a prime example of that.

In the 1980s, the ISO and the IETF (Internet Engineering Task Force) pursued competing standards for creating a world-wide “internet”. The IETF was an informal group of technologist that had essentially no official standing.

The ISO version of the Internet failed. Their process was to bring multiple stakeholders from business, government, and universities together in committees to debate competing interests. The result was something so horrible that it could never work in practice.

The IETF succeeded. It consisted of engineers just building things. Rather than officially “standardized”, these things were “described”, so that others knew enough to build their own version that interoperated. Once lots of different people built interoperating versions of something, then it became a “standard”.

In other words, the way the Internet came to be, standardization followed interoperability — it didn’t create interoperability.

In the end, the ISO gave up on their standards and adopted the IETF standards. The ISO brought no value to the development of Internet standards. Whether they ratified the Internet’s “TCP/IP” standard, ignored it, or condemned it, the Internet would exist today anyway, and a competing ISO-blessed internetwork would not.

The same question exists for blockchain technologies. Groups are off busy innovating quickly, creating their own standards. If the ISO blesses one, or creates its own, it’s unlikely to have any impact on interoperability.

Blockchain vs. chaining blocks

The excitement over blockchains is largely driven by people who don’t know the details, who don’t understand the difference between a blockchain like Bitcoin and the problem they are trying to solve.

Consider a record keeping system, especially public records. Storing them in a blockchain seems like a natural idea.

But in fact, it’s a terrible idea. A Bitcoin-style blockchain has a lot of features you don’t want, like “proof-of-work” signing. It is also missing necessary features, like bulk storage with redundancy (backups). Sure, Bitcoin has redundancy, but by brute force, storing the blockchain in thousands of places around the Internet. This is far from what a public records system would need, which would store a lot more data with far fewer backup copies (fewer than 10).

The only real overlap between Bitcoin and a public records system is a “signing chain”. But this is something that already existed before Bitcoin. It’s what Bitcoin blockchain was built on top of — it’s not the blockchain itself.

It’s like people discovering “cryptography” for the first time when they looked at Bitcoin, ignoring the thousand year history of crypto, and now every time they see a need for “crypto” they think “Bitcoin blockchain”.

Consensus and forking

The entire point of Bitcoin, the reason it was created, was as the antithesis to centralized standardization like ISO. Standardizing blockchains misses the entire point of their existence. The Bitcoin manifesto is that standardization comes from acclamation not proclamation, and that many different standards are preferable to a single one.

This is not just a theoretical idea but one built into Bitcoin’s blockchain technology. “Consensus” is achieved by the proof-of-work mechanism, so that those who do the most work are the ones that drive the consensus. When irreconcilable differences arise, the blockchain “forks”, with each side continuing on with their now non-interoperable blockchains. Such forks are not a sin, but part of the natural evolution.

We saw this with the recent fork of Bitcoin. There are now so many transactions that they exceed the size of blocks. One group chose a change to make transactions smaller. Another group chose a change to make block sizes larger.

It is this problem, of consensus, that is the innovation that Bitcoin created with blockchains, not the chain signing of public transaction records.

Ethereum

What “blockchain standardization” is going to mean in practice is not the blockchain itself, but trying to standardize the Ethereum version. What makes Ethereum different is the “smart contracts” programming language, which has financial institutions excited.

This is a bad idea because from a cybersecurity perspective, Ethereum’s programming language is flawed. Different bugs in “smart contracts” have led to multiple $100-million hacks, such as the infamous “DAO collapse”.

While it has interesting possibilities, we should be scared of standardizing Ethereum’s language before it works.

Conclusion

People who matter are too busy innovating, creating their own blockchain standards. There is little that the ISO can do to improve this. Their official imprimatur is not needed to foster innovation and interoperability — if they are consequential at anything, it’ll just be interfering.

How to Increase the Redundancy and Performance of Your AWS Directory Service for Microsoft AD Directory by Adding Domain Controllers

Post Syndicated from Peter Pereira original https://aws.amazon.com/blogs/security/how-to-increase-the-redundancy-and-performance-of-your-aws-directory-service-for-microsoft-ad-directory-by-adding-domain-controllers/

You can now increase the redundancy and performance of your AWS Directory Service for Microsoft Active Directory (Enterprise Edition), also known as AWS Microsoft AD, directory by deploying additional domain controllers. Adding domain controllers increases redundancy, resulting in even greater resilience and higher availability. This new capability enables you to have at least two domain controllers operating, even if an Availability Zone were to be temporarily unavailable. The additional domain controllers also improve the performance of your applications by enabling directory clients to load-balance their requests across a larger number of domain controllers. For example, AWS Microsoft AD enables you to use larger fleets of Amazon EC2 instances to run .NET applications that perform frequent user attribute lookups.

AWS Microsoft AD is a highly available, managed Active Directory built on actual Microsoft Windows Server 2012 R2 in the AWS Cloud. When you create your AWS Microsoft AD directory, AWS deploys two domain controllers that are exclusively yours in separate Availability Zones for high availability. Now, you can deploy additional domain controllers easily via the Directory Service console or API, by specifying the total number of domain controllers that you want.

AWS Microsoft AD distributes the additional domain controllers across the Availability Zones and subnets within the Amazon VPC where your directory is running. AWS deploys the domain controllers, configures them to replicate directory changes, monitors for and repairs any issues, performs daily snapshots, and updates the domain controllers with patches. This reduces the effort and complexity of creating and managing your own domain controllers in the AWS Cloud.

In this blog post, I create an AWS Microsoft AD directory with two domain controllers in each Availability Zone. This ensures that I always have at least two domain controllers operating, even if an entire Availability Zone were to be temporarily unavailable. To accomplish this, first I create an AWS Microsoft AD directory with one domain controller per Availability Zone, and then I deploy one additional domain controller per Availability Zone.

Solution architecture

The following diagram shows how AWS Microsoft AD deploys all the domain controllers in this solution after you complete Steps 1 and 2. In Step 1, AWS Microsoft AD deploys the two required domain controllers across multiple Availability Zones and subnets in an Amazon VPC. In Step 2, AWS Microsoft AD deploys one additional domain controller per Availability Zone and subnet.

Solution diagram

Step 1: Create an AWS Microsoft AD directory

First, I create an AWS Microsoft AD directory in an Amazon VPC. I can add domain controllers only after AWS Microsoft AD configures my first two required domain controllers. In my example, my domain name is example.com.

When I create my directory, I must choose the VPC in which to deploy my directory (as shown in the following screenshot). Optionally, I can choose the subnets in which to deploy my domain controllers, and AWS Microsoft AD ensures I select subnets from different Availability Zones. In this case, I have no subnet preference, so I choose No Preference from the Subnets drop-down list. In this configuration, AWS Microsoft AD selects subnets from two different Availability Zones to deploy the directory.

Screenshot of choosing the VPC in which to create the directory

I then choose Next Step to review my configuration, and then choose Create Microsoft AD. It takes approximately 40 minutes for my domain controllers to be created. I can check the status from the AWS Directory Service console, and when the status is Active, I can add my two additional domain controllers to the directory.

Step 2: Deploy two more domain controllers in the directory

Now that I have created an AWS Microsoft AD directory and it is active, I can deploy two additional domain controllers in the directory. AWS Microsoft AD enables me to add domain controllers through the Directory Service console or API. In this post, I use the console.

To deploy two more domain controllers in the directory:

  1. I open the AWS Management Console, choose Directory Service, and then choose the Microsoft AD Directory ID. In my example, my recently created directory is example.com, as shown in the following screenshot.Screenshot of choosing the Directory ID
  2. I choose the Domain controllers tab next. Here I can see the two domain controllers that AWS Microsoft AD created for me in Step 1. It also shows the Availability Zones and subnets in which AWS Microsoft AD deployed the domain controllers.Screenshot showing the domain controllers, Availability Zones, and subnets
  3. I then choose Modify on the Domain controllers tab. I specify the total number of domain controllers I want by choosing the subtract and add buttons. In my example, I want four domain controllers in total for my directory.Screenshot showing how to specify the total number of domain controllers
  4. I choose Apply. AWS Microsoft AD deploys the two additional domain controllers and distributes them evenly across the Availability Zones and subnets in my Amazon VPC. Within a few seconds, I can see the Availability Zones and subnets in which AWS Microsoft AD deployed my two additional domain controllers with a status of Creating (see the following screenshot). While AWS Microsoft AD deploys the additional domain controllers, my directory continues to operate by using the active domain controllers—with no disruption of service.
    Screenshot of two additional domain controllers with a status of "Creating"
  5. When AWS Microsoft AD completes the deployment steps, all domain controllers are in Active status and available for use by my applications. As a result, I have improved the redundancy and performance of my directory.

Note: After deploying additional domain controllers, I can reduce the number of domain controllers by repeating the modification steps with a lower number of total domain controllers. Unless a directory is deleted, AWS Microsoft AD does not allow fewer than two domain controllers per directory in order to deliver fault tolerance and high availability.

Summary

In this blog post, I demonstrated how to deploy additional domain controllers in your AWS Microsoft AD directory. By adding domain controllers, you increase the redundancy and performance of your directory, which makes it easier for you to migrate and run mission-critical Active Directory–integrated workloads in the AWS Cloud without having to deploy and maintain your own AD infrastructure.

To learn more about AWS Directory Service, see the AWS Directory Service home page. If you have questions, post them on the Directory Service forum.

– Peter

Sync vs. Backup vs. Storage

Post Syndicated from Yev original https://www.backblaze.com/blog/sync-vs-backup-vs-storage/

Cloud Sync vs. Cloud Backup vs. Cloud Storage

Google Drive recently announced their new Backup and Sync feature for Google Drive, which allows users to select folders on their computer that they want to back up to their Google Drive account (note: these files count against your Google Drive storage limit). Whenever new backup services are announced, we get a lot of questions so I thought we should take a minute to review the differences in cloud based services.

What is the Cloud? Sync Vs Backup Vs Storage

There is still a lot of confusion in the space about what exactly the “cloud” is and how different services interact with it. When folks use a syncing and sharing service like Dropbox, Box, Google Drive, OneDrive or any of the others, they often assume those are acting as a cloud backup solution as well. Adding to the confusion, cloud storage services are often the backend for backup and sync services as well as standalone services. To help sort this out, we’ll define some of the terms below as they apply to a traditional computer set-up with a bunch of apps and data.

Cloud Sync (ex. Dropbox, iCloud Drive, OneDrive, Box, Google Drive) – these services sync folders on your computer to folders on other machines or to the cloud – allowing users to work from a folder or directory across devices. Typically these services have tiered pricing, meaning you pay for the amount of data you store with the service. If there is data loss, sometimes these services even have a rollback feature, of course only files that are in the synced folders are available to be recovered.

Cloud Backup (ex. Backblaze Cloud Backup, Mozy, Carbonite) – these services work in the background automatically. The user does not need to take any action like setting up specific folders. Backup services typically back up any new or changed data on your computer to another location. Before the cloud took off, that location was primarily a CD or an external hard drive – but as cloud storage became more readily available it became the most popular storage medium. Typically these services have fixed pricing, and if there is a system crash or data loss, all backed up data is available for restore. In addition, these services have rollback features in case there is data loss / accidental file deletion.

Cloud Storage (ex. Backblaze B2, Amazon S3, Microsoft Azure) – these services are where many online backup and syncing and sharing services store data. Cloud storage providers typically serve as the endpoint for data storage. These services typically provide APIs, CLIs, and access points for individuals and developers to tie in their cloud storage offerings directly. These services are priced “per GB” meaning you pay for the amount of storage that you use. Since these services are designed for high-availability and durability, data can live solely on these services – though we still recommend having multiple copies of your data, just in case.

What Should You Use?

Backblaze strongly believes in a 3-2-1 Backup Strategy. A 3-2-1 strategy means having at least 3 total copies of your data, 2 of which are local but on different mediums (e.g. an external hard drive in addition to your computer’s local drive), and at least 1 copy offsite. The best setup is data on your computer, a copy on a hard drive that lives somewhere not inside your computer, and another copy with a cloud backup provider. Backblaze Cloud Backup is a great compliment to other services, like Time Machine, Dropbox, and even the free-tiers of cloud storage services.

What is The Difference Between Cloud Sync and Backup?

Let’s take a look at some sync setups that we see fairly frequently.

Example 1) Users have one folder on their computer that is designated for Dropbox, Google Drive, OneDrive, or one of the other syncing/sharing services. Users save or place data into those directories when they want them to appear on other devices. Often these users are using the free-tier of those syncing and sharing services and only have a few GB of data uploaded in them.

Example 2) Users are paying for extended storage for Dropbox, Google Drive, OneDrive, etc… and use those folders as the “Documents” folder – essentially working out of those directories. Files in that folder are available across devices, however, files outside of that folder (e.g. living on the computer’s desktop or anywhere else) are not synced or stored by the service.

What both examples are missing however is the backup of photos, movies, videos, and the rest of the data on their computer. That’s where cloud backup providers excel, by automatically backing up user data with little or no set-up, and no need for the dragging-and-dropping of files. Backblaze actually scans your hard drive to find all the data, regardless of where it might be hiding. The results are, all the user’s data is kept in the Backblaze cloud and the portion of the data that is synced is also kept in that provider’s cloud – giving the user another layer of redundancy. Best of all, Backblaze will actually back up your Dropbox, iCloud Drive, Google Drive, and OneDrive folders.

Data Recovery

The most important feature to think about is how easy it is to get your data back from all of these services. With sync and share services, retrieving a lot of data, especially if you are in a high-data tier, can be cumbersome and take awhile. Generally, the sync and share services only allow customers to download files over the Internet. If you are trying to download more than a couple gigabytes of data, the process can take time and can be fraught with errors.

With cloud storage services, you can usually only retrieve data over the Internet as well, and you pay for both the storage and the egress of the data, so retrieving a large amount of data can be both expensive and time consuming.

Cloud backup services will enable you to download files over the internet too and can also suffer from long download times. At Backblaze we never want our customers to feel like we’re holding their data hostage, which is why we have a lot of restore options, including our Restore Return Refund policy, which allows people to restore their data via a USB Hard Drive, and then return that drive to us for a refund. Cloud sync providers do not provide this capability.

One popular data recovery use case we’ve seen when a person has a lot of data to restore is to download just the files that are needed immediately, and then order a USB Hard Drive restore for the remaining files that are not as time sensitive. The user gets all their files back in a few days, and their network is spared the download charges.

The bottom line is that all of these services have merit for different use-cases. Have questions about which is best for you? Sound off in the comments below!

The post Sync vs. Backup vs. Storage appeared first on Backblaze Blog | Cloud Storage & Cloud Backup.

Elixir Cross Referencer: new way to browse kernel sources

Post Syndicated from ris original https://lwn.net/Articles/725302/rss

Free electrons has released the initial
version of the Elixir
Cross-Referencer
, a Linux source code cross-referencing online tool.
Elixir uses a new engine written in Python that replaces LXR, the
engine used in free electron’s previous online tool. “Another reason that motivated a complete rewrite was that we wanted to provide an up-to-date reference (including the latest revisions) while keeping it immutable, so that external links to the source code wouldn’t get broken in the future. As a direct consequence, we would need to index many different revisions for each project, with potentially a lot of redundant information between them. That’s when we realized we could leverage the data model of Git to deal with this redundancy in an efficient manner, by indexing Git blobs, which are shared between revisions. In order to make sure queries under this strategy would be fast enough, we wrote a proof-of-concept in Python, and thus Elixir was born.

AWS GovCloud (US) Heads East – New Region in the Works for 2018

Post Syndicated from Jeff Barr original https://aws.amazon.com/blogs/aws/aws-govcloud-us-heads-east-new-region-in-the-works-for-2018/

AWS GovCloud (US) gives AWS customers a place to host sensitive data and regulated workloads in the AWS Cloud. The first AWS GovCloud (US) Region was launched in 2011 and is located on the west coast of the US.

I’m happy to announce that we are working on a second Region that we expect to open in 2018. The upcoming AWS GovCloud (US-East) Region will provide customers with added redundancy, data durability, and resiliency, and will also provide additional options for disaster recovery.

Like the existing region, which we now call AWS GovCloud (US-West), the new region will be isolated and meet top US government compliance requirements including International Traffic in Arms Regulations (ITAR), NIST standards, Federal Risk and Authorization Management Program (FedRAMP) Moderate and High, Department of Defense Impact Levels 2-4, DFARs, IRS1075, and Criminal Justice Information Services (CJIS) requirements. Visit the GovCloud (US) page to learn more about the compliance regimes that we support.

Government agencies and the IT contactors that serve them were early adopters of AWS GovCloud (US), as were companies in regulated industries. These organizations are able to enjoy the flexibility and cost-effectiveness of public cloud while benefiting from the isolation and data protection offered by a region designed and built to meet their regulatory needs and to help them to meet their compliance requirements. Here’s a small sample from our customer base:

Federal (US) GovernmentDepartment of Veterans Affairs, General Services Administration 18F (Digital Services Delivery), NASA JPL, Defense Digital Service, United States Air Force, United States Department of Justice.

Regulated IndustriesCSRA, Talen Energy, Cobham Electronics.

SaaS and Solution ProvidersFIGmd, Blackboard, Splunk, GitHub, Motorola.

Federal, state, and local agencies that want to move their existing applications to the AWS Cloud can take advantage of the AWS Cloud Adoption Framework (CAF) offered by AWS Professional Services.

Jeff;

 

 

Building a Competitive Moat: Turning Challenges Into Advantages

Post Syndicated from Gleb Budman original https://www.backblaze.com/blog/turning-challenges-into-advantages/

castle on top of a storage pod

In my previous post on how Backblaze got started, I mentioned that “just because we knew the right solution, didn’t mean that it was possible.” I’ll dig into that here. The right solution was to offer unlimited backup for $5 per month. The price of storage at the time, however, would have likely forced us to price our unlimited backup service at 2x – 5x that.

We were faced with a difficult challenge – compromise a fundamental feature of our product by removing the unlimited storage element, increase our price point in order to cover our costs but likely limit our potential customer base, seek funding in order to run at a loss while we built market share with a hope/prayer we could make a profit in the future, or find another way (huge unknown that might not have a solution). Below I’ll dig into the options that were available, the paths we tried, and how this challenge completely transformed our company and ended up being our greatest technological advantage.

Available Options:

Use a Storage Service

Originally we intended to build the backup application, but leave the back-end storage to others; likely Amazon S3. This had many advantages:

  1. We would not have to worry about the storage at all
  2. It would scale up or down as we needed it
  3. We would pay only for what we used

Especially as a small, bootstrapped company with limited resources – these were incredible benefits.

There was just one problem. At S3’s then current pricing ($0.15/GB/month), a customer storing just 33 GB would cost us 100% of the $5 per month we would collect. Additionally, we would need to pay S3 transaction and download charges, along with our engineering/support/marketing and other expenses.. The conclusion, even if the average customer stored just 33 GB, it would cost us at least $10/month for a customer that we were charging just $5/month.

In 2007, when we were getting started, there were a few other storage services available. But all were more expensive. Despite the fantastic benefits of using such a service, it simply didn’t work for us.

Buy Storage Systems

Buying storage systems didn’t have all the benefits of using a storage service – we would have to forecast need, buy in big blocks up front, manage data centers, etc. – but it seemed the second-best option. Companies such as EMC, NetApp, Dell, and others sold hundreds of petabytes of storage systems where they provide the servers, software, and support.

Alas, there were two problems: One temporary, the other permanent (and fatal). The temporary problem was that these systems were hundreds of thousands of dollars just to get started. This was challenging for us from a cash-flow perspective, but it was just a question of coming up with the cash. The permanent problem was that these systems cost ~$1,000/TB of storage. Hard drives were selling for ~$100/TB, so there was a 10x markup for the storage system. That markup eliminated pursuing this path. What if the the average customer had 100 GB to store? It would take us 20 months to pay off the purchase. We weren’t sure how much data the average customer would have, but the scenarios we were running made it seem like a $5/month price point was unsustainable.

Our Choices Where:

Don’t Offer the Right Solution

If it’s impossible to offer unlimited backup for $5/month, there are certainly choices. We could have raised the price to $10/month, not make the backup unlimited, or close-up shop altogether. All doable, none ideal.

Raise Funding

Plenty of companies raise funding before they can be self-sustaining, and it can work out great for everyone. We had raised funding for a previous company and believed we could have done it for Backblaze. And raising funding would have taken care of the cash-flow issue if we chose to buy storage systems.

However, it would have left us with a business with negative unit economics – we would lose money on every customer, and the faster we grew, the more money we would lose. VCs do fund these types of companies often (many of the delivery companies today fall in this realm) with the idea that, at scale, you improve your cost structure and possibly also charge more. But it’s a dangerous game since not only is the business not self-sustaining, it inevitably must be significantly altered in order to survive.

Find a Way to Store Data for Less

If there were some way to store data for less, significantly less, it could all work. We had a tiny glimmer of hope that it would be possible: Since hard drives only cost ~$100/TB, if we could somehow use those drives without adding much overhead, that would be quite affordable.

“we wanted to build a sustainable business from day one and build a culture that believes dollars come from customers.”

Our first decision was to not compromise our product by restricting the amount of storage. Although this would have been a much easier solution, it violated our core mission: Create a simple and inexpensive solution to backup all of your important data.

We had previously also decided not to raise funding to get started because we wanted to build a sustainable business from day one and build a culture that believes dollars come from customers. With those decisions made, we moved onto finding the best solution to fulfill our mission and create a viable company.

Experimentation

All we wanted was to attach hard drives to the Internet. If we could do that inexpensively, our backup application could store the data there and we could offer our unlimited backup service.

A hard drive needs to be connected to a server to be available on the Internet. It certainly wouldn’t be very cost effective to have one server for every hard drive, as the server costs would dominate the equation. Alternatively, trying to attach a lot of drives to a server resulted in needing expensive “enterprise” servers. The goal then became cost-efficiently attaching as many hard drives as possible to one server. According to its spec, USB is supposed to allow for 127 devices to be daisy-chained to a single port. We tried; it didn’t work. We considered Firewire, which could connect 63 devices, but the connectors are aimed at graphic designers and ended up too expensive on a unit-basis. Our next thought was to use small consumer-grade DAS (Direct-attached storage) devices and connect those to a server. We managed to attach 8 DAS devices with 4 drives each for a total of 32 hard drives connected to one server.

DAS units attached to a server
This worked well, but it was operationally challenging as none of these devices were meant to fit in a data center rack. Further complicating matters was that moving one of these setups required cabling 10 power cords, and separately moving 9 boxes. Fine at small scale, but very hard to scale up.

We realized that we didn’t need all the boxes, we just needed backplanes to connect the drives from the DAS boxes to the motherboard from the server. We found a different DAS box that supports port multipliers and took that backplane. How did we decide on that DAS box? Tim, co-founder & Chief Cloud Officer, remembers going to Fry’s and picking the box that looked “about right”.

That all laid the path for our eventual 45 drive design. The next thought was: If we could put all that in one box, it might be the solution we were looking for. The first iteration of this was a plywood box.

the first wooden storage pod

That eventually evolved into a steel server and what we refer to as a Storage Pod.

steel storage pod chassis

Building a Storage Platform

The Storage Pod became our key building block, but was just a tiny component of the ‘storage platform’. We had to write software that would run on each Storage Pod, software that would create redundancy between the Storage Pods, and central software and systems that would coordinate other aspects of the system to accept/load balance/validate/clean-up data. We had to find and train contract manufacturers to build the Storage Pods, find and negotiate data center space and bandwidth, setup processes to buy drives and track their reliability, hire people to maintain the systems, and setup the business processes to do all of this and more at scale.

All of this ended up taking tremendous technical effort, management engagement, and work from all corners of Backblaze. But it has also paid enormous dividends.

The Transformation

We started Backblaze thinking of ourselves as a backup company. In reality, we became a storage company with ‘backup’ as the first service we offered on our storage platform. Our backup service relies on the storage platform as, without the storage platform, we couldn’t offer unlimited backup. To enable the backup service, storage became the foundation of our company and is still what we live and breathe every day.

It didn’t just change how we built the service, it changed the fundamental DNA of the company.

Dividends

Creating our own storage platform was certainly hard. But it enabled us to offer our unlimited backup for a low price and do that while running a sustainable business.

“It didn’t just change how we built the service, it changed the fundamental DNA of the company.”

We felt that we had a service and price point that customers wanted, and we “unlocked” the way to let us build it. Having our storage platform also provides us with a deep connection to our customers and the storage community – we share how we build Storage Pods and how reliable hard drives in our environment have been. That content, in turns, helps brings awareness to Backblaze; the awareness helps establish the company as a tech leader; that reputation helps us recruit to our growing team and earns customers who are evaluating our solutions vs Storage Company X.

And after years of being a storage company with a backup service, and being asked all the time to just offer our storage directly, we launched our Backblaze B2 Cloud Storage service. We offer this raw storage at a price of $0.005/GB/month – that’s less than 1/4th of the price of S3.

If we had built our backup service on one of the existing storage services or storage systems, it would have been easier – but none of this would have been possible. This challenge, which we have spent a decade working to overcome, has also transformed our company and became our greatest technological advantage.

The post Building a Competitive Moat: Turning Challenges Into Advantages appeared first on Backblaze Blog | Cloud Storage & Cloud Backup.

How Important is Hosting Location? Questions to Ask Your Hosting Provider

Post Syndicated from Sarah Wilson original https://www.anchor.com.au/blog/2017/05/site-hosting-location/

The importance of the location of your hosting partner will depend on your organisation requirements and your business needs. For a small business focussed on a single country with fairly low traffic, starting with shared hosting, or small virtual private server is generally the most cost effective place to host your site.  Hosting your site or web application in the same geographical location as your website visitors reduces page load times and latency (the lag between requesting data and receiving it) and will greatly improve user experience. If you are running a business critical website or ecommerce application, or have customers or visitors from various global locations, then incorrectly placing your website in an unsuitably located data centre will cost you much more than a monthly hosting fee!

Does the Location of My Hosting Provider Matter?

Location also matters when it comes to service.  Selecting a hosting provider you should be aware of their usual operation hours and ensure this fits with your companies requirements, and timezone! Why does this matter? You can’t choose when an unexpected outage of your site happens, so if your hosting provider is based in a different time zone with limited service hours, there may be no one to help you.  Here at Anchor,  we’ve implemented a “follow the sun support” model, around our 8am-6pm AEST operating hours. This means that anywhere you are in the world with a problem, you can pick up the phone and our friendly support team will be on hand to help.

But Why Do I Care Where the Data Centre is?

Data centres require state of the art cooling and power in order to keep the servers and hardware in perfect condition. Redundancy in the network is also a vital part of infrastructure, i.e. if something in the network fails, there is backup infrastructure, power or cooling so it can operate as normal. The data centre that Anchor uses for our shared hosting and VPS is brand new, with 24/7 security and state of the art technologies, located in Sydney.

On the flip side of this location conundrum is how you may serve customers in locations outside of Australia which can now be achieved via a public cloud offering such as Amazon Web Services (AWS). With AWS, your business can leverage a network of global data centres around the globe so you can serve your customers and audiences wherever they are located! For example, if your business operates in Australia and you have customers in Singapore, United Kingdom and USA, you can now have your application deployed in all four data centres so all of your customer sessions can be routed to the closest server.

Anchor provides fully managed hosting in our own Sydney based data centre, or on public clouds such as AWS. Get in touch to find out more about our managed hosting services.

The post How Important is Hosting Location? Questions to Ask Your Hosting Provider appeared first on AWS Managed Services by Anchor.

Turning Steel Into Tin: The First 10 Years of Backblaze

Post Syndicated from Gleb Budman original https://www.backblaze.com/blog/backblaze-turns-10/

Today Backblaze celebrates turning 10 years old. Tin is the traditional gift for a ten year wedding anniversary: a sign of strength and flexibility. Getting to this point took not only the steel to make the servers, but tin as well.

How things have changed:

2007 2017
Team Five Founders in a Palo Alto apartment 55 employees around the country
Storage Hard drives strung together 60-drive, tool-less Storage Pods
Drives 1 TB drives 8 TB and 10 TB dries
Redundancy RAID redundancy 20-Storage Pod Vault redundancy
Storage 45 TB of total stored customer data 300,000+ TB
Customers A few friends as customers Hundreds of thousands across 125 countries
Files Saved 1 customer’s data restored (mine) 20 billion+ files restored
Business Lines Consumer Backup Consumer Backup, Business Backup, and Cloud Storage
Financials $0 revenue Millions in revenue and profitable
Mission Make storing data astonishingly easy and low-cost Make storing data astonishingly easy and low-cost

From Our Humble Beginnings In 2007:

To Over 300 Petabytes in 2017:

Someone recently asked me, “Does Backblaze look now the way you imagined it would ten years ago?” The honest answer? I had never imagined what it would like in ten years; when trying to figure out how to even get to market, that timespan wasn’t even on my radar. When we were filming our first video, we were more concerned with finding our first customer than planning our first birthday.

Now, not only has it been a decade, but we’re signing 5-year data center contracts and talking about what the company will look like in ten years and beyond. We often say that we hope this will be the last job for our employees.

Thinking back, a few things I learned:

  • Staying in business is as much about commitment as cash.
  • The existential risks are hard to predict.
  • The hardest times are rarely due to tech or business; they’re personal.
  • The accepted wisdom is often wrong.
  • Less money often leads to better solutions.
  • Culture affects everything.

I’ll expand on all of these in future posts. Over the next few months, I’ll be more active in bringing back our entrepreneurship series of blog posts. Don’t worry, our Hard Drive Stats and all the other stuff aren’t going anywhere!

But today is about the community that makes Backblaze what it is: None of it would have been possible without all of the people who built Backblaze as their own company every day, our friends and boosters, partners, and (of course) our incredible customers. While I did not, and likely could not, imagine a decade ago where Backblaze would be today, I’m thrilled at where we have arrived. Thank you.

And tomorrow, onto the next decade.

The post Turning Steel Into Tin: The First 10 Years of Backblaze appeared first on Backblaze Blog | Cloud Storage & Cloud Backup.

Back Up Your Tax Data Before It’s Too Late

Post Syndicated from Peter Cohen original https://www.backblaze.com/blog/backup-tax-data/

Backup Your Tax Data

Just filing taxes can be stressful enough. The last thing you need is a hassle with the IRS or your state’s Department of Revenue because you’ve lost or misplaced necessary paperwork.

So how long should you keep your tax records? The IRS recommends you keep records for up to 7 years. Don’t let data insecurity add to your tax season stress: Make sure your digital tax files and any supporting documentation are well-protected with these tips.

Prepare backups from your tax prep software

If you’re using TurboTax or another app to prepare this year’s taxes, make sure to save a copy of your tax data file on your device. And to be extra safe, export it to a universal format. A read-only format like PDF will do the job. The important thing is that you can look at the information you’ve submitted without needing proprietary software.

Intuit offers instructions on its web site to perform a TurboTax file backup. TaxAct also offers instructions for TaxAct file backup. If you’re using another tax prep software package, make sure to check the Help files or online support documentation for instructions on saving and exporting your tax data files.

The same goes for any accompanying files you’ve used as supporting documentation: Scanned receipts, bank statements, 1099s, real estate tax, mortgage statements, insurance receipts, and any other records you’ve accounted for on your tax forms. Digital copies – file formats like JPG, GIF or PDF – are fine with the IRS, as long as they’re legible.

Put them all of your records in a “2017 Taxes” folder that’s stored prominently in your Documents folder, or wherever you keep the digital records that are most important to you.

Back Up Your Tax Returns Locally and Offsite

Once you’ve got all your tax return files in a single location, back them up. Start by archiving your data locally using Time Machine, Windows Backup, cloning software or whatever method you prefer. Don’t rest on your laurels with that, though: You need offsite backup too, to make sure that your data is safe no matter what happens.

Backblaze customer? Rest assured that Backblaze backs up that folder you created safely and securely. Backblaze backs up all your important files.

You can (and should) verify your files are being backed up from time to time. You should also test your backup periodically to make sure everything’s working as you expect.

Encrypt your tax records when transmitting and storing them in the cloud. Encryption is built-in to Backblaze. The same can’t be said for all cloud services, so check with other services to make they protect your data. Password-protecting individual files and folders adds another layer of protection.

If you’re not ready to come up with a complete backup strategy and are just looking for a quick fix, start with Backblaze and a USB thumbstick. Copy your files to the thumbstick and store it somewhere safe. If you’re not already backing up your computer, we can help. We’ve published guidelines for you in our Computer Backup Guide.

Don’t Depend On Just “Any” Cloud

Intuit offered a Backup to Cloud service as a pilot program for Turbotax customers. I used it to store my 2014 returns, but then I got an email cancelling the service the following year. Intuit gave only a short period to save the data before deleting it. That was a critical reminder for me not to trust any cloud services alone – I should have kept that data local.

It may sound odd coming from us (as a cloud storage company), but we wouldn’t depend alone on iCloud or any other cloud storage or sync solution. Keep a local backup and a cloud backup – that way you’ll be able to restore no matter what happens.

Some people like to add an extra layer of redundancy by printing out paper records of their tax returns too. A truly universal file format. If you have the space to do it and can store them safely, it couldn’t hurt. Mark them for deletion no less than 3-7 years from your filing date.

Give Copies To Someone (or Something) You Trust

For an extra layer of redundancy, pass along copies of your tax documents to someone you trust for safekeeping. A spouse or close family member, accountant, attorney – whoever you think can be trusted to keep the documents safe. It’s the real-life equivalent to our 3-2-1 Backup Strategy because the goal here is to keep a copy offsite for safekeeping with a trusted third party (like Backblaze).

Don’t Let Backup Be Taxing

It has been said that it’s impossible to be sure of anything but death and taxes. We can’t help you with the former, but hopefully, we’ve given you some good ideas on how to make the latter less stressful by making sure all your tax data is available, secure, and safely archived. That way, if and when you eventually need it, it’ll be no more than a few taps or clicks away. If you have other ideas for good tax backup solutions, sound off in the comments – we want to hear from you.

The post Back Up Your Tax Data Before It’s Too Late appeared first on Backblaze Blog | Cloud Storage & Cloud Backup.

Backblaze Labs: The Future of DNA Storage

Post Syndicated from Yev original https://www.backblaze.com/blog/backblaze-labs-future-dna-storage/

Backblaze Labs – the data storage innovation branch of Backblaze – is proud to present to you its newest innovation, Backblaze DNA Storage. Most recently Backblaze Labs designed Storage Pod 6.0 and built out our Vault infrastructure that powers Backblaze Personal Backup and B2 Cloud Storage. DNA Storage has long been in our sights and we’re happy to see our work validated as Science magazine has caught up to us, reporting its latest capabilities in their article DNA could store all of the world’s data in one room.

Backblaze has always been known for very dense storage, and when we started to see reports that we could store up to 1 Zettabyte of storage in a single gram of DNA, Backblaze Labs kicked itself into high gear. We were new to bio-storage, so we wanted to take it slow and steady.

The Ah-Ha Moment

In keeping with our bootstrapping ethos our team has not only been studying how to best employ DNA storage, but has also taken an active role by volunteering* in our alpha program as test subjects. Currently our server farm resides inside giant datacenters, but with DNA storage our ‘Storage Bods™’ could be… mobile!

*Note rumors of employees being “voluntold” to take part are just that, rumor.

Unfortunately this feature also proved to be the downfall of our earliest experiment – our first volunteer, Lego, ran away for a day. Fortunately we found her… but we realized new approach was needed.

Our first human Storage Bod was Bob, our network engineer. His enthusiasm to take part, with the right motivation, was exactly what we needed. Aside from the sporadic bouts of nausea we were able to keep our first test unit of storage, roughly 100 TB of 70’s, 80’s & 90’s TV sitcoms, in place with some stability. Strangely we noticed some odd side effects with Bob kicking our refrigerator to get a drink and developing an odd laugh. Able to look past this we moved on to our next phase.

Now that our initial test “hosts” were used to the process, we started experimenting with data redundancy. It’s great to have a mobile backup, but what about making sure that the data exists in two places at once? We wanted this new project to be in keeping with our recommended 3-2-1 Backup strategy, didn’t we?

Unfortunately, our first attempts were again, not as successful as we’d have hoped. While we were able to start replicating the data, keeping it in two different locations proved to be a harder problem than we initially thought.

The breakthrough came when we stopped thinking about simply having the data in two locations, and instead focusing on having identical copies of the data in two distinct units. Sound familiar? It should! We had stumbled onto the DNA-storage equivalent of RAID 1!

The best part about this setup was that when one set of data got corrupted, the other would know and could come in for repairs. It was a win/win. Or as we call it, a twin/twin!

Next Steps

We’d like to implement a Backblaze Vault type of set-up with our new Storage Bod systems, 20 Bods grouped together running our Reed-Solomon encoding algorithm to create 99.999999% DNA data durability. That way, even if three of the Storage Bods were to go down or leave town, all of the data can be recovered. We don’t know how all of this will work out yet, but we’re sure we’ll need to get a lot more office snacks for our Storage Bods and we’re excited about the possibility of having billions of terabytes of data walking around. What could go wrong?

The post Backblaze Labs: The Future of DNA Storage appeared first on Backblaze Blog | Cloud Storage & Cloud Backup.

How to Help Protect Dynamic Web Applications Against DDoS Attacks by Using Amazon CloudFront and Amazon Route 53

Post Syndicated from Holly Willey original https://aws.amazon.com/blogs/security/how-to-protect-dynamic-web-applications-against-ddos-attacks-by-using-amazon-cloudfront-and-amazon-route-53/

Using a content delivery network (CDN) such as Amazon CloudFront to cache and serve static text and images or downloadable objects such as media files and documents is a common strategy to improve webpage load times, reduce network bandwidth costs, lessen the load on web servers, and mitigate distributed denial of service (DDoS) attacks. AWS WAF is a web application firewall that can be deployed on CloudFront to help protect your application against DDoS attacks by giving you control over which traffic to allow or block by defining security rules. When users access your application, the Domain Name System (DNS) translates human-readable domain names (for example, www.example.com) to machine-readable IP addresses (for example, 192.0.2.44). A DNS service, such as Amazon Route 53, can effectively connect users’ requests to a CloudFront distribution that proxies requests for dynamic content to the infrastructure hosting your application’s endpoints.

In this blog post, I show you how to deploy CloudFront with AWS WAF and Route 53 to help protect dynamic web applications (with dynamic content such as a response to user input) against DDoS attacks. The steps shown in this post are key to implementing the overall approach described in AWS Best Practices for DDoS Resiliency and enable the built-in, managed DDoS protection service, AWS Shield.

Background

AWS hosts CloudFront and Route 53 services on a distributed network of proxy servers in data centers throughout the world called edge locations. Using the global Amazon network of edge locations for application delivery and DNS service plays an important part in building a comprehensive defense against DDoS attacks for your dynamic web applications. These web applications can benefit from the increased security and availability provided by CloudFront and Route 53 as well as improving end users’ experience by reducing latency.

The following screenshot of an Amazon.com webpage shows how static and dynamic content can compose a dynamic web application that is delivered via HTTPS protocol for the encryption of user page requests as well as the pages that are returned by a web server.

Screenshot of an Amazon.com webpage with static and dynamic content

The following map shows the global Amazon network of edge locations available to serve static content and proxy requests for dynamic content back to the origin as of the writing of this blog post. For the latest list of edge locations, see AWS Global Infrastructure.

Map showing Amazon edge locations

How AWS Shield, CloudFront, and Route 53 work to help protect against DDoS attacks

To help keep your dynamic web applications available when they are under DDoS attack, the steps in this post enable AWS Shield Standard by configuring your applications behind CloudFront and Route 53. AWS Shield Standard protects your resources from common, frequently occurring network and transport layer DDoS attacks. Attack traffic can be geographically isolated and absorbed using the capacity in edge locations close to the source. Additionally, you can configure geographical restrictions to help block attacks originating from specific countries.

The request-routing technology in CloudFront connects each client to the nearest edge location, as determined by continuously updated latency measurements. HTTP and HTTPS requests sent to CloudFront can be monitored, and access to your application resources can be controlled at edge locations using AWS WAF. Based on conditions that you specify in AWS WAF, such as the IP addresses that requests originate from or the values of query strings, traffic can be allowed, blocked, or allowed and counted for further investigation or remediation. The following diagram shows how static and dynamic web application content can originate from endpoint resources within AWS or your corporate data center. For more details, see How CloudFront Delivers Content and How CloudFront Works with Regional Edge Caches.

Route 53 DNS requests and subsequent application traffic routed through CloudFront are inspected inline. Always-on monitoring, anomaly detection, and mitigation against common infrastructure DDoS attacks such as SYN/ACK floods, UDP floods, and reflection attacks are built into both Route 53 and CloudFront. For a review of common DDoS attack vectors, see How to Help Prepare for DDoS Attacks by Reducing Your Attack Surface. When the SYN flood attack threshold is exceeded, SYN cookies are activated to avoid dropping connections from legitimate clients. Deterministic packet filtering drops malformed TCP packets and invalid DNS requests, only allowing traffic to pass that is valid for the service. Heuristics-based anomaly detection evaluates attributes such as type, source, and composition of traffic. Traffic is scored across many dimensions, and only the most suspicious traffic is dropped. This method allows you to avoid false positives while protecting application availability.

Route 53 is also designed to withstand DNS query floods, which are real DNS requests that can continue for hours and attempt to exhaust DNS server resources. Route 53 uses shuffle sharding and anycast striping to spread DNS traffic across edge locations and help protect the availability of the service.

The next four sections provide guidance about how to deploy CloudFront, Route 53, AWS WAF, and, optionally, AWS Shield Advanced.

Deploy CloudFront

To take advantage of application delivery with DDoS mitigations at the edge, start by creating a CloudFront distribution and configuring origins:

  1. Sign in to the AWS Management Console and open the CloudFront console
  2. Choose Create Distribution.
  3. On the first page of the Create Distribution Wizard, in the Web section, choose Get Started.
  4. Specify origin settings for the distribution. The following screenshot of the CloudFront console shows an example CloudFront distribution configured with an Elastic Load Balancing load balancer origin, as shown in the previous diagram. I have configured this example to set the Origin SSL Protocols to use TLSv1.2 and the Origin Protocol Policy to HTTP Only. For more information about creating an HTTPS listener for your ELB load balancer and requesting a certificate from AWS Certificate Manager (ACM), see Getting Started with Elastic Load BalancingSupported Regions, and Requiring HTTPS for Communication Between CloudFront and Your Custom Origin.
  1. Specify cache behavior settings for the distribution, as shown in the following screenshot. You can configure each URL path pattern with a set of associated cache behaviors. For dynamic web applications, set the Minimum TTL to 0 so that CloudFront will make a GET request with an If-Modified-Since header back to the origin. When CloudFront proxies traffic to the origin from edge locations and back, multiple concurrent requests for the same object are collapsed into a single request. The request is sent over a persistent connection from the edge location to the region over networks monitored by AWS. The use of a large initial TCP window size in CloudFront maximizes the available bandwidth, and TCP Fast Open (TFO) reduces latency.
  2. To ensure that all traffic to CloudFront is encrypted and to enable SSL termination from clients at global edge locations, specify Redirect HTTP to HTTPS for Viewer Protocol Policy. Moving SSL termination to CloudFront offloads computationally expensive SSL negotiation, helps mitigate SSL abuse, and reduces latency with the use of OCSP stapling and session tickets. For more information about options for serving HTTPS requests, see Choosing How CloudFront Serves HTTPS Requests. For dynamic web applications, set Allowed HTTP Methods to include all methods, set Forward Headers to All, and for Query String Forwarding and Caching, choose Forward all, cache based on all.
  1. Specify distribution settings for the distribution, as shown in the following screenshot. Enter your domain names in the Alternate Domain Names box and choose Custom SSL Certificate.
  2. Choose Create Distribution. Note the x.cloudfront.net Domain Name of the distribution. In the next section, you will configure Route 53 to route traffic to this CloudFront distribution domain name.

Configure Route 53

When you created a web distribution in the previous section, CloudFront assigned a domain name to the distribution, such as d111111abcdef8.cloudfront.net. You can use this domain name in the URLs for your content, such as: http://d111111abcdef8.cloudfront.net/logo.jpg.

Alternatively, you might prefer to use your own domain name in URLs, such as: http://example.com/logo.jpg. You can accomplish this by creating a Route 53 alias resource record set that routes dynamic web application traffic to your CloudFront distribution by using your domain name. Alias resource record sets are virtual records specific to Route 53 that are used to map alias resource record sets for your domain to your CloudFront distribution. Alias resource record sets are similar to CNAME records except there is no charge for DNS queries to Route 53 alias resource record sets mapped to AWS services. Alias resource record sets are also not visible to resolvers, and they can be created for the root domain (zone apex) as well as subdomains.

A hosted zone, similar to a DNS zone file, is a collection of records that belongs to a single parent domain name. Each hosted zone has four nonoverlapping name servers in a delegation set. If a DNS query is dropped, the client automatically retries the next name server. If you have not already registered a domain name and have not configured a hosted zone for your domain, complete these two prerequisite steps before proceeding:

After you have registered your domain name and configured your public hosted zone, follow these steps to create an alias resource record set:

  1. Sign in to the AWS Management Console and open the Route 53 console.
  2. In the navigation pane, choose Hosted Zones.
  3. Choose the name of the hosted zone for the domain that you want to use to route traffic to your CloudFront distribution.
  4. Choose Create Record Set.
  5. Specify the following values:
    • Name – Type the domain name that you want to use to route traffic to your CloudFront distribution. The default value is the name of the hosted zone. For example, if the name of the hosted zone is example.com and you want to use acme.example.com to route traffic to your distribution, type acme.
    • Type – Choose A – IPv4 address. If IPv6 is enabled for the distribution and you are creating a second resource record set, choose AAAA – IPv6 address.
    • Alias – Choose Yes.
    • Alias Target – In the CloudFront distributions section, choose the name that CloudFront assigned to the distribution when you created it.
    • Routing Policy – Accept the default value of Simple.
    • Evaluate Target Health – Accept the default value of No.
  6. Choose Create.
  7. If IPv6 is enabled for the distribution, repeat Steps 4 through 6. Specify the same settings except for the Type field, as explained in Step 5.

The following screenshot of the Route 53 console shows a Route 53 alias resource record set that is configured to map a domain name to a CloudFront distribution.

If your dynamic web application requires geo redundancy, you can use latency-based routing in Route 53 to run origin servers in different AWS regions. Route 53 is integrated with CloudFront to collect latency measurements from each edge location. With Route 53 latency-based routing, each CloudFront edge location goes to the region with the lowest latency for the origin fetch.

Enable AWS WAF

AWS WAF is a web application firewall that helps detect and mitigate web application layer DDoS attacks by inspecting traffic inline. Application layer DDoS attacks use well-formed but malicious requests to evade mitigation and consume application resources. You can define custom security rules (also called web ACLs) that contain a set of conditions, rules, and actions to block attacking traffic. After you define web ACLs, you can apply them to CloudFront distributions, and web ACLs are evaluated in the priority order you specified when you configured them. Real-time metrics and sampled web requests are provided for each web ACL.

You can configure AWS WAF whitelisting or blacklisting in conjunction with CloudFront geo restriction to prevent users in specific geographic locations from accessing your application. The AWS WAF API supports security automation such as blacklisting IP addresses that exceed request limits, which can be useful for mitigating HTTP flood attacks. Use the AWS WAF Security Automations Implementation Guide to implement rate-based blacklisting.

The following diagram shows how the (a) flow of CloudFront access logs files to an Amazon S3 bucket (b) provides the source data for the Lambda log parser function (c) to identify HTTP flood traffic and update AWS WAF web ACLs. As CloudFront receives requests on behalf of your dynamic web application, it sends access logs to an S3 bucket, triggering the Lambda log parser. The Lambda function parses CloudFront access logs to identify suspicious behavior, such as an unusual number of requests or errors, and it automatically updates your AWS WAF rules to block subsequent requests from the IP addresses in question for a predefined amount of time that you specify.

Diagram of the process

In addition to automated rate-based blacklisting to help protect against HTTP flood attacks, prebuilt AWS CloudFormation templates are available to simplify the configuration of AWS WAF for a proactive application-layer security defense. The following diagram provides an overview of CloudFormation template input into the creation of the CommonAttackProtection stack that includes AWS WAF web ACLs used to block, allow, or count requests that meet the criteria defined in each rule.

Diagram of CloudFormation template input into the creation of the CommonAttackProtection stack

To implement these application layer protections, follow the steps in Tutorial: Quickly Setting Up AWS WAF Protection Against Common Attacks. After you have created your AWS WAF web ACLs, you can assign them to your CloudFront distribution by updating the settings.

  1. Sign in to the AWS Management Console and open the CloudFront console.
  2. Choose the link under the ID column for your CloudFront distribution.
  3. Choose Edit under the General
  4. Choose your AWS WAF Web ACL from the drop-down
  5. Choose Yes, Edit.

Activate AWS Shield Advanced (optional)

Deploying CloudFront, Route 53, and AWS WAF as described in this post enables the built-in DDoS protections for your dynamic web applications that are included with AWS Shield Standard. (There is no upfront cost or charge for AWS Shield Standard beyond the normal pricing for CloudFront, Route 53, and AWS WAF.) AWS Shield Standard is designed to meet the needs of many dynamic web applications.

For dynamic web applications that have a high risk or history of frequent, complex, or high volume DDoS attacks, AWS Shield Advanced provides additional DDoS mitigation capacity, attack visibility, cost protection, and access to the AWS DDoS Response Team (DRT). For more information about AWS Shield Advanced pricing, see AWS Shield Advanced pricing. To activate advanced protection services, follow these steps:

  1. Sign in to the AWS Management Console and open the AWS WAF console.
  2. If this is your first time signing in to the AWS WAF console, choose Get started with AWS Shield Advanced. Otherwise, choose Protected resources.
  3. Choose Activate AWS Shield Advanced.
  4. Choose the resource type and resource to protect.
  5. For Name, enter a friendly name that will help you identify the AWS resources that are protected. For example, My CloudFront AWS Shield Advanced distributions.
  6. (Optional) For Web DDoS attack, select Enable. You will be prompted to associate an existing web ACL with these resources, or create a new ACL if you don’t have any yet.
  7. Choose Add DDoS protection.

Summary

In this blog post, I outline the steps to deploy CloudFront and configure Route 53 in front of your dynamic web application to leverage the global Amazon network of edge locations for DDoS resiliency. The post also provides guidance about enabling AWS WAF for application layer traffic monitoring and automated rules creation to block malicious traffic. I also cover the optional steps to activate AWS Shield Advanced, which helps build a more comprehensive defense against DDoS attacks for your dynamic web applications.

If you have comments about this post, submit them in the “Comments” section below. If you have questions about or issues implementing this solution, please open a new thread on the AWS WAF forum.

– Holly

Backup and Restore Time Machine using Synology and the B2 Cloud

Post Syndicated from Andy Klein original https://www.backblaze.com/blog/time-machine-synology-b2-backup-restore/

B2 Cloud Storage, Time Machine, and Synology NAS
Have you ever wished that you could have Time Machine, your Synology NAS, and B2 Cloud Storage work together to automatically backup your Mac locally and to the cloud? That would be cool. Of course, you’d also want to be able to restore your Time Machine backup from your Synology NAS or the B2 cloud. And while you’re wishing, it would be great if you could have an encrypted USB Hard Drive show up at your doorstep with your Time Machine backup. Stop wishing! You can do all that today. Here’s how.

Overview

Apple’s Time Machine app, included with every Mac, creates automatic backups of your Mac computer. Typically, these backups are stored on a local external hard drive. Time Machine backups can also be stored on other devices such as a Network Attached Storage (NAS) system on your network. If your computer crashes or you get a new computer, you can restore your data from the Time Machine backup.

We advocate a “3-2-1” backup strategy that combines local storage like a Time Machine backup with offsite backup to provide an additional layer of security and redundancy. That’s 3 copies of your data: 2 local (your “live” version and your Time Machine backup), and 1 offsite. If something happens to your computer or your NAS – if they’re stolen, or if some sort of disaster strikes – you can still count on your cloud backup to keep you safe.

You can use Backblaze to back up your computer to the cloud and use Time Machine to create a local backup. In fact, many of our customers do exactly that. But there’s another way to approach this that’s more efficient: Make a copy of the Time Machine backup and send it offsite automatically.

A Streamlined 3-2-1 Backup Plan

diagram of automatic backup of your Mac locally and to the cloud

The idea is simple: Have Time Machine store its backup on your Synology NAS device, then sync the Time Machine backup from the Synology NAS to Backblaze B2 Cloud Storage. Once this is set up, the 3-2-1 backup process occurs automatically and your files are stored locally and off-site.

We’ve prepared a guide titled “How to backup your Time Machine backup to Synology and B2” in the Backblaze Knowledge Base to help you with the setup of Time Machine, Synology, and Backblaze B2. Please read through the instructions before starting the actual installation.

Restoring Your Time Machine Backup

The greatest backup process in the world is of little value if you can’t restore your data. With your Time Machine backup now stored on your Synology NAS and in B2, you have multiple ways to restore your files.

Day-to-day Restores

From time to time you may need to restore a file or two from your local backup, in this case, your Time Machine backup stored on your Synology NAS. This works just like having your Time Machine backup stored on a locally connected external hard drive:

  • On the Mac menu bar (top right) locate and click on the Time Machine icon.
  • Select “Enter Time Machine”.
  • Locate the file or files you wish to restore.
  • Click “Restore” to restore the selected file(s).

The only thing to remember is that your Synology NAS device needs to be accessible via your network to access the Time Machine backup.

Full Restores

Most often you would do a full restore of your Time Machine backup if you are replacing your computer or the hard/SSD drive inside.

Method 1: Restore from the Synology NAS device

The most straight-forward method is to restore the Time Machine backup directly from the Synology NAS device. You can restore your entire Time Machine backup to your new or reformatted Mac by having Apple’s Migration Assistant app use the Time Machine backup stored on the Synology NAS as the restore source. The Migration Assistant app is included with your Mac.

Of course, in the case of a disaster or theft, the Synology NAS may suffer the same fate as your Mac. In that case, you’ll want to restore your Time Machine backup from Backblaze B2, here’s how.

Method 2: Restore a Time Machine Backup from B2 via a USB Hard Drive

The second method is to prepare a B2 snapshot of your Time Machine backup and then have the snapshot copied to a USB hard drive you purchase from Backblaze. Think of a snapshot as a container that holds a copy of the files you wish to download. Instead of downloading each file individually, you create a snapshot of the files and download one item, the snapshot. In this case, you create the snapshot of your Time Machine backup, and we copy the snapshot to the hard drive and FedEx it to you. You then use the USB Hard Drive as a restore source when using Migration Assistant.

Method 2: Restore a Time Machine backup from B2 via USB hard drive

We’ve prepared a guide titled, “How to restore your Time Machine backup from B2” in the Backblaze Knowledge Base to walk you through the process of restoring your Time Machine backup from Backblaze B2 using an encrypted USB Hard Drive.

Method 3: Restore a Time Machine Backup from B2 via Download

When using this method, give consideration to the size of the Time Machine backup. It is not uncommon for this file to be several hundred gigabytes or even a terabyte or two. Even with the reasonably fast network connection downloading such a large file can take a considerable amount of time.

Prepare a snapshot of your Time Machine backup from B2 and download it to your “new” Mac. After you “unzip” the file you can use Migration Assistant on your new Mac to restore the Time Machine backup using the unzipped file as the restore source.

Method 3: Restore a Time Machine backup from B2 via download

Summary

As we noted earlier, you can use Backblaze Computer Backup to backup your computer to the cloud and use Time Machine to create a local backup. That works fine, but if you are using a Synology NAS device in your environment, the 3-2-1 strategy discussed above gives you another option. In that case, all of the Time Machine backups in your home or office can reside on the Synology NAS. Then you don’t need an external drive to store the Time Machine backup for each computer and all of the Time Machine backups can sync automatically to Backblaze B2 Cloud Storage.

In summary, if you have a Mac, a Synology NAS, and a Backblaze B2 account you can have an automatic 3-2-1 Time Machine backup of the files on your computer. You don’t have to drag and drop files into backup folders, remember to hit the “backup now” button, or hoard backup external USB drives in your closet. Enjoy automatic, continuous backup, locally and in the cloud. 3-2-1 backup has never been so easy.

The post Backup and Restore Time Machine using Synology and the B2 Cloud appeared first on Backblaze Blog | Cloud Storage & Cloud Backup.

The Cloud’s Software: A Look Inside Backblaze

Post Syndicated from Peter Cohen original https://www.backblaze.com/blog/the-clouds-software-a-look-inside-backblaze/

When most of us think about “the cloud,” we have an abstract idea that it’s computers in a data center somewhere – racks of blinking lights and lots of loud fans. There’s truth to that. Have a look inside our datacenter to get an idea. But besides the impressive hardware – and the skilled techs needed to keep it running – there’s software involved. Let’s take a look at a few of the software tools that keep our operation working.

Our data center is populated with Storage Pods, the servers that hold the data you entrust to us if you’re a Backblaze customer or you use B2 Cloud Storage. Inside each Storage Pod are dozens of 3.5-inch spinning hard disk drives – the same kind you’ll find inside a desktop PC. Storage Pods are mounted on racks inside the data center. Those Storage Pods work together in Vaults.

Vault Software

The Vault software that keeps those Storage Pods humming is one the backbones of our operation. It’s what makes it possible for us to scale our services to meet your needs and with durability, scalability and fast performance.

The Vault software distributes data across 20 different Storage Pods, with the data spread evenly across all 20 pods. Drives in the same position inside each Storage Pod are grouped together in software in what we call a “tome.” When a file gets uploaded to Backblaze, it’s split into pieces we call “shards” and distributed across all 20 drives.

Each file is stored as 20 shards: 17 data shards and three parity shards. As the name implies, the data shards comprise the information in the files you upload to Backblaze. Parity shards add redundancy so that a file can be completely restored from a Vault even if some of the pieces are not available.

Because those shards are distributed across 20 Storage Pods in 20 cabinets, a Storage Pod can go down and the Vault will still operate unimpeded. An entire cabinet can lose power and the Vault will still work fine.

Files can be written to the Vault even if a Storage Pod is down with two parity shards to protect the data. Even in the extreme — and unlikely — case where three Storage Pods in a Vault are offline, the files in the vault are still available because they can be reconstructed from the 17 available pieces.

Reed-Solomon Erasure Coding

Erasure coding makes it possible to rebuild a data file even if parts of the original are lost. Having effective erasure coding is vital in a distributed environment like a Backblaze Vault. It helps us keep your data safe even when the hardware that the data is stored on needs to be serviced.

We use Reed-Solomon erasure encoding. It’s a proven technique used in Linux RAID systems, by Microsoft in its Azure cloud storage, and by Facebook too. The Backblaze Vault Architecture is capable of delivering 99.99999% annual durability thanks in part to our Reed-Solomon erasure coding implementation.

Here’s our own Brian Beach with an explanation of how Reed-Solomon encoding works:

We threw out the Linux RAID software we had been using prior to the implementation of the Vaults and wrote our own Reed-Solomon implementation from scratch. We’re very proud of it. So much so that we’ve released it as open source that you can use in your own projects, if you wish.

We developed our Reed-Solomon implementation as a Java library. Why? When we first started this project, we assumed that we would need to write it in C to make it run as fast as we needed. It turns out that modern Java virtual machines working on our servers are great, and just-in-time compilers produces code that runs pretty quick.

All the work we’ve done to build a reliable, scalable, affordable solution for storing data in a “cloud” led to the creation of B2 Cloud Storage. B2 lets you store your data in the cloud for a fraction of what you’d spend elsewhere – 1/4 the price of Amazon S3, for example.

Using Our Storage

Having over 300 Petabytes of data storage available isn’t very useful unless we can store data and reliably restore it too. We offer two ways to store data with Backblaze: via a client application or via direct access. Our client application, Backblaze Computer Backup, is installed on your Mac or Windows system and basically does everything related to automatically backing up your computer. We locate the files that are new or changed and back them up. We manage versions, deduplicate files, and more. The Backblaze app does all the work behind the scenes.

The other way to use our storage is via direct access. You can use a Web GUI, a Command Line Interface (CLI) or an Application Programming Interface (API). With any of these methods, you are in charge of what gets stored in the Backblaze cloud. This is what Backblaze B2 is all about. You can log into B2 and use the Web GUI to drag and drop files that are stored in the Backblaze cloud. You decide what gets added and deleted, and how many versions of a file you want to keep. Think of B2 as your very own bucket in the cloud where you can store your files.

We also have mobile apps for iOS and Android devices to help you view and share any backed up files you have on the go. You can download them, play back or view media files, and share them as you need.

We focused on creating a native, integrated experience for you when you use our software. We didn’t take a shortcut to create a Java app for the desktop. On the Mac our app is built using Xcode and on the PC it was built using C. The app is designed for lightweight, unobtrusive performance. If you do need to adjust its performance, we give you that ability. You have control over throttling the backup rate. You can even adjust the number of CPU threads dedicated to Backblaze, if you choose.

When we first released the software almost a decade ago we had no idea that we’d iterate it more than 1,000 times. That’s the threshold we reached late last year, however! We released version 4.3.0 in December. We’re still plugging away at it and have plans for the future, too.

Our Philosophy: Keep It Simple

“Keep It Simple” is the philosophy that underlies all of the technology that powers our hardware. It makes it possible for you to affordably, reliably back up your computers and store data in the cloud.

We’re not interested in creating elaborate, difficult-to-implement solutions or pricing schemes that confuse and confound you. Our backup service is unlimited and unthrottled for one low price. We offer cloud storage for 1/4th the competition. And we make it easy to access with desktop, mobile and web interfaces, command line tools and APIs.

Hopefully we’ve shed some light on the software that lets our cloud services operate. Have questions? Join the discussion and let us know.

The post The Cloud’s Software: A Look Inside Backblaze appeared first on Backblaze Blog | Cloud Storage & Cloud Backup.

What’s the Diff: Hot and Cold Data Storage

Post Syndicated from Peter Cohen original https://www.backblaze.com/blog/whats-the-diff-hot-and-cold-data-storage/

Hot And Cold Storage

Differentiating cloud data storage by “temperature” is common practice when it comes to describing the tiered storage setups offered by various cloud storage providers. “Hot” and “cold” describes how often that data is accessed. What’s the actual difference, and how does each temperature fit your cloud storage strategy? Let’s take a look.

First of all, let’s get this out of the way: There’s no set industry definition of what hot and cold actually mean. So some of this may need to be adapted to your specific circumstances. You’re bound to see some variance or disagreement if you research the topic.

Hot Storage

“Hot” storage is data you need to access right away, where performance is at a premium. Hot storage often goes hand in hand with cloud computing. If you’re depending on cloud services not only to store your data but also to process it, you’re looking at hot storage.

Business-critical information that needs to be accessed frequently and quickly is hot storage. If performance is of the essence – if you need the data stored on SSDs instead of hard drives, because speed is that much of a factor – then that’s hot storagae.

High-performance primary storage comes at a price, though. Cloud data storage providers charge a premium for hot data storage, because it’s resource-intensive. Microsoft’s Azure Hot Blobs and Amazon AWS services don’t come cheap.

Read on for how our B2 Cloud Storage fits the hot storage model. But first, let’s talk about cold storage.

Cold Storage

“Cold” storage is information that you don’t need to access very often. Inactive data that doesn’t need to be accessed for months, years, decades, potentially ever. That’s the sort of content that cold storage is ideal for. Practical examples of data suitable for cold storage include old projects, records you might need for auditing or bookkeeping purposes at some point in the future, or other content you only need to access infrequently.

Data retrieval and response time for cold cloud storage systems are typically slower than services designed for active data manipulation. Practical examples of cold cloud storage include services like Amazon Glacier and Google Coldline.

Storage prices for cold cloud storage systems are typically lower than warm or hot storage. But cold storage often incur higher per-operation costs than other kinds of cloud storage. Access to the data typically requires patience and planning.

Apocryphally, “cold” storage meant just that: Data physically stored away from the hot machines running the media. Today, cold storage is still sometimes used to describe purely offline storage – that is, data that’s not stored in the cloud at all. Sometimes this is data that you might want to quarantine from from the Internet altogether – for example, cryptocurrency like BitCoin. Sometimes this is that old definition of cold storage: data that is archived on some sort of durable medium and stored in a secure offsite facility.

How B2 Cloud Storage Fits the Cold and Hot Model

We’ve designed B2 Cloud Storage to be instantly available. With B2, you won’t have delays accessing your information like you might have with offline or some nearline systems. Your data is available when you need it.

B2 is built on the physical architecture and advanced software framework we’ve been developing for the past decade to power our signature backup services. B2 Cloud Storage sports multiple layers of redundancy to make sure that your data is stored safely and is available when you need it.

We’ve taken the concept of hot storage a step further by offering reliable, affordable, and scalable storage in the cloud for a mere fraction of what others charge. We’re one-quarter the price of Amazon.

B2 Cloud Storage changes the pricing model for cloud storage. B2 changes the pricing model so much that our customers have found it economical to migrate away altogether from slow, inconvenient and frustrating cold storage and offline archival systems. Our media and entertainment customers are using B2 instead of LTO tape systems, for example.

What Temperature Is Your Cloud Storage?

Different organizations have different needs, so there’s no right answer about what temperature your cloud data should be. It’s imperative to your bottom line that you don’t pay for more than what you need. That’s why we’ve designed B2 to be an affordable and reliable cloud storage solution. Get started today and you’ll get the first 10GB for free!

Have a different idea of what hot and cold storage are? Have questions that aren’t answered here? Join the discussion!

The post What’s the Diff: Hot and Cold Data Storage appeared first on Backblaze Blog | Cloud Storage & Cloud Backup.

Four Tips To Help Photographers and Videographers Get The Most From B2

Post Syndicated from Peter Cohen original https://www.backblaze.com/blog/four-b2-tips-photographers-videographers/

B2 for Photographers and Videographers

Photographers and videographers regularly push the limit of data storage and archiving solutions, especially as camera makers constantly increase megapixel sensor density. Network Attached Storage (NAS) systems are indispensable to help store these large amounts of data, whether it’s 20-50 megapixel stills or 1080p, 4K or 8K video footage. Data creep is inevitable. What can you do to get the most out of your NAS and future backup strategy? Enter B2 Cloud Storage.

B2 Cloud Storage can help you make sure your archived photos and videos are safe for as long as you need them in a secure offsite location. B2 is reliable cloud storage available for a fraction of the price of other cloud storage services: One-quarter what you’d pay Amazon. It’s easy to use thanks to a powerful web GUI, an open API and even a CLI.

Here are some tips to get the most out of B2.

1. B2 Integrates With Popular NAS Systems

Synology is one of the most popular vendors of NAS systems currently in use in small to medium-sized businesses (SMBs) like like many photography and video businesses. If you’re currently using a Synology NAS, you can begin backing up and syncing to B2 right away. Synology’s Cloud Sync app – available as part of Synology’s DiskStation Manager (DSM) software – supports B2.

CloudBerry is an enormously popular app for backing up Windows Server systems, and it supports B2. CloudBerry makes a version of its software to support Synology and makes apps for other platforms too.

2. A Complete Backup Strategy Will Save Your Bacon

“Don’t put all your eggs in one basket.” At this point, if you’re like many of us, you may not even think about backing up your NAS because it has built-in redundancy. If one drive dies, you can rebuild the NAS by replacing it.

But that doesn’t tell the whole story. What happens in the NAS hardware itself bites the bullet? What happens if there’s a natural disaster like a flood, fire or other calamity that claims the NAS?

To that end, it’s important to avoid relying on any single system, because that makes you dependent on a single point of failure. Make sure your NAS is backed up, and back it up locally before you back it up to the cloud (it’s what we call the 3-2-1 Backup Strategy and it’s worked pretty well for us and our customers over the years).

3. Store What You Need Online

Taking a hybrid approach to data archival and storage can be a smart way to spread the risk and provide alternate access to your work when in a pinch. Hybrid cloud storage offloads some of what you’ve archived to the cloud. You can leave what you need or what you think you might need immediately in local storage. Offload what you don’t need right away to the cloud.

Cloud access can be a time (and life) saver when you unexpectedly need to access archived projects. You don’t have to hunt for optical media such as CDs or DVDs. And if you’re storing archived content offsite, factor in the cost and time needed to deliver such media. Using a B2 cloud repository simplifies and speeds the process greatly. You’ll spend less time finding and restoring projects and more time getting work done.

Worried about backups and archives taking up huge amounts of cloud storage space? Don’t be. B2 supports Lifecycle Rules to make it easier to automatically hide and delete older versions of files. Using B2’s powerful web interface, you can specify whether to keep all versions of a file, keep only the last version of a file, keep prior versions for a specific number of days, or based on other criteria you specify. Lifecycle Rules can be applied to any bucket you create. (A bucket is a B2 file repository, the topmost organizational structure of data stored on B2.)

If you’re interested in a pre-built hybrid cloud solution that works with B2, check out OpenIO.

4. Use B2 To Share Files

With B2, you can create public content to share with others with a web-friendly URL. You can share proofs, rough cuts or other content you’d like to make available to your clients. That means you don’t have to host it locally. So you’re not consuming your own network bandwidth and you aren’t compromising the security of your network to outside users.

Safely contain what you want to share in a public or private bucket. B2 supports your ability to manage content sharing as you see fit. You can change buckets from public to private with a single click from our web interface.

Our web interface is an easy way to upload content to share with others. We also support many third-party apps including Cyberduck and DropShare. More details are available in our help section: How Can I Upload to B2.

If you’re concerned about overrunning your cloud storage budget, take comfort that B2 provides you with data cap and alert management features, so you’ll never be hit with a surprise bill.

Hopefully we’ve given you some ideas of how you can integrate B2 into your own workflow to help ease your archiving burden and make it easier for you to share files securely and safely. Are you a photo or video pro using B2? What are your biggest data storage challenges? Let us know in the comments, and feel free to share other tips and techniques!

The post Four Tips To Help Photographers and Videographers Get The Most From B2 appeared first on Backblaze Blog | Cloud Storage & Cloud Backup.