The alliance is expected to bring together representatives from across the engineering community—including academia, industry, and professional societies—to define and prioritize high-impact fundamental research addressing national, global, and societal needs. ERVA says it will recommend research topics to the NSF it believes the agency ought to fund.
IEEE is one of more than a dozen professional engineering societies that have joined as affiliate partners. The organizations plan to participate in the alliance’s events and have the opportunity to contribute ideas.
NSF hopes the initiative will spur advances.
“When engineers come together behind a big challenge, we create amazing discoveries and innovations that can lead to exciting new fields,” Dawn Tilbury, NSF assistant director for engineering and an IEEE Fellow, said in a news release announcing the alliance.
A UNIFIED VOICE
IEEE Fellow Barry W.Johnson is the alliance’s founding executive director and its co-principal investigator. He says that what is unique about ERVA is that it is bringing in the entire engineering community—not just electrical engineers but also mechanical engineers, civil engineers, biomedical engineers, and others.
Rather than having 20 or 30 professional societies speaking on behalf of a technology individually, he says, “we want ERVA to communicate with a consistent, unified voice.”
What is most unusual about ERVA, Johnson says, is its strong emphasis on the participants’ diversity, including their different geographic areas and disciplines, as well as being at different career stages.
“We’ll then supplement them with individuals from the science communities so that we get a true, multidisciplinary group,” he says. “We believe that the future is going to reside in multidisciplinary activities.”
Johnson says the NSF traditionally has focused on ideas “bubbling up from the research community.” An individual or organization would submit a proposal to the foundation, which would then vet it for funding. Another method the foundation has used to identify research topics is what Johnson calls visioning sessions. They might include workshops in which participants, including NSF program directors, identify new and emerging topics within a technical area before they become commonplace—for example, quantum computing.
ERVA’s process will begin with surveys of the research community to help identify potential research themes, Johnson says. The process is likely to include the use of research intelligence based on analyses of publication and patent databases.
Once a potential theme is identified, a task force of eight to 12 experts will conduct the visioning process and issue a report to ERVA’s leadership that includes recommendations, Johnson says. Once the report is finalized, he adds, it will be shared broadly with the engineering community including university professors and researchers at companies that might want to get involved.
“What NSF really wants to accomplish is to be proactive in identifying new and emerging areas so that it achieves its vision to be the global leader in research and innovation,” Johnson says.
ERVA’s principal investigator is Dorota A.Grejner-Brzezinska, vice president for knowledge enterprise at Ohio State University. In addition to Johnson, the ERVA co-principal investigators are IEEE Fellows Charles Johnson-Bey and Edl Shamiloglu, and UIDP President and CEO Anthony M.Boccanfuso. An advisory board has been established as well as a standing council, which Johnson calls the “intellectual brain trust of the organization.” It is expected to help identify technologies the alliance should consider.
The three groups met for the first time virtually on 11 June. A video of the meeting is available.
Johnson is looking to hire a full-time executive director to oversee the organization and its full-time staff.
CALL TO ACTION FOR IEEE MEMBERS
Johnson wants IEEE members who are experts in specific technologies to help the alliance with the visioning activity by subscribing to be an ERVA Champion. There are already more than 400. He also calls on IEEE members to provide the alliance with ideas for research themes.
“The key thing about the ERVA is getting ideas from the broad engineering community,” he says, “with IEEE being a critical component.”
IEEE membership offers a wide range of benefits and opportunities for those who share a common interest in technology. If you are not already a member, consider joining IEEE and becoming part of a worldwide network of more than 400,000 students and professionals.
Netflix is revolutionizing the way a modern studio operates. Our mission in Studio Engineering is to build a unified, global, and digital studio that powers the effective production of amazing content.
Netflix produces some of the world’s most beloved and award-winning films and series, including The Irishman, The Crown, La Casa de Papel, Ozark, and Tiger King. In an effort to effectively and efficiently produce this content we are looking to improve and automate many areas of the production process. We combine our entertainment knowledge and our technical expertise to provide innovative technical solutions from the initial pitch of an idea to the moment our members hit play.
Why Does Studio Engineering Exist?
The journey of a Netflix Original title from the moment it first comes to us as a pitch, to that press of the play button is incredibly complex. Producing great content requires a significant amount of coordination and collaboration from Netflix employees and external vendors across the various production phases. This process starts before the deal has been struck and continues all the way through launch on the service, involving people representing finance, scheduling, human resources, facilities, asset delivery, and many other business functions. In this overview, we will shed light on the complexity and magnitude of this journey and update this post with links to deeper technical blogs over time.
Mission at a Glance
Creative pitch: Combine the best of machine learning and human intuition to help Netflix understand how a proposed title compares to other titles, estimate how many subscribers will enjoy it, and decide whether or not to produce it.
Business negotiations: Empower the Netflix Legal team with data to help with deal negotiations and acquisition of rights to produce and stream the content.
Pre-Production: Provide solutions to plan for resource needs, and discovery of people and vendors to continue expanding the scale of our productions. Any given production requires the collaboration of hundreds of people with varying expertise, so finding exactly the right people and vendors for each job is essential.
Production: Enable content creation from script to screen that optimizes the production process for efficiency and transparency. Free up creative resources to focus on what’s important: producing amazing and entertaining content.
Post-Production: Help our creative partners collaborate to refine content into their final vision with digital content logistics and orchestration.
Studio Engineering will be publishing a series of articles providing business and technical insights as we further explore the details behind the journey from pitch to play. Stay tuned as we expand on each stage of the content lifecycle over the coming months!
Here are some related articles to Studio Engineering:
We are often talking about “innovation” and “digital innovation” (or “technical innovation”) in particular, when it comes to tech startups. It has, unfortunately, become a cliche, and now “innovation” is devoid of meaning. I’ve been trying to put some meaningful analysis of the “innovation landscape” and to classify what is being called “innovation”.
And the broad classification I got to is “technical innovation” vs “process innovation”. In the majority of cases, tech startups are actually process innovations. They get existing technology and try to optimize some real world process with it. These processes include “communicating with friends online”, “getting in touch with business contacts online”, “getting a taxi online”, “getting a date online”, “ordering food online”, “uploading photos online”, and so on. There is no inherent technical innovation in any of these – they either introduce new (and better) processes, or they optimize existing ones.
And don’t get me wrong – these are all very useful things. In fact, this is what “digital transformation” means – doing things electronically that were previously done in an analogue way, or doing things that were previously not possible in the analogue world. And the better you imagine or reimagine the process, the more effective your company will be.
In many cases these digital transformation tools have to deal with real-world complexities – legislation, entrenched behaviour, edge cases. E.g. you can easily write food delivery software. You get the order, you notify the store, you optimize the delivery people’s routes to collect and deliver as much food as possible, and you’re good to go. And then you “hit” the real world, where there are traffic jams, temporarily closed streets, restricted parking, unreponsive restaurants, unresponsive customers, keeping the online menu and what’s in stock in sync, worsened weather conditions, messed up orders, part-time job regulations that differ by country, and so on. And you have to navigate that maze in order to deliver a digitally transformed food delivery service.
There is nothing technically complex about that – any kid with some PHP and JS knowledge can write the software by finding answers to the programming hurdles on Stackoverflow. In that sense, it is no so technically innovative. The hard part is the processes and the real-world complexities. And of course, turning that into a profitable business.
In the long run, these non-technical innovations end up producing technical innovation. Facebook had nothing interesting on the technical side in the beginning. Then it had millions of users and had to scale, and then it became interesting – how to process so much data, how to scale to multiple parts of the world, how to optimize the storage of so many photos, and so on. Facebook gave us Cassandra, Twitter gave us Snowflake, LinkedIn gave us Kafka. There are many more examples, and it’s great that these companies open source some of their internally developed technologies. But these are side-effects of the scale, not an inherent technical innovation that lead to the scale in the first place.
And then there’s the technical innovation companies. I think it’s a much more rare phenomenon and the prime example is Google – the company started as a consequence of a research paper. Roughly speaking, the paper outlined a technical innovation in search that made all other approaches to search obsolete. We can say that Bitcoin was such an innovation, as well. In some cases it’s not the founders that develop the original research, but they derive their product from existing computer science research. They combine multiple papers, adapt them to the needs of the real world (because, as we know, research papers often rely on a “spherical horse in vacuum”) and build something useful.
As a personal side-note here, some of my (side) projects were purely process innovations – I once made an online bus ticket reservation service (before such a thing existed in my country), then I made a social network aggregator (that was arguably better than existing ones at the time). And they were much less interesting than my more technically innovative projects, like Computoser (which has some original research) or LogSentinel (which combines several research papers into a product).
A subset of the technical innovation is the so called “deep tech” – projects that are supposed to enable future innovation. This can be simplified as “applied research”. Computer vision, AI, biomedical. This is where you need a lot of R&D, not simply “pouring” code for a few months.
Just as “process innovation” companies eventually lead to technical innovation, technical innovation companies eventually (or immediately) lead to process improvements. Google practically changed the way we consume information, so it’s impact on the processes is rather high. And to me, that’s the goal of each company – to get to change behaviour. It’s much more interesting to do that using never-before-done technical feats, but if you can do it without the technical bits (i.e. by simply building a website/app using current web/mobile frameworks), good for you.
If you become a successful company, you’ll necessarily have both types of innovation, regardless of how you started. And in order to have a successful company, you have to improve processes and change behaviour. You have to do digital transformation. In the long run, it doesn’t make that much of a difference which was first – the technology or the process innovation. Although from business and investment perspective, it’s easier for competitors to copy the processes and harder to copy internal R&D.
Whether we should call process innovation “technical innovation” – I don’t think so, but that ship has already sailed – anything that uses technology is now “technical innovation”, even if it’s a WordPress site. But for technical people it’s much more challenging and rewarding to be based on actual technical innovation. We just have to remember that we have to solve real-world problems, improve or introduce processes and change behaviour.
In November 2013, the first commercially available helium-filled hard drive was introduced by HGST, a Western Digital subsidiary. The 6 TB drive was not only unique in being helium-filled, it was for the moment, the highest capacity hard drive available. Fast forward a little over 4 years later and 12 TB helium-filled drives are readily available, 14 TB drives can be found, and 16 TB helium-filled drives are arriving soon.
Backblaze has been purchasing and deploying helium-filled hard drives over the past year and we thought it was time to start looking at their failure rates compared to traditional air-filled drives. This post will provide an overview, then we’ll continue the comparison on a regular basis over the coming months.
The Promise and Challenge of Helium Filled Drives
We all know that helium is lighter than air — that’s why helium-filled balloons float. Inside of an air-filled hard drive there are rapidly spinning disk platters that rotate at a given speed, 7200 rpm for example. The air inside adds an appreciable amount of drag on the platters that in turn requires an appreciable amount of additional energy to spin the platters. Replacing the air inside of a hard drive with helium reduces the amount of drag, thereby reducing the amount of energy needed to spin the platters, typically by 20%.
We also know that after a few days, a helium-filled balloon sinks to the ground. This was one of the key challenges in using helium inside of a hard drive: helium escapes from most containers, even if they are well sealed. It took years for hard drive manufacturers to create containers that could contain helium while still functioning as a hard drive. This container innovation allows helium-filled drives to function at spec over the course of their lifetime.
Checking for Leaks
Three years ago, we identified SMART 22 as the attribute assigned to recording the status of helium inside of a hard drive. We have both HGST and Seagate helium-filled hard drives, but only the HGST drives currently report the SMART 22 attribute. It appears the normalized and raw values for SMART 22 currently report the same value, which starts at 100 and goes down.
To date only one HGST drive has reported a value of less than 100, with multiple readings between 94 and 99. That drive continues to perform fine, with no other errors or any correlating changes in temperature, so we are not sure whether the change in value is trying to tell us something or if it is just a wonky sensor.
Helium versus Air-Filled Hard Drives
There are several different ways to compare these two types of drives. Below we decided to use just our 8, 10, and 12 TB drives in the comparison. We did this since we have helium-filled drives in those sizes. We left out of the comparison all of the drives that are 6 TB and smaller as none of the drive models we use are helium-filled. We are open to trying different comparisons. This just seemed to be the best place to start.
The most obvious observation is that there seems to be little difference in the Annualized Failure Rate (AFR) based on whether they contain helium or air. One conclusion, given this evidence, is that helium doesn’t affect the AFR of hard drives versus air-filled drives. My prediction is that the helium drives will eventually prove to have a lower AFR. Why? Drive Days.
Let’s go back in time to Q1 2017 when the air-filled drives listed in the table above had a similar number of Drive Days to the current number of Drive Days for the helium drives. We find that the failure rate for the air-filled drives at the time (Q1 2017) was 1.61%. In other words, when the drives were in use a similar number of hours, the helium drives had a failure rate of 1.06% while the failure rate of the air-filled drives was 1.61%.
Helium or Air?
My hypothesis is that after normalizing the data so that the helium and air-filled drives have the same (or similar) usage (Drive Days), the helium-filled drives we use will continue to have a lower Annualized Failure Rate versus the air-filled drives we use. I expect this trend to continue for the next year at least. What side do you come down on? Will the Annualized Failure Rate for helium-filled drives be better than air-filled drives or vice-versa? Or do you think the two technologies will be eventually produce the same AFR over time? Pick a side and we’ll document the results over the next year and see where the data takes us.
In today’s guest post, seventh-grade students Evan Callas, Will Ross, Tyler Fallon, and Kyle Fugate share their story of using the Raspberry Pi Oracle Weather Station in their Innovation Lab class, headed by Raspberry Pi Certified Educator Chris Aviles.
United Nations Sustainable Goals
The past couple of weeks in our Innovation Lab class, our teacher, Mr Aviles, has challenged us students to design a project that helps solve one of the United Nations Sustainable Goals. We chose Climate Action. Innovation Lab is a class that gives students the opportunity to learn about where the crossroads of technology, the environment, and entrepreneurship meet. Everyone takes their own paths in innovation and learns about the environment using project-based learning.
Raspberry Pi Oracle Weather Station
For our climate change challenge, we decided to build a Raspberry Pi Oracle Weather Station. Tackling the issues of climate change in a way that helps our community stood out to us because we knew with the help of this weather station we can send the local data to farmers and fishermen in town. Recent changes in climate have been affecting farmers’ crops. Unexpected rain, heat, and other unusual weather patterns can completely destabilize the natural growth of the plants and destroy their crops altogether. The amount of labour output needed by farmers has also significantly increased, forcing farmers to grow more food on less resources. By using our Raspberry Pi Oracle Weather Station to alert local farmers, they can be more prepared and aware of the weather, leading to better crops and safe boating.
Growing teamwork and coding skills
The process of setting up our weather station was fun and simple. Raspberry Pi made the instructions very easy to understand and read, which was very helpful for our team who had little experience in coding or physical computing. We enjoyed working together as a team and were happy to be growing our teamwork skills.
Once we constructed and coded the weather station, we learned that we needed to support the station with PVC pipes. After we completed these steps, we brought the weather station up to the roof of the school and began collecting data. Our information is currently being sent to the Initial State dashboard so that we can share the information with anyone interested. This information will also be recorded and seen by other schools, businesses, and others from around the world who are using the weather station. For example, we can see the weather in countries such as France, Greece and Italy.
Raspberry Pi allows us to build these amazing projects that help us to enjoy coding and physical computing in a fun, engaging, and impactful way. We picked climate change because we care about our community and would like to make a substantial contribution to our town, Fair Haven, New Jersey. It is not every day that kids are given these kinds of opportunities, and we are very lucky and grateful to go to a school and learn from a teacher where these opportunities are given to us. Thanks, Mr Aviles!
To see more awesome projects by Mr Avile’s class, you can keep up with him on his blog and follow him on Twitter.
For some organizations, the idea of “going serverless” can be daunting. But with an understanding of best practices – and the right tools — many serverless applications can be fully functional with only a few lines of code and little else.
Examples of fully-serverless-application use cases include:
Web or mobile backends – Create fully-serverless, mobile applications or websites by creating user-facing content in a native mobile application or static web content in an S3 bucket. Then have your front-end content integrate with Amazon API Gateway as a backend service API. Lambda functions will then execute the business logic you’ve written for each of the API Gateway methods in your backend API.
Chatbots and virtual assistants – Build new serverless ways to interact with your customers, like customer support assistants and bots ready to engage customers on your company-run social media pages. The Amazon Alexa Skills Kit (ASK) and Amazon Lex have the ability to apply natural-language understanding to user-voice and freeform-text input so that a Lambda function you write can intelligently respond and engage with them.
Internet of Things (IoT) backends – AWS IoT has direct-integration for device messages to be routed to and processed by Lambda functions. That means you can implement serverless backends for highly secure, scalable IoT applications for uses like connected consumer appliances and intelligent manufacturing facilities.
Using AWS Lambda as the logic layer of a serverless application can enable faster development speed and greater experimentation – and innovation — than in a traditional, server-based environment.
Andrew Baird is a Sr. Solutions Architect for AWS. Prior to becoming a Solutions Architect, Andrew was a developer, including time as an SDE with Amazon.com. He has worked on large-scale distributed systems, public-facing APIs, and operations automation.
This post courtesy of Paul Johnston, AWS Senior Developer Advocate – Serverless
Welcome to the first edition of the AWS Serverless ICYMI (In case you missed it) quarterly recap! Every quarter we’ll share all of the most recent product launches, feature enhancements, blog posts, webinars, Twitch live streams, and other interesting things that you might have missed!
These runtimes give Lambda developers and development teams even greater options for coding serverless, on-demand, compute solutions.
The AWS SAM 1.4.0 release was one of its biggest. The release added features for configuring many aspects of Amazon API Gateway, including CORS support, regional endpoints, binary media types, and stage settings. It also included per function concurrency support, tags and TableName for SimpleTable, and many documentation updates. Check out the release notes for the full list!
AppSync came out of the whitelisted preview and added a whole bunch of new features:
We’re always looking to help people start learning how to build serverless applications. Our serverless web application workshops are online and you can do the hands-on labs yourself: Build a Serverless web application
Still looking for more?
The Serverless landing page has lots of information including a resources page containing case studies, webinars, whitepapers, customer stories, reference architectures, and even more Getting Started tutorials. Check it out!
Microsoft has issued a
press release describing the security dangers involved with the
Internet of things (“a weaponized stove, baby monitors that spy, the
contents of your refrigerator being held for ransom“) and introducing
“Microsoft Azure Sphere” as a combination of hardware and software to
address the problem. “Unlike the RTOSes common to MCUs today, our
defense-in-depth IoT OS offers multiple layers of security. It combines
security innovations pioneered in Windows, a security monitor, and a custom
Linux kernel to create a highly-secured software environment and a
trustworthy platform for new IoT experiences.”
American Public Television was like many organizations that have been around for a while. They were entrenched using an older technology — in their case, tape storage and distribution — that once met their needs but was limiting their productivity and preventing them from effectively collaborating with their many media partners. APT’s VP of Technology knew that he needed to move into the future and embrace cloud storage to keep APT ahead of the game.
Since 1961, American Public Television (APT) has been a leading distributor of groundbreaking, high-quality, top-rated programming to the nation’s public television stations. Gerry Field is the Vice President of Technology at APT and is responsible for delivering their extensive program catalog to 350+ public television stations nationwide.
In the time since Gerry joined APT in 2007, the industry has been in digital overdrive. During that time APT has continued to acquire and distribute the best in public television programming to their technically diverse subscribers.
This created two challenges for Gerry. First, new technology and format proliferation were driving dramatic increases in digital storage. Second, many of APT’s subscribers struggled to keep up with the rapidly changing industry. While some subscribers had state-of-the-art satellite systems to receive programming, others had to wait for the post office to drop off programs recorded on tape weeks earlier. With no slowdown on the horizon of innovation in the industry, Gerry knew that his storage and distribution systems would reach a crossroads in no time at all.
Living the tape paradigm
The digital media industry is only a few years removed from its film, and later videotape, roots. Tape was the input and the output of the industry for many years. As a consequence, the tools and workflows used by the industry were built and designed to work with tape. Over time, the “file” slowly replaced the tape as the object to be captured, edited, stored and distributed. Trouble was, many of the systems and more importantly workflows were based on processing tape, and these have proven to be hard to change.
At APT, Gerry realized the limits of the tape paradigm and began looking for technologies and solutions that enabled workflows based on file and object based storage and distribution.
Thinking file based storage and distribution
For data (digital media) storage, APT, like everyone else, started by installing onsite storage servers. As the amount of digital data grew, more storage was added. In addition, APT was expanding its distribution footprint by creating or partnering with distribution channels such as CreateTV and APT Worldwide. This dramatically increased the number of programming formats and the amount of data that had to be stored. As a consequence, updating, maintaining, and managing the APT storage systems was becoming a major challenge and a major resource hog.
Knowing that his in-house storage system was only going to cost more time and money, Gerry decided it was time to look at cloud storage. But that wasn’t the only reason he looked at the cloud. While most people consider cloud storage as just a place to back up and archive files, Gerry was envisioning how the ubiquity of the cloud could help solve his distribution challenges. The trouble was the price of cloud storage from vendors like Amazon S3 and Microsoft Azure was a non-starter, especially for a non-profit. Then Gerry came across Backblaze. B2 Cloud Storage service met all of his performance requirements, and at $0.005/GB/month for storage and $0.01/GB for downloads it was nearly 75% less than S3 or Azure.
Gerry did the math and found that he could economically incorporate B2 Cloud Storage into his IT portfolio, using it for both program submission and for active storage and archiving of the APT programs. In addition, B2 now gives him the foundation necessary to receive and distribute programming content over the Internet. This is especially useful for organizations that can’t conveniently access satellite distribution systems. Not to mention downloading from the cloud is much faster than sending a tape through the mail.
Adding B2 Cloud Storage to their infrastructure has helped American Public Television address two key challenges. First, they now have “unlimited” storage in the cloud without having to add any hardware. In addition, with B2, they only pay for the storage they use. That means they don’t have to buy storage upfront trying to match the maximum amount of storage they’ll ever need. Second, by using B2 as a distribution source for their programming APT subscribers, especially the smaller and remote ones, can get content faster and more reliably without having to perform costly upgrades to their infrastructure.
The road ahead
As APT gets used to their file based infrastructure and workflow, there are a number of cost saving and income generating ideas they are pondering which are now worth considering. Here are a few:
Program Submissions — New content can be uploaded from anywhere using a web browser, an Internet connection, and a login. For example, a producer in Cambodia can upload their film to B2. From there the film is downloaded to an in-house system where it is processed and transcoded using compute. The finished film is added to the APT catalog and added to B2. Once there, the program is instantly available for subscribers to order and download.
“The affordability and performance of Backblaze B2 is what allowed us to make the B2 cloud part of the APT data storage and distribution strategy into the future.” — Gerry Field
Easier Previews — At any time, work in process or finished programs can be made available for download from the B2 cloud. One place this could be useful is where a subscriber needs to review a program to comply with local policies and practices before airing. In the old system, each “one-off” was a time consuming manual process.
Instant Subscriptions — There are many organizations such as schools and businesses that want to use just one episode of a desired show. With an e-commerce based website, current or even archived programming kept in B2 could be available to download or stream for a minimal charge.
At APT there were multiple technologies needed to make their file-based infrastructure work, but as Gerry notes, having an affordable, trustworthy, cloud storage service like B2 is one of the critical building blocks needed to make everything work together.
At AWS, our customers have always been the motivation for our innovation. In turn, we’re committed to helping them accelerate the pace of their own innovation. It was in the spirit of helping our customers achieve their objectives faster that we launched AWS Lambda in 2014, eliminating the burden of server management and enabling AWS developers to focus on business logic instead of the challenges of provisioning and managing infrastructure.
In the years since, our customers have built amazing things using Lambda and other serverless offerings, such as Amazon API Gateway, Amazon Cognito, and Amazon DynamoDB. Together, these services make it easy to build entire applications without the need to provision, manage, monitor, or patch servers. By removing much of the operational drudgery of infrastructure management, we’ve helped our customers become more agile and achieve faster time-to-market for their applications and services. By eliminating cold servers and cold containers with request-based pricing, we’ve also eliminated the high cost of idle capacity and helped our customers achieve dramatically higher utilization and better economics.
After we launched Lambda, though, we quickly learned an important lesson: A single Lambda function rarely exists in isolation. Rather, many functions are part of serverless applications that collectively deliver customer value. Whether it’s the combination of event sources and event handlers, as serverless web apps that combine APIs with functions for dynamic content with static content repositories, or collections of functions that together provide a microservice architecture, our customers were building and delivering serverless architectures for every conceivable problem. Despite the economic and agility benefits that hundreds of thousands of AWS customers were enjoying with Lambda, we realized there was still more we could do.
How Customer Feedback Inspired Us to Innovate
We heard from our customers that getting started—either from scratch or when augmenting their implementation with new techniques or technologies—remained a challenge. When we looked for serverless assets to share, we found stellar examples built by serverless pioneers that represented a multitude of solutions across industries.
There were apps to facilitate monitoring and logging, to process image and audio files, to create Alexa skills, and to integrate with notification and location services. These apps ranged from “getting started” examples to complete, ready-to-run assets. What was missing, however, was a unified place for customers to discover this diversity of serverless applications and a step-by-step interface to help them configure and deploy them.
We also heard from customers and partners that building their own ecosystems—ecosystems increasingly composed of functions, APIs, and serverless applications—remained a challenge. They wanted a simple way to share samples, create extensibility, and grow consumer relationships on top of serverless approaches.
We built the AWS Serverless Application Repository to help solve both of these challenges by offering publishers and consumers of serverless apps a simple, fast, and effective way to share applications and grow user communities around them. Now, developers can easily learn how to apply serverless approaches to their implementation and business challenges by discovering, customizing, and deploying serverless applications directly from the Serverless Application Repository. They can also find libraries, components, patterns, and best practices that augment their existing knowledge, helping them bring services and applications to market faster than ever before.
How the AWS Serverless Application Repository Inspires Innovation for All Customers
Companies that want to create ecosystems, share samples, deliver extensibility and customization options, and complement their existing SaaS services use the Serverless Application Repository as a distribution channel, producing apps that can be easily discovered and consumed by their customers. AWS partners like HERE have introduced their location and transit services to thousands of companies and developers. Partners like Datadog, Splunk, and TensorIoT have showcased monitoring, logging, and IoT applications to the serverless community.
Individual developers are also publishing serverless applications that push the boundaries of innovation—some have published applications that leverage machine learning to predict the quality of wine while others have published applications that monitor crypto-currencies, instantly build beautiful image galleries, or create fast and simple surveys. All of these publishers are using serverless apps, and the Serverless Application Repository, as the easiest way to share what they’ve built. Best of all, their customers and fellow community members can find and deploy these applications with just a few clicks in the Lambda console. Apps in the Serverless Application Repository are free of charge, making it easy to explore new solutions or learn new technologies.
Finally, we at AWS continue to publish apps for the community to use. From apps that leverage Amazon Cognito to sync user data across applications to our latest collection of serverless apps that enable users to quickly execute common financial calculations, we’re constantly looking for opportunities to contribute to community growth and innovation.
At AWS, we’re more excited than ever by the growing adoption of serverless architectures and the innovation that services like AWS Lambda make possible. Helping our customers create and deliver new ideas drives us to keep inventing ways to make building and sharing serverless apps even easier. As the number of applications in the Serverless Application Repository grows, so too will the innovation that it fuels for both the owners and the consumers of those apps. With the general availability of the Serverless Application Repository, our customers become more than the engine of our innovation—they become the engine of innovation for one another.
In Part 2, we take a deeper look at the differences between HDDs and SSDs, how both HDD and SSD technologies are evolving, and how Backblaze takes advantage of SSDs in our operations and data centers.
The first time you booted a computer or opened an app on a computer with a solid-state-drive (SSD), you likely were delighted. I know I was. I loved the speed, silence, and just the wow factor of this new technology that seemed better in just about every way compared to hard drives.
I was ready to fully embrace the promise of SSDs. And I have. My desktop uses an SSD for booting, applications, and for working files. My laptop has a single 512GB SSD. I still use hard drives, however. The second, third, and fourth drives in my desktop computer are HDDs. The external USB RAID I use for local backup uses HDDs in four drive bays. When my laptop is at my desk it is attached to a 1.5TB USB backup hard drive. HDDs still have a place in my personal computing environment, as they likely do in yours.
Nothing stays the same for long, however, especially in the fast-changing world of computing, so we are certain to see new storage technologies coming to the fore, perhaps with even more wow factor.
Before we get to what’s coming, let’s review the primary differences between HDDs and SSDs in a little more detail in the following table.
A Comparison of HDDs to SSDs
Power Draw/Battery Life
More power draw, averages 6–7 watts and therefore uses more battery
Less power draw, averages 2–3 watts, resulting in 30+ minute battery boost
Only around $0.03 per gigabyte, very cheap (buying a 4TB model)
Expensive, roughly $0.20- $0.30 per gigabyte (based on buying a 1TB drive)
Typically around 500GB and 2TB maximum for notebook size drives; 10TB max for desktops
Typically not larger than 1TB for notebook size drives; 4TB for desktops
Operating System Boot Time
Around 30-40 seconds average bootup time
Around 8-13 seconds average bootup time
Audible clicks and spinning platters can be heard
There are no moving parts, hence no sound
The spinning of the platters can sometimes result in vibration
No vibration as there are no moving parts
HDD doesn’t produce much heat, but it will have a measurable amount more heat than an SSD due to moving parts and higher power draw
Lower power draw and no moving parts so little heat is produced
Mean time between failure rate of 1.5 million hours
Mean time between failure rate of 2.0 million hours
File Copy / Write Speed
The range can be anywhere from 50–120MB/s
Generally above 200 MB/s and up to 550 MB/s for cutting edge drives
Full Disk Encryption (FDE) Supported on some models
Full Disk Encryption (FDE) Supported on some models
The HDD has an amazing history of improvement and innovation. From its inception in 1956 the hard drive has decreased in size 57,000 times, increased storage 1 million times, and decreased cost 2,000 times. In other words, the cost per gigabyte has decreased by 2 billion times in about 60 years.
Hard drive manufacturers made these dramatic advances by reducing the size, and consequently the seek times, of platters while increasing their density, improving disk reading technologies, adding multiple arms and read/write heads, developing better bus interfaces, and increasing spin speed and reducing friction with techniques such as filling drives with helium.
In 2005, the drive industry introduced perpendicular recording technology to replace the older longitudinal recording technology, which enabled areal density to reach more than 100 gigabits per square inch. Longitudinal recording aligns data bits horizontally in relation to the drive’s spinning platter, parallel to the surface of the disk, while perpendicular recording aligns bits vertically, perpendicular to the disk surface.
Other technologies such as bit patterned media recording (BPMR) are contributing to increased densities, as well. Introduced by Toshiba in 2010, BPMR is a proposed hard disk drive technology that could succeed perpendicular recording. It records data using nanolithography in magnetic islands, with one bit per island. This contrasts with current disk drive technology where each bit is stored in 20 to 30 magnetic grains within a continuous magnetic film.
Shingled magnetic recording (SMR) is a magnetic storage data recording technology used in HDDs to increase storage density and overall per-drive storage capacity. Shingled recording writes new tracks that overlap part of the previously written magnetic track, leaving the previous track narrower and allowing for higher track density. Thus, the tracks partially overlap similar to roof shingles. This approach was selected because physical limitations prevent recording magnetic heads from having the same width as reading heads, leaving recording heads wider.
Track Spacing Enabled by SMR Technology (Seagate)
To increase the amount of data stored on a drive’s platter requires cramming the magnetic regions closer together, which means the grains need to be smaller so they won’t interfere with each other. In 2002, Seagate successfully demoed heat-assisted magnetic recording (HAMR). HAMR records magnetically using laser-thermal assistance that ultimately could lead to a 20 terabyte drive by 2019. (See our post on HAMR by Seagate’s CTO Mark Re, What is HAMR and How Does It Enable the High-Capacity Needs of the Future?)
Western Digital claims that its competing microwave-assisted magnetic recording (MAMR) could enable drive capacity to increase up to 40TB by the year 2025. Some industry watchers and drive manufacturers predict increases in areal density from today’s .86 tbpsi terabit-per-square-inch (TBPSI) to 10 tbpsi by 2025 resulting in as much as 100TB drive capacity in the next decade.
The future certainly does look bright for HDDs continuing to be with us for a while.
The Outlook for SSDs
SSDs are also in for some amazing advances.
SATA (Serial Advanced Technology Attachment) is the common hardware interface that allows the transfer of data to and from HDDs and SSDs. SATA SSDs are fine for the majority of home users, as they are generally cheaper, operate at a lower speed, and have a lower write life.
While fine for everyday computing, in a RAID (Redundant Array of Independent Disks), server array or data center environment, often a better alternative has been to use ‘SAS’ drives, which stands for Serial Attached SCSI. This is another type of interface that, again, is usable either with HDDs or SSDs. ‘SCSI’ stands for Small Computer System Interface (which is why SAS drives are sometimes referred to as ‘scuzzy’ drives). SAS has increased IOPS (Inputs Outputs Per Second) over SATA, meaning it has the ability to read and write data faster. This has made SAS an optimal choice for systems that require high performance and availability.
On an enterprise level, SAS prevails over SATA, as SAS supports over-provisioning to prolong write life and has been specifically designed to run in environments that require constant drive usage.
PCIe (Peripheral Component Interconnect Express) is a high speed serial computer expansion bus standard that supports drastically higher data transfer rates over SATA or SAS interfaces due to the fact that there are more channels available for the flow of data.
Many leading drive manufacturers have been adopting PCIe as the standard for new home and enterprise storage and some peripherals. For example, you’ll see that the latest Apple Macbooks ship with PCIe-based flash storage, something that Apple has been adopting over the years with their consumer devices.
PCIe can also be used within data centers for RAID systems and to create high-speed networking capabilities, increasing overall performance and supporting the newer and higher capacity HDDs.
As we covered in Part 1, SSDs are based on a type of non-volatile flash memory called NAND.The latest trend in NAND flash is quad-level-cell (QLC) NAND. NAND is subdivided into types based on how many bits of data are stored in each physical memory cell. SLC (single-level-cell) stores one bit, MLC (multi-level-cell) stores two, TLC (triple-level cell) stores three, and QLC (quad-level-cell) stores four.
Storing more data per cell makes NAND more dense, but it also makes the memory slower — it takes more time to read and write data when so much additional information (and so many more charge states) are stored within the same cell of memory.
QLC NAND memory is built on older process nodes with larger cells that can more easily store multiple bits of data. The new NAND tech has higher overall reliability with higher total number of program / erase cycles (P/E cycles).
QLC NAND wafer from which individual microcircuits are made
QLC NAND promises to produce faster and denser SSDs. The effect on price also could be dramatic. Tom’s Hardware is predicting that the advent of QLC could push 512GB SSDs down to $100.
Beyond HDDs and SSDs
There is significant work being done that is pushing the bounds of data storage beyond what is possible with spinning platters and microcircuits. A team at Harvard University has used genome-editing to encode video into live bacteria.
We’ve already discussed the benefits of SSDs. The benefits of SSDs that apply particularly to the data center are:
Low power consumption — When you are running lots of drives, power usage adds up. Anywhere you can conserve power is a win.
Speed — Data can be accessed faster, which is especially beneficial for caching databases and other data affecting overall application or system performance.
Lack of vibration — Reducing vibration improves reliability thereby reducing problems and maintenance. Racks don’t need the size and structural rigidity housing SSDs that they need housing HDDs.
Low noise — Data centers will become quieter as more SSDs are deployed.
Low heat production — The less heat generated the less cooling and power required in the data center.
Faster booting — The faster a storage chassis can get online or a critical server can be rebooted after maintenance or a problem, the better.
Greater areal density — Data centers will be able to store more data in less space, which increases efficiency in all areas (power, cooling, etc.)
The top drive manufacturers say that they expect HDDs and SSDs to coexist for the foreseeable future in all areas — home, business, and data center, with customers choosing which technology and product will best fit their application.
How Backblaze Uses SSDs
In just about all respects, SSDs are superior to HDDs. So why don’t we replace the 100,000+ hard drives we have spinning in our data centers with SSDs?
Our operations team takes advantage of the benefits and savings of SSDs wherever they can, using them in every place that’s appropriate other than primary data storage. They’re particularly useful in our caching and restore layers, where we use them strategically to speed up data transfers. SSDs also speed up access to B2 Cloud Storage metadata. Our operations teams is considering moving to SSDs to boot our Storage Pods, where the cost of a small SSD is competitive with hard drives, and their other attributes (small size, lack of vibration, speed, low-power consumption, reliability) are all pluses.
A Future with Both HDDs and SSDs
IDC predicts that total data created will grow from approximately 33 zettabytes in 2018 to about 160 zettabytes in 2025. (See What’s a Byte? if you’d like help understanding the size of a zettabyte.)
Annual Size of the Global Datasphere
Over 90% of enterprise drive shipments today are HDD, according to IDC. By 2025, SSDs will comprise almost 20% of drive shipments. SSDs will gain share, but total growth in data created will result in massive sales of both HDDs and SSDs.
Enterprise Byte Shipments: NDD and SSD
As both HDD and SSD sales grow, so does the capacity of both technologies. Given the benefits of SSDs in many applications, we’re likely going to see SSDs replacing HDDs in all but the highest capacity uses.
It’s clear that there are merits to both HDDs and SSDs. If you’re not running a data center, and don’t have more than one or two terabytes of data to store on your home or business computer, your first choice likely should be an SSD. They provide a noticeable improvement in performance during boot-up and data transfer, and are smaller, quieter, and more reliable as well. Save the HDDs for secondary drives, NAS, RAID, and local backup devices in your system.
Perhaps some day we’ll look back at the days of spinning platters with the same nostalgia we look back at stereo LPs, and some of us will have an HDD paperweight on our floating anti-gravity desk as a conversation piece. Until the day that SSD’s performance, capacity, and finally, price, expel the last HDD out of the home and data center, we can expect to live in a world that contains both solid state SSDs and magnetic platter HDDs, and as users we will reap the benefits from both technologies.
Don’t miss future posts on HDDs, SSDs, and other topics, including hard drive stats, cloud storage, and tips and tricks for backing up to the cloud. Use the Join button above to receive notification of future posts on our blog.
This is part two of a series on the factors that an organization needs to consider when opening a data center and the challenges that must be met in the process.
In Part 1 of this series, we looked at the different types of data centers, the importance of location in planning a data center, data center certification, and the single most expensive factor in running a data center, power.
In Part 2, we continue to look at factors that need to considered both by those interested in a dedicated data center and those seeking to colocate in an existing center.
In part 1, we began our discussion of the power requirements of data centers.
As we discussed, redundancy and failover is a chief requirement for data center power. A redundantly designed power supply system is also a necessity for maintenance, as it enables repairs to be performed on one network, for example, without having to turn off servers, databases, or electrical equipment.
The common critical components of a data center’s power flow are:
Uninterruptible Power Supplies (UPS)
Utility Supply is the power that comes from one or more utility grids. While most of us consider the grid to be our primary power supply (hats off to those of you who manage to live off the grid), politics, economics, and distribution make utility supply power susceptible to outages, which is why data centers must have autonomous power available to maintain availability.
Generators are used to supply power when the utility supply is unavailable. They convert mechanical energy, usually from motors, to electrical energy.
Transfer Switches are used to transfer electric load from one source or electrical device to another, such as from one utility line to another, from a generator to a utility, or between generators. The transfer could be manually activated or automatic to ensure continuous electrical power.
Distribution Panels get the power where it needs to go, taking a power feed and dividing it into separate circuits to supply multiple loads.
A UPS, as we touched on earlier, ensures that continuous power is available even when the main power source isn’t. It often consists of batteries that can come online almost instantaneously when the current power ceases. The power from a UPS does not have to last a long time as it is considered an emergency measure until the main power source can be restored. Another function of the UPS is to filter and stabilize the power from the main power supply.
Data center UPSs
PDU stands for the Power Distribution Unit and is the device that distributes power to the individual pieces of equipment.
After power, the networking connections to the data center are of prime importance. Can the data center obtain and maintain high-speed networking connections to the building? With networking, as with all aspects of a data center, availability is a primary consideration. Data center designers think of all possible ways service can be interrupted or lost, even briefly. Details such as the vulnerabilities in the route the network connections make from the core network (the backhaul) to the center, and where network connections enter and exit a building, must be taken into consideration in network and data center design.
Routers and switches are used to transport traffic between the servers in the data center and the core network. Just as with power, network redundancy is a prime factor in maintaining availability of data center services. Two or more upstream service providers are required to ensure that availability.
How fast a customer can transfer data to a data center is affected by: 1) the speed of the connections the data center has with the outside world, 2) the quality of the connections between the customer and the data center, and 3) the distance of the route from customer to the data center. The longer the length of the route and the greater the number of packets that must be transferred, the more significant a factor will be played by latency in the data transfer. Latency is the delay before a transfer of data begins following an instruction for its transfer. Generally latency, not speed, will be the most significant factor in transferring data to and from a data center. Packets transferred using the TCP/IP protocol suite, which is the conceptual model and set of communications protocols used on the internet and similar computer networks, must be acknowledged when received (ACK’d) and requires a communications roundtrip for each packet. If the data is in larger packets, the number of ACKs required is reduced, so latency will be a smaller factor in the overall network communications speed.
Those interested in testing the overall speed and latency of their connection to Backblaze’s data centers can use the Check Your Bandwidth tool on our website.
Data center telecommunications equipment
Data center under floor cable runs
Computer, networking, and power generation equipment generates heat, and there are a number of solutions employed to rid a data center of that heat. The location and climate of the data center is of great importance to the data center designer because the climatic conditions dictate to a large degree what cooling technologies should be deployed that in turn affect the power used and the cost of using that power. The power required and cost needed to manage a data center in a warm, humid climate will vary greatly from managing one in a cool, dry climate. Innovation is strong in this area and many new approaches to efficient and cost-effective cooling are used in the latest data centers.
Switch’s uninterruptible, multi-system, HVAC Data Center Cooling Units
There are three primary ways data center cooling can be achieved:
Room Cooling cools the entire operating area of the data center. This method can be suitable for small data centers, but becomes more difficult and inefficient as IT equipment density and center size increase.
Row Cooling concentrates on cooling a data center on a row by row basis. In its simplest form, hot aisle/cold aisle data center design involves lining up server racks in alternating rows with cold air intakes facing one way and hot air exhausts facing the other. The rows composed of rack fronts are called cold aisles. Typically, cold aisles face air conditioner output ducts. The rows the heated exhausts pour into are called hot aisles. Typically, hot aisles face air conditioner return ducts.
Rack Cooling tackles cooling on a rack by rack basis. Air-conditioning units are dedicated to specific racks. This approach allows for maximum densities to be deployed per rack. This works best in data centers with fully loaded racks, otherwise there would be too much cooling capacity, and the air-conditioning losses alone could exceed the total IT load.
Data Centers are high-security facilities as they house business, government, and other data that contains personal, financial, and other secure information about businesses and individuals.
This list contains the physical-security considerations when opening or co-locating in a data center:
Layered Security Zones. Systems and processes are deployed to allow only authorized personnel in certain areas of the data center. Examples include keycard access, alarm systems, mantraps, secure doors, and staffed checkpoints.
Physical Barriers. Physical barriers, fencing and reinforced walls are used to protect facilities. In a colocation facility, one customers’ racks and servers are often inaccessible to other customers colocating in the same data center.
Backblaze racks secured in the data center
Monitoring Systems. Advanced surveillance technology monitors and records activity on approaching driveways, building entrances, exits, loading areas, and equipment areas. These systems also can be used to monitor and detect fire and water emergencies, providing early detection and notification before significant damage results.
Top-tier providers evaluate their data center security and facilities on an ongoing basis. Technology becomes outdated quickly, so providers must stay-on-top of new approaches and technologies in order to protect valuable IT assets.
To pass into high security areas of a data center requires passing through a security checkpoint where credentials are verified.
The gauntlet of cameras and steel bars one must pass before entering this data center
Facilities and Services
Data center colocation providers often differentiate themselves by offering value-added services. In addition to the required space, power, cooling, connectivity and security capabilities, the best solutions provide several on-site amenities. These accommodations include offices and workstations, conference rooms, and access to phones, copy machines, and office equipment.
Additional features may consist of kitchen facilities, break rooms and relaxation lounges, storage facilities for client equipment, and secure loading docks and freight elevators.
Would you Like to Know More about The Challenges of Opening and Running a Data Center?
That’s it for part 2 of this series. If readers are interested, we could write a post about some of the new technologies and trends affecting data center design and use. Please let us know in the comments.
Don’t miss future posts on data centers and other topics, including hard drive stats, cloud storage, and tips and tricks for backing up to the cloud. Use the Join button above to receive notification of future posts on our blog.
Helping people to get into making is at the heart of what we do, and so we’ve created a brand-new, free online course to support educators to start their own makerspaces. If you’re interested in the maker movement, then this course is for you! Sign up now and start learning with Build a Makerspace for Young People on FutureLearn.
Find out how to create and run a makerspace for young people. Look at the pedagogy and approaches behind digital making.
Dive into the maker movement
From planning to execution, this course will cover everything you need to know to set up and lead your very own makerspace. You’ll learn about different approaches to designing makerspace environments, understand the pedagogy that underpins the maker movement, and create your own makerspace action plan. By the end of the course, you will be well versed in makerspace culture, and you’ll have the skills and knowledge to build a successful and thriving makerspace in your community.
Let makerspace experts lead your journey
This new course features five fantastic case studies about real-life makerspace educators. They’ll share their stories of starting a makerspace: what worked, what didn’t, and what’s next on their journey. Hear from Jessica Simons as she describes her experience starting the MCHS Maker Lab, connect with Patrick Ferrell as he details his teaching at the Jocelyn H. Lee Innovation Lab, and learn from Nick Provenzano as he shares his top tips on how to ensure the legacy of your makerspace. These accomplished educators will give you their practical advice and expert insights, helping you learn the best practices of starting a makerspace environment.
Connect with educators worldwide
By taking this course, you’ll also be connecting with talented and like-minded educators from across the globe. This is your opportunity to develop a community of practice while learning from fellow teachers, librarians, and community leaders who are also engaged in the maker movement.
“I like this course and how it progresses from introducing the concept of makerspaces and how they have come to education, all the way through to creating my own action plan to get started.”— Makerspace Educator in Hayward, California USA
Sign up now
The first run of our Build a Makerspace for Young People course starts on 12 March 2018. You can sign up and access all content for four weeks. After that period, we’ll run the course again multiple times throughout the year. Enjoy, and happy making!
We have been busy adding new features and capabilities to Amazon Redshift, and we wanted to give you a glimpse of what we’ve been doing over the past year. In this article, we recap a few of our enhancements and provide a set of resources that you can use to learn more and get the most out of your Amazon Redshift implementation.
In 2017, we made more than 30 announcements about Amazon Redshift. We listened to you, our customers, and delivered Redshift Spectrum, a feature of Amazon Redshift, that gives you the ability to extend analytics to your data lake—without moving data. We launched new DC2 nodes, doubling performance at the same price. We also announced many new features that provide greater scalability, better performance, more automation, and easier ways to manage your analytics workloads.
To see a full list of our launches, visit our what’s new page—and be sure to subscribe to our RSS feed.
Major launches in 2017
Amazon Redshift Spectrum—extend analytics to your data lake, without moving data
We launched Amazon Redshift Spectrum to give you the freedom to store data in Amazon S3, in open file formats, and have it available for analytics without the need to load it into your Amazon Redshift cluster. It enables you to easily join datasets across Redshift clusters and S3 to provide unique insights that you would not be able to obtain by querying independent data silos.
With Redshift Spectrum, you can run SQL queries against data in an Amazon S3 data lake as easily as you analyze data stored in Amazon Redshift. And you can do it without loading data or resizing the Amazon Redshift cluster based on growing data volumes. Redshift Spectrum separates compute and storage to meet workload demands for data size, concurrency, and performance. Redshift Spectrum scales processing across thousands of nodes, so results are fast, even with massive datasets and complex queries. You can query open file formats that you already use—such as Apache Avro, CSV, Grok, ORC, Apache Parquet, RCFile, RegexSerDe, SequenceFile, TextFile, and TSV—directly in Amazon S3, without any data movement.
“For complex queries, Redshift Spectrum provided a 67 percent performance gain,” said Rafi Ton, CEO, NUVIAD. “Using the Parquet data format, Redshift Spectrum delivered an 80 percent performance improvement. For us, this was substantial.”
DC2 nodes—twice the performance of DC1 at the same price
We launched second-generation Dense Compute (DC2) nodes to provide low latency and high throughput for demanding data warehousing workloads. DC2 nodes feature powerful Intel E5-2686 v4 (Broadwell) CPUs, fast DDR4 memory, and NVMe-based solid state disks (SSDs). We’ve tuned Amazon Redshift to take advantage of the better CPU, network, and disk on DC2 nodes, providing up to twice the performance of DC1 at the same price. Our DC2.8xlarge instances now provide twice the memory per slice of data and an optimized storage layout with 30 percent better storage utilization.
“Redshift allows us to quickly spin up clusters and provide our data scientists with a fast and easy method to access data and generate insights,” said Bradley Todd, technology architect at Liberty Mutual. “We saw a 9x reduction in month-end reporting time with Redshift DC2 nodes as compared to DC1.”
On average, our customers are seeing 3x to 5x performance gains for most of their critical workloads.
We introduced short query acceleration to speed up execution of queries such as reports, dashboards, and interactive analysis. Short query acceleration uses machine learning to predict the execution time of a query, and to move short running queries to an express short query queue for faster processing.
We launched results caching to deliver sub-second response times for queries that are repeated, such as dashboards, visualizations, and those from BI tools. Results caching has an added benefit of freeing up resources to improve the performance of all other queries.
We also introduced late materialization to reduce the amount of data scanned for queries with predicate filters by batching and factoring in the filtering of predicates before fetching data blocks in the next column. For example, if only 10 percent of the table rows satisfy the predicate filters, Amazon Redshift can potentially save 90 percent of the I/O for the remaining columns to improve query performance.
We launched query monitoring rules and pre-defined rule templates. These features make it easier for you to set metrics-based performance boundaries for workload management (WLM) queries, and specify what action to take when a query goes beyond those boundaries. For example, for a queue that’s dedicated to short-running queries, you might create a rule that aborts queries that run for more than 60 seconds. To track poorly designed queries, you might have another rule that logs queries that contain nested loops.
Amazon Redshift and Redshift Spectrum serve customers across a variety of industries and sizes, from startups to large enterprises. Visit our customer page to see the success that customers are having with our recent enhancements. Learn how companies like Liberty Mutual Insurance saw a 9x reduction in month-end reporting time using DC2 nodes. On this page, you can find case studies, videos, and other content that show how our customers are using Amazon Redshift to drive innovation and business results.
In addition, check out these resources to learn about the success our customers are having building out a data warehouse and data lake integration solution with Amazon Redshift:
You can enhance your Amazon Redshift data warehouse by working with industry-leading experts. Our AWS Partner Network (APN) Partners have certified their solutions to work with Amazon Redshift. They offer software, tools, integration, and consulting services to help you at every step. Visit our Amazon Redshift Partner page and choose an APN Partner. Or, use AWS Marketplace to find and immediately start using third-party software.
To see what our Partners are saying about Amazon Redshift Spectrum and our DC2 nodes mentioned earlier, read these blog posts:
If you are evaluating or considering a proof of concept with Amazon Redshift, or you need assistance migrating your on-premises or other cloud-based data warehouse to Amazon Redshift, our team of product experts and solutions architects can help you with architecting, sizing, and optimizing your data warehouse. Contact us using this support request form, and let us know how we can assist you.
If you are an Amazon Redshift customer, we offer a no-cost health check program. Our team of database engineers and solutions architects give you recommendations for optimizing Amazon Redshift and Amazon Redshift Spectrum for your specific workloads. To learn more, email us at [email protected].
Larry Heathcote is a Principle Product Marketing Manager at Amazon Web Services for data warehousing and analytics. Larry is passionate about seeing the results of data-driven insights on business outcomes. He enjoys family time, home projects, grilling out and the taste of classic barbeque.
This is part one of a series. The second part will be posted later this week. Use the Join button above to receive notification of future posts in this series.
Though most of us have never set foot inside of a data center, as citizens of a data-driven world we nonetheless depend on the services that data centers provide almost as much as we depend on a reliable water supply, the electrical grid, and the highway system. Every time we send a tweet, post to Facebook, check our bank balance or credit score, watch a YouTube video, or back up a computer to the cloud we are interacting with a data center.
In this series, The Challenges of Opening a Data Center, we’ll talk in general terms about the factors that an organization needs to consider when opening a data center and the challenges that must be met in the process. Many of the factors to consider will be similar for opening a private data center or seeking space in a public data center, but we’ll assume for the sake of this discussion that our needs are more modest than requiring a data center dedicated solely to our own use (i.e. we’re not Google, Facebook, or China Telecom).
Data center technology and management are changing rapidly, with new approaches to design and operation appearing every year. This means we won’t be able to cover everything happening in the world of data centers in our series, however, we hope our brief overview proves useful.
What is a Data Center?
A data center is the structure that houses a large group of networked computer servers typically used by businesses, governments, and organizations for the remote storage, processing, or distribution of large amounts of data.
While many organizations will have computing services in the same location as their offices that support their day-to-day operations, a data center is a structure dedicated to 24/7 large-scale data processing and handling.
Depending on how you define the term, there are anywhere from a half million data centers in the world to many millions. While it’s possible to say that an organization’s on-site servers and data storage can be called a data center, in this discussion we are using the term data center to refer to facilities that are expressly dedicated to housing computer systems and associated components, such as telecommunications and storage systems. The facility might be a private center, which is owned or leased by one tenant only, or a shared data center that offers what are called “colocation services,” and rents space, services, and equipment to multiple tenants in the center.
A large, modern data center operates around the clock, placing a priority on providing secure and uninterrrupted service, and generally includes redundant or backup power systems or supplies, redundant data communication connections, environmental controls, fire suppression systems, and numerous security devices. Such a center is an industrial-scale operation often using as much electricity as a small town.
Types of Data Centers
There are a number of ways to classify data centers according to how they will be used, whether they are owned or used by one or multiple organizations, whether and how they fit into a topology of other data centers; which technologies and management approaches they use for computing, storage, cooling, power, and operations; and increasingly visible these days: how green they are.
Data centers can be loosely classified into three types according to who owns them and who uses them.
Exclusive Data Centers are facilities wholly built, maintained, operated and managed by the business for the optimal operation of its IT equipment. Some of these centers are well-known companies such as Facebook, Google, or Microsoft, while others are less public-facing big telecoms, insurance companies, or other service providers.
Managed Hosting Providers are data centers managed by a third party on behalf of a business. The business does not own data center or space within it. Rather, the business rents IT equipment and infrastructure it needs instead of investing in the outright purchase of what it needs.
Colocation Data Centers are usually large facilities built to accommodate multiple businesses within the center. The business rents its own space within the data center and subsequently fills the space with its IT equipment, or possibly uses equipment provided by the data center operator.
Backblaze, for example, doesn’t own its own data centers but colocates in data centers owned by others. As Backblaze’s storage needs grow, Backblaze increases the space it uses within a given data center and/or expands to other data centers in the same or different geographic areas.
Availability is Key
When designing or selecting a data center, an organization needs to decide what level of availability is required for its services. The type of business or service it provides likely will dictate this. Any organization that provides real-time and/or critical data services will need the highest level of availability and redundancy, as well as the ability to rapidly failover (transfer operation to another center) when and if required. Some organizations require multiple data centers not just to handle the computer or storage capacity they use, but to provide alternate locations for operation if something should happen temporarily or permanently to one or more of their centers.
Organizations operating data centers that can’t afford any downtime at all will typically operate data centers that have a mirrored site that can take over if something happens to the first site, or they operate a second site in parallel to the first one. These data center topologies are called Active/Passive, and Active/Active, respectively. Should disaster or an outage occur, disaster mode would dictate immediately moving all of the primary data center’s processing to the second data center.
While some data center topologies are spread throughout a single country or continent, others extend around the world. Practically, data transmission speeds put a cap on centers that can be operated in parallel with the appearance of simultaneous operation. Linking two data centers located apart from each other — say no more than 60 miles to limit data latency issues — together with dark fiber (leased fiber optic cable) could enable both data centers to be operated as if they were in the same location, reducing staffing requirements yet providing immediate failover to the secondary data center if needed.
This redundancy of facilities and ensured availability is of paramount importance to those needing uninterrupted data center services.
Leadership in Energy and Environmental Design (LEED) is a rating system devised by the United States Green Building Council (USGBC) for the design, construction, and operation of green buildings. Facilities can achieve ratings of certified, silver, gold, or platinum based on criteria within six categories: sustainable sites, water efficiency, energy and atmosphere, materials and resources, indoor environmental quality, and innovation and design.
Green certification has become increasingly important in data center design and operation as data centers require great amounts of electricity and often cooling water to operate. Green technologies can reduce costs for data center operation, as well as make the arrival of data centers more amenable to environmentally-conscious communities.
The ACT, Inc. data center in Iowa City, Iowa was the first data center in the U.S. to receive LEED-Platinum certification, the highest level available.
ACT Data Center exterior
ACT Data Center interior
Factors to Consider When Selecting a Data Center
There are numerous factors to consider when deciding to build or to occupy space in a data center. Aspects such as proximity to available power grids, telecommunications infrastructure, networking services, transportation lines, and emergency services can affect costs, risk, security and other factors that need to be taken into consideration.
The size of the data center will be dictated by the business requirements of the owner or tenant. A data center can occupy one room of a building, one or more floors, or an entire building. Most of the equipment is often in the form of servers mounted in 19 inch rack cabinets, which are usually placed in single rows forming corridors (so-called aisles) between them. This allows staff access to the front and rear of each cabinet. Servers differ greatly in size from 1U servers (i.e. one “U” or “RU” rack unit measuring 44.50 millimeters or 1.75 inches), to Backblaze’s Storage Pod design that fits a 4U chassis, to large freestanding storage silos that occupy many square feet of floor space.
Location will be one of the biggest factors to consider when selecting a data center and encompasses many other factors that should be taken into account, such as geological risks, neighboring uses, and even local flight paths. Access to suitable available power at a suitable price point is often the most critical factor and the longest lead time item, followed by broadband service availability.
With more and more data centers available providing varied levels of service and cost, the choices increase each year. Data center brokers can be employed to find a data center, just as one might use a broker for home or other commercial real estate.
Websites listing available colocation space, such as upstack.io, or entire data centers for sale or lease, are widely used. A common practice is for a customer to publish its data center requirements, and the vendors compete to provide the most attractive bid in a reverse auction.
Business and Customer Proximity
The center’s closeness to a business or organization may or may not be a factor in the site selection. The organization might wish to be close enough to manage the center or supervise the on-site staff from a nearby business location. The location of customers might be a factor, especially if data transmission speeds and latency are important, or the business or customers have regulatory, political, tax, or other considerations that dictate areas suitable or not suitable for the storage and processing of data.
Local climate is a major factor in data center design because the climatic conditions dictate what cooling technologies should be deployed. In turn this impacts uptime and the costs associated with cooling, which can total as much as 50% or more of a center’s power costs. The topology and the cost of managing a data center in a warm, humid climate will vary greatly from managing one in a cool, dry climate. Nevertheless, data centers are located in both extremely cold regions and extremely hot ones, with innovative approaches used in both extremes to maintain desired temperatures within the center.
Geographic Stability and Extreme Weather Events
A major obvious factor in locating a data center is the stability of the actual site as regards weather, seismic activity, and the likelihood of weather events such as hurricanes, as well as fire or flooding.
Backblaze’s Sacramento data center describes its location as one of the most stable geographic locations in California, outside fault zones and floodplains.
Sometimes the location of the center comes first and the facility is hardened to withstand anticipated threats, such as Equinix’s NAP of the Americas data center in Miami, one of the largest single-building data centers on the planet (six stories and 750,000 square feet), which is built 32 feet above sea level and designed to withstand category 5 hurricane winds.
Equinix “NAP of the Americas” Data Center in Miami
Most data centers don’t have the extreme protection or history of the Bahnhof data center, which is located inside the ultra-secure former nuclear bunker Pionen, in Stockholm, Sweden. It is buried 100 feet below ground inside the White Mountains and secured behind 15.7 in. thick metal doors. It prides itself on its self-described “Bond villain” ambiance.
Bahnhof Data Center under White Mountain in Stockholm
Usually, the data center owner or tenant will want to take into account the balance between cost and risk in the selection of a location. The Ideal quadrant below is obviously favored when making this compromise.
Risk mitigation also plays a strong role in pricing. The extent to which providers must implement special building techniques and operating technologies to protect the facility will affect price. When selecting a data center, organizations must make note of the data center’s certification level on the basis of regulatory requirements in the industry. These certifications can ensure that an organization is meeting necessary compliance requirements.
Electrical power usually represents the largest cost in a data center. The cost a service provider pays for power will be affected by the source of the power, the regulatory environment, the facility size and the rate concessions, if any, offered by the utility. At higher level tiers, battery, generator, and redundant power grids are a required part of the picture.
Fault tolerance and power redundancy are absolutely necessary to maintain uninterrupted data center operation. Parallel redundancy is a safeguard to ensure that an uninterruptible power supply (UPS) system is in place to provide electrical power if necessary. The UPS system can be based on batteries, saved kinetic energy, or some type of generator using diesel or another fuel. The center will operate on the UPS system with another UPS system acting as a backup power generator. If a power outage occurs, the additional UPS system power generator is available.
Many data centers require the use of independent power grids, with service provided by different utility companies or services, to prevent against loss of electrical service no matter what the cause. Some data centers have intentionally located themselves near national borders so that they can obtain redundant power from not just separate grids, but from separate geopolitical sources.
Higher redundancy levels required by a company will of invariably lead to higher prices. If one requires high availability backed by a service-level agreement (SLA), one can expect to pay more than another company with less demanding redundancy requirements.
Stay Tuned for Part 2 of The Challenges of Opening a Data Center
That’s it for part 1 of this post. In subsequent posts, we’ll take a look at some other factors to consider when moving into a data center such as network bandwidth, cooling, and security. We’ll take a look at what is involved in moving into a new data center (including stories from Backblaze’s experiences). We’ll also investigate what it takes to keep a data center running, and some of the new technologies and trends affecting data center design and use. You can discover all posts on our blog tagged with “Data Center” by following the link https://www.backblaze.com/blog/tag/data-center/.
The second part of this series on The Challenges of Opening a Data Center will be posted later this week. Use the Join button above to receive notification of future posts in this series.
AWS has released a new whitepaper that has been requested by many AWS customers: AWS Policy Perspectives: Data Residency. Data residency is the requirement that all customer content processed and stored in an IT system must remain within a specific country’s borders, and it is one of the foremost concerns of governments that want to use commercial cloud services. General cybersecurity concerns and concerns about government requests for data have contributed to a continued focus on keeping data within countries’ borders. In fact, some governments have determined that mandating data residency provides an extra layer of security.
This approach, however, is counterproductive to the data protection objectives and the IT modernization and global economic growth goals that many governments have set as milestones. This new whitepaper addresses the real and perceived security risks expressed by governments when they demand in-country data residency by identifying the most likely and prevalent IT vulnerabilities and security risks, explaining the native security embedded in cloud services, and highlighting the roles and responsibilities of cloud service providers (CSPs), governments, and customers in protecting data.
Large-scale, multinational CSPs, often called hyperscale CSPs, represent a transformational disruption in technology because of how they support their customers with high degrees of efficiency, agility, and innovation as part of world-class security offerings. The whitepaper explains how hyperscale CSPs, such as AWS, that might be located out of country provide their customers the ability to achieve high levels of data protection through safeguards on their own platform and with turnkey tooling for their customers. They do this while at the same time preserving nation-state regulatory sovereignty.
The whitepaper also considers the commercial, public-sector, and economic effects of data residency policies and offers considerations for governments to evaluate before enforcing requirements that can unintentionally limit public-sector digital transformation goals, in turn possibly leading to increased cybersecurity risk.
AWS continues to engage with governments around the world to hear and address their top-of-mind security concerns. We take seriously our commitment to advocate for our customers’ interests and enforce security from “ground zero.” This means that when customers use AWS, they can have the confidence that their data is protected with a level of assurance that meets, if not exceeds, their needs, regardless of where the data resides.
Interesting article by Major General Hao Yeli, Chinese People’s Liberation Army (ret.), a senior advisor at the China International Institute for Strategic Society, Vice President of China Institute for Innovation and Development Strategy, and the Chair of the Guanchao Cyber Forum.
Against the background of globalization and the internet era, the emerging cyber sovereignty concept calls for breaking through the limitations of physical space and avoiding misunderstandings based on perceptions of binary opposition. Reinforcing a cyberspace community with a common destiny, it reconciles the tension between exclusivity and transferability, leading to a comprehensive perspective. China insists on its cyber sovereignty, meanwhile, it transfers segments of its cyber sovereignty reasonably. China rightly attaches importance to its national security, meanwhile, it promotes international cooperation and open development.
China has never been opposed to multi-party governance when appropriate, but rejects the denial of government’s proper role and responsibilities with respect to major issues. The multilateral and multiparty models are complementary rather than exclusive. Governments and multi-stakeholders can play different leading roles at the different levels of cyberspace.
In the internet era, the law of the jungle should give way to solidarity and shared responsibilities. Restricted connections should give way to openness and sharing. Intolerance should be replaced by understanding. And unilateral values should yield to respect for differences while recognizing the importance of diversity.
Last week I attended a talk given by Bryan Mistele, president of Seattle-based INRIX. Bryan’s talk provided a glimpse into the future of transportation, centering around four principle attributes, often abbreviated as ACES:
Autonomous – Cars and trucks are gaining the ability to scan and to make sense of their environments and to navigate without human input.
Connected – Vehicles of all types have the ability to take advantage of bidirectional connections (either full-time or intermittent) to other cars and to cloud-based resources. They can upload road and performance data, communicate with each other to run in packs, and take advantage of traffic and weather data.
Electric – Continued development of battery and motor technology, will make electrics vehicles more convenient, cost-effective, and environmentally friendly.
Shared – Ride-sharing services will change usage from an ownership model to an as-a-service model (sound familiar?).
Individually and in combination, these emerging attributes mean that the cars and trucks we will see and use in the decade to come will be markedly different than those of the past.
On the Road with AWS AWS customers are already using our AWS IoT, edge computing, Amazon Machine Learning, and Alexa products to bring this future to life – vehicle manufacturers, their tier 1 suppliers, and AutoTech startups all use AWS for their ACES initiatives. AWS Greengrass is playing an important role here, attracting design wins and helping our customers to add processing power and machine learning inferencing at the edge.
AWS customer Aptiv (formerly Delphi) talked about their Automated Mobility on Demand (AMoD) smart vehicle architecture in a AWS re:Invent session. Aptiv’s AMoD platform will use Greengrass and microservices to drive the onboard user experience, along with edge processing, monitoring, and control. Here’s an overview:
Another customer, Denso of Japan (one of the world’s largest suppliers of auto components and software) is using Greengrass and AWS IoT to support their vision of Mobility as a Service (MaaS). Here’s a video:
AWS at CES The AWS team will be out in force at CES in Las Vegas and would love to talk to you. They’ll be running demos that show how AWS can help to bring innovation and personalization to connected and autonomous vehicles.
Personalized In-Vehicle Experience – This demo shows how AWS AI and Machine Learning can be used to create a highly personalized and branded in-vehicle experience. It makes use of Amazon Lex, Polly, and Amazon Rekognition, but the design is flexible and can be used with other services as well. The demo encompasses driver registration, login and startup (including facial recognition), voice assistance for contextual guidance, personalized e-commerce, and vehicle control. Here’s the architecture for the voice assistance:
Connected Vehicle Solution – This demo shows how a connected vehicle can combine local and cloud intelligence, using edge computing and machine learning at the edge. It handles intermittent connections and uses AWS DeepLens to train a model that responds to distracted drivers. Here’s the overall architecture, as described in our Connected Vehicle Solution:
Digital Content Delivery – This demo will show how a customer uses a web-based 3D configurator to build and personalize their vehicle. It will also show high resolution (4K) 3D image and an optional immersive AR/VR experience, both designed for use within a dealership.
Autonomous Driving – This demo will showcase the AWS services that can be used to build autonomous vehicles. There’s a 1/16th scale model vehicle powered and driven by Greengrass and an overview of a new AWS Autonomous Toolkit. As part of the demo, attendees drive the car, training a model via Amazon SageMaker for subsequent on-board inferencing, powered by Greengrass ML Inferencing.
To speak to one of my colleagues or to set up a time to see the demos, check out the Visit AWS at CES 2018 page.
Some Resources If you are interested in this topic and want to learn more, the AWS for Automotive page is a great starting point, with discussions on connected vehicles & mobility, autonomous vehicle development, and digital customer engagement.
When you are ready to start building a connected vehicle, the AWS Connected Vehicle Solution contains a reference architecture that combines local computing, sophisticated event rules, and cloud-based data processing and storage. You can use this solution to accelerate your own connected vehicle projects.
At inception, Backblaze was a consumer company. Thousands upon thousands of individuals came to our website and gave us $5/mo to keep their data safe. But, we didn’t sell business solutions. It took us years before we had a sales team. In the last couple of years, we’ve released products that businesses of all sizes love: Backblaze B2 Cloud Storage and Backblaze for Business Computer Backup. Those businesses want to integrate Backblaze deeply into their infrastructure, so it’s time to hire our first Sales Engineer!
Founded in 2007, Backblaze started with a mission to make backup software elegant and provide complete peace of mind. Over the course of almost a decade, we have become a pioneer in robust, scalable low cost cloud backup. Recently, we launched B2 – robust and reliable object storage at just $0.005/gb/mo. Part of our differentiation is being able to offer the lowest price of any of the big players while still being profitable.
We’ve managed to nurture a team oriented culture with amazingly low turnover. We value our people and their families. Don’t forget to check out our “About Us” page to learn more about the people and some of our perks.
We have built a profitable, high growth business. While we love our investors, we have maintained control over the business. That means our corporate goals are simple – grow sustainably and profitably.
Some Backblaze Perks:
Competitive healthcare plans
Competitive compensation and 401k
All employees receive Option grants
Unlimited vacation days
Fully stocked Micro kitchen
Catered breakfast and lunches
Awesome people who work on awesome projects
Normal work hours
Get to bring your pets into the office
San Mateo Office – located near Caltrain and Highways 101 & 280.
Backblaze B2 cloud storage is a building block for almost any computing service that requires storage. Customers need our help integrating B2 into iOS apps to Docker containers. Some customers integrate directly to the API using the programming language of their choice, others want to solve a specific problem using ready made software, already integrated with B2.
At the same time, our computer backup product is deepening it’s integration into enterprise IT systems. We are commonly asked for how to set Windows policies, integrate with Active Directory, and install the client via remote management tools.
We are looking for a sales engineer who can help our customers navigate the integration of Backblaze into their technical environments.
Are you 1/2” deep into many different technologies, and unafraid to dive deeper?
Can you confidently talk with customers about their technology, even if you have to look up all the acronyms right after the call?
Are you excited to setup complicated software in a lab and write knowledge base articles about your work?
Then Backblaze is the place for you!
Enough about Backblaze already, what’s in it for me?
In this role, you will be given the opportunity to learn about the technologies that drive innovation today; diverse technologies that customers are using day in and out. And more importantly, you’ll learn how to learn new technologies.
Just as an example, in the past 12 months, we’ve had the opportunity to learn and become experts in these diverse technologies:
How to setup VM servers for lab environments, both on-prem and using cloud services.
Create an automatically “resetting” demo environment for the sales team.
Setup Microsoft Domain Controllers with Active Directory and AD Federation Services.
Learn the basics of OAUTH and web single sign on (SSO).
Archive video workflows from camera to media asset management systems.
How to install and monitor online backup installations using RMM tools, like JAMF.
Tape (LTO) systems. (Yes – people still use tape for storage!)
How can I know if I’ll succeed in this role?
Confidence. Be able to ask customers questions about their environments and convey to them your technical acumen.
Curiosity. Always want to learn about customers’ situations, how they got there and what problems they are trying to solve.
Organization. You’ll work with customers, integration partners, and Backblaze team members on projects of various lengths. You can context switch and either have a great memory or keep copious notes. Your checklists have their own checklists.
You are versed in:
The fundamentals of Windows, Linux and Mac OS X operating systems. You shouldn’t be afraid to use a command line.
Building, installing, integrating and configuring applications on any operating system.
Debugging failures – reading logs, monitoring usage, effective google searching to fix problems excites you.
The basics of TCP/IP networking and the HTTP protocol.
Novice development skills in any programming/scripting language. Have basic understanding of data structures and program flow.
Your background contains:
Bachelor’s degree in computer science or the equivalent.
2+ years of experience as a pre or post-sales engineer.
The right extra credit:
There are literally hundreds of previous experiences you can have had that would make you perfect for this job. Some experiences that we know would be helpful for us are below, but make sure you tell us your stories!
Experience using or programming against Amazon S3.
Experience with large on-prem storage – NAS, SAN, Object. And backing up data on such storage with tools like Veeam, Veritas and others.
Experience with photo or video media. Media archiving is a key market for Backblaze B2.
Experience with Windows Servers, Active Directory, Group policies and the like.
What’s it like working with the Sales team?
The Backblaze sales team collaborates. We help each other out by sharing ideas, templates, and our customer’s experiences. When we talk about our accomplishments, there is no “I did this,” only “we”. We are truly a team.
We are honest to each other and our customers and communicate openly. We aim to have fun by embracing crazy ideas and creative solutions. We try to think not outside the box, but with no boxes at all. Customers are the driving force behind the success of the company and we care deeply about their success.
Wu builds a big manifesto about how real-world institutions can’t be trusted. Certainly, this reflects the rhetoric from a vocal wing of Bitcoin fanatics, but it’s not the Bitcoin manifesto.
Instead, the word “trust” in the Bitcoin paper is much narrower, referring to how online merchants can’t trust credit-cards (for example). When I bought school supplies for my niece when she studied in Canada, the online site wouldn’t accept my U.S. credit card. They didn’t trust my credit card. However, they trusted my Bitcoin, so I used that payment method instead, and succeeded in the purchase.
Real-world currencies like dollars are tethered to the real-world, which means no single transaction can be trusted, because “they” (the credit-card company, the courts, etc.) may decide to reverse the transaction. The manifesto behind Bitcoin is that a transaction cannot be reversed — and thus, can always be trusted.
Deliberately confusing the micro-trust in a transaction and macro-trust in banks and governments is a sort of bait-and-switch.
The wrong inspiration
“It was, after all, a carnival of human errors and misfeasance that inspired the invention of Bitcoin in 2009, namely, the financial crisis.”
Not true. Bitcoin did not appear fully formed out of the void, but was instead based upon a series of innovations that predate the financial crisis by a decade. Moreover, the financial crisis had little to do with “currency”. The value of the dollar and other major currencies were essentially unscathed by the crisis. Certainly, enthusiasts looking backward like to cherry pick the financial crisis as yet one more reason why the offline world sucks, but it had little to do with Bitcoin.
In crypto we trust
It’s not in code that Bitcoin trusts, but in crypto. Satoshi makes that clear in one of his posts on the subject:
A generation ago, multi-user time-sharing computer systems had a similar problem. Before strong encryption, users had to rely on password protection to secure their files, placing trust in the system administrator to keep their information private. Privacy could always be overridden by the admin based on his judgment call weighing the principle of privacy against other concerns, or at the behest of his superiors. Then strong encryption became available to the masses, and trust was no longer required. Data could be secured in a way that was physically impossible for others to access, no matter for what reason, no matter how good the excuse, no matter what.
You don’t possess Bitcoins. Instead, all the coins are on the public blockchain under your “address”. What you possess is the secret, private key that matches the address. Transferring Bitcoin means using your private key to unlock your coins and transfer them to another. If you print out your private key on paper, and delete it from the computer, it can never be hacked.
Trust is in this crypto operation. Trust is in your private crypto key.
We don’t trust the code
The manifesto “in code we trust” has been proven wrong again and again. We don’t trust computer code (software) in the cryptocurrency world.
The most profound example is something known as the “DAO” on top of Ethereum, Bitcoin’s major competitor. Ethereum allows “smart contracts” containing code. The quasi-religious manifesto of the DAO smart-contract is that the “code is the contract”, that all the terms and conditions are specified within the smart-contract code, completely untethered from real-world terms-and-conditions.
Then a hacker found a bug in the DAO smart-contract and stole most of the money.
In principle, this is perfectly legal, because “the code is the contract”, and the hacker just used the code. In practice, the system didn’t live up to this. The Ethereum core developers, acting as central bankers, rewrote the Ethereum code to fix this one contract, returning the money back to its original owners. They did this because those core developers were themselves heavily invested in the DAO and got their money back.
Similar things happen with the original Bitcoin code. A disagreement has arisen about how to expand Bitcoin to handle more transactions. One group wants smaller and “off-chain” transactions. Another group wants a “large blocksize”. This caused a “fork” in Bitcoin with two versions, “Bitcoin” and “Bitcoin Cash”. The fork championed by the core developers (central bankers) is worth around $20,000 right now, while the other fork is worth around $2,000.
So it’s still “in central bankers we trust”, it’s just that now these central bankers are mostly online instead of offline institutions. They have proven to be even more corrupt than real-world central bankers. It’s certainly not the code that is trusted.
Wu repeats the well-known reference to Amazon during the dot-com bubble. If you bought Amazon’s stock for $107 right before the dot-com crash, it still would be one of wisest investments you could’ve made. Amazon shares are now worth around $1,200 each.
The implication is that Bitcoin, too, may have such long term value. Even if you buy it today and it crashes tomorrow, it may still be worth ten-times its current value in another decade or two.
This is a poor analogy, for three reasons.
The first reason is that we knew the Internet had fundamentally transformed commerce. We knew there were going to be winners in the long run, it was just a matter of picking who would win (Amazon) and who would lose (Pets.com). We have yet to prove Bitcoin will be similarly transformative.
The second reason is that businesses are real, they generate real income. While the stock price may include some irrational exuberance, it’s ultimately still based on the rational expectations of how much the business will earn. With Bitcoin, it’s almost entirely irrational exuberance — there are no long term returns.
The third flaw in the analogy is that there are an essentially infinite number of cryptocurrencies. We saw this today as Coinbase started trading Bitcoin Cash, a fork of Bitcoin. The two are nearly identical, so there’s little reason one should be so much valuable than another. It’s only a fickle fad that makes one more valuable than another, not business fundamentals. The successful future cryptocurrency is unlikely to exist today, but will be invented in the future.
The lessons of the dot-com bubble is not that Bitcoin will have long term value, but that cryptocurrency companies like Coinbase and BitPay will have long term value. Or, the lesson is that “old” companies like JPMorgan that are early adopters of the technology will grow faster than their competitors.
The point of Wu’s paper is to distinguish trust in traditional real-world institutions and trust in computer software code. This is an inaccurate reading of the situation.
Bitcoin is not about replacing real-world institutions but about untethering online transactions.
The trust in Bitcoin is in crypto — the power crypto gives individuals instead of third-parties.
The trust is not in the code. Bitcoin is a “cryptocurrency” not a “codecurrency”.
The collective thoughts of the interwebz
The cookie settings on this website are set to "allow cookies" to give you the best browsing experience possible. If you continue to use this website without changing your cookie settings or you click "Accept" below then you are consenting to this.