Tag Archives: Case Study

Future-Proofing Backups for Your Business

Post Syndicated from Natasha Rabinov original https://www.backblaze.com/blog/future-proofing-backups-for-your-business/

screenshot of PagerDuty dashboard

An alert from PagerDuty sets off alarm bells for anyone in IT. Alerts might signify that a disk is nearly full or has failed entirely. Although unpleasant and imminently critical, hard drive failures come as no surprise to IT Managers. They are prepared for those alerts and have likely seen such incidents and failures before. Experience has shown them that it is not a question of if hard drives will fail, but when.

In fact, from the moment they are hired, IT Managers begin protecting company data and assessing points of failure. On that first day, the threat of data loss may inadvertently come from internal mistakes.

When my execs are on the road, what happens if they lose their laptop? How can I get them the latest version of their files, no matter where they are?

Although execs may be the ones losing their laptops, they will surely turn to IT Managers to recover data right away.

Questions continue to build up when thinking about company growth and the impact on IT.

We just hired another five people and the server is almost full. When will I have both time and budget to spin up another one?

IT Managers are typically not lacking in projects; they are often short on time. Budgets certainly matter but time management is also a problem — there already isn’t enough time in the day.

Many of these IT issues can be mitigated if they are tackled early on. The right backup solution is simple, silent, and affordable. A low-touch solution can give IT Managers back two precious resources, time and budget. They can move on to other projects while their backups run automatically in the background, not interrupting their users. The best plan is something that scales as the company grows from its first IT manager through IPO.

A good example of a company that future-proofed their backups is PagerDuty. At the time of their first IT hire, they accurately assessed their current and future backup needs. Here is their story.

PagerDuty server rack

Case Study: How PagerDuty Future Proofed Backups

The first thing Matt Spring, IT Manager, noticed when he joined PagerDuty was that they worked in the cloud. While everyone carried around a laptop or perhaps had a desktop system, there were no file servers, no database servers, no mail servers, no servers of any kind located in the office. At first glance, it seemed everyone simply connected to the internet and used cloud-based applications, but as Matt soon discovered many people were also using Mac-based applications as well. Matt instinctively knew he had a backup problem and he had to act to ensure the organization would not lose important data.

The backup problem that Matt faced is one encountered by companies that use both cloud-based and PC/Mac-based applications in their environment. For example, a company might use a cloud-based HR system, but Office applications on their laptops and desktops. While Matt had some confidence the cloud-based data was backed up, the local data on the company’s laptops and desktops was not being backed up.

As Matt was building a list of backup vendors to consider, he included Backblaze. He was familiar with Backblaze because he had been following their Hard Drive Stats blog posts. He appreciated the company’s transparency and included them in the list. His primary criteria for selecting a backup service were:

  • Able to be installed with little or no user involvement
  • Automatically back up all the data on a laptop or desktop with no user intervention
  • Affordable

After a review process, he chose Backblaze Business Backup.

As PagerDuty grew, so did the number of laptops and desktops, and Matt and his team ensured that Backblaze was installed on all of them. This was especially important to PagerDuty as some of the newly hired employees worked in locations across the globe. Matt could send them a system provisioned with Backblaze and from the moment the new employee started working, they were being backed up to the Backblaze cloud.

One of the features that Matt likes is how Backblaze scans the users’ system looking for data to back up versus having to pick and choose folders and files. He points out how he once restored an Office autorecovery file from Backblaze when a user forgot to save several hours of work before their system crashed. He commented that, “no other cloud backup system that I know of would have automatically backed up that file.”

Over the years that PagerDuty has been a Backblaze Business Backup customer, they’ve had several instances where they needed to restore data. Every restore has been successful.

As a bonus, most users find the restore process easy enough so they can restore their own files, but Matt and his team have from time to time done complete system restores to replace a failed or lost system.

“No other cloud backup system that I know of would have automatically backed up that file.”

— Matt Spring, PagerDuty

Recently, Matt and his team upgraded for free to the most recent release of Backblaze Business Backup. This release included Business Groups. This feature allows administrators to organize users into groups for billing and management purposes. For example, groups might be created for administrators, executives, and staff. Groups can also be managed or unmanaged. Contractors and interns could be in a managed group where IT controls the restore process, while company employees could be in one or more unmanaged groups, as desired.

Matt Appreciates the Flexibility That Business Groups Provides

As PagerDuty continues to grow, Matt expects the organization to be more and more dispersed, with even more employees and contractors spread out internationally. While this presents many challenges for IT, one thing Matt already has covered is having the data sitting on the company’s laptops and desktops automatically and securely backed up to the Backblaze cloud.

If you are interested in learning more, you can read the PagerDuty case study. Or, if you would like to back up all of your end users’ data today and future proof your backups, we invite you to try out Backblaze Business Backup.

The post Future-Proofing Backups for Your Business appeared first on Backblaze Blog | Cloud Storage & Cloud Backup.

Panna Cooking Creates the Perfect Storage Recipe with Backblaze and 45 Drives

Post Syndicated from Andy Klein original https://www.backblaze.com/blog/panna-cooking-creates-perfect-storage-recipe/

Panna Cooking custard dessert with strawberries

Panna Cooking is the smart home cook’s go-to resource for learning to cook, video recipes, and shopping lists, all delivered by 40+ of the world’s best chefs. Video is the primary method Panna uses to communicate with their customers. Joshua Stenseth is a full-time editor and part-time IT consultant in charge of wrangling the video content Panna creates.

Like many organizations, digital media archive storage wasn’t top of mind over the years at Panna Cooking. Over time, more and more media projects were archived to various external hard drives and dutifully stored in the archive closet. The external drive archive solution was inexpensive and was easy to administer for Joshua, until…

Joshua stared at the request from the chef. She wanted to update a recipe video from a year ago to include in her next weekly video. The edits being requested were the easy part as the new footage was ready to go. The trouble was locating the old digital video files. Over time, Panna had built up a digital video archive that resided on over 100 external hard drives scattered around the office. The digital files that Joshua needed were on one of those drives.

Panna Cooking, like many growing organizations, learned that the easy-to-do tasks, like using external hard drives for data archiving, don’t scale very well. This is especially true for media content such as video and photographs. It is easy to get overwhelmed.

Panna Cooking dessert

Joshua was given the task of ensuring all the content created by Panna was economically stored, readily available, and secured off site. The Panna Cooking case study details how Joshua was able to consolidate their existing scattered archive by employing the Hybrid Cloud Storage package from 45 Drives, a flexible, highly affordable, media archiving solution consisting of a 45 Drives Storinator storage server, Rclone, Duplicity, and B2 Cloud Storage.

The 45 Drives’ innovative Hybrid Cloud Storage package and their partnership with Backblaze B2 Cloud Storage was the perfect solution. The Hybrid Cloud Storage package installs on the Storinator system and utilizes Rclone or Duplicity to back up or sync files to the B2 cloud. This gave Panna a fully operational local storage system that sends changes automatically to the B2 cloud. For Joshua and his fellow editors, the Storinator/Backblaze B2 solution gave them the best of both worlds, high performance local storage and readily accessible, affordable off-site cloud storage all while eliminating their old archive: the closet full of external hard drives.

The post Panna Cooking Creates the Perfect Storage Recipe with Backblaze and 45 Drives appeared first on Backblaze Blog | Cloud Storage & Cloud Backup.

American Public Television Embraces the Cloud — And the Future

Post Syndicated from Andy Klein original https://www.backblaze.com/blog/american-public-television-embraces-the-cloud-and-the-future/

American Public Television website

American Public Television was like many organizations that have been around for a while. They were entrenched using an older technology — in their case, tape storage and distribution — that once met their needs but was limiting their productivity and preventing them from effectively collaborating with their many media partners. APT’s VP of Technology knew that he needed to move into the future and embrace cloud storage to keep APT ahead of the game.
Since 1961, American Public Television (APT) has been a leading distributor of groundbreaking, high-quality, top-rated programming to the nation’s public television stations. Gerry Field is the Vice President of Technology at APT and is responsible for delivering their extensive program catalog to 350+ public television stations nationwide.

In the time since Gerry  joined APT in 2007, the industry has been in digital overdrive. During that time APT has continued to acquire and distribute the best in public television programming to their technically diverse subscribers.

This created two challenges for Gerry. First, new technology and format proliferation were driving dramatic increases in digital storage. Second, many of APT’s subscribers struggled to keep up with the rapidly changing industry. While some subscribers had state-of-the-art satellite systems to receive programming, others had to wait for the post office to drop off programs recorded on tape weeks earlier. With no slowdown on the horizon of innovation in the industry, Gerry knew that his storage and distribution systems would reach a crossroads in no time at all.

American Public Television logo

Living the tape paradigm

The digital media industry is only a few years removed from its film, and later videotape, roots. Tape was the input and the output of the industry for many years. As a consequence, the tools and workflows used by the industry were built and designed to work with tape. Over time, the “file” slowly replaced the tape as the object to be captured, edited, stored and distributed. Trouble was, many of the systems and more importantly workflows were based on processing tape, and these have proven to be hard to change.

At APT, Gerry realized the limits of the tape paradigm and began looking for technologies and solutions that enabled workflows based on file and object based storage and distribution.

Thinking file based storage and distribution

For data (digital media) storage, APT, like everyone else, started by installing onsite storage servers. As the amount of digital data grew, more storage was added. In addition, APT was expanding its distribution footprint by creating or partnering with distribution channels such as CreateTV and APT Worldwide. This dramatically increased the number of programming formats and the amount of data that had to be stored. As a consequence, updating, maintaining, and managing the APT storage systems was becoming a major challenge and a major resource hog.

APT Online

Knowing that his in-house storage system was only going to cost more time and money, Gerry decided it was time to look at cloud storage. But that wasn’t the only reason he looked at the cloud. While most people consider cloud storage as just a place to back up and archive files, Gerry was envisioning how the ubiquity of the cloud could help solve his distribution challenges. The trouble was the price of cloud storage from vendors like Amazon S3 and Microsoft Azure was a non-starter, especially for a non-profit. Then Gerry came across Backblaze. B2 Cloud Storage service met all of his performance requirements, and at $0.005/GB/month for storage and $0.01/GB for downloads it was nearly 75% less than S3 or Azure.

Gerry did the math and found that he could economically incorporate B2 Cloud Storage into his IT portfolio, using it for both program submission and for active storage and archiving of the APT programs. In addition, B2 now gives him the foundation necessary to receive and distribute programming content over the Internet. This is especially useful for organizations that can’t conveniently access satellite distribution systems. Not to mention downloading from the cloud is much faster than sending a tape through the mail.

Adding B2 Cloud Storage to their infrastructure has helped American Public Television address two key challenges. First, they now have “unlimited” storage in the cloud without having to add any hardware. In addition, with B2, they only pay for the storage they use. That means they don’t have to buy storage upfront trying to match the maximum amount of storage they’ll ever need. Second, by using B2 as a distribution source for their programming APT subscribers, especially the smaller and remote ones, can get content faster and more reliably without having to perform costly upgrades to their infrastructure.

The road ahead

As APT gets used to their file based infrastructure and workflow, there are a number of cost saving and income generating ideas they are pondering which are now worth considering. Here are a few:

Program Submissions — New content can be uploaded from anywhere using a web browser, an Internet connection, and a login. For example, a producer in Cambodia can upload their film to B2. From there the film is downloaded to an in-house system where it is processed and transcoded using compute. The finished film is added to the APT catalog and added to B2. Once there, the program is instantly available for subscribers to order and download.

“The affordability and performance of Backblaze B2 is what allowed us to make the B2 cloud part of the APT data storage and distribution strategy into the future.” — Gerry Field

Easier Previews — At any time, work in process or finished programs can be made available for download from the B2 cloud. One place this could be useful is where a subscriber needs to review a program to comply with local policies and practices before airing. In the old system, each “one-off” was a time consuming manual process.

Instant Subscriptions — There are many organizations such as schools and businesses that want to use just one episode of a desired show. With an e-commerce based website, current or even archived programming kept in B2 could be available to download or stream for a minimal charge.

At APT there were multiple technologies needed to make their file-based infrastructure work, but as Gerry notes, having an affordable, trustworthy, cloud storage service like B2 is one of the critical building blocks needed to make everything work together.

The post American Public Television Embraces the Cloud — And the Future appeared first on Backblaze Blog | Cloud Storage & Cloud Backup.

Simplicity is a Feature for Cloud Backup

Post Syndicated from Roderick Bauer original https://www.backblaze.com/blog/distributed-cloud-backup-for-businesses/

cloud on a blue background
For Joel Wagener, Director of IT at AIBS, simplicity is an important feature he looks for in software applications to use in his organization. So maybe it’s not unexpected that Joel chose Backblaze for Business to back up AIBS’s staff computers. According to Joel, “It just works.”American Institute of Biological Sciences

AIBS (The American Institute of Biological Sciences) is a non-profit scientific association dedicated to advancing biological research and education. Founded in 1947 as part of the National Academy of Sciences, AIBS later became independent and now has over 100 member organizations. AIBS works to ensure that the public, legislators, funders, and the community of biologists have access to and use information that will guide them in making informed decisions about matters that require biological knowledge.

AIBS started using Backblaze for Business Cloud Backup several years ago to make sure that the organization’s data was backed up and protected from accidental loss or computer failure. AIBS is based in Washington, D.C., but is a virtual organization, with staff dispersed around the United States. AIBS needed a backup solution that worked anywhere a staff member was located, and was easy to use, as well. Joel has made Backblaze a default part of the configuration management for all the AIBS endpoints, which in their case are exclusively Macintosh.

AIBS biological images

“We started using Backblaze on a single computer in 2014, then not too long after that decided to deploy it to all our endpoints,” explains Joel. “We use Groups to oversee backups and for central billing, but we let each user manage their own computer and restore files on their own if they need to.”

“Backblaze stays out of the way until we need it. It’s fairly lightweight, and I appreciate that it’s simple,” says Joel. “It doesn’t throttle backups and the price point is good. I have family members who use Backblaze, as well.”

Backblaze’s Groups feature permits an organization to oversee and manage the user accounts, including restores, or let users handle that themselves. This flexibility fits a variety of organizations, where various degrees of oversight or independence are desirable. The finance and HR departments could manage their own data, for example, while the rest of the organization could be managed by IT. All groups can be billed centrally no matter how other functionality is set up.

“If we have a computer that needs repair, we can put a loaner computer in that person’s hands and they can immediately get the data they need directly from the Backblaze cloud backup, which is really helpful. When we get the original computer back from repair we can do a complete restore and return it to the user all ready to go again. When we’ve needed restores, Backblaze has been reliable.”

Joel also likes that the memory footprint of Backblaze is light — the clients for both Macintosh and Windows are native, and designed to use minimum system resources and not impact any applications used on the computer. He also likes that updates to the client software are pushed out when necessary.

Backblaze for Business

Backblaze for Business also helps IT maintain archives of users’ computers after they leave the organization.

“We like that we have a ready-made archive of a computer when someone leaves,” said Joel. The Backblaze backup is there if we need to retrieve anything that person was working on.”

There are other capabilities in Backblaze that Joel likes, but hasn’t had a chance to use yet.

“We’ve used Casper (Jamf) to deploy and manage software on endpoints without needing any interaction from the user. We haven’t used it yet for Backblaze, but we know that Backblaze supports it. It’s a handy feature to have.”

”It just works.”
— Joel Wagener, AIBS Director of IT

Perhaps the best thing about Backblaze for Business isn’t a specific feature that can be found on a product data sheet.

“When files have been lost, Backblaze has provided us access to multiple prior versions, and this feature has been important and worked well several times,” says Joel.

“That provides needed peace of mind to our users, and our IT department, as well.”

The post Simplicity is a Feature for Cloud Backup appeared first on Backblaze Blog | Cloud Storage & Cloud Backup.

Amazon Redshift – 2017 Recap

Post Syndicated from Larry Heathcote original https://aws.amazon.com/blogs/big-data/amazon-redshift-2017-recap/

We have been busy adding new features and capabilities to Amazon Redshift, and we wanted to give you a glimpse of what we’ve been doing over the past year. In this article, we recap a few of our enhancements and provide a set of resources that you can use to learn more and get the most out of your Amazon Redshift implementation.

In 2017, we made more than 30 announcements about Amazon Redshift. We listened to you, our customers, and delivered Redshift Spectrum, a feature of Amazon Redshift, that gives you the ability to extend analytics to your data lake—without moving data. We launched new DC2 nodes, doubling performance at the same price. We also announced many new features that provide greater scalability, better performance, more automation, and easier ways to manage your analytics workloads.

To see a full list of our launches, visit our what’s new page—and be sure to subscribe to our RSS feed.

Major launches in 2017

Amazon Redshift Spectrumextend analytics to your data lake, without moving data

We launched Amazon Redshift Spectrum to give you the freedom to store data in Amazon S3, in open file formats, and have it available for analytics without the need to load it into your Amazon Redshift cluster. It enables you to easily join datasets across Redshift clusters and S3 to provide unique insights that you would not be able to obtain by querying independent data silos.

With Redshift Spectrum, you can run SQL queries against data in an Amazon S3 data lake as easily as you analyze data stored in Amazon Redshift. And you can do it without loading data or resizing the Amazon Redshift cluster based on growing data volumes. Redshift Spectrum separates compute and storage to meet workload demands for data size, concurrency, and performance. Redshift Spectrum scales processing across thousands of nodes, so results are fast, even with massive datasets and complex queries. You can query open file formats that you already use—such as Apache Avro, CSV, Grok, ORC, Apache Parquet, RCFile, RegexSerDe, SequenceFile, TextFile, and TSV—directly in Amazon S3, without any data movement.

For complex queries, Redshift Spectrum provided a 67 percent performance gain,” said Rafi Ton, CEO, NUVIAD. “Using the Parquet data format, Redshift Spectrum delivered an 80 percent performance improvement. For us, this was substantial.

To learn more about Redshift Spectrum, watch our AWS Summit session Intro to Amazon Redshift Spectrum: Now Query Exabytes of Data in S3, and read our announcement blog post Amazon Redshift Spectrum – Exabyte-Scale In-Place Queries of S3 Data.

DC2 nodes—twice the performance of DC1 at the same price

We launched second-generation Dense Compute (DC2) nodes to provide low latency and high throughput for demanding data warehousing workloads. DC2 nodes feature powerful Intel E5-2686 v4 (Broadwell) CPUs, fast DDR4 memory, and NVMe-based solid state disks (SSDs). We’ve tuned Amazon Redshift to take advantage of the better CPU, network, and disk on DC2 nodes, providing up to twice the performance of DC1 at the same price. Our DC2.8xlarge instances now provide twice the memory per slice of data and an optimized storage layout with 30 percent better storage utilization.

Redshift allows us to quickly spin up clusters and provide our data scientists with a fast and easy method to access data and generate insights,” said Bradley Todd, technology architect at Liberty Mutual. “We saw a 9x reduction in month-end reporting time with Redshift DC2 nodes as compared to DC1.”

Read our customer testimonials to see the performance gains our customers are experiencing with DC2 nodes. To learn more, read our blog post Amazon Redshift Dense Compute (DC2) Nodes Deliver Twice the Performance as DC1 at the Same Price.

Performance enhancements— 3x-5x faster queries

On average, our customers are seeing 3x to 5x performance gains for most of their critical workloads.

We introduced short query acceleration to speed up execution of queries such as reports, dashboards, and interactive analysis. Short query acceleration uses machine learning to predict the execution time of a query, and to move short running queries to an express short query queue for faster processing.

We launched results caching to deliver sub-second response times for queries that are repeated, such as dashboards, visualizations, and those from BI tools. Results caching has an added benefit of freeing up resources to improve the performance of all other queries.

We also introduced late materialization to reduce the amount of data scanned for queries with predicate filters by batching and factoring in the filtering of predicates before fetching data blocks in the next column. For example, if only 10 percent of the table rows satisfy the predicate filters, Amazon Redshift can potentially save 90 percent of the I/O for the remaining columns to improve query performance.

We launched query monitoring rules and pre-defined rule templates. These features make it easier for you to set metrics-based performance boundaries for workload management (WLM) queries, and specify what action to take when a query goes beyond those boundaries. For example, for a queue that’s dedicated to short-running queries, you might create a rule that aborts queries that run for more than 60 seconds. To track poorly designed queries, you might have another rule that logs queries that contain nested loops.

Customer insights

Amazon Redshift and Redshift Spectrum serve customers across a variety of industries and sizes, from startups to large enterprises. Visit our customer page to see the success that customers are having with our recent enhancements. Learn how companies like Liberty Mutual Insurance saw a 9x reduction in month-end reporting time using DC2 nodes. On this page, you can find case studies, videos, and other content that show how our customers are using Amazon Redshift to drive innovation and business results.

In addition, check out these resources to learn about the success our customers are having building out a data warehouse and data lake integration solution with Amazon Redshift:

Partner solutions

You can enhance your Amazon Redshift data warehouse by working with industry-leading experts. Our AWS Partner Network (APN) Partners have certified their solutions to work with Amazon Redshift. They offer software, tools, integration, and consulting services to help you at every step. Visit our Amazon Redshift Partner page and choose an APN Partner. Or, use AWS Marketplace to find and immediately start using third-party software.

To see what our Partners are saying about Amazon Redshift Spectrum and our DC2 nodes mentioned earlier, read these blog posts:

Resources

Blog posts

Visit the AWS Big Data Blog for a list of all Amazon Redshift articles.

YouTube videos

GitHub

Our community of experts contribute on GitHub to provide tips and hints that can help you get the most out of your deployment. Visit GitHub frequently to get the latest technical guidance, code samples, administrative task automation utilities, the analyze & vacuum schema utility, and more.

Customer support

If you are evaluating or considering a proof of concept with Amazon Redshift, or you need assistance migrating your on-premises or other cloud-based data warehouse to Amazon Redshift, our team of product experts and solutions architects can help you with architecting, sizing, and optimizing your data warehouse. Contact us using this support request form, and let us know how we can assist you.

If you are an Amazon Redshift customer, we offer a no-cost health check program. Our team of database engineers and solutions architects give you recommendations for optimizing Amazon Redshift and Amazon Redshift Spectrum for your specific workloads. To learn more, email us at [email protected].

If you have any questions, email us at [email protected].

 


Additional Reading

If you found this post useful, be sure to check out Amazon Redshift Spectrum – Exabyte-Scale In-Place Queries of S3 Data, Using Amazon Redshift for Fast Analytical Reports and How to Migrate Your Oracle Data Warehouse to Amazon Redshift Using AWS SCT and AWS DMS.


About the Author

Larry Heathcote is a Principle Product Marketing Manager at Amazon Web Services for data warehousing and analytics. Larry is passionate about seeing the results of data-driven insights on business outcomes. He enjoys family time, home projects, grilling out and the taste of classic barbeque.

 

 

 

Ode to ‘Locate My Computer’

Post Syndicated from Yev original https://www.backblaze.com/blog/laptop-locator-can-save-you/

Laptop locator signal

Some things don’t get the credit they deserve. For one of our engineers, Billy, the Locate My Computer feature is near and dear to his heart. It took him a while to build it, and it requires some regular updates, even after all these years. Billy loves the Locate My Computer feature, but really loves knowing how it’s helped customers over the years. One recent story made us decide to write a bit of a greatest hits post as an ode to one of our favorite features — Locate My Computer.

What is it?

Locate My Computer, as you’ll read in the stories below, came about because some of our users had their computers stolen and were trying to find a way to retrieve their devices. They realized that while some of their programs and services like Find My Mac were wiped, in some cases, Backblaze was still running in the background. That created the ability to use our software to figure out where the computer was contacting us from. After manually helping some of the individuals that wrote in, we decided to build it in as a feature. Little did we know the incredible stories it would lead to. We’ll get into that, but first, a little background on why the whole thing came about.

Identifying the Customer Need

“My friend’s laptop was stolen. He tracked the thief via @Backblaze for weeks & finally identified him on Facebook & Twitter. Digital 007.”

Mat —
In December 2010, we saw a tweet from @DigitalRoyalty which read: “My friend’s laptop was stolen. He tracked the thief via @Backblaze for weeks & finally identified him on Facebook & Twitter. Digital 007.” Our CEO was manning Twitter at the time and reached out for the whole story. It turns out that Mat Miller had his laptop stolen, and while he was creating some restores a few days later, he noticed a new user was created on his computer and was backing up data. He restored some of those files, saw some information that could help identify the thief, and filed a police report. Read the whole story: Digital 007 — Outwitting The Thief.

Mark —
Following Mat Miller’s story we heard from Mark Bao, an 18-year old entrepreneur and student at Bentley University who had his laptop stolen. The laptop was stolen out of Mark’s dorm room and the thief started using it in a variety of ways, including audition practice for Dancing with the Stars. Once Mark logged in to Backblaze and saw that there were new files being uploaded, including a dance practice video, he was able to reach out to campus police and got his laptop back. You can read more about the story on: 18 Year Old Catches Thief Using Backblaze.

After Mat and Mark’s story we thought we were onto something. In addition to those stories that had garnered some media attention, we would occasionally get requests from users that said something along the lines of, “Hey, my laptop was stolen, but I had Backblaze installed. Could you please let me know if it’s still running, and if so, what the IP address is so that I can go to the authorities?” We would help them where we could, but knew that there was probably a much more efficient method of helping individuals and businesses keep track of their computers.

Some of the Greatest Hits, and the Mafia Story

In May of 2011, we launched “Locate My Computer.” This was our way of adding a feature to our already-popular backup client that would allow users to see a rough representation of where their computer was located, and the IP address associated with its last known transmission. After speaking to law enforcement, we learned that those two things were usually enough for the authorities to subpoena an ISP and get the physical address of the last known place the computer phoned home from. From there, they could investigate and, if the device was still there, return it to its rightful owner.

Bridgette —
Once the feature went live the stories got even more interesting. Almost immediately after we launched Locate My Computer, we were contacted by Bridgette, who told us of a break-in at her house. Luckily no one was home at the time, but the thief was able to get away with her iMac, DSLR, and a few other prized possessions. As soon as she reported the robbery to the police, they were able to use the Locate My Computer feature to find the thief’s location and recover her missing items. We even made a case study out of Bridgette’s experience. You can read it at: Backblaze And The Stolen iMac.

“Joe” —
The crazy recovery stories didn’t end there. Shortly after Bridgette’s story, we received an email from a user (“Joe” — to protect the innocent) who was traveling to Argentina from the United States and had his laptop stolen. After he contacted the police department in Buenos Aires, and explained to them that he was using Backblaze (which the authorities thought was a computer tracking service, and in this case, we were), they were able to get the location of the computer from an ISP in Argentina. When they went to investigate, they realized that the perpetrators were foreign nationals connected to the mafia, and that in addition to a handful of stolen laptops, they were also in the possession of over $1,000,000 in counterfeit currency! Read the whole story about “Joe” and how: Backblaze Found $1 Million in Counterfeit Cash!

The Maker —
After “Joe,” we thought that our part in high-profile “busts was over, but we were wrong. About a year later we received word from a “maker” who told us that he was able to act as an “internet super-sleuth” and worked hard to find his stolen computer. After a Maker Faire in Detroit, the maker’s car was broken into while they were getting BBQ following a successful show. While some of the computers were locked and encrypted, others were in hibernation mode and wide open to prying eyes. After the police report was filed, the maker went to Backblaze to retrieve his lost files and remembered seeing the little Locate My Computer button. That’s when the story gets really interesting. The victim used a combination of ingenuity, Craigslist, Backblaze, and the local police department to get his computer back, and make a drug bust along the way. Head over to Makezine.com to read about how:How Tracking Down My Stolen Computer Triggered a Drug Bust.

Una —
While we kept hearing praise and thanks from our customers who were able to recover their data and find their computers, a little while passed before we would hear a story that was as incredible as the ones above. In July of 2016, we received an email from Una who told us one of the most amazing stories of perseverance that we’d ever heard. With the help of Backblaze and a sympathetic constable in Australia, Una tracked her stolen computer’s journey across 6 countries. She got her computer back and we wrote up the whole story: How Una Found Her Stolen Laptop.

And the Hits Keep on Coming

The most recent story came from “J,” and we’ll share the whole thing with you because it has a really nice conclusion:

Back in September of 2017, I brought my laptop to work to finish up some administrative work before I took off for a vacation. I work in a mall where traffic [is] plenty and more specifically I work at a kiosk in the middle of the mall. This allows for a high amount of traffic passing by every few seconds. I turned my back for about a minute to put away some paperwork. At the time I didn’t notice my laptop missing. About an hour later when I was gathering my belongings for the day I noticed it was gone. I was devastated. This was a high end MacBook Pro that I just purchased. So we are not talking about a little bit of money here. This was a major investment.

Time [went] on. When I got back from my vacation I reached out to my LP (Loss Prevention) team to get images from our security to submit to the police with some thread of hope that they would find whomever stole it. December approached and I did not hear anything. I gave up hope and assumed that the laptop was scrapped. I put an iCloud lock on it and my Find My Mac feature was saying that laptop was “offline.” I just assumed that they opened it, saw it was locked, and tried to scrap it for parts.

Towards the end of January I got an email from Backblaze saying that the computer was successfully backed up. This came as a shock to me as I thought it was wiped. But I guess however they wiped it didn’t remove Backblaze from the SSD. None the less, I was very happy. I sifted through the backup and found the person’s name via the search history. Then, using the Locate my Computer feature I saw where it came online. I reached out on social media to the person in question and updated the police. I finally got ahold of the person who stated she bought it online a few weeks backs. We made arrangements and I’m happy to say that I am typing this email on my computer right now.

J finished by writing: “Not only did I want to share this story with you but also wanted to say thanks! Apple’s find my computer system failed. The police failed to find it. But Backblaze saved the day. This has been the best $5 a month I have ever spent. Not only that but I got all my stuff back. Which made the deal even better! It was like it was never gone.”

Have a Story of Your Own?

We’re more than thrilled to have helped all of these people restore their lost data using Backblaze. Recovering the actual machine using Locate My Computer though, that’s the icing on the cake. We’re proud of what we’ve been able to build here at Backblaze, and we really enjoy hearing stories from people who have used our service to successfully get back up and running, whether that meant restoring their data or recovering their actual computer.

If you have any interesting data recovery or computer recovery stories that you’d like to share with us, please email press@backblaze.com and we’ll share it with Billy and the rest of the Backblaze team. We love hearing them!

The post Ode to ‘Locate My Computer’ appeared first on Backblaze Blog | Cloud Storage & Cloud Backup.

Now Available: New Digital Training to Help You Learn About AWS Big Data Services

Post Syndicated from Sara Snedeker original https://aws.amazon.com/blogs/big-data/now-available-new-digital-training-to-help-you-learn-about-aws-big-data-services/

AWS Training and Certification recently released free digital training courses that will make it easier for you to build your cloud skills and learn about using AWS Big Data services. This training includes courses like Introduction to Amazon EMR and Introduction to Amazon Athena.

You can get free and unlimited access to more than 100 new digital training courses built by AWS experts at aws.training. It’s easy to access training related to big data. Just choose the Analytics category on our Find Training page to browse through the list of courses. You can also use the keyword filter to search for training for specific AWS offerings.

Recommended training

Just getting started, or looking to learn about a new service? Check out the following digital training courses:

Introduction to Amazon EMR (15 minutes)
Covers the available tools that can be used with Amazon EMR and the process of creating a cluster. It includes a demonstration of how to create an EMR cluster.

Introduction to Amazon Athena (10 minutes)
Introduces the Amazon Athena service along with an overview of its operating environment. It covers the basic steps in implementing Athena and provides a brief demonstration.

Introduction to Amazon QuickSight (10 minutes)
Discusses the benefits of using Amazon QuickSight and how the service works. It also includes a demonstration so that you can see Amazon QuickSight in action.

Introduction to Amazon Redshift (10 minutes)
Walks you through Amazon Redshift and its core features and capabilities. It also includes a quick overview of relevant use cases and a short demonstration.

Introduction to AWS Lambda (10 minutes)
Discusses the rationale for using AWS Lambda, how the service works, and how you can get started using it.

Introduction to Amazon Kinesis Analytics (10 minutes)
Discusses how Amazon Kinesis Analytics collects, processes, and analyzes streaming data in real time. It discusses how to use and monitor the service and explores some use cases.

Introduction to Amazon Kinesis Streams (15 minutes)
Covers how Amazon Kinesis Streams is used to collect, process, and analyze real-time streaming data to create valuable insights.

Introduction to AWS IoT (10 minutes)
Describes how the AWS Internet of Things (IoT) communication architecture works, and the components that make up AWS IoT. It discusses how AWS IoT works with other AWS services and reviews a case study.

Introduction to AWS Data Pipeline (10 minutes)
Covers components like tasks, task runner, and pipeline. It also discusses what a pipeline definition is, and reviews the AWS services that are compatible with AWS Data Pipeline.

Go deeper with classroom training

Want to learn more? Enroll in classroom training to learn best practices, get live feedback, and hear answers to your questions from an instructor.

Big Data on AWS (3 days)
Introduces you to cloud-based big data solutions such as Amazon EMR, Amazon Redshift, Amazon Kinesis, and the rest of the AWS big data platform.

Data Warehousing on AWS (3 days)
Introduces you to concepts, strategies, and best practices for designing a cloud-based data warehousing solution, and demonstrates how to collect, store, and prepare data for the data warehouse.

Building a Serverless Data Lake (1 day)
Teaches you how to design, build, and operate a serverless data lake solution with AWS services. Includes topics such as ingesting data from any data source at large scale, storing the data securely and durably, using the right tool to process large volumes of data, and understanding the options available for analyzing the data in near-real time.

More training coming in 2018

We’re always evaluating and expanding our training portfolio, so stay tuned for more training options in the new year. You can always visit us at aws.training to explore our latest offerings.

How to Make Your Web App More Reliable and Performant Using webpack: a Yahoo Mail Case Study

Post Syndicated from mikesefanov original https://yahooeng.tumblr.com/post/168508200981

yahoodevelopers:

image

By Murali Krishna Bachhu, Anurag Damle, and Utkarsh Shrivastava

As engineers on the Yahoo Mail team at Oath, we pride ourselves on the things that matter most to developers: faster development cycles, more reliability, and better performance. Users don’t necessarily see these elements, but they certainly feel the difference they make when significant improvements are made. Recently, we were able to upgrade all three of these areas at scale by adopting webpack® as Yahoo Mail’s underlying module bundler, and you can do the same for your web application.

What is webpack?

webpack is an open source module bundler for modern JavaScript applications. When webpack processes your application, it recursively builds a dependency graph that includes every module your application needs. Then it packages all of those modules into a small number of bundles, often only one, to be loaded by the browser.

webpack became our choice module bundler not only because it supports on-demand loading, multiple bundle generation, and has a relatively low runtime overhead, but also because it is better suited for web platforms and NodeJS apps and has great community support.

image

Comparison of webpack to other open source bundlers


How did we integrate webpack?

Like any developer does when integrating a new module bundler, we started integrating webpack into Yahoo Mail by looking at its basic config file. We explored available default webpack plugins as well as third-party webpack plugins and then picked the plugins most suitable for our application. If we didn’t find a plugin that suited a specific need, we wrote the webpack plugin ourselves (e.g., We wrote a plugin to execute Atomic CSS scripts in the latest Yahoo Mail experience in order to decrease our overall CSS payload**).

During the development process for Yahoo Mail, we needed a way to make sure webpack would continuously run in the background. To make this happen, we decided to use the task runner Grunt. Not only does Grunt keep the connection to webpack alive, but it also gives us the ability to pass different parameters to the webpack config file based on the given environment. Some examples of these parameters are source map options, enabling HMR, and uglification.

Before deployment to production, we wanted to optimize the javascript bundles for size to make the Yahoo Mail experience faster. webpack provides good default support for this with the UglifyJS plugin. Although the default options are conservative, they give us the ability to configure the options. Once we modified the options to our specifications, we saved approximately 10KB.

image

Code snippet showing the configuration options for the UglifyJS plugin


Faster development cycles for developers

While developing a new feature, engineers ideally want to see their code changes reflected on their web app instantaneously. This allows them to maintain their train of thought and eventually results in more productivity. Before we implemented webpack, it took us around 30 seconds to 1 minute for changes to reflect on our Yahoo Mail development environment. webpack helped us reduce the wait time to 5 seconds.

More reliability

Consumers love a reliable product, where all the features work seamlessly every time. Before we began using webpack, we were generating javascript bundles on demand or during run-time, which meant the product was more prone to exceptions or failures while fetching the javascript bundles. With webpack, we now generate all the bundles during build time, which means that all the bundles are available whenever consumers access Yahoo Mail. This results in significantly fewer exceptions and failures and a better experience overall.

Better Performance

We were able to attain a significant reduction of payload after adopting webpack.

  1. Reduction of about 75 KB gzipped Javascript payload
  2. 50% reduction on server-side render time
  3. 10% improvement in Yahoo Mail’s launch performance metrics, as measured by render time above the fold (e.g., Time to load contents of an email).

Below are some charts that demonstrate the payload size of Yahoo Mail before and after implementing webpack.

image

Payload before using webpack (JavaScript Size = 741.41KB)


image

Payload after switching to webpack (JavaScript size = 669.08KB)


image

Conclusion

Shifting to webpack has resulted in significant improvements. We saw a common build process go from 30 seconds to 5 seconds, large JavaScript bundle size reductions, and a halving in server-side rendering time. In addition to these benefits, our engineers have found the community support for webpack to have been impressive as well. webpack has made the development of Yahoo Mail more efficient and enhanced the product for users. We believe you can use it to achieve similar results for your web application as well.

**Optimized CSS generation with Atomizer

Before we implemented webpack into the development of Yahoo Mail, we looked into how we could decrease our CSS payload. To achieve this, we developed an in-house solution for writing modular and scoped CSS in React. Our solution is similar to the Atomizer library, and our CSS is written in JavaScript like the example below:

image

Sample snippet of CSS written with Atomizer


Every React component creates its own styles.js file with required style definitions. React-Atomic-CSS converts these files into unique class definitions. Our total CSS payload after implementing our solution equaled all the unique style definitions in our code, or only 83KB (21KB gzipped).

During our migration to webpack, we created a custom plugin and loader to parse these files and extract the unique style definitions from all of our CSS files. Since this process is tied to bundling, only CSS files that are part of the dependency chain are included in the final CSS.

NetNeutrality vs. limiting FaceTime

Post Syndicated from Robert Graham original http://blog.erratasec.com/2017/11/netneutrality-vs-limiting-facetime.html

People keep retweeting this ACLU graphic in regards to NetNeutrality. In this post, I debunk the fourth item. In previous posts [1] [2] I debunk other items.

But here’s the thing: the FCC allowed these restrictions, despite the FCC’s “Open Internet” order forbidding such things. In other words, despite the graphic’s claims it “happened without net neutrality rules”, the opposite is true, it happened with net neutrality rules.

The FCC explains why they allowed it in their own case study on the matter. The short version is this: AT&T’s network couldn’t handle the traffic, so it was appropriate to restrict it until some time in the future (the LTE rollout) until it could. The issue wasn’t that AT&T was restricting FaceTime in favor of its own video-calling service (it didn’t have one), but it was instead an issue of “bandwidth management”.
When Apple released FaceTime, they themselves restricted it’s use to WiFi, preventing its use on cell phone networks. That’s because Apple recognized mobile networks couldn’t handle it.
When Apple flipped the switch and allowed it’s use on mobile networks, because mobile networks had gotten faster, they clearly said “carrier restrictions may apply”. In other words, it said “carriers may restrict FaceTime with our blessing if they can’t handle the load”.
When Tim Wu wrote his paper defining “NetNeutrality” in 2003, he anticipated just this scenario. He wrote:

“The goal of bandwidth management is, at a general level, aligned with network neutrality.”

He doesn’t give “bandwidth management” a completely free pass. He mentions the issue frequently in his paper with a less favorable description, such as here:

Similarly, while managing bandwidth is a laudable goal, its achievement through restricting certain application types is an unfortunate solution. The result is obviously a selective disadvantage for certain application markets. The less restrictive means is, as above, the technological management of bandwidth. Application-restrictions should, at best, be a stopgap solution to the problem of competing bandwidth demands. 

And that’s what AT&T’s FaceTime limiting was: an unfortunate stopgap solution until LTE was more fully deployed, which is fully allowed under Tim Wu’s principle of NetNeutrality.

So the ACLU’s claim above is fully debunked: such things did happen even with NetNeutrality rules in place, and should happen.

Finally, and this is probably the most important part, AT&T didn’t block it in the network. Instead, they blocked the app on the phone. If you jailbroke your phone, you could use FaceTime as you wished. Thus, it’s not a “network” neutrality issue because no blocking happened in the network.

Backing Up the Modern Enterprise with Backblaze for Business

Post Syndicated from Roderick Bauer original https://www.backblaze.com/blog/endpoint-backup-solutions/

Endpoint backup diagram

Organizations of all types and sizes need reliable and secure backup. Whether they have as few as 3 or as many as 300,000 computer users, an organization’s computer data is a valuable business asset that needs to be protected.

Modern organizations are changing how they work and where they work, which brings new challenges to making sure that company’s data assets are not only available, but secure. Larger organizations have IT departments that are prepared to address these needs, but often times in smaller and newer organizations the challenge falls upon office management who might not be as prepared or knowledgeable to face a work environment undergoing dramatic changes.

Whether small or large, local or world-wide, for-profit or non-profit, organizations need a backup strategy and solution that matches the new ways of working in the enterprise.

The Enterprise Has Changed, and So Has Data Use

More and more, organizations are working in the cloud. These days organizations can operate just fine without their own file servers, database servers, mail servers, or other IT infrastructure that used to be standard for all but the smallest organization.

The reality is that for most organizations, though, it’s a hybrid work environment, with a combination of cloud-based and PC and Macintosh-based applications. Legacy apps aren’t going away any time soon. They will be with us for a while, with their accompanying data scattered amongst all the desktops, laptops and other endpoints in corporate headquarters, home offices, hotel rooms, and airport waiting areas.

In addition, the modern workforce likely combines regular full-time employees, remote workers, contractors, and sometimes interns, volunteers, and other temporary workers who also use company IT assets.

The Modern Enterprise Brings New Challenges for IT

These changes in how enterprises work present a problem for anyone tasked with making sure that data — no matter who uses it or where it lives — is adequately backed-up. Cloud-based applications, when properly used and managed, can be adequately backed up, provided that users are connected to the internet and data transfers occur regularly — which is not always the case. But what about the data on the laptops, desktops, and devices used by remote employees, contractors, or just employees whose work keeps them on the road?

The organization’s backup solution must address all the needs of the modern organization or enterprise using both cloud and PC and Mac-based applications, and not be constrained by employee or computer location.

A Ten-Point Checklist for the Modern Enterprise for Backing Up

What should the modern enterprise look for when evaluating a backup solution?

1) Easy to deploy to workers’ computers

Whether installed by the computer user or an IT person locally or remotely, the backup solution must be easy to implement quickly with minimal demands on the user or administrator.

2) Fast and unobtrusive client software

Backups should happen in the background by efficient (native) PC and Macintosh software clients that don’t consume valuable processing power or take memory away from applications the user needs.

3) Easy to configure

The backup solutions must be easy to configure for both the user and the IT professional. Ease-of-use means less time to deploy, configure, and manage.

4) Defaults to backing up all valuable data

By default, the solution backs up commonly used files and folders or directories, including desktops. Some backup solutions are difficult and intimidating because they require that the user chose what needs to be backed up, often missing files and folders/directories that contain valuable data.

5) Works automatically in the background

Backups should happen automatically, no matter where the computer is located. The computer user, especially the remote or mobile one, shouldn’t be required to attach cables or drives, or remember to initiate backups. A working solution backs up automatically without requiring action by the user or IT administrator.

6) Data restores are fast and easy

Whether it’s a single file, directory, or an entire system that must be restored, a user or IT sysadmin needs to be able to restore backed up data as quickly as possible. In cases of large restores to remote locations, the ability to send a restore via physical media is a must.

7) No limitations on data

Throttling, caps, and data limits complicate backups and require guesses about how much storage space will be needed.

8) Safe & Secure

Organizations require that their data is secure during all phases of initial upload, storage, and restore.

9) Easy-to-manage

The backup solution needs to provide a clear and simple web management interface for all functions. Designing for ease-of-use leads to efficiency in management and operation.

10) Affordable and transparent pricing

Backup costs should be predictable, understandable, and without surprises.

Two Scenarios for the Modern Enterprise

Enterprises exist in many forms and types, but wanting to meet the above requirements is common across all of them. Below, we take a look at two common scenarios showing how enterprises face these challenges. Three case studies are available that provide more information about how Backblaze customers have succeeded in these environments.

Enterprise Profile 1

The needs of a smaller enterprise differ from those of larger, established organizations. This organization likely doesn’t have anyone who is devoted full-time to IT. The job of on-boarding new employees and getting them set up with a computer likely falls upon an executive assistant or office manager. This person might give new employees a checklist with the software and account information and lets users handle setting up the computer themselves.

Organizations in this profile need solutions that are easy to install and require little to no configuration. Backblaze, by default, backs up all user data, which lets the organization be secure in knowing all the data will be backed up to the cloud — including files left on the desktop. Combined with Backblaze’s unlimited data policy, organizations have a truly “set it and forget it” platform.

Customizing Groups To Meet Teams’ Needs

The Groups feature of Backblaze for Business allows an organization to decide whether an individual client’s computer will be Unmanaged (backups and restores under the control of the worker), or Managed, in which an administrator can monitor the status and frequency of backups and handle restores should they become necessary. One group for the entire organization might be adequate at this stage, but the organization has the option to add additional groups as it grows and needs more flexibility and control.

The organization, of course, has the choice of managing and monitoring users using Groups. With Backblaze’s Groups, organizations can set user-based access rules, which allows the administrator to create restores for lost files or entire computers on an employee’s behalf, to centralize billing for all client computers in the organization, and to redeploy a recovered computer or new computer with the backed up data.

Restores

In this scenario, the decision has been made to let each user manage her own backups, including restores, if necessary, of individual files or entire systems. If a restore of a file or system is needed, the restore process is easy enough for the user to handle it by herself.

Case Study 1

Read about how PagerDuty uses Backblaze for Business in a mixed enterprise of cloud and desktop/laptop applications.

PagerDuty Case Study

In a common approach, the employee can retrieve an accidentally deleted file or an earlier version of a document on her own. The Backblaze for Business interface is easy to navigate and was designed with feedback from thousands of customers over the course of a decade.

In the event of a lost, damaged, or stolen laptop,  administrators of Managed Groups can  initiate the restore, which could be in the form of a download of a restore ZIP file from the web management console, or the overnight shipment of a USB drive directly to the organization or user.

Enterprise Profile 2

This profile is for an organization with a full-time IT staff. When a new worker joins the team, the IT staff is tasked with configuring the computer and delivering it to the new employee.

Backblaze for Business Groups

Case Study 2

Global charitable organization charity: water uses Backblaze for Business to back up workers’ and volunteers’ laptops as they travel to developing countries in their efforts to provide clean and safe drinking water.

charity: water Case Study

This organization can take advantage of additional capabilities in Groups. A Managed Group makes sense in an organization with a geographically dispersed work force as it lets IT ensure that workers’ data is being regularly backed up no matter where they are. Billing can be company-wide or assigned to individual departments or geographical locations. The organization has the choice of how to divide the organization into Groups (location, function, subsidiary, etc.) and whether the Group should be Managed or Unmanaged. Using Managed Groups might be suitable for most of the organization, but there are exceptions in which sensitive data might dictate using an Unmanaged Group, such as could be the case with HR, the executive team, or finance.

Deployment

By Invitation Email, Link, or Domain

Backblaze for Business allows a number of options for deploying the client software to workers’ computers. Client installation is fast and easy on both Windows and Macintosh, so sending email invitations to users or automatically enrolling users by domain or invitation link, is a common approach.

By Remote Deployment

IT might choose to remotely and silently deploy Backblaze for Business across specific Groups or the entire organization. An administrator can silently deploy the Backblaze backup client via the command-line, or use common RMM (Remote Monitoring and Management) tools such as Jamf and Munki.

Restores

Case Study 3

Read about how Bright Bear Technology Solutions, an IT Managed Service Provider (MSP), uses the Groups feature of Backblaze for Business to manage customer backups and restores, deploy Backblaze licenses to their customers, and centralize billing for all their client-based backup services.

Bright Bear Case Study

Some organizations are better equipped to manage or assist workers when restores become necessary. Individual users will be pleased to discover they can roll-back files to an earlier version if they wish, but IT will likely manage any complete system restore that involves reconfiguring a computer after a repair or requisitioning an entirely new system when needed.

This organization might chose to retain a client’s entire computer backup for archival purposes, using Backblaze B2 as the cloud storage solution. This is another advantage of having a cloud storage provider that combines both endpoint backup and cloud object storage among its services.

The Next Step: Server Backup & Data Archiving with B2 Cloud Storage

As organizations grow, they have increased needs for cloud storage beyond Macintosh and PC data backup. Backblaze’s object cloud storage, Backblaze B2, provides low-cost storage and archiving of records, media, and server data that can grow with the organization’s size and needs.

B2 Cloud Storage is available through the same Backblaze management console as Backblaze Computer Backup. This means that Admins have one console for billing, monitoring, deployment, and role provisioning. B2 is priced at 1/4 the cost of Amazon S3, or $0.005 per month per gigabyte (which equals $5/month per terabyte).

Why Modern Enterprises Chose Backblaze

Backblaze for Business

Businesses and organizations select Backblaze for Business for backup because Backblaze is designed to meet the needs of the modern enterprise. Backblaze customers are part of a a platform that has a 10+ year track record of innovation and over 400 petabytes of customer data already under management.

Backblaze’s backup model is proven through head-to-head comparisons to back up data that other backup solutions overlook in their default configurations — including valuable files that are needed after an accidental deletion, theft, or computer failure.

Backblaze is the only enterprise-level backup company that provides TOTP (Time-based One-time Password) via both SMS and Authentication app to all accounts at no incremental charge. At just $50/year/computer, Backblaze is affordable for any size of enterprise.

Modern Enterprises can Meet The Challenge of The Changing Data Environment

With the right backup solution and strategy, the modern enterprise will be prepared to ensure that its data is protected from accident, disaster, or theft, whether its data is in one office or dispersed among many locations, and remote and mobile employees.

Backblaze for Business is an affordable solution that enables organizations to meet the evolving data demands facing the modern enterprise.

The post Backing Up the Modern Enterprise with Backblaze for Business appeared first on Backblaze Blog | Cloud Storage & Cloud Backup.

Say Hello to the New Atlassian

Post Syndicated from Chris De Santis original https://www.anchor.com.au/blog/2017/09/hello-new-atlassian/

Who is Atlassian?

Atlassian is an Australian IT company that develops enterprise software, with its best-known products being its issue-tracking app, Jira, and team collaboration and wiki product, Confluence.

In December 2015, Atlassian went public and made their initial public offering (IPO) under the symbol TEAM, valuing them at $4.37 billion. In summary, they big.

What happened?

A facelift

It’s a nice sunny day in Sydney in mid-September of 2017, and Atlassian, after 15 years of consistency, has rebranded, changing their look and feel for a brighter and funner one, compared to the dreary previous look.New Atlassian Branding VideoIt’s a hell of a lot simpler and, as they show in the above video, it’s going to be used with a lot more creativity and flair in mind—it’s flexible in a sense that they can use it in a lot more ways than before, with a lot more colours than before.

Atlassian Logo ComparisonThe blues they’re using now work super-well with the logos on a white background, whereas the white logos on their new champion, brand colour blue can go both ways: some can see it as a bold, daring step which is quite attractive, while others can see it as off-putting and not very user-friendly.

New Atlassian Logo Versions

What’s it all mean?

Symbolism

In his announcement blog, Atlassian Co-Founder & Co-CEO, Mike Cannon-Brookes, mentions that the branding change reflects their newly-shifted focus on the concept of teamwork. He continues to explain that their previous logo depicted the sky-holding Greek titan Atlas and symbolised legendary service and support. But, while it has become renown, they’re shifting their focus on the concept of teamwork—why focus on something you’ve already done right, right?

Atlassian Logo EvolutionThe new logo contains more symbolism than meets the eye, as can be interpreted as:

  • Two people high-fiving
  • A mountain to scale
  • The letter “A” (seen as two pillars reinforcing each other)
Product logos

Atlassian has created and acquired many products in their adventure so far, and they all seemed to have a similar art style, but something always felt off about their consistency. Well, needless to say, this was addressed with Atlassian’s very own “identity system”, which is a pretty cool term for a consistent logo-look for 14+ products, to fit them under one brand.

New Atlassian Product LogosThe result is a set of unique marks that “still feel very related to each other”. Whereas, I also see a new set of “unknown” Pokémon.

Typeface

New Atlassian TypefaceTo add a cherry on top, Atlassian will be using their own custom-made typeface called Charlie Sans, specifically designed to balance legibility with personality–that’s probably the best way to describe it. Otherwise, I’d say, out of purely-constructive criticism, that there isn’t much difference between itself and any of the other staple fonts; i.e. Arial, Verdana, etc. Then again, I’m not a professional designer.

It doesn’t look as distinct as their previous typeface, but, to be fair, it does look very slick next to the new product logos.

Well…

What do you think about it all?

 

Image credits: Atlassian

The post Say Hello to the New Atlassian appeared first on AWS Managed Services by Anchor.

Preserving the Music of Austin City Limits

Post Syndicated from Andy Klein original https://www.backblaze.com/blog/preserving-the-music-of-austin-city-limits/

Austin City Limits

KLRU-TV, Austin PBS created Austin City Limits 42 years ago and has produced it ever since. Austin City Limits is the longest-running music series in television history. Over the years, KLRU accumulated over 550 episodes and thousands of hours of unaired footage stored on videotape. When KLRU decided to preserve their collection they turned to Backblaze for help with uploading and storing this unparalleled musical anthology in the Backblaze B2 cloud.

Upload: Backblaze B2 Fireball

KLRU started their preservation efforts by digitizing their collection of videotapes. After some internal processing, they were ready to upload the files to Backblaze, but there was a problem – one facing many organizations with a stash of historical digital data – their network connection was “slow”. It was fine for daily work, but uploading terabytes of data was not going to work.

“We would not have been able to get this project off the ground without the B2 Fireball.” – James Cole, KLRU

Backblaze B2 Fireball to the rescue. The B2 Fireball is a secure, shippable, data ingest system capable of transporting up to 40 terabytes of data from your location to Backblaze where the data is ingested into your B2 account. Designed for those organizations that have large amounts of data locally that they want to store in the cloud, the Backblaze B2 Fireball was just what KLRU needed to get the project started.

Preserve: Live Archive with B2

The KLRU staff is working hard to digitize and restore their entire musical archive and they are committed to preserving their data by having both a local copy and a cloud copy of their files. By choosing Backblaze B2 Cloud Storage versus a near-line or off-line storage solution KLRU now has an affordable live archive of their data they can access without delay anytime they need.

You can download and read the entire Austin City Limits case study for more details on how KLRU used B2 as part of their strategy to preserve their entire catalog of Austin City Limits content for future generations.

Dave Grohl Austin City Limits performance

The post Preserving the Music of Austin City Limits appeared first on Backblaze Blog | Cloud Storage & Cloud Backup.

Amazon EC2 Container Service – Launch Recap, Customer Stories, and Code

Post Syndicated from Jeff Barr original https://aws.amazon.com/blogs/aws/amazon-ec2-container-service-launch-recap-customer-stories-and-code/

Today seems like a good time to recap some of the features that we have added to Amazon EC2 Container Service over the last year or so, and to share some customer success stories and code with you! The service makes it easy for you to run any number of Docker containers across a managed cluster of EC2 instances, with full console, API, CloudFormation, CLI, and PowerShell support. You can store your Linux and Windows Docker images in the EC2 Container Registry for easy access.

Launch Recap
Let’s start by taking a look at some of the newest ECS features and some helpful how-to blog posts that will show you how to use them:

Application Load Balancing – We added support for the application load balancer last year. This high-performance load balancing option runs at the application level and allows you to define content-based routing rules. It provides support for dynamic ports and can be shared across multiple services, making it easier for you to run microservices in containers. To learn more, read about Service Load Balancing.

IAM Roles for Tasks – You can secure your infrastructure by assigning IAM roles to ECS tasks. This allows you to grant permissions on a fine-grained, per-task basis, customizing the permissions to the needs of each task. Read IAM Roles for Tasks to learn more.

Service Auto Scaling – You can define scaling policies that scale your services (tasks) up and down in response to changes in demand. You set the desired minimum and maximum number of tasks, create one or more scaling policies, and Service Auto Scaling will take care of the rest. The documentation for Service Auto Scaling will help you to make use of this feature.

Blox – Scheduling, in a container-based environment, is the process of assigning tasks to instances. ECS gives you three options: automated (via the built-in Service Scheduler), manual (via the RunTask function), and custom (via a scheduler that you provide). Blox is an open source scheduler that supports a one-task-per-host model, with room to accommodate other models in the future. It monitors the state of the cluster and is well-suited to running monitoring agents, log collectors, and other daemon-style tasks.

Windows – We launched ECS with support for Linux containers and followed up with support for running Windows Server 2016 Base with Containers.

Container Instance Draining – From time to time you may need to remove an instance from a running cluster in order to scale the cluster down or to perform a system update. Earlier this year we added a set of lifecycle hooks that allow you to better manage the state of the instances. Read the blog post How to Automate Container Instance Draining in Amazon ECS to see how to use the lifecycle hooks and a Lambda function to automate the process of draining existing work from an instance while preventing new work from being scheduled for it.

CI/CD Pipeline with Code* – Containers simplify software deployment and are an ideal target for a CI/CD (Continuous Integration / Continuous Deployment) pipeline. The post Continuous Deployment to Amazon ECS using AWS CodePipeline, AWS CodeBuild, Amazon ECR, and AWS CloudFormation shows you how to build and operate a CI/CD pipeline using multiple AWS services.

CloudWatch Logs Integration – This launch gave you the ability to configure the containers that run your tasks to send log information to CloudWatch Logs for centralized storage and analysis. You simply install the Amazon ECS Container Agent and enable the awslogs log driver.

CloudWatch Events – ECS generates CloudWatch Events when the state of a task or a container instance changes. These events allow you to monitor the state of the cluster using a Lambda function. To learn how to capture the events and store them in an Elasticsearch cluster, read Monitor Cluster State with Amazon ECS Event Stream.

Task Placement Policies – This launch provided you with fine-grained control over the placement of tasks on container instances within clusters. It allows you to construct policies that include cluster constraints, custom constraints (location, instance type, AMI, and attribute), placement strategies (spread or bin pack) and to use them without writing any code. Read Introducing Amazon ECS Task Placement Policies to see how to do this!

EC2 Container Service in Action
Many of our customers from large enterprises to hot startups and across all industries, such as financial services, hospitality, and consumer electronics, are using Amazon ECS to run their microservices applications in production. Companies such as Capital One, Expedia, Okta, Riot Games, and Viacom rely on Amazon ECS.

Mapbox is a platform for designing and publishing custom maps. The company uses ECS to power their entire batch processing architecture to collect and process over 100 million miles of sensor data per day that they use for powering their maps. They also optimize their batch processing architecture on ECS using Spot Instances. The Mapbox platform powers over 5,000 apps and reaches more than 200 million users each month. Its backend runs on ECS allowing it to serve more than 1.3 billion requests per day. To learn more about their recent migration to ECS, read their recent blog post, We Switched to Amazon ECS, and You Won’t Believe What Happened Next.

Travel company Expedia designed their backends with a microservices architecture. With the popularization of Docker, they decided they would like to adopt Docker for its faster deployments and environment portability. They chose to use ECS to orchestrate all their containers because it had great integration with the AWS platform, everything from ALB to IAM roles to VPC integration. This made ECS very easy to use with their existing AWS infrastructure. ECS really reduced the heavy lifting of deploying and running containerized applications. Expedia runs 75% of all apps on AWS in ECS allowing it to process 4 billion requests per hour. Read Kuldeep Chowhan‘s blog post, How Expedia Runs Hundreds of Applications in Production Using Amazon ECS to learn more.

Realtor.com provides home buyers and sellers with a comprehensive database of properties that are currently for sale. Their move to AWS and ECS has helped them to support business growth that now numbers 50 million unique monthly users who drive up to 250,000 requests per second at peak times. ECS has helped them to deploy their code more quickly while increasing utilization of their cloud infrastructure. Read the Realtor.com Case Study to learn more about how they use ECS, Kinesis, and other AWS services.

Instacart talks about how they use ECS to power their same-day grocery delivery service:

Capital One talks about how they use ECS to automate their operations and their infrastructure management:

Code
Clever developers are using ECS as a base for their own work. For example:

Rack is an open source PaaS (Platform as a Service). It focuses on infrastructure automation, runs in an isolated VPC, and uses a single-tenant build service for security.

Empire is also an open source PaaS. It provides a Heroku-like workflow and is targeted at small and medium sized startups, with an emphasis on microservices.

Cloud Container Cluster Visualizer (c3vis) helps to visualize resource utilization within ECS clusters:

Stay Tuned
We have plenty of new features in the works for ECS, so stay tuned!

Jeff;

 

Querying OpenStreetMap with Amazon Athena

Post Syndicated from Seth Fitzsimmons original https://aws.amazon.com/blogs/big-data/querying-openstreetmap-with-amazon-athena/

This is a guest post by Seth Fitzsimmons, member of the 2017 OpenStreetMap US board of directors. Seth works with clients including the Humanitarian OpenStreetMap Team, Mapzen, the American Red Cross, and World Bank to craft innovative geospatial solutions.

OpenStreetMap (OSM) is a free, editable map of the world, created and maintained by volunteers and available for use under an open license. Companies and non-profits like Mapbox, Foursquare, Mapzen, the World Bank, the American Red Cross and others use OSM to provide maps, directions, and geographic context to users around the world.

In the 12 years of OSM’s existence, editors have created and modified several billion features (physical things on the ground like roads or buildings). The main PostgreSQL database that powers the OSM editing interface is now over 2TB and includes historical data going back to 2007. As new users join the open mapping community, more and more valuable data is being added to OpenStreetMap, requiring increasingly powerful tools, interfaces, and approaches to explore its vastness.

This post explains how anyone can use Amazon Athena to quickly query publicly available OSM data stored in Amazon S3 (updated weekly) as an AWS Public Dataset. Imagine that you work for an NGO interested in improving knowledge of and access to health centers in Africa. You might want to know what’s already been mapped, to facilitate the production of maps of surrounding villages, and to determine where infrastructure investments are likely to be most effective.

Note: If you run all the queries in this post, you will be charged approximately $1 based on the number of bytes scanned. All queries used in this post can be found in this GitHub gist.

What is OpenStreetMap?

As an open content project, regular OSM data archives are made available to the public via planet.openstreetmap.org in a few different formats (XML, PBF). This includes both snapshots of the current state of data in OSM as well as historical archives.

Working with “the planet” (as the data archives are referred to) can be unwieldy. Because it contains data spanning the entire world, the size of a single archive is on the order of 50 GB. The format is bespoke and extremely specific to OSM. The data is incredibly rich, interesting, and useful, but the size, format, and tooling can often make it very difficult to even start the process of asking complex questions.

Heavy users of OSM data typically download the raw data and import it into their own systems, tailored for their individual use cases, such as map rendering, driving directions, or general analysis. Now that OSM data is available in the Apache ORC format on Amazon S3, it’s possible to query the data using Athena without even downloading it.

How does Athena help?

You can use Athena along with data made publicly available via OSM on AWS. You don’t have to learn how to install, configure, and populate your own server instances and go through multiple steps to download and transform the data into a queryable form. Thanks to AWS and partners, a regularly updated copy of the planet file (available within hours of OSM’s weekly publishing schedule) is hosted on S3 and made available in a format that lends itself to efficient querying using Athena.

Asking questions with Athena involves registering the OSM planet file as a table and making SQL queries. That’s it. Nothing to download, nothing to configure, nothing to ingest. Athena distributes your queries and returns answers within seconds, even while querying over 9 years and billions of OSM elements.

You’re in control. S3 provides high availability for the data and Athena charges you per TB of data scanned. Plus, we’ve gone through the trouble of keeping scanning charges as small as possible by transcoding OSM’s bespoke format as ORC. All the hard work of transforming the data into something highly queryable and making it publicly available is done; you just need to bring some questions.

Registering Tables

The OSM Public Datasets consist of three tables:

  • planet
    Contains the current versions of all elements present in OSM.
  • planet_history
    Contains a historical record of all versions of all elements (even those that have been deleted).
  • changesets
    Contains information about changesets in which elements were modified (and which have a foreign key relationship to both the planet and planet_history tables).

To register the OSM Public Datasets within your AWS account so you can query them, open the Athena console (make sure you are using the us-east-1 region) to paste and execute the following table definitions:

planet

CREATE EXTERNAL TABLE planet (
  id BIGINT,
  type STRING,
  tags MAP<STRING,STRING>,
  lat DECIMAL(9,7),
  lon DECIMAL(10,7),
  nds ARRAY<STRUCT<ref: BIGINT>>,
  members ARRAY<STRUCT<type: STRING, ref: BIGINT, role: STRING>>,
  changeset BIGINT,
  timestamp TIMESTAMP,
  uid BIGINT,
  user STRING,
  version BIGINT
)
STORED AS ORCFILE
LOCATION 's3://osm-pds/planet/';

planet_history

CREATE EXTERNAL TABLE planet_history (
    id BIGINT,
    type STRING,
    tags MAP<STRING,STRING>,
    lat DECIMAL(9,7),
    lon DECIMAL(10,7),
    nds ARRAY<STRUCT<ref: BIGINT>>,
    members ARRAY<STRUCT<type: STRING, ref: BIGINT, role: STRING>>,
    changeset BIGINT,
    timestamp TIMESTAMP,
    uid BIGINT,
    user STRING,
    version BIGINT,
    visible BOOLEAN
)
STORED AS ORCFILE
LOCATION 's3://osm-pds/planet-history/';

changesets

CREATE EXTERNAL TABLE changesets (
    id BIGINT,
    tags MAP<STRING,STRING>,
    created_at TIMESTAMP,
    open BOOLEAN,
    closed_at TIMESTAMP,
    comments_count BIGINT,
    min_lat DECIMAL(9,7),
    max_lat DECIMAL(9,7),
    min_lon DECIMAL(10,7),
    max_lon DECIMAL(10,7),
    num_changes BIGINT,
    uid BIGINT,
    user STRING
)
STORED AS ORCFILE
LOCATION 's3://osm-pds/changesets/';

 

Under the Hood: Extract, Transform, Load

So, what happens behind the scenes to make this easier for you? In a nutshell, the data is transcoded from the OSM PBF format into Apache ORC.

There’s an AWS Lambda function (running every 15 minutes, triggered by CloudWatch Events) that watches planet.openstreetmap.org for the presence of weekly updates (using rsync). If that function detects that a new version has become available, it submits a set of AWS Batch jobs to mirror, transcode, and place it as the “latest” version. Code for this is available at osm-pds-pipelines GitHub repo.

To facilitate the transcoding into a format appropriate for Athena, we have produced an open source tool, OSM2ORC. The tool also includes an Osmosis plugin that can be used with complex filtering pipelines and outputs an ORC file that can be uploaded to S3 for use with Athena, or used locally with other tools from the Hadoop ecosystem.

What types of questions can OpenStreetMap answer?

There are many uses for OpenStreetMap data; here are three major ones and how they may be addressed using Athena.

Case Study 1: Finding Local Health Centers in West Africa

When the American Red Cross mapped more than 7,000 communities in West Africa in areas affected by the Ebola epidemic as part of the Missing Maps effort, they found themselves in a position where collecting a wide variety of data was both important and incredibly beneficial for others. Accurate maps play a critical role in understanding human communities, especially for populations at risk. The lack of detailed maps for West Africa posed a problem during the 2014 Ebola crisis, so collecting and producing data around the world has the potential to improve disaster responses in the future.

As part of the data collection, volunteers collected locations and information about local health centers, something that will facilitate treatment in future crises (and, more importantly, on a day-to-day basis). Combined with information about access to markets and clean drinking water and historical experiences with natural disasters, this data was used to create a vulnerability index to select communities for detailed mapping.

For this example, you find all health centers in West Africa (many of which were mapped as part of Missing Maps efforts). This is something that healthsites.io does for the public (worldwide and editable, based on OSM data), but you’re working with the raw data.

Here’s a query to fetch information about all health centers, tagged as nodes (points), in Guinea, Sierra Leone, and Liberia:

SELECT * from planet
WHERE type = 'node'
  AND tags['amenity'] IN ('hospital', 'clinic', 'doctors')
  AND lon BETWEEN -15.0863 AND -7.3651
  AND lat BETWEEN 4.3531 AND 12.6762;

Buildings, as “ways” (polygons, in this case) assembled from constituent nodes (points), can also be tagged as medical facilities. In order to find those, you need to reassemble geometries. Here you’re taking the average of all nodes that make up a building (which will be the approximate center point, which is close enough for this purpose). Here is a query that finds both buildings and points that are tagged as medical facilities:

-- select out nodes and relevant columns
WITH nodes AS (
  SELECT
    type,
    id,
    tags,
    lat,
    lon
  FROM planet
  WHERE type = 'node'
),
-- select out ways and relevant columns
ways AS (
  SELECT
    type,
    id,
    tags,
    nds
  FROM planet
  WHERE type = 'way'
    AND tags['amenity'] IN ('hospital', 'clinic', 'doctors')
),
-- filter nodes to only contain those present within a bounding box
nodes_in_bbox AS (
  SELECT *
  FROM nodes
  WHERE lon BETWEEN -15.0863 AND -7.3651
    AND lat BETWEEN 4.3531 AND 12.6762
)
-- find ways intersecting the bounding box
SELECT
  ways.type,
  ways.id,
  ways.tags,
  AVG(nodes.lat) lat,
  AVG(nodes.lon) lon
FROM ways
CROSS JOIN UNNEST(nds) AS t (nd)
JOIN nodes_in_bbox nodes ON nodes.id = nd.ref
GROUP BY (ways.type, ways.id, ways.tags)
UNION ALL
SELECT
  type,
  id,
  tags,
  lat,
  lon
FROM nodes_in_bbox
WHERE tags['amenity'] IN ('hospital', 'clinic', 'doctors');

You could go a step further and query for additional tags included with these (for example, opening_hours) and use that as a metric for measuring “completeness” of the dataset and to focus on additional data to collect (and locations to fill out).

Case Study 2: Generating statistics about mapathons

OSM has a history of holding mapping parties. Mapping parties are events where interested people get together and wander around outside, gathering and improving information about sites (and sights) that they pass. Another form of mapping party is the mapathon, which brings together armchair and desk mappers to focus on improving data in another part of the world.

Mapathons are a popular way of enlisting volunteers for Missing Maps, a collaboration between many NGOs, education institutions, and civil society groups that aims to map the most vulnerable places in the developing world to support international and local NGOs and individuals. One common way that volunteers participate is to trace buildings and roads from aerial imagery, providing baseline data that can later be verified by Missing Maps staff and volunteers working in the areas being mapped.

(Image and data from the American Red Cross)

Data collected during these events lends itself to a couple different types of questions. People like competition, so Missing Maps has developed a set of leaderboards that allow people to see where they stand relative to other mappers and how different groups compare. To facilitate this, hashtags (such as #missingmaps) are included in OSM changeset comments. To do similar ad hoc analysis, you need to query the list of changesets, filter by the presence of certain hashtags in the comments, and group things by username.

Now, find changes made during Missing Maps mapathons at George Mason University (using the #gmu hashtag):

SELECT *
FROM changesets
WHERE regexp_like(tags['comment'], '(?i)#gmu');

This includes all tags associated with a changeset, which typically include a mapper-provided comment about the changes made (often with additional hashtags corresponding to OSM Tasking Manager projects) as well as information about the editor used, imagery referenced, etc.

If you’re interested in the number of individual users who have mapped as part of the Missing Maps project, you can write a query such as:

SELECT COUNT(DISTINCT uid)
FROM changesets
WHERE regexp_like(tags['comment'], '(?i)#missingmaps');

25,610 people (as of this writing)!

Back at GMU, you’d like to know who the most prolific mappers are:

SELECT user, count(*) AS edits
FROM changesets
WHERE regexp_like(tags['comment'], '(?i)#gmu')
GROUP BY user
ORDER BY count(*) DESC;

Nice job, BrokenString!

It’s also interesting to see what types of features were added or changed. You can do that by using a JOIN between the changesets and planet tables:

SELECT planet.*, changesets.tags
FROM planet
JOIN changesets ON planet.changeset = changesets.id
WHERE regexp_like(changesets.tags['comment'], '(?i)#gmu');


Using this as a starting point, you could break down the types of features, highlight popular parts of the world, or do something entirely different.

Case Study 3: Building Condition

With building outlines having been produced by mappers around (and across) the world, local Missing Maps volunteers (often from local Red Cross / Red Crescent societies) go around with Android phones running OpenDataKit  and OpenMapKit to verify that the buildings in question actually exist and to add additional information about them, such as the number of stories, use (residential, commercial, etc.), material, and condition.

This data can be used in many ways: it can provide local geographic context (by being included in map source data) as well as facilitate investment by development agencies such as the World Bank.

Here are a collection of buildings mapped in Dhaka, Bangladesh:

(Map and data © OpenStreetMap contributors)

For NGO staff to determine resource allocation, it can be helpful to enumerate and show buildings in varying conditions. Building conditions within an area can be a means of understanding where to focus future investments.

Querying for buildings is a bit more complicated than working with points or changesets. Of the three core OSM element types—node, way, and relation, only nodes (points) have geographic information associated with them. Ways (lines or polygons) are composed of nodes and inherit vertices from them. This means that ways must be reconstituted in order to effectively query by bounding box.

This results in a fairly complex query. You’ll notice that this is similar to the query used to find buildings tagged as medical facilities above. Here you’re counting buildings in Dhaka according to building condition:

-- select out nodes and relevant columns
WITH nodes AS (
  SELECT
    id,
    tags,
    lat,
    lon
  FROM planet
  WHERE type = 'node'
),
-- select out ways and relevant columns
ways AS (
  SELECT
    id,
    tags,
    nds
  FROM planet
  WHERE type = 'way'
),
-- filter nodes to only contain those present within a bounding box
nodes_in_bbox AS (
  SELECT *
  FROM nodes
  WHERE lon BETWEEN 90.3907 AND 90.4235
    AND lat BETWEEN 23.6948 AND 23.7248
),
-- fetch and expand referenced ways
referenced_ways AS (
  SELECT
    ways.*,
    t.*
  FROM ways
  CROSS JOIN UNNEST(nds) WITH ORDINALITY AS t (nd, idx)
  JOIN nodes_in_bbox nodes ON nodes.id = nd.ref
),
-- fetch *all* referenced nodes (even those outside the queried bounding box)
exploded_ways AS (
  SELECT
    ways.id,
    ways.tags,
    idx,
    nd.ref,
    nodes.id node_id,
    ARRAY[nodes.lat, nodes.lon] coordinates
  FROM referenced_ways ways
  JOIN nodes ON nodes.id = nd.ref
  ORDER BY ways.id, idx
)
-- query ways matching the bounding box
SELECT
  count(*),
  tags['building:condition']
FROM exploded_ways
GROUP BY tags['building:condition']
ORDER BY count(*) DESC;


Most buildings are unsurveyed (125,000 is a lot!), but of those that have been, most are average (as you’d expect). If you were to further group these buildings geographically, you’d have a starting point to determine which areas of Dhaka might benefit the most.

Conclusion

OSM data, while incredibly rich and valuable, can be difficult to work with, due to both its size and its data model. In addition to the time spent downloading large files to work with locally, time is spent installing and configuring tools and converting the data into more queryable formats. We think Amazon Athena combined with the ORC version of the planet file, updated on a weekly basis, is an extremely powerful and cost-effective combination. This allows anyone to start querying billions of records with simple SQL in no time, giving you the chance to focus on the analysis, not the infrastructure.

To download the data and experiment with it using other tools, the latest OSM ORC-formatted file is available via OSM on AWS at s3://osm-pds/planet/planet-latest.orcs3://osm-pds/planet-history/history-latest.orc, and s3://osm-pds/changesets/changesets-latest.orc.

We look forward to hearing what you find out!

Utopia

Post Syndicated from Eevee original https://eev.ee/blog/2017/03/08/utopia/

It’s been a while, but someone’s back on the Patreon blog topic tier! IndustrialRobot asks:

What does your personal utopia look like? Do you think we (as mankind) can achieve it? Why/why not?

Hm.

I spent the month up to my eyeballs in a jam game, but this question was in the back of my mind a lot. I could use it as a springboard to opine about anything, especially in the current climate: politics, religion, nationalism, war, economics, etc., etc. But all of that has been done to death by people who actually know what they’re talking about.

The question does say “personal”. So in a less abstract sense… what do I want the world to look like?

Mostly, I want everyone to have the freedom to make things.

I’ve been having a surprisingly hard time writing the rest of this without veering directly into the ravines of “basic income is good” and “maybe capitalism is suboptimal”. Those are true, but not really the tone I want here, and anyway they’ve been done to death by better writers than I. I’ve talked this out with Mel a few times, and it sounds much better aloud, so I’m going to try to drop my Blog Voice and just… talk.

*ahem*

Art versus business

So, art. Art is good.

I’m construing “art” very broadly here. More broadly than “media”, too. I’m including shitty robots, weird Twitter almost-bots, weird Twitter non-bots, even a great deal of open source software. Anything that even remotely resembles creative work — driven perhaps by curiosity, perhaps by practicality, but always by a soul bursting with ideas and a palpable need to get them out.

Western culture thrives on art. Most culture thrives on art. I’m not remotely qualified to defend this, but I suspect you could define culture in terms of art. It’s pretty important.

You’d think this would be reflected in how we discuss art, but often… it’s not. Tell me how often you’ve heard some of these gems.

  • I could do that.”
  • My eight-year-old kid could do that.”
  • Jokes about the worthlessness of liberal arts degrees.
  • Jokes about people trying to write novels in their spare time, the subtext being that only dreamy losers try to write novels, or something.
  • The caricature of a hippie working on a screenplay at Starbucks.

Oh, and then there was the guy who made a bot to scrape tons of art from artists who were using Patreon as a paywall — and a primary source of income. The justification was that artists shouldn’t expect to make a living off of, er, doing art, and should instead get “real jobs”.

I do wonder. How many of the people repeating these sentiments listen to music, or go to movies, or bought an iPhone because it’s prettier? Are those things not art that took real work to create? Is creating those things not a “real job”?

Perhaps a “real job” has to be one that’s not enjoyable, not a passion? And yet I can’t recall ever hearing anyone say that Taylor Swift should get a “real job”. Or that, say, pro football players should get “real jobs”. What do pro football players even do? They play a game a few times a year, and somehow this drives the flow of unimaginable amounts of money. We dress it up in the more serious-sounding “sport”, but it’s a game in the same general genre as hopscotch. There’s nothing wrong with that, but somehow it gets virtually none of the scorn that art does.

Another possible explanation is America’s partly-Christian, partly-capitalist attitude that you deserve exactly whatever you happen to have at the moment. (Whereas I deserve much more and will be getting it any day now.) Rich people are rich because they earned it, and we don’t question that further. Poor people are poor because they failed to earn it, and we don’t question that further, either. To do so would suggest that the system is somehow unfair, and hard work does not perfectly correlate with any particular measure of success.

I’m sure that factors in, but it’s not quite satisfying: I’ve also seen a good deal of spite aimed at people who are making a fairly decent chunk through Patreon or similar. Something is missing.

I thought, at first, that the key might be the American worship of work. Work is an inherent virtue. Politicians run entire campaigns based on how many jobs they’re going to create. Notably, no one seems too bothered about whether the work is useful, as long as someone decided to pay you for it.

Finally I stumbled upon the key. America doesn’t actually worship work. America worships business. Business means a company is deciding to pay you. Business means legitimacy. Business is what separates a hobby from a career.

And this presents a problem for art.

If you want to provide a service or sell a product, that’ll be hard, but America will at least try to look like it supports you. People are impressed that you’re an entrepreneur, a small business owner. Politicians will brag about policies made in your favor, whether or not they’re stabbing you in the back.

Small businesses have a particular structure they can develop into. You can divide work up. You can have someone in sales, someone in accounting. You can provide specifications and pay a factory to make your product. You can defer all of the non-creative work to someone else, whether that means experts in a particular field or unskilled labor.

But if your work is inherently creative, you can’t do that. The very thing you’re making is your idea in your style, driven by your experience. This is not work that’s readily parallelizable. Even if you sell physical merchandise and register as an LLC and have a dedicated workspace and do various other formal business-y things, the basic structure will still look the same: a single person doing the thing they enjoy. A hobbyist.

Consider the bulleted list from above. Those are all individual painters or artists or authors or screenwriters. The kinds of artists who earn respect without question are generally those managed by a business, those with branding: musical artists signed to labels, actors working for a studio. Even football players are part of a tangle of business.

(This doesn’t mean that business automatically confers respect, of course; tech in particular is full of anecdotes about nerds’ disdain for people whose jobs are design or UI or documentation or whathaveyou. But a businessy look seems to be a significant advantage.)

It seems that although art is a large part of what informs culture, we have a culture that defines “serious” endeavors in such a way that independent art cannot possibly be “serious”.

Art versus money

Which wouldn’t really matter at all, except that we also have a culture that expects you to pay for food and whatnot.

The reasoning isn’t too outlandish. Food is produced from a combination of work and resources. In exchange for getting the food, you should give back some of your own work and resources.

Obviously this is riddled with subtle flaws, but let’s roll with it for now and look at a case study. Like, uh, me!

Mel and I built and released two games together in the six weeks between mid-January and the end of February. Together, those games have made $1,000 in sales. The sales trail off fairly quickly within a few days of release, so we’ll call that the total gross for our effort.

I, dumb, having never actually sold anything before, thought this was phenomenal. Then I had the misfortune of doing some math.

Itch takes at least 10%, so we’re down to $900 net. Divided over six weeks, that’s $150 per week, before taxes — or $3.75 per hour if we’d been working full time.

Ah, but wait! There are two of us. And we hadn’t been working full time — we’d been working nearly every waking hour, which is at least twice “full time” hours. So we really made less than a dollar an hour. Even less than that, if you assume overtime pay.

From the perspective of capitalism, what is our incentive to do this? Between us, we easily have over thirty years of experience doing the things we do, and we spent weeks in crunch mode working on something, all to earn a small fraction of minimum wage. Did we not contribute back our own work and resources? Was our work worth so much less than waiting tables?

Waiting tables is a perfectly respectable way to earn a living, mind you. Ah, but wait! I’ve accidentally done something clever here. It is generally expected that you tip your waiter, because waiters are underpaid by the business, because the business assumes they’ll be tipped. Not tipping is actually, almost impressively, one of the rudest things you can do. And yet it’s not expected that you tip an artist whose work you enjoy, even though many such artists aren’t being paid at all.

Now, to be perfectly fair, both games were released for free. Even a dollar an hour is infinitely more than the zero dollars I was expecting — and I’m amazed and thankful we got as much as we did! Thank you so much. I bring it up not as a complaint, but as an armchair analysis of our systems of incentives.

People can take art for granted and whatever, yes, but there are several other factors at play here that hamper the ability for art to make money.

For one, I don’t want to sell my work. I suspect a great deal of independent artists and writers and open source developers (!) feel the same way. I create things because I want to, because I have to, because I feel so compelled to create that having a non-creative full-time job was making me miserable. I create things for the sake of expressing an idea. Attaching a price tag to something reduces the number of people who’ll experience it. In other words, selling my work would make it less valuable in my eyes, in much the same way that adding banner ads to my writing would make it less valuable.

And yet, I’m forced to sell something in some way, or else I’ll have to find someone who wants me to do bland mechanical work on their ideas in exchange for money… at the cost of producing sharply less work of my own. Thank goodness for Patreon, at least.

There’s also the reverse problem, in that people often don’t want to buy creative work. Everyone does sometimes, but only sometimes. It’s kind of a weird situation, and the internet has exacerbated it considerably.

Consider that if I write a book and print it on paper, that costs something. I have to pay for the paper and the ink and the use of someone else’s printer. If I want one more book, I have to pay a little more. I can cut those costs pretty considerable by printing a lot of books at once, but each copy still has a price, a marginal cost. If I then gave those books away, I would be actively losing money. So I can pretty well justify charging for a book.

Along comes the internet. Suddenly, copying costs nothing. Not only does it cost nothing, but it’s the fundamental operation. When you download a file or receive an email or visit a web site, you’re really getting a copy! Even the process which ultimately shows it on your screen involves a number of copies. This is so natural that we don’t even call it copying, don’t even think of it as copying.

True, bandwidth does cost something, but the rate is virtually nothing until you start looking at very big numbers indeed. I pay $60/mo for hosting this blog and a half dozen other sites — even that’s way more than I need, honestly, but downgrading would be a hassle — and I get 6TB of bandwidth. Even the longest of my posts haven’t exceeded 100KB. A post could be read by 64 million people before I’d start having a problem. If that were the population of a country, it’d be the 23rd largest in the world, between Italy and the UK.

How, then, do I justify charging for my writing? (Yes, I realize the irony in using my blog as an example in a post I’m being paid $88 to write.)

Well, I do pour effort and expertise and a fraction of my finite lifetime into it. But it doesn’t cost me anything tangible — I already had this hosting for something else! — and it’s easier all around to just put it online.

The same idea applies to a vast bulk of what’s online, and now suddenly we have a bit of a problem. Not only are we used to getting everything for free online, but we never bothered to build any sensible payment infrastructure. You still have to pay for everything by typing in a cryptic sequence of numbers from a little physical plastic card, which will then give you a small loan and charge the seller 30¢ plus 2.9% for the “convenience”.

If a website could say “pay 5¢ to read this” and you clicked a button in your browser and that was that, we might be onto something. But with our current setup, it costs far more than 5¢ to transfer 5¢, even though it’s just a number in a computer somewhere. The only people with the power and resources to fix this don’t want to fix it — they’d rather be the ones charging you the 30¢ plus 2.9%.

That leads to another factor of platforms and publishers, which are more than happy to eat a chunk of your earnings even when you do sell stuff. Google Play, the App Store, Steam, and anecdotally many other big-name comparative platforms all take 30% of your sales. A third! And that’s good! It seems common among book publishers to take 85% to 90%. For ebook sales — i.e., ones that don’t actually cost anything — they may generously lower that to a mere 75% to 85%.

Bless Patreon for only taking 5%. Itch.io is even better: it defaults to 10%, but gives you a slider, which you can set to anything from 0% to 100%.

I’ve mentioned all this before, so here’s a more novel thought: finite disposable income. Your audience only has so much money to spend on media right now. You can try to be more compelling to encourage them to spend more of it, rather than saving it, but ultimately everyone has a limit before they just plain run out of money.

Now, popularity is heavily influenced by social and network effects, so it tends to create a power law distribution: a few things are ridiculously hyperpopular, and then there’s a steep drop to a long tail of more modestly popular things.

If a new hyperpopular thing comes out, everyone is likely to want to buy it… but then that eats away a significant chunk of that finite pool of money that could’ve gone to less popular things.

This isn’t bad, and buying a popular thing doesn’t make you a bad person; it’s just what happens. I don’t think there’s any satisfying alternative that doesn’t involve radically changing the way we think about our economy.

Taylor Swift, who I’m only picking on because her infosec account follows me on Twitter, has sold tens of millions of albums and is worth something like a quarter of a billion dollars. Does she need more? If not, should she make all her albums free from now on?

Maybe she does, and maybe she shouldn’t. The alternative is for someone to somehow prevent her from making more money, which doesn’t sit well. Yet it feels almost heretical to even ask if someone “needs” more money, because we take for granted that she’s earned it — in part by being invested in by a record label and heavily advertised. The virtue is work, right? Don’t a lot of people work just as hard? (“But you have to be talented too!” Then please explain how wildly incompetent CEOs still make millions, and leave burning businesses only to be immediately hired by new ones? Anyway, are we really willing to bet there is no one equally talented but not as popular by sheer happenstance?)

It’s kind of a moot question anyway, since she’s probably under contract with billionaires and it’s not up to her.

Where the hell was I going with this.


Right, so. Money. Everyone needs some. But making it off art can be tricky, unless you’re one of the lucky handful who strike gold.

And I’m still pretty goddamn lucky to be able to even try this! I doubt I would’ve even gotten into game development by now if I were still working for an SF tech company — it just drained so much of my creative energy, and it’s enough of an uphill battle for me to get stuff done in the first place.

How many people do I know who are bursting with ideas, but have to work a tedious job to keep the lights on, and are too tired at the end of the day to get those ideas out? Make no mistake, making stuff takes work — a lot of it. And that’s if you’re already pretty good at the artform. If you want to learn to draw or paint or write or code, you have to do just as much work first, with much more frustration, and not as much to show for it.

Utopia

So there’s my utopia. I want to see a world where people have the breathing room to create the things they dream about and share them with the rest of us.

Can it happen? Maybe. I think the cultural issues are a fairly big blocker; we’d be much better off if we treated independent art with the same reverence as, say, people who play with a ball for twelve hours a year. Or if we treated liberal arts degrees as just as good as computer science degrees. (“But STEM can change the world!” Okay. How many people with computer science degrees would you estimate are changing the world, and how many are making a website 1% faster or keeping a lumbering COBOL beast running or trying to trick 1% more people into clicking on ads?)

I don’t really mean stuff like piracy, either. Piracy is a thing, but it’s… complicated. In my experience it’s not even artists who care the most about piracy; it’s massive publishers, the sort who see artists as a sponge to squeeze money out of. You know, the same people who make everything difficult to actually buy, infest it with DRM so it doesn’t work on half the stuff you own, and don’t even sell it in half the world.

I mean treating art as a free-floating commodity, detached from anyone who created it. I mean neo-Nazis adopting a comic book character as their mascot, against the creator’s wishes. I mean politicians and even media conglomerates using someone else’s music in well-funded videos and ads without even asking. I mean assuming Google Image Search, wonder that it is, is some kind of magical free art machine. I mean the snotty Reddit post I found while looking up Patreon’s fee structure, where some doofus was insisting that Patreon couldn’t possibly pay for a full-time YouTuber’s time, because not having a job meant they had lots of time to spare.

Maybe I should go one step further: everyone should create at least once or twice. Everyone should know what it’s like to have crafted something out of nothing, to be a fucking god within the microcosm of a computer screen or a sewing machine or a pottery table. Everyone should know that spark of inspiration that we don’t seem to know how to teach in math or science classes, even though it’s the entire basis of those as well. Everyone should know that there’s a good goddamn reason I listed open source software as a kind of art at the beginning of this post.

Basic income and more arts funding for public schools. If Uber can get billions of dollars for putting little car icons on top of Google Maps and not actually doing any of their own goddamn service themselves, I think we can afford to pump more cash into webcomics and indie games and, yes, even underwater basket weaving.

Classifying Elections as "Critical Infrastructure"

Post Syndicated from Bruce Schneier original https://www.schneier.com/blog/archives/2017/01/should_election.html

I am co-author on a paper discussing whether elections be classified as “critical infrastructure” in the US, based on experiences in other countries:

Abstract: With the Russian government hack of the Democratic National Convention email servers, and further leaks expected over the coming months that could influence an election, the drama of the 2016 U.S. presidential race highlights an important point: Nefarious hackers do not just pose a risk to vulnerable companies, cyber attacks can potentially impact the trajectory of democracies. Yet, to date, a consensus has not been reached as to the desirability and feasibility of reclassifying elections, in particular voting machines, as critical infrastructure due in part to the long history of local and state control of voting procedures. This Article takes on the debate in the U.S. using the 2016 elections as a case study but puts the issue in a global context with in-depth case studies from South Africa, Estonia, Brazil, Germany, and India. Governance best practices are analyzed by reviewing these differing approaches to securing elections, including the extent to which trend lines are converging or diverging. This investigation will, in turn, help inform ongoing minilateral efforts at cybersecurity norm building in the critical infrastructure context, which are considered here for the first time in the literature through the lens of polycentric governance.

The paper was speculative, but now it’s official. The U.S. election has been classified as critical infrastructure. I am tentatively in favor of this, but what really matter is what happens now. What does this mean? What sorts of increased security will election systems get? Will we finally get rid of computerized touch-screen voting?

EDITED TO ADD (1/16): This is a good article.

Using Wi-Fi to Detect Hand Motions and Steal Passwords

Post Syndicated from Bruce Schneier original https://www.schneier.com/blog/archives/2016/11/using_wi-fi_to_.html

This is impressive research: “When CSI Meets Public WiFi: Inferring Your Mobile Phone Password via WiFi Signals“:

Abstract: In this study, we present WindTalker, a novel and practical keystroke inference framework that allows an attacker to infer the sensitive keystrokes on a mobile device through WiFi-based side-channel information. WindTalker is motivated from the observation that keystrokes on mobile devices will lead to different hand coverage and the finger motions, which will introduce a unique interference to the multi-path signals and can be reflected by the channel state information (CSI). The adversary can exploit the strong correlation between the CSI fluctuation and the keystrokes to infer the user’s number input. WindTalker presents a novel approach to collect the target’s CSI data by deploying a public WiFi hotspot. Compared with the previous keystroke inference approach, WindTalker neither deploys external devices close to the target device nor compromises the target device. Instead, it utilizes the public WiFi to collect user’s CSI data, which is easy-to-deploy and difficult-to-detect. In addition, it jointly analyzes the traffic and the CSI to launch the keystroke inference only for the sensitive period where password entering occurs. WindTalker can be launched without the requirement of visually seeing the smart phone user’s input process, backside motion, or installing any malware on the tablet. We implemented Windtalker on several mobile phones and performed a detailed case study to evaluate the practicality of the password inference towards Alipay, the largest mobile payment platform in the world. The evaluation results show that the attacker can recover the key with a high successful rate.

That “high successful rate” is 81.7%.

News article.

AWS Week in Review – November 7, 2016

Post Syndicated from Jeff Barr original https://aws.amazon.com/blogs/aws/aws-week-in-review-november-7-2016/

Let’s take a quick look at what happened in AWS-land last week. Thanks are due to the 16 internal and external contributors who submitted pull requests!

Monday

November 7

Tuesday

November 8

Wednesday

November 9

Thursday

November 10

Friday

November 11

Saturday

November 12

Sunday

November 13

New & Notable Open Source

  • Sippy Cup is a Python nanoframework for AWS Lambda and API Gateway.
  • Yesterdaytabase is a Python tool for constantly refreshing data in your staging and test environments with Lambda and CloudFormation.
  • ebs-snapshot-lambda is a Lambda function to snapshot EBS volumes and purge old snapshots.
  • examples is a collection of boilerplates and examples of serverless architectures built with the Serverless Framework and Lambda.
  • ecs-deploy-cli is a simple and easy way to deploy tasks and update services in AWS ECS.
  • Comments-Showcase is a serverless comment webapp that uses API Gateway, Lambda, DynamoDB, and IoT.
  • serverless-offline emulates Lambda and API Gateway locally for development of Serverless projects.
  • aws-sign-web is a JavaScript implementation of AWS Signature v4 for use within web browsers.
  • Zappa implements serverless Django on Lambda and API Gateway.
  • awsping is a console tool to check latency to AWS regions.

New SlideShare Presentations

Upcoming Events

Help Wanted

Stay tuned for next week! In the meantime, follow me on Twitter and subscribe to the RSS feed.