Tag Archives: flickr

Yahoo! Fined 35 Million USD For Late Disclosure Of Hack

Post Syndicated from Darknet original https://www.darknet.org.uk/2018/05/yahoo-fined-35-million-usd-for-late-disclosure-of-hack/?utm_source=rss&utm_medium=social&utm_campaign=darknetfeed

Yahoo! Fined 35 Million USD For Late Disclosure Of Hack

Ah Yahoo! in trouble again, this time the news is Yahoo! fined for 35 million USD by the SEC for the 2 years delayed disclosure of the massive hack, we actually reported on the incident in 2016 when it became public – Massive Yahoo Hack – 500 Million Accounts Compromised.

Yahoo! has been having a rocky time for quite a few years now and just recently has sold Flickr to SmugMug for an undisclosed amount, I hope that at least helps pay off some of the fine.

Read the rest of Yahoo! Fined 35 Million USD For Late Disclosure Of Hack now! Only available at Darknet.

#CensorshipMachine

Post Syndicated from nellyo original https://nellyo.wordpress.com/2018/03/03/illegal_/

 

На 1 март 2018 Европейската комисия публикува Препоръка  относно мерките за ефективно справяне с незаконното онлайн съдържание, която представя идеите на Комисията за това как да се ускори премахването на незаконно съдържание. Отделно от това, подобни идеи са развити в предложенията за ревизия на медийното и авторското право, както и в дискусиите за борба с дезинформацията и фалшивите новини.

Аз също говорих за това на конференцията за фалшивите новини, организирана от АЕЖ през ноември 2017: ЕК препоръчва  на частни търговски дружества да се даде възможност да заличават съдържание, качено от граждани. Сега Европейската комисия продължава идеите в тази посока.

В правна  система, основана на върховенство на правото, съдът е този, който трябва да се произнася при намеса в свободата на изразяване, поне досега това беше неоспорвано положение. ЕК насърчава тенденцията е да се овластят доставчици да правят такава преценка – точно както идеята за трите удара преди време.

Реакцията на European Digital Rights (EDRi):

Европейските политици работят за най-големия интернет филтър, който някога сме виждали. Това може да звучи драматично, но наистина не е преувеличено. Ако предложението бъде прието, уеб сайтове като Soundcloud, eBay, Facebook и Flickr ще бъдат принудени да филтрират всичко, което искате да качите. Алгоритъм ще  определя кое от съдържанието, което качвате, ще се вижда от останалия свят и кое – няма.

Този интернет филтър е предвиден в предложенията за нова европейска нормативна уредба. Интернет филтрите не могат и не трябва да се използват за регулиране на авторското право. Те не работят. Но има много по-голям проблем: след като бъде инсталиран, интернет филтърът може и ще бъде използван за безброй други цели. Обзалагаме се, че политиците радостно очакват интернет филтъра, за да го използват в биткаите си  с фалшиви новини, тероризъм или нежелани политически мнения.

EDRi подчертава, че има много причини да сте срещу тези предложения – ето три:

  • Това е атака срещу вашата свобода на изразяване.
  • Филтри като тези  правят много грешки.
  • Платформите ще  бъдат насърчени да избягват риска  – за сметка на вашата свобода.

What’s the Best Solution for Managing Digital Photos and Videos?

Post Syndicated from Roderick Bauer original https://www.backblaze.com/blog/discovering-best-solution-for-photo-video-backup/

Digital Asset Management (DAM)

If you have spent any time, as we have, talking to photographers and videographers about how they back up and archive their digital photos and videos, then you know that there’s no one answer or solution that users have discovered to meet their needs.

Based on what we’ve heard, visual media artists are still searching for the best combination of software, hardware, and cloud storage to preserve their media, and to be able to search, retrieve, and reuse that media as easily as possible.

Yes, there are a number of solutions out there, and some users have created combinations of hardware, software, and services to meet their needs, but we have met few who claim to be satisfied with their solution for digital asset management (DAM), or expect that they will be using the same solution in just a year or two.

We’d like to open a dialog with professionals and serious amateurs to learn more about what you’re doing, what you’d like to do, and how Backblaze might fit into that solution.

We have a bit of cred in this field, as we currently have hundreds of petabytes of digital media files in our data centers from users of Backblaze Backup and Backblaze B2 Cloud Storage. We want to make our cloud services as useful as possible for photographers and videographers.

Tell Us Both Your Current Solution and Your Dream Solution

To get started, we’d love to hear from you about how you’re managing your photos and videos. Whether you’re an amateur or a professional, your experiences are valuable and will help us understand how to provide the best cloud component of a digital asset management solution.

Here are some questions to consider:

  • Are you using direct-attached drives, NAS (Network-Attached Storage), or offline storage for your media?
  • Do you use the cloud for media you’re actively working on?
  • Do you back up or archive to the cloud?
  • Did you have a catalog or record of the media that you’ve archived that you use to search and retrieve media?
  • What’s different about how you work in the field (or traveling) versus how you work in a studio (or at home)?
  • What software and/or hardware currently works for you?
  • What’s the biggest impediment to working in the way you’d really like to?
  • How could the cloud work better for you?

Please Contribute Your Ideas

To contribute, please answer the following two questions in the comments below or send an email to [email protected]. Please comment or email your response by December 22, 2017.

  1. How are you currently backing up your digital photos, video files, and/or file libraries/catalogs? Do you have a backup system that uses attached drives, a local network, the cloud, or offline storage media? Does it work well for you?
  2. Imagine your ideal digital asset backup setup. What would it look like? Don’t be constrained by current products, technologies, brands, or solutions. Invent a technology or product if you wish. Describe an ideal system that would work the way you want it to.

We know you have opinions about managing photos and videos. Bring them on!

We’re soliciting answers far and wide from amateurs and experts, weekend video makers and well-known professional photographers. We have a few amateur and professional photographers and videographers here at Backblaze, and they are contributing their comments, as well.

Once we have gathered all the responses, we’ll write a post on what we learned about how people are currently working and what they would do if anything were possible. Look for that post after the beginning of the year.

Don’t Miss Future Posts on Media Management

We don’t want you to miss our future posts on photography, videography, and digital asset management. To receive email notices of blog updates (and no spam, we promise), enter your email address above using the Join button at the top of the page.

Come Back on Thursday for our Photography Post (and a Special Giveaway, too)

This coming Thursday we’ll have a blog post about the different ways that photographers and videographers are currently managing their digital media assets.

Plus, you’ll have the chance to win a valuable hardware/software combination for digital media management that I am sure you will appreciate. (You’ll have to wait until Thursday to find out what the prize is, but it has a total value of over $700.)

Past Posts on Photography, Videography, and Digital Asset Management

We’ve written a number of blog posts about photos, videos, and managing digital assets. We’ve posted links to some of them below.

Four Tips To Help Photographers and Videographers Get The Most From B2

Four Tips To Help Photographers and Videographers Get The Most From B2

How to Back Up Your Mac’s Photos Library

How to Back Up Your Mac’s Photos Library

How To Back Up Your Flickr Library

How To Back Up Your Flickr Library

Getting Video Archives Out of Your Closet

Getting Video Archives Out of Your Closet

B2 Cloud Storage Roundup

B2 Cloud Storage Roundup

Backing Up Photos While Traveling

Backing up photos while traveling – feedback

Should I Use an External Drive for Backup?

Should I use an external drive for backup?

How to Connect your Synology NAS to B2

How to Connect your Synology NAS to B2

The post What’s the Best Solution for Managing Digital Photos and Videos? appeared first on Backblaze Blog | Cloud Storage & Cloud Backup.

CoderDojo Coolest Projects 2017

Post Syndicated from Ben Nuttall original https://www.raspberrypi.org/blog/coderdojo-coolest-projects-2017/

When I heard we were merging with CoderDojo, I was delighted. CoderDojo is a wonderful organisation with a spectacular community, and it’s going to be great to join forces with the team and work towards our common goal: making a difference to the lives of young people by making technology accessible to them.

You may remember that last year Philip and I went along to Coolest Projects, CoderDojo’s annual event at which their global community showcase their best makes. It was awesome! This year a whole bunch of us from the Raspberry Pi Foundation attended Coolest Projects with our new Irish colleagues, and as expected, the projects on show were as cool as can be.

Coolest Projects 2017 attendee

Crowd at Coolest Projects 2017

This year’s coolest projects!

Young maker Benjamin demoed his brilliant RGB LED table tennis ball display for us, and showed off his brilliant project tutorial website codemakerbuddy.com, which he built with Python and Flask. [Click on any of the images to enlarge them.]

Coolest Projects 2017 LED ping-pong ball display
Coolest Projects 2017 Benjamin and Oly

Next up, Aimee showed us a recipes app she’d made with the MIT App Inventor. It was a really impressive and well thought-out project.

Coolest Projects 2017 Aimee's cook book
Coolest Projects 2017 Aimee's setup

This very successful OpenCV face detection program with hardware installed in a teddy bear was great as well:

Coolest Projects 2017 face detection bear
Coolest Projects 2017 face detection interface
Coolest Projects 2017 face detection database

Helen’s and Oly’s favourite project involved…live bees!

Coolest Projects 2017 live bees

BEEEEEEEEEEES!

Its creator, 12-year-old Amy, said she wanted to do something to help the Earth. Her project uses various sensors to record data on the bee population in the hive. An adjacent monitor displays the data in a web interface:

Coolest Projects 2017 Aimee's bees

Coolest robots

I enjoyed seeing lots of GPIO Zero projects out in the wild, including this robotic lawnmower made by Kevin and Zach:

Raspberry Pi Lawnmower

Kevin and Zach’s Raspberry Pi lawnmower project with Python and GPIO Zero, showed at CoderDojo Coolest Projects 2017

Philip’s favourite make was a Pi-powered robot you can control with your mind! According to the maker, Laura, it worked really well with Philip because he has no hair.

Philip Colligan on Twitter

This is extraordinary. Laura from @CoderDojo Romania has programmed a mind controlled robot using @Raspberry_Pi @coolestprojects

And here are some pictures of even more cool robots we saw:

Coolest Projects 2017 coolest robot no.1
Coolest Projects 2017 coolest robot no.2
Coolest Projects 2017 coolest robot no.3

Games, toys, activities

Oly and I were massively impressed with the work of Mogamad, Daniel, and Basheerah, who programmed a (borrowed) Amazon Echo to make a voice-controlled text-adventure game using Java and the Alexa API. They’ve inspired me to try something similar using the AIY projects kit and adventurelib!

Coolest Projects 2017 Mogamad, Daniel, Basheerah, Oly
Coolest Projects 2017 Alexa text-based game

Christopher Hill did a brilliant job with his Home Alone LEGO house. He used sensors to trigger lights and sounds to make it look like someone’s at home, like in the film. I should have taken a video – seeing it in action was great!

Coolest Projects 2017 Lego home alone house
Coolest Projects 2017 Lego home alone innards
Coolest Projects 2017 Lego home alone innards closeup

Meanwhile, the Northern Ireland Raspberry Jam group ran a DOTS board activity, which turned their area into a conductive paint hazard zone.

Coolest Projects 2017 NI Jam DOTS activity 1
Coolest Projects 2017 NI Jam DOTS activity 2
Coolest Projects 2017 NI Jam DOTS activity 3
Coolest Projects 2017 NI Jam DOTS activity 4
Coolest Projects 2017 NI Jam DOTS activity 5
Coolest Projects 2017 NI Jam DOTS activity 6

Creativity and ingenuity

We really enjoyed seeing so many young people collaborating, experimenting, and taking full advantage of the opportunity to make real projects. And we loved how huge the range of technologies in use was: people employed all manner of hardware and software to bring their ideas to life.

Philip Colligan on Twitter

Wow! Look at that room full of awesome young people. @coolestprojects #coolestprojects @CoderDojo

Congratulations to the Coolest Projects 2017 prize winners, and to all participants. Here are some of the teams that won in the different categories:

Coolest Projects 2017 winning team 1
Coolest Projects 2017 winning team 2
Coolest Projects 2017 winning team 3

Take a look at the gallery of all winners over on Flickr.

The wow factor

Raspberry Pi co-founder and Foundation trustee Pete Lomas came along to the event as well. Here’s what he had to say:

It’s hard to describe the scale of the event, and photos just don’t do it justice. The first thing that hit me was the sheer excitement of the CoderDojo ninjas [the children attending Dojos]. Everyone was setting up for their time with the project judges, and their pure delight at being able to show off their creations was evident in both halls. Time and time again I saw the ninjas apply their creativity to help save the planet or make someone’s life better, and it’s truly exciting that we are going to help that continue and expand.

Even after 8 hours, enthusiasm wasn’t flagging – the awards ceremony was just brilliant, with ninjas high-fiving the winners on the way to the stage. This speaks volumes about the ethos and vision of the CoderDojo founders, where everyone is a winner just by being part of a community of worldwide friends. It was a brilliant introduction, and if this weekend was anything to go by, our merger certainly is a marriage made in Heaven.

Join this awesome community!

If all this inspires you as much as it did us, consider looking for a CoderDojo near you – and sign up as a volunteer! There’s plenty of time for young people to build up skills and start working on a project for next year’s event. Check out coolestprojects.com for more information.

The post CoderDojo Coolest Projects 2017 appeared first on Raspberry Pi.

AWS Hot Startups – May 2017

Post Syndicated from Tina Barr original https://aws.amazon.com/blogs/aws/aws-hot-startups-may-2017/

April showers bring May startups! This month we have three hot startups for you to check out. Keep reading to find out what they’re up to, and how they’re using AWS to do it.

Today’s post features the following startups:

  • Lobster – an AI-powered platform connecting creative social media users to professionals.
  • Visii – helping consumers find the perfect product using visual search.
  • Tiqets – a curated marketplace for culture and entertainment.

Lobster (London, England)

Every day, social media users generate billions of authentic images and videos to rival typical stock photography. Powered by Artificial Intelligence, Lobster enables brands, agencies, and the press to license visual content directly from social media users so they can find that piece of content that perfectly fits their brand or story. Lobster does the work of sorting through major social networks (Instagram, Flickr, Facebook, Vk, YouTube, and Vimeo) and cloud storage providers (Dropbox, Google Photos, and Verizon) to find media, saving brands and agencies time and energy. Using filters like gender, color, age, and geolocation can help customers find the unique content they’re looking for, while Lobster’s AI and visual recognition finds images instantly. Lobster also runs photo challenges to help customers discover the perfect image to fit their needs.

Lobster is an excellent platform for creative people to get their work discovered while also protecting their content. Users are treated as copyright holders and earn 75% of the final price of every sale. The platform is easy to use: new users simply sign in with an existing social media or cloud account and can start showcasing their artistic talent right away. Lobster allows users to connect to any number of photo storage sources so they’re able to choose which items to share and which to keep private. Once users have selected their favorite photos and videos to share, they can sit back and watch as their work is picked to become the signature for a new campaign or featured on a cool website – and start earning money for their work.

Lobster is using a variety of AWS services to keep everything running smoothly. The company uses Amazon S3 to store photography that was previously ordered by customers. When a customer purchases content, the respective piece of content must be available at any given moment, independent from the original source. Lobster is also using Amazon EC2 for its application servers and Elastic Load Balancing to monitor the state of each server.

To learn more about Lobster, check them out here!

Visii (London, England)

In today’s vast web, a growing number of products are being sold online and searching for something specific can be difficult. Visii was created to cater to businesses and help them extract value from an asset they already have – their images. Their SaaS platform allows clients to leverage an intelligent visual search on their websites and apps to help consumers find the perfect product for them. With Visii, consumers can choose an image and immediately discover more based on their tastes and preferences. Whether it’s clothing, artwork, or home decor, Visii will make recommendations to get consumers to search visually and subsequently help businesses increase their conversion rates.

There are multiple ways for businesses to integrate Visii on their website or app. Many of Visii’s clients choose to build against their API, but Visii also work closely with many clients to figure out the most effective way to do this for each unique case. This has led Visii to help build innovative user interfaces and figure out the best integration points to get consumers to search visually. Businesses can also integrate Visii on their website with a widget – they just need to provide a list of links to their products and Visii does the rest.

Visii runs their entire infrastructure on AWS. Their APIs and pipeline all sit in auto-scaling groups, with ELBs in front of them, sending things across into Amazon Simple Queue Service and Amazon Aurora. Recently, Visii moved from Amazon RDS to Aurora and noted that the process was incredibly quick and easy. Because they make heavy use of machine learning, it is crucial that their pipeline only runs when required and that they maximize the efficiency of their uptime.

To see how companies are using Visii, check out Style Picker and Saatchi Art.

Tiqets (Amsterdam, Netherlands)

Tiqets is making the ticket-buying experience faster and easier for travelers around the world.  Founded in 2013, Tiqets is one of the leading curated marketplaces for admission tickets to museums, zoos, and attractions. Their mission is to help travelers get the most out of their trips by helping them find and experience a city’s culture and entertainment. Tiqets partners directly with vendors to adapt to a customer’s specific needs, and is now active in over 30 cities in the US, Europe, and the Middle East.

With Tiqets, travelers can book tickets either ahead of time or at their destination for a wide range of attractions. The Tiqets app provides real-time availability and delivers tickets straight to customer’s phones via email, direct download, or in the app. Customers save time skipping long lines (a perk of the app!), save trees (don’t need to physically print tickets), and most importantly, they can make the most out of their leisure time. For each attraction featured on Tiqets, there is a lot of helpful information including best modes of transportation, hours, commonly asked questions, and reviews from other customers.

The Tiqets platform consists of the consumer-facing website, the internal and external-facing APIs, and the partner self-service portals. For the app hosting and infrastructure, Tiqets uses AWS services such as Elastic Load Balancing, Amazon EC2, Amazon RDS, Amazon CloudFront, Amazon Route 53, and Amazon ElastiCache. Through the infrastructure orchestration of their AWS configuration, they can easily set up separate development or test environments while staying close to the production environment as well.

Tiqets is hiring! Be sure to check out their jobs page if you are interested in joining the Tiqets team.

Thanks for reading and don’t forget to check out April’s Hot Startups if you missed it.

-Tina Barr

 

 

How Backblaze Got Started: The Problem, The Solution, and the Stuff In-Between

Post Syndicated from Gleb Budman original https://www.backblaze.com/blog/how-backblaze-got-started/

How Backblaze Got Started

Backblaze will be celebrating its ten year anniversary this month. As I was reflecting on our path to get here, I thought some of the issues we encountered along the way are universal to most startups. With that in mind, I’ll write a series of blog posts focused on the entrepreneurial journey. This post is the first and focuses on the birth of Backblaze. I hope you stick around and enjoy the Backblaze story along the way.

What’s Your Problem?

The entrepreneur builds things to solve problems – your own or someone else’s. That problem may be a lack of something that you wish existed or something broken you want to fix. Here’s the problem that kicked off Backblaze and how it got noticed:

Brian Wilson, now co-founder and CTO of Backblaze, had been doing tech support for friends and family, as many of us did. One day he got a panicked call from one of those friends, Lise.

Lise: “You’ve got to help me! My computer crashed!”
Brian: “No problem – we’ll get you a new laptop; where’s your backup?”
Lise: “Look, what I don’t need now is a lecture! What I need is for you to get my data back!”

Brian was religious about backing up data and had been for years. He burned his data onto a CD and a DVD, diversifying the media types he used. During the process, Brian periodically read some files from each of the discs to test his backups. Finally, Brian put one disc in his closet and mailed another to his brother in New Mexico to have it offsite. Brian did this every week!

Brian was obviously a lot more obsessive than most of us.

Lise, however, had the opposite problem. She had no backup. And she wasn’t alone.

Whose Problem Is It?

A serious pain-point for one person may turn out to be a serious pain-point for millions.

At this point, it would have been easy just to say, “Well that sucks” or blame Lise. “User error” and “they just don’t get it” are common refrains in tech. But blaming the user doesn’t solve the problem.

Brian started talking to people and asking, “Who doesn’t back up?” He also talked with me and some of the others that are now Backblaze co-founders, and we asked the same question to others.

It turned out that most people didn’t back up their computers. Lise wasn’t the anomaly; Brian was. And that was a problem.

Over the previous decade, everything had gone digital. Photos, movies, financials, taxes, everything. A single crashed hard drive could cause you to lose everything. And drives would indeed crash. Over time everything would be digital, and society as a whole would permanently lose vast amounts of information. Big problem.

Surveying the Landscape

There’s a well-known adage that “Having no competition may mean you have no market.” The corollary I’d add is that “Having competition doesn’t mean the market is full.”

Weren’t There Backup Solutions?

Yes. Plenty. In fact, we joked that we were thirty years too late to the problem.

“Solutions Exist” does not mean “Problem Solved.” Even though many backup solutions were available, most people did not back up their data.

What Were the Current Solutions?

At first glance, it seems clear we’d be competing with other backup services. But when I asked people “How do you back up your data today?”, here were the answers I heard most frequently:

  • Copy ‘My Documents’ directory to an external drive before going on vacation
  • Copy files to a USB key
  • Send important files to Gmail
  • Pray
  • And “Do I need to back up?” (I’ll talk about this one in another post.)

Sometimes people would mention a particular backup app or service, but this was rare.

What Was Wrong With the Current Solutions?

Existing backup systems had various issues. They would not back up all of the users’ data, for example. They would only back up periodically and thus didn’t have current data. Most solutions were not off-site, so fire, theft or another catastrophe could still wipe out data. Some weren’t automatic, which left more room for neglect and user error.

“Solutions Exist” does not mean “Problem Solved.”

In fairness, some backup products and services had already solved some of these issues. But few people used those products. I talked with a lot of people and asked, “Why don’t you use some backup software/service?”

The most common answer was, “I tried it…and it was too hard and too expensive.” We’d learn a lot more about what “hard” and “expensive” meant along the way.

Finding and Testing Solutions

Focus is critical for execution, but when brainstorming solutions, go broad.

We considered a variety of approaches to help people back up their files.

Peer-to-Peer Backup: This was the original idea. Two people would install our backup software which would send each person’s data to the other’s computer. This idea had a lot going for it: The data would be off-site; It would work with existing hardware; It was mildly viral.

Local Drive Backup: The backup software would send data to a USB hard drive. Manually copying files to an external drive was most people’s idea of backing up. However, no good software existed at the time to make this easy. (Time Machine for the Mac hadn’t launched yet.)

Backup To Online Services: Weirder and more unique, this idea stemmed from noticing that online services provided free storage: Flickr for photos; Google Docs for documents and spreadsheets; YouTube for movies; and so on. We considered writing software that would back up each file type to the service that supported it and back up the rest to Gmail.

Backup To Our Online Storage: We’d create a service that backed up data to the cloud. It may seem obvious now, but backing up to the cloud was just one of a variety of possibilities at the time. Also, initially, we didn’t mean ‘our’ storage. We assumed we would use S3 or some other storage provider.

The goal was to come up with a solution that was easy.

We put each solution we came up with through its paces. The goal was to come up with a solution that was easy: Easy for people to use. Easy to understand.

Peer-to-peer backup? First, we’d have to explain what it is (no small task) and then get buy-in from the user to host a backup on their machine. That meant having enough space on each computer, and both needed to be online at the same time. After our initial excitement with the idea, we came to the conclusion that there were too many opportunities for things to go wrong. Verdict: Not easy.

Backup software? Not off-site, and required the purchase of a hard drive. If the drive broke or wasn’t connected, no backup occurred. A useful solution but again, too many opportunities for things to go wrong. Verdict: Not easy.

Back up to online services? Users needed accounts at each, and none of the services supported all file types, so your data ended up scattered all over the place. Verdict: Not easy.

Back up to our online storage? The backup would be current, kept off-site, and updated automatically. It was easy to for people to use, and easy to understand. Verdict: Easy!

Getting To the Solution

Don’t brainstorm forever. Problems don’t get solved on ideas alone.

We decided to back up to our online storage! It met many of the key goals. We started building.

Attempt #1

We built a backup software installer, a way to pick files and folders to back up, and the underlying engine that copies the files to remote storage. We tried to make it comfortable by minimizing clicks and questions.

Fail #1

This approach seemed easy enough to use, at least for us, but it turned out not to be for our target users.

We thought about the original answer we heard: “I tried it…and it was too hard and too expensive.”

“Too hard” is not enough information. What was too hard before? Were the icons too small? The text too long? A critical feature missing? Were there too many features to wade through? Or something else altogether?

Dig deeper into users’ actual needs

We reached out to a lot of friends, family, and co-workers and held some low-key pizza and beer focus groups. Those folks walked us through their backup experience. While there were a lot of difficult areas, the most complicated part was setting up what would be backed up.

“I had to get all the files and folders on my computer organized; then I could set up the backup.”

That’s like cleaning the garage. Sounds like a good idea, but life conspires to get in the way, and it doesn’t happen.

We had to solve that or users would never think of our service as ‘easy.’

Takeaway: Dig deeper into users’ actual needs.

Attempt #2

Trying to remove the need to “clean the garage,” we asked folks what they wanted to be backed up. They told us they wanted their photos, movies, music, documents, and everything important.

We listened and tried making it easier. We focused our second attempt at a backup solution by pre-selecting everything ‘important.’ We selected the documents folder and then went one step further by finding all the photo, movies, music, and other common file types on the computer. Now users didn’t have to select files and folders – we would do it for them!

Fail #2

More pizza and beer user testing had people ask, “But how do I know that my photos are being backed up?”

We told them, “we’re searching your whole computer for photos.”

“But my photos are in this weird format: .jpg, are those included? .gif? .psd?”

We learned that the backup process felt nebulous to users since they wouldn’t know what exactly would be selected. Users would always feel uncomfortable – and uncomfortable isn’t ‘easy.’

Takeaway: No, really, keep digging deeper into users’ actual needs. Identify their real problem, not the solution they propose.

Attempt #3

We took a step back and asked, “What do we know?”

We want all of our “important” files backed up, but it can be hard for us to identify what files those are. Having us guess makes us uncomfortable. So, forget the tech. What experience would be the right one?

Our answer was that the computer would just magically be backed up to the cloud.

Then one of our co-founders Tim wondered, “what if we didn’t ask any questions and just backed up everything?”

At first, we all looked at him askew. Backup everything? That was a lot of data. How would that be possible? But we came back to, “Is this the right answer? Yes. So let’s see if we can make it work.”

So we flipped the entire backup approach on its head.

We didn’t ask users, “What do you want to have backed up.” We asked, “What do you NOT want to be backed up?” If you didn’t know, we’d back up all your data. It took away the scary “pick your files” question and made people comfortable that all their necessary data was being backed up.

We ran that experience by users, and their surprised response was, “Really, that’s it?” Hallelujah.

Success.

Takeaway: Keep digging deeper. Don’t let the tech get in the way of understanding the real problem.

Pricing

Pricing isn’t a side-note – it’s part of the product. Understand how customers will perceive your pricing.

We had developed a solution that was easy to use and easy to understand. But could we make it easy to afford? How much do we charge?

We would be storing a lot of data for each customer. The more data they needed to store, the more it would cost us. We planned to put the data on S3, which charged $0.15/GB/month. So it would seem logical to follow that same pricing model.

People thought of the value of the service rather than an amount of storage.

People had no idea how much data they had on their hard drive and certainly not how much of it needed to be backed up. Worse, they could be off by 1000x if they weren’t sure about the difference between megabytes and gigabytes, as some were.

We had to solve that too, or users would never think of our service as ‘easy.’

I asked everyone I could find: “If we were to provide you a service that automatically would backup all of the data on your computer over the internet, what would that be worth to you?”

What I heard back was a bell-curve:

  • A small number of people said, “$0. It should be free. Everything on the net is free!”
  • A small number of people said, “$50 – $100/month. That’s incredibly valuable!”
  • But by far the majority said, “Hmm. If it were $5/month, that’d be a no-brainer.”

A few interesting takeaways:

  • Everyone assumed it would be a monthly charge even though I didn’t ask, “What would you pay per month.”
  • No one said, “I’d pay $x/GB/month,” so people thought of the value of the service rather than an amount of storage.
  • There may have been opportunities to offer a free service and attempt to monetize it in other ways or to charge $50 – $100/month/user, but these were the small markets.
  • At $5/month, there was a significant slice of the population that was excited to use it.

Conclusion On the Solution

Over and over again we heard, “I tried backing up, but it was too hard and too expensive.”

After really understanding what was complicated, we finally got our real solution: An unlimited online backup service that would back up all your data automatically and charge just $5/month.

Easy to use, easy to understand, and easy to afford. Easy in the ways that mattered to the people using the service.

Often looking backward things seem obvious. But we learned a lot along the way:

  • Having competition doesn’t mean the market is full. Just because solutions exist doesn’t mean the problem is solved.
  • Don’t brainstorm forever. Problems don’t get solved on ideas alone. Brainstorm options, but don’t get stuck in the brainstorming phase.
  • Dig deeper into users’ actual needs. Then keep digging. Don’t let your knowledge of tech get in the way of your understanding the user. And be willing to shift course as your learn more.
  • Pricing isn’t a side-note. It’s part of the product. Understand how customers will perceive your pricing.

Just because we knew the right solution didn’t mean that it was possible. I’ll talk about that, along with how to launch, getting early traction, and more in future posts. What other questions do you have? Leave them in the comments.

The post How Backblaze Got Started: The Problem, The Solution, and the Stuff In-Between appeared first on Backblaze Blog | Cloud Storage & Cloud Backup.

How To Back Up Your Flickr Library

Post Syndicated from Peter Cohen original https://www.backblaze.com/blog/how-to-backup-your-flickr-library/

Flickr and cloud backup image

UPDATE May 17, 2018:  On April 20, Flickr announced that is being acquired by the image hosting and sharing service SmugMug. At that time, Flickr users were told that they have until May 25, 2018, to either accept the new terms of service from SmugMug or download their photo files from Flickr and close their accounts. Here is an excerpt from the email that was sent to Flickr users:

We think you are going to love Flickr under SmugMug ownership, but you can choose to not have your Flickr account and data transferred to SmugMug until May 25, 2018. If you want to keep your Flickr account and data from being transferred, you must go to your Flickr account to download the photos and videos you want to keep, then delete your account from your Account Settings by May 25, 2018.

If you do not delete your account by May 25, 2018, your Flickr account and data will transfer to SmugMug and will be governed by SmugMug’s Terms and Privacy Policy.

We wanted to let our readers know of this change, and also help them download their photos if they wish to do so. To that end, we’ve updated a post we published a little over a year ago with instructions on how to download your photos from Flickr. It’s a good idea to have a backup of your photos on Flickr whether or not you plan to continue with the service.

To read more:

You can read Peter’s updated post from March 21, 2017, How to Back Up Your Flickr Library, below.

— Editor

Flickr is a popular photo blogging service used by pro and amateur photographers alike. Flickr helps you archive your photos in the cloud and share them publicly with others. What happens when Flickr is the only place you can find your photos, though?

I hadn’t thought that much of that contingency. I’ve been a Flickr user since the pre-Yahoo days — 2004. I recently took stock of all the photos I’d uploaded to Flickr and realized something unsettling: I didn’t have some of these images on my Mac. It’s been 13 years and probably half a dozen computers since then, so I wasn’t surprised that some photos had fallen through the cracks.

I decided to be better safe than sorry. I set out to backup my entire Flickr library to make sure I had everything. And I’m here to pass along what I learned.

Flickr’s Camera Roll and Album Download Options

Most of Flickr’s workflow — and most of their supported apps — focus on getting images into Flickr, not out of Flickr. That doesn’t mean you can’t download images from Flickr, but it isn’t straightforward.

You can download photos directly from Flickr using their Camera Roll view, which organizes all your photos by the date they were taken. This is Flickr’s file-management interface, letting you select photos for whichever use you wish. Once you’ve selected the photos you want using the check boxes, Flickr will create a ZIP file that you can download. You are limited to 500 photos at a time, so this could take a number of repetitions if you have a lot of photos.

Flickr Camera Roll View screenshot

The download UI once you’ve met your photo selections:

Flickr Camera Roll options

You also can download Flickr Albums. Like the limit for the camera roll, you are limited to the number of photos you can download. In the case of albums, the limit is 5,000 files from albums at a time.

Flickr’s download albums selection dialog:

Flickr download albums

Guidelines from Flickr’s download help page:

screenshot of Flickr's download options

Third-party apps

Some third-party app makers have tapped into Flickr’s API to create various import and export services and apps.

Bulkr is one such app. The app, free to download, lets you download images from your Flickr library with the touch of a button. It’s dependent on Adobe Flash and requires Adobe AIR. Some features are unavailable unless you pay for the “Pro” version ($29).

Bulkr screenshot

Flickr downloadr is another free app that lets you download your Flickr library. It also works on Mac, Windows and Linux systems. No license encumbrances to download extra content — it’s released as open source.

Flickr Downloadr screenshot

I’ve tried them both on my library of over 8,000 images. In either case, I just set up the apps and let them run — they took a while, a couple of hours to grab everything. So if you’re working with a large archive of Flickr images, I’d recommend setting aside some time when you can leave your computer running.

What To Do With Your Flickr Images

You’ve downloaded the images to your local hard drive. What next? Catalog what you have. Both Macs and PCs include such software. The apps for each platform are both called “Photos.” They have the benefit of being free, built-in, and well-supported using existing tools and workflows.

If the Photos apps included with your computer don’t suit you, there are other commercial app options. Adobe Photoshop Lightroom is one of the more popular options that work with both Macs and Windows PCs. It’s included with Adobe’s $9.99 per month Creative Cloud Photography subscription (bundled with Photoshop), or you can buy it separately for $149.

Archive Your Backup

Now that you’ve downloaded all of your Flickr images, make sure they’re safe by backing them up. Back them up locally using Time Machine (on the Mac), Windows Backup or whatever means you prefer.

Even though you’ve gotten the images from the cloud by downloading them from Flickr, it’d be a good idea to store a backup copy offsite just in case. That’s keeping with the guidelines of the 3-2-1 Backup Strategy — a solid way to make sure that nothing bad can happen to your data.

Backblaze Backup and Backblaze B2 Cloud Storage are both great options, of course, for backing up and archiving your media, but the main thing is to make sure your photos are safe and sound. If anything happens to your computer or your local backup, you’ll still have a copy of those precious memories stored securely.

Need more tips on how to back up your computer? Check out our Computer Backup Guide for more details.

The post How To Back Up Your Flickr Library appeared first on Backblaze Blog | Cloud Storage & Cloud Backup.

Bringing the Viewer In: The Video Opportunity in Virtual Reality

Post Syndicated from mikesefanov original https://yahooeng.tumblr.com/post/151940036881

By Satender Saroha, Video Engineering

Virtual reality (VR) 360° videos are the next frontier of how we engage with and consume content. Unlike a traditional scenario in which a person views a screen in front of them, VR places the user inside an immersive experience. A viewer is “in” the story, and not on the sidelines as an observer.

Ivan Sutherland, widely regarded as the father of computer graphics, laid out the vision for virtual reality in his famous speech, “Ultimate Display” in 1965 [1]. In that he said, “You shouldn’t think of a computer screen as a way to display information, but rather as a window into a virtual world that could eventually look real, sound real, move real, interact real, and feel real.”

Over the years, significant advancements have been made to bring reality closer to that vision. With the advent of headgear capable of rendering 3D spatial audio and video, realistic sound and visuals can be virtually reproduced, delivering immersive experiences to consumers.

When it comes to entertainment and sports, streaming in VR has become the new 4K HEVC/UHD of 2016. This has been accelerated by the release of new camera capture hardware like GoPro and streaming capabilities such as 360° video streaming from Facebook and YouTube. Yahoo streams lots of engaging sports, finance, news, and entertainment video content to tens of millions of users. The opportunity to produce and stream such content in 360° VR opens a unique opportunity to Yahoo to offer new types of engagement, and bring the users a sense of depth and visceral presence.

While this is not an experience that is live in product, it is an area we are actively exploring. In this blog post, we take a look at what’s involved in building an end-to-end VR streaming workflow for both Live and Video on Demand (VOD). Our experiments and research goes from camera rig setup, to video stitching, to encoding, to the eventual rendering of videos on video players on desktop and VR headsets. We also discuss challenges yet to be solved and the opportunities they present in streaming VR.

1. The Workflow

Yahoo’s video platform has a workflow that is used internally to enable streaming to an audience of tens of millions with the click of a few buttons. During experimentation, we enhanced this same proven platform and set of APIs to build a complete 360°/VR experience. The diagram below shows the end-to-end workflow for streaming 360°/VR that we built on Yahoo’s video platform.

Figure 1: VR Streaming Workflow at Yahoo

1.1. Capturing 360° video

In order to capture a virtual reality video, you need access to a 360°-capable video camera. Such a camera uses either fish-eye lenses or has an array of wide-angle lenses to collectively cover a 360 (θ) by 180 (ϕ) sphere as shown below.

Though it sounds simple, there is a real challenge in capturing a scene in 3D 360° as most of the 360° video cameras offer only 2D 360° video capture.

In initial experiments, we tried capturing 3D video using two cameras side-by-side, for left and right eyes and arranging them in a spherical shape. However this required too many cameras – instead we use view interpolation in the stitching step to create virtual cameras.

Another important consideration with 360° video is the number of axes the camera is capturing video with. In traditional 360° video that is captured using only a single-axis (what we refer as horizontal video), a user can turn their head from left to right. But this setup of cameras does not support a user tilting their head at 90°.

To achieve true 3D in our setup, we went with 6-12 GoPro cameras having 120° field of view (FOV) arranged in a ring, and an additional camera each on top and bottom, with each one outputting 2.7K at 30 FPS.

1.2. Stitching 360° video

Projection Layouts

Because a 360° view is a spherical video, the surface of this sphere needs to be projected onto a planar surface in 2D so that video encoders can process it. There are two popular layouts:

Equirectangular layout: This is the most widely-used format in computer graphics to represent spherical surfaces in a rectangular form with an aspect ratio of 2:1. This format has redundant information at the poles which means some pixels are over-represented, introducing distortions at the poles compared to the equator (as can be seen in the equirectangular mapping of the sphere below).

Figure 2: Equirectangular Layout [2]

CubeMap layout: CubeMap layout is a format that has also been used in computer graphics. It contains six individual 2D textures that map to six sides of a cube. The figure below is a typical cubemap representation. In a cubemap layout, the sphere is projected onto six faces and the images are folded out into a 2D image, so pieces of a video frame map to different parts of a cube, which leads to extremely efficient compact packing. Cubemap layouts require about 25% fewer pixels compared to equirectangular layouts.

Figure 3: CubeMap Layout [3]

Stitching Videos

In our setup, we experimented with a couple of stitching softwares. One was from Vahana VR [4], and the other was a modified version of the open-source Surround360 technology that works with a GoPro rig [5]. Both softwares output equirectangular panoramas for the left and the right eye. Here are the steps involved in stitching together a 360° image:

Raw frame image processing: Converts uncompressed raw video data to RGB, which involves several steps starting from black-level adjustment, to applying Demosaic algorithms in order to figure out RGB color parts for each pixel based on the surrounding pixels. This also involves gamma correction, color correction, and anti vignetting (undoing the reduction in brightness on the image periphery). Finally, this stage applies sharpening and noise-reduction algorithms to enhance the image and suppress the noise.

Calibration: During the calibration step, stitching software takes steps to avoid vertical parallax while stitching overlapping portions in adjacent cameras in the rig. The purpose is to align everything in the scene, so that both eyes see every point at the same vertical coordinate. This step essentially matches the key points in images among adjacent camera pairs. It uses computer vision algorithms for feature detection like Binary Robust Invariant Scalable Keypoints (BRISK) [6] and AKAZE [7].

Optical Flow: During stitching, to cover the gaps between adjacent real cameras and provide interpolated view, optical flow is used to create virtual cameras. The optical flow algorithm finds the pattern of apparent motion of image objects between two consecutive frames caused by the movement of the object or camera. It uses OpenCV algorithms to find the optical flow [8].

Below are the frames produced by the GoPro camera rig:

Figure 4: Individual frames from 12-camera rig

Figure 5: Stitched frame output with PtGui

Figure 6: Stitched frame with barrel distortion using Surround360

Figure 7: Stitched frame after removing barrel distortion using Surround360

To get the full depth in stereo, the rig is set-up so that i = r * sin(FOV/2 – 360/n). where:

  • i = IPD/2 where IPD is the inter-pupillary distance between eyes.\
  • r = Radius of the rig.
  • FOV = Field of view of GoPro cameras, 120 degrees.
  • n = Number of cameras which is 12 in our setup.

Given IPD is normally 6.4 cms, i should be greater than 3.2 cm. This implies that with a 12-camera setup, the radius of the the rig comes to 14 cm(s). Usually, if there are more cameras it is easier to avoid black stripes.

Reducing Bandwidth – FOV-based adaptive transcoding

For a truly immersive experience, users expect 4K (3840 x 2160) quality resolution at 60 frames per second (FPS) or higher. Given typical HMDs have a FOV of 120 degrees, a full 360° video needs a resolution of at least 12K (11520 x 6480). 4K streaming needs a bandwidth of 25 Mbps [9]. So for 12K resolution, this effectively translates to > 75 Mbps and even more for higher framerates. However, average wifi in US has bandwidth of 15 Mbps [10].

One way to address the bandwidth issue is by reducing the resolution of areas that are out of the field of view. Spatial sub-sampling is used during transcoding to produce multiple viewport-specific streams. Each viewport-specific stream has high resolution in a given viewport and low resolution in the rest of the sphere.

On the player side, we can modify traditional adaptive streaming logic to take into account field of view. Depending on the video, if the user moves his head around a lot, it could result in multiple buffer fetches and could result in rebuffering. Ideally, this will work best in videos where the excessive motion happens in one field of view at a time and does not span across multiple fields of view at the same time. This work is still in an experimental stage.

The default output format from stitching software of both Surround360 and Vahana VR is equirectangular format. In order to reduce the size further, we pass it through a cubemap filter transform integrated into ffmpeg to get an additional pixel reduction of ~25%  [11] [12].

At the end of above steps, the stitching pipeline produces high-resolution stereo 3D panoramas which are then ingested into the existing Yahoo Video transcoding pipeline to produce multiple bit-rates HLS streams.

1.3. Adding a stitching step to the encoding pipeline

Live – In order to prepare for multi-bitrate streaming over the Internet, a live 360° video-stitched stream in RTMP is ingested into Yahoo’s video platform. A live Elemental encoder was used to re-encode and package the live input into multiple bit-rates for adaptive streaming on any device (iOS, Android, Browser, Windows, Mac, etc.)

Video on Demand – The existing Yahoo video transcoding pipeline was used to package multiple bit-rates HLS streams from raw equirectangular mp4 source videos.

1.4. Rendering 360° video into the player

The spherical video stream is delivered to the Yahoo player in multiple bit rates. As a user changes their viewing angle, different portion of the frame are shown, presenting a 360° immersive experience. There are two types of VR players currently supported at Yahoo:

WebVR based Javascript Player – The Web community has been very active in enabling VR experiences natively without plugins from within browsers. The W3C has a Javascript proposal [13], which describes support for accessing virtual reality (VR) devices, including sensors and head-mounted displays on the Web. VR Display is the main starting point for all the device APIs supported. Some of the key interfaces and attributes exposed are:

  • VR Display Capabilities: It has attributes to indicate position support, orientation support, and has external display.
  • VR Layer: Contains the HTML5 canvas element which is presented by VR Display when its submit frame is called. It also contains attributes defining the left bound and right bound textures within source canvas for presenting to an eye.
  • VREye Parameters: Has information required to correctly render a scene for given eye. For each eye, it has offset the distance from middle of the user’s eyes to the center point of one eye which is half of the interpupillary distance (IPD). In addition, it maintains the current FOV of the eye, and the recommended renderWidth and render Height of each eye viewport.
  • Get VR Displays: Returns a list of VR Display(s) HMDs accessible to the browser.

We implemented a subset of webvr spec in the Yahoo player (not in production yet) that lets you watch monoscopic and stereoscopic 3D video on supported web browsers (Chrome, Firefox, Samsung), including Oculus Gear VR-enabled phones. The Yahoo player takes the equirectangular video and maps its individual frames on the Canvas javascript element. It uses the webGL and Three.JS libraries to do computations for detecting the orientation and extracting the corresponding frames to display.

For web devices which support only monoscopic rendering like desktop browsers without HMD, it creates a single Perspective Camera object specifying the FOV and aspect ratio. As the device’s requestAnimationFrame is called it renders the new frames. As part of rendering the frame, it first calculates the projection matrix for FOV and sets the X (user’s right), Y (Up), Z (behind the user) coordinates of the camera position.

For devices that support stereoscopic rendering like mobile phones from Samsung Gear, the webvr player creates two PerspectiveCamera objects, one for the left eye and one for the right eye. Each Perspective camera queries the VR device capabilities to get the eye parameters like FOV, renderWidth and render Height every time a frame needs to be rendered at the native refresh rate of HMD. The key difference between stereoscopic and monoscopic is the perceived sense of depth that the user experiences, as the video frames separated by an offset are rendered by separate canvas elements to each individual eye.

Cardboard VR – Google provides a VR sdk for both iOS and Android [14]. This simplifies common VR tasks like-lens distortion correction, spatial audio, head tracking, and stereoscopic side-by-side rendering. For iOS, we integrated Cardboard VR functionality into our Yahoo Video SDK, so that users can watch stereoscopic 3D videos on iOS using Google Cardboard.

2. Results

With all the pieces in place, and experimentation done, we were able to successfully do a 360° live streaming of an internal company-wide event.

Figure 8: 360° Live streaming of Yahoo internal event

In addition to demonstrating our live streaming capabilities, we are also experimenting with showing 360° VOD videos produced with a GoPro-based camera rig. Here is a screenshot of one of the 360° videos being played in the Yahoo player.

Figure 9: Yahoo Studios produced 360° VOD content in the Yahoo Player

3. Challenges and Opportunities

3.1. Enormous amounts of data

As we alluded to in the video processing section of this post, delivering 4K resolution videos for each eye for each FOV at a high frame-rate remains a challenge. While FOV-adaptive streaming does reduce the size by providing high resolution streams separately for each FOV, providing an impeccable 60 FPS or more viewing experience still requires a lot more data than the current internet pipes can handle. Some of the other possible options which we are closely paying attention to are:

Compression efficiency with HEVC and VP9 – New codecs like HEVC and VP9 have the potential to provide significant compression gains. HEVC open source codecs like x265 have shown a 40% compression performance gain compared to the currently ubiquitous H.264/AVC codec. LIkewise, a VP9 codec from Google has shown similar 40% compression performance gains. The key challenge is the hardware decoding support and the browser support. But with Apple and Microsoft very much behind HEVC and Firefox and Chrome already supporting VP9, we believe most browsers would support HEVC or VP9 within a year.

Using 10 bit color depth vs 8 bit color depth – Traditional monitors support 8 bpc (bits per channel) for displaying images. Given each pixel has 3 channels (RGB), 8 bpc maps to 256x256x256 color/luminosity combinations to represent 16 million colors. With 10 bit color depth, you have the potential to represent even more colors. But the biggest stated advantage of using 10 bit color depth is with respect to compression during encoding even if the source only uses 8 bits per channel. Both x264 and x265 codecs support 10 bit color depth, with ffmpeg already supporting encoding at 10 bit color depth.

3.2. Six degrees of freedom

With current camera rig workflows, users viewing the streams through HMD are able to achieve three degrees of Freedom (DoF) i.e., the ability to move up/down, clockwise/anti-clockwise, and swivel. But you still can’t get a different perspective when you move inside it i.e., move forward/backward. Until now, this true six DoF immersive VR experience has only been possible in CG VR games. In video streaming, LightField technology-based video cameras produced by Lytro are the first ones to capture light field volume data from all directions [15]. But Lightfield-based videos require an order of magnitude more data than traditional fixed FOV, fixed IPD, fixed lense camera rigs like GoPro. As bandwidth problems get resolved via better compressions and better networks, achieving true immersion should be possible.

4. Conclusion

VR streaming is an emerging medium and with the addition of 360° VR playback capability, Yahoo’s video platform provides us a great starting point to explore the opportunities in video with regard to virtual reality. As we continue to work to delight our users by showing immersive video content, we remain focused on optimizing the rendering of high-quality 4K content in our players. We’re looking at building FOV-based adaptive streaming capabilities and better compression during delivery. These capabilities, and the enhancement of our webvr player to play on more HMDs like HTC Vive and Oculus Rift, will set us on track to offer streaming capabilities across the entire spectrum. At the same time, we are keeping a close watch on advancements in supporting spatial audio experiences, as well as advancements in the ability to stream volumetric lightfield videos to achieve true six degrees of freedom, with the aim of realizing the true potential of VR.

Glossary – VR concepts:

VR – Virtual reality, commonly referred to as VR, is an immersive computer-simulated reality experience that places viewers inside an experience. It “transports” viewers from their physical reality into a closed virtual reality. VR usually requires a headset device that takes care of sights and sounds, while the most-involved experiences can include external motion tracking, and sensory inputs like touch and smell. For example, when you put on VR headgear you suddenly start feeling immersed in the sounds and sights of another universe, like the deck of the Star Trek Enterprise. Though you remain physically at your place, VR technology is designed to manipulate your senses in a manner that makes you truly feel as if you are on that ship, moving through the virtual environment and interacting with the crew.

360 degree video – A 360° video is created with a camera system that simultaneously records all 360 degrees of a scene. It is a flat equirectangular video projection that is morphed into a sphere for playback on a VR headset. A standard world map is an example of equirectangular projection, which maps the surface of the world (sphere) onto orthogonal coordinates.

Spatial Audio – Spatial audio gives the creator the ability to place sound around the user. Unlike traditional mono/stereo/surround audio, it responds to head rotation in sync with video. While listening to spatial audio content, the user receives a real-time binaural rendering of an audio stream [17].

FOV – A human can naturally see 170 degrees of viewable area (field of view). Most consumer grade head mounted displays HMD(s) like Oculus Rift and HTC Vive now display 90 degrees to 120 degrees.

Monoscopic video – A monoscopic video means that both eyes see a single flat image, or video file. A common camera setup involves six cameras filming six different fields of view. Stitching software is used to form a single equirectangular video. Max output resolution on 2D scopic videos on Gear VR is 3480×1920 at 30 frames per second.

Presence – Presence is a kind of immersion where the low-level systems of the brain are tricked to such an extent that they react just as they would to non-virtual stimuli.

Latency – It’s the time between when you move your head, and when you see physical updates on the screen. An acceptable latency is anywhere from 11 ms (for games) to 20 ms (for watching 360 vr videos).

Head Tracking – There are two forms:

  • Positional tracking – movements and related translations of your body, eg: sway side to side.
  • Traditional head tracking – left, right, up, down, roll like clock rotation.

References:

[1] Ultimate Display Speech as reminisced by Fred Brooks: http://www.roadtovr.com/fred-brooks-ivan-sutherlands-1965-ultimate-display-speech/

[2] Equirectangular Layout Image: https://www.flickr.com/photos/[email protected]/10111691364/

[3] CubeMap Layout: http://learnopengl.com/img/advanced/cubemaps_skybox.png

[4] Vahana VR: http://www.video-stitch.com/

[5] Surround360 Stitching software: https://github.com/facebook/Surround360

[6] Computer Vision Algorithm BRISK: https://www.robots.ox.ac.uk/~vgg/rg/papers/brisk.pdf

[7] Computer Vision Algorithm AKAZE: http://docs.opencv.org/3.0-beta/doc/tutorials/features2d/akaze_matching/akaze_matching.html

[8] Optical Flow: http://docs.opencv.org/trunk/d7/d8b/tutorial_py_lucas_kanade.html

[9] 4K connection speeds: https://help.netflix.com/en/node/306

[10] Average connection speeds in US: https://www.akamai.com/us/en/about/news/press/2016-press/akamai-releases-fourth-quarter-2015-state-of-the-internet-report.jsp

[11] CubeMap transform filter for ffmpeg: https://github.com/facebook/transform

[12] FFMPEG software: https://ffmpeg.org/

[13] WebVR Spec: https://w3c.github.io/webvr/

[14] Google Daydream SDK: https://vr.google.com/cardboard/developers/

[15] Lytro LightField Volume for six DoF: https://www.lytro.com/press/releases/lytro-immerge-the-worlds-first-professional-light-field-solution-for-cinematic-vr

[16] 10 bit color depth: https://gist.github.com/l4n9th4n9/4459997

Personalized Group Recommendations Are Here | code.flickr.com

Post Syndicated from davglass original https://yahooeng.tumblr.com/post/151144204266

Personalized Group Recommendations Are Here | code.flickr.com:

There are two primary paradigms for the discovery of digital content. First is the search paradigm, in which the user is actively looking for specific content using search terms and filters (e.g., Google web search, Flickr image search, Yelprestaurant search, etc.). Second is a passive approach, in which the user browses content presented to them (e.g., NYTimes news, Flickr Explore, and Twitter trending topics). Personalization benefits both approaches by providing relevant content that is tailored to users’ tastes (e.g., Google News, Netflix homepage, LinkedIn job search, etc.). We believe personalization can improve the user experience at Flickr by guiding both new as well as more experienced members as they explore photography. Today, we’re excited to bring you personalized group recommendations.

Read more over at code.flickr.com

Docker comes to Raspberry Pi

Post Syndicated from Matt Richardson original https://www.raspberrypi.org/blog/docker-comes-to-raspberry-pi/

If you’re not already familiar with Docker, it’s a method of packaging software to include not only your code, but also other components such as a full file system, system tools, services, and libraries. You can then run the software on multiple machines without a lot of setup. Docker calls these packages containers.

Mayview Maersk by Flickr user Kees Torn

Mayview Maersk by Flickr user Kees Torn

Think of it like a shipping container and you’ve got some idea of how it works. Shipping containers are a standard size so that they can be moved around at ports, and shipped via sea or land. They can also contain almost anything. Docker containers can hold your software’s code and its dependencies, so that it can easily run on many different machines. Developers often use them to create a web application server that runs on their own machine for development, and is then pushed to the cloud for the public to use.

While we’ve noticed people using Docker on Raspberry Pi for a while now, the latest release officially includes Raspbian Jessie installation support. You can now install the Docker client on your Raspberry Pi with just one terminal command:

curl -sSL get.docker.com | sh

From there, you can create your own container or download pre-made starter containers for your projects. The documentation is thorough and easy to follow.

Docker Swarm

One way you can use Raspberry Pi and Docker together is for Swarm. Used together, they can create a computer cluster. With Swarm containers on a bunch of networked Raspberry Pis, you can build a powerful machine and explore how a Docker Swarm works. Alex Ellis shows you how in this video:

Docker Swarm mode Deep Dive on Raspberry Pi (scaled)

Get all the details @ http://blog.alexellis.io/live-deep-dive-pi-swarm/

You can follow along with Alex’s written tutorial as well. He has even taken it further by using Pi Zero’s USB gadget capabilities to create a tiny Docker Swarm:

Alex Ellis on Twitter

Look ma, no Ethernet! 8 core @Docker 1.12 swarm boom USB OTG @Raspberry_Pi @pimoronipic.twitter.com/frlSQ9ePpr

The Raspberry Pi already makes many computing tasks easier; why not add deploying remote applications to that list with Docker?

The post Docker comes to Raspberry Pi appeared first on Raspberry Pi.

Aquaponics

Post Syndicated from Liz Upton original https://www.raspberrypi.org/blog/aquaponics/

So then. Aquaponics. I’d assumed it was something to do with growing underwater plants. Dead wrong.

My educative moment occurred at Disneyworld’s Epcot a couple of years ago. There’s a ride called The Land, where, after enduring  a selection of creaking dioramas illustrating different US habitats, you’re taken on a little motorised punt thing on a watery track through greenhouses groaning under the weight of four-kilogramme mega-lemons, arboreal tomatoes and Mickey-shaped pumpkins.

Giant lemon, from Arild Finne Nybø on Flickr.

Giant lemon, from Arild Finne Nybø on Flickr.

At the end of the…river thing…, you’ll find a section on aquaponics. An aquaponics system creates an incredibly efficient symbiotic environment for raising food. Aquatic food (prawns, fish and the like) is raised in water. Waste products from those creatures, which in an aquatic-only environment would degrade the quality of the water, are diverted into a hydroponic system, where nitrogen-fixing bacteria turn them into nitrates and nitrites, which are used to feed edible plants. The water can then be recirculated into the fish tank.

Finesse is required. You need to be able to monitor and control temperature, drainage and pumping. Professional systems are expensive, so the enterprising aquaponics practitioner will want to build their own. Enter the Raspberry Pi. And a shipping container, a shed and some valves.

Raspbery Pi Controlled IBC based Aquaponics

Raspbery Pi Controlled IBC based Aquaponics. Details and scripts available at http://www.instructables.com/id/Raspberry-Pi-Controlled-Aquaponics/

MatthewH415, the maker, has documented the whole build at Instructables. He says:

This build uses the IBC method of Aquaponics, with modifications to include a Raspberry Pi for controlling a pump, solenoid drain, and temperature probes for water and air temperatures. The relays and timing is controlled with python scripting. Temperature and control data is collected every minute and sent to plot.ly for graphing, and future expansion will include sensors for water level and PH values for additional control.

All of my scripts are available at github.com, feel free to use them for your aquaponics setup. Thanks to Chris @ plot.ly for the help with streaming data to their service, and to the amazingly detailed build instructions provided at IBCofAquaponics.com.

We love it. Thanks Matthew; come the apocalypse, we at Pi Towers are happy in the safe and secure knowledge that we’ll at least have tilapia and cucumbers.

The post Aquaponics appeared first on Raspberry Pi.

Simple workflow for building web service APIs

Post Syndicated from yahoo original https://yahooeng.tumblr.com/post/142418165386

Norbert Potocki, Software Engineer @ Yahoo Inc.

APIs are at the core of
server-client communications, and well-defined API contracts are essential to the overall experience of client developer communities. At
Yahoo, we have explored the best methods to develop APIs – both external (like apiary, apigee, API Gateway) and internal. In our examination,
our main focus was to devise a methodology that provides a simple way to build new server endpoints, while guaranteeing a stable,
streamlined integration for client developers. The workflow itself can be used with one of many domain-specific languages (DSL) for API modeling (e.g. Swagger, RAML, ardielle). The
main driver for this project was a need to build a new generation of Flickr
APIs. Flickr has had a long tradition of exposing rich capabilities via our API and innovating. One of Flickr’s contributions to this
domain was inventing an early version of OAuth
protocol. In this post, we will share a simple workflow that demonstrates the new approach to building APIs. For the purpose of this
article we will focus on the Swagger, although the workflow can easily be adapted to use on another DSL.  

Our goals for developing the workflow:

  • Standardize parts of the API but allow for them to be easily replaced or extended
  • Maximize automation opportunities
    • Auto-generation of documentation
    • SDKs
    • API validation tests
  • Drive efficiency with code re-usability
  • Focus on developer productivity
    • Well-documented, easy to use development tools
    • Easy to follow workflow
    • Room for customization
  • Encourage collaboration with the open source community
    • Use open standards
    • Use a DSL for interface modeling

What we tried before and why it didn’t work

Let’s take a look at two popular approaches to building APIs.

The first is an implementation-centric approach. Backend developers implement the
service thus defining the API in the code. If you’re lucky there will be some code-level documentation – like javadoc – attached. Other
teams (e.g., frontend, mobile, 3rd party engineers) have to read the javadoc and/or the code to understand the API contract nuances. They
need to understand the programming language used, and the implementation has to be publicly available. Versioning may be tricky and the
situation can get worse when multiple development teams work on different services that represent a single umbrella-project. There’s a
chance that their APIs will fall out-of-sync, be it by implementing rate-limiting headers or by using different versioning models.

The other approach is to use in-house developed DSL and tools. There are already a slew of mature open source DSLs and tools available on
the market, and opting for this route may be more efficient. Swagger is a perfect example. Many engineers know it and you can share your
early API design with the open source community and get feedback. Also, there’s no extra learning curve involved so the chances that
somebody will contribute are higher.

The New Workflow

API components

Let’s start by discussing the API elements we work with and what roles they play in the workflow:


Figure 1 – API components
  • API specification: the centerpiece of each service, including an API contract described in a well-defined and structured,
    human-readable format.
  • Documentation: developer-friendly API reference, a user’s guide and examples of API usage. Partially
    generated from API specification and partially hand-crafted.
  • SDKs (Software Development Kits): a set of programming
    language-specific libraries that simplify interactions with APIs. They typically abstract out lower layers (e.g. HTTP) and expose API
    input and output as concepts specific to the language (e.g. objects in Java). Partially generated from the API specification.
  • Implementation: the business logic that directs how APIs provide functionality. Validated against API specification via API
    tests.
  • API tests: tests validating the implementation against the specification.

API specification – separating contract from implementation

An API specification is one of the most important parts of a web service. You may want to consider
keeping the contract definition separate from the implementation because it will allow you to:

  • Deliver the interface to customers faster (and get feedback sooner)
  • Keep your implementation private while still clearly communicating to customers what the service contract is
  • Describe APIs in a programming language-agnostic way to allow the use of different technologies for implementation
  • Work on SDKs, API documentation and clients in parallel to implementation

There are
a few popular domain-specific languages that can be used for describing the contract of your service. At Flickr, we use Swagger, and keep
all Swagger files in a GitHub repository called “api-spec”. It contains multiple yaml files that describe different parts of the API –
both reusable elements and service-specific endpoint and resource definitions.


Figure 2 – Repository holding API specification

To give you a taste of Swagger here’s how a combined yaml file could look like:


Figure 3 – sample Swagger YAML file

One nice aspect of Swagger is the Swagger editor. It’s a browser-based IDE that shows you a
live preview of your API documentation (generated from the spec) and also provides a console for querying mock backend implementation.
Here’s how it looks like:


Figure 4 – API editor

Once the changes are approved and merged to master, a number of CD
(continuous delivery) pipelines kick in: one per each SDK that we host and another pipeline for generating documentation. There is also an
option of triggering a CD pipeline generating stubs for backend implementation but the decision is left up to the service owner.

The power of documentation

Documentation is the most important yet most unappreciated part of software engineering. At Yahoo, we devote
lots of attention to documenting APIs, and in this workflow we keep the documentation in a separate GitHub repository. GitHub offers a great feature called GitHub
Pages
that allows us to host documentation on their servers and avoid building a custom CD pipeline for documentation. It also gives
you the ability to edit files directly in the browser. GitHub pages are powered by Jekyll, which
serves HTML pages directly from the repository. You use Markdown files to provide
content, select a web template to use and push it to the “gh-pages” branch:


Figure 5 – documentation repository

The
repo contains both a hand-crafted user’s guide and an automatically generated API reference. The API reference is generated from the API
specification and put in a directory called “api-reference.


Figure 6 – API reference directory

The process of
generating the API reference is executed by a simplistic CD pipeline. Every time you merge changes to the master branch of the API
specification repository, it will assemble the yaml files into a single Swagger json file and submit it as a pull-request towards the
documentation repository. Here’s the simple node.js script that does the transformation:


Figure 7 – merging multiple Swagger files

And a snippet from CD pipeline steps that creates the pull-request:


Figure 8 – generate documentation pull-request

The “api-reference” directory also contains the Swagger UI code, which is responsible for rendering the Swagger json file in the browser. It also provides a
console that allows you to send requests against a test backend instance, and comes in very handy when a customer wants to quickly explore
our APIs. Here’s how the final result looks:


Figure 9 – Swagger UI

Why having SDKs is important

Calling an API is
fun. Dealing with failures, re-tries and HTTP connection issues – not so much. That’s where services which have a dedicated SDK really
shine. An SDK can either be a thin wrapper around an HTTP client that deals with marshalling of requests and responses, or a fat client
that has extra business logic in it. Since this extra business logic is handcrafted most of the time, we will exclude it from the
discussion and focus on a thin client instead.

Thin API clients can usually be auto-generated from API specifications. We have a CD
pipeline (similar to the documentation CD pipeline) that is responsible for this process. Each SDK is kept in a separate GitHub
repository. For each API specification change, all SDKs are regenerated and pushed (as pull-requests) to appropriate repositories. Take a
look at the swagger-codegen project to learn more about SDK generation.

It’s worth mentioning that the thin layer could also be generated in runtime based on the Swagger json file itself.

API implementation

The major question that pops out when implementing an API is: should we automatically generate the stub code? From
our experience – it may be worth it, but most often it’s not. API stub scaffolding saves you some initial work when you add a new API.
However, different service owners prefer to structure their code in various manners (packages, class names, how code is divided between
REST controllers, etc.) and thus it’s expensive to develop a one-size-fits-all generator.

The last topic we want to cover is
validating implementation against API specification. Validation happens via tests (written in Cucumber) that are executed with every
change to the implementation. We validate API responses schema, different failure scenarios (for valid HTTP status code usage), returned
headers, rate-limiting mechanism, pagination and others. To maximize code-reuse and simplify test code, we use one of the thin SDKs for
API calls within tests.


Figure 10 – Implementation validation

Summary

In this article, we provided a simple, yet comprehensive, overview for working with APIs that
we use at Flickr, and examined the key features, including clear separation of different system components (specification, implementation,
documentation, sdks, tests), developer-friendliness and automation options. We also presented the workflow that binds all the components
together in an easy to use, streamlined way.

Configuration management for distributed systems (using GitHub and cfg4j)

Post Syndicated from yahoo original https://yahooeng.tumblr.com/post/141920508211

Norbert Potocki, Software Engineer @ Yahoo Inc.

Warm up: Why configuration management?

When working with large-scale software systems, configuration management becomes crucial – supporting non-uniform environments gets
greatly simplified, if you decouple code from configuration. While building complex software/products such as Flickr, we had to come up with a simple yet
powerful way to
manage configuration. Popular approaches to solving this problem include using configuration files or having a dedicated configuration
service. Our new solution combines both the extremely popular GitHub and cfg4j
library, giving you a very flexible approach that will work with applications of any size.

Why should I decouple configuration from the code?

Faster configuration changes (e.g. flipping feature toggles): Configuration can simply be injected without requiring parts of your
code to be reloaded and re-executed. Config-only updates tend to be faster than code deployment

Different configuration for different environments: Running your app on a laptop or in a test environment requires a different set of
settings than production instance

Keeping credentials private: If you don’t have a dedicated credential store, it may be convenient to keep credentials as part of
configuration. They usually aren’t supposed to be “public” but the code still may be. Be a good sport and don’t keep credentials
in a public GitHub repo 🙂

Meet the Gang: Overview of configuration management players

Let’s see what configuration-specific components we’ll be working with today:

image
Figure 1 – Overview of configuration management components
Configuration repository and editor: Where your configuration lives. We’re using Git
for storing configuration files and GitHub as an ad hoc editor.

Push cache : Intermediary store that we use to improve fetch speed and to ease load on GitHub servers.

CD pipeline: Continuous deployment pipeline pushing changes from repository to push cache and validating config correctness.

Configuration library: Fetches configs from push cache and exposing them to your business logic.
Bootstrap configuration : Initial configuration specifying where your push cache is located (so that library knows where to get
configuration from).
All these players work as a team to provide an end-to-end configuration management solution.

The Coach: Configuration repository and editor

The first thing you might expect from the configuration repository and editor is ease of use. Let’s enumerate what that means:

Configuration should be easy to read and write
It should be straightforward to add a new configuration set
You most certainly want to be able to review changes if your team is bigger than one person
It’s nice to see a history of changes, especially when you’re trying to fix a bug at the middle of night
Support from popular IDEs – freedom of choice is priceless
Multi-tenancy support (optional) is often pragmatic
So what options are out there that may satisfy those requirements? The three very popular formats for storing configuration are YAML, Java Property files, and XML files. We use YAML because it is widely supported by multiple programming languages and
IDEs and it’s very readable and easy to understand, even for the non-engineer.

We could use a dedicated configuration store; however, the great thing about files is that they can be easily versioned by
version control tools like Git, which we decided to use as it’s widely known and proven.

Git provides us with a history of changes and an easy way to branch off configuration. It also has great support in the form of GitHub which we
use both as an editor (built-in support for YAML files) and collaboration tool (pull requests, forks, review tool). Both are nicely glued
together by following the Git flow branching model. Here’s an example of
configuration file that we use:


Figure 2 – configuration file preview
One of the goals was
to make managing multiple configuration sets (execution environments) a breeze. We needed the ability to add and remove environments quickly. If
you look at the screenshot below, you’ll notice a “prod-us-east” directory in the path. For every environment, we stored a separate directory
with config files in Git. All of them have the exact same structure and only differ in YAML file contents.

This solution makes working with environments simple and comes in very handy during local development or new production fleet rollout (see
use cases at the end of this article). Here’s a sample config repo for a project that has only one “feature”:


Figure 3 – support for multiple environments
Some of the
products that we work with at Yahoo have a very granular architecture – hundreds of micro-services working together. For scenarios like
this, it’s convenient to store configurations for all services in a single repository, which greatly reduces the overhead of maintaining multiple
repositories. We support this use case by having multiple top-level directories each holding configurations for one service only.

The Sprinter: Push cache

The main role
of push cache is to decrease load put on the GitHub server and improve configuration fetch time. Since speed is the only
concern here, we decided to keep the push cache simple – it’s just a key-value store. Consul was our
choice: the nice thing is that it’s fully distributed.

You can install Consul clients on the edge nodes and they will
keep being synchronized across the fleet. This greatly improves both reliability and performance of the system. If performance is not a
concern, any key-value store will do. You can skip using push cache altogether and connect directly to Github, which comes in handy during
development (see use cases below to learn more about this).

The Manager: CD Pipeline

When the configuration
repository is updated, a CD pipeline kicks in. This fetches configuration, converts it into a more optimized format and pushes it to the cache.
Additionally, the CD pipeline validates the configuration (once at the pull-request stage and again after being merged to master) and
controls multi-phase deployment by deploying config change to only 20% of production hosts at one time.

The Mascot: Bootstrap configuration

Before we can connect to the push cache to fetch configuration we need to know where it is.
That’s where bootstrap configuration comes into play – it’s very simple. The config contains the hostname, port to connect to, and the name of
the environment to use. You need to put this config with your code or as part of the CD pipeline. This simple yaml file binding Spring
profiles to different Consul hosts suffices for our needs:

image

Figure 4 – bootstrap configuration

The Cool Guy: Configuration library

image

The configuration library takes care of fetching the
configuration from push cache and exposing it to your business logic. We use the library called cfg4j
(“configuration for java”). This library re-loads configurations from the push cache every few seconds and injects them into configuration
objects that our code uses. It also takes care of local caching, merging properties from different repositories, and falling back to
user-provided defaults when necessary (read more at http://www.cfg4j.org/).Briefly summarizing
how we use cfg4j’s features:

Configuration auto-reloading: Each service reloads configuration every ~30 seconds and auto re-configures itself.
Multi-environment support: for our multiple environments (beta, performance, canary, production-us-west, production-us-east,
etc.).

Local caching: Remedies service interruption when the push cache or configuration repository is down and also improves the
performance for obtaining configs.

Fallback and merge strategies: Simplifies local development and provides support for multiple configuration repositories.
Integration with Dependency Injection containers: because we love DI :D.
If you want to play with this library yourself, there’s plenty of examples both in its
documentation
and cfg4j-sample-apps Github repository.

The Heavy Lifter: Configurable code
The most important piece is business logic. To best make use of a configuration service,
the business logic has to be able to re-configure itself in runtime. Here are a few rules of thumb and code samples:
Use dependency injection for injecting configuration. This is how we do it using Spring Framework (see the bootstrap configuration
above for host/port values):

Use configuration objects to inject configuration instead of providing configuration directly – here’s where the difference is:
Direct configuration injection (won’t reload as config changes)
Configuration injection via “interface binding” (will reload as config changes):
The exercise: Common use-cases (applying our simple solution)

Configuration during development (local overrides)

When you develop a feature, a main concern is the ability to evolve your code quickly. A full
configuration management pipeline is not conducive to this ability. We use the following approaches when doing local
development:

Add a temporary configuration file to the project and use cfg4j’s MergeConfigurationSource for reading config both from the configuration
store and your file. By making your local file a primary configuration source, you provide an override mechanism. If the property is
found in your file, it will be used. If not, cfg4j will fall back to using values from configuration store. Here’s an example (reference
examples above to get a complete code):
Fork the configuration repository, make changes to the fork and use cfg4j’s GitConfigurationSource to access it directly (no push
cache required):
Set up your private push cache, point your service to the cache and edit values in it directly.
Configuration defaults

When you work with multiple environments, some of them may share a common configuration. That’s when using configuration defaults may be
convenient. You can do this by creating a “default” environment and using cfg4j’s MergeConfigurationSource
for reading config first from the original environment and then (as a fallback) from the “default” environment.

Dealing with outages

Configuration repository, push cache and configuration CD pipeline can experience outages. To minimize impact of such
events, it’s good practice to cache the configuration locally (in-memory) after each fetch. cfg4j does that automatically.

Responding to incidents – ultra fast configuration updates (skipping configuration CD pipeline)
Tests can’t always detect all problems. Bugs leak to
production environment and at times it’s important to make a config change as fast as possible to stop the fire. If you’re using push
cache, the fastest way to modify config values is to make changes directly within the cache. Consul offers a rich REST API and web UI for
updating configuration in the key-value store.

Keeping code and configuration in sync
Verifying that code and
configuration are kept in sync happens at the configuration CD pipeline level. One part of the continuous deployment process deploys the
code into a temporary execution environment, and pointing it to the branch that contains the configuration changes. Once the service is up,
we execute a batch of functional tests to verify configuration correctness.

The cool down: Summary
The
presented solution is the result of work that we put into building huge-scale photo-serving services. We needed a simple yet flexible configuration management system, and combining Git, Github, Consul and cfg4j provided a
very satisfactory solution that we encourage you to try.

I want to thank the following people for reviewing this article: Bhautik Joshi, Elanna Belanger, Archie Russell.

Честита Баба Марта

Post Syndicated from Боян Юруков original http://feedproxy.google.com/~r/yurukov-blog/~3/GbiFtlWfEeI/

Още една година, още една Баба Марта. Отново направих мартеници по метода, за който писах преди 8 години. Този път обаче ми помага и Калина тичайки след усукващата се мартеница. По-долу е снимката на резултата. Същите са както през 2009-та, 2011, 2012 и 2013. Направих и още по-голяма мартеница за входната врата.
IMG_20160229_181826



Ето ги и малките мартеници:
IMG_20160229_2328481
Тази година забелязах и нещо интересно. В една от статиите ми от преди 7 години снимах мартениците на ръката си. Открих, че се използва в доста статии обясняващи традицията на мартеницата. Доста от тях са в украински и руски сайтове. Някои от тях се виждат в това търсене в Google. Предполагам, че причината е в това, че при търсене на „мартеница“ на кирилица и латиница снимката излизаше сред първите места. Вече не излиза, тъй като изтрих акаунта си във Flickr заедно с всички снимки в него.


CaffeOnSpark Open Sourced for Distributed Deep Learning on Big Data Clusters

Post Syndicated from yahoo original https://yahooeng.tumblr.com/post/139916828451

yahoohadoop:

By Andy Feng(@afeng76), Jun Shi and Mridul Jain (@mridul_jain), Yahoo Big ML Team
Introduction
Deep learning (DL) is a critical capability required by Yahoo product teams (ex. Flickr, Image Search) to gain intelligence from massive amounts of online data. Many existing DL frameworks require a separated cluster for deep learning, and multiple programs have to be created for a typical machine learning pipeline (see Figure 1). The separated clusters require large datasets to be transferred among them, and introduce unwanted system complexity and latency for end-to-end learning.
image
Figure 1: ML Pipeline with multiple programs on separated clusters

As discussed in our earlier Tumblr post, we believe that deep learning should be conducted in the same cluster along with existing data processing pipelines to support feature engineering and traditional (non-deep) machine learning. We created CaffeOnSpark to allow deep learning training and testing to be embedded into Spark applications (see Figure 2). 
image
Figure 2: ML Pipeline with single program on one cluster

CaffeOnSpark: API & Configuration and CLI

CaffeOnSpark is designed to be a Spark deep learning package. Spark MLlib supported a variety of non-deep learning algorithms for classification, regression, clustering, recommendation, and so on. Deep learning is a key capacity that Spark MLlib lacks currently, and CaffeOnSpark is designed to fill that gap. CaffeOnSpark API supports dataframes so that you can easily interface with a training dataset that was prepared using a Spark application, and extract the predictions from the model or features from intermediate layers for results and data analysis using MLLib or SQL.
imageFigure 3: CaffeOnSpark as a Spark Deep Learning package

1:   def main(args: Array[String]): Unit = {
2:   val ctx = new SparkContext(new SparkConf())
3:   val cos = new CaffeOnSpark(ctx)
4:   val conf = new Config(ctx, args).init()
 5:   val dl_train_source = DataSource.getSource(conf, true)
 6:   cos.train(dl_train_source)
 7:   val lr_raw_source = DataSource.getSource(conf, false)
 8:   val extracted_df = cos.features(lr_raw_source)
 9:   val lr_input_df = extracted_df.withColumn(“Label”, cos.floatarray2doubleUDF(extracted_df(conf.label)))
10:     .withColumn(“Feature”, cos.floatarray2doublevectorUDF(extracted_df(conf.features(0))))
11:  val lr = new LogisticRegression().setLabelCol(“Label”).setFeaturesCol(“Feature”)
12:  val lr_model = lr.fit(lr_input_df)
13:  lr_model.write.overwrite().save(conf.outputPath)
14: }

Figure 4: Scala application using CaffeOnSpark both MLlib

Scala program in Figure 4 illustrates how CaffeOnSpark and MLlib work together:
L1-L4 … You initialize a Spark context, and use it to create CaffeOnSpark and configuration object.
L5-L6 … You use CaffeOnSpark to conduct DNN training with a training dataset on HDFS.
L7-L8 …. The learned DL model is applied to extract features from a feature dataset on HDFS.
L9-L12 … MLlib uses the extracted features to perform non-deep learning (more specifically logistic regression for classification).
L13 … You could save the classification model onto HDFS.

As illustrated in Figure 4, CaffeOnSpark enables deep learning steps to be seamlessly embedded in Spark applications. It eliminates unwanted data movement in traditional solutions (as illustrated in Figure 1), and enables deep learning to be conducted on big-data clusters directly. Direct access to big-data and massive computation power are critical for DL to find meaningful insights in a timely manner.
CaffeOnSpark uses the configuration files for solvers and neural network as in standard Caffe uses. As illustrated in our example, the neural network will have a MemoryData layer with 2 extra parameters:

source_class specifying a data source class

source specifying dataset location.
The initial CaffeOnSpark release has several built-in data source classes (including com.yahoo.ml.caffe.LMDB for LMDB databases and com.yahoo.ml.caffe.SeqImageDataSource for Hadoop sequence files). Users could easily introduce customized data source classes to interact with the existing data formats.

CaffeOnSpark applications will be launched by standard Spark commands, such as spark-submit. Here are 2 examples of spark-submit commands. The first command uses CaffeOnSpark to train a DNN model saved onto HDFS. The second command is a custom Spark application that embedded CaffeOnSpark along with MLlib.
First command:
spark-submit    –files caffenet_train_solver.prototxt,caffenet_train_net.prototxt    –num-executors 2      –class com.yahoo.ml.caffe.CaffeOnSpark        caffe-grid-0.1-SNAPSHOT-jar-with-dependencies.jar       -train -persistent       -conf caffenet_train_solver.prototxt       -model hdfs:///sample_images.model       -devices 2
Second command:

spark-submit    –files caffenet_train_solver.prototxt,caffenet_train_net.prototxt    –num-executors 2      –class com.yahoo.ml.caffe.examples.MyMLPipeline                                         caffe-grid-0.1-SNAPSHOT-jar-with-dependencies.jar
       -features fc8        -label label        -conf caffenet_train_solver.prototxt        -model hdfs:///sample_images.model          -output hdfs:///image_classifier_model        -devices 2

System Architecture
imageFigure 5: System Architecture

Figure 5 describes the system architecture of CaffeOnSpark. We launch Caffe engines on GPU devices or CPU devices within the Spark executor, via invoking a JNI layer with fine-grain memory management. Unlike traditional Spark applications, CaffeOnSpark executors communicate to each other via MPI allreduce style interface via TCP/Ethernet or RDMA/Infiniband. This Spark+MPI architecture enables CaffeOnSpark to achieve similar performance as dedicated deep learning clusters.
Many deep learning jobs are long running, and it is important to handle potential system failures. CaffeOnSpark enables training state being snapshotted periodically, and thus we could resume from previous state after a failure of a CaffeOnSpark job.
Open Source
In the last several quarters, Yahoo has applied CaffeOnSpark on several projects, and we have received much positive feedback from our internal users. Flickr teams, for example, made significant improvements on image recognition accuracy with CaffeOnSpark by training with millions of photos from the Yahoo Webscope Flickr Creative Commons 100M dataset on Hadoop clusters.
CaffeOnSpark is beneficial to deep learning community and the Spark community. In order to advance the fields of deep learning and artificial intelligence, Yahoo is happy to release CaffeOnSpark at github.com/yahoo/CaffeOnSpark under Apache 2.0 license.
CaffeOnSpark can be tested on an  AWS EC2 cloud or on your own Spark clusters. Please find the detailed instructions at Yahoo github repository, and share your feedback at [email protected]. Our goal is to make CaffeOnSpark widely available to deep learning scientists and researchers, and we welcome contributions from the community to make that happen. .

Hadoop Turns 10

Post Syndicated from yahoo original https://yahooeng.tumblr.com/post/138742476996

yahoohadoop:

by Peter Cnudde, VP of Engineering
It is hard to believe that 10 years have already passed since Hadoop was started at Yahoo. We initially applied it to web search, but since then, Hadoop has become central to everything we do at the company. Today, Hadoop is the de facto platform for processing and storing big data for thousands of companies around the world, including most of the Fortune 500. It has also given birth to a thriving industry around it, comprised of a number of companies who have built their businesses on the platform and continue to invest and innovate to expand its capabilities.
At Yahoo, Hadoop remains a cornerstone technology on which virtually every part of our business relies on to power our world-class products, and deliver user experiences that delight more than a billion users worldwide. Whether it is content personalization for increasing engagement, ad targeting and optimization for serving the right ad to the right consumer, new revenue streams from native ads and mobile search monetization, data processing pipelines, mail anti-spam or search assist and analytics – Hadoop touches them all.
When it comes to scale, Yahoo still boasts one of the largest Hadoop deployments in the world. From a footprint standpoint, we maintain over 35,000 Hadoop servers as a central hosted platform running across 16 clusters with a combined 600 petabytes in storage capacity (HDFS), allowing us to execute 34 million monthly compute jobs on the platform.
But we aren’t stopping there, and actively collaborate with the Hadoop community to further push the scalability boundaries and advance technological innovation. We have used MapReduce historically to power batch-oriented processing, but continue to invest in and adopt low latency data processing stacks on top of Hadoop, such as Storm for stream processing, and Tez and Spark for faster batch processing.
What’s more, the applications of these innovations have spanned the gamut – from cool and fun features, like Flickr’s Magic View to one of our most exciting recent projects that involves combining Apache Spark and Caffe. The project allows us to leverage GPUs to power deep learning on Hadoop clusters. This custom deployment bridges the gap between HPC (High Performance Computing) and big data, and is helping position Yahoo as a frontrunner in the next generation of computing and machine learning.
We’re delighted by the impact the platform has made to the big data movement, and can’t wait to see what the next 10 years has in store.
Cheers!

Perceptual Image Compression at Flickr

Post Syndicated from yahoo original https://yahooeng.tumblr.com/post/130574301641

Archie Russell, Peter Norby, Saeideh BakhshiAt Flickr our users really care about image quality. They also care a lot about how responsive our apps are. Addressing both of these concerns simultaneously is challenging;  higher quality images have larger file sizes and are slower to transfer. Slow transfers are especially noticeable on mobile devices. Flickr had historically targeted the high image quality, but in late 2014 we implemented a method to both maintain image quality and decrease file size. As image appearance is very important to our users,  we performed an extensive user test before rolling this change out. Here’s how we did it.Background:  JPEG Quality SettingsJPEG compression has several tunable knobs. The q-value is the best known of these; it adjusts the level of spatial detail stored for fine details: a higher q-value typically keeps more detail. However, as q-value gets very close to 100, file size increases dramatically, usually without improving image appearance.If file size and app performance isn’t an issue, dialing up q-value is an easy way to get really nice-looking images; this is what Flickr has done in the past. And if appearance isn’t very important, dialing down q-value is a viable option. But if you want both,  you’re kind of stuck. Additionally, q-value is not one size fits all,  some images look great at q-value 80 while others don’t.imageAnother commonly adjusted setting is chroma-subsampling,  which alters the amount of color information stored in a JPEG file. With a setting of 4:4:4, the two chroma (color) channels in a JPG have as much information as the luminance channel. In an image with a setting of 4:2:0, each chroma channel has only a quarter as much information as in an a 4:4:4 image.imageq=96, chroma=4:4:4 (125KB)imageq=70, chroma=4:4:4 (67KB)imageq=96, chroma=4:2:0 (62KB)imagea=70, chroma=4:2:0 (62KB)Table 1:  JPEG stored at different quality and chroma levels. The top image is saved at high quality and chroma level – notice the color and detail in the folds of the red flag. The bottom image has the lowest quality – notice artifacts along the right edges of the red flag.Perceptual JPEG CompressionIdeally, we’d have an algorithm which automatically tuned all JPEG parameters to make a file smaller, but which would limit perceptible changes to the image. Technology exists that attempts to do this and can decrease image file size by over 30%. This compression ratio is highly dependent on image content and dimensions.imageFig 2. Compressed (l) and uncompressed ® images. Compressed image is 36% smaller.We were pleased with perceptually compressed images in cursory examinations, compressed images were smaller and nearly indistinguishable from their sources. But we wanted to really quantify how well it worked before rolling it out. The standard computational tools for evaluating compression, such as SSIM, are fairly simplistic and don’t do a great job at modeling how a user sees things. To really evaluate this technology had to use a better measure of perceptibility:  human minds.  imageTo test whether our image compression would impact user perception of image quality, we put together a “taste test”. The taste test was constructed as a game with multiple rounds where users looked at both compressed and uncompressed images. Users accumulated points the longer they played, and got more points for doing well at the game.  We maintained a leaderboard to encourage participation and used only internal testers.   The game’s test images came from a diverse collection of 250 images contributed by Flickr staff. The images came from a variety of cameras and included a number of subjects from photographers with varying skill levels.imageIn each round, our test code randomly selected a test image, and presented two variants of this image side by side. 50% of the time we presented the user two identical images; the rest of the time we presented one compressed image and one uncompressed image. We asked the tester if the two images looked the same or different. We expected a user choosing randomly OR a user unable to distinguish the two cases would answer correctly about half the time.  We randomly swapped the location of the compressed images to compensate for user bias to the left or the right.  If testers chose correctly, they were presented with a second question: "Which image did you prefer, and why?”imageFig 4. Screenshot of taste test.Our test displayed images simultaneously to prevent testers noticing a longer load time for the larger, non-compressed image. The images were presented with either 320, 640, or 1600 pixels on their longest side. The 320 & 640px images were shown for 12 seconds before being dimmed out. The intent behind this detail was to represent how real users interacted with our images. The 1600px images stayed on screen for 20 seconds, as we expected larger images to be viewed for longer periods of time by real users.Taste Test Outcome and DeploymentWe ran our taste test for two weeks and analyzed our results. Although we let users play as long as they liked, we skipped the first result per user as a “warm-up” and considered only the subsequent ten results, which limited the potential for users training themselves to spot compression artifacts. We disregarded users that had fewer than eleven results.imageTable 2. Taste test results. Testers selected “identical” at nearly the same rate, whether the input was identical or not.When our testers were presented with two identical images, they thought the images were identical only 68.8% of the time(!), and when presented with a compressed image next to a non-compressed image, our testers thought the images were identical slightly less often: 67.6% of the time. This difference was small enough for us, and our statisticians told us it was statistically insignificant. Our image pairs were so similar that multiple testers thought all images were identical and reported that the test system was buggy. We inspected the images most often labeled different, and found no significant artifacts in the compressed versions.So even in this side-by-side test, perceptual image compression was just barely noticeable when images were presented side-by-side.  As the Flickr website wouldn’t ever show compressed and uncompressed images at the same time, and the use of compression had large benefits in storage footprint and site performance, we elected to go forward.At the beginning of 2014 we silently rolled out perceptual-based compression on our image thumbnails (we don’t alter the “original” images uploaded by our users). The slight changes to image appearance went unnoticed by users, but user interactions with Flickr became much faster, especially for users with slow connections, while our storage footprint became much smaller. This was a best-case scenario for us.Evaluating perceptual compression was a considerable task, but it gave the confidence we needed to apply this compression in production to our users.   This marked the first time Flickr had adjusted image settings in years, and, it was fun.imageFig 5.  Taste test high score listEpilogueAfter eighteen months of perceptual compression at Flickr,  we adjusted our settings slightly to shrink images an additional 15%. For our users on mobile devices, 15% fewer bytes per image makes for a much more responsive experience. We had run a taste test on this newer setting and users were were able to spot our compression slightly more often than with our original settings. When presented a pair of identical images, our testers declared these images identical 65.2% of the time, when presented with different images, our testers declared the images identical 62% of the time. It wasn’t as imperceptible as our original approach, but, we decided it was close enough to roll out.Boy were we wrong! A few very vocal users spotted the compression and didn’t like it at all. The Flickr Help Forum had a very lively thread which Petapixel picked up. We beat our heads against the wall considered our options and came up with a middle path between our initial and follow-on approaches, giving us smaller, faster-to-load files while still maintaining the appearance our users expect.Through our use of perceptual compression, combined with our use of on-the-fly resize and COS, we were able to decrease our storage footprint dramatically, while simultaneously improving user experience. It’s a win all around but we’re not done yet — we still have a few tricks up our sleeves.