Tag Archives: marketing

Tech wishes for 2018

Post Syndicated from Eevee original https://eev.ee/blog/2018/02/18/tech-wishes-for-2018/

Anonymous asks, via money:

What would you like to see happen in tech in 2018?

(answer can be technical, social, political, combination, whatever)

Hmm.

Less of this

I’m not really qualified to speak in depth about either of these things, but let me put my foot in my mouth anyway:

The Blockchain™

Bitcoin was a neat idea. No, really! Decentralization is cool. Overhauling our terrible financial infrastructure is cool. Hash functions are cool.

Unfortunately, it seems to have devolved into mostly a get-rich-quick scheme for nerds, and by nearly any measure it’s turning into a spectacular catastrophe. Its “success” is measured in how much a bitcoin is worth in US dollars, which is pretty close to an admission from its own investors that its only value is in converting back to “real” money — all while that same “success” is making it less useful as a distinct currency.

Blah, blah, everyone already knows this.

What concerns me slightly more is the gold rush hype cycle, which is putting cryptocurrency and “blockchain” in the news and lending it all legitimacy. People have raked in millions of dollars on ICOs of novel coins I’ve never heard mentioned again. (Note: again, that value is measured in dollars.) Most likely, none of the investors will see any return whatsoever on that money. They can’t, really, unless a coin actually takes off as a currency, and that seems at odds with speculative investing since everyone either wants to hoard or ditch their coins. When the coins have no value themselves, the money can only come from other investors, and eventually the hype winds down and you run out of other investors.

I fear this will hurt a lot of people before it’s over, so I’d like for it to be over as soon as possible.


That said, the hype itself has gotten way out of hand too. First it was the obsession with “blockchain” like it’s a revolutionary technology, but hey, Git is a fucking blockchain. The novel part is the way it handles distributed consensus (which in Git is basically left for you to figure out), and that’s uniquely important to currency because you want to be pretty sure that money doesn’t get duplicated or lost when moved around.

But now we have startups trying to use blockchains for website backends and file storage and who knows what else? Why? What advantage does this have? When you say “blockchain”, I hear “single Git repository” — so when you say “email on the blockchain”, I have an aneurysm.

Bitcoin seems to have sparked imagination in large part because it’s decentralized, but I’d argue it’s actually a pretty bad example of a decentralized network, since people keep forking it. The ability to fork is a feature, sure, but the trouble here is that the Bitcoin family has no notion of federation — there is one canonical Bitcoin ledger and it has no notion of communication with any other. That’s what you want for currency, not necessarily other applications. (Bitcoin also incentivizes frivolous forking by giving the creator an initial pile of coins to keep and sell.)

And federation is much more interesting than decentralization! Federation gives us email and the web. Federation means I can set up my own instance with my own rules and still be able to meaningfully communicate with the rest of the network. Federation has some amount of tolerance for changes to the protocol, so such changes are more flexible and rely more heavily on consensus.

Federation is fantastic, and it feels like a massive tragedy that this rekindled interest in decentralization is mostly focused on peer-to-peer networks, which do little to address our current problems with centralized platforms.

And hey, you know what else is federated? Banks.

AI

Again, the tech is cool and all, but the marketing hype is getting way out of hand.

Maybe what I really want from 2018 is less marketing?

For one, I’ve seen a huge uptick in uncritically referring to any software that creates or classifies creative work as “AI”. Can we… can we not. It’s not AI. Yes, yes, nerds, I don’t care about the hair-splitting about the nature of intelligence — you know that when we hear “AI” we think of a human-like self-aware intelligence. But we’re applying it to stuff like a weird dog generator. Or to whatever neural network a website threw into production this week.

And this is dangerously misleading — we already had massive tech companies scapegoating The Algorithm™ for the poor behavior of their software, and now we’re talking about those algorithms as though they were self-aware, untouchable, untameable, unknowable entities of pure chaos whose decisions we are arbitrarily bound to. Ancient, powerful gods who exist just outside human comprehension or law.

It’s weird to see this stuff appear in consumer products so quickly, too. It feels quick, anyway. The latest iPhone can unlock via facial recognition, right? I’m sure a lot of effort was put into ensuring that the same person’s face would always be recognized… but how confident are we that other faces won’t be recognized? I admit I don’t follow all this super closely, so I may be imagining a non-problem, but I do know that humans are remarkably bad at checking for negative cases.

Hell, take the recurring problem of major platforms like Twitter and YouTube classifying anything mentioning “bisexual” as pornographic — because the word is also used as a porn genre, and someone threw a list of porn terms into a filter without thinking too hard about it. That’s just a word list, a fairly simple thing that any human can review; but suddenly we’re confident in opaque networks of inferred details?

I don’t know. “Traditional” classification and generation are much more comforting, since they’re a set of fairly abstract rules that can be examined and followed. Machine learning, as I understand it, is less about rules and much more about pattern-matching; it’s built out of the fingerprints of the stuff it’s trained on. Surely that’s just begging for tons of edge cases. They’re practically made of edge cases.


I’m reminded of a point I saw made a few days ago on Twitter, something I’d never thought about but should have. TurnItIn is a service for universities that checks whether students’ papers match any others, in order to detect cheating. But this is a paid service, one that fundamentally hinges on its corpus: a large collection of existing student papers. So students pay money to attend school, where they’re required to let their work be given to a third-party company, which then profits off of it? What kind of a goofy business model is this?

And my thoughts turn to machine learning, which is fundamentally different from an algorithm you can simply copy from a paper, because it’s all about the training data. And to get good results, you need a lot of training data. Where is that all coming from? How many for-profit companies are setting a neural network loose on the web — on millions of people’s work — and then turning around and selling the result as a product?

This is really a question of how intellectual property works in the internet era, and it continues our proud decades-long tradition of just kinda doing whatever we want without thinking about it too much. Nothing if not consistent.

More of this

A bit tougher, since computers are pretty alright now and everything continues to chug along. Maybe we should just quit while we’re ahead. There’s some real pie-in-the-sky stuff that would be nice, but it certainly won’t happen within a year, and may never happen except in some horrific Algorithmic™ form designed by people that don’t know anything about the problem space and only works 60% of the time but is treated as though it were bulletproof.

Federation

The giants are getting more giant. Maybe too giant? Granted, it could be much worse than Google and Amazon — it could be Apple!

Amazon has its own delivery service and brick-and-mortar stores now, as well as providing the plumbing for vast amounts of the web. They’re not doing anything particularly outrageous, but they kind of loom.

Ad company Google just put ad blocking in its majority-share browser — albeit for the ambiguously-noble goal of only blocking obnoxious ads so that people will be less inclined to install a blanket ad blocker.

Twitter is kind of a nightmare but no one wants to leave. I keep trying to use Mastodon as well, but I always forget about it after a day, whoops.

Facebook sounds like a total nightmare but no one wants to leave that either, because normies don’t use anything else, which is itself direly concerning.

IRC is rapidly bleeding mindshare to Slack and Discord, both of which are far better at the things IRC sadly never tried to do and absolutely terrible at the exact things IRC excels at.

The problem is the same as ever: there’s no incentive to interoperate. There’s no fundamental technical reason why Twitter and Tumblr and MySpace and Facebook can’t intermingle their posts; they just don’t, because why would they bother? It’s extra work that makes it easier for people to not use your ecosystem.

I don’t know what can be done about that, except that hope for a really big player to decide to play nice out of the kindness of their heart. The really big federated success stories — say, the web — mostly won out because they came along first. At this point, how does a federated social network take over? I don’t know.

Social progress

I… don’t really have a solid grasp on what’s happening in tech socially at the moment. I’ve drifted a bit away from the industry part, which is where that all tends to come up. I have the vague sense that things are improving, but that might just be because the Rust community is the one I hear the most about, and it puts a lot of effort into being inclusive and welcoming.

So… more projects should be like Rust? Do whatever Rust is doing? And not so much what Linus is doing.

Open source funding

I haven’t heard this brought up much lately, but it would still be nice to see. The Bay Area runs on open source and is raking in zillions of dollars on its back; pump some of that cash back into the ecosystem, somehow.

I’ve seen a couple open source projects on Patreon, which is fantastic, but feels like a very small solution given how much money is flowing through the commercial tech industry.

Ad blocking

Nice. Fuck ads.

One might wonder where the money to host a website comes from, then? I don’t know. Maybe we should loop this in with the above thing and find a more informal way to pay people for the stuff they make when we find it useful, without the financial and cognitive overhead of A Transaction or Giving Someone My Damn Credit Card Number. You know, something like Bitco— ah, fuck.

Year of the Linux Desktop

I don’t know. What are we working on at the moment? Wayland? Do Wayland, I guess. Oh, and hi-DPI, which I hear sucks. And please fix my sound drivers so PulseAudio stops blaming them when it fucks up.

Pirates Crack Microsoft’s UWP Protection, Five Layers of DRM Defeated

Post Syndicated from Andy original https://torrentfreak.com/pirates-crack-microsofts-uwp-protection-five-layers-of-drm-defeated-180215/

As the image on the right shows, Microsoft’s Universal Windows Platform (UWP) is a system that enables software developers to create applications that can run across many devices.

“The Universal Windows Platform (UWP) is the app platform for Windows 10. You can develop apps for UWP with just one API set, one app package, and one store to reach all Windows 10 devices – PC, tablet, phone, Xbox, HoloLens, Surface Hub and more,” Microsoft explains.

While the benefits of such a system are immediately apparent, critics say that UWP gives Microsoft an awful lot of control, not least since UWP software must be distributed via the Windows Store with Microsoft taking a cut.

Or that was the plan, at least.

Last evening it became clear that the UWP system, previously believed to be uncrackable, had fallen to pirates. After being released on October 31, 2017, the somewhat underwhelming Zoo Tycoon Ultimate Animal Collection became the first victim at the hands of popular scene group, CODEX.

“This is the first scene release of a UWP (Universal Windows Platform) game. Therefore we would like to point out that it will of course only work on Windows 10. This particular game requires Windows 10 version 1607 or newer,” the group said in its release notes.

CODEX release notes

CODEX says it’s important that the game isn’t allowed to communicate with the Internet so the group advises users to block the game’s executable in their firewall.

While that’s not a particularly unusual instruction, CODEX did reveal that various layers of protection had to be bypassed to make the game work. They’re listed by the group as MSStore, UWP, EAppX, XBLive, and Arxan, the latter being an anti-tamper system.

“It’s the equivalent of Denuvo (without the DRM License part),” cracker Voksi previously explained. “It’s still bloats the executable with useless virtual machines that only slow down your game.”

Arxan features

Arxan’s marketing comes off as extremely confident but may need amending in light of yesterday’s developments.

“Arxan uses code protection against reverse-engineering, key and data protection to secure servers and fortification of game logic to stop the bad guys from tampering. Sorry hackers, game over,” the company’s marketing reads.

What is unclear at this stage is whether Zoo Tycoon Ultimate Animal Collection represents a typical UWP release or if some particular flaw allowed CODEX to take it apart. The possibility of additional releases is certainly a tantalizing one for pirates but how long they will have to wait is unknown.

Whatever the outcome, Arxan calling “game over” is perhaps a little premature under the circumstances but in this continuing arms race, they probably have another version of their anti-tamper tech up their sleeves…..

Source: TF, for the latest info on copyright, file-sharing, torrent sites and more. We also have VPN discounts, offers and coupons

Early Challenges: Making Critical Hires

Post Syndicated from Gleb Budman original https://www.backblaze.com/blog/early-challenges-making-critical-hires/

row of potential employee hires sitting waiting for an interview

In 2009, Google disclosed that they had 400 recruiters on staff working to hire nearly 10,000 people. Someday, that might be your challenge, but most companies in their early days are looking to hire a handful of people — the right people — each year. Assuming you are closer to startup stage than Google stage, let’s look at who you need to hire, when to hire them, where to find them (and how to help them find you), and how to get them to join your company.

Who Should Be Your First Hires

In later stage companies, the roles in the company have been well fleshed out, don’t change often, and each role can be segmented to focus on a specific area. A large company may have an entire department focused on just cubicle layout; at a smaller company you may not have a single person whose actual job encompasses all of facilities. At Backblaze, our CTO has a passion and knack for facilities and mostly led that charge. Also, the needs of a smaller company are quick to change. One of our first hires was a QA person, Sean, who ended up being 100% focused on data center infrastructure. In the early stage, things can shift quite a bit and you need people that are broadly capable, flexible, and most of all willing to pitch in where needed.

That said, there are times you may need an expert. At a previous company we hired Jon, a PhD in Bayesian statistics, because we needed algorithmic analysis for spam fighting. However, even that person was not only able and willing to do the math, but also code, and to not only focus on Bayesian statistics but explore a plethora of spam fighting options.

When To Hire

If you’ve raised a lot of cash and are willing to burn it with mistakes, you can guess at all the roles you might need and start hiring for them. No judgement: that’s a reasonable strategy if you’re cash-rich and time-poor.

If your cash is limited, try to see what you and your team are already doing and then hire people to take those jobs. It may sound counterintuitive, but if you’re already doing it presumably it needs to be done, you have a good sense of the type of skills required to do it, and you can bring someone on-board and get them up to speed quickly. That then frees you up to focus on tasks that can’t be done by someone else. At Backblaze, I ran marketing internally for years before hiring a VP of Marketing, making it easier for me to know what we needed. Once I was hiring, my primary goal was to find someone I could trust to take that role completely off of me so I could focus solely on my CEO duties

Where To Find the Right People

Finding great people is always difficult, particularly when the skillsets you’re looking for are highly in-demand by larger companies with lots of cash and cachet. You, however, have one massive advantage: you need to hire 5 people, not 5,000.

People You Worked With

The absolutely best people to hire are ones you’ve worked with before that you already know are good in a work situation. Consider your last job, the one before, and the one before that. A significant number of the people we recruited at Backblaze came from our previous startup MailFrontier. We knew what they could do and how they would fit into the culture, and they knew us and thus could quickly meld into the environment. If you didn’t have a previous job, consider people you went to school with or perhaps individuals with whom you’ve done projects previously.

People You Know

Hiring friends, family, and others can be risky, but should be considered. Sometimes a friend can be a “great buddy,” but is not able to do the job or isn’t a good fit for the organization. Having to let go of someone who is a friend or family member can be rough. Have the conversation up front with them about that possibility, so you have the ability to stay friends if the position doesn’t work out. Having said that, if you get along with someone as a friend, that’s one critical component of succeeding together at work. At Backblaze we’ve hired a number of people successfully that were friends of someone in the organization.

Friends Of People You Know

Your network is likely larger than you imagine. Your employees, investors, advisors, spouses, friends, and other folks all know people who might be a great fit for you. Make sure they know the roles you’re hiring for and ask them if they know anyone that would fit. Search LinkedIn for the titles you’re looking for and see who comes up; if they’re a 2nd degree connection, ask your connection for an introduction.

People You Know About

Sometimes the person you want isn’t someone anyone knows, but you may have read something they wrote, used a product they’ve built, or seen a video of a presentation they gave. Reach out. You may get a great hire: worst case, you’ll let them know they were appreciated, and make them aware of your organization.

Other Places to Find People

There are a million other places to find people, including job sites, community groups, Facebook/Twitter, GitHub, and more. Consider where the people you’re looking for are likely to congregate online and in person.

A Comment on Diversity

Hiring “People You Know” can often result in “Hiring People Like You” with the same workplace experiences, culture, background, and perceptions. Some studies have shown [1, 2, 3, 4] that homogeneous groups deliver faster, while heterogeneous groups are more creative. Also, “Hiring People Like You” often propagates the lack of women and minorities in tech and leadership positions in general. When looking for people you know, keep an eye to not discount people you know who don’t have the same cultural background as you.

Helping People To Find You

Reaching out proactively to people is the most direct way to find someone, but you want potential hires coming to you as well. To do this, they have to a) be aware of you, b) know you have a role they’re interested in, and c) think they would want to work there. Let’s tackle a) and b) first below.

Your Blog

I started writing our blog before we launched the product and talked about anything I found interesting related to our space. For several years now our team has owned the content on the blog and in 2017 over 1.5 million people read it. Each time we have a position open it’s published to the blog. If someone finds reading about backup and storage interesting, perhaps they’d want to dig in deeper from the inside. Many of the people we’ve recruited have mentioned reading the blog as either how they found us or as a factor in why they wanted to work here.
[BTW, this is Gleb’s 200th post on Backblaze’s blog. The first was in 2008. — Editor]

Your Email List

In addition to the emails our blog subscribers receive, we send regular emails to our customers, partners, and prospects. These are largely focused on content we think is directly useful or interesting for them. However, once every few months we include a small mention that we’re hiring, and the positions we’re looking for. Often a small blurb is all you need to capture people’s imaginations whether they might find the jobs interesting or can think of someone that might fit the bill.

Your Social Involvement

Whether it’s Twitter or Facebook, Hacker News or Slashdot, your potential hires are engaging in various communities. Being socially involved helps make people aware of you, reminds them of you when they’re considering a job, and paints a picture of what working with you and your company would be like. Adam was in a Reddit thread where we were discussing our Storage Pods, and that interaction was ultimately part of the reason he left Apple to come to Backblaze.

Convincing People To Join

Once you’ve found someone or they’ve found you, how do you convince them to join? They may be currently employed, have other offers, or have to relocate. Again, while the biggest companies have a number of advantages, you might have more unique advantages than you realize.

Why Should They Join You

Here are a set of items that you may be able to offer which larger organizations might not:

Role: Consider the strengths of the role. Perhaps it will have broader scope? More visibility at the executive level? No micromanagement? Ability to take risks? Option to create their own role?

Compensation: In addition to salary, will their options potentially be worth more since they’re getting in early? Can they trade-off salary for more options? Do they get option refreshes?

Benefits: In addition to healthcare, food, and 401(k) plans, are there unique benefits of your company? One company I knew took the entire team for a one-month working retreat abroad each year.

Location: Most people prefer to work close to home. If you’re located outside of the San Francisco Bay Area, you might be at a disadvantage for not being in the heart of tech. But if you find employees close to you you’ve got a huge advantage. Sometimes it’s micro; even in the Bay Area the difference of 5 miles can save 20 minutes each way every day. We located the Backblaze headquarters in San Mateo, a middle-ground that made it accessible to those coming from San Jose and San Francisco. We also chose a downtown location near a train, restaurants, and cafes: all to make it easier and more pleasant. Also, are you flexible in letting your employees work remotely? Our systems administrator Elliott is about to embark on a long-term cross-country journey working from an RV.

Environment: Open office, cubicle, cafe, work-from-home? Loud/quiet? Social or focused? 24×7 or work-life balance? Different environments appeal to different people.

Team: Who will they be working with? A company with 100,000 people might have 100 brilliant ones you’d want to work with, but ultimately we work with our core team. Who will your prospective hires be working with?

Market: Some people are passionate about gaming, others biotech, still others food. The market you’re targeting will get different people excited.

Product: Have an amazing product people love? Highlight that. If you’re lucky, your potential hire is already a fan.

Mission: Curing cancer, making people happy, and other company missions inspire people to strive to be part of the journey. Our mission is to make storing data astonishingly easy and low-cost. If you care about data, information, knowledge, and progress, our mission helps drive all of them.

Culture: I left this for last, but believe it’s the most important. What is the culture of your company? Finding people who want to work in the culture of your organization is critical. If they like the culture, they’ll fit and continue it. We’ve worked hard to build a culture that’s collaborative, friendly, supportive, and open; one in which people like coming to work. For example, the five founders started with (and still have) the same compensation and equity. That started a culture of “we’re all in this together.” Build a culture that will attract the people you want, and convey what the culture is.

Writing The Job Description

Most job descriptions focus on the all the requirements the candidate must meet. While important to communicate, the job description should first sell the job. Why would the appropriate candidate want the job? Then share some of the requirements you think are critical. Remember that people read not just what you say but how you say it. Try to write in a way that conveys what it is like to actually be at the company. Ahin, our VP of Marketing, said the job description itself was one of the things that attracted him to the company.

Orchestrating Interviews

Much can be said about interviewing well. I’m just going to say this: make sure that everyone who is interviewing knows that their job is not only to evaluate the candidate, but give them a sense of the culture, and sell them on the company. At Backblaze, we often have one person interview core prospects solely for company/culture fit.

Onboarding

Hiring success shouldn’t be defined by finding and hiring the right person, but instead by the right person being successful and happy within the organization. Ensure someone (usually their manager) provides them guidance on what they should be concentrating on doing during their first day, first week, and thereafter. Giving new employees opportunities and guidance so that they can achieve early wins and feel socially integrated into the company does wonders for bringing people on board smoothly

In Closing

Our Director of Production Systems, Chris, said to me the other day that he looks for companies where he can work on “interesting problems with nice people.” I’m hoping you’ll find your own version of that and find this post useful in looking for your early and critical hires.

Of course, I’d be remiss if I didn’t say, if you know of anyone looking for a place with “interesting problems with nice people,” Backblaze is hiring. 😉

The post Early Challenges: Making Critical Hires appeared first on Backblaze Blog | Cloud Storage & Cloud Backup.

How I built a data warehouse using Amazon Redshift and AWS services in record time

Post Syndicated from Stephen Borg original https://aws.amazon.com/blogs/big-data/how-i-built-a-data-warehouse-using-amazon-redshift-and-aws-services-in-record-time/

This is a customer post by Stephen Borg, the Head of Big Data and BI at Cerberus Technologies.

Cerberus Technologies, in their own words: Cerberus is a company founded in 2017 by a team of visionary iGaming veterans. Our mission is simple – to offer the best tech solutions through a data-driven and a customer-first approach, delivering innovative solutions that go against traditional forms of working and process. This mission is based on the solid foundations of reliability, flexibility and security, and we intend to fundamentally change the way iGaming and other industries interact with technology.

Over the years, I have developed and created a number of data warehouses from scratch. Recently, I built a data warehouse for the iGaming industry single-handedly. To do it, I used the power and flexibility of Amazon Redshift and the wider AWS data management ecosystem. In this post, I explain how I was able to build a robust and scalable data warehouse without the large team of experts typically needed.

In two of my recent projects, I ran into challenges when scaling our data warehouse using on-premises infrastructure. Data was growing at many tens of gigabytes per day, and query performance was suffering. Scaling required major capital investment for hardware and software licenses, and also significant operational costs for maintenance and technical staff to keep it running and performing well. Unfortunately, I couldn’t get the resources needed to scale the infrastructure with data growth, and these projects were abandoned. Thanks to cloud data warehousing, the bottleneck of infrastructure resources, capital expense, and operational costs have been significantly reduced or have totally gone away. There is no more excuse for allowing obstacles of the past to delay delivering timely insights to decision makers, no matter how much data you have.

With Amazon Redshift and AWS, I delivered a cloud data warehouse to the business very quickly, and with a small team: me. I didn’t have to order hardware or software, and I no longer needed to install, configure, tune, or keep up with patches and version updates. Instead, I easily set up a robust data processing pipeline and we were quickly ingesting and analyzing data. Now, my data warehouse team can be extremely lean, and focus more time on bringing in new data and delivering insights. In this post, I show you the AWS services and the architecture that I used.

Handling data feeds

I have several different data sources that provide everything needed to run the business. The data includes activity from our iGaming platform, social media posts, clickstream data, marketing and campaign performance, and customer support engagements.

To handle the diversity of data feeds, I developed abstract integration applications using Docker that run on Amazon EC2 Container Service (Amazon ECS) and feed data to Amazon Kinesis Data Streams. These data streams can be used for real time analytics. In my system, each record in Kinesis is preprocessed by an AWS Lambda function to cleanse and aggregate information. My system then routes it to be stored where I need on Amazon S3 by Amazon Kinesis Data Firehose. Suppose that you used an on-premises architecture to accomplish the same task. A team of data engineers would be required to maintain and monitor a Kafka cluster, develop applications to stream data, and maintain a Hadoop cluster and the infrastructure underneath it for data storage. With my stream processing architecture, there are no servers to manage, no disk drives to replace, and no service monitoring to write.

Setting up a Kinesis stream can be done with a few clicks, and the same for Kinesis Firehose. Firehose can be configured to automatically consume data from a Kinesis Data Stream, and then write compressed data every N minutes to Amazon S3. When I want to process a Kinesis data stream, it’s very easy to set up a Lambda function to be executed on each message received. I can just set a trigger from the AWS Lambda Management Console, as shown following.

I also monitor the duration of function execution using Amazon CloudWatch and AWS X-Ray.

Regardless of the format I receive the data from our partners, I can send it to Kinesis as JSON data using my own formatters. After Firehose writes this to Amazon S3, I have everything in nearly the same structure I received but compressed, encrypted, and optimized for reading.

This data is automatically crawled by AWS Glue and placed into the AWS Glue Data Catalog. This means that I can immediately query the data directly on S3 using Amazon Athena or through Amazon Redshift Spectrum. Previously, I used Amazon EMR and an Amazon RDS–based metastore in Apache Hive for catalog management. Now I can avoid the complexity of maintaining Hive Metastore catalogs. Glue takes care of high availability and the operations side so that I know that end users can always be productive.

Working with Amazon Athena and Amazon Redshift for analysis

I found Amazon Athena extremely useful out of the box for ad hoc analysis. Our engineers (me) use Athena to understand new datasets that we receive and to understand what transformations will be needed for long-term query efficiency.

For our data analysts and data scientists, we’ve selected Amazon Redshift. Amazon Redshift has proven to be the right tool for us over and over again. It easily processes 20+ million transactions per day, regardless of the footprint of the tables and the type of analytics required by the business. Latency is low and query performance expectations have been more than met. We use Redshift Spectrum for long-term data retention, which enables me to extend the analytic power of Amazon Redshift beyond local data to anything stored in S3, and without requiring me to load any data. Redshift Spectrum gives me the freedom to store data where I want, in the format I want, and have it available for processing when I need it.

To load data directly into Amazon Redshift, I use AWS Data Pipeline to orchestrate data workflows. I create Amazon EMR clusters on an intra-day basis, which I can easily adjust to run more or less frequently as needed throughout the day. EMR clusters are used together with Amazon RDS, Apache Spark 2.0, and S3 storage. The data pipeline application loads ETL configurations from Spring RESTful services hosted on AWS Elastic Beanstalk. The application then loads data from S3 into memory, aggregates and cleans the data, and then writes the final version of the data to Amazon Redshift. This data is then ready to use for analysis. Spark on EMR also helps with recommendations and personalization use cases for various business users, and I find this easy to set up and deliver what users want. Finally, business users use Amazon QuickSight for self-service BI to slice, dice, and visualize the data depending on their requirements.

Each AWS service in this architecture plays its part in saving precious time that’s crucial for delivery and getting different departments in the business on board. I found the services easy to set up and use, and all have proven to be highly reliable for our use as our production environments. When the architecture was in place, scaling out was either completely handled by the service, or a matter of a simple API call, and crucially doesn’t require me to change one line of code. Increasing shards for Kinesis can be done in a minute by editing a stream. Increasing capacity for Lambda functions can be accomplished by editing the megabytes allocated for processing, and concurrency is handled automatically. EMR cluster capacity can easily be increased by changing the master and slave node types in Data Pipeline, or by using Auto Scaling. Lastly, RDS and Amazon Redshift can be easily upgraded without any major tasks to be performed by our team (again, me).

In the end, using AWS services including Kinesis, Lambda, Data Pipeline, and Amazon Redshift allows me to keep my team lean and highly productive. I eliminated the cost and delays of capital infrastructure, as well as the late night and weekend calls for support. I can now give maximum value to the business while keeping operational costs down. My team pushed out an agile and highly responsive data warehouse solution in record time and we can handle changing business requirements rapidly, and quickly adapt to new data and new user requests.


Additional Reading

If you found this post useful, be sure to check out Deploy a Data Warehouse Quickly with Amazon Redshift, Amazon RDS for PostgreSQL and Tableau Server and Top 8 Best Practices for High-Performance ETL Processing Using Amazon Redshift.


About the Author

Stephen Borg is the Head of Big Data and BI at Cerberus Technologies. He has a background in platform software engineering, and first became involved in data warehousing using the typical RDBMS, SQL, ETL, and BI tools. He quickly became passionate about providing insight to help others optimize the business and add personalization to products. He is now the Head of Big Data and BI at Cerberus Technologies.

 

 

 

Server vs Endpoint Backup — Which is Best?

Post Syndicated from Roderick Bauer original https://www.backblaze.com/blog/endpoint-backup-for-distributed-computing/

server and computer backup to the cloud

How common are these statements in your organization?

  • I know I saved that file. The application must have put it somewhere outside of my documents folder.” — Mike in Marketing
  • I was on the road and couldn’t get a reliable VPN connection. I guess that’s why my laptop wasn’t backed up.” — Sally in Sales
  • I try to follow file policies, but I had a deadline this week and didn’t have time to copy my files to the server.” — Felicia in Finance
  • I just did a commit of my code changes and that was when the coffee mug was knocked over onto the laptop.” — Erin in Engineering
  • If you need a file restored from backup, contact the help desk at [email protected] The IT department will get back to you.” — XYZ corporate intranet
  • Why don’t employees save files on the network drive like they’re supposed to?” — Isaac in IT

If these statements are familiar, most likely you rely on file server backups to safeguard your valuable endpoint data.

The problem is, the workplace has changed. Where server backups might have fit how offices worked at one time in the past, relying solely on server backups today means you could be missing valuable endpoint data from your backups. On top of that, you likely are unnecessarily expending valuable user and IT time in attempting to secure and restore endpoint data.

Times Have Changed, and so have Effective Enterprise Backup Strategies

The ways we use computers and handle files today are vastly different from just five or ten years ago. Employees are mobile, and we no longer are limited to monolithic PC and Mac-based office suites. Cloud applications are everywhere. Company-mandated network drive policies are difficult to enforce as office practices change, devices proliferate, and organizational culture evolves. Besides, your IT staff has other things to do than babysit your employees to make sure they follow your organization’s policies for managing files.

Server Backup has its Place, but Does it Support How People Work Today?

Many organizations still rely on server backup. If your organization works primarily in centralized offices with all endpoints — likely desktops — connected directly to your network, and you maintain tight control of how employees manage their files, it still might work for you.

Your IT department probably has set network drive policies that require employees to save files in standard places that are regularly backed up to your file server. Turns out, though, that even standard applications don’t always save files where IT would like them to be. They could be in a directory or folder that’s not regularly backed up.

As employees have become more mobile, they have adopted practices that enable them to access files from different places, but these practices might not fit in with your organization’s server policies. An employee saving a file to Dropbox might be planning to copy it to an “official” location later, but whether that ever happens could be doubtful. Often people don’t realize until it’s too late that accidentally deleting a file in one sync service directory means that all copies in all locations — even the cloud — are also deleted.

Employees are under increasing demands to produce, which means that network drive policies aren’t always followed; time constraints and deadlines can cause best practices to go out the window. Users will attempt to comply with policies as best they can — and you might get 70% or even 75% effective compliance — but getting even to that level requires training, monitoring, and repeatedly reminding employees of policies they need to follow — none of which leads to a good work environment.

Even if you get to 75% compliance with network file policies, what happens if the critical file needed to close out an end-of-year financial summary isn’t one of the files backed up? The effort required for IT to get from 70% to 80% or 90% of an endpoint’s files effectively backed up could require multiple hours from your IT department, and you still might not have backed up the one critical file you need later.

Your Organization Operates on its Data — And Today That Data Exists in Multiple Locations

Users are no longer tied to one endpoint, and may use different computers in the office, at home, or traveling. The greater the number of endpoints used, the greater the chance of an accidental or malicious device loss or data corruption. The loss of the Sales VP’s laptop at the airport on her way back from meeting with major customers can affect an entire organization and require weeks to resolve.

Even with the best intentions and efforts, following policies when out of the office can be difficult or impossible. Connecting to your private network when remote most likely requires a VPN, and VPN connectivity can be challenging from the lobby Wi-Fi at the Radisson. Server restores require time from the IT staff, which can mean taking resources away from other IT priorities and a growing backlog of requests from users to need their files as soon as possible. When users are dependent on IT to get back files critical to their work, employee productivity and often deadlines are affected.

Managing Finite Server Storage Is an Ongoing Challenge

Network drive backup usually requires on-premises data storage for endpoint backups. Since it is a finite resource, allocating that storage is another burden on your IT staff. To make sure that storage isn’t exceeded, IT departments often ration storage by department and/or user — another oversight duty for IT, and even more choices required by your IT department and department heads who have to decide which files to prioritize for backing up.

Adding Backblaze Endpoint Backup Improves Business Continuity and Productivity

Having an endpoint backup strategy in place can mitigate these problems and improve user productivity, as well. A good endpoint backup service, such as Backblaze Cloud Backup, will ensure that all devices are backed up securely, automatically, without requiring any action by the user or by your IT department.

For 99% of users, no configuration is required for Backblaze Backup. Everything on the endpoint is encrypted and securely backed up to the cloud, including program configuration files and files outside of standard document folders. Even temp files are backed up, which can prove invaluable when recovering a file after a crash or other program interruption. Cloud storage is unlimited with Backblaze Backup, so there are no worries about running out of storage or rationing file backups.

The Backblaze client can be silently and remotely installed to both Macintosh and Windows clients with no user interaction. And, with Backblaze Groups, your IT staff has complete visibility into when files were last backed up. IT staff can recover any backed up file, folder, or entire computer from the admin panel, and even give file restore capability to the user, if desired, which reduces dependency on IT and time spent waiting for restores.

With over 500 petabytes of customer data stored and one million files restored every hour of every day by Backblaze customers, you know that Backblaze Backup works for its users.

You Need Data Security That Matches the Way People Work Today

Both file server and endpoint backup have their places in an organization’s data security plan, but their use and value differ. If you already are using file server backup, adding endpoint backup will make a valuable contribution to your organization by reducing workload, improving productivity, and increasing confidence that all critical files are backed up.

By guaranteeing fast and automatic backup of all endpoint data, and matching the current way organizations and people work with data, Backblaze Backup will enable you to effectively and affordably meet the data security demands of your organization.

The post Server vs Endpoint Backup — Which is Best? appeared first on Backblaze Blog | Cloud Storage & Cloud Backup.

Rightscorp Has a Massive Database of ‘Repeat Infringers’ to Pursue

Post Syndicated from Ernesto original https://torrentfreak.com/rightscorp-has-a-massive-database-of-repeat-infringers-to-pursue-180208/

Last week the Fourth Circuit Court of Appeals ruled that ISPs are required to terminate ‘repeat infringers’ based on allegations from copyright holders alone, a topic that has been contested for years.

This means that copyright holders now have a bigger incentive to send takedown notices, as ISPs can’t easily ignore them. That’s music to the ears of the various piracy tracking companies, Rightscorp included.

The piracy monetization company always maintained that multiple complaints from copyright holders are enough to classify someone as a repeat infringer, without a court order, and the Fourth Circuit has now reached the same conclusion.

“After years of uncertainty on these issues, it is gratifying for the US Court of Appeals to proclaim the law on ISP liability for subscriber infringements to be essentially what Rightscorp has always said it is,” Rightscorp President Christopher Sabec says.

Rightscorp is pleased to see that the court shares its opinion since the verdict also provides new business opportunities. The company informs TorrentFreak that it’s ready to help copyright holders to hold ISPs responsible.

“Rightscorp has always stood with content holders who wish to protect their rights against ISPs that are not taking action against repeat infringers,” Sabec tells us.

“Now, with the law addressing ISP liability for subscriber infringements finally sharpened and clarified at the appellate level, we are ready to support all efforts by rights holders to compel ISPs to abide by their responsibilities under the DMCA.”

The piracy tracking company has a treasure trove of piracy data at its disposal to issue takedown requests or back lawsuits. Over the past five years, it amassed nearly a billion “records” of copyright infringements.

“Rightscorp’s data records include no less the 969,653,557 infringements over the last five years,” Sabec says.

This number includes a lot of repeat infringers, obviously. It’s made up of IP-addresses downloading the same file on several occasions and/or multiple files over time.

While it’s unlikely that account holders will be disconnected based on infringements that happened years ago, this type of historical data can be used in court cases. Rightscorp’s infringement notices are the basis of the legal action against Cox, and are being used as evidence in a separate RIAA case against ISP Grande communications as well.

Grande previously said that it refused to act on Rightcorp’s notices because it doubts their accuracy, but the tracking company contests this. That case is still ongoing and a final decision has yet to be reached.

For now, however, Rightcorp is marketing its hundreds of thousands of recorded copyright infringements as an opportunity for rightsholders. And for a company that can use some extra cash in hand, that’s good news.

Source: TF, for the latest info on copyright, file-sharing, torrent sites and more. We also have VPN discounts, offers and coupons

How I coined the term ‘open source’ (Opensource.com)

Post Syndicated from jake original https://lwn.net/Articles/746207/rss

Over at Opensource.com, Christine Peterson has published her account of coining the term “open source”. Originally written in 2006, her story on the origin of the term has now been published for the first time. The 20 year anniversary of the adoption of “open source” is being celebrated this year by the Open Source Initiative at various conferences (recently at linux.conf.au, at FOSDEM on February 3, and others). “Between meetings that week, I was still focused on the need for a better name and came up with the term “open source software.” While not ideal, it struck me as good enough. I ran it by at least four others: Eric Drexler, Mark Miller, and Todd Anderson liked it, while a friend in marketing and public relations felt the term “open” had been overused and abused and believed we could do better. He was right in theory; however, I didn’t have a better idea, so I thought I would try to go ahead and introduce it. In hindsight, I should have simply proposed it to Eric Raymond, but I didn’t know him well at the time, so I took an indirect strategy instead.

Todd had agreed strongly about the need for a new term and offered to assist in getting the term introduced. This was helpful because, as a non-programmer, my influence within the free software community was weak. My work in nanotechnology education at Foresight was a plus, but not enough for me to be taken very seriously on free software questions. As a Linux programmer, Todd would be listened to more closely.”

Hollywood Says Only Site-Blocking Left to Beat Piracy in New Zealand

Post Syndicated from Andy original https://torrentfreak.com/hollywood-says-only-site-blocking-left-to-beat-piracy-in-new-zealand-180123/

The Motion Picture Distributors’ Association (MPDA) is a non-profit organisation which represents major international film studios in New Zealand.

With companies including Fox, Sony, Paramount, Roadshow, Disney, and Universal on the books, the MPDA sings from the same sheet as the MPAA and MPA. It also hopes to achieve in New Zealand what its counterparts have achieved in Europe and Australia but cannot on home soil – mass pirate site blocking.

In a release heralding the New Zealand screen industry’s annual contribution of around NZ$1.05 billion to GDP and NZ$706 million to exports, MPDA Managing Director Matthew Cheetham says that despite the successes, serious challenges lie ahead.

“When we have the illegal file sharing site the Pirate Bay as New Zealand’s 19th most popular site in New Zealand, it is clear that legitimate movie and TV distribution channels face challenges,” Cheetham says.

MPDA members in New Zealand

In common with movie bosses in many regions, Cheetham is hoping that the legal system will rise to the challenge and assist distributors to tackle the piracy problem. In New Zealand, that might yet require a change in the law but given recent changes in Australia, that doesn’t seem like a distant proposition.

Last December, the New Zealand government announced an overhaul of the country’s copyright laws. A review of the Copyright Act 1994 was announced by the previous government and is now scheduled to go ahead this year. The government has already indicated a willingness to consider amendments to the Act in order to meet the objectives of New Zealand’s copyright regime.

“In New Zealand, piracy is almost an accepted thing, because no one’s really doing anything about it, because no one actually can do anything about it,” Cheetham said last month.

It’s quite unusual for Hollywood’s representatives to say nothing can be done about piracy. However, there was a small ray of hope this morning when Cheetham said that there is actually one option left.

“There’s nothing we can do in New Zealand apart from site blocking,” Cheetham said.

So, as the MPDA appears to pin its hopes on legislative change, other players in the entertainment industry are testing the legal system as it stands today.

Last September, Sky TV began a pioneering ‘pirate’ site-blocking challenge in the New Zealand High Court, applying for an injunction against several local ISPs to prevent their subscribers from accessing several pirate sites.

The boss of Vocus, one of the ISP groups targeted, responded angrily, describing Sky’s efforts as “dinosaur behavior” and something one would expect in North Korea, not in New Zealand.

“It isn’t our job to police the Internet and it sure as hell isn’t SKY’s either, all sites should be equal and open,” General Manager Taryn Hamilton said.

The response from ISPs suggests that even when the matter of site-blocking is discussed as part of the Copyright Act review, introducing specific legislation may not be smooth sailing. In that respect, all eyes will turn to the Sky process, to see if some precedent can be set there.

Finally, another familiar problem continues to raise its head down under. So-called “Kodi boxes” – the now generic phrase often used to describe set-top devices configured for piracy – are also on the content industries’ radar.

There are a couple of cases still pending against sellers, including one in which a budding entrepreneur sent out marketing letters claiming that his service was better than Sky’s offering. For seller Krish Reddy, this didn’t turn out well as the company responded with a NZ$1m lawsuit.

Generally, however, both content industries and consumers are having a good time in New Zealand but the MPDA’s Cheetham says that taking on pirates is never easy.

“It’s been called the golden age of television and a lot of premium movies have been released in the last 12 or 18 months. Content providers and distributors have really upped their game in the last five or 10 years to meet what people want but it’s very difficult to compete with free,” Cheetham concludes.

Source: TF, for the latest info on copyright, file-sharing, torrent sites and more. We also have VPN discounts, offers and coupons

Security Breaches Don’t Affect Stock Price

Post Syndicated from Bruce Schneier original https://www.schneier.com/blog/archives/2018/01/security_breach.html

Interesting research: “Long-term market implications of data breaches, not,” by Russell Lange and Eric W. Burger.

Abstract: This report assesses the impact disclosure of data breaches has on the total returns and volatility of the affected companies’ stock, with a focus on the results relative to the performance of the firms’ peer industries, as represented through selected indices rather than the market as a whole. Financial performance is considered over a range of dates from 3 days post-breach through 6 months post-breach, in order to provide a longer-term perspective on the impact of the breach announcement.

Key findings:

  • While the difference in stock price between the sampled breached companies and their peers was negative (1.13%) in the first 3 days following announcement of a breach, by the 14th day the return difference had rebounded to + 0.05%, and on average remained positive through the period assessed.
  • For the differences in the breached companies’ betas and the beta of their peer sets, the differences in the means of 8 months pre-breach versus post-breach was not meaningful at 90, 180, and 360 day post-breach periods.

  • For the differences in the breached companies’ beta correlations against the peer indices pre- and post-breach, the difference in the means of the rolling 60 day correlation 8 months pre- breach versus post-breach was not meaningful at 90, 180, and 360 day post-breach periods.

  • In regression analysis, use of the number of accessed records, date, data sensitivity, and malicious versus accidental leak as variables failed to yield an R2 greater than 16.15% for response variables of 3, 14, 60, and 90 day return differential, excess beta differential, and rolling beta correlation differential, indicating that the financial impact on breached companies was highly idiosyncratic.

  • Based on returns, the most impacted industries at the 3 day post-breach date were U.S. Financial Services, Transportation, and Global Telecom. At the 90 day post-breach date, the three most impacted industries were U.S. Financial Services, U.S. Healthcare, and Global Telecom.

The market isn’t going to fix this. If we want better security, we need to regulate the market.

Note: The article is behind a paywall. An older version is here. A similar article is here.

Early Challenges: Managing Cash Flow

Post Syndicated from Gleb Budman original https://www.backblaze.com/blog/managing-cash-flow/

Cash flow projection charts

This post by Backblaze’s CEO and co-founder Gleb Budman is the eighth in a series about entrepreneurship. You can choose posts in the series from the list below:

  1. How Backblaze got Started: The Problem, The Solution, and the Stuff In-Between
  2. Building a Competitive Moat: Turning Challenges Into Advantages
  3. From Idea to Launch: Getting Your First Customers
  4. How to Get Your First 1,000 Customers
  5. Surviving Your First Year
  6. How to Compete with Giants
  7. The Decision on Transparency
  8. Early Challenges: Managing Cash Flow

Use the Join button above to receive notification of new posts in this series.

Running out of cash is one of the quickest ways for a startup to go out of business. When you are starting a company the question of where to get cash is usually the top priority, but managing cash flow is critical for every stage in the lifecycle of a company. As a primarily bootstrapped but capital-intensive business, managing cash flow at Backblaze was and still is a key element of our success and requires continued focus. Let’s look at what we learned over the years.

Raising Your Initial Funding

When starting a tech business in Silicon Valley, the default assumption is that you will immediately try to raise venture funding. There are certainly many advantages to raising funding — not the least of which is that you don’t need to be cash-flow positive since you have cash in the bank and the expectation is that you will have a “burn rate,” i.e. you’ll be spending more than you make.

Note: While you’re not expected to be cash-flow positive, that doesn’t mean you don’t have to worry about cash. Cash-flow management will determine your burn rate. Whether you can get to cash-flow breakeven or need to raise another round of funding is a direct byproduct of your cash flow management.

Also, raising funding takes time (most successful fundraising cycles take 3-6 months start-to-finish), and time at a startup is in short supply. Constantly trying to raise funding can take away from product development and pursuing growth opportunities. If you’re not successful in raising funding, you then have to either shut down or find an alternate method of funding the business.

Sources of Funding

Depending on the stage of the company, type of company, and other factors, you may have access to different sources of funding. Let’s list a number of them:

Customers

Sales — the best kind of funding. It is non-dilutive, doesn’t have to be paid back, and is a direct metric of the success of your company.

Pre-Sales — some customers may be willing to pay you for a product in beta, a test, or pre-pay for a product they’ll receive when finished. Pre-Sales income also is great because it shares the characteristics of cash from sales, but you get the cash early. It also can be a good sign that the product you’re building fills a market need. We started charging for Backblaze computer backup while it was still in private beta, which allowed us to not only collect cash from customers, but also test the billing experience and users’ real desire for the service.

Services — if you’re a service company and customers are paying you for that, great. You can effectively scale for the number of hours available in a day. As demand grows, you can add more employees to increase the total number of billable hours.

Note: If you’re a product company and customers are paying you to consult, that can provide much needed cash, and could provide feedback toward the right product. However, it can also distract from your core business, send you down a path where you’re building a product for a single customer, and addict you to a path that prevents you from building a scalable business.

Investors

Yourself — you likely are putting your time into the business, and deferring salary in the process. You may also put your own cash into the business either as an investment or a loan.

Angels — angels are ideal as early investors since they are used to investing in businesses with little to no traction. AngelList is a good place to find them, though finding people you’re connected with through someone that knows you well is best.

Crowdfunding — a component of the JOBS Act permitted entrepreneurs to raise money from nearly anyone since May 2016. The SEC imposes limits on both investors and the companies. This article goes into some depth on the options and sites available.

VCs — VCs are ideal for companies that need to raise at least a few million dollars and intend to build a business that will be worth over $1 billion.

Debt

Friends & Family — F&F are often the first people to give you money because they are investing in you. It’s great to have some early supporters, but it also can be risky to take money from people who aren’t used to the risks. The key advice here is to only take money from people who won’t mind losing it. If someone is talking about using their children’s college funds or borrowing from their 401k, say ‘no thank you’ — even if they’re sure they want to loan you money.

Bank Loans — a variety of loan types exist, but most either require the company to have been operational for a couple years, be able to borrow against money the company has or is making, or be able to get a personal guarantee from the founders whereby their own credit is on the line. Fundera provides a good overview of loan options and can help secure some, but most will not be an option for a brand new startup.

Grants

Government — in some areas there is the potential for government grants to facilitate research. The SBIR program facilitates some such grants.

At Backblaze, we used a number of these options:

• Investors/Yourself
We loaned a cumulative total of a couple hundred thousand dollars to the company and invested our time by going without a salary for a year and a half.
• Customers/Pre-Sales
We started selling the Backblaze service while it was still in beta.
• Customers/Sales
We launched v1.0 and kept selling.
• Investors/Angels
After a year and a half, we raised $370k from 11 angels. All of them were either people whom we knew personally or were a strong recommendation from a mutual friend.
• Debt/Loans
After a couple years we were able to get equipment leases whereby the Storage Pods and hard drives were used as collateral to secure the lease on them.
• Investors/VCs
Ater five years we raised $5m from TMT Investments to add to the balance sheet and invest in growth.

The variety and quantity of sources we used is by no means uncommon.

GAAP vs. Cash

Most companies start tracking financials based on cash, and as they scale they switch to GAAP (Generally Accepted Accounting Principles). Cash is easier to track — we got paid $XXXX and spent $YYY — and as often mentioned, is required for the business to stay alive. GAAP has more subtlety and complexity, but provides a clearer picture of how the business is really doing. Backblaze was on a ‘cash’ system for the first few years, then switched to GAAP. For this post, I’m going to focus on things that help cash flow, not GAAP profitability.

Stages of Cash Flow Management

All-spend

In a pure service business (e.g. solo proprietor law firm), you may have no expenses other than your time, so this stage doesn’t exist. However, in a product business there is a period of time where you are building the product and have nothing to sell. You have zero cash coming in, but have cash going out. Your cash-flow is completely negative and you need funds to cover that.

Sales-generating

Starting to see cash come in from customers is thrilling. I initially had our system set up to email me with every $5 payment we received. You’re making sales, but not covering expenses.

Ramen-profitable

But it takes a lot of $5 payments to pay for servers and salaries, so for a while expenses are likely to outstrip sales. Getting to ramen-profitable is a critical stage where sales cover the business expenses and are “paying enough for the founders to eat ramen.” This extends the runway for a business, but is not completely sustainable, since presumably the founders can’t (or won’t) live forever on a subsistence salary.

Business-profitable

This is the ultimate stage whereby the business is truly profitable, including paying everyone market-rate salaries. A business at this stage is self-sustaining. (Of course, market shifts and plenty of other challenges can kill the business, but cash-flow issues alone will not.)

Note, I’m using the word ‘profitable’ here to mean this is still on a cash-basis.

Backblaze was in the all-spend stage for just over a year, during which time we built the service and hadn’t yet made the service available to customers. Backblaze was in the sales-generating stage for nearly another year before the company was barely ramen-profitable where sales were covering the company expenses and paying the founders minimum wage. (I say ‘barely’ since minimum wage in the SF Bay Area is arguably never subsistence.) It took almost three more years before the company was business-profitable, paying everyone including the founders market-rate.

Cash Flow Forecasting

When raising funding it’s helpful to think of milestones reached. You don’t necessarily need enough cash on day one to last for the next 100 years of the company. Some good milestones to consider are how much cash you need to prove there is a market need, prove you can build a product to meet that need, or get to ramen-profitable.

Two things to consider:

1) Unit Economics (COGS)

If your product is 100% software, this may not be relevant. Once software is built it costs effectively nothing to deliver the product to one customer or one million customers. However, in most businesses there is some incremental cost to provide the product. If you’re selling a hardware device, perhaps you sell it for $100 but it costs you $50 to make it. This is called “COGS” (Cost of Goods Sold).

Many products rely on cloud services where the costs scale with growth. That model works great, but it’s still important to understand what the costs are for the cloud service you use per unit of product you sell.

Support is often done by the founders early-on in a business, but that is another real cost to factor in and estimate on a per-user basis. Taking all of the per unit costs combined, you may charge $10/month/user for your service, but if it costs you $7/month/user in cloud services, you’re only netting $3/month/user.

2) Operating Expenses (OpEx)

These are expenses that don’t scale with the number of product units you sell. Typically this includes research & development, sales & marketing, and general & administrative expenses. Presumably there is a certain level of these functions required to build the product, market it, sell it, and run the organization. You can choose to invest or cut back on these, but you’ll still make the same amount per product unit.

Incremental Net Profit Per Unit

If you’ve calculated your COGS and your unit economics are “upside down,” where the amount you charge is less than that it costs you to provide your service, it’s worth thinking hard about how that’s going to change over time. If it will not change, there is no scale that will make the business work. Presuming you do make money on each unit of product you sell — what is sometimes referred to as “Contribution Margin” — consider how many of those product units you need to sell to cover your operating expenses as described above.

Calculating Your Profit

The math on getting to ramen-profitable is simple:

(Number of Product Units Sold x Contribution Margin) - Operating Expenses = Profit

If your operating expenses include subsistence salaries for the founders and profit > $0, you’re ramen-profitable.

Improving Cash Flow

Having access to sources of cash, whether from selling to customers or other methods, is excellent. But needing less cash gives you more choices and allows you to either dilute less, owe less, or invest more.

There are two ways to improve cash flow:

1) Collect More Cash

The best way to collect more cash is to provide more value to your customers and as a result have them pay you more. Additional features/products/services can allow this. However, you can also collect more cash by changing how you charge for your product. If you have a subscription, changing from charging monthly to yearly dramatically improves your cash flow. If you have a product that customers use up, selling a year’s supply instead of selling them one-by-one can help.

2) Spend Less Cash

Reducing COGS is a fantastic way to spend less cash in a scalable way. If you can do this without harming the product or customer experience, you win. There are a myriad of ways to also reduce operating expenses, including taking sub-market salaries, using your home instead of renting office space, staying focused on your core product, etc.

Ultimately, collecting more and spending less cash dramatically simplifies the process of getting to ramen-profitable and later to business-profitable.

Be Careful (Why GAAP Matters)

A word of caution: while running out of cash will put you out of business immediately, overextending yourself will likely put you out of business not much later. GAAP shows how a business is really doing; cash doesn’t. If you only focus on cash, it is possible to commit yourself to both delivering products and repaying loans in the future in an unsustainable fashion. If you’re taking out loans, watch the total balance and monthly payments you’re committing to. If you’re asking customers for pre-payment, make sure you believe you can deliver on what they’ve paid for.

Summary

There are numerous challenges to building a business, and ensuring you have enough cash is amongst the most important. Having the cash to keep going lets you keep working on all of the other challenges. The frameworks above were critical for maintaining Backblaze’s cash flow and cash balance. Hopefully you can take some of the lessons we learned and apply them to your business. Let us know what works for you in the comments below.

The post Early Challenges: Managing Cash Flow appeared first on Backblaze Blog | Cloud Storage & Cloud Backup.

timeShift(GrafanaBuzz, 1w) Issue 28

Post Syndicated from Blogs on Grafana Labs Blog original https://grafana.com/blog/2018/01/05/timeshiftgrafanabuzz-1w-issue-28/

Happy new year! Grafana Labs is getting back in the swing of things after taking some time off to celebrate 2017, and spending time with family and friends. We’re diligently working on the new Grafana v5.0 release (planning v5.0 beta release by end of January), which includes a ton of new features, a new layout engine, and a polished UI. We’d love to hear your feedback!


Latest Stable Release

Grafana 4.6.3 is now available. Latest bugfixes include:

  • Gzip: Fixes bug Gravatar images when gzip was enabled #5952
  • Alert list: Now shows alert state changes even after adding manual annotations on dashboard #99513
  • Alerting: Fixes bug where rules evaluated as firing when all conditions was false and using OR operator. #93183
  • Cloudwatch: CloudWatch no longer display metrics’ default alias #101514, thx @mtanda

Download Grafana 4.6.3 Now


From the Blogosphere

Why Observability Matters – Now and in the Future: Our own Carl Bergquist teamed up with Neil Gehani, Director of Product at Weaveworks to discuss best practices on how to get started with monitoring your application and infrastructure. This video focuses on modern containerized applications instrumented to use Prometheus to generate metrics and Grafana to visualize them.

How to Install and Secure Grafana on Ubuntu 16.04: In this tutorial, you’ll learn how to install and secure Grafana with a SSL certificate and a Nginx reverse proxy, then you’ll modify Grafana’s default settings for even tighter security.

Monitoring Informix with Grafana: Ben walks us through how to use Grafana to visualize data from IBM Informix and offers a practical demonstration using Docker containers. He also talks about his philosophy of sharing dashboards across teams, important metrics to collect, and how he would like to improve his monitoring stack.

Monitor your hosts with Glances + InfluxDB + Grafana: Glances is a cross-platform system monitoring tool written in Python. This article takes you step by step through the pieces of the stack, installation, confirguration and provides a sample dashboard to get you up and running.


GrafanaCon Tickets are Going Fast!

Lock in your seat for GrafanaCon EU while there are still tickets avaialable! Join us March 1-2, 2018 in Amsterdam for 2 days of talks centered around Grafana and the surrounding monitoring ecosystem including Graphite, Prometheus, InfluxData, Elasticsearch, Kubernetes, and more.

We have some exciting talks lined up from Google, CERN, Bloomberg, eBay, Red Hat, Tinder, Fastly, Automattic, Prometheus, InfluxData, Percona and more! You can see the full list of speakers below, but be sure to get your ticket now.

Get Your Ticket Now

GrafanaCon EU will feature talks from:

“Google Bigtable”
Misha Brukman
PROJECT MANAGER,
GOOGLE CLOUD
GOOGLE

“Monitoring at Bloomberg”
Stig Sorensen
HEAD OF TELEMETRY
BLOOMBERG

“Monitoring at Bloomberg”
Sean Hanson
SOFTWARE DEVELOPER
BLOOMBERG

“Monitoring Tinder’s Billions of Swipes with Grafana”
Utkarsh Bhatnagar
SR. SOFTWARE ENGINEER
TINDER

“Grafana at CERN”
Borja Garrido
PROJECT ASSOCIATE
CERN

“Monitoring the Huge Scale at Automattic”
Abhishek Gahlot
SOFTWARE ENGINEER
Automattic

“Real-time Engagement During the 2016 US Presidential Election”
Anna MacLachlan
CONTENT MARKETING MANAGER
Fastly

“Real-time Engagement During the 2016 US Presidential Election”
Gerlando Piro
FRONT END DEVELOPER
Fastly

“Grafana v5 and the Future”
Torkel Odegaard
CREATOR | PROJECT LEAD
GRAFANA

“Prometheus for Monitoring Metrics”
Brian Brazil
FOUNDER
ROBUST PERCEPTION

“What We Learned Integrating Grafana with Prometheus”
Peter Zaitsev
CO-FOUNDER | CEO
PERCONA

“The Biz of Grafana”
Raj Dutt
CO-FOUNDER | CEO
GRAFANA LABS

“What’s New In Graphite”
Dan Cech
DIR, PLATFORM SERVICES
GRAFANA LABS

“The Design of IFQL, the New Influx Functional Query Language”
Paul Dix
CO-FOUNTER | CTO
INFLUXDATA

“Writing Grafana Dashboards with Jsonnet”
Julien Pivotto
OPEN SOURCE CONSULTANT
INUITS

“Monitoring AI Platform at eBay”
Deepak Vasthimal
MTS-2 SOFTWARE ENGINEER
EBAY

“Running a Power Plant with Grafana”
Ryan McKinley
DEVELOPER
NATEL ENERGY

“Performance Metrics and User Experience: A “Tinder” Experience”
Susanne Greiner
DATA SCIENTIST
WÜRTH PHOENIX S.R.L.

“Analyzing Performance of OpenStack with Grafana Dashboards”
Alex Krzos
SENIOR SOFTWARE ENGINEER
RED HAT INC.

“Storage Monitoring at Shell Upstream”
Arie Jan Kraai
STORAGE ENGINEER
SHELL TECHNICAL LANDSCAPE SERVICE

“The RED Method: How To Instrument Your Services”
Tom Wilkie
FOUNDER
KAUSAL

“Grafana Usage in the Quality Assurance Process”
Andrejs Kalnacs
LEAD SOFTWARE DEVELOPER IN TEST
EVOLUTION GAMING

“Using Prometheus and Grafana for Monitoring my Power Usage”
Erwin de Keijzer
LINUX ENGINEER
SNOW BV

“Weather, Power & Market Forecasts with Grafana”
Max von Roden
DATA SCIENTIST
ENERGY WEATHER

“Weather, Power & Market Forecasts with Grafana”
Steffen Knott
HEAD OF IT
ENERGY WEATHER

“Inherited Technical Debt – A Tale of Overcoming Enterprise Inertia”
Jordan J. Hamel
HEAD OF MONITORING PLATFORMS
AMGEN

“Grafanalib: Dashboards as Code”
Jonathan Lange
VP OF ENGINEERING
WEAVEWORKS

“The Journey of Shifting the MQTT Broker HiveMQ to Kubernetes”
Arnold Bechtoldt
SENIOR SYSTEMS ENGINEER
INOVEX

“Graphs Tell Stories”
Blerim Sheqa
SENIOR DEVELOPER
NETWAYS

[email protected] or How to Store Millions of Metrics per Second”
Vladimir Smirnov
SYSTEM ADMINISTRATOR
Booking.com


Upcoming Events:

In between code pushes we like to speak at, sponsor and attend all kinds of conferences and meetups. We also like to make sure we mention other Grafana-related events happening all over the world. If you’re putting on just such an event, let us know and we’ll list it here.

FOSDEM | Brussels, Belgium – Feb 3-4, 2018: FOSDEM is a free developer conference where thousands of developers of free and open source software gather to share ideas and technology. There is no need to register; all are welcome.

Jfokus | Stockholm, Sweden – Feb 5-7, 2018:
Carl Bergquist – Quickie: Monitoring? Not OPS Problem

Why should we monitor our system? Why can’t we just rely on the operations team anymore? They use to be able to do that. What’s currently changing? Presentation content: – Why do we monitor our system – How did it use to work? – Whats changing – Why do we need to shift focus – Everyone should be on call. – Resilience is the goal (Best way of having someone care about quality is to make them responsible).

Register Now

Jfokus | Stockholm, Sweden – Feb 5-7, 2018:
Leonard Gram – Presentation: DevOps Deconstructed

What’s a Site Reliability Engineer and how’s that role different from the DevOps engineer my boss wants to hire? I really don’t want to be on call, should I? Is Docker the right place for my code or am I better of just going straight to Serverless? And why should I care about any of it? I’ll try to answer some of these questions while looking at what DevOps really is about and how commodisation of servers through “the cloud” ties into it all. This session will be an opinionated piece from a developer who’s been on-call for the past 6 years and would like to convince you to do the same, at least once.

Register Now

Tweet of the Week

We scour Twitter each week to find an interesting/beautiful dashboard and show it off! #monitoringLove

Awesome! Let us know if you have any questions – we’re happy to help out. We also have a bunch of screencasts to help you get going.


Grafana Labs is Hiring!

We are passionate about open source software and thrive on tackling complex challenges to build the future. We ship code from every corner of the globe and love working with the community. If this sounds exciting, you’re in luck – WE’RE HIRING!

Check out our Open Positions


How are we doing?

That’s a wrap! Let us know what you think about timeShift. Submit a comment on this article below, or post something at our community forum. See you next year!

Follow us on Twitter, like us on Facebook, and join the Grafana Labs community.

Some notes on Meltdown/Spectre

Post Syndicated from Robert Graham original http://blog.erratasec.com/2018/01/some-notes-on-meltdownspectre.html

I thought I’d write up some notes.

You don’t have to worry if you patch. If you download the latest update from Microsoft, Apple, or Linux, then the problem is fixed for you and you don’t have to worry. If you aren’t up to date, then there’s a lot of other nasties out there you should probably also be worrying about. I mention this because while this bug is big in the news, it’s probably not news the average consumer needs to concern themselves with.

This will force a redesign of CPUs and operating systems. While not a big news item for consumers, it’s huge in the geek world. We’ll need to redesign operating systems and how CPUs are made.

Don’t worry about the performance hit. Some, especially avid gamers, are concerned about the claims of “30%” performance reduction when applying the patch. That’s only in some rare cases, so you shouldn’t worry too much about it. As far as I can tell, 3D games aren’t likely to see less than 1% performance degradation. If you imagine your game is suddenly slower after the patch, then something else broke it.

This wasn’t foreseeable. A common cliche is that such bugs happen because people don’t take security seriously, or that they are taking “shortcuts”. That’s not the case here. Speculative execution and timing issues with caches are inherent issues with CPU hardware. “Fixing” this would make CPUs run ten times slower. Thus, while we can tweek hardware going forward, the larger change will be in software.

There’s no good way to disclose this. The cybersecurity industry has a process for coordinating the release of such bugs, which appears to have broken down. In truth, it didn’t. Once Linus announced a security patch that would degrade performance of the Linux kernel, we knew the coming bug was going to be Big. Looking at the Linux patch, tracking backwards to the bug was only a matter of time. Hence, the release of this information was a bit sooner than some wanted. This is to be expected, and is nothing to be upset about.

It helps to have a name. Many are offended by the crassness of naming vulnerabilities and giving them logos. On the other hand, we are going to be talking about these bugs for the next decade. Having a recognizable name, rather than a hard-to-remember number, is useful.

Should I stop buying Intel? Intel has the worst of the bugs here. On the other hand, ARM and AMD alternatives have their own problems. Many want to deploy ARM servers in their data centers, but these are likely to expose bugs you don’t see on x86 servers. The software fix, “page table isolation”, seems to work, so there might not be anything to worry about. On the other hand, holding up purchases because of “fear” of this bug is a good way to squeeze price reductions out of your vendor. Conversely, later generation CPUs, “Haswell” and even “Skylake” seem to have the least performance degradation, so it might be time to upgrade older servers to newer processors.

Intel misleads. Intel has a press release that implies they are not impacted any worse than others. This is wrong: the “Meltdown” issue appears to apply only to Intel CPUs. I don’t like such marketing crap, so I mention it.


Statements from companies:

IPTV Provider Stops Selling New Subscriptions Under Pressure From “UK Authorities”

Post Syndicated from Andy original https://torrentfreak.com/iptv-provider-stops-selling-new-subscriptions-under-pressure-from-uk-authorities-171224/

Over the past couple of decades, piracy of live TV has broadly taken two forms. That which relies on breaking broadcaster encryption (such as card sharing and hacked set-top boxes), and the more recent developments of P2P and IPTV-style transmission.

With the former under pressure and P2P systems such as Sopcast and AceTorrent moving along in the background, streaming from servers is now the next big thing, whether that’s for free via third-party Kodi plugins or for a small fee from premium IPTV providers.

Of course, copyright holders don’t like any of this usage but with their for-profit strategy, commercial IPTV providers have a big target on their backs. More evidence of this was revealed recently when UK-based IPTV service ACE TV announced they were taking action to avoid problems in the country.

In a message to prospective and existing customers, ACE TV said that potential legal issues were behind its decision to accept no new customers while locking down its service.

“It saddens me to announce this, but due to pressure from the authorities in the UK, we are no longer selling new subscriptions. This obviously includes trials,” the announcement reads.

Noting that it would take new order for just 24 hours more, ACE TV insisted that it wasn’t shutting down but would lock down the service while closing Facebook.

TF sources and unconfirmed rumors online suggest that the Federation Against Copyright Theft and partners the Premier League are involved. However, ACE TV didn’t respond to TorrentFreak’s request for comment so we’re unable to confirm or deny the allegations.

That being said, even if the threats came directly from the police, it’s likely that the approach would’ve been initially prompted by companies connected to FACT, since the anti-piracy outfit often puts forward names of services for investigation on behalf of its partners.

Perhaps surprisingly, ACE TV is legally incorporated in the UK as Ace Hosting Limited, a fact it makes clear on its website. While easy to find, the company’s registered address is shared by dozens of other companies, indicating a mail forwarding operation rather than a place servers or staff can be found.

This proxy location may well be the reason the company feels emboldened to carry on some level of service rather than shutting down completely, but its legal basis for doing so is interesting at best, precarious at worst.

“This website, any content contained herein and any contract brought into being as a result of usage of this website are governed by and construed in accordance with English Law,” ACE TV’s website reads.

“The parties to any such contract agree to submit to the exclusive jurisdiction of the courts of England and Wales. All contracts are concluded in English.”

It seems likely that ACE TV has been threatened under UK law, since that’s where it’s incorporated. That would seem to explain why its concerned about UK authorities and their potential effect on the business. On the other hand, however, the service claims to operate entirely legally, but under the laws of the United States. It even has a repeat infringer policy.

“Ace Hosting operates as an intermediary to cache and deliver content hosted by others at the instruction of our subscribers. We cannot remove content hosted by others,” the company says.

“As an intermediary, we are entitled to rely upon (among other things) the DMCA safe harbor available to system caching service providers and we maintain policies and procedures to terminate subscribers that would be considered repeat infringers under the DMCA.”

Whether the notices on the site have been advised by a legal professional or are there to present an air of authenticity is unclear but it’s precarious for a service of this nature to rely solely on conduit status in order to avoid liability.

Marketing, prior conduct, and overall intent play a major role in such cases and when all of that is aired in the cold light of day, the situation can look very different to a judge, particularly in the UK, where no similar cases have been successfully defended to date.

Source: TF, for the latest info on copyright, file-sharing, torrent sites and more. We also have VPN discounts, offers and coupons

Serverless @ re:Invent 2017

Post Syndicated from Chris Munns original https://aws.amazon.com/blogs/compute/serverless-reinvent-2017/

At re:Invent 2014, we announced AWS Lambda, what is now the center of the serverless platform at AWS, and helped ignite the trend of companies building serverless applications.

This year, at re:Invent 2017, the topic of serverless was everywhere. We were incredibly excited to see the energy from everyone attending 7 workshops, 15 chalk talks, 20 skills sessions and 27 breakout sessions. Many of these sessions were repeated due to high demand, so we are happy to summarize and provide links to the recordings and slides of these sessions.

Over the course of the week leading up to and then the week of re:Invent, we also had over 15 new features and capabilities across a number of serverless services, including AWS Lambda, Amazon API Gateway, AWS [email protected], AWS SAM, and the newly announced AWS Serverless Application Repository!

AWS Lambda

Amazon API Gateway

  • Amazon API Gateway Supports Endpoint Integrations with Private VPCs – You can now provide access to HTTP(S) resources within your VPC without exposing them directly to the public internet. This includes resources available over a VPN or Direct Connect connection!
  • Amazon API Gateway Supports Canary Release Deployments – You can now use canary release deployments to gradually roll out new APIs. This helps you more safely roll out API changes and limit the blast radius of new deployments.
  • Amazon API Gateway Supports Access Logging – The access logging feature lets you generate access logs in different formats such as CLF (Common Log Format), JSON, XML, and CSV. The access logs can be fed into your existing analytics or log processing tools so you can perform more in-depth analysis or take action in response to the log data.
  • Amazon API Gateway Customize Integration Timeouts – You can now set a custom timeout for your API calls as low as 50ms and as high as 29 seconds (the default is 30 seconds).
  • Amazon API Gateway Supports Generating SDK in Ruby – This is in addition to support for SDKs in Java, JavaScript, Android and iOS (Swift and Objective-C). The SDKs that Amazon API Gateway generates save you development time and come with a number of prebuilt capabilities, such as working with API keys, exponential back, and exception handling.

AWS Serverless Application Repository

Serverless Application Repository is a new service (currently in preview) that aids in the publication, discovery, and deployment of serverless applications. With it you’ll be able to find shared serverless applications that you can launch in your account, while also sharing ones that you’ve created for others to do the same.

AWS [email protected]

[email protected] now supports content-based dynamic origin selection, network calls from viewer events, and advanced response generation. This combination of capabilities greatly increases the use cases for [email protected], such as allowing you to send requests to different origins based on request information, showing selective content based on authentication, and dynamically watermarking images for each viewer.

AWS SAM

Twitch Launchpad live announcements

Other service announcements

Here are some of the other highlights that you might have missed. We think these could help you make great applications:

AWS re:Invent 2017 sessions

Coming up with the right mix of talks for an event like this can be quite a challenge. The Product, Marketing, and Developer Advocacy teams for Serverless at AWS spent weeks reading through dozens of talk ideas to boil it down to the final list.

From feedback at other AWS events and webinars, we knew that customers were looking for talks that focused on concrete examples of solving problems with serverless, how to perform common tasks such as deployment, CI/CD, monitoring, and troubleshooting, and to see customer and partner examples solving real world problems. To that extent we tried to settle on a good mix based on attendee experience and provide a track full of rich content.

Below are the recordings and slides of breakout sessions from re:Invent 2017. We’ve organized them for those getting started, those who are already beginning to build serverless applications, and the experts out there already running them at scale. Some of the videos and slides haven’t been posted yet, and so we will update this list as they become available.

Find the entire Serverless Track playlist on YouTube.

Talks for people new to Serverless

Advanced topics

Expert mode

Talks for specific use cases

Talks from AWS customers & partners

Looking to get hands-on with Serverless?

At re:Invent, we delivered instructor-led skills sessions to help attendees new to serverless applications get started quickly. The content from these sessions is already online and you can do the hands-on labs yourself!
Build a Serverless web application

Still looking for more?

We also recently completely overhauled the main Serverless landing page for AWS. This includes a new Resources page containing case studies, webinars, whitepapers, customer stories, reference architectures, and even more Getting Started tutorials. Check it out!

A New AWS Government, Education, and Nonprofits Blog Post: “AWS Achieves Full Empanelment for the Delivery of Cloud Services by India’s Ministry of Electronics and Information Technology”

Post Syndicated from Craig Liebendorfer original https://aws.amazon.com/blogs/security/a-new-aws-government-education-nonprofits-blog-post-aws-achieves-full-empanelment-for-the-delivery-of-cloud-services-by-indias-ministry-of-electronics-and-information-technology/

AWS Security by Design image

AWS recently announced that Amazon Internet Services Private Limited (AISPL), an Indian subsidiary of the Amazon Group that undertakes the resale and marketing of AWS Cloud services in India, has achieved full Cloud Service Provider (CSP) empanelment and successfully completed the Standardization Testing and Quality Certification (STQC) audit from the Indian Ministry of Electronics and Information Technology (MeitY) for cloud services delivered from the AWS Asia Pacific (Mumbai) Region.

With this certification, AISPL joins a list of approved providers that meet predefined government standards of quality, availability, and security. With 52 global security certifications, attestations, assurance programs, and quality audits already achieved, such as ISO 9001, SOC1, SOC2, and SOC3, AWS is the first global cloud service provider to earn this status in India.

To read the full blog post, see AWS Achieves Full Empanelment for the Delivery of Cloud Services by India’s Ministry of Electronics and Information Technology.

– Craig

BitTorrent Inc. Emerges Victorious Following EU Trademark Dispute

Post Syndicated from Andy original https://torrentfreak.com/bittorrent-inc-emerges-victorious-following-eu-trademark-dispute-171213/

For anyone familiar with the BitTorrent brand, there can only be one company that springs to mind. BitTorrent Inc., the outfit behind uTorrent that still employs BitTorrent creator Bram Cohen, seems the logical choice, but not everything is straightforward.

Back in June 2003, a company called BitTorrent Marketing GmbH filed an application to register an EU trademark for the term ‘BitTorrent’ with the European Union Intellectual Property Office (EUIPO). The company hoped to exploit the trademark for a wide range of uses from marketing, advertising, retail, mail order and Internet sales, to film, television and video licensing plus “providing of memory space on the internet”.

The trademark application was published in Jul 2004 and registered in June 2006. However, in June 2011 BitTorrent Inc. filed an application for its revocation on the grounds that the trademark had not been “put to genuine use in the European Union in connection with the services concerned within a continuous period of five years.”

A year later, the EUIPO notified BitTorrent Marketing GmbH that it had three months to submit evidence of the trademark’s use. After an application from the company, more time was given to present evidence and a deadline was set for November 21, 2011. Things did not go to plan, however.

On the very last day, BitTorrent Marketing GmbH responded to the request by fax, noting that a five-page letter had been sent along with 69 pages of additional evidence. But something went wrong, with the fax machine continually reporting errors. Several days later, the evidence arrived by mail, but that was technically too late.

In September 2013, BitTorrent Inc.’s application for the trademark to be revoked was upheld but in November 2013, BitTorrent Marketing GmbH (by now known as Hochmann Marketing GmbH) appealed against the decision to revoke.

Almost two years later in August 2015, an EUIPO appeal held that Hochmann “had submitted no relevant proof” before the specified deadline that the trademark had been in previous use. On this basis, the evidence could not be taken into account.

“[The appeal] therefore concluded that genuine use of the mark at issue had not been proven, and held that the mark must be revoked with effect from 24 June 2011,” EUIPO documentation reads.

However, Hochmann Marketing GmbH wasn’t about to give up, demanding that the decision be annulled and that EUIPO and BitTorrent Inc. should pay the costs. In response, EUIPO and BitTorrent Inc. demanded the opposite, that Hochmann’s action should be dismissed and they should pay the costs instead.

In its decision published yesterday, the EU General Court (Third Chamber) clearly sided with EUIPO and BitTorrent Inc.

“The [evidence] document clearly contains only statements that are not substantiated by any supporting evidence capable of adducing proof of the place, time, extent and nature of use of the mark at issue, especially because the evidence in question was submitted, in the present case, three days after the prescribed period expired,” the decision reads.

The decision also notes that the company was given an additional month to come up with evidence and then some – the evidence was actually due on a Saturday so the period was extended until Monday for the convenience of the company.

“Next, EUIPO had duly informed the applicant, by letter of 19 July 2011, that it was ‘required to submit the required evidence of use in reply to the request within three months of receipt of this communication’ and that ‘if no evidence of use [was] submitted within this period, the [EU] mark w[ould] be revoked’,” the decision reads, adding;

“That letter also included guidance on how to provide evidence in a timely manner. Consequently, the applicant knew not only what documents it must submit, but also what the consequences of late submission of evidence were.”

All things considered, the Court rejected Hochmann Marketing GmbH’s application, ultimately deciding that not enough evidence was produced and what did appear was too late. For that, the trademark remains revoked and Hochmann Marketing must cover EUIPO and BitTorrent Inc.’s legal costs.

This isn’t the first time that BitTorrent Inc. has taken on BitTorrent/Hochmann Marketing GmbH and won. In 2014, it took the company to court in the United States and walked away with a $2.2m damages award.

Source: TF, for the latest info on copyright, file-sharing, torrent sites and more. We also have VPN discounts, offers and coupons

Screener Piracy Season Kicks Off With Louis C.K.’s ‘I Love You, Daddy’

Post Syndicated from Ernesto original https://torrentfreak.com/screener-piracy-season-kicks-off-with-louis-c-k-s-i-love-you-daddy-171211/

Towards the end of the year, movie screeners are sent out to industry insiders who cast their votes for the Oscars and other awards.

It’s a highly anticipated time for pirates who hope to get copies of the latest blockbusters early, which is traditionally what happens.

Last year the action started relatively late. It took until January before the first leak surfaced – Denzel Washington’s Fences –
but more than a dozen made their way online soon after.

Today the first leak of the new screener season started to populate various pirate sites, Louis C.K.’s “I Love You, Daddy.” It was released by the infamous “Hive-CM8” group which also made headlines in previous years.

“I Love You, Daddy” was carefully chosen, according to a message posted in the release notes. Last month distributor The Orchard chose to cancel the film from its schedule after Louis C.K. was accused of sexual misconduct. With uncertainty surrounding the film’s release, “Hive-CM8” decided to get it out.

“We decided to let this one title go out this month, since it never made it to the cinema, and nobody knows if it ever will go to retail at all,” Hive-CM8 write in their NFO.

“Either way their is no perfect time to release it anyway, but we think it would be a waste to let a great Louis C.K. go unwatched and nobody can even see or buy it,” they add.

I Love You, Daddy

It is no surprise that the group put some thought into their decision. In 2015 they published several movies before their theatrical release, for which they later offered an apology, stating that this wasn’t acceptable.

Last year this stance was reiterated, noting that they would not leak any screeners before Christmas. Today’s release shows that this isn’t a golden rule, but it’s unlikely that they will push any big titles before they’re out in theaters.

“I Love You, Daddy” isn’t going to be seen in theaters anytime soon, but it might see an official release. This past weekend, news broke that Louis C.K. had bought back the rights from The Orchard and must pay back marketing costs, including a payment for the 12,000 screeners that were sent out.

Hive-CM8, meanwhile, suggest that they have more screeners in hand, although their collection isn’t yet complete.

“We are still missing some titles, anyone want to share for the collection? Yes we want to have them all if possible, we are collectors, we don’t want to release them all,” they write.

Finally, the group also has some disappointing news for Star Wars fans who are looking for an early copy of “The Last Jedi.” Hive-CM8 is not going to release it.

“Their will be no starwars from us, sorry wont happen,” they write.

Source: TF, for the latest info on copyright, file-sharing, torrent sites and more. We also have VPN discounts, offers and coupons

Implementing Dynamic ETL Pipelines Using AWS Step Functions

Post Syndicated from Tara Van Unen original https://aws.amazon.com/blogs/compute/implementing-dynamic-etl-pipelines-using-aws-step-functions/

This post contributed by:
Wangechi Dole, AWS Solutions Architect
Milan Krasnansky, ING, Digital Solutions Developer, SGK
Rian Mookencherry, Director – Product Innovation, SGK

Data processing and transformation is a common use case you see in our customer case studies and success stories. Often, customers deal with complex data from a variety of sources that needs to be transformed and customized through a series of steps to make it useful to different systems and stakeholders. This can be difficult due to the ever-increasing volume, velocity, and variety of data. Today, data management challenges cannot be solved with traditional databases.

Workflow automation helps you build solutions that are repeatable, scalable, and reliable. You can use AWS Step Functions for this. A great example is how SGK used Step Functions to automate the ETL processes for their client. With Step Functions, SGK has been able to automate changes within the data management system, substantially reducing the time required for data processing.

In this post, SGK shares the details of how they used Step Functions to build a robust data processing system based on highly configurable business transformation rules for ETL processes.

SGK: Building dynamic ETL pipelines

SGK is a subsidiary of Matthews International Corporation, a diversified organization focusing on brand solutions and industrial technologies. SGK’s Global Content Creation Studio network creates compelling content and solutions that connect brands and products to consumers through multiple assets including photography, video, and copywriting.

We were recently contracted to build a sophisticated and scalable data management system for one of our clients. We chose to build the solution on AWS to leverage advanced, managed services that help to improve the speed and agility of development.

The data management system served two main functions:

  1. Ingesting a large amount of complex data to facilitate both reporting and product funding decisions for the client’s global marketing and supply chain organizations.
  2. Processing the data through normalization and applying complex algorithms and data transformations. The system goal was to provide information in the relevant context—such as strategic marketing, supply chain, product planning, etc. —to the end consumer through automated data feeds or updates to existing ETL systems.

We were faced with several challenges:

  • Output data that needed to be refreshed at least twice a day to provide fresh datasets to both local and global markets. That constant data refresh posed several challenges, especially around data management and replication across multiple databases.
  • The complexity of reporting business rules that needed to be updated on a constant basis.
  • Data that could not be processed as contiguous blocks of typical time-series data. The measurement of the data was done across seasons (that is, combination of dates), which often resulted with up to three overlapping seasons at any given time.
  • Input data that came from 10+ different data sources. Each data source ranged from 1–20K rows with as many as 85 columns per input source.

These challenges meant that our small Dev team heavily invested time in frequent configuration changes to the system and data integrity verification to make sure that everything was operating properly. Maintaining this system proved to be a daunting task and that’s when we turned to Step Functions—along with other AWS services—to automate our ETL processes.

Solution overview

Our solution included the following AWS services:

  • AWS Step Functions: Before Step Functions was available, we were using multiple Lambda functions for this use case and running into memory limit issues. With Step Functions, we can execute steps in parallel simultaneously, in a cost-efficient manner, without running into memory limitations.
  • AWS Lambda: The Step Functions state machine uses Lambda functions to implement the Task states. Our Lambda functions are implemented in Java 8.
  • Amazon DynamoDB provides us with an easy and flexible way to manage business rules. We specify our rules as Keys. These are key-value pairs stored in a DynamoDB table.
  • Amazon RDS: Our ETL pipelines consume source data from our RDS MySQL database.
  • Amazon Redshift: We use Amazon Redshift for reporting purposes because it integrates with our BI tools. Currently we are using Tableau for reporting which integrates well with Amazon Redshift.
  • Amazon S3: We store our raw input files and intermediate results in S3 buckets.
  • Amazon CloudWatch Events: Our users expect results at a specific time. We use CloudWatch Events to trigger Step Functions on an automated schedule.

Solution architecture

This solution uses a declarative approach to defining business transformation rules that are applied by the underlying Step Functions state machine as data moves from RDS to Amazon Redshift. An S3 bucket is used to store intermediate results. A CloudWatch Event rule triggers the Step Functions state machine on a schedule. The following diagram illustrates our architecture:

Here are more details for the above diagram:

  1. A rule in CloudWatch Events triggers the state machine execution on an automated schedule.
  2. The state machine invokes the first Lambda function.
  3. The Lambda function deletes all existing records in Amazon Redshift. Depending on the dataset, the Lambda function can create a new table in Amazon Redshift to hold the data.
  4. The same Lambda function then retrieves Keys from a DynamoDB table. Keys represent specific marketing campaigns or seasons and map to specific records in RDS.
  5. The state machine executes the second Lambda function using the Keys from DynamoDB.
  6. The second Lambda function retrieves the referenced dataset from RDS. The records retrieved represent the entire dataset needed for a specific marketing campaign.
  7. The second Lambda function executes in parallel for each Key retrieved from DynamoDB and stores the output in CSV format temporarily in S3.
  8. Finally, the Lambda function uploads the data into Amazon Redshift.

To understand the above data processing workflow, take a closer look at the Step Functions state machine for this example.

We walk you through the state machine in more detail in the following sections.

Walkthrough

To get started, you need to:

  • Create a schedule in CloudWatch Events
  • Specify conditions for RDS data extracts
  • Create Amazon Redshift input files
  • Load data into Amazon Redshift

Step 1: Create a schedule in CloudWatch Events
Create rules in CloudWatch Events to trigger the Step Functions state machine on an automated schedule. The following is an example cron expression to automate your schedule:

In this example, the cron expression invokes the Step Functions state machine at 3:00am and 2:00pm (UTC) every day.

Step 2: Specify conditions for RDS data extracts
We use DynamoDB to store Keys that determine which rows of data to extract from our RDS MySQL database. An example Key is MCS2017, which stands for, Marketing Campaign Spring 2017. Each campaign has a specific start and end date and the corresponding dataset is stored in RDS MySQL. A record in RDS contains about 600 columns, and each Key can represent up to 20K records.

A given day can have multiple campaigns with different start and end dates running simultaneously. In the following example DynamoDB item, three campaigns are specified for the given date.

The state machine example shown above uses Keys 31, 32, and 33 in the first ChoiceState and Keys 21 and 22 in the second ChoiceState. These keys represent marketing campaigns for a given day. For example, on Monday, there are only two campaigns requested. The ChoiceState with Keys 21 and 22 is executed. If three campaigns are requested on Tuesday, for example, then ChoiceState with Keys 31, 32, and 33 is executed. MCS2017 can be represented by Key 21 and Key 33 on Monday and Tuesday, respectively. This approach gives us the flexibility to add or remove campaigns dynamically.

Step 3: Create Amazon Redshift input files
When the state machine begins execution, the first Lambda function is invoked as the resource for FirstState, represented in the Step Functions state machine as follows:

"Comment": ” AWS Amazon States Language.", 
  "StartAt": "FirstState",
 
"States": { 
  "FirstState": {
   
"Type": "Task",
   
"Resource": "arn:aws:lambda:xx-xxxx-x:XXXXXXXXXXXX:function:Start",
    "Next": "ChoiceState" 
  } 

As described in the solution architecture, the purpose of this Lambda function is to delete existing data in Amazon Redshift and retrieve keys from DynamoDB. In our use case, we found that deleting existing records was more efficient and less time-consuming than finding the delta and updating existing records. On average, an Amazon Redshift table can contain about 36 million cells, which translates to roughly 65K records. The following is the code snippet for the first Lambda function in Java 8:

public class LambdaFunctionHandler implements RequestHandler<Map<String,Object>,Map<String,String>> {
    Map<String,String> keys= new HashMap<>();
    public Map<String, String> handleRequest(Map<String, Object> input, Context context){
       Properties config = getConfig(); 
       // 1. Cleaning Redshift Database
       new RedshiftDataService(config).cleaningTable(); 
       // 2. Reading data from Dynamodb
       List<String> keyList = new DynamoDBDataService(config).getCurrentKeys();
       for(int i = 0; i < keyList.size(); i++) {
           keys.put(”key" + (i+1), keyList.get(i)); 
       }
       keys.put(”key" + T,String.valueOf(keyList.size()));
       // 3. Returning the key values and the key count from the “for” loop
       return (keys);
}

The following JSON represents ChoiceState.

"ChoiceState": {
   "Type" : "Choice",
   "Choices": [ 
   {

      "Variable": "$.keyT",
     "StringEquals": "3",
     "Next": "CurrentThreeKeys" 
   }, 
   {

     "Variable": "$.keyT",
    "StringEquals": "2",
    "Next": "CurrentTwooKeys" 
   } 
 ], 
 "Default": "DefaultState"
}

The variable $.keyT represents the number of keys retrieved from DynamoDB. This variable determines which of the parallel branches should be executed. At the time of publication, Step Functions does not support dynamic parallel state. Therefore, choices under ChoiceState are manually created and assigned hardcoded StringEquals values. These values represent the number of parallel executions for the second Lambda function.

For example, if $.keyT equals 3, the second Lambda function is executed three times in parallel with keys, $key1, $key2 and $key3 retrieved from DynamoDB. Similarly, if $.keyT equals two, the second Lambda function is executed twice in parallel.  The following JSON represents this parallel execution:

"CurrentThreeKeys": { 
  "Type": "Parallel",
  "Next": "NextState",
  "Branches": [ 
  {

     "StartAt": “key31",
    "States": { 
       “key31": {

          "Type": "Task",
        "InputPath": "$.key1",
        "Resource": "arn:aws:lambda:xx-xxxx-x:XXXXXXXXXXXX:function:Execution",
        "End": true 
       } 
    } 
  }, 
  {

     "StartAt": “key32",
    "States": { 
     “key32": {

        "Type": "Task",
       "InputPath": "$.key2",
         "Resource": "arn:aws:lambda:xx-xxxx-x:XXXXXXXXXXXX:function:Execution",
       "End": true 
      } 
     } 
   }, 
   {

      "StartAt": “key33",
       "States": { 
          “key33": {

                "Type": "Task",
             "InputPath": "$.key3",
             "Resource": "arn:aws:lambda:xx-xxxx-x:XXXXXXXXXXXX:function:Execution",
           "End": true 
       } 
     } 
    } 
  ] 
} 

Step 4: Load data into Amazon Redshift
The second Lambda function in the state machine extracts records from RDS associated with keys retrieved for DynamoDB. It processes the data then loads into an Amazon Redshift table. The following is code snippet for the second Lambda function in Java 8.

public class LambdaFunctionHandler implements RequestHandler<String, String> {
 public static String key = null;

public String handleRequest(String input, Context context) { 
   key=input; 
   //1. Getting basic configurations for the next classes + s3 client Properties
   config = getConfig();

   AmazonS3 s3 = AmazonS3ClientBuilder.defaultClient(); 
   // 2. Export query results from RDS into S3 bucket 
   new RdsDataService(config).exportDataToS3(s3,key); 
   // 3. Import query results from S3 bucket into Redshift 
    new RedshiftDataService(config).importDataFromS3(s3,key); 
   System.out.println(input); 
   return "SUCCESS"; 
 } 
}

After the data is loaded into Amazon Redshift, end users can visualize it using their preferred business intelligence tools.

Lessons learned

  • At the time of publication, the 1.5–GB memory hard limit for Lambda functions was inadequate for processing our complex workload. Step Functions gave us the flexibility to chunk our large datasets and process them in parallel, saving on costs and time.
  • In our previous implementation, we assigned each key a dedicated Lambda function along with CloudWatch rules for schedule automation. This approach proved to be inefficient and quickly became an operational burden. Previously, we processed each key sequentially, with each key adding about five minutes to the overall processing time. For example, processing three keys meant that the total processing time was three times longer. With Step Functions, the entire state machine executes in about five minutes.
  • Using DynamoDB with Step Functions gave us the flexibility to manage keys efficiently. In our previous implementations, keys were hardcoded in Lambda functions, which became difficult to manage due to frequent updates. DynamoDB is a great way to store dynamic data that changes frequently, and it works perfectly with our serverless architectures.

Conclusion

With Step Functions, we were able to fully automate the frequent configuration updates to our dataset resulting in significant cost savings, reduced risk to data errors due to system downtime, and more time for us to focus on new product development rather than support related issues. We hope that you have found the information useful and that it can serve as a jump-start to building your own ETL processes on AWS with managed AWS services.

For more information about how Step Functions makes it easy to coordinate the components of distributed applications and microservices in any workflow, see the use case examples and then build your first state machine in under five minutes in the Step Functions console.

If you have questions or suggestions, please comment below.