Tag Archives: eNom

MPAA & RIAA Demand Tough Copyright Standards in NAFTA Negotiations

Post Syndicated from Andy original https://torrentfreak.com/mpaa-riaa-demand-tough-copyright-standards-in-nafta-negotiations-170621/

The North American Free Trade Agreement (NAFTA) between the United States, Canada, and Mexico was negotiated more than 25 years ago. With a quarter of a decade of developments to contend with, the United States wants to modernize.

“While our economy and U.S. businesses have changed considerably over that period, NAFTA has not,” the government says.

With this in mind, the US requested comments from interested parties seeking direction for negotiation points. With those comments now in, groups like the MPAA and RIAA have been making their positions known. It’s no surprise that intellectual property enforcement is high on the agenda.

“Copyright is the lifeblood of the U.S. motion picture and television industry. As such, MPAA places high priority on securing strong protection and enforcement disciplines in the intellectual property chapters of trade agreements,” the MPAA writes in its submission.

“Strong IPR protection and enforcement are critical trade priorities for the music industry. With IPR, we can create good jobs, make significant contributions to U.S. economic growth and security, invest in artists and their creativity, and drive technological innovation,” the RIAA notes.

While both groups have numerous demands, it’s clear that each seeks an environment where not only infringers can be held liable, but also Internet platforms and services.

For the RIAA, there is a big focus on the so-called ‘Value Gap’, a phenomenon found on user-uploaded content sites like YouTube that are able to offer infringing content while avoiding liability due to Section 512 of the DMCA.

“Today, user-uploaded content services, which have developed sophisticated on-demand music platforms, use this as a shield to avoid licensing music on fair terms like other digital services, claiming they are not legally responsible for the music they distribute on their site,” the RIAA writes.

“Services such as Apple Music, TIDAL, Amazon, and Spotify are forced to compete with services that claim they are not liable for the music they distribute.”

But if sites like YouTube are exercising their rights while acting legally under current US law, how can partners Canada and Mexico do any better? For the RIAA, that can be achieved by holding them to standards envisioned by the group when the DMCA was passed, not how things have panned out since.

Demanding that negotiators “protect the original intent” of safe harbor, the RIAA asks that a “high-level and high-standard service provider liability provision” is pursued. This, the music group says, should only be available to “passive intermediaries without requisite knowledge of the infringement on their platforms, and inapplicable to services actively engaged in communicating to the public.”

In other words, make sure that YouTube and similar sites won’t enjoy the same level of safe harbor protection as they do today.

The RIAA also requires any negotiated safe harbor provisions in NAFTA to be flexible in the event that the DMCA is tightened up in response to the ongoing safe harbor rules study.

In any event, NAFTA should not “support interpretations that no longer reflect today’s digital economy and threaten the future of legitimate and sustainable digital trade,” the RIAA states.

For the MPAA, Section 512 is also perceived as a problem. While noting that the original intent was to foster a system of shared responsibility between copyright owners and service providers, the MPAA says courts have subsequently let copyright holders down. Like the RIAA, the MPAA also suggests that Canada and Mexico can be held to higher standards.

“We recommend a new approach to this important trade policy provision by moving to high-level language that establishes intermediary liability and appropriate limitations on liability. This would be fully consistent with U.S. law and avoid the same misinterpretations by policymakers and courts overseas,” the MPAA writes.

“In so doing, a modernized NAFTA would be consistent with Trade Promotion Authority’s negotiating objective of ‘ensuring that standards of protection and enforcement keep pace with technological developments’.”

The MPAA also has some specific problems with Mexico, including unauthorized camcording. The Hollywood group says that 85 illicit audio and video recordings of films were linked to Mexican theaters in 2016. However, recording is not currently a criminal offense in Mexico.

Another issue for the MPAA is that criminal sanctions for commercial scale infringement are only available if the infringement is for profit.

“This has hampered enforcement against the above-discussed camcording problem but also against online infringement, such as peer-to-peer piracy, that may be on a scale that is immensely harmful to U.S. rightsholders but nonetheless occur without profit by the infringer,” the MPAA writes.

“The modernized NAFTA like other U.S. bilateral free trade agreements must provide for criminal sanctions against commercial scale infringements without proof of profit motive.”

Also of interest are the MPAA’s complaints against Mexico’s telecoms laws. Unlike in the US and many countries in Europe, Mexico’s ISPs are forbidden to hand out their customers’ personal details to rights holders looking to sue. This, the MPAA says, needs to change.

The submissions from the RIAA and MPAA can be found here and here (pdf)

Source: TF, for the latest info on copyright, file-sharing, torrent sites and ANONYMOUS VPN services.

The Pirate Bay Isn’t Affected By Adverse Court Rulings – Everyone Else Is

Post Syndicated from Andy original https://torrentfreak.com/the-pirate-bay-isnt-affected-by-adverse-court-rulings-everyone-else-is-170618/

For more than a decade The Pirate Bay has been the world’s most controversial site. Delivering huge quantities of copyrighted content to the masses, the platform is revered and reviled across the copyright spectrum.

Its reputation is one of a defiant Internet swashbuckler, but due to changes in how the site has been run in more recent times, its current philosophy is more difficult to gauge. What has never been in doubt, however, is the site’s original intent to be as provocative as possible.

Through endless publicity stunts, some real, some just for the ‘lulz’, The Pirate Bay managed to attract a massive audience, all while incurring the wrath of every major copyright holder in the world.

Make no mistake, they all queued up to strike back, but every subsequent rightsholder action was met by a Pirate Bay middle finger, two fingers, or chin flick, depending on the mood of the day. This only served to further delight the masses, who happily spread the word while keeping their torrents flowing.

This vicious circle of being targeted by the entertainment industries, mocking them, and then reaping the traffic benefits, developed into the cheapest long-term marketing campaign the Internet had ever seen. But nothing is ever truly for free and there have been consequences.

After taunting Hollywood and the music industry with its refusals to capitulate, endless legal action that the site would have ordinarily been forced to participate in largely took place without The Pirate Bay being present. It doesn’t take a law degree to work out what happened in each and every one of those cases, whatever complex route they took through the legal system. No defense, no win.

For example, the web-blocking phenomenon across the UK, Europe, Asia and Australia was driven by the site’s absolute resilience and although there would clearly have been other scapegoats had The Pirate Bay disappeared, the site was the ideal bogeyman the copyright lobby required to move forward.

Filing blocking lawsuits while bringing hosts, advertisers, and ISPs on board for anti-piracy initiatives were also made easier with the ‘evil’ Pirate Bay still online. Immune from every anti-piracy technique under the sun, the existence of the platform in the face of all onslaughts only strengthened the cases of those arguing for even more drastic measures.

Over a decade, this has meant a significant tightening of the sharing and streaming climate. Without any big legislative changes but plenty of case law against The Pirate Bay, web-blocking is now a walk in the park, ad hoc domain seizures are a fairly regular occurrence, and few companies want to host sharing sites. Advertisers and brands are also hesitant over where they place their ads. It’s a very different world to the one of 10 years ago.

While it would be wrong to attribute every tightening of the noose to the actions of The Pirate Bay, there’s little doubt that the site and its chaotic image played a huge role in where copyright enforcement is today. The platform set out to provoke and succeeded in every way possible, gaining supporters in their millions. It could also be argued it kicked a hole in a hornets’ nest, releasing the hell inside.

But perhaps the site’s most amazing achievement is the way it has managed to stay online, despite all the turmoil.

This week yet another ruling, this time from the powerful European Court of Justice, found that by offering links in the manner it does, The Pirate Bay and other sites are liable for communicating copyright works to the public. Of course, this prompted the usual swathe of articles claiming that this could be the final nail in the site’s coffin.

Wrong.

In common with every ruling, legal defeat, and legislative restriction put in place due to the site’s activities, this week’s decision from the ECJ will have zero effect on the Pirate Bay’s availability. For right or wrong, the site was breaking the law long before this ruling and will continue to do so until it decides otherwise.

What we have instead is a further tightened legal landscape that will have a lasting effect on everything BUT the site, including weaker torrent sites, Internet users, and user-uploaded content sites such as YouTube.

With The Pirate Bay carrying on regardless, that is nothing short of remarkable.

Source: TF, for the latest info on copyright, file-sharing, torrent sites and ANONYMOUS VPN services.

Who’s To Blame For The Kodi Crackdown?

Post Syndicated from Andy original https://torrentfreak.com/whos-to-blame-for-the-kodi-crackdown-170611/

Perfectly legal as standard, the Kodi media player can be easily modified to turn it into the ultimate streaming piracy machine.

Uptake by users has been nothing short of phenomenal. Millions of people are now consuming illicit media through third-party Kodi addons. With free movies, TV shows, sports, live TV and more on tap, it’s not difficult to see why the system is so popular.

As a result, barely a day goes by without Kodi making headlines and this week was no exception. On Monday, TorrentFreak broke the news that the ZEMTV addon and TV Addons, one of the most popular addon communities, were being sued by Dish Network for copyright infringement.

Within hours of the announcement and apparently as a direct result, several addons (including the massively popular Phoenix) decided to throw in the towel. Quite understandably, users of the platforms were disappointed, and that predictably resulted in people attempting to apportion blame.

The first comment to catch the eye was posted directly beneath our article. Interestingly, it placed the blame squarely on our shoulders.

“Thanks Torrentfreak, for ruining Kodi,” it read.

While shooting the messenger is an option, it’s historically problematic. Town criers were the original newsreaders, delivering important messages to the public. Killing a town crier was considered treason, but it was also pointless – it didn’t change the facts on the ground.

So if we can’t kill those who read about a lawsuit in the public PACER system and reported it, who’s left to blame? Unsurprisingly, there’s no shortage of targets, but most of them fall short.

The underlying theme is that most people voicing a negative opinion about the profile of Kodi do not appreciate their previously niche piracy system being in the spotlight. Everything was just great when just a few people knew about the marvelous hidden world of ‘secret’ XBMC/Kodi addons, many insist, but seeing it in the mainstream press is a disaster. It’s difficult to disagree.

However, the point where this all falls down is when people are asked when the discussion about Kodi should’ve stopped. We haven’t questioned them all, of course, but it’s almost guaranteed that while most with a grievance didn’t want Kodi getting too big, they absolutely appreciate the fact that someone told them about it. Piracy and piracy techniques spread by word of mouth so unfortunately, people can’t have it both ways.

Interestingly, some people placed the blame on TV Addons, the site that hosts the addons themselves. They argued that the addon scene didn’t need such a high profile target and that the popularity of the site only brought unwanted attention. However, for every critic, there are apparently thousands who love what the site does to raise the profile of Kodi. Without that, it’s clear that there would be fewer users and indeed, fewer addons.

For TV Addons’ part, they’re extremely clear who’s responsible for bringing the heat. On numerous occasions in emails to TF, the operators of the repository have blamed those who have attempted to commercialize the Kodi scene. For them, the responsibility must be placed squarely on the shoulders of people selling ‘Kodi boxes’ on places like eBay and Amazon. Once big money got involved, that attracted the authorities, they argue.

With this statement in mind, TF spoke with a box seller who previously backed down from selling on eBay due to issues over Kodi’s trademark. He didn’t want to speak on the record but admitted to selling “a couple of thousand” boxes over the past two years, noting that all he did was respond to demand with supply.

And this brings us full circle and a bit closer to apportioning blame for the Kodi crackdown.

The bottom line is that when it comes to piracy, Kodi and its third-party ‘pirate’ addons are so good at what they do, it’s no surprise they’ve been a smash hit with Internet users. All of the content that anyone could want – and more – accessible in one package, on almost any platform? That’s what consumers have been demanding for more than a decade and a half.

That brings us to the unavoidable conclusion that modified Kodi simply got too good at delivering content outside controlled channels, and that success was impossible to moderate or calm. Quite simply, every user that added to the Kodi phenomenon by installing the software with ‘pirate’ addons has to shoulder some of the blame for the crackdown.

That might sound harsh but in the piracy world it’s never been any different. Without millions of users, The Pirate Bay raid would never have happened. Without users, KickassTorrents might still be rocking today. But of course, what would be the point?

Users might break sites and services, but they also make them. That’s the piracy paradox.

Source: TF, for the latest info on copyright, file-sharing, torrent sites and ANONYMOUS VPN services.

MPAA Chief Praises Site-Blocking But Italians Love Piracy – and the Quality

Post Syndicated from Andy original https://torrentfreak.com/mpaa-chief-praises-site-blocking-but-italians-love-pirate-quality-170606/

After holding a reputation for being soft on piracy for many years, in more recent times Italy has taken a much tougher stance. The country now takes regular action against pirate sites and has a fairly aggressive site-blocking mechanism.

On Monday, the industry gathered in Rome and was presented with new data from local anti-piracy outfit FAPAV. The research revealed that while there has been some improvement over the past six years, 39% of Italians are still consuming illicit movies, TV shows, sporting events and other entertainment, at the rate of 669m acts of piracy every year.

While movie piracy is down 4% from 2010, the content most often consumed by pirates is still films, with 33% of the adult population engaging in illicit consumption during the past year.

The downward trend was not shared by TV shows, however. In the past seven years, piracy has risen to 22% of the population, up 13% on figures from 2010.

In keeping with the MPAA’s recent coding of piracy in 1.0, 2.0, and 3.0 variants (P2P as 1.0, streaming websites as 2.0, streaming devices/Kodi as 3.0), FAPAV said that Piracy 2.0 had become even more established recently, with site operators making considerable technological progress.

“The research tells us we can not lower our guard, we always have to work harder and with greater determination in communication and awareness, especially with regard to digital natives,” said FAPAV Secretary General, Bagnoli Rossi.

The FAPAV chief said that there needs to be emphasis in two areas. One, changing perceptions among the public over the seriousness of piracy via education and two, placing pressure on websites using the police, judiciary, and other law enforcement agencies.

“The pillars of anti-piracy protection are: the judicial authority, self-regulatory agreements, communication and educational activities,” said Rossi, adding that cooperation with Italy’s AGCOM had resulted in 94 sites being blocked over three years.

FAPAV research has traditionally focused on people aged 15 and up but the anti-piracy group believes that placing more emphasis on younger people (aged 10-14) is important since they also consume a lot of pirated content online. MPAA chief Chris Dodd, who was at the event, agreed with the sentiment.

“Today’s youth are the future of the audiovisual industry. Young people must learn to respect the people who work in film and television that in 96% of cases never appear [in front of camera] but still work behind the scenes,” Dodd said.

“It is important to educate and direct them towards legal consumption, which creates jobs and encourages investment. Technology has expanded options to consume content legally and at any time and place, but at the same time has given attackers the opportunity to develop illegal businesses.”

Despite large-scale site-blocking not being a reality in the United States, Dodd was also keen to praise Italy for its efforts while acknowledging the wider blocking regimes in place across the EU.

“We must not only act by blocking pirate sites (we have closed a little less than a thousand in Europe) but also focus on legal offers. Today there are 480 legal online distribution services worldwide. We must have more,” Dodd said.

The outgoing MPAA chief reiterated that movies, music, games and a wide range of entertainment products are all available online legally now. Nevertheless, piracy remains a “growing phenomenon” that has criminals at its core.

“Piracy is composed of criminal organizations, ready to steal sensitive data and to make illegal profits any way they can. It’s a business that harms the entire audiovisual market, which in Europe alone has a million working professionals. To promote the culture of legality means protecting this market and its collective heritage,” Dodd said.

In Italy, convincing pirates to go legal might be more easily said than done. Not only do millions download video every year, but the majority of pirates are happy with the quality too. 89% said they were pleased with the quality of downloaded movies while the satisfaction with TV shows was even greater with 91% indicating approval.

Source: TF, for the latest info on copyright, file-sharing, torrent sites and ANONYMOUS VPN services.

When a Big Torrent Site Dies, Some Hope it Will Be Right Back

Post Syndicated from Andy original https://torrentfreak.com/when-a-big-torrent-site-dies-some-hope-it-will-be-right-back-170604/

For a niche that has had millions of words written about it over the past 18 years or so, most big piracy stories have had the emotions of people at their core.

When The Pirate Bay was taken down by the police eleven years ago it was global news, but the real story was the sense of disbelief and loss felt by millions of former users. Outsiders may dismiss these feelings, but they are very common and very real.

Of course, those negative emotions soon turned to glee when the site returned days later, but full-on, genuine resurrections are something that few big sites have been able to pull off since. What we have instead today is the sudden disappearance of iconic sites and a scrambling by third-party opportunists to fill in the gaps with look-a-like platforms.

The phenomenon has affected many big sites, from The Pirate Bay itself through to KickassTorrents, YTS/YIFY, and more recently, ExtraTorrent. When sites disappear, it’s natural for former users to look for replacements. And when those replacements look just like the real deal there’s a certain amount of comfort to be had. For many users, these sites provide the perfect antidote to their feelings of loss.

That being said, the clone site phenomenon has seriously got out of hand. Pioneered by players in the streaming site scene, fake torrent sites can now be found in abundance wherever there is a brand worth copying. ExtraTorrent operator SaM knew this when he closed his site last month, and he took the time to warn people away from them personally.

“Stay away from fake ExtraTorrent websites and clones,” he said.

It’s questionable how many listened.

Within days, users were flooding to fake ExtraTorrent sites, encouraged by some elements of the press. Despite having previously reported SaM’s clear warnings, some publications were still happy to report that ExtraTorrent was back, purely based on the word of the fake sites themselves. And I’ve got a bridge for sale, if you have the cash.

While misleading news reports must take some responsibility, it’s clear that when big sites go down a kind of grieving process takes place among dedicated former users, making some more likely to clutch at straws. While some simply move on, others who have grown more attached to a platform they used to call home can go into denial.

This reaction has often been seen in TF’s mailbox, when YTS/YIFY went down in particular. More recently, dozens of emails informed us that ExtraTorrent had gone, with many others asking when it was coming back. But the ones that stood out most were from people who had read SaM’s message, read TF’s article stating that ALL clones were fakes, yet still wanted to know if sites a, b and c were legitimate or not.

We approached a user on Reddit who asked similar things and been derided by other users for his apparent reluctance to accept that ExtraTorrent had gone. We didn’t find stupidity (as a few in /r/piracy had cruelly suggested) but a genuine sense of loss.

“I loved the site dude, what can I say?” he told TF. “Just kinda got used to it and hung around. Before I knew it I was logging in every day. In time it just felt like home. I miss it.”

The user hadn’t seen the articles claiming that one of the imposter ExtraTorrent sites was the real deal. He did, however, seem a bit unsettled when we told him it was a fake. But when we asked if he was going to stop using it, we received an emphatic “no”.

“Dude it looks like ET and yeah it’s not quite the same but I can get my torrents. Why does it matter what crew [runs it]?” he said.

It does matter, of course. The loss of a proper torrent site like ExtraTorrent, which had releasers and a community, can never be replaced by a custom-skinned Pirate Bay mirror. No matter how much it looks like a lost friend, it’s actually a pig in lipstick that contributes little to the ecosystem.

That being said, it’s difficult to counter the fact that some of these clones make people happy. They fill a void that other sites, for mainly cosmetic reasons, can’t fill. With this in mind, the grounds for criticism weaken a little – but not much.

For anyone who has watched the Black Mirror episode ‘Be Right Back‘, it’s clear that sudden loss can be a hard thing for humans to accept. When trying to fill the gap, what might initially seem like a good replacement is almost certainly destined to disappoint longer term, when the sub-standard copy fails to capture the heart and soul of the real deal.

It’s an issue that will occupy the piracy scene for some time to come, but interestingly, it’s also an argument that Hollywood has used against piracy itself for decades. But that’s another story.

Source: TF, for the latest info on copyright, file-sharing, torrent sites and ANONYMOUS VPN services.

How NAGRA Fights Kodi and IPTV Piracy

Post Syndicated from Andy original https://torrentfreak.com/how-nagra-fights-kodi-and-iptv-piracy-170603/

Nagravision or NAGRA is one of the best known companies operating in the digital cable and satellite television content security space. Due to successes spanning several decades, the company has often proven unpopular with pirates.

In particular, Nagravision encryption systems have regularly been a hot topic for discussion on cable and satellite hacking forums, frustrating those looking to receive pay TV services without paying the high prices associated with them. However, the rise of the Internet is now presenting new challenges.

NAGRA still protects traditional cable and satellite pay TV services in 2017; Virgin Media in the UK is a long-standing customer, for example. But the rise of Internet streaming means that pirate content can now be delivered to the home with ease, completely bypassing the entire pay TV provider infrastructure. And, by extension, NAGRA’s encryption.

This means that NAGRA has been required to spread its wings.

As reported in April, NAGRA is establishing a lab to monitor and detect unauthorized consumption of content via set-top boxes, websites and other streaming platforms. That covers the now omnipresent Kodi phenomenon, alongside premium illicit IPTV services. TorrentFreak caught up with the company this week to find out more.

“NAGRA has an automated monitoring platform that scans all live channels and VOD assets available on Kodi,” NAGRA’s Ivan Schnider informs TF.

“The service we offer to our customers automatically finds illegal distribution of their content on Kodi and removes infringing streams.”

In the first instance, NAGRA sends standard takedown notices to hosting services to terminate illicit streams. The company says that while some companies are very cooperative, others are less so. When meeting resistance, NAGRA switches to more coercive methods, described here by Christopher Schouten, NAGRA Senior Director Product Marketing.

“Takedowns are generally sent to streaming platforms and hosting servers. When those don’t work, Advanced Takedowns allow us to use both technical and legal means to get results,” Schouten says.

“Numerous stories in recent days show how for instance popular Kodi plug-ins have been removed by their authors because of the mere threat of legal actions like this.”

At the center of operations is NAGRA’s Piracy Intelligence Portal, which offers customers a real-time view of worldwide online piracy trends, information on the infrastructure behind illegal services, as well as statistics and status of takedown requests.

“We measure takedown compliance very carefully using our Piracy Intelligence Portal, so we can usually predict the results we will get. We work on a daily basis to improve relationships and interfaces with those who are less compliant,” Schouten says.

The Piracy Intelligence Portal

While persuasion is probably the best solution, some hosts inevitably refuse to cooperate. However, NAGRA also offers the NexGuard system, which is able to determine the original source of the content.

“Using forensic watermarking to trace the source of the leak, we will be able to completely shut down the ‘leak’ at the source, independently and within minutes of detection,” Schouten says.

Whatever route is taken, NAGRA says that the aim is to take down streams as quickly as possible, something which hopefully undermines confidence in pirate services and encourages users to re-enter the legal market. Interestingly, the company also says it uses “technical means” to degrade pirate services to the point that consumers lose faith in them.

But while augmented Kodi setups and illicit IPTV are certainly considered a major threat in 2017, they are not the only problem faced by content companies.

While the Apple platform is quite tight, the open nature of Android means that there are a rising number of apps that can be sideloaded from the web. These allow pirate content to be consumed quickly and conveniently within a glossy interface.

Apps like Showbox, MovieHD and Terrarium TV have the movie and TV show sector wrapped up, while the popular Mobdro achieves the same with live TV, including premium sports. Schnider says NAGRA can handle apps like these and other emerging threats in a variety of ways.

“In addition to Kodi-related anti-piracy activities, NAGRA offers a service that automatically finds illegal distribution of content on Android applications, fully loaded STBs, M3U playlist and other platforms that provide plug-and-play solutions for the big TV screen; this service also includes the removal of infringing streams,” he explains.

M3U playlist piracy doesn’t get a lot of press. An M3U file is a text file that specifies locations where content (such as streams) can be found online.

In its basic ‘free’ form, it’s simply a case of finding an M3U file on an indexing site or blog and loading it into VLC. It’s not as flashy as any of the above apps, and unless one knows where to get the free M3Us quickly, many channels may already be offline. Premium M3U files are widely available, however, and tend to be pretty reliable.

But while attacking sources of infringing content is clearly a big part of NAGRA’s mission, the company also deploys softer strategies for dealing with pirates.

“Beyond disrupting pirate streams, raising awareness amongst users that these services are illegal and helping service providers deliver competing legitimate services, are also key areas in the fight against premium IPTV piracy where NAGRA can help,” Schnider says.

“Converting users of such services to legitimate paying subscribers represents a significant opportunity for content owners and distributors.”

For this to succeed, Schouten says there needs to be an understanding of the different motivators that lead an individual to commit piracy.

“Is it price? Is it availability? Is it functionality?” he asks.

Interestingly, he also reveals that lots of people are spending large sums of money on IPTV services they believe are legal but are not. Rather than the high prices putting them off, they actually add to their air of legitimacy.

“These consumers can relatively easily be converted into paying subscribers if they can be convinced that pay-TV services offer superior quality, reliability, and convenience because let’s face it, most IPTV services are still a little dodgy to use,” he says.

“Education is also important; done through working with service providers to inform consumers through social media platforms of the risks linked to the use of illegitimate streaming devices / IPTV devices, e.g. purchasing boxes that may no longer work after a short period of time.”

And so the battle over content continues.

Source: TF, for the latest info on copyright, file-sharing, torrent sites and ANONYMOUS VPN services.

Building High-Throughput Genomics Batch Workflows on AWS: Workflow Layer (Part 4 of 4)

Post Syndicated from Andy Katz original https://aws.amazon.com/blogs/compute/building-high-throughput-genomics-batch-workflows-on-aws-workflow-layer-part-4-of-4/

Aaron Friedman is a Healthcare and Life Sciences Partner Solutions Architect at AWS

Angel Pizarro is a Scientific Computing Technical Business Development Manager at AWS

This post is the fourth in a series on how to build a genomics workflow on AWS. In Part 1, we introduced a general architecture, shown below, and highlighted the three common layers in a batch workflow:

  • Job
  • Batch
  • Workflow

In Part 2, you built a Docker container for each job that needed to run as part of your workflow, and stored them in Amazon ECR.

In Part 3, you tackled the batch layer and built a scalable, elastic, and easily maintainable batch engine using AWS Batch. This solution took care of dynamically scaling your compute resources in response to the number of runnable jobs in your job queue length as well as managed job placement.

In part 4, you build out the workflow layer of your solution using AWS Step Functions and AWS Lambda. You then run an end-to-end genomic analysis―specifically known as exome secondary analysis―for many times at a cost of less than $1 per exome.

Step Functions makes it easy to coordinate the components of your applications using visual workflows. Building applications from individual components that each perform a single function lets you scale and change your workflow quickly. You can use the graphical console to arrange and visualize the components of your application as a series of steps, which simplify building and running multi-step applications. You can change and add steps without writing code, so you can easily evolve your application and innovate faster.

An added benefit of using Step Functions to define your workflows is that the state machines you create are immutable. While you can delete a state machine, you cannot alter it after it is created. For regulated workloads where auditing is important, you can be assured that state machines you used in production cannot be altered.

In this blog post, you will create a Lambda state machine to orchestrate your batch workflow. For more information on how to create a basic state machine, please see this Step Functions tutorial.

All code related to this blog series can be found in the associated GitHub repository here.

Build a state machine building block

To skip the following steps, we have provided an AWS CloudFormation template that can deploy your Step Functions state machine. You can use this in combination with the setup you did in part 3 to quickly set up the environment in which to run your analysis.

The state machine is composed of smaller state machines that submit a job to AWS Batch, and then poll and check its execution.

The steps in this building block state machine are as follows:

  1. A job is submitted.
    Each analytical module/job has its own Lambda function for submission and calls the batchSubmitJob Lambda function that you built in the previous blog post. You will build these specialized Lambda functions in the following section.
  2. The state machine queries the AWS Batch API for the job status.
    This is also a Lambda function.
  3. The job status is checked to see if the job has completed.
    If the job status equals SUCCESS, proceed to log the final job status. If the job status equals FAILED, end the execution of the state machine. In all other cases, wait 30 seconds and go back to Step 2.

Here is the JSON representing this state machine.

{
  "Comment": "A simple example that submits a Job to AWS Batch",
  "StartAt": "SubmitJob",
  "States": {
    "SubmitJob": {
      "Type": "Task",
      "Resource": "arn:aws:lambda:us-east-1:<account-id>::function:batchSubmitJob",
      "Next": "GetJobStatus"
    },
    "GetJobStatus": {
      "Type": "Task",
      "Resource": "arn:aws:lambda:us-east-1:<account-id>:function:batchGetJobStatus",
      "Next": "CheckJobStatus",
      "InputPath": "$",
      "ResultPath": "$.status"
    },
    "CheckJobStatus": {
      "Type": "Choice",
      "Choices": [
        {
          "Variable": "$.status",
          "StringEquals": "FAILED",
          "End": true
        },
        {
          "Variable": "$.status",
          "StringEquals": "SUCCEEDED",
          "Next": "GetFinalJobStatus"
        }
      ],
      "Default": "Wait30Seconds"
    },
    "Wait30Seconds": {
      "Type": "Wait",
      "Seconds": 30,
      "Next": "GetJobStatus"
    },
    "GetFinalJobStatus": {
      "Type": "Task",
      "Resource": "arn:aws:lambda:us-east-1:<account-id>:function:batchGetJobStatus",
      "End": true
    }
  }
}

Building the Lambda functions for the state machine

You need two basic Lambda functions for this state machine. The first one submits a job to AWS Batch and the second checks the status of the AWS Batch job that was submitted.

In AWS Step Functions, you specify an input as JSON that is read into your state machine. Each state receives the aggregate of the steps immediately preceding it, and you can specify which components a state passes on to its children. Because you are using Lambda functions to execute tasks, one of the easiest routes to take is to modify the input JSON, represented as a Python dictionary, within the Lambda function and return the entire dictionary back for the next state to consume.

Building the batchSubmitIsaacJob Lambda function

For Step 1 above, you need a Lambda function for each of the steps in your analysis workflow. As you created a generic Lambda function in the previous post to submit a batch job (batchSubmitJob), you can use that function as the basis for the specialized functions you’ll include in this state machine. Here is such a Lambda function for the Isaac aligner.

from __future__ import print_function

import boto3
import json
import traceback

lambda_client = boto3.client('lambda')



def lambda_handler(event, context):
    try:
        # Generate output put
        bam_s3_path = '/'.join([event['resultsS3Path'], event['sampleId'], 'bam/'])

        depends_on = event['dependsOn'] if 'dependsOn' in event else []

        # Generate run command
        command = [
            '--bam_s3_folder_path', bam_s3_path,
            '--fastq1_s3_path', event['fastq1S3Path'],
            '--fastq2_s3_path', event['fastq2S3Path'],
            '--reference_s3_path', event['isaac']['referenceS3Path'],
            '--working_dir', event['workingDir']
        ]

        if 'cmdArgs' in event['isaac']:
            command.extend(['--cmd_args', event['isaac']['cmdArgs']])
        if 'memory' in event['isaac']:
            command.extend(['--memory', event['isaac']['memory']])

        # Submit Payload
        response = lambda_client.invoke(
            FunctionName='batchSubmitJob',
            InvocationType='RequestResponse',
            LogType='Tail',
            Payload=json.dumps(dict(
                dependsOn=depends_on,
                containerOverrides={
                    'command': command,
                },
                jobDefinition=event['isaac']['jobDefinition'],
                jobName='-'.join(['isaac', event['sampleId']]),
                jobQueue=event['isaac']['jobQueue']
            )))

        response_payload = response['Payload'].read()

        # Update event
        event['bamS3Path'] = bam_s3_path
        event['jobId'] = json.loads(response_payload)['jobId']
        
        return event
    except Exception as e:
        traceback.print_exc()
        raise e

In the Lambda console, create a Python 2.7 Lambda function named batchSubmitIsaacJob and paste in the above code. Use the LambdaBatchExecutionRole that you created in the previous post. For more information, see Step 2.1: Create a Hello World Lambda Function.

This Lambda function reads in the inputs passed to the state machine it is part of, formats the data for the batchSubmitJob Lambda function, invokes that Lambda function, and then modifies the event dictionary to pass onto the subsequent states. You can repeat these for each of the other tools, which can be found in the tools//lambda/lambda_function.py script in the GitHub repo.

Building the batchGetJobStatus Lambda function

For Step 2 above, the process queries the AWS Batch DescribeJobs API action with jobId to identify the state that the job is in. You can put this into a Lambda function to integrate it with Step Functions.

In the Lambda console, create a new Python 2.7 function with the LambdaBatchExecutionRole IAM role. Name your function batchGetJobStatus and paste in the following code. This is similar to the batch-get-job-python27 Lambda blueprint.

from __future__ import print_function

import boto3
import json

print('Loading function')

batch_client = boto3.client('batch')

def lambda_handler(event, context):
    # Log the received event
    print("Received event: " + json.dumps(event, indent=2))
    # Get jobId from the event
    job_id = event['jobId']

    try:
        response = batch_client.describe_jobs(
            jobs=[job_id]
        )
        job_status = response['jobs'][0]['status']
        return job_status
    except Exception as e:
        print(e)
        message = 'Error getting Batch Job status'
        print(message)
        raise Exception(message)

Structuring state machine input

You have structured the state machine input so that general file references are included at the top-level of the JSON object, and any job-specific items are contained within a nested JSON object. At a high level, this is what the input structure looks like:

{
        "general_field_1": "value1",
        "general_field_2": "value2",
        "general_field_3": "value3",
        "job1": {},
        "job2": {},
        "job3": {}
}

Building the full state machine

By chaining these state machine components together, you can quickly build flexible workflows that can process genomes in multiple ways. The development of the larger state machine that defines the entire workflow uses four of the above building blocks. You use the Lambda functions that you built in the previous section. Rename each building block submission to match the tool name.

We have provided a CloudFormation template to deploy your state machine and the associated IAM roles. In the CloudFormation console, select Create Stack, choose your template (deploy_state_machine.yaml), and enter in the ARNs for the Lambda functions you created.

Continue through the rest of the steps and deploy your stack. Be sure to check the box next to "I acknowledge that AWS CloudFormation might create IAM resources."

Once the CloudFormation stack is finished deploying, you should see the following image of your state machine.

In short, you first submit a job for Isaac, which is the aligner you are using for the analysis. Next, you use parallel state to split your output from "GetFinalIsaacJobStatus" and send it to both your variant calling step, Strelka, and your QC step, Samtools Stats. These then are run in parallel and you annotate the results from your Strelka step with snpEff.

Putting it all together

Now that you have built all of the components for a genomics secondary analysis workflow, test the entire process.

We have provided sequences from an Illumina sequencer that cover a region of the genome known as the exome. Most of the positions in the genome that we have currently associated with disease or human traits reside in this region, which is 1–2% of the entire genome. The workflow that you have built works for both analyzing an exome, as well as an entire genome.

Additionally, we have provided prebuilt reference genomes for Isaac, located at:

s3://aws-batch-genomics-resources/reference/

If you are interested, we have provided a script that sets up all of that data. To execute that script, run the following command on a large EC2 instance:

make reference REGISTRY=<your-ecr-registry>

Indexing and preparing this reference takes many hours on a large-memory EC2 instance. Be careful about the costs involved and note that the data is available through the prebuilt reference genomes.

Starting the execution

In a previous section, you established a provenance for the JSON that is fed into your state machine. For ease, we have auto-populated the input JSON for you to the state machine. You can also find this in the GitHub repo under workflow/test.input.json:

{
  "fastq1S3Path": "s3://aws-batch-genomics-resources/fastq/SRR1919605_1.fastq.gz",
  "fastq2S3Path": "s3://aws-batch-genomics-resources/fastq/SRR1919605_2.fastq.gz",
  "referenceS3Path": "s3://aws-batch-genomics-resources/reference/hg38.fa",
  "resultsS3Path": "s3://<bucket>/genomic-workflow/results",
  "sampleId": "NA12878_states_1",
  "workingDir": "/scratch",
  "isaac": {
    "jobDefinition": "isaac-myenv:1",
    "jobQueue": "arn:aws:batch:us-east-1:<account-id>:job-queue/highPriority-myenv",
    "referenceS3Path": "s3://aws-batch-genomics-resources/reference/isaac/"
  },
  "samtoolsStats": {
    "jobDefinition": "samtools_stats-myenv:1",
    "jobQueue": "arn:aws:batch:us-east-1:<account-id>:job-queue/lowPriority-myenv"
  },
  "strelka": {
    "jobDefinition": "strelka-myenv:1",
    "jobQueue": "arn:aws:batch:us-east-1:<account-id>:job-queue/highPriority-myenv",
    "cmdArgs": " --exome "
  },
  "snpEff": {
    "jobDefinition": "snpeff-myenv:1",
    "jobQueue": "arn:aws:batch:us-east-1:<account-id>:job-queue/lowPriority-myenv",
    "cmdArgs": " -t hg38 "
  }
}

You are now at the stage to run your full genomic analysis. Copy the above to a new text file, change paths and ARNs to the ones that you created previously, and save your JSON input as input.states.json.

In the CLI, execute the following command. You need the ARN of the state machine that you created in the previous post:

aws stepfunctions start-execution --state-machine-arn <your-state-machine-arn> --input file://input.states.json

Your analysis has now started. By using Spot Instances with AWS Batch, you can quickly scale out your workflows while concurrently optimizing for cost. While this is not guaranteed, most executions of the workflows presented here should cost under $1 for a full analysis.

Monitoring the execution

The output from the above CLI command gives you the ARN that describes the specific execution. Copy that and navigate to the Step Functions console. Select the state machine that you created previously and paste the ARN into the search bar.

The screen shows information about your specific execution. On the left, you see where your execution currently is in the workflow.

In the following screenshot, you can see that your workflow has successfully completed the alignment job and moved onto the subsequent steps, which are variant calling and generating quality information about your sample.

You can also navigate to the AWS Batch console and see that progress of all of your jobs reflected there as well.

Finally, after your workflow has completed successfully, check out the S3 path to which you wrote all of your files. If you run a ls –recursive command on the S3 results path, specified in the input to your state machine execution, you should see something similar to the following:

2017-05-02 13:46:32 6475144340 genomic-workflow/results/NA12878_run1/bam/sorted.bam
2017-05-02 13:46:34    7552576 genomic-workflow/results/NA12878_run1/bam/sorted.bam.bai
2017-05-02 13:46:32         45 genomic-workflow/results/NA12878_run1/bam/sorted.bam.md5
2017-05-02 13:53:20      68769 genomic-workflow/results/NA12878_run1/stats/bam_stats.dat
2017-05-02 14:05:12        100 genomic-workflow/results/NA12878_run1/vcf/stats/runStats.tsv
2017-05-02 14:05:12        359 genomic-workflow/results/NA12878_run1/vcf/stats/runStats.xml
2017-05-02 14:05:12  507577928 genomic-workflow/results/NA12878_run1/vcf/variants/genome.S1.vcf.gz
2017-05-02 14:05:12     723144 genomic-workflow/results/NA12878_run1/vcf/variants/genome.S1.vcf.gz.tbi
2017-05-02 14:05:12  507577928 genomic-workflow/results/NA12878_run1/vcf/variants/genome.vcf.gz
2017-05-02 14:05:12     723144 genomic-workflow/results/NA12878_run1/vcf/variants/genome.vcf.gz.tbi
2017-05-02 14:05:12   30783484 genomic-workflow/results/NA12878_run1/vcf/variants/variants.vcf.gz
2017-05-02 14:05:12    1566596 genomic-workflow/results/NA12878_run1/vcf/variants/variants.vcf.gz.tbi

Modifications to the workflow

You have now built and run your genomics workflow. While diving deep into modifications to this architecture are beyond the scope of these posts, we wanted to leave you with several suggestions of how you might modify this workflow to satisfy additional business requirements.

  • Job tracking with Amazon DynamoDB
    In many cases, such as if you are offering Genomics-as-a-Service, you might want to track the state of your jobs with DynamoDB to get fine-grained records of how your jobs are running. This way, you can easily identify the cost of individual jobs and workflows that you run.
  • Resuming from failure
    Both AWS Batch and Step Functions natively support job retries and can cover many of the standard cases where a job might be interrupted. There may be cases, however, where your workflow might fail in a way that is unpredictable. In this case, you can use custom error handling with AWS Step Functions to build out a workflow that is even more resilient. Also, you can build in fail states into your state machine to fail at any point, such as if a batch job fails after a certain number of retries.
  • Invoking Step Functions from Amazon API Gateway
    You can use API Gateway to build an API that acts as a "front door" to Step Functions. You can create a POST method that contains the input JSON to feed into the state machine you built. For more information, see the Implementing Serverless Manual Approval Steps in AWS Step Functions and Amazon API Gateway blog post.

Conclusion

While the approach we have demonstrated in this series has been focused on genomics, it is important to note that this can be generalized to nearly any high-throughput batch workload. We hope that you have found the information useful and that it can serve as a jump-start to building your own batch workloads on AWS with native AWS services.

For more information about how AWS can enable your genomics workloads, be sure to check out the AWS Genomics page.

Other posts in this four-part series:

Please leave any questions and comments below.

Building High-Throughput Genomic Batch Workflows on AWS: Batch Layer (Part 3 of 4)

Post Syndicated from Andy Katz original https://aws.amazon.com/blogs/compute/building-high-throughput-genomic-batch-workflows-on-aws-batch-layer-part-3-of-4/

Aaron Friedman is a Healthcare and Life Sciences Partner Solutions Architect at AWS

Angel Pizarro is a Scientific Computing Technical Business Development Manager at AWS

This post is the third in a series on how to build a genomics workflow on AWS. In Part 1, we introduced a general architecture, shown below, and highlighted the three common layers in a batch workflow:

  • Job
  • Batch
  • Workflow

In Part 2, you built a Docker container for each job that needed to run as part of your workflow, and stored them in Amazon ECR.

In Part 3, you tackle the batch layer and build a scalable, elastic, and easily maintainable batch engine using AWS Batch.

AWS Batch enables developers, scientists, and engineers to easily and efficiently run hundreds of thousands of batch computing jobs on AWS. It dynamically provisions the optimal quantity and type of compute resources (for example, CPU or memory optimized instances) based on the volume and specific resource requirements of the batch jobs that you submit. With AWS Batch, you do not need to install and manage your own batch computing software or server clusters, which allows you to focus on analyzing results, such as those of your genomic analysis.

Integrating applications into AWS Batch

If you are new to AWS Batch, we recommend reading Setting Up AWS Batch to ensure that you have the proper permissions and AWS environment.

After you have a working environment, you define several types of resources:

  • IAM roles that provide service permissions
  • A compute environment that launches and terminates compute resources for jobs
  • A custom Amazon Machine Image (AMI)
  • A job queue to submit the units of work and to schedule the appropriate resources within the compute environment to execute those jobs
  • Job definitions that define how to execute an application

After the resources are created, you’ll test the environment and create an AWS Lambda function to send generic jobs to the queue.

This genomics workflow covers the basic steps. For more information, see Getting Started with AWS Batch.

Creating the necessary IAM roles

AWS Batch simplifies batch processing by managing a number of underlying AWS services so that you can focus on your applications. As a result, you create IAM roles that give the service permissions to act on your behalf. In this section, deploy the AWS CloudFormation template included in the GitHub repository and extract the ARNs for later use.

To deploy the stack, go to the top level in the repo with the following command:

aws cloudformation create-stack --template-body file://batch/setup/iam.template.yaml --stack-name iam --capabilities CAPABILITY_NAMED_IAM

You can capture the output from this stack in the Outputs tab in the CloudFormation console:

Creating the compute environment

In AWS Batch, you will set up a managed compute environments. Managed compute environments automatically launch and terminate compute resources on your behalf based on the aggregate resources needed by your jobs, such as vCPU and memory, and simple boundaries that you define.

When defining your compute environment, specify the following:

  • Desired instance types in your environment
  • Min and max vCPUs in the environment
  • The Amazon Machine Image (AMI) to use
  • Percentage value for bids on the Spot Market and VPC subnets that can be used.

AWS Batch then provisions an elastic and heterogeneous pool of Amazon EC2 instances based on the aggregate resource requirements of jobs sitting in the RUNNABLE state. If a mix of CPU and memory-intensive jobs are ready to run, AWS Batch provisions the appropriate ratio and size of CPU and memory-optimized instances within your environment. For this post, you will use the simplest configuration, in which instance types are set to "optimal" allowing AWS Batch to choose from the latest C, M, and R EC2 instance families.

While you could create this compute environment in the console, we provide the following CLI commands. Replace the subnet IDs and key name with your own private subnets and key, and the image-id with the image you will build in the next section.

ACCOUNTID=<your account id>
SERVICEROLE=<from output in CloudFormation template>
IAMFLEETROLE=<from output in CloudFormation template>
JOBROLEARN=<from output in CloudFormation template>
SUBNETS=<comma delimited list of subnets>
SECGROUPS=<your security groups>
SPOTPER=50 # percentage of on demand
IMAGEID=<ami-id corresponding to the one you created>
INSTANCEROLE=<from output in CloudFormation template>
REGISTRY=${ACCOUNTID}.dkr.ecr.us-east-1.amazonaws.com
KEYNAME=<your key name>
MAXCPU=1024 # max vCPUs in compute environment
ENV=myenv

# Creates the compute environment
aws batch create-compute-environment --compute-environment-name genomicsEnv-$ENV --type MANAGED --state ENABLED --service-role ${SERVICEROLE} --compute-resources type=SPOT,minvCpus=0,maxvCpus=$MAXCPU,desiredvCpus=0,instanceTypes=optimal,imageId=$IMAGEID,subnets=$SUBNETS,securityGroupIds=$SECGROUPS,ec2KeyPair=$KEYNAME,instanceRole=$INSTANCEROLE,bidPercentage=$SPOTPER,spotIamFleetRole=$IAMFLEETROLE

Creating the custom AMI for AWS Batch

While you can use default Amazon ECS-optimized AMIs with AWS Batch, you can also provide your own image in managed compute environments. We will use this feature to provision additional scratch EBS storage on each of the instances that AWS Batch launches and also to encrypt both the Docker and scratch EBS volumes.

AWS Batch has the same requirements for your AMI as Amazon ECS. To build the custom image, modify the default Amazon ECS-Optimized Amazon Linux AMI in the following ways:

  • Attach a 1 TB scratch volume to /dev/sdb
  • Encrypt the Docker and new scratch volumes
  • Mount the scratch volume to /docker_scratch by modifying /etcfstab

The first two tasks can be addressed when you create the custom AMI in the console. Spin up a small t2.micro instance, and proceed through the standard EC2 instance launch.

After your instance has launched, record the IP address and then SSH into the instance. Copy and paste the following code:

sudo yum -y update
sudo parted /dev/xvdb mklabel gpt
sudo parted /dev/xvdb mkpart primary 0% 100%
sudo mkfs -t ext4 /dev/xvdb1
sudo mkdir /docker_scratch
sudo echo -e '/dev/xvdb1\t/docker_scratch\text4\tdefaults\t0\t0' | sudo tee -a /etc/fstab
sudo mount -a

This auto-mounts your scratch volume to /docker_scratch, which is your scratch directory for batch processing. Next, create your new AMI and record the image ID.

Creating the job queues

AWS Batch job queues are used to coordinate the submission of batch jobs. Your jobs are submitted to job queues, which can be mapped to one or more compute environments. Job queues have priority relative to each other. You can also specify the order in which they consume resources from your compute environments.

In this solution, use two job queues. The first is for high priority jobs, such as alignment or variant calling. Set this with a high priority (1000) and map back to the previously created compute environment. Next, set a second job queue for low priority jobs, such as quality statistics generation. To create these compute environments, enter the following CLI commands:

aws batch create-job-queue --job-queue-name highPriority-${ENV} --compute-environment-order order=0,computeEnvironment=genomicsEnv-${ENV}  --priority 1000 --state ENABLED
aws batch create-job-queue --job-queue-name lowPriority-${ENV} --compute-environment-order order=0,computeEnvironment=genomicsEnv-${ENV}  --priority 1 --state ENABLED

Creating the job definitions

To run the Isaac aligner container image locally, supply the Amazon S3 locations for the FASTQ input sequences, the reference genome to align to, and the output BAM file. For more information, see tools/isaac/README.md.

The Docker container itself also requires some information on a suitable mountable volume so that it can read and write files temporary files without running out of space.

Note: In the following example, the FASTQ files as well as the reference files to run are in a publicly available bucket.

FASTQ1=s3://aws-batch-genomics-resources/fastq/SRR1919605_1.fastq.gz
FASTQ2=s3://aws-batch-genomics-resources/fastq/SRR1919605_2.fastq.gz
REF=s3://aws-batch-genomics-resources/reference/isaac/
BAM=s3://mybucket/genomic-workflow/test_results/bam/

mkdir ~/scratch

docker run --rm -ti -v $(HOME)/scratch:/scratch $REPO_URI --bam_s3_folder_path $BAM \
--fastq1_s3_path $FASTQ1 \
--fastq2_s3_path $FASTQ2 \
--reference_s3_path $REF \
--working_dir /scratch 

Locally running containers can typically expand their CPU and memory resource headroom. In AWS Batch, the CPU and memory requirements are hard limits and are allocated to the container image at runtime.

Isaac is a fairly resource-intensive algorithm, as it creates an uncompressed index of the reference genome in memory to match the query DNA sequences. The large memory space is shared across multiple CPU threads, and Isaac can scale almost linearly with the number of CPU threads given to it as a parameter.

To fit these characteristics, choose an optimal instance size to maximize the number of CPU threads based on a given large memory footprint, and deploy a Docker container that uses all of the instance resources. In this case, we chose a host instance with 80+ GB of memory and 32+ vCPUs. The following code is example JSON that you can pass to the AWS CLI to create a job definition for Isaac.

aws batch register-job-definition --job-definition-name isaac-${ENV} --type container --retry-strategy attempts=3 --container-properties '
{"image": "'${REGISTRY}'/isaac",
"jobRoleArn":"'${JOBROLEARN}'",
"memory":80000,
"vcpus":32,
"mountPoints": [{"containerPath": "/scratch", "readOnly": false, "sourceVolume": "docker_scratch"}],
"volumes": [{"name": "docker_scratch", "host": {"sourcePath": "/docker_scratch"}}]
}'

You can copy and paste the following code for the other three job definitions:

aws batch register-job-definition --job-definition-name strelka-${ENV} --type container --retry-strategy attempts=3 --container-properties '
{"image": "'${REGISTRY}'/strelka",
"jobRoleArn":"'${JOBROLEARN}'",
"memory":32000,
"vcpus":32,
"mountPoints": [{"containerPath": "/scratch", "readOnly": false, "sourceVolume": "docker_scratch"}],
"volumes": [{"name": "docker_scratch", "host": {"sourcePath": "/docker_scratch"}}]
}'

aws batch register-job-definition --job-definition-name snpeff-${ENV} --type container --retry-strategy attempts=3 --container-properties '
{"image": "'${REGISTRY}'/snpeff",
"jobRoleArn":"'${JOBROLEARN}'",
"memory":10000,
"vcpus":4,
"mountPoints": [{"containerPath": "/scratch", "readOnly": false, "sourceVolume": "docker_scratch"}],
"volumes": [{"name": "docker_scratch", "host": {"sourcePath": "/docker_scratch"}}]
}'

aws batch register-job-definition --job-definition-name samtoolsStats-${ENV} --type container --retry-strategy attempts=3 --container-properties '
{"image": "'${REGISTRY}'/samtools_stats",
"jobRoleArn":"'${JOBROLEARN}'",
"memory":10000,
"vcpus":4,
"mountPoints": [{"containerPath": "/scratch", "readOnly": false, "sourceVolume": "docker_scratch"}],
"volumes": [{"name": "docker_scratch", "host": {"sourcePath": "/docker_scratch"}}]
}'

The value for "image" comes from the previous post on creating a Docker image and publishing to ECR. The value for jobRoleArn you can find from the output of the CloudFormation template that you deployed earlier. In addition to providing the number of CPU cores and memory required by Isaac, you also give it a storage volume for scratch and staging. The volume comes from the previously defined custom AMI.

Testing the environment

After you have created the Isaac job definition, you can submit the job using the AWS Batch submitJob API action. While the base mappings for Docker run are taken care of in the job definition that you just built, the specific job parameters should be specified in the container overrides section of the API call. Here’s what this would look like in the CLI, using the same parameters as in the bash commands shown earlier:

aws batch submit-job --job-name testisaac --job-queue highPriority-${ENV} --job-definition isaac-${ENV}:1 --container-overrides '{
"command": [
			"--bam_s3_folder_path", "s3://mybucket/genomic-workflow/test_batch/bam/",
            "--fastq1_s3_path", "s3://aws-batch-genomics-resources/fastq/ SRR1919605_1.fastq.gz",
            "--fastq2_s3_path", "s3://aws-batch-genomics-resources/fastq/SRR1919605_2.fastq.gz",
            "--reference_s3_path", "s3://aws-batch-genomics-resources/reference/isaac/",
            "--working_dir", "/scratch",
			"—cmd_args", " --exome ",]
}'

When you execute a submitJob call, jobId is returned. You can then track the progress of your job using the describeJobs API action:

aws batch describe-jobs –jobs <jobId returned from submitJob>

You can also track the progress of all of your jobs in the AWS Batch console dashboard.

To see exactly where a RUNNING job is at, use the link in the AWS Batch console to direct you to the appropriate location in CloudWatch logs.

Completing the batch environment setup

To finish, create a Lambda function to submit a generic AWS Batch job.

In the Lambda console, create a Python 2.7 Lambda function named batchSubmitJob. Copy and paste the following code. This is similar to the batch-submit-job-python27 Lambda blueprint. Use the LambdaBatchExecutionRole that you created earlier. For more information about creating functions, see Step 2.1: Create a Hello World Lambda Function.

from __future__ import print_function

import json
import boto3

batch_client = boto3.client('batch')

def lambda_handler(event, context):
    # Log the received event
    print("Received event: " + json.dumps(event, indent=2))
    # Get parameters for the SubmitJob call
    # http://docs.aws.amazon.com/batch/latest/APIReference/API_SubmitJob.html
    job_name = event['jobName']
    job_queue = event['jobQueue']
    job_definition = event['jobDefinition']
    
    # containerOverrides, dependsOn, and parameters are optional
    container_overrides = event['containerOverrides'] if event.get('containerOverrides') else {}
    parameters = event['parameters'] if event.get('parameters') else {}
    depends_on = event['dependsOn'] if event.get('dependsOn') else []
    
    try:
        response = batch_client.submit_job(
            dependsOn=depends_on,
            containerOverrides=container_overrides,
            jobDefinition=job_definition,
            jobName=job_name,
            jobQueue=job_queue,
            parameters=parameters
        )
        
        # Log response from AWS Batch
        print("Response: " + json.dumps(response, indent=2))
        
        # Return the jobId
        event['jobId'] = response['jobId']
        return event
    
    except Exception as e:
        print(e)
        message = 'Error getting Batch Job status'
        print(message)
        raise Exception(message)

Conclusion

In part 3 of this series, you successfully set up your data processing, or batch, environment in AWS Batch. We also provided a Python script in the corresponding GitHub repo that takes care of all of the above CLI arguments for you, as well as building out the job definitions for all of the jobs in the workflow: Isaac, Strelka, SAMtools, and snpEff. You can check the script’s README for additional documentation.

In Part 4, you’ll cover the workflow layer using AWS Step Functions and AWS Lambda.

Please leave any questions and comments below.

Even Fake Leaks Can Help in Hollywood’s Anti-Piracy Wars

Post Syndicated from Andy original https://torrentfreak.com/even-fake-leaks-can-help-in-hollywoods-anti-piracy-wars-170527/

On Monday 15 May, during a town hall meeting in New York, Disney CEO Bob Iger informed a group of ABC employees that hackers had stolen one of the company’s movies.

The hackers allegedly informed the company that if a ransom was paid, then the copy would never see the light of day. Predictably, Disney refused to pay, the most sensible decision under the circumstances.

Although Disney didn’t name the ‘hacked’ film, it was named by Deadline as ‘Pirates of the Caribbean: Dead Men Tell No Tales’. A week later, a video was published by the LA Times claiming that the movie was indeed the latest movie in the successful ‘Pirates’ franchise.

From the beginning, however, something seemed off. Having made an announcement about the ‘hack’ to ABC employees, Disney suddenly didn’t want to talk anymore, declining all requests for comment. That didn’t make much sense – why make something this huge public if you don’t want to talk about it?

With this and other anomalies nagging, TF conducted its own investigation and this Wednesday – a week and a half after Disney’s announcement and a full three weeks after the company was contacted with a demand for cash – we published our findings.

Our conclusion was that the ‘hack’ almost certainly never happened and, from the beginning, no one had ever spoken about the new Pirates film being the ‘hostage’. Everything pointed to a ransom being demanded for a non-existent copy of The Last Jedi and that the whole thing was a grand hoax.

Multiple publications tried to get a comment from Disney before Wednesday, yet none managed to do so. Without compromising our sources, TF also sent an outline of our investigation to the company to get to the bottom of this saga. We were ignored.

Then, out of the blue, one day after we published our findings, Disney chief Bob Iger suddenly got all talkative again. Speaking with Yahoo Finance, Iger confirmed what we suspected all along – it was a hoax.

“To our knowledge we were not hacked,” Iger said. “We had a threat of a hack of a movie being stolen. We decided to take it seriously but not react in the manner in which the person who was threatening us had required.”

Let’s be clear here, if there were to be a victim in all of this, that would quite clearly be Disney. The company didn’t ask to be hacked, extorted, or lied to. But why would a company quietly sit on a dubious threat for two weeks, then confidently make it public as fact but refuse to talk, only to later declare it a hoax under pressure?

That may never be known, but Disney and its colleagues sure managed to get some publicity and sympathy in the meantime.

Publications such as the LA Times placed the threat alongside the ‘North Korea’ Sony hack, the more recent Orange is the New Black leak, and the WannaCry ransomware attacks that plagued the web earlier this month.

“Hackers are seizing the content and instead of just uploading it, they’re contacting the studios and asking for a ransom. That is a pretty recent phenomenon,” said MPAA content protection chief Dean Marks in the same piece.

“It’s scary,” an anonymous studio executive added. “It could happen to any one of us.”

While that is indeed the case and there is a definite need to take things seriously, this particular case was never credible. Not a single person interviewed by TF believed that a movie was available. Furthermore, there were many signs that the person claiming to have the movie was definitely not another TheDarkOverlord.

In fact, when TF was investigating the leak we had a young member of a release group more or less laugh at us for wasting our time trying to find out of it was real or not. Considering its massive power (and the claim that the FBI had been involved) it’s difficult to conclude that Disney hadn’t determined the same at a much earlier stage.

All that being said, trying to hoax Disney over a fake leak of The Last Jedi is an extremely dangerous game in its own right. Not only is extortion a serious crime, but dancing around pre-release leaks of Star Wars movies is just about as risky as it gets.

In June 2005, after releasing a workprint copy of Star Wars: Episode 3, the FBI took down private tracker EliteTorrents in a blaze of publicity. People connected to the leak received lengthy jail sentences. The same would happen again today, no doubt.

It might seem like fun and games now, but people screwing with Disney – for real, for money, or both – rarely come out on top. If a workprint of The Last Jedi does eventually become available (and of course that’s always a possibility), potential leakers should consider their options very carefully.

A genuine workprint leak could prompt the company to go to war, but in the meantime, fake-based extortion attempts only add fuel to the anti-piracy fire – in Hollywood’s favor.

Source: TF, for the latest info on copyright, file-sharing, torrent sites and ANONYMOUS VPN services.

Copyright Troll Piracy ‘Witness’ Went Back to the Future – and Lost

Post Syndicated from Andy original https://torrentfreak.com/copyright-troll-piracy-witness-went-back-to-the-future-and-lost-170526/

Since the early 2000s, copyright trolls have been attempting to squeeze cash from pirating Internet users and fifteen years later the practice is still going strong.

While there’s little doubt that trolls catch some genuine infringers in their nets, the claim that actions are all about protecting copyrights is a shallow one. The aim is to turn piracy into profit and history has shown us that the bigger the operation, the more likely it is they’ll cut corners to cut costs.

The notorious Guardaley trolling operation is a prime example. After snaring the IP addresses of hundreds of thousands of Internet users, the company extracts cash settlements in the United States, Europe and beyond. It’s a project of industrial scale based on intimidation of alleged infringers. But, when those people fight back, the scary trolls suddenly become less so.

The latest case of Guardaley running for the hills comes courtesy of SJD from troll-watching site FightCopyrightTrolls, who reports on an attempt by Guardaley partner Criminal Productions to extract settlement from Zach Bethke, an alleged downloader of the Ryan Reynolds movie, Criminal.

On May 12, Bethke’s lawyer, J. Christopher Lynch, informed Criminal Productions’ lawyer David A. Lowe that Bethke is entirely innocent.

“Neither Mr. Bethke nor his girlfriend copied your client’s movie and they do not know who, if anyone, may have done so,” Lynch wrote.

“Mr. Bethke does not use BitTorrent. Prior to this lawsuit, Mr. Bethke had never heard of your client’s movie and he has no interest in it. If he did have any interest in it, he could have rented it for no marginal cost using his Netflix or Amazon Prime accounts.”

Lynch went on to request that Criminal Productions drop the case. Failing that, he said, things would probably get more complicated. As reported last year, Lynch and Lowe have been regularly locking horns over these cases, with Lynch largely coming out on top.

Part of Lynch’s strategy has been to shine light on Guardaley’s often shadowy operations. He previously noted that its investigators were not properly licensed to operate in the U.S. and the company had been found to put forward a fictitious witness, among other things.

In the past, these efforts to bring Guardaley out into the open have resulted in its clients’, which include several film companies, dropping cases. Lynch, it appears, wants that to happen again in Bethke’s case, noting in his letter that it’s “long past due for a judge to question the qualifications” of the company’s so-called technical experts.

In doing so he calls Guardaley’s evidence into account once more, noting inconsistencies in the way alleged infringements were supposedly “observed” by “foreign investigator[s], with a direct financial interest in the matter.”

One of Lynch’s findings is that the “observations” of two piracy investigators overlap each others’ monitoring periods in separate cases, while reportedly monitoring the same torrent hash.

“Both declarations cover the same ‘hash number’ of the movie, i.e. the same soak. This overlap seems impossible if we stick with the fictions of the Complaint and Motion for Expedited Discovery that the declarant ‘observed’ the defendant ‘infringing’,” Lynch notes.

While these are interesting points, the quality of evidence presented by Guardaley and Criminal Productions is really called into question following another revelation. Daniel Macek, an ‘observing’ investigator used in numerous Guardaley cases, apparently has a unique talent.

As seen from the image below, the alleged infringements relating to Mr. Bethke’s case were carried out between June 25 and 28, 2016.

However, the declaration (pdf) filed with the Court on witness Macek’s behalf was signed and dated either June 14 or 16, more than a week before the infringements allegedly took place.

Time-traveler? Lynch thinks not.

“How can a witness sign a declaration that he observed something BEFORE it happened?” he writes.

“Criminal Productions submitted four such Declarations of Mr. Macek that were executed BEFORE the dates of the accompanying typed up list of observations that Mr. Macek swore that he made.

“Unless Daniel Macek is also Marty McFly, it is impossible to execute a declaration claiming to observe something that has yet to happen.”

So what could explain this strange phenomenon? Lynch believes he’s got to the bottom of that one too.

After comparing all four Macek declarations, he found that aside from the case numbers, the dates and signatures were identical. Instead of taking the issue of presenting evidence before the Court seriously, he believes Criminal Productions and partner Guardaley have been taking short cuts.

“From our review, it appears these metaphysical Macek declarations are not just temporally improper, they are also photocopies, including the signatures not separately executed,” he notes.

“We are astonished by your client’s foreign representatives’ apparent lack of respect for our federal judicial system. Use of duplicate signatures from a witness testifying to events that have yet to happen is on the same level of horror as the use of a fictitious witness and ‘his’ initials as a convenience to obtain subpoenas.”

Not entirely unexpectedly, five days later the case against Bethke and other defendants was voluntarily dismissed (pdf), indicating once again that like vampires, trolls do not like the light. Other lawyers defending similar cases globally should take note.

Source: TF, for the latest info on copyright, file-sharing, torrent sites and ANONYMOUS VPN services.

Thailand Arrests Brits Over Pirated Football Streams

Post Syndicated from Ernesto original https://torrentfreak.com/thailand-arrests-brits-over-pirated-football-streams-170517/

In recent years there’s been an increase in the availability of unlicensed TV streams, with vendors offering virtually any channel imaginable, for free or in exchange for a small fee.

Many of these IPTV packages are unlicensed. That makes them a lot cheaper to the end users, which explains why their popularity is growing.

While the phenomenon remained under the radar for a long time, more recently we have seen several raids on vendors who sell these ‘pirate’ subscriptions. After arrests in Spain and Poland, Thai authorities have also joined in.

Last week the Department of Special Investigation (DSI) arrested two British men, William Lloyd, 39 and William Robinson, 35, for their alleged involvement in selling unlicensed IPTV subscriptions. The pair were arrested together with 33-year-old local man, Supatra Raksasat.

The enforcement action followed a complaint from the Football Association Premier League Ltd (FAPL) and was made public yesterday. According to the authorities, the men sold pirate subscriptions to dozens of TV-channels through 365sport.tv.

365sport.tv

The website in question was taken offline and is no longer operational. However, cached versions show that the outfit sold subscriptions for 10 or 22 premium sports channels for a monthly fee of 600 ($17) and 999 ($29) Thai Baht respectively.

During the raids DSI, which is a special department of the Ministry of Justice, seized mobile phones, nine computer servers, nine computers, and a total of 49 set-top boxes, local media reports.

DSI deputy chief Suriya Singhakamol said that the men were also accused of offering unauthorized content through a variety of other sites targeted at expats, including Thaiexpat.tv, Hkexpat.tv, Indoexpat.tv, Vietexpat.tv, and Euroexpat.tv.

Following the Premier League complaint, DSI’s cybercrime unit launched a special investigation which found that 365sport.tv offered the unlicensed streams through Thai servers.

The authorities subsequently obtained arrest warrants through the Central Intellectual Property and International Trade Court.

While the case remains open, the two British suspects have been handed over to officials from the British embassy, which requested their bail. All unlicensed IPTV streams, meanwhile, are no longer online.

Source: TF, for the latest info on copyright, file-sharing, torrent sites and ANONYMOUS VPN services.

Texas Court Orders Temporary ‘Pre-Piracy’ Shutdown of Sports Streaming Sites

Post Syndicated from Ernesto original https://torrentfreak.com/texas-court-orders-temporary-pre-piracy-shutdown-of-sports-streaming-sites-170513/

Copyright holders often complain that they have virtually no means to target pirate sites, especially those run from overseas.

Interestingly, however, in recent months it has become apparent that the US Federal Court system can be used as a prime enforcement tool to shut down pirate domain names.

This is also the path Indian media outfit Times Content Limited (TCL) decided to go down. The company operates the cricket channel Willow TV and owns the US broadcasting rights to the Indian Premier League cricket tournament, which is currently ongoing.

Two weeks ago the company sued several sports streaming sites including smartcric.com and crickethdlive.com. These sites allow users to watch cricket games for free over the Internet, without permission.

To stop this from taking place, the Indian company requested a broad injunction, which the court granted last week.

The preliminary injunction (pdf) orders various third party providers to stop working with these sites effective immediately to prevent future copyright infringements. This also applies to any new domain names or websites the operators may launch.

“…all service providers whose services will enable or facilitate Defendants’ anticipated infringement are ordered to suspend all services with respect to smartcric.com, smartcric.eu, crickethdlive.com, and crickethdlive.pw, or any other website or domain that is redirected from the Websites and continues to distribute and publicly perform the 2017 IPL,” it reads.

Domain registries and registrars are not the only parties that are compelled to comply. It also lists a broad range of intermediaries including hosting companies, CDN services, advertising outfits, and streaming providers.

Where this order clearly differs from similar injunctions in the US is that it specifically targets “anticipated infringement.” Or put differently, it aims to prevent piracy before it takes place.

From the injunction

What stands out further is that the injunction is temporary in nature. It only applies while the Cricket tournament is active. This ends on May 22, after which the parties involved are free to lift or reverse the actions they took.

“For the avoidance of doubt, the Court’s intent is to ensure that Defendants’ Websites be rendered offline, inaccessible and incapable of receiving or displaying audio or video signals between the date of this order and 6:00 am. CDT on May 22, 2017,” the injunction reads.

Over the past few days several of the seized domain names have been placed in a Godaddy holding account belonging to the law firm that represents TCL. And per court order, they will stay there until said date.

That doesn’t mean, however, that the case is over after the tournament ends. In the complaint, TCL also requests damages and other punitive measures, which is something that has to be decided over at a later date.

TorrentFreak spoke to the operator of the streaming sites in question, who says that the lawsuit took him by surprise. After losing his initial domain names he registered several new ones, but these were swiftly taken down as well.

“I moved Smartcric.com to Smartcric.be and Crickethdlive.com to Crickethdlive.pw. However, both domains were suspended as well within a day. Later, I moved Crickethdlive content to Crickethdlive.to however that was suspended yesterday as well,” the operator says.

“It was shocking to see that non-US registries were following the order issued by a US court. It was unfair and unjust to comply with orders of a non-competent court by these registries.”

Interestingly, one of the domain names was registered through the domain name service Njalla, which Pirate Bay co-founder Peter Sunde recently launched. Sunde stresses that the domain was seized beyond their control and that no personal information was shared.

“We’re looking into the case at the moment, but the court took the domain and sent it to a legal firm. We have no way of going above the court and ICANN on this. However, we have of course not sent any information about the customer to anyone,” Sunde says.

The streaming site operator still doubts that he will get his domain names back after the injunction expires. Instead, he’s decided to focus his effort on finding a domain name that falls outside of the scope of the US courts.

Source: TF, for the latest info on copyright, file-sharing, torrent sites and ANONYMOUS VPN services.

NO, Kodi Users Are Not Risking Ten Years in Prison

Post Syndicated from Andy original https://torrentfreak.com/no-kodi-users-are-not-risking-ten-years-in-prison-170507/

Piracy has always been a reasonably popular topic in the UK and there can barely be a person alive today who hasn’t either engaged in or been exposed to the phenomenon in some way. Just lately, however, things have really entered the mainstream.

The massive public interest is down to the set-top box craze, which is largely fueled by legal Kodi software augmented with infringing addons that provide free access to premium movies, TV channels and live sports.

While this a topic one might expect technology sites to report on, just recently UK tabloids have flooded the market with largely sensational stories about Kodi and piracy in general, which often recycle the same story time and again with SHOCKING click-bait headlines YOU JUST WON’T BELIEVE.

We’ve had to put up with misleading headlines and stories for months, so a while ago we made an effort to discuss the issues with tabloid reporters. Needless to say, we didn’t get very far. Most ignored our emails, but even those who responded weren’t prepared to do much.

One told us that his publication had decided that articles featuring Kodi were good for traffic while another promised to escalate our comments further up the chain of command. Within days additional articles with similar problems were being published regardless and this week things really boiled over.

10 Years for Kodi users? Hardly

The above report published in the Daily Express is typical of many doing the rounds at the moment. Taking Kodi as the popular search term, it shoe-horns the topic into areas of copyright law that do not apply to it, and ones certainly not covered by the Digital Economy Act cited in the headline.

As reported this week, the Digital Economy Act raises penalties for online copyright infringement offenses from two to ten years, but only in specific circumstances. Users streaming content to their homes via Kodi is absolutely not one of them.

To fall foul of the new law a user would need to communicate a copyrighted work to the public. In piracy terms that means ‘uploading’ and people streaming content via Kodi do nothing of the sort. The Digital Economy Act offers no remedy to deal with users streaming content – period – but let’s not allow the facts to get in the way of a click-inducing headline.

The Mirror has it wrong too

The Mirror article weaves in comments from Kieron Sharp from the Federation Against Copyright Theft. He notes that the new legislation should be targeted at people making a business out of infringement, which will hopefully be the case.

However, the article incorrectly extrapolates Sharp’s comments to mean that the law also applies to people streaming content via Kodi. Only making things more confusing, it then states that people “who casually stream a couple of movies every once in a while are extremely unlikely to be prosecuted to such extremes.”

Again, the Digital Economy Act has nothing to do with people streaming movies via Kodi but if we go along with the charade and agree that people who casually stream movies aren’t going to be prosecuted, why claim “10 year jail sentences for Kodi users” in the headline?

The bottom line is that there is nothing in the article itself that supports the article’s headline claim that Kodi users could go to jail for ten years. In itself, this is problematic from a reporting standpoint.

Published by IPSO, the Editors’ Code of Practice clearly states that “the Press must take care not to publish inaccurate, misleading or distorted information or images, including headlines not supported by the text.”

But singling out the Daily Express and The Mirror on this would be unfair. Dozens of other publications jumped on the same bandwagon, parroting the same misinformation, often with similar click-bait headlines.

For people dealing with these issues every day, the ins-and-outs of piracy alongside developing copyright law can be easier to grasp, so it’s perhaps a little unfair to expect general reporters to understand every detail of what can be extremely complex issues. Mistakes get made by everyone, that’s human nature.

But really, is there any excuse for headlines like this one published by the Sunday Express this morning?

According to the piece, readers of TorrentFreak are also at risk of spending ten years in prison. You couldn’t make this damaging nonsense up. Actually, apparently you can.

In addition to a lack of research, the problem here is the prevalence of click-bait headlines driving traffic and the inability of the underlying articles to live up to the hype. If we can moderate the headlines and report within them, the rest should simply fall into place. Ditch the NEEDLESS capital letters and stick to the facts.

Society in 2017 needs those more than ever.

Source: TF, for the latest info on copyright, file-sharing, torrent sites and ANONYMOUS VPN services.

AWS Enables Consortium Science to Accelerate Discovery

Post Syndicated from Jeff Barr original https://aws.amazon.com/blogs/aws/aws-enables-consortium-science-to-accelerate-discovery/

My colleague Mia Champion is a scientist (check out her publications), an AWS Certified Solutions Architect, and an AWS Certified Developer. The time that she spent doing research on large-data datasets gave her an appreciation for the value of cloud computing in the bioinformatics space, which she summarizes and explains in the guest post below!

Jeff;


Technological advances in scientific research continue to enable the collection of exponentially growing datasets that are also increasing in the complexity of their content. The global pace of innovation is now also fueled by the recent cloud-computing revolution, which provides researchers with a seemingly boundless scalable and agile infrastructure. Now, researchers can remove the hindrances of having to own and maintain their own sequencers, microscopes, compute clusters, and more. Using the cloud, scientists can easily store, manage, process and share datasets for millions of patient samples with gigabytes and more of data for each individual. As American physicist, John Bardeen once said: “Science is a collaborative effort. The combined results of several people working together is much more effective than could be that of an individual scientist working alone”.

Prioritizing Reproducible Innovation, Democratization, and Data Protection
Today, we have many individual researchers and organizations leveraging secure cloud enabled data sharing on an unprecedented scale and producing innovative, customized analytical solutions using the AWS cloud.  But, can secure data sharing and analytics be done on such a collaborative scale as to revolutionize the way science is done across a domain of interest or even across discipline/s of science? Can building a cloud-enabled consortium of resources remove the analytical variability that leads to diminished reproducibility, which has long plagued the interpretability and impact of research discoveries? The answers to these questions are ‘yes’ and initiatives such as the Neuro Cloud Consortium, The Global Alliance for Genomics and Health (GA4GH), and The Sage Bionetworks Synapse platform, which powers many research consortiums including the DREAM challenges, are starting to put into practice model cloud-initiatives that will not only provide impactful discoveries in the areas of neuroscience, infectious disease, and cancer, but are also revolutionizing the way in which scientific research is done.

Bringing Crowd Developed Models, Algorithms, and Functions to the Data
Collaborative projects have traditionally allowed investigators to download datasets such as those used for comparative sequence analysis or for training a deep learning algorithm on medical imaging data. Investigators were then able to develop and execute their analysis using institutional clusters, local workstations, or even laptops:

This method of collaboration is problematic for many reasons. The first concern is data security, since dataset download essentially permits “chain-data-sharing” with any number of recipients. Second, analytics done using compute environments that are not templated at some level introduces the risk of variable analytics that itself is not reproducible by a different investigator, or even the same investigator using a different compute environment. Third, the required data dump, processing, and then re-upload or distribution to the collaborative group is highly inefficient and dependent upon each individual’s networking and compute capabilities. Overall, traditional methods of scientific collaboration have introduced methods in which security is compromised and time to discovery is hampered.

Using the AWS cloud, collaborative researchers can share datasets easily and securely by taking advantage of Identity and Access Management (IAM) policy restrictions for user bucket access as well as S3 bucket policies or Access Control Lists (ACLs). To streamline analysis and ensure data security, many researchers are eliminating the necessity to download datasets entirely by leveraging resources that facilitate moving the analytics to the data source and/or taking advantage of remote API requests to access a shared database or data lake. One way our customers are accomplishing this is to leverage container based Docker technology to provide collaborators with a way to submit algorithms or models for execution on the system hosting the shared datasets:

Docker container images have all of the application’s dependencies bundled together, and therefore provide a high degree of versatility and portability, which is a significant advantage over using other executable-based approaches. In the case of collaborative machine learning projects, each docker container will contain applications, language runtime, packages and libraries, as well as any of the more popular deep learning frameworks commonly used by researchers including: MXNet, Caffe, TensorFlow, and Theano.

A common feature in these frameworks is the ability to leverage a host machine’s Graphical Processing Units (GPUs) for significant acceleration of the matrix and vector operations involved in the machine learning computations. As such, researchers with these objectives can leverage EC2’s new P2 instance types in order to power execution of submitted machine learning models. In addition, GPUs can be mounted directly to containers using the NVIDIA Docker tool and appear at the system level as additional devices. By leveraging Amazon EC2 Container Service and the EC2 Container Registry, collaborators are able to execute analytical solutions submitted to the project repository by their colleagues in a reproducible fashion as well as continue to build on their existing environment.  Researchers can also architect a continuous deployment pipeline to run their docker-enabled workflows.

In conclusion, emerging cloud-enabled consortium initiatives serve as models for the broader research community for how cloud-enabled community science can expedite discoveries in Precision Medicine while also providing a platform where data security and discovery reproducibility is inherent to the project execution.

Mia D. Champion, Ph.D.

 

EC2 F1 Instances with FPGAs – Now Generally Available

Post Syndicated from Jeff Barr original https://aws.amazon.com/blogs/aws/ec2-f1-instances-with-fpgas-now-generally-available/

We launched the Developer Preview of the FPGA-equipped F1 instances at AWS re:Invent. The response to the announcement was quick and overwhelming! We received over 2000 requests for entry, and were able to provide over 200 developers with access to the Hardware Development Kit (HDK) and the actual F1 instances.

In the post that I wrote for re:Invent, I told you that:

This highly parallelized model is ideal for building custom accelerators to process compute-intensive problems. Properly programmed, an FPGA has the potential to provide a 30x speedup to many types of genomics, seismic analysis, financial risk analysis, big data search, and encryption algorithms and applications.

During the preview, partners and developers have been working on all sorts of exciting tools, services, and applications. I’ll tell you more about them in just a moment.

Now Generally Available
Today we are making the F1 instances generally available in the US East (Northern Virginia) Region, with plans to bring them to other regions before too long.

We continued to add features and functions during the preview, while also making the development tools more efficient and easier to use. Here’s a summary:

Developer Community – We launched the AWS FPGA Development Forum to provide a place for FPGA developers to hang out and to communicate with us and with each other.

HDK and SDK – We published the EC2 FPGA Hardware (HDK) and Software Development Kit to GitHub, and made many improvements in response to feedback that we received during the preview.

The improvements include support for VHDL (in addition to Verilog), an improved virtual lab environment (Virtual JTAG, Virtual LED, and Virtual DipSwitch), AWS libraries for FPGA management and the FPGA runtime, and support for OpenCL including the AWS OpenCL runtime library.

FPGA Developer AMI – This Marketplace AMI contains a full set of FPGA development tools including an RTL compiler and simulator, along with Xilinx SDAccel for OpenCL development, all tuned for use on C4, M4, and R4 instances.

FPGAs At Work
Here’s a sampling of the impressive work that our partners have been doing with the F1’s:

Edico Genome is deploying their DRAGEN Bio-IT Platform on F1 instances, with the expectation that it will provide whole-genome sequencing that runs in real time.

Ryft offers the Ryft Cloud, an accelerator for data analytics and machine learning that extends Elastic Stack. It sources data from Amazon Kinesis, Amazon Simple Storage Service (S3), Amazon Elastic Block Store (EBS), and local instance storage and uses massive bitwise parallelism to drive performance. The product supports high-level JDBC, ODBC, and REST interfaces along with low-level C, C++, Java, and Python APIs (see the Ryft API page for more information).

Reconfigure.io launched a cloud-based service that allows you to program FPGAs using the Go programming language. You can build, test, and deploy your code from within their cloud-based environment while taking advantage of concurrency-oriented language features such as goroutines (lightweight threads), channels, and selects.

NGCodec ported their RealityCodec video encoder to the F1 and used it to produce broadcast-quality video at 80 frames per second. Their solution can encode up to 32 independent video streams on a single F1 instance (read their new post, You Deserve Better than Grainy Giraffes, to learn more).

FPGAs In School & Research
Research groups and graduate classes at top-tier universities contacted us via AWS Educate and were eager to gain access to F1 instances.

UCLA‘s CS133 class (Parallel and Distributed Computing) is setting up an F1-based FPGA lab that will be operational within 3 or 4 weeks. According to UCLA Chancellor’s Professor Jason Cong, they are expanding multiple research projects to cover F1 including FPGA performance debugging, machine learning acceleration, Spark to FPGA compilation, and systolic array compilation.

Last month we announced that we are collaborating with the National Science Foundation (NSF) to foster innovation in big data research (read AWS Collaborates With the National Science Foundation to Foster Innovation to learn more and to find out how to apply for a grant).

FPGA’s in the AWS Marketplace
As I shared in my original post, we have built a complete beginning to end solution that lets developers build FPGA-powered applications and services and list them in the AWS Marketplace. I can’t wait to see what kinds of cool things show up there!

Jeff;

Police Say “Criminal Gangs” Are Selling Pirate Media Players

Post Syndicated from Andy original https://torrentfreak.com/police-say-criminal-gangs-selling-pirate-media-players-170419/

For the millions of purist ‘pirates’ out there, obtaining free content online is a puzzle to be solved at home. Discovering the best sites, services, and tools is all part of the challenge and in order to keep things tidy, these should come at no cost too.

But for every self-sufficient pirate, there are dozens of other individuals who prefer not to get into the nuts and bolts of the activity but still want to enjoy the content on offer. It is these people that are reportedly fueling a new crime wave sweeping the streets, from the United States, through Europe, and beyond.

IPTV – whether that’s a modified Kodi setup or a subscription service – is now considered by stakeholders to be a major piracy threat and when people choose to buy ready-built devices, they are increasingly enriching “criminal gangs” who have moved in to make money from the phenomenon.

That’s the claim from Police Scotland, who yesterday held a seminar at Scottish Police College to discuss emerging threats in intellectual property crime. The event was attended by experts from across Europe, including stakeholders, Trading Standards, HM Revenue & Customs, and the UK Intellectual Property Office.

“The illegal use of Internet protocol television has risen by 143% in the past year and is predominantly being carried out online. This involves the uploading of streams, server hosting and sales of pre-configured devices,” Scottish Police said in a statement.

The conference was billed as an “opportunity to share ideas, knowledge and investigative techniques” that address this booming area of intellectual property infringement, increasingly being exploited by people looking to make a quick buck. The organized sale of Android-style set-top boxes pre-configured for piracy is being seen as a prime example.

In addition to eBay and Amazon sales, hundreds of adverts are being placed both online and in traditional papers by people selling devices already setup with Kodi and the necessary addons.

“Crime groups and criminals around Scotland are diversifying into what’s seen as less risk areas,” Chief Inspector Mark Leonard explains.

It goes without saying that both police and copyright holders are alarmed by the rise in sales of these devices. However, even the people who help to keep the ‘pirate’ addons maintained and circulated have a problem with it too.

“In my opinion, the type of people attracted to selling something like a preloaded Kodi box aren’t very educated and generally lean towards crooked or criminal activity,” Eleazar of the hugely popular TVAddons repository informs TorrentFreak.

“These box sellers bring people to our community who should never have used Kodi in the first place, people who feel they are owed something, people who see Kodi only as a piracy tool, and people who don’t have the technical aptitude to maintain their Kodi device themselves.”

But for sellers of these devices, that’s exactly why they exist – to help out people who would otherwise struggle to get a Kodi-enabled box up and running. However, there are clear signs that these sellers are feeling the heat and slowly getting the message that their activities could attract police attention.

On several occasions TorrentFreak has contacted major sellers of these devices for comment but none wish to go on the record. Smaller operators, such as those selling a few boxes on eBay, are equally cautious. One individual, who is already on police radar, insists that it’s not his fault that business is booming.

“Sky and the Premier League charge too much. It’s that simple,” he told TF.

“Your average John gives you a few quid and takes [the device] and plugs it in. Job done. How is that different from getting a mate to do it for you, apart from the drink?”

To some extent, Internet piracy has traditionally been viewed as a somewhat ‘geeky’ activity, carried out by the tech-savvy individual with a little know-how. However, the shift from the bedroom to the living room – fueled by box suppliers – has introduced a whole new audience to the activity.

“This is now seen as being normalized,” says Chief Inspector Mark Leonard.

“A family will sit and watch one of these IPTV devices. There’s also a public perception that this is a commodity which is victimless. Prevention is a big part of this so we need to change attitudes and behaviours of people that this damages the creative industries in Scotland as well.”

As things stand, everything points to the controversy over these devices being set to continue. Despite being under attack from all sides, their convenience and bargain-basement pricing means they will remain a hit with fans. This is one piracy battle set to rage for some time.

Source: TF, for the latest info on copyright, file-sharing, torrent sites and ANONYMOUS VPN services.

A History of Removable Computer Storage

Post Syndicated from Peter Cohen original https://www.backblaze.com/blog/history-removable-computer-storage/

A History of Removable Storage

Almost from the start we’ve had a problem with computers: They create and consume more data than we can economically store. Hundreds of companies have been created around the need for more computer storage. These days if we need space we can turn to cloud services like our own B2 Cloud Storage, but it hasn’t always been that way. The history of removable computer storage is like the history of hard drives: A fascinating look into the ever-evolving technology of data storage.

The Birth of Removable Storage

Punch Cards

punch card

Before electronic computers existed, there were electrical, mechanical computing devices. Herman Hollerith, a U.S. census worker interested in simplifying the laborious process of tabulating census data, made a device that read information from rectangular cards with holes punched in particular locations to indicate information like marital status and age.

Hollerith’s cards long outlasted him and his machine. With the advent of electronic computers in the 1950s, punch cards became the de facto method of data input. The conventions introduced with punch cards, such as an 80 column width, affected everything from the way we’d make computer monitors to the format of text files for decades.

Open-Reel Tapes and Magnetic Cartridges

IBM 100 tape drive

Magnetic tape drives were standard issue for the mainframes and minicomputers used by businesses and other organizations from the advent of the computer industry in the 1950s up until the 1980s.

Tape drives started out on 10 1/2-inch reels. A thin metal strip recorded data magnetically. Watch any television program of this era and the scene with a computer will show you a device like this. The nine-track tapes developed by IBM for its computers could store up to 175 MB per tape. At the time, that was a tremendous amount of data, suitable for archiving days or weeks’ worth of data. These days 175 MBs might be enough to store a few dozen photos from your smartphone. Times have changed!

Eventually the big reel to reel systems would be replaced with much more portable, easier-to-use, and higher density magnetic tape cartridges. Mag tapes for data backup found their way into PCs in the 80s and 90s, though they, too, would be replaced by other removable media systems like CD-R burners.

Linear Tape-Open (LTO) made its debut in the late 1990s. These digital tape cartridges could store 100 GB each, making them ideal for backing up servers and archiving big projects. Since then capacity has improved to 6.0 TB per tape. There’s still a demand for LTO data archival systems today. However, tape drives are nearing their end of usefulness as better cloud options takeover the backup and archival markets. Our own B2 Cloud Storage is rapidly making LTO a thing of the past.

Burning LTO

Winchester Drives

IBM 3340 Winchester drive

Spinning hard disk drives started out as huge refrigerator-sized boxes attached to mainframe computers. As more businesses found uses for computers, the need for storage increased, but allowable floor space did not. IBM’s solution for this problem came in the early 1970s: the IBM 3340, popular known as a Winchester.

The 3340 sported removable data modules that contained hard drive platters which could store up to 70 MB. Instead of having to buy a whole new cabinet, companies leasing equipment from IBM could buy additional data modules to increase their storage capabilities.

From the start, the 3340 was a smashing success (okay, maybe smashing isn’t the best adjective to use when describing a hard drive, but you get the point). You could find these and their descendants connected to mainframes and minicomputers in corporate data centers throughout the 1970s and into the 1980s.

The Birth of the PC Brings New Storage Solutions

Cassette Recorder

TRS-80 w cassette drive

The 1970s saw another massive evolution of computers with the introduction of first generation personal computers. The first PCs lacked any built-in permanent storage. Hard disk drives were still very expensive. Even floppy disk drives were rare at the time. When you turned the computer off, you’d lose your data, unless you had something to store it on.

The solution that the first PC makers came up was to use a cassette recorder. Microcassettes exploded in the consumer electronic market as a convenient and inexpensive way for people to record and listen to music and use for voice dictation. At a time that long-distance phone calls were an expensive luxury, it was the original FaceTime for some of us, too: I remember as a preschooler, recording and playing cassettes to stay in touch with my grandparents on the other side of the country.

So using a cassette recorder to store computer data made sense. The devices were already commonplace and relatively inexpensive. Type in a save command, and the computer played tones through a cable connected to the tape drive to differentiate binary 0s and 1s. Type in a load command, and you could play back the tape to read the program into memory. It was very slow. But it was better than nothing.

Floppy Disk

Commodore 1541

The 1970s saw the rise of the floppy disk, the portable storage format that ultimately reigned supreme for decades. The earliest models of floppy disks were eight inches in diameter and could hold about 80 KB. Eight-inch drives were more common in corporate computing, but when floppies came to personal computers, the smaller 5 1/4-inch design caught on like wildfire.

Floppy disks became commonplace alongside the Apples and Commodores of the day. You could squeeze about 120 KB onto one of those puppies. Doesn’t sound like a lot, but it was plenty of space for Apple DOS and Lode Runner.

Apple popularized the 3 1/2-inch size when it introduced the Macintosh in 1984. By the late 1980s the smaller floppy disk size – which would ultimately store 1.44 MB per disk – was the dominant removable storage medium of the day. And so it would remain for decades.

The Bernoulli Box

Bernoulli Box

In the early 1980s, a new product called the Bernoulli Box would offer the convenience of removable cartridges like Winchester drives but in a much smaller, more portable format. It was called the Bernoulli Box. The Bernoulli box was an important removable storage device for businesses who had transitioned from expensive mainframes and minicomputers to desktops.

Bernoulli cartridges worked on the same principle as floppies but were larger and in a much more shielded enclosure. The cartridges sported larger capacities than floppy disks, too. You could store 10 MB or 20 MB instead of the 1.44 MB limit on a floppy disk. Capacities would increase over time to 230 MB. Bernoulli Boxes and the cartridges were expensive, which kept them in the realm of business storage. Iomega, the Bernoulli Box’s creator, turned its attention to an enormously popular removable storage system you’ll read about later: the Zip drive.

SyQuest Disks

SyQuest drive

In the 1990s another removable storage device made its mark in the computer industry. SyQuest developed a removable storage system that used 44 MB (and later 88 MB) hard disk platters. SyQuest drives were mainstays of creative digital markets – I saw them on almost any I could find a Mac doing graphic design work, desktop publishing, music, or video work.

SyQuest would be a footnote by the late 90s as Zip disks, recordable CDs and other storage media overtook them. Speaking of Zip disks…

The Click of Death

Zip Drive

The 1990s were a transitionary period for personal computing (well, when isn’t, it really). Information density was increasing rapidly. We were still years away from USB thumb drives and ubiquitous high-speed Wi-Fi, so “sneakernet” – physically transporting information from one computer to another – was still the preferred way to get big projects back and forth. Floppy drives were too small, hard disks weren’t portable, and rewritable CDs were expensive.

Iomega came along with the Zip Drive, a removable storage system that used disks shaped like heavier-duty floppies, each capable of storing up to 100 MB on them. A high-density floppy could store 1.4 MB or so, so it was orders of magnitude more of portable storage. Zip Disks quickly became popular, but Iomega eventually redesigned them to lower the cost of manufacturing. The redesign came with a price: The drives failed more frequently and could damage the disk in the process.

The phenomenon became known as the Click of Death: The sound the actuator (the part with the read/write head) would make as it reset after hitting a damaged sector on the disk. Iomega would eventually settle a class-action lawsuit over the issue, but consumers were already moving away from the format.

Iomega developed a successor to the Zip drive: The Jaz drive. When it first came out, it could store 1 GB on a removable cartridge. Inside the cartridge was a spinning hard disk mechanism; it wasn’t unlike the SyQuest drives that had been popular a few years before, but in a smaller size you could easily fit into a jacket pocket. Unfortunately, the Jaz drive developed reliability problems of its own – disks would get jammed in the drives, drives overheated, and some had vibration problems.

Recordable CDs and DVDs

Apple SuperDrive

As a storage medium, Compact Discs had been around since the 1980s, mainly popular as a music listening format. CD burners connected from the beginning, but they were ridiculously huge and expensive: The size of a washing machine and tens of thousands of dollars. By the late 1990s technology improved, prices lowered and recordable CD burners – CD-Rs – became commonplace.

With our ever-increasing need for more storage, we moved on to DVD-R and DVD-RW systems within a few years, upping the total you could store per disc to 4.3 GB (eventually up to 8 GB per disc once dual-layer media and burners were introduced).

Blu-Ray Disc offers even greater storage capacity and is popular for its use in the home entertainment market, so some PCs have added recordable Blu-Ray drives. Blu-ray sports capacities from 25 to 128 GB per disc depending on format. Increasingly, even optical drives have become optional accessories as we’ve slimmed down our laptop computers to improve portability.

Magneto-Optical

Magneto-optical disk

Another optical format, Magneto-Optical (MO), was used on some computer systems in the 80s and 90s. It would also find its way into consumer products. The cartridges could store 650 MB. Initial systems were only able to write once to a disc, but later ones were rewriteable.

NeXT, the other computer maker founded by Steve Jobs besides Apple, was the earliest desktop system to feature a MO drive as standard issue. Magneto-optical drives were available in 5 1/4-inch and 3-inch physical sizes with capacities up to 9 GB per disc. The most popular consumer incarnation of magneto-optical is Sony’s MiniDisc.

Removeable Storage Moves Beyond Computers

SD Cards

SD Cards

The most recent removable media format to see widespread adoption on personal computers is the Secure Digital (SD) Card. SD cards have become the industry standard most popular with many smartphones, still cameras, and video cameras. They can serve up data securely thanks to password protection, smartSD protocol and Near Field Communication (NFC) support available in some variations.

With no moving parts and non-volatile flash memory inside, SD cards are reliable, quiet and relatively fast methods of transporting and archiving data. What’s more, they come in different physical sizes to suit different device applications – everything from postage stamp-sized cards found in digital cameras to fingernail-sized micro cards found in phones.

Even compared to 5 1/4-inch media like Blu-ray Discs, SD card capacities are remarkable. 128 GB and 256 GB cards are commonplace now. What’s more, the SDXC spec maxes out at 2 TB, with support for 8K video transfer speeds possible. So there’s some headroom both for performance and capacity.

The More Things Change

As computer hardware continues to improve and as we continue to demand higher performance and greater portability and convenience, portable media will change. But as we’ve found ourselves with ubiquitous, high-speed Internet connectivity, the very need for removable local storage has diminished. Now instead of archiving data on an external cartridge, disc or card, we can just upload it to the cloud and access it anywhere.

That doesn’t obviate the need for a good backup strategy, of course. It’s vital to keep your important files safe with a local archive or backup. For that, removable media like SD cards and rewritable DVDs and even external hard drives can continue to fill an important role. Remember to store your info offsite too, preferably with a continuous, secure and reliable backup method like Backblaze Cloud Backup: Unlimited, unthrottled and easy to use.

The post A History of Removable Computer Storage appeared first on Backblaze Blog | Cloud Storage & Cloud Backup.

Friday Squid Blogging: Squid Can Edit Their Own RNA

Post Syndicated from Bruce Schneier original https://www.schneier.com/blog/archives/2017/04/friday_squid_bl_573.html

This is just plain weird:

Rosenthal, a neurobiologist at the Marine Biological Laboratory, was a grad student studying a specific protein in squid when he got an an inkling that some cephalopods might be different. Every time he analyzed that protein’s RNA sequence, it came out slightly different. He realized the RNA was occasionally substituting A’ for I’s, and wondered if squid might apply RNA editing to other proteins. Rosenthal, a grad student at the time, joined Tel Aviv University bioinformaticists Noa Liscovitch-Braur and Eli Eisenberg to find out.

In results published today, they report that the family of intelligent mollusks, which includes squid, octopuses and cuttlefish, feature thousands of RNA editing sites in their genes. Where the genetic material of humans, insects, and other multi-celled organisms read like a book, the squid genome reads more like a Mad Lib.

So why do these creatures engage in RNA editing when most others largely abandoned it? The answer seems to lie in some crazy double-stranded cloverleaves that form alongside editing sites in the RNA. That information is like a tag for RNA editing. When the scientists studied octopuses, squid, and cuttlefish, they found that these species had retained those vast swaths of genetic information at the expense of making the small changes that facilitate evolution. “Editing is important enough that they’re forgoing standard evolution,” Rosenthal says.

He hypothesizes that the development of a complex brain was worth that price. The researchers found many of the edited proteins in brain tissue, creating the elaborate dendrites and axons of the neurons and tuning the shape of the electrical signals that neurons pass. Perhaps RNA editing, adopted as a means of creating a more sophisticated brain, allowed these species to use tools, camouflage themselves, and communicate.

Yet more evidence that these bizarre creatures are actually aliens.

Three more articles. Academic paper.

As usual, you can also use this squid post to talk about the security stories in the news that I haven’t covered.

Read my blog posting guidelines here.