Tag Archives: launch

Pirate Party Wins Big in Czech Parliament Elections

Post Syndicated from Ernesto original https://torrentfreak.com/pirate-party-wins-big-in-czech-parliament-elections-171023/

The Czech Pirates have made quite a name for themselves in recent years.

The political party previously took on a local anti-piracy outfit by launching their own movie download sites, making the point that linking is not a crime.

The bold move resulted in a criminal investigation, but the case was eventually dropped after it was deemed that the Pirates acted in accordance with EU law.

In the political arena, the Czech Pirate Party booked several successes as well. In Parliamentary elections, however, the party never managed to beat the required threshold. Until this weekend.

With 10.79% of the total vote, the Pirates won 22 seats in the national parliament. Not just that, they also became the third largest political party in the country, where more than 30 parties participated in the elections.

The Czech Republic becomes the fourth country where a Pirate Party is represented in the national parliament, following Sweden, Germany, and Iceland, which is quite an achievement.

“It is the best result of any Pirate Party in history and gives us a great mandate to transform the dynamics of Czech politics. At the same time, we understand this as a huge responsibility towards the voters and the Pirate movement as a whole,” Tomáš Vymazal, one of the new Members of Parliament, tells TorrentFreak.

The Pirates (photo via)

While there were some celebrations after the election result came in, the Czech Pirate Party is moving full steam ahead. The twenty-two newly elected members have already held their first meeting, discussing how to get the most out of their negotiations with other parties.

“The negotiation team has been established and the club’s chairman was elected. We’ll now need to set up our offices, hire assistants and distribute specific responsibilities among the club,” Vymazal says.

“One of the first issues we will open up a discussion about how parliament will be fixing an historic anti-corruption bill.”

The bill in question makes sure that every contract the state or a state-owned business enters into is put on the record. However, the previous parliament introduced several exceptions and as a result, many of the money flows remain hidden from the public.

Like other Pirate parties, the Czech branch is by no means a single issue outfit. The party has a broad vision which it distilled to a twenty point program. In addition to fighting corruption, this includes tax reform and increasing teachers’ salaries, for example.

More classical pirate themes are also on the agenda of course. The Pirate Party wants to overhaul the country’s copyright legislation, stop internet censorship, and put an end to cell phone tracking. In addition, the use of medical marijuana should be allowed.

With the backing of hundreds of thousands of Czechs, these and other issues will certainly be on the political agenda during the years to come. It’s now up to the Pirates to make them a reality.

“We must do a very good job to successfully establish the Pirate Party in Czech politics and deliver on the promises we made to the voters,” Vymazal says.

Source: TF, for the latest info on copyright, file-sharing, torrent sites and ANONYMOUS VPN services.

OSSIM Download – Open Source SIEM Tools & Software

Post Syndicated from Darknet original https://www.darknet.org.uk/2017/10/ossim-download-open-source-siem-tools-software/?utm_source=rss&utm_medium=social&utm_campaign=darknetfeed

OSSIM Download – Open Source SIEM Tools & Software

OSSIM is a popular Open Source SIEM or Security Information and Event Management (SIEM) product, providing event collection, normalization and correlation.

OSSIM stands for Open Source Security Information Management, it was launched in 2003 by security engineers because of the lack of available open source products, OSSIM was created specifically to address the reality many security professionals face: A SIEM, whether it is open source or commercial, is virtually useless without the basic security controls necessary for security visibility.

Read the rest of OSSIM Download – Open Source SIEM Tools & Software now! Only available at Darknet.

Denuvo DRM Cracked within a Day of Release

Post Syndicated from Bruce Schneier original https://www.schneier.com/blog/archives/2017/10/denuvo_drm_crac.html

Denuvo is probably the best digital-rights management system, used to protect computer games. It’s regularly cracked within a day.

If Denuvo can no longer provide even a single full day of protection from cracks, though, that protection is going to look a lot less valuable to publishers. But that doesn’t mean Denuvo will stay effectively useless forever. The company has updated its DRM protection methods with a number of “variants” since its rollout in 2014, and chatter in the cracking community indicates a revamped “version 5” will launch any day now. That might give publishers a little more breathing room where their games can exist uncracked and force the crackers back to the drawing board for another round of the never-ending DRM battle.

BoingBoing post.

Related: Vice has a good history of DRM.

Steal This Show S03E09: Learning To Love Your Panopticon

Post Syndicated from Ernesto original https://torrentfreak.com/steal-show-s03e09-learning-love-panopticon/

stslogo180If you enjoy this episode, consider becoming a patron and getting involved with the show. Check out Steal This Show’s Patreon campaign: support us and get all kinds of fantastic benefits!

In this episode we meet Diani Barreto from the Berlin Bureau of ExposeFacs. Launched in June 2014, ExposeFacts.org supports and encourages whistleblowers to disclose information that citizens need to make truly informed decisions in a democracy.

ExposeFacts aims to shed light on concealed activities that are relevant to human rights, corporate malfeasance, the environment, civil liberties and war.

Steal This Show aims to release bi-weekly episodes featuring insiders discussing copyright and file-sharing news. It complements our regular reporting by adding more room for opinion, commentary, and analysis.

The guests for our news discussions will vary, and we’ll aim to introduce voices from different backgrounds and persuasions. In addition to news, STS will also produce features interviewing some of the great innovators and minds.

Host: Jamie King

Guest: Diani Barreto

Produced by Jamie King
Edited & Mixed by Riley Byrne
Original Music by David Triana
Web Production by Siraje Amarniss

Source: TF, for the latest info on copyright, file-sharing, torrent sites and ANONYMOUS VPN services.

US Senators Ask Apple Why VPN Apps Were Removed in China

Post Syndicated from Andy original https://torrentfreak.com/us-senators-ask-apple-why-vpn-apps-were-removed-in-china-171020/

As part of what is now clearly a crackdown on Great Firewall-evading tools and services, during the summer Chinese government pressure reached technology giant Apple.

On or around July 29, Apple removed many of the most-used VPN applications from its Chinese app store. In a short email from the company, VPN providers were informed that VPN applications are considered illegal in China.

“We are writing to notify you that your application will be removed from the China App Store because it includes content that is illegal in China, which is not in compliance with the App Store Review Guidelines,” Apple informed the affected VPNs.

Apple’s email to VPN providers

Now, in a letter sent to Apple CEO Tim Cook, US senators Ted Cruz and Patrick Leahy express concern at the move by Apple, noting that if reports of the software removals are true, the company could be assisting China’s restrictive approach to the Internet.

“VPNs allow users to access the uncensored Internet in China and other countries that restrict Internet freedom. If these reports are true, we are concerned that Apple may be enabling the Chines government’s censorship and surveillance of the Internet.”

Describing China as a country with “an abysmal human rights record, including with respect to the rights of free expression and free access to information, both online and offline”, the senators cite Reporters Without Borders who previously labeled the country as “the enemy of the Internet”.

While senators Cruz and Leahy go on to praise Apple for its contribution to the spread of information, they criticize the company for going along with the wishes of the Chinese government as it seeks to suppress knowledge and communication.

“While Apple’s many contributions to the global exchange of information are admirable, removing VPN apps that allow individuals in China to evade the Great Firewall and access the Internet privately does not enable people in China to ‘speak up’,” the senators write.

“To the contrary, if Apple complies with such demands from the Chinese government it inhibits free expression for users across China, particularly in light of the Cyberspace Administration of China’s new regulations targeting online anonymity.”

In January, a notice published by China’s Ministry of Industry and Information Technology said that the government had indeed launched a 14-month campaign to crack down on local ‘unauthorized’ Internet platforms.

This means that all VPN services have to be pre-approved by the Government if they want to operate in China. And the aggression against VPNs and their providers didn’t stop there.

In September, a Chinese man who sold Great Firewall-evading VPN software via a website was sentenced to nine months in prison by a Chinese court. Just weeks later, a software developer who set up a VPN for his own use but later sold access to the service was arrested and detained for three days.

This emerging pattern is clearly a concern for the senators who are now demanding that Tim Cook responds to ten questions (pdf), including whether Apple raised concerns about China’s VPN removal demands and details of how many apps were removed from its store. The senators also want to see copies of any pro-free speech statements Apple has made in China.

Whether the letter will make any difference on the ground in China remains to be seen, but the public involvement of the senators and technology giant Apple is certain to thrust censorship and privacy further into the public eye.

Source: TF, for the latest info on copyright, file-sharing, torrent sites and ANONYMOUS VPN services.

Introducing Cost Allocation Tags for Amazon SQS

Post Syndicated from Jeff Barr original https://aws.amazon.com/blogs/aws/introducing-cost-allocation-tags-for-amazon-sqs/

You have long had the ability to tag your AWS resources and to see cost breakouts on a per-tag basis. Cost allocation was launched in 2012 (see AWS Cost Allocation for Customer Bills) and we have steadily added support for additional services, most recently DynamoDB (Introducing Cost Allocation Tags for Amazon DynamoDB), Lambda (AWS Lambda Supports Tagging and Cost Allocations), and EBS (New – Cost Allocation for AWS Snapshots).

Today, we are launching tag-based cost allocation for Amazon Simple Queue Service (SQS). You can now assign tags to your queues and use them to manage your costs at any desired level: application, application stage (for a loosely coupled application that communicates via queues), project, department, or developer. After you have tagged your queues, you can use the AWS Tag Editor to search queues that have tags of interest.

Here’s how I would add three tags (app, stage, and department) to one of my queues:

This feature is available now in all AWS Regions and you can start using in today! To learn more about tagging, read Tagging Your Amazon SQS Queues. To learn more about cost allocation via tags, read Using Cost Allocation Tags. To learn more about how to use message queues to build loosely coupled microservices for modern applications, read our blog post (Building Loosely Coupled, Scalable, C# Applications with Amazon SQS and Amazon SNS) and watch the recording of our recent webinar, Decouple and Scale Applications Using Amazon SQS and Amazon SNS.

If you are coming to AWS re:Invent, plan to attend session ARC 330: How the BBC Built a Massive Media Pipeline Using Microservices. In the talk you will find out how they used SNS and SQS to improve the elasticity and reliability of the BBC iPlayer architecture.

Jeff;

Federate Database User Authentication Easily with IAM and Amazon Redshift

Post Syndicated from Thiyagarajan Arumugam original https://aws.amazon.com/blogs/big-data/federate-database-user-authentication-easily-with-iam-and-amazon-redshift/

Managing database users though federation allows you to manage authentication and authorization procedures centrally. Amazon Redshift now supports database authentication with IAM, enabling user authentication though enterprise federation. No need to manage separate database users and passwords to further ease the database administration. You can now manage users outside of AWS and authenticate them for access to an Amazon Redshift data warehouse. Do this by integrating IAM authentication and a third-party SAML-2.0 identity provider (IdP), such as AD FS, PingFederate, or Okta. In addition, database users can also be automatically created at their first login based on corporate permissions.

In this post, I demonstrate how you can extend the federation to enable single sign-on (SSO) to the Amazon Redshift data warehouse.

SAML and Amazon Redshift

AWS supports Security Assertion Markup Language (SAML) 2.0, which is an open standard for identity federation used by many IdPs. SAML enables federated SSO, which enables your users to sign in to the AWS Management Console. Users can also make programmatic calls to AWS API actions by using assertions from a SAML-compliant IdP. For example, if you use Microsoft Active Directory for corporate directories, you may be familiar with how Active Directory and AD FS work together to enable federation. For more information, see the Enabling Federation to AWS Using Windows Active Directory, AD FS, and SAML 2.0 AWS Security Blog post.

Amazon Redshift now provides the GetClusterCredentials API operation that allows you to generate temporary database user credentials for authentication. You can set up an IAM permissions policy that generates these credentials for connecting to Amazon Redshift. Extending the IAM authentication, you can configure the federation of AWS access though a SAML 2.0–compliant IdP. An IAM role can be configured to permit the federated users call the GetClusterCredentials action and generate temporary credentials to log in to Amazon Redshift databases. You can also set up policies to restrict access to Amazon Redshift clusters, databases, database user names, and user group.

Amazon Redshift federation workflow

In this post, I demonstrate how you can use a JDBC– or ODBC-based SQL client to log in to the Amazon Redshift cluster using this feature. The SQL clients used with Amazon Redshift JDBC or ODBC drivers automatically manage the process of calling the GetClusterCredentials action, retrieving the database user credentials, and establishing a connection to your Amazon Redshift database. You can also use your database application to programmatically call the GetClusterCredentials action, retrieve database user credentials, and connect to the database. I demonstrate these features using an example company to show how different database users accounts can be managed easily using federation.

The following diagram shows how the SSO process works:

  1. JDBC/ODBC
  2. Authenticate using Corp Username/Password
  3. IdP sends SAML assertion
  4. Call STS to assume role with SAML
  5. STS Returns Temp Credentials
  6. Use Temp Credentials to get Temp cluster credentials
  7. Connect to Amazon Redshift using temp credentials

Walkthrough

Example Corp. is using Active Directory (idp host:demo.examplecorp.com) to manage federated access for users in its organization. It has an AWS account: 123456789012 and currently manages an Amazon Redshift cluster with the cluster ID “examplecorp-dw”, database “analytics” in us-west-2 region for its Sales and Data Science teams. It wants the following access:

  • Sales users can access the examplecorp-dw cluster using the sales_grp database group
  • Sales users access examplecorp-dw through a JDBC-based SQL client
  • Sales users access examplecorp-dw through an ODBC connection, for their reporting tools
  • Data Science users access the examplecorp-dw cluster using the data_science_grp database group.
  • Partners access the examplecorp-dw cluster and query using the partner_grp database group.
  • Partners are not federated through Active Directory and are provided with separate IAM user credentials (with IAM user name examplecorpsalespartner).
  • Partners can connect to the examplecorp-dw cluster programmatically, using language such as Python.
  • All users are automatically created in Amazon Redshift when they log in for the first time.
  • (Optional) Internal users do not specify database user or group information in their connection string. It is automatically assigned.
  • Data warehouse users can use SSO for the Amazon Redshift data warehouse using the preceding permissions.

Step 1:  Set up IdPs and federation

The Enabling Federation to AWS Using Windows Active Directory post demonstrated how to prepare Active Directory and enable federation to AWS. Using those instructions, you can establish trust between your AWS account and the IdP and enable user access to AWS using SSO.  For more information, see Identity Providers and Federation.

For this walkthrough, assume that this company has already configured SSO to their AWS account: 123456789012 for their Active Directory domain demo.examplecorp.com. The Sales and Data Science teams are not required to specify database user and group information in the connection string. The connection string can be configured by adding SAML Attribute elements to your IdP. Configuring these optional attributes enables internal users to conveniently avoid providing the DbUser and DbGroup parameters when they log in to Amazon Redshift.

The user-name attribute can be set up as follows, with a user ID (for example, nancy) or an email address (for example. [email protected]):

<Attribute Name="https://redshift.amazon.com/SAML/Attributes/DbUser">  
  <AttributeValue>user-name</AttributeValue>
</Attribute>

The AutoCreate attribute can be defined as follows:

<Attribute Name="https://redshift.amazon.com/SAML/Attributes/AutoCreate">
    <AttributeValue>true</AttributeValue>
</Attribute>

The sales_grp database group can be included as follows:

<Attribute Name="https://redshift.amazon.com/SAML/Attributes/DbGroups">
    <AttributeValue>sales_grp</AttributeValue>
</Attribute>

For more information about attribute element configuration, see Configure SAML Assertions for Your IdP.

Step 2: Create IAM roles for access to the Amazon Redshift cluster

The next step is to create IAM policies with permissions to call GetClusterCredentials and provide authorization for Amazon Redshift resources. To grant a SQL client the ability to retrieve the cluster endpoint, region, and port automatically, include the redshift:DescribeClusters action with the Amazon Redshift cluster resource in the IAM role.  For example, users can connect to the Amazon Redshift cluster using a JDBC URL without the need to hardcode the Amazon Redshift endpoint:

Previous:  jdbc:redshift://endpoint:port/database

Current:  jdbc:redshift:iam://clustername:region/dbname

Use IAM to create the following policies. You can also use an existing user or role and assign these policies. For example, if you already created an IAM role for IdP access, you can attach the necessary policies to that role. Here is the policy created for sales users for this example:

Sales_DW_IAM_Policy

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Action": [
                "redshift:DescribeClusters"
            ],
            "Resource": [
                "arn:aws:redshift:us-west-2:123456789012:cluster:examplecorp-dw"
            ]
        },
        {
            "Effect": "Allow",
            "Action": [
                "redshift:GetClusterCredentials"
            ],
            "Resource": [
                "arn:aws:redshift:us-west-2:123456789012:cluster:examplecorp-dw",
                "arn:aws:redshift:us-west-2:123456789012:dbuser:examplecorp-dw/${redshift:DbUser}"
            ],
            "Condition": {
                "StringEquals": {
                    "aws:userid": "AIDIODR4TAW7CSEXAMPLE:${redshift:DbUser}@examplecorp.com"
                }
            }
        },
        {
            "Effect": "Allow",
            "Action": [
                "redshift:CreateClusterUser"
            ],
            "Resource": [
                "arn:aws:redshift:us-west-2:123456789012:dbuser:examplecorp-dw/${redshift:DbUser}"
            ]
        },
        {
            "Effect": "Allow",
            "Action": [
                "redshift:JoinGroup"
            ],
            "Resource": [
                "arn:aws:redshift:us-west-2:123456789012:dbgroup:examplecorp-dw/sales_grp"
            ]
        }
    ]
}

The policy uses the following parameter values:

  • Region: us-west-2
  • AWS Account: 123456789012
  • Cluster name: examplecorp-dw
  • Database group: sales_grp
  • IAM role: AIDIODR4TAW7CSEXAMPLE
Policy Statement Description
{
"Effect":"Allow",
"Action":[
"redshift:DescribeClusters"
],
"Resource":[
"arn:aws:redshift:us-west-2:123456789012:cluster:examplecorp-dw"
]
}

Allow users to retrieve the cluster endpoint, region, and port automatically for the Amazon Redshift cluster examplecorp-dw. This specification uses the resource format arn:aws:redshift:region:account-id:cluster:clustername. For example, the SQL client JDBC can be specified in the format jdbc:redshift:iam://clustername:region/dbname.

For more information, see Amazon Resource Names.

{
"Effect":"Allow",
"Action":[
"redshift:GetClusterCredentials"
],
"Resource":[
"arn:aws:redshift:us-west-2:123456789012:cluster:examplecorp-dw",
"arn:aws:redshift:us-west-2:123456789012:dbuser:examplecorp-dw/${redshift:DbUser}"
],
"Condition":{
"StringEquals":{
"aws:userid":"AIDIODR4TAW7CSEXAMPLE:${redshift:DbUser}@examplecorp.com"
}
}
}

Generates a temporary token to authenticate into the examplecorp-dw cluster. “arn:aws:redshift:us-west-2:123456789012:dbuser:examplecorp-dw/${redshift:DbUser}” restricts the corporate user name to the database user name for that user. This resource is specified using the format: arn:aws:redshift:region:account-id:dbuser:clustername/dbusername.

The Condition block enforces that the AWS user ID should match “AIDIODR4TAW7CSEXAMPLE:${redshift:DbUser}@examplecorp.com”, so that individual users can authenticate only as themselves. The AIDIODR4TAW7CSEXAMPLE role has the Sales_DW_IAM_Policy policy attached.

{
"Effect":"Allow",
"Action":[
"redshift:CreateClusterUser"
],
"Resource":[
"arn:aws:redshift:us-west-2:123456789012:dbuser:examplecorp-dw/${redshift:DbUser}"
]
}
Automatically creates database users in examplecorp-dw, when they log in for the first time. Subsequent logins reuse the existing database user.
{
"Effect":"Allow",
"Action":[
"redshift:JoinGroup"
],
"Resource":[
"arn:aws:redshift:us-west-2:123456789012:dbgroup:examplecorp-dw/sales_grp"
]
}
Allows sales users to join the sales_grp database group through the resource “arn:aws:redshift:us-west-2:123456789012:dbgroup:examplecorp-dw/sales_grp” that is specified in the format arn:aws:redshift:region:account-id:dbgroup:clustername/dbgroupname.

Similar policies can be created for Data Science users with access to join the data_science_grp group in examplecorp-dw. You can now attach the Sales_DW_IAM_Policy policy to the role that is mapped to IdP application for SSO.
 For more information about how to define the claim rules, see Configuring SAML Assertions for the Authentication Response.

Because partners are not authorized using Active Directory, they are provided with IAM credentials and added to the partner_grp database group. The Partner_DW_IAM_Policy is attached to the IAM users for partners. The following policy allows partners to log in using the IAM user name as the database user name.

Partner_DW_IAM_Policy

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Action": [
                "redshift:DescribeClusters"
            ],
            "Resource": [
                "arn:aws:redshift:us-west-2:123456789012:cluster:examplecorp-dw"
            ]
        },
        {
            "Effect": "Allow",
            "Action": [
                "redshift:GetClusterCredentials"
            ],
            "Resource": [
                "arn:aws:redshift:us-west-2:123456789012:cluster:examplecorp-dw",
                "arn:aws:redshift:us-west-2:123456789012:dbuser:examplecorp-dw/${redshift:DbUser}"
            ],
            "Condition": {
                "StringEquals": {
                    "redshift:DbUser": "${aws:username}"
                }
            }
        },
        {
            "Effect": "Allow",
            "Action": [
                "redshift:CreateClusterUser"
            ],
            "Resource": [
                "arn:aws:redshift:us-west-2:123456789012:dbuser:examplecorp-dw/${redshift:DbUser}"
            ]
        },
        {
            "Effect": "Allow",
            "Action": [
                "redshift:JoinGroup"
            ],
            "Resource": [
                "arn:aws:redshift:us-west-2:123456789012:dbgroup:examplecorp-dw/partner_grp"
            ]
        }
    ]
}

redshift:DbUser“: “${aws:username}” forces an IAM user to use the IAM user name as the database user name.

With the previous steps configured, you can now establish the connection to Amazon Redshift through JDBC– or ODBC-supported clients.

Step 3: Set up database user access

Before you start connecting to Amazon Redshift using the SQL client, set up the database groups for appropriate data access. Log in to your Amazon Redshift database as superuser to create a database group, using CREATE GROUP.

Log in to examplecorp-dw/analytics as superuser and create the following groups and users:

CREATE GROUP sales_grp;
CREATE GROUP datascience_grp;
CREATE GROUP partner_grp;

Use the GRANT command to define access permissions to database objects (tables/views) for the preceding groups.

Step 4: Connect to Amazon Redshift using the JDBC SQL client

Assume that sales user “nancy” is using the SQL Workbench client and JDBC driver to log in to the Amazon Redshift data warehouse. The following steps help set up the client and establish the connection:

  1. Download the latest Amazon Redshift JDBC driver from the Configure a JDBC Connection page
  2. Build the JDBC URL with the IAM option in the following format:
    jdbc:redshift:iam://examplecorp-dw:us-west-2/sales_db

Because the redshift:DescribeClusters action is assigned to the preceding IAM roles, it automatically resolves the cluster endpoints and the port. Otherwise, you can specify the endpoint and port information in the JDBC URL, as described in Configure a JDBC Connection.

Identify the following JDBC options for providing the IAM credentials (see the “Prepare your environment” section) and configure in the SQL Workbench Connection Profile:

plugin_name=com.amazon.redshift.plugin.AdfsCredentialsProvider 
idp_host=demo.examplecorp.com (The name of the corporate identity provider host)
idp_port=443  (The port of the corporate identity provider host)
user=examplecorp\nancy(corporate user name)
password=***(corporate user password)

The SQL workbench configuration looks similar to the following screenshot:

Now, “nancy” can connect to examplecorp-dw by authenticating using the corporate Active Directory. Because the SAML attributes elements are already configured for nancy, she logs in as database user nancy and is assigned the sales_grp. Similarly, other Sales and Data Science users can connect to the examplecorp-dw cluster. A custom Amazon Redshift ODBC driver can also be used to connect using a SQL client. For more information, see Configure an ODBC Connection.

Step 5: Connecting to Amazon Redshift using JDBC SQL Client and IAM Credentials

This optional step is necessary only when you want to enable users that are not authenticated with Active Directory. Partners are provided with IAM credentials that they can use to connect to the examplecorp-dw Amazon Redshift clusters. These IAM users are attached to Partner_DW_IAM_Policy that assigns them to be assigned to the public database group in Amazon Redshift. The following JDBC URLs enable them to connect to the Amazon Redshift cluster:

jdbc:redshift:iam//examplecorp-dw/analytics?AccessKeyID=XXX&SecretAccessKey=YYY&DbUser=examplecorpsalespartner&DbGroup= partner_grp&AutoCreate=true

The AutoCreate option automatically creates a new database user the first time the partner logs in. There are several other options available to conveniently specify the IAM user credentials. For more information, see Options for providing IAM credentials.

Step 6: Connecting to Amazon Redshift using an ODBC client for Microsoft Windows

Assume that another sales user “uma” is using an ODBC-based client to log in to the Amazon Redshift data warehouse using Example Corp Active Directory. The following steps help set up the ODBC client and establish the Amazon Redshift connection in a Microsoft Windows operating system connected to your corporate network:

  1. Download and install the latest Amazon Redshift ODBC driver.
  2. Create a system DSN entry.
    1. In the Start menu, locate the driver folder or folders:
      • Amazon Redshift ODBC Driver (32-bit)
      • Amazon Redshift ODBC Driver (64-bit)
      • If you installed both drivers, you have a folder for each driver.
    2. Choose ODBC Administrator, and then type your administrator credentials.
    3. To configure the driver for all users on the computer, choose System DSN. To configure the driver for your user account only, choose User DSN.
    4. Choose Add.
  3. Select the Amazon Redshift ODBC driver, and choose Finish. Configure the following attributes:
    Data Source Name =any friendly name to identify the ODBC connection 
    Database=analytics
    user=uma(corporate user name)
    Auth Type-Identity Provider: AD FS
    password=leave blank (Windows automatically authenticates)
    Cluster ID: examplecorp-dw
    idp_host=demo.examplecorp.com (The name of the corporate IdP host)

This configuration looks like the following:

  1. Choose OK to save the ODBC connection.
  2. Verify that uma is set up with the SAML attributes, as described in the “Set up IdPs and federation” section.

The user uma can now use this ODBC connection to establish the connection to the Amazon Redshift cluster using any ODBC-based tools or reporting tools such as Tableau. Internally, uma authenticates using the Sales_DW_IAM_Policy  IAM role and is assigned the sales_grp database group.

Step 7: Connecting to Amazon Redshift using Python and IAM credentials

To enable partners, connect to the examplecorp-dw cluster programmatically, using Python on a computer such as Amazon EC2 instance. Reuse the IAM users that are attached to the Partner_DW_IAM_Policy policy defined in Step 2.

The following steps show this set up on an EC2 instance:

  1. Launch a new EC2 instance with the Partner_DW_IAM_Policy role, as described in Using an IAM Role to Grant Permissions to Applications Running on Amazon EC2 Instances. Alternatively, you can attach an existing IAM role to an EC2 instance.
  2. This example uses Python PostgreSQL Driver (PyGreSQL) to connect to your Amazon Redshift clusters. To install PyGreSQL on Amazon Linux, use the following command as the ec2-user:
    sudo easy_install pip
    sudo yum install postgresql postgresql-devel gcc python-devel
    sudo pip install PyGreSQL

  1. The following code snippet demonstrates programmatic access to Amazon Redshift for partner users:
    #!/usr/bin/env python
    """
    Usage:
    python redshift-unload-copy.py <config file> <region>
    
    * Copyright 2014, Amazon.com, Inc. or its affiliates. All Rights Reserved.
    *
    * Licensed under the Amazon Software License (the "License").
    * You may not use this file except in compliance with the License.
    * A copy of the License is located at
    *
    * http://aws.amazon.com/asl/
    *
    * or in the "license" file accompanying this file. This file is distributed
    * on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either
    * express or implied. See the License for the specific language governing
    * permissions and limitations under the License.
    """
    
    import sys
    import pg
    import boto3
    
    REGION = 'us-west-2'
    CLUSTER_IDENTIFIER = 'examplecorp-dw'
    DB_NAME = 'sales_db'
    DB_USER = 'examplecorpsalespartner'
    
    options = """keepalives=1 keepalives_idle=200 keepalives_interval=200
                 keepalives_count=6"""
    
    set_timeout_stmt = "set statement_timeout = 1200000"
    
    def conn_to_rs(host, port, db, usr, pwd, opt=options, timeout=set_timeout_stmt):
        rs_conn_string = """host=%s port=%s dbname=%s user=%s password=%s
                             %s""" % (host, port, db, usr, pwd, opt)
        print "Connecting to %s:%s:%s as %s" % (host, port, db, usr)
        rs_conn = pg.connect(dbname=rs_conn_string)
        rs_conn.query(timeout)
        return rs_conn
    
    def main():
        # describe the cluster and fetch the IAM temporary credentials
        global redshift_client
        redshift_client = boto3.client('redshift', region_name=REGION)
        response_cluster_details = redshift_client.describe_clusters(ClusterIdentifier=CLUSTER_IDENTIFIER)
        response_credentials = redshift_client.get_cluster_credentials(DbUser=DB_USER,DbName=DB_NAME,ClusterIdentifier=CLUSTER_IDENTIFIER,DurationSeconds=3600)
        rs_host = response_cluster_details['Clusters'][0]['Endpoint']['Address']
        rs_port = response_cluster_details['Clusters'][0]['Endpoint']['Port']
        rs_db = DB_NAME
        rs_iam_user = response_credentials['DbUser']
        rs_iam_pwd = response_credentials['DbPassword']
        # connect to the Amazon Redshift cluster
        conn = conn_to_rs(rs_host, rs_port, rs_db, rs_iam_user,rs_iam_pwd)
        # execute a query
        result = conn.query("SELECT sysdate as dt")
        # fetch results from the query
        for dt_val in result.getresult() :
            print dt_val
        # close the Amazon Redshift connection
        conn.close()
    
    if __name__ == "__main__":
        main()

You can save this Python program in a file (redshiftscript.py) and execute it at the command line as ec2-user:

python redshiftscript.py

Now partners can connect to the Amazon Redshift cluster using the Python script, and authentication is federated through the IAM user.

Summary

In this post, I demonstrated how to use federated access using Active Directory and IAM roles to enable single sign-on to an Amazon Redshift cluster. I also showed how partners outside an organization can be managed easily using IAM credentials.  Using the GetClusterCredentials API action, now supported by Amazon Redshift, lets you manage a large number of database users and have them use corporate credentials to log in. You don’t have to maintain separate database user accounts.

Although this post demonstrated the integration of IAM with AD FS and Active Directory, you can replicate this solution across with your choice of SAML 2.0 third-party identity providers (IdP), such as PingFederate or Okta. For the different supported federation options, see Configure SAML Assertions for Your IdP.

If you have questions or suggestions, please comment below.


Additional Reading

Learn how to establish federated access to your AWS resources by using Active Directory user attributes.


About the Author

Thiyagarajan Arumugam is a Big Data Solutions Architect at Amazon Web Services and designs customer architectures to process data at scale. Prior to AWS, he built data warehouse solutions at Amazon.com. In his free time, he enjoys all outdoor sports and practices the Indian classical drum mridangam.

 

Implementing Default Directory Indexes in Amazon S3-backed Amazon CloudFront Origins Using [email protected]

Post Syndicated from Ronnie Eichler original https://aws.amazon.com/blogs/compute/implementing-default-directory-indexes-in-amazon-s3-backed-amazon-cloudfront-origins-using-lambdaedge/

With the recent launch of [email protected], it’s now possible for you to provide even more robust functionality to your static websites. Amazon CloudFront is a content distribution network service. In this post, I show how you can use [email protected] along with the CloudFront origin access identity (OAI) for Amazon S3 and still provide simple URLs (such as www.example.com/about/ instead of www.example.com/about/index.html).

Background

Amazon S3 is a great platform for hosting a static website. You don’t need to worry about managing servers or underlying infrastructure—you just publish your static to content to an S3 bucket. S3 provides a DNS name such as <bucket-name>.s3-website-<AWS-region>.amazonaws.com. Use this name for your website by creating a CNAME record in your domain’s DNS environment (or Amazon Route 53) as follows:

www.example.com -> <bucket-name>.s3-website-<AWS-region>.amazonaws.com

You can also put CloudFront in front of S3 to further scale the performance of your site and cache the content closer to your users. CloudFront can enable HTTPS-hosted sites, by either using a custom Secure Sockets Layer (SSL) certificate or a managed certificate from AWS Certificate Manager. In addition, CloudFront also offers integration with AWS WAF, a web application firewall. As you can see, it’s possible to achieve some robust functionality by using S3, CloudFront, and other managed services and not have to worry about maintaining underlying infrastructure.

One of the key concerns that you might have when implementing any type of WAF or CDN is that you want to force your users to go through the CDN. If you implement CloudFront in front of S3, you can achieve this by using an OAI. However, in order to do this, you cannot use the HTTP endpoint that is exposed by S3’s static website hosting feature. Instead, CloudFront must use the S3 REST endpoint to fetch content from your origin so that the request can be authenticated using the OAI. This presents some challenges in that the REST endpoint does not support redirection to a default index page.

CloudFront does allow you to specify a default root object (index.html), but it only works on the root of the website (such as http://www.example.com > http://www.example.com/index.html). It does not work on any subdirectory (such as http://www.example.com/about/). If you were to attempt to request this URL through CloudFront, CloudFront would do a S3 GetObject API call against a key that does not exist.

Of course, it is a bad user experience to expect users to always type index.html at the end of every URL (or even know that it should be there). Until now, there has not been an easy way to provide these simpler URLs (equivalent to the DirectoryIndex Directive in an Apache Web Server configuration) to users through CloudFront. Not if you still want to be able to restrict access to the S3 origin using an OAI. However, with the release of [email protected], you can use a JavaScript function running on the CloudFront edge nodes to look for these patterns and request the appropriate object key from the S3 origin.

Solution

In this example, you use the compute power at the CloudFront edge to inspect the request as it’s coming in from the client. Then re-write the request so that CloudFront requests a default index object (index.html in this case) for any request URI that ends in ‘/’.

When a request is made against a web server, the client specifies the object to obtain in the request. You can use this URI and apply a regular expression to it so that these URIs get resolved to a default index object before CloudFront requests the object from the origin. Use the following code:

'use strict';
exports.handler = (event, context, callback) => {
    
    // Extract the request from the CloudFront event that is sent to [email protected] 
    var request = event.Records[0].cf.request;

    // Extract the URI from the request
    var olduri = request.uri;

    // Match any '/' that occurs at the end of a URI. Replace it with a default index
    var newuri = olduri.replace(/\/$/, '\/index.html');
    
    // Log the URI as received by CloudFront and the new URI to be used to fetch from origin
    console.log("Old URI: " + olduri);
    console.log("New URI: " + newuri);
    
    // Replace the received URI with the URI that includes the index page
    request.uri = newuri;
    
    // Return to CloudFront
    return callback(null, request);

};

To get started, create an S3 bucket to be the origin for CloudFront:

Create bucket

On the other screens, you can just accept the defaults for the purposes of this walkthrough. If this were a production implementation, I would recommend enabling bucket logging and specifying an existing S3 bucket as the destination for access logs. These logs can be useful if you need to troubleshoot issues with your S3 access.

Now, put some content into your S3 bucket. For this walkthrough, create two simple webpages to demonstrate the functionality:  A page that resides at the website root, and another that is in a subdirectory.

<s3bucketname>/index.html

<!doctype html>
<html>
    <head>
        <meta charset="utf-8">
        <title>Root home page</title>
    </head>
    <body>
        <p>Hello, this page resides in the root directory.</p>
    </body>
</html>

<s3bucketname>/subdirectory/index.html

<!doctype html>
<html>
    <head>
        <meta charset="utf-8">
        <title>Subdirectory home page</title>
    </head>
    <body>
        <p>Hello, this page resides in the /subdirectory/ directory.</p>
    </body>
</html>

When uploading the files into S3, you can accept the defaults. You add a bucket policy as part of the CloudFront distribution creation that allows CloudFront to access the S3 origin. You should now have an S3 bucket that looks like the following:

Root of bucket

Subdirectory in bucket

Next, create a CloudFront distribution that your users will use to access the content. Open the CloudFront console, and choose Create Distribution. For Select a delivery method for your content, under Web, choose Get Started.

On the next screen, you set up the distribution. Below are the options to configure:

  • Origin Domain Name:  Select the S3 bucket that you created earlier.
  • Restrict Bucket Access: Choose Yes.
  • Origin Access Identity: Create a new identity.
  • Grant Read Permissions on Bucket: Choose Yes, Update Bucket Policy.
  • Object Caching: Choose Customize (I am changing the behavior to avoid having CloudFront cache objects, as this could affect your ability to troubleshoot while implementing the Lambda code).
    • Minimum TTL: 0
    • Maximum TTL: 0
    • Default TTL: 0

You can accept all of the other defaults. Again, this is a proof-of-concept exercise. After you are comfortable that the CloudFront distribution is working properly with the origin and Lambda code, you can re-visit the preceding values and make changes before implementing it in production.

CloudFront distributions can take several minutes to deploy (because the changes have to propagate out to all of the edge locations). After that’s done, test the functionality of the S3-backed static website. Looking at the distribution, you can see that CloudFront assigns a domain name:

CloudFront Distribution Settings

Try to access the website using a combination of various URLs:

http://<domainname>/:  Works

› curl -v http://d3gt20ea1hllb.cloudfront.net/
*   Trying 54.192.192.214...
* TCP_NODELAY set
* Connected to d3gt20ea1hllb.cloudfront.net (54.192.192.214) port 80 (#0)
> GET / HTTP/1.1
> Host: d3gt20ea1hllb.cloudfront.net
> User-Agent: curl/7.51.0
> Accept: */*
>
< HTTP/1.1 200 OK
< ETag: "cb7e2634fe66c1fd395cf868087dd3b9"
< Accept-Ranges: bytes
< Server: AmazonS3
< X-Cache: Miss from cloudfront
< X-Amz-Cf-Id: -D2FSRwzfcwyKZKFZr6DqYFkIf4t7HdGw2MkUF5sE6YFDxRJgi0R1g==
< Content-Length: 209
< Content-Type: text/html
< Last-Modified: Wed, 19 Jul 2017 19:21:16 GMT
< Via: 1.1 6419ba8f3bd94b651d416054d9416f1e.cloudfront.net (CloudFront), 1.1 iad6-proxy-3.amazon.com:80 (Cisco-WSA/9.1.2-010)
< Connection: keep-alive
<
<!doctype html>
<html>
    <head>
        <meta charset="utf-8">
        <title>Root home page</title>
    </head>
    <body>
        <p>Hello, this page resides in the root directory.</p>
    </body>
</html>
* Curl_http_done: called premature == 0
* Connection #0 to host d3gt20ea1hllb.cloudfront.net left intact

This is because CloudFront is configured to request a default root object (index.html) from the origin.

http://<domainname>/subdirectory/:  Doesn’t work

› curl -v http://d3gt20ea1hllb.cloudfront.net/subdirectory/
*   Trying 54.192.192.214...
* TCP_NODELAY set
* Connected to d3gt20ea1hllb.cloudfront.net (54.192.192.214) port 80 (#0)
> GET /subdirectory/ HTTP/1.1
> Host: d3gt20ea1hllb.cloudfront.net
> User-Agent: curl/7.51.0
> Accept: */*
>
< HTTP/1.1 200 OK
< ETag: "d41d8cd98f00b204e9800998ecf8427e"
< x-amz-server-side-encryption: AES256
< Accept-Ranges: bytes
< Server: AmazonS3
< X-Cache: Miss from cloudfront
< X-Amz-Cf-Id: Iqf0Gy8hJLiW-9tOAdSFPkL7vCWBrgm3-1ly5tBeY_izU82ftipodA==
< Content-Length: 0
< Content-Type: application/x-directory
< Last-Modified: Wed, 19 Jul 2017 19:21:24 GMT
< Via: 1.1 6419ba8f3bd94b651d416054d9416f1e.cloudfront.net (CloudFront), 1.1 iad6-proxy-3.amazon.com:80 (Cisco-WSA/9.1.2-010)
< Connection: keep-alive
<
* Curl_http_done: called premature == 0
* Connection #0 to host d3gt20ea1hllb.cloudfront.net left intact

If you use a tool such like cURL to test this, you notice that CloudFront and S3 are returning a blank response. The reason for this is that the subdirectory does exist, but it does not resolve to an S3 object. Keep in mind that S3 is an object store, so there are no real directories. User interfaces such as the S3 console present a hierarchical view of a bucket with folders based on the presence of forward slashes, but behind the scenes the bucket is just a collection of keys that represent stored objects.

http://<domainname>/subdirectory/index.html:  Works

› curl -v http://d3gt20ea1hllb.cloudfront.net/subdirectory/index.html
*   Trying 54.192.192.130...
* TCP_NODELAY set
* Connected to d3gt20ea1hllb.cloudfront.net (54.192.192.130) port 80 (#0)
> GET /subdirectory/index.html HTTP/1.1
> Host: d3gt20ea1hllb.cloudfront.net
> User-Agent: curl/7.51.0
> Accept: */*
>
< HTTP/1.1 200 OK
< Date: Thu, 20 Jul 2017 20:35:15 GMT
< ETag: "ddf87c487acf7cef9d50418f0f8f8dae"
< Accept-Ranges: bytes
< Server: AmazonS3
< X-Cache: RefreshHit from cloudfront
< X-Amz-Cf-Id: bkh6opXdpw8pUomqG3Qr3UcjnZL8axxOH82Lh0OOcx48uJKc_Dc3Cg==
< Content-Length: 227
< Content-Type: text/html
< Last-Modified: Wed, 19 Jul 2017 19:21:45 GMT
< Via: 1.1 3f2788d309d30f41de96da6f931d4ede.cloudfront.net (CloudFront), 1.1 iad6-proxy-3.amazon.com:80 (Cisco-WSA/9.1.2-010)
< Connection: keep-alive
<
<!doctype html>
<html>
    <head>
        <meta charset="utf-8">
        <title>Subdirectory home page</title>
    </head>
    <body>
        <p>Hello, this page resides in the /subdirectory/ directory.</p>
    </body>
</html>
* Curl_http_done: called premature == 0
* Connection #0 to host d3gt20ea1hllb.cloudfront.net left intact

This request works as expected because you are referencing the object directly. Now, you implement the [email protected] function to return the default index.html page for any subdirectory. Looking at the example JavaScript code, here’s where the magic happens:

var newuri = olduri.replace(/\/$/, '\/index.html');

You are going to use a JavaScript regular expression to match any ‘/’ that occurs at the end of the URI and replace it with ‘/index.html’. This is the equivalent to what S3 does on its own with static website hosting. However, as I mentioned earlier, you can’t rely on this if you want to use a policy on the bucket to restrict it so that users must access the bucket through CloudFront. That way, all requests to the S3 bucket must be authenticated using the S3 REST API. Because of this, you implement a [email protected] function that takes any client request ending in ‘/’ and append a default ‘index.html’ to the request before requesting the object from the origin.

In the Lambda console, choose Create function. On the next screen, skip the blueprint selection and choose Author from scratch, as you’ll use the sample code provided.

Next, configure the trigger. Choosing the empty box shows a list of available triggers. Choose CloudFront and select your CloudFront distribution ID (created earlier). For this example, leave Cache Behavior as * and CloudFront Event as Origin Request. Select the Enable trigger and replicate box and choose Next.

Lambda Trigger

Next, give the function a name and a description. Then, copy and paste the following code:

'use strict';
exports.handler = (event, context, callback) => {
    
    // Extract the request from the CloudFront event that is sent to [email protected] 
    var request = event.Records[0].cf.request;

    // Extract the URI from the request
    var olduri = request.uri;

    // Match any '/' that occurs at the end of a URI. Replace it with a default index
    var newuri = olduri.replace(/\/$/, '\/index.html');
    
    // Log the URI as received by CloudFront and the new URI to be used to fetch from origin
    console.log("Old URI: " + olduri);
    console.log("New URI: " + newuri);
    
    // Replace the received URI with the URI that includes the index page
    request.uri = newuri;
    
    // Return to CloudFront
    return callback(null, request);

};

Next, define a role that grants permissions to the Lambda function. For this example, choose Create new role from template, Basic Edge Lambda permissions. This creates a new IAM role for the Lambda function and grants the following permissions:

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Action": [
                "logs:CreateLogGroup",
                "logs:CreateLogStream",
                "logs:PutLogEvents"
            ],
            "Resource": [
                "arn:aws:logs:*:*:*"
            ]
        }
    ]
}

In a nutshell, these are the permissions that the function needs to create the necessary CloudWatch log group and log stream, and to put the log events so that the function is able to write logs when it executes.

After the function has been created, you can go back to the browser (or cURL) and re-run the test for the subdirectory request that failed previously:

› curl -v http://d3gt20ea1hllb.cloudfront.net/subdirectory/
*   Trying 54.192.192.202...
* TCP_NODELAY set
* Connected to d3gt20ea1hllb.cloudfront.net (54.192.192.202) port 80 (#0)
> GET /subdirectory/ HTTP/1.1
> Host: d3gt20ea1hllb.cloudfront.net
> User-Agent: curl/7.51.0
> Accept: */*
>
< HTTP/1.1 200 OK
< Date: Thu, 20 Jul 2017 21:18:44 GMT
< ETag: "ddf87c487acf7cef9d50418f0f8f8dae"
< Accept-Ranges: bytes
< Server: AmazonS3
< X-Cache: Miss from cloudfront
< X-Amz-Cf-Id: rwFN7yHE70bT9xckBpceTsAPcmaadqWB9omPBv2P6WkIfQqdjTk_4w==
< Content-Length: 227
< Content-Type: text/html
< Last-Modified: Wed, 19 Jul 2017 19:21:45 GMT
< Via: 1.1 3572de112011f1b625bb77410b0c5cca.cloudfront.net (CloudFront), 1.1 iad6-proxy-3.amazon.com:80 (Cisco-WSA/9.1.2-010)
< Connection: keep-alive
<
<!doctype html>
<html>
    <head>
        <meta charset="utf-8">
        <title>Subdirectory home page</title>
    </head>
    <body>
        <p>Hello, this page resides in the /subdirectory/ directory.</p>
    </body>
</html>
* Curl_http_done: called premature == 0
* Connection #0 to host d3gt20ea1hllb.cloudfront.net left intact

You have now configured a way for CloudFront to return a default index page for subdirectories in S3!

Summary

In this post, you used [email protected] to be able to use CloudFront with an S3 origin access identity and serve a default root object on subdirectory URLs. To find out some more about this use-case, see [email protected] integration with CloudFront in our documentation.

If you have questions or suggestions, feel free to comment below. For troubleshooting or implementation help, check out the Lambda forum.

Google Asked to Delist Pirate Movie Sites, ISPs Asked to Block Them

Post Syndicated from Andy original https://torrentfreak.com/google-asked-to-delist-pirate-movie-sites-isps-asked-to-block-them-171018/

After seizing several servers operated by popular private music tracker What.cd, last November French police went after a much bigger target.

Boasting millions of regular visitors, Zone-Telechargement (Zone-Download) was ranked the 11th most-visited website in the whole of the country. The site offered direct downloads of a wide variety of pirated content, including films, series, games, and music. Until the French Gendarmerie shut it down, that is.

After being founded in 2011 and enjoying huge growth following the 2012 raids against Megaupload, the Zone-Telechargement ‘brand’ was still popular with French users, despite the closure of the platform. It, therefore, came as no surprise that the site was quickly cloned by an unknown party and relaunched as Zone-Telechargement.ws.

The site has been doing extremely well following its makeover. To the annoyance of copyright holders, SimilarWeb reports the platform as France’s 37th most popular site with around 58 million visitors per month. That’s a huge achievement in less than 12 months.

Now, however, the site is receiving more unwanted attention. PCInpact says it has received information that several movie-focused organizations including the French National Film Center are requesting tough action against the site.

The National Federation of Film Distributors, the Video Publishing Union, the Association of Independent Producers and the Producers Union are all demanding the blocking of Zone-Telechargement by several local ISPs, alongside its delisting from search results.

The publication mentions four Internet service providers – Free, Numericable, Bouygues Telecom, and Orange – plus Google on the search engine front. At this stage, other search companies, such as Microsoft’s Bing, are not reported as part of the action.

In addition to Zone-Telechargement, several other ‘pirate’ sites (Papystreaming.org, Sokrostream.cc and Zonetelechargement.su, another site playing on the popular brand) are included in the legal process. All are described as “structurally infringing” by the complaining movie outfits, PCInpact notes.

The legal proceedings against the sites are based in Article 336-2 of the Intellectual Property Code. It’s ground already trodden by movie companies who following a 2011 complaint, achieved victory in 2013 against several Allostreaming-linked sites.

In that case, the High Court of Paris ordered ISPs, several of which appear in the current action, to “implement all appropriate means including blocking” to prevent access to the infringing sites.

The Court also ordered Google, Microsoft, and Yahoo to “take all necessary measures to prevent the occurrence on their services of any results referring to any of the sites” on their platforms.

Also of interest is that the action targets a service called DL-Protecte.com, which according to local anti-piracy agency HADOPI, makes it difficult for rightsholders to locate infringing content while at the same time generates more revenue for pirate sites.

A judgment is expected in “several months.”

Source: TF, for the latest info on copyright, file-sharing, torrent sites and ANONYMOUS VPN services.

New ‘Coalition Against Piracy’ Will Crack Down on Pirate Streaming Boxes

Post Syndicated from Ernesto original https://torrentfreak.com/new-coalition-against-piracy-will-crack-down-on-pirate-streaming-boxes-171017/

Traditionally there have only been a handful of well-known industry groups fighting online piracy, but this appears to be changing.

Increasingly, major entertainment industry companies are teaming up in various regions to bundle their enforcement efforts against copyright infringement.

Earlier this year the Alliance for Creativity and Entertainment (ACE) was formed by major players including Disney, HBO, and NBCUniversal, and several of the same media giants are also involved in the newly founded Coalition Against Piracy (CAP).

CAP will coordinate anti-piracy efforts in Asia and is backed by CASBAA, Disney, Fox, HBO Asia, NBCUniversal, Premier League, Turner Asia-Pacific, A&E Networks, Astro, BBC Worldwide, National Basketball Association, TV5MONDE, Viacom International, and others.

The coalition has hired Neil Gane as its general manager. Gane is no stranger to anti-piracy work, as he previously served as the MPAA’s regional director in Australasia and was chief of the Australian Federation Against Copyright Theft.

The goal of CAP will be to assist in local enforcement actions against piracy, including the disruption and dismantling of local businesses that facilitate it. Pirate streaming boxes and apps will be among the main targets.

These boxes, which often use the legal Kodi player paired with infringing add-ons, are referred to as illicit streaming devices (ISDs) by industry insiders. They have grown in popularity all around the world and Asia is no exception.

“The prevalence of ISDs across Asia is staggering. The criminals who operate the ISD networks and the pirate websites are profiting from the hard work of talented creators, seriously damaging the legitimate content ecosystem as well as exposing consumers to dangerous malware”, Gane said, quoted by Indian Television.

Gane knows the region well and started his career working for the Hong Kong Police. He sees the pirate streaming box ecosystem as a criminal network which presents a major threat to the entertainment industries.

“This is a highly organized transnational crime with criminal syndicates profiting enormously at the expense of consumers as well as content creators,” Gane noted.

The Asian creative industry is a major growth market as more and more legal content is made available. However, the growth of these legal services is threatened by pirate boxes and apps. The Coalition Against Piracy hopes to curb this.

The launch of CAP, which will be formalized at the upcoming CASBAA anti-piracy convention in November, confirms the trend of localized anti-piracy coalitions which are backed by major industry players. We can expect to hear more from these during the years to come.

Just a few days ago the founding members of the aforementioned ACE anti-piracy initiative filed their first joint lawsuit in the US which, unsurprisingly, targets a seller of streaming boxes.

Source: TF, for the latest info on copyright, file-sharing, torrent sites and ANONYMOUS VPN services.

How to Compete with Giants

Post Syndicated from Gleb Budman original https://www.backblaze.com/blog/how-to-compete-with-giants/

How to Compete with Giants

This post by Backblaze’s CEO and co-founder Gleb Budman is the sixth in a series about entrepreneurship. You can choose posts in the series from the list below:

  1. How Backblaze got Started: The Problem, The Solution, and the Stuff In-Between
  2. Building a Competitive Moat: Turning Challenges Into Advantages
  3. From Idea to Launch: Getting Your First Customers
  4. How to Get Your First 1,000 Customers
  5. Surviving Your First Year
  6. How to Compete with Giants

Use the Join button above to receive notification of new posts in this series.

Perhaps your business is competing in a brand new space free from established competitors. Most of us, though, start companies that compete with existing offerings from large, established companies. You need to come up with a better mousetrap — not the first mousetrap.

That’s the challenge Backblaze faced. In this post, I’d like to share some of the lessons I learned from that experience.

Backblaze vs. Giants

Competing with established companies that are orders of magnitude larger can be daunting. How can you succeed?

I’ll set the stage by offering a few sets of giants we compete with:

  • When we started Backblaze, we offered online backup in a market where companies had been offering “online backup” for at least a decade, and even the newer entrants had raised tens of millions of dollars.
  • When we built our storage servers, the alternatives were EMC, NetApp, and Dell — each of which had a market cap of over $10 billion.
  • When we introduced our cloud storage offering, B2, our direct competitors were Amazon, Google, and Microsoft. You might have heard of them.

What did we learn by competing with these giants on a bootstrapped budget? Let’s take a look.

Determine What Success Means

For a long time Apple considered Apple TV to be a hobby, not a real product worth focusing on, because it did not generate a billion in revenue. For a $10 billion per year revenue company, a new business that generates $50 million won’t move the needle and often isn’t worth putting focus on. However, for a startup, getting to $50 million in revenue can be the start of a wildly successful business.

Lesson Learned: Don’t let the giants set your success metrics.

The Advantages Startups Have

The giants have a lot of advantages: more money, people, scale, resources, access, etc. Following their playbook and attacking head-on means you’re simply outgunned. Common paths to failure are trying to build more features, enter more markets, outspend on marketing, and other similar approaches where scale and resources are the primary determinants of success.

But being a startup affords many advantages most giants would salivate over. As a nimble startup you can leverage those to succeed. Let’s breakdown nine competitive advantages we’ve used that you can too.

1. Drive Focus

It’s hard to build a $10 billion revenue business doing just one thing, and most giants have a broad portfolio of businesses, numerous products for each, and targeting a variety of customer segments in multiple markets. That adds complexity and distributes management attention.

Startups get the benefit of having everyone in the company be extremely focused, often on a singular mission, product, customer segment, and market. While our competitors sell everything from advertising to Zantac, and are investing in groceries and shipping, Backblaze has focused exclusively on cloud storage. This means all of our best people (i.e. everyone) is focused on our cloud storage business. Where is all of your focus going?

Lesson Learned: Align everyone in your company to a singular focus to dramatically out-perform larger teams.

2. Use Lack-of-Scale as an Advantage

You may have heard Paul Graham say “Do things that don’t scale.” There are a host of things you can do specifically because you don’t have the same scale as the giants. Use that as an advantage.

When we look for data center space, we have more options than our largest competitors because there are simply more spaces available with room for 100 cabinets than for 1,000 cabinets. With some searching, we can find data center space that is better/cheaper.

When a flood in Thailand destroyed factories, causing the world’s supply of hard drives to plummet and prices to triple, we started drive farming. The giants certainly couldn’t. It was a bit crazy, but it let us keep prices unchanged for our customers.

Our Chief Cloud Officer, Tim, used to work at Adobe. Because of their size, any new product needed to always launch in a multitude of languages and in global markets. Once launched, they had scale. But getting any new product launched was incredibly challenging.

Lesson Learned: Use lack-of-scale to exploit opportunities that are closed to giants.

3. Build a Better Product

This one is probably obvious. If you’re going to provide the same product, at the same price, to the same customers — why do it? Remember that better does not always mean more features. Here’s one way we built a better product that didn’t require being a bigger company.

All online backup services required customers to choose what to include in their backup. We found that this was complicated for users since they often didn’t know what needed to be backed up. We flipped the model to back up everything and allow users to exclude if they wanted to, but it was not required. This reduced the number of features/options, while making it easier and better for the user.

This didn’t require the resources of a huge company; it just required understanding customers a bit deeper and thinking about the solution differently. Building a better product is the most classic startup competitive advantage.

Lesson Learned: Dig deep with your customers to understand and deliver a better mousetrap.

4. Provide Better Service

How can you provide better service? Use your advantages. Escalations from your customer care folks to engineering can go through fewer hoops. Fixing an issue and shipping can be quicker. Access to real answers on Twitter or Facebook can be more effective.

A strategic decision we made was to have all customer support people as full-time employees in our headquarters. This ensures they are in close contact to the whole company for feedback to quickly go both ways.

Having a smaller team and fewer layers enables faster internal communication, which increases customer happiness. And the option to do things that don’t scale — such as help a customer in a unique situation — can go a long way in building customer loyalty.

Lesson Learned: Service your customers better by establishing clear internal communications.

5. Remove The Unnecessary

After determining that the industry standard EMC/NetApp/Dell storage servers would be too expensive to build our own cloud storage upon, we decided to build our own infrastructure. Many said we were crazy to compete with these multi-billion dollar companies and that it would be impossible to build a lower cost storage server. However, not only did it prove to not be impossible — it wasn’t even that hard.

One key trick? Remove the unnecessary. While EMC and others built servers to sell to other companies for a wide variety of use cases, Backblaze needed servers that only Backblaze would run, and for a single use case. As a result we could tailor the servers for our needs by removing redundancy from each server (since we would run redundant servers), and using lower-performance components (since we would get high-performance by running parallel servers).

What do your customers and use cases not need? This can trim costs and complexity while often improving the product for your use case.

Lesson Learned: Don’t think “what can we add” to what the giants offer — think “what can we remove.”

6. Be Easy

How many times have you visited a large company website, particularly one that’s not consumer-focused, only to leave saying, “Huh? I don’t understand what you do.” Keeping your website clear, and your product and pricing simple, will dramatically increase conversion and customer satisfaction. If you’re able to make it 2x easier and thus increasing your conversion by 2x, you’ve just allowed yourself to spend ½ as much acquiring a customer.

Providing unlimited data backup wasn’t specifically about providing more storage — it was about making it easier. Since users didn’t know how much data they needed to back up, charging per gigabyte meant they wouldn’t know the cost. Providing unlimited data backup meant they could just relax.

Customers love easy — and being smaller makes easy easier to deliver. Use that as an advantage in your website, marketing materials, pricing, product, and in every other customer interaction.

Lesson Learned: Ease-of-use isn’t a slogan: it’s a competitive advantage. Treat it as seriously as any other feature of your product

7. Don’t Be Afraid of Risk

Obviously unnecessary risks are unnecessary, and some risks aren’t worth taking. However, large companies that have given guidance to Wall Street with a $0.01 range on their earning-per-share are inherently going to be very risk-averse. Use risk-tolerance to open up opportunities, and adjust your tolerance level as you scale. In your first year, there are likely an infinite number of ways your business may vaporize; don’t be too worried about taking a risk that might have a 20% downside when the upside is hockey stick growth.

Using consumer-grade hard drives in our servers may have caused pain and suffering for us years down-the-line, but they were priced at approximately 50% of enterprise drives. Giants wouldn’t have considered the option. Turns out, the consumer drives performed great for us.

Lesson Learned: Use calculated risks as an advantage.

8. Be Open

The larger a company grows, the more it wants to hide information. Some of this is driven by regulatory requirements as a public company. But most of this is cultural. Sharing something might cause a problem, so let’s not. All external communication is treated as a critical press release, with rounds and rounds of editing by multiple teams and approvals. However, customers are often desperate for information. Moreover, sharing information builds trust, understanding, and advocates.

I started blogging at Backblaze before we launched. When we blogged about our Storage Pod and open-sourced the design, many thought we were crazy to share this information. But it was transformative for us, establishing Backblaze as a tech thought leader in storage and giving people a sense of how we were able to provide our service at such a low cost.

Over the years we’ve developed a culture of being open internally and externally, on our blog and with the press, and in communities such as Hacker News and Reddit. Often we’ve been asked, “why would you share that!?” — but it’s the continual openness that builds trust. And that culture of openness is incredibly challenging for the giants.

Lesson Learned: Overshare to build trust and brand where giants won’t.

9. Be Human

As companies scale, typically a smaller percent of founders and executives interact with customers. The people who build the company become more hidden, the language feels “corporate,” and customers start to feel they’re interacting with the cliche “faceless, nameless corporation.” Use your humanity to your advantage. From day one the Backblaze About page listed all the founders, and my email address. While contacting us shouldn’t be the first path for a customer support question, I wanted it to be clear that we stand behind the service we offer; if we’re doing something wrong — I want to know it.

To scale it’s important to have processes and procedures, but sometimes a situation falls outside of a well-established process. While we want our employees to follow processes, they’re still encouraged to be human and “try to do the right thing.” How to you strike this balance? Simon Sinek gives a good talk about it: make your employees feel safe. If employees feel safe they’ll be human.

If your customer is a consumer, they’ll appreciate being treated as a human. Even if your customer is a corporation, the purchasing decision-makers are still people.

Lesson Learned: Being human is the ultimate antithesis to the faceless corporation.

Build Culture to Sustain Your Advantages at Scale

Presumably the goal is not to always be competing with giants, but to one day become a giant. Does this mean you’ll lose all of these advantages? Some, yes — but not all. Some of these advantages are cultural, and if you build these into the culture from the beginning, and fight to keep them as you scale, you can keep them as you become a giant.

Tesla still comes across as human, with Elon Musk frequently interacting with people on Twitter. Apple continues to provide great service through their Genius Bar. And, worst case, if you lose these at scale, you’ll still have the other advantages of being a giant such as money, people, scale, resources, and access.

Of course, some new startup will be gunning for you with grand ambitions, so just be sure not to get complacent. 😉

The post How to Compete with Giants appeared first on Backblaze Blog | Cloud Storage & Cloud Backup.

Amazon Lightsail Update – Launch and Manage Windows Virtual Private Servers

Post Syndicated from Jeff Barr original https://aws.amazon.com/blogs/aws/amazon-lightsail-update-launch-and-manage-windows-virtual-private-servers/

I first told you about Amazon Lightsail last year in my blog post, Amazon Lightsail – the Power of AWS, the Simplicity of a VPS. Since last year’s launch, thousands of customers have used Lightsail to get started with AWS, launching Linux-based Virtual Private Servers.

Today we are adding support for Windows-based Virtual Private Servers. You can launch a VPS that runs Windows Server 2012 R2, Windows Server 2016, or Windows Server 2016 with SQL Server 2016 Express and be up and running in minutes. You can use your VPS to build, test, and deploy .NET or Windows applications without having to set up or run any infrastructure. Backups, DNS management, and operational metrics are all accessible with a click or two.

Servers are available in five sizes, with 512 MB to 8 GB of RAM, 1 or 2 vCPUs, and up to 80 GB of SSD storage. Prices (including software licenses) start at $10 per month:

You can try out a 512 MB server for one month (up to 750 hours) at no charge.

Launching a Windows VPS
To launch a Windows VPS, log in to Lightsail , click on Create instance, and select the Microsoft Windows platform. Then click on Apps + OS if you want to run SQL Server 2016 Express, or OS Only if Windows is all you need:

If you want to use a Powershell script to customize your instance after it launches for the first time, click on Add launch script and enter the script:

Choose your instance plan, enter a name for your instance(s), and select the quantity to be launched, then click on Create:

Your instance will be up and running within a minute or so:

Click on the instance, and then click on Connect using RDP:

This will connect using a built-in, browser-based RDP client (you can also use the IP address and the credentials with another client):

Available Today
This feature is available today in the US East (Northern Virginia), US East (Ohio), US West (Oregon), EU (London), EU (Ireland), EU (Frankfurt), Asia Pacific (Singapore), Asia Pacific (Mumbai), Asia Pacific (Sydney), and Asia Pacific (Tokyo) Regions.

Jeff;

 

PureVPN Explains How it Helped the FBI Catch a Cyberstalker

Post Syndicated from Andy original https://torrentfreak.com/purevpn-explains-how-it-helped-the-fbi-catch-a-cyberstalker-171016/

Early October, Ryan S. Lin, 24, of Newton, Massachusetts, was arrested on suspicion of conducting “an extensive cyberstalking campaign” against a 24-year-old Massachusetts woman, as well as her family members and friends.

The Department of Justice described Lin’s offenses as a “multi-faceted” computer hacking and cyberstalking campaign. Launched in April 2016 when he began hacking into the victim’s online accounts, Lin allegedly obtained personal photographs and sensitive information about her medical and sexual histories and distributed that information to hundreds of other people.

Details of what information the FBI compiled on Lin can be found in our earlier report but aside from his alleged crimes (which are both significant and repugnant), it was PureVPN’s involvement in the case that caused the most controversy.

In a report compiled by an FBI special agent, it was revealed that the Hong Kong-based company’s logs helped the authorities net the alleged criminal.

“Significantly, PureVPN was able to determine that their service was accessed by the same customer from two originating IP addresses: the RCN IP address from the home Lin was living in at the time, and the software company where Lin was employed at the time,” the agent’s affidavit reads.

Among many in the privacy community, this revelation was met with disappointment. On the PureVPN website the company claims to carry no logs and on a general basis, it’s expected that so-called “no-logging” VPN providers should provide people with some anonymity, at least as far as their service goes. Now, several days after the furor, the company has responded to its critics.

In a fairly lengthy statement, the company begins by confirming that it definitely doesn’t log what websites a user views or what content he or she downloads.

“PureVPN did not breach its Privacy Policy and certainly did not breach your trust. NO browsing logs, browsing habits or anything else was, or ever will be shared,” the company writes.

However, that’s only half the problem. While it doesn’t log user activity (what sites people visit or content they download), it does log the IP addresses that customers use to access the PureVPN service. These, given the right circumstances, can be matched to external activities thanks to logs carried by other web companies.

PureVPN talks about logs held by Google’s Gmail service to illustrate its point.

“A network log is automatically generated every time a user visits a website. For the sake of this example, let’s say a user logged into their Gmail account. Every time they accessed Gmail, the email provider created a network log,” the company explains.

“If you are using a VPN, Gmail’s network log would contain the IP provided by PureVPN. This is one half of the picture. Now, if someone asks Google who accessed the user’s account, Google would state that whoever was using this IP, accessed the account.

“If the user was connected to PureVPN, it would be a PureVPN IP. The inquirer [in the Lin case, the FBI] would then share timestamps and network logs acquired from Google and ask them to be compared with the network logs maintained by the VPN provider.”

Now, if PureVPN carried no logs – literally no logs – it would not be able to help with this kind of inquiry. That was the case last year when the FBI approached Private Internet Access for information and the company was unable to assist.

However, as is made pretty clear by PureVPN’s explanation, the company does log user IP addresses and timestamps which reveal when a user was logged on to the service. It doesn’t matter that PureVPN doesn’t log what the user allegedly did online, since the third-party service already knows that information to the precise second.

Following the example, GMail knows that a user sent an email at 10:22am on Monday October 16 from a PureVPN IP address. So, if PureVPN is approached by the FBI, the company can confirm that User X was using the same IP address at exactly the same time, and his home IP address was XXX.XX.XXX.XX. Effectively, the combined logs link one IP address to the other and the user is revealed. It’s that simple.

It is for this reason that in TorrentFreak’s annual summary of no-logging VPN providers, the very first question we ask every single company reads as follows:

Do you keep ANY logs which would allow you to match an IP-address and a time stamp to a user/users of your service? If so, what information do you hold and for how long?

Clearly, if a company says “yes we log incoming IP addresses and associated timestamps”, any claim to total user anonymity is ended right there and then.

While not completely useless (a logging service will still stop the prying eyes of ISPs and similar surveillance, while also defeating throttling and site-blocking), if you’re a whistle-blower with a job or even your life to protect, this level of protection is entirely inadequate.

The take-home points from this controversy are numerous, but perhaps the most important is for people to read and understand VPN provider logging policies.

Secondly, and just as importantly, VPN providers need to be extremely clear about the information they log. Not tracking browsing or downloading activities is all well and good, but if home IP addresses and timestamps are stored, this needs to be made clear to the customer.

Finally, VPN users should not be evil. There are plenty of good reasons to stay anonymous online but cyberstalking, death threats and ruining people’s lives are not included. Fortunately, the FBI have offline methods for catching this type of offender, and long may that continue.

PureVPN’s blog post is available here.

Source: TF, for the latest info on copyright, file-sharing, torrent sites and ANONYMOUS VPN services.

Netflix Expands Content Protection Team to Reduce Piracy

Post Syndicated from Ernesto original https://torrentfreak.com/netflix-expands-content-protection-team-to-reduce-piracy-171015/

There is little doubt that, in the United States and many other countries, Netflix has become the standard for watching movies on the Internet.

Despite the widespread availability, however, Netflix originals are widely pirated. Episodes from House of Cards, Narcos, and Orange is the New Black are downloaded and streamed millions of times through unauthorized platforms.

The streaming giant is obviously not happy with this situation and has ramped up its anti-piracy efforts in recent years. Since last year the company has sent out over a million takedown requests to Google alone and this volume continues to expand.

This growth coincides with an expansion of the company’s internal anti-piracy division. A new job posting shows that Netflix is expanding this team with a Copyright and Content Protection Coordinator. The ultimate goal is to reduce piracy to a fringe activity.

“The growing Global Copyright & Content Protection Group is looking to expand its team with the addition of a coordinator,” the job listing reads.

“He or she will be tasked with supporting the Netflix Global Copyright & Content Protection Group in its internal tactical take down efforts with the goal of reducing online piracy to a socially unacceptable fringe activity.”

Among other things, the new coordinator will evaluate new technological solutions to tackle piracy online.

More old-fashioned takedown efforts are also part of the job. This includes monitoring well-known content platforms, search engines and social network sites for pirated content.

“Day to day scanning of Facebook, YouTube, Twitter, Periscope, Google Search, Bing Search, VK, DailyMotion and all other platforms (including live platforms) used for piracy,” is listed as one of the main responsibilities.

Netflix’ Copyright and Content Protection Coordinator Job

The coordinator is further tasked with managing Facebook’s Rights Manager and YouTube’s Content-ID system, to prevent circumvention of these piracy filters. Experience with fingerprinting technologies and other anti-piracy tools will be helpful in this regard.

Netflix doesn’t do all the copyright enforcement on its own though. The company works together with other media giants in the recently launched “Alliance for Creativity and Entertainment” that is spearheaded by the MPAA.

In addition, the company also uses the takedown services of external anti-piracy outfits to target more traditional infringement sources, such as cyberlockers and piracy streaming sites. The coordinator has to keep an eye on these as well.

“Liaise with our vendors on manual takedown requests on linking sites and hosting sites and gathering data on pirate streaming sites, cyberlockers and usenet platforms.”

The above shows that Netflix is doing its best to prevent piracy from getting out of hand. It’s definitely taking the issue more seriously than a few years ago when the company didn’t have much original content.

The switch from being merely a distribution platform to becoming a major content producer and copyright holder has changed the stakes. Netflix hasn’t won the war on piracy, it’s just getting started.

Source: TF, for the latest info on copyright, file-sharing, torrent sites and ANONYMOUS VPN services.

Hollywood Giants Sue Kodi-powered ‘TickBox TV’ Over Piracy

Post Syndicated from Ernesto original https://torrentfreak.com/hollywood-giants-sue-kodi-powered-tickbox-tv-over-piracy-171014/

Online streaming piracy is booming and many people use dedicated media players to bring this content to their regular TVs.

The bare hardware is not illegal and neither is media player software such as Kodi. When these devices are loaded with copyright-infringing addons, however, they turn into an unprecedented piracy threat.

It becomes even more problematic when the sellers of these devices market their products as pirate tools. This is exactly what TickBox TV does, according to Hollywood’s major movie studios, Netflix, and Amazon.

TickBox is a Georgia-based provider of set-top boxes that allow users to stream a variety of popular media. The company’s devices use the Kodi media player and come with instructions on how to add various add-ons.

In a complaint filed in a California federal court yesterday, Universal, Columbia Pictures, Disney, 20th Century Fox, Paramount Pictures, Warner Bros, Amazon, and Netflix accuse Tickbox of inducing and contributing to copyright infringement.

“TickBox sells ‘TickBox TV,’ a computer hardware device that TickBox urges its customers to use as a tool for the mass infringement of Plaintiffs’ copyrighted motion pictures and television shows,” the complaint, picked up by THR, reads.

While the device itself does not host any infringing content, users are informed where they can find it.

The movie and TV studios stress that Tickbox’s marketing highlights its infringing uses with statements such as “if you’re tired of wasting money with online streaming services like Netflix, Hulu or Amazon Prime.”

Sick of paying high monthly fees?

“TickBox promotes the use of TickBox TV for overwhelmingly, if not exclusively, infringing purposes, and that is how its customers use TickBox TV. TickBox advertises TickBox TV as a substitute for authorized and legitimate distribution channels such as cable television or video-on-demand services like Amazon Prime and Netflix,” the studios’ lawyers write.

The complaint explains in detail how TickBox works. When users first boot up their device they are prompted to download the “TickBox TV Player” software. This comes with an instruction video guiding people to infringing streams.

“The TickBox TV instructional video urges the customer to use the ‘Select Your Theme’ button on the start-up menu for downloading addons. The ‘Themes’ are curated collections of popular addons that link to unauthorized streams of motion pictures and television shows.”

“Some of the most popular addons currently distributed — which are available through TickBox TV — are titled ‘Elysium,’ ‘Bob,’ and ‘Covenant’,” the complaint adds, showing screenshots of the interface.

Covenant

The movie and TV studios, which are the founding members of the recently launched ACE anti-piracy initiative, want TickBox to stop selling their devices. In addition, they demand compensation for the damages they’ve suffered. Requesting the maximum statutory damages of $150,000 per copyright infringement, this can run into the millions.

The involvement of Amazon, albeit the content division, is notable since the online store itself sells dozens of similar streaming devices, some of which even list “infringing” addons.

The TickBox lawsuit is the first case in the United States where a group of major Hollywood players is targeting a streaming device. Earlier this year various Hollywood insiders voiced concerns about the piracy streaming epidemic and if this case goes their way, it probably won’t be the last.

A copy of the full complaint is available here (pdf)

Source: TF, for the latest info on copyright, file-sharing, torrent sites and ANONYMOUS VPN services.

Popular Zer0day Torrent Tracker Taken Offline By Mass Copyright Complaint

Post Syndicated from Andy original https://torrentfreak.com/popular-zer0day-torrent-tracker-taken-offline-by-mass-copyright-complaint-171014/

In January 2016, a BitTorrent enthusiast decided to launch a stand-alone tracker, purely for fun.

The Zer0day platform, which hosts no torrents, is a tracker in the purest sense, directing traffic between peers, no matter what content is involved and no matter where people are in the world.

With this type of tracker in short supply, it was soon utilized by The Pirate Bay and the now-defunct ExtraTorrent. By August 2016, it was tracking almost four million peers and a million torrents, a considerable contribution to the BitTorrent ecosystem.

After handling many ups and downs associated with a service of this type, the tracker eventually made it to the end of 2016 intact. This year it grew further still and by the end of September was tracking an impressive 5.5 million peers spread over 1.2 million torrents. Soon after, however, the tracker disappeared from the Internet without warning.

In an effort to find out what had happened, TorrentFreak contacted Zer0day’s operator who told us a familiar story. Without any warning at all, the site’s host pulled the plug on the service, despite having been paid 180 euros for hosting just a week earlier.

“We’re hereby informing you of the termination of your dedicated server due to a breach of our terms of service,” the host informed Zer0day.

“Hosting trackers on our servers that distribute infringing and copyrighted content is prohibited. This server was found to distribute such content. Should we identify additional similar activity in your services, we will be forced to close your account.”

While hosts tend not to worry too much about what their customers are doing, this one had just received a particularly lengthy complaint. Sent by the head of anti-piracy at French collecting society SCPP, it laid out the group’s problems with the Zer0day tracker.

“SCPP has been responsible for the collective management and protection of sound recordings and music videos producers’ rights since 1985. SCPP counts more than 2,600 members including the majority of independent French producers, in addition to independent European producers, and the major international companies: Sony, Universal and Warner,” the complaints reads.

“SCPP administers a catalog of 7,200,000 sound tracks and 77,000 music videos. SCPP is empowered by its members to take legal action in order to put an end to any infringements of the producers’ rights set out in Article L335-4 of the French Intellectual Property Code…..punishable by a three-year prison sentence or a fine of €300,000.”

Noting that it works on behalf of a number of labels and distributors including BMG, Sony Music, Universal Music, Warner Music and others, SCPP listed countless dozens of albums under its protection, each allegedly tracked by the Zer0day platform.

“It has come to our attention that these music albums are illegally being communicated to the public (made available for download) by various users of the BitTorrent-Network,” the complaint reads.

Noting that Zer0day is involved in the process, the anti-piracy outfit presented dozens of hash codes relating to protected works, demanding that the site stop facilitation of infringement on each and every one of them.

“We have proof that your tracker udp://tracker.zer0day.to:1337/announce provided peers of the BitTorrent-Network with information regarding these torrents, to be specific IP Addresses of peers that were offering without authorization the full albums for download, and that this information enabled peers to download files that contain the sound recordings to which our members producers have the exclusive rights.

“These sound recordings are thus being illegally communicated to the public, and your tracker is enabling the seeders to do so.”

Rather than take the hashes down from the tracker, SCPP actually demanded that Zer0day create a permanent blacklist within 24 hours, to ensure the corresponding torrents wouldn’t be tracked again.

“You should understand that this letter constitutes a notice to you that you may be liable for the infringing activity occurring on your service. In addition, if you ignore this notice, you may also be liable for any resulting infringement,” the complaint added.

But despite all the threats, SCPP didn’t receive the response they’d demanded since the operator of the site refused to take any action.

“Obviously, ‘info hashes’ are not copyrightable nor point to specific copyrighted content, or even have any meaning. Further, I cannot verify that request strings parameters (‘info hashes’) you sent me contain copyrighted material,” he told SCPP.

“Like the website says; for content removal kindly ask the indexing site to remove the listing and the .torrent file. Also, tracker software does not have an option to block request strings parameters (‘info hashes’).”

The net effect of non-compliance with SCPP was fairly dramatic and swift. Zer0day’s host took down the whole tracker instead and currently it remains offline. Whether it reappears depends on the site’s operator finding a suitable web host, but at the moment he says he has no idea where one will appear from.

“Currently I’m searching for some virtual private server as a temporary home for the tracker,” he concludes.

As mentioned in an earlier article detailing the problems sites like Zer0day.to face, trackers aren’t absolutely essential for the functioning of BitTorrent transfers. Nevertheless, their existence certainly improves matters for file-sharers so when they go down, millions can be affected.

Source: TF, for the latest info on copyright, file-sharing, torrent sites and ANONYMOUS VPN services.

Coaxing 2D platforming out of Unity

Post Syndicated from Eevee original https://eev.ee/blog/2017/10/13/coaxing-2d-platforming-out-of-unity/

An anonymous donor asked a question that I can’t even begin to figure out how to answer, but they also said anything else is fine, so here’s anything else.

I’ve been avoiding writing about game physics, since I want to save it for ✨ the book I’m writing ✨, but that book will almost certainly not touch on Unity. Here, then, is a brief run through some of the brick walls I ran into while trying to convince Unity to do 2D platforming.

This is fairly high-level — there are no blocks of code or helpful diagrams. I’m just getting this out of my head because it’s interesting. If you want more gritty details, I guess you’ll have to wait for ✨ the book ✨.

The setup

I hadn’t used Unity before. I hadn’t even used a “real” physics engine before. My games so far have mostly used LÖVE, a Lua-based engine. LÖVE includes box2d bindings, but for various reasons (not all of them good), I opted to avoid them and instead write my own physics completely from scratch. (How, you ask? ✨ Book ✨!)

I was invited to work on a Unity project, Chaos Composer, that someone else had already started. It had basic movement already implemented; I taught myself Unity’s physics system by hacking on it. It’s entirely possible that none of this is actually the best way to do anything, since I was really trying to reproduce my own homegrown stuff in Unity, but it’s the best I’ve managed to come up with.

Two recurring snags were that you can’t ask Unity to do multiple physics updates in a row, and sometimes getting the information I wanted was difficult. Working with my own code spoiled me a little, since I could invoke it at any time and ask it anything I wanted; Unity, on the other hand, is someone else’s black box with a rigid interface on top.

Also, wow, Googling for a lot of this was not quite as helpful as expected. A lot of what’s out there is just the first thing that works, and often that’s pretty hacky and imposes severe limits on the game design (e.g., “this won’t work with slopes”). Basic movement and collision are the first thing you do, which seems to me like the worst time to be locking yourself out of a lot of design options. I tried very (very, very, very) hard to minimize those kinds of constraints.

Problem 1: Movement

When I showed up, movement was already working. Problem solved!

Like any good programmer, I immediately set out to un-solve it. Given a “real” physics engine like Unity prominently features, you have two options: ⓐ treat the player as a physics object, or ⓑ don’t. The existing code went with option ⓑ, like I’d done myself with LÖVE, and like I’d seen countless people advise. Using a physics sim makes for bad platforming.

But… why? I believed it, but I couldn’t concretely defend it. I had to know for myself. So I started a blank project, drew some physics boxes, and wrote a dozen-line player controller.

Ah! Immediate enlightenment.

If the player was sliding down a wall, and I tried to move them into the wall, they would simply freeze in midair until I let go of the movement key. The trouble is that the physics sim works in terms of forces — moving the player involves giving them a nudge in some direction, like a giant invisible hand pushing them around the level. Surprise! If you press a real object against a real wall with your real hand, you’ll see the same effect — friction will cancel out gravity, and the object will stay in midair..

Platformer movement, as it turns out, doesn’t make any goddamn physical sense. What is air control? What are you pushing against? Nothing, really; we just have it because it’s nice to play with, because not having it is a nightmare.

I looked to see if there were any common solutions to this, and I only really found one: make all your walls frictionless.

Game development is full of hacks like this, and I… don’t like them. I can accept that minor hacks are necessary sometimes, but this one makes an early and widespread change to a fundamental system to “fix” something that was wrong in the first place. It also imposes an “invisible” requirement, something I try to avoid at all costs — if you forget to make a particular wall frictionless, you’ll never know unless you happen to try sliding down it.

And so, I swiftly returned to the existing code. It wasn’t too different from what I’d come up with for LÖVE: it applied gravity by hand, tracked the player’s velocity, computed the intended movement each frame, and moved by that amount. The interesting thing was that it used MovePosition, which schedules a movement for the next physics update and stops the movement if the player hits something solid.

It’s kind of a nice hybrid approach, actually; all the “physics” for conscious actors is done by hand, but the physics engine is still used for collision detection. It’s also used for collision rejection — if the player manages to wedge themselves several pixels into a solid object, for example, the physics engine will try to gently nudge them back out of it with no extra effort required on my part. I still haven’t figured out how to get that to work with my homegrown stuff, which is built to prevent overlap rather than to jiggle things out of it.

But wait, what about…

Our player is a dynamic body with rotation lock and no gravity. Why not just use a kinematic body?

I must be missing something, because I do not understand the point of kinematic bodies. I ran into this with Godot, too, which documented them the same way: as intended for use as players and other manually-moved objects. But by default, they don’t even collide with other kinematic bodies or static geometry. What? There’s a checkbox to turn this on, which I enabled, but then I found out that MovePosition doesn’t stop kinematic bodies when they hit something, so I would’ve had to cast along the intended path of movement to figure out when to stop, thus duplicating the same work the physics engine was about to do.

But that’s impossible anyway! Static geometry generally wants to be made of edge colliders, right? They don’t care about concave/convex. Imagine the player is standing on the ground near a wall and tries to move towards the wall. Both the ground and the wall are different edges from the same edge collider.

If you try to cast the player’s hitbox horizontally, parallel to the ground, you’ll only get one collision: the existing collision with the ground. Casting doesn’t distinguish between touching and hitting. And because Unity only reports one collision per collider, and because the ground will always show up first, you will never find out about the impending wall collision.

So you’re forced to either use raycasts for collision detection or decomposed polygons for world geometry, both of which are slightly worse tools for no real gain.

I ended up sticking with a dynamic body.


Oh, one other thing that doesn’t really fit anywhere else: keep track of units! If you’re adding something called “velocity” directly to something called “position”, something has gone very wrong. Acceleration is distance per time squared; velocity is distance per time; position is distance. You must multiply or divide by time to convert between them.

I never even, say, add a constant directly to position every frame; I always phrase it as velocity and multiply by Δt. It keeps the units consistent: time is always in seconds, not in tics.

Problem 2: Slopes

Ah, now we start to get off in the weeds.

A sort of pre-problem here was detecting whether we’re on a slope, which means detecting the ground. The codebase originally used a manual physics query of the area around the player’s feet to check for the ground, which seems to be somewhat common, but that can’t tell me the angle of the detected ground. (It’s also kind of error-prone, since “around the player’s feet” has to be specified by hand and may not stay correct through animations or changes in the hitbox.)

I replaced that with what I’d eventually settled on in LÖVE: detect the ground by detecting collisions, and looking at the normal of the collision. A normal is a vector that points straight out from a surface, so if you’re standing on the ground, the normal points straight up; if you’re on a 10° incline, the normal points 10° away from straight up.

Not all collisions are with the ground, of course, so I assumed something is ground if the normal pointed away from gravity. (I like this definition more than “points upwards”, because it avoids assuming anything about the direction of gravity, which leaves some interesting doors open for later on.) That’s easily detected by taking the dot product — if it’s negative, the collision was with the ground, and I now have the normal of the ground.

Actually doing this in practice was slightly tricky. With my LÖVE engine, I could cram this right into the middle of collision resolution. With Unity, not quite so much. I went through a couple iterations before I really grasped Unity’s execution order, which I guess I will have to briefly recap for this to make sense.

Unity essentially has two update cycles. It performs physics updates at fixed intervals for consistency, and updates everything else just before rendering. Within a single frame, Unity does as many fixed physics updates as it has spare time for (which might be zero, one, or more), then does a regular update, then renders. User code can implement either or both of Update, which runs during a regular update, and FixedUpdate, which runs just before Unity does a physics pass.

So my solution was:

  • At the very end of FixedUpdate, clear the actor’s “on ground” flag and ground normal.

  • During OnCollisionEnter2D and OnCollisionStay2D (which are called from within a physics pass), if there’s a collision that looks like it’s with the ground, set the “on ground” flag and ground normal. (If there are multiple ground collisions, well, good luck figuring out the best way to resolve that! At the moment I’m just taking the first and hoping for the best.)

That means there’s a brief window between the end of FixedUpdate and Unity’s physics pass during which a grounded actor might mistakenly believe it’s not on the ground, which is a bit of a shame, but there are very few good reasons for anything to be happening in that window.

Okay! Now we can do slopes.

Just kidding! First we have to do sliding.

When I first looked at this code, it didn’t apply gravity while the player was on the ground. I think I may have had some problems with detecting the ground as result, since the player was no longer pushing down against it? Either way, it seemed like a silly special case, so I made gravity always apply.

Lo! I was a fool. The player could no longer move.

Why? Because MovePosition does exactly what it promises. If the player collides with something, they’ll stop moving. Applying gravity means that the player is trying to move diagonally downwards into the ground, and so MovePosition stops them immediately.

Hence, sliding. I don’t want the player to actually try to move into the ground. I want them to move the unblocked part of that movement. For flat ground, that means the horizontal part, which is pretty much the same as discarding gravity. For sloped ground, it’s a bit more complicated!

Okay but actually it’s less complicated than you’d think. It can be done with some cross products fairly easily, but Unity makes it even easier with a couple casts. There’s a Vector3.ProjectOnPlane function that projects an arbitrary vector on a plane given by its normal — exactly the thing I want! So I apply that to the attempted movement before passing it along to MovePosition. I do the same thing with the current velocity, to prevent the player from accelerating infinitely downwards while standing on flat ground.

One other thing: I don’t actually use the detected ground normal for this. The player might be touching two ground surfaces at the same time, and I’d want to project on both of them. Instead, I use the player body’s GetContacts method, which returns contact points (and normals!) for everything the player is currently touching. I believe those contact points are tracked by the physics engine anyway, so asking for them doesn’t require any actual physics work.

(Looking at the code I have, I notice that I still only perform the slide for surfaces facing upwards — but I’d want to slide against sloped ceilings, too. Why did I do this? Maybe I should remove that.)

(Also, I’m pretty sure projecting a vector on a plane is non-commutative, which raises the question of which order the projections should happen in and what difference it makes. I don’t have a good answer.)

(I note that my LÖVE setup does something slightly different: it just tries whatever the movement ought to be, and if there’s a collision, then it projects — and tries again with the remaining movement. But I can’t ask Unity to do multiple moves in one physics update, alas.)

Okay! Now, slopes. But actually, with the above work done, slopes are most of the way there already.

One obvious problem is that the player tries to move horizontally even when on a slope, and the easy fix is to change their movement from speed * Vector2.right to speed * new Vector2(ground.y, -ground.x) while on the ground. That’s the ground normal rotated a quarter-turn clockwise, so for flat ground it still points to the right, and in general it points rightwards along the ground. (Note that it assumes the ground normal is a unit vector, but as far as I’m aware, that’s true for all the normals Unity gives you.)

Another issue is that if the player stands motionless on a slope, gravity will cause them to slowly slide down it — because the movement from gravity will be projected onto the slope, and unlike flat ground, the result is no longer zero. For conscious actors only, I counter this by adding the opposite factor to the player’s velocity as part of adding in their walking speed. This matches how the real world works, to some extent: when you’re standing on a hill, you’re exerting some small amount of effort just to stay in place.

(Note that slope resistance is not the same as friction. Okay, yes, in the real world, virtually all resistance to movement happens as a result of friction, but bracing yourself against the ground isn’t the same as being passively resisted.)

From here there are a lot of things you can do, depending on how you think slopes should be handled. You could make the player unable to walk up slopes that are too steep. You could make walking down a slope faster than walking up it. You could make jumping go along the ground normal, rather than straight up. You could raise the player’s max allowed speed while running downhill. Whatever you want, really. Armed with a normal and awareness of dot products, you can do whatever you want.

But first you might want to fix a few aggravating side effects.

Problem 3: Ground adherence

I don’t know if there’s a better name for this. I rarely even see anyone talk about it, which surprises me; it seems like it should be a very common problem.

The problem is: if the player runs up a slope which then abruptly changes to flat ground, their momentum will carry them into the air. For very fast players going off the top of very steep slopes, this makes sense, but it becomes visible even for relatively gentle slopes. It was a mild nightmare in the original release of our game Lunar Depot 38, which has very “rough” ground made up of lots of shallow slopes — so the player is very frequently slightly off the ground, which meant they couldn’t jump, for seemingly no reason. (I even had code to fix this, but I disabled it because of a silly visual side effect that I never got around to fixing.)

Anyway! The reason this is a problem is that game protagonists are generally not boxes sliding around — they have legs. We don’t go flying off the top of real-world hilltops because we put our foot down until it touches the ground.

Simulating this footfall is surprisingly fiddly to get right, especially with someone else’s physics engine. It’s made somewhat easier by Cast, which casts the entire hitbox — no matter what shape it is — in a particular direction, as if it had moved, and tells you all the hypothetical collisions in order.

So I cast the player in the direction of gravity by some distance. If the cast hits something solid with a ground-like collision normal, then the player must be close to the ground, and I move them down to touch it (and set that ground as the new ground normal).

There are some wrinkles.

Wrinkle 1: I only want to do this if the player is off the ground now, but was on the ground last frame, and is not deliberately moving upwards. That latter condition means I want to skip this logic if the player jumps, for example, but also if the player is thrust upwards by a spring or abducted by a UFO or whatever. As long as external code goes through some interface and doesn’t mess with the player’s velocity directly, that shouldn’t be too hard to track.

Wrinkle 2: When does this logic run? It needs to happen after the player moves, which means after a Unity physics pass… but there’s no callback for that point in time. I ended up running it at the beginning of FixedUpdate and the beginning of Update — since I definitely want to do it before rendering happens! That means it’ll sometimes happen twice between physics updates. (I could carefully juggle a flag to skip the second run, but I… didn’t do that. Yet?)

Wrinkle 3: I can’t move the player with MovePosition! Remember, MovePosition schedules a movement, it doesn’t actually perform one; that means if it’s called twice before the physics pass, the first call is effectively ignored. I can’t easily combine the drop with the player’s regular movement, for various fiddly reasons. I ended up doing it “by hand” using transform.Translate, which I think was the “old way” to do manual movement before MovePosition existed. I’m not totally sure if it activates triggers? For that matter, I’m not sure it even notices collisions — but since I did a full-body Cast, there shouldn’t be any anyway.

Wrinkle 4: What, exactly, is “some distance”? I’ve yet to find a satisfying answer for this. It seems like it ought to be based on the player’s current speed and the slope of the ground they’re moving along, but every time I’ve done that math, I’ve gotten totally ludicrous answers that sometimes exceed the size of a tile. But maybe that’s not wrong? Play around, I guess, and think about when the effect should “break” and the player should go flying off the top of a hill.

Wrinkle 5: It’s possible that the player will launch off a slope, hit something, and then be adhered to the ground where they wouldn’t have hit it. I don’t much like this edge case, but I don’t see a way around it either.

This problem is surprisingly awkward for how simple it sounds, and the solution isn’t entirely satisfying. Oh, well; the results are much nicer than the solution. As an added bonus, this also fixes occasional problems with running down a hill and becoming detached from the ground due to precision issues or whathaveyou.

Problem 4: One-way platforms

Ah, what a nightmare.

It took me ages just to figure out how to define one-way platforms. Only block when the player is moving downwards? Nope. Only block when the player is above the platform? Nuh-uh.

Well, okay, yes, those approaches might work for convex players and flat platforms. But what about… sloped, one-way platforms? There’s no reason you shouldn’t be able to have those. If Super Mario World can do it, surely Unity can do it almost 30 years later.

The trick is, again, to look at the collision normal. If it faces away from gravity, the player is hitting a ground-like surface, so the platform should block them. Otherwise (or if the player overlaps the platform), it shouldn’t.

Here’s the catch: Unity doesn’t have conditional collision. I can’t decide, on the fly, whether a collision should block or not. In fact, I think that by the time I get a callback like OnCollisionEnter2D, the physics pass is already over.

I could go the other way and use triggers (which are non-blocking), but then I have the opposite problem: I can’t stop the player on the fly. I could move them back to where they hit the trigger, but I envision all kinds of problems as a result. What if they were moving fast enough to activate something on the other side of the platform? What if something else moved to where I’m trying to shove them back to in the meantime? How does this interact with ground detection and listing contacts, which would rightly ignore a trigger as non-blocking?

I beat my head against this for a while, but the inability to respond to collision conditionally was a huge roadblock. It’s all the more infuriating a problem, because Unity ships with a one-way platform modifier thing. Unfortunately, it seems to have been implemented by someone who has never played a platformer. It’s literally one-way — the player is only allowed to move straight upwards through it, not in from the sides. It also tries to block the player if they’re moving downwards while inside the platform, which invokes clumsy rejection behavior. And this all seems to be built into the physics engine itself somehow, so I can’t simply copy whatever they did.

Eventually, I settled on the following. After calculating attempted movement (including sliding), just at the end of FixedUpdate, I do a Cast along the movement vector. I’m not thrilled about having to duplicate the physics engine’s own work, but I do filter to only things on a “one-way platform” physics layer, which should at least help. For each object the cast hits, I use Physics2D.IgnoreCollision to either ignore or un-ignore the collision between the player and the platform, depending on whether the collision was ground-like or not.

(A lot of people suggested turning off collision between layers, but that can’t possibly work — the player might be standing on one platform while inside another, and anyway, this should work for all actors!)

Again, wrinkles! But fewer this time. Actually, maybe just one: handling the case where the player already overlaps the platform. I can’t just check for that with e.g. OverlapCollider, because that doesn’t distinguish between overlapping and merely touching.

I came up with a fairly simple fix: if I was going to un-ignore the collision (i.e. make the platform block), and the cast distance is reported as zero (either already touching or overlapping), I simply do nothing instead. If I’m standing on the platform, I must have already set it blocking when I was approaching it from the top anyway; if I’m overlapping it, I must have already set it non-blocking to get here in the first place.

I can imagine a few cases where this might go wrong. Moving platforms, especially, are going to cause some interesting issues. But this is the best I can do with what I know, and it seems to work well enough so far.

Oh, and our player can deliberately drop down through platforms, which was easy enough to implement; I just decide the platform is always passable while some button is held down.

Problem 5: Pushers and carriers

I haven’t gotten to this yet! Oh boy, can’t wait. I implemented it in LÖVE, but my way was hilariously invasive; I’m hoping that having a physics engine that supports a handwaved “this pushes that” will help. Of course, you also have to worry about sticking to platforms, for which the recommended solution is apparently to parent the cargo to the platform, which sounds goofy to me? I guess I’ll find out when I throw myself at it later.

Overall result

I ended up with a fairly pleasant-feeling system that supports slopes and one-way platforms and whatnot, with all the same pieces as I came up with for LÖVE. The code somehow ended up as less of a mess, too, but it probably helps that I’ve been down this rabbit hole once before and kinda knew what I was aiming for this time.

Animation of a character running smoothly along the top of an irregular dinosaur skeleton

Sorry that I don’t have a big block of code for you to copy-paste into your project. I don’t think there are nearly enough narrative discussions of these fundamentals, though, so hopefully this is useful to someone. If not, well, look forward to ✨ my book, that I am writing ✨!

Predict Billboard Top 10 Hits Using RStudio, H2O and Amazon Athena

Post Syndicated from Gopal Wunnava original https://aws.amazon.com/blogs/big-data/predict-billboard-top-10-hits-using-rstudio-h2o-and-amazon-athena/

Success in the popular music industry is typically measured in terms of the number of Top 10 hits artists have to their credit. The music industry is a highly competitive multi-billion dollar business, and record labels incur various costs in exchange for a percentage of the profits from sales and concert tickets.

Predicting the success of an artist’s release in the popular music industry can be difficult. One release may be extremely popular, resulting in widespread play on TV, radio and social media, while another single may turn out quite unpopular, and therefore unprofitable. Record labels need to be selective in their decision making, and predictive analytics can help them with decision making around the type of songs and artists they need to promote.

In this walkthrough, you leverage H2O.ai, Amazon Athena, and RStudio to make predictions on whether a song might make it to the Top 10 Billboard charts. You explore the GLM, GBM, and deep learning modeling techniques using H2O’s rapid, distributed and easy-to-use open source parallel processing engine. RStudio is a popular IDE, licensed either commercially or under AGPLv3, for working with R. This is ideal if you don’t want to connect to a server via SSH and use code editors such as vi to do analytics. RStudio is available in a desktop version, or a server version that allows you to access R via a web browser. RStudio’s Notebooks feature is used to demonstrate the execution of code and output. In addition, this post showcases how you can leverage Athena for query and interactive analysis during the modeling phase. A working knowledge of statistics and machine learning would be helpful to interpret the analysis being performed in this post.

Walkthrough

Your goal is to predict whether a song will make it to the Top 10 Billboard charts. For this purpose, you will be using multiple modeling techniques―namely GLM, GBM and deep learning―and choose the model that is the best fit.

This solution involves the following steps:

  • Install and configure RStudio with Athena
  • Log in to RStudio
  • Install R packages
  • Connect to Athena
  • Create a dataset
  • Create models

Install and configure RStudio with Athena

Use the following AWS CloudFormation stack to install, configure, and connect RStudio on an Amazon EC2 instance with Athena.

Launching this stack creates all required resources and prerequisites:

  • Amazon EC2 instance with Amazon Linux (minimum size of t2.large is recommended)
  • Provisioning of the EC2 instance in an existing VPC and public subnet
  • Installation of Java 8
  • Assignment of an IAM role to the EC2 instance with the required permissions for accessing Athena and Amazon S3
  • Security group allowing access to the RStudio and SSH ports from the internet (I recommend restricting access to these ports)
  • S3 staging bucket required for Athena (referenced within RStudio as ATHENABUCKET)
  • RStudio username and password
  • Setup logs in Amazon CloudWatch Logs (if needed for additional troubleshooting)
  • Amazon EC2 Systems Manager agent, which makes it easy to manage and patch

All AWS resources are created in the US-East-1 Region. To avoid cross-region data transfer fees, launch the CloudFormation stack in the same region. To check the availability of Athena in other regions, see Region Table.

Log in to RStudio

The instance security group has been automatically configured to allow incoming connections on the RStudio port 8787 from any source internet address. You can edit the security group to restrict source IP access. If you have trouble connecting, ensure that port 8787 isn’t blocked by subnet network ACLS or by your outgoing proxy/firewall.

  1. In the CloudFormation stack, choose Outputs, Value, and then open the RStudio URL. You might need to wait for a few minutes until the instance has been launched.
  2. Log in to RStudio with the and password you provided during setup.

Install R packages

Next, install the required R packages from the RStudio console. You can download the R notebook file containing just the code.

#install pacman – a handy package manager for managing installs
if("pacman" %in% rownames(installed.packages()) == FALSE)
{install.packages("pacman")}  
library(pacman)
p_load(h2o,rJava,RJDBC,awsjavasdk)
h2o.init(nthreads = -1)
##  Connection successful!
## 
## R is connected to the H2O cluster: 
##     H2O cluster uptime:         2 hours 42 minutes 
##     H2O cluster version:        3.10.4.6 
##     H2O cluster version age:    4 months and 4 days !!! 
##     H2O cluster name:           H2O_started_from_R_rstudio_hjx881 
##     H2O cluster total nodes:    1 
##     H2O cluster total memory:   3.30 GB 
##     H2O cluster total cores:    4 
##     H2O cluster allowed cores:  4 
##     H2O cluster healthy:        TRUE 
##     H2O Connection ip:          localhost 
##     H2O Connection port:        54321 
##     H2O Connection proxy:       NA 
##     H2O Internal Security:      FALSE 
##     R Version:                  R version 3.3.3 (2017-03-06)
## Warning in h2o.clusterInfo(): 
## Your H2O cluster version is too old (4 months and 4 days)!
## Please download and install the latest version from http://h2o.ai/download/
#install aws sdk if not present (pre-requisite for using Athena with an IAM role)
if (!aws_sdk_present()) {
  install_aws_sdk()
}

load_sdk()
## NULL

Connect to Athena

Next, establish a connection to Athena from RStudio, using an IAM role associated with your EC2 instance. Use ATHENABUCKET to specify the S3 staging directory.

URL <- 'https://s3.amazonaws.com/athena-downloads/drivers/AthenaJDBC41-1.0.1.jar'
fil <- basename(URL)
#download the file into current working directory
if (!file.exists(fil)) download.file(URL, fil)
#verify that the file has been downloaded successfully
list.files()
## [1] "AthenaJDBC41-1.0.1.jar"
drv <- JDBC(driverClass="com.amazonaws.athena.jdbc.AthenaDriver", fil, identifier.quote="'")

con <- jdbcConnection <- dbConnect(drv, 'jdbc:awsathena://athena.us-east-1.amazonaws.com:443/',
                                   s3_staging_dir=Sys.getenv("ATHENABUCKET"),
                                   aws_credentials_provider_class="com.amazonaws.auth.DefaultAWSCredentialsProviderChain")

Verify the connection. The results returned depend on your specific Athena setup.

con
## <JDBCConnection>
dbListTables(con)
##  [1] "gdelt"               "wikistats"           "elb_logs_raw_native"
##  [4] "twitter"             "twitter2"            "usermovieratings"   
##  [7] "eventcodes"          "events"              "billboard"          
## [10] "billboardtop10"      "elb_logs"            "gdelthist"          
## [13] "gdeltmaster"         "twitter"             "twitter3"

Create a dataset

For this analysis, you use a sample dataset combining information from Billboard and Wikipedia with Echo Nest data in the Million Songs Dataset. Upload this dataset into your own S3 bucket. The table below provides a description of the fields used in this dataset.

Field Description
year Year that song was released
songtitle Title of the song
artistname Name of the song artist
songid Unique identifier for the song
artistid Unique identifier for the song artist
timesignature Variable estimating the time signature of the song
timesignature_confidence Confidence in the estimate for the timesignature
loudness Continuous variable indicating the average amplitude of the audio in decibels
tempo Variable indicating the estimated beats per minute of the song
tempo_confidence Confidence in the estimate for tempo
key Variable with twelve levels indicating the estimated key of the song (C, C#, B)
key_confidence Confidence in the estimate for key
energy Variable that represents the overall acoustic energy of the song, using a mix of features such as loudness
pitch Continuous variable that indicates the pitch of the song
timbre_0_min thru timbre_11_min Variables that indicate the minimum values over all segments for each of the twelve values in the timbre vector
timbre_0_max thru timbre_11_max Variables that indicate the maximum values over all segments for each of the twelve values in the timbre vector
top10 Indicator for whether or not the song made it to the Top 10 of the Billboard charts (1 if it was in the top 10, and 0 if not)

Create an Athena table based on the dataset

In the Athena console, select the default database, sampled, or create a new database.

Run the following create table statement.

create external table if not exists billboard
(
year int,
songtitle string,
artistname string,
songID string,
artistID string,
timesignature int,
timesignature_confidence double,
loudness double,
tempo double,
tempo_confidence double,
key int,
key_confidence double,
energy double,
pitch double,
timbre_0_min double,
timbre_0_max double,
timbre_1_min double,
timbre_1_max double,
timbre_2_min double,
timbre_2_max double,
timbre_3_min double,
timbre_3_max double,
timbre_4_min double,
timbre_4_max double,
timbre_5_min double,
timbre_5_max double,
timbre_6_min double,
timbre_6_max double,
timbre_7_min double,
timbre_7_max double,
timbre_8_min double,
timbre_8_max double,
timbre_9_min double,
timbre_9_max double,
timbre_10_min double,
timbre_10_max double,
timbre_11_min double,
timbre_11_max double,
Top10 int
)
ROW FORMAT DELIMITED
FIELDS TERMINATED BY ','
STORED AS TEXTFILE
LOCATION 's3://aws-bigdata-blog/artifacts/predict-billboard/data'
;

Inspect the table definition for the ‘billboard’ table that you have created. If you chose a database other than sampledb, replace that value with your choice.

dbGetQuery(con, "show create table sampledb.billboard")
##                                      createtab_stmt
## 1       CREATE EXTERNAL TABLE `sampledb.billboard`(
## 2                                       `year` int,
## 3                               `songtitle` string,
## 4                              `artistname` string,
## 5                                  `songid` string,
## 6                                `artistid` string,
## 7                              `timesignature` int,
## 8                `timesignature_confidence` double,
## 9                                `loudness` double,
## 10                                  `tempo` double,
## 11                       `tempo_confidence` double,
## 12                                       `key` int,
## 13                         `key_confidence` double,
## 14                                 `energy` double,
## 15                                  `pitch` double,
## 16                           `timbre_0_min` double,
## 17                           `timbre_0_max` double,
## 18                           `timbre_1_min` double,
## 19                           `timbre_1_max` double,
## 20                           `timbre_2_min` double,
## 21                           `timbre_2_max` double,
## 22                           `timbre_3_min` double,
## 23                           `timbre_3_max` double,
## 24                           `timbre_4_min` double,
## 25                           `timbre_4_max` double,
## 26                           `timbre_5_min` double,
## 27                           `timbre_5_max` double,
## 28                           `timbre_6_min` double,
## 29                           `timbre_6_max` double,
## 30                           `timbre_7_min` double,
## 31                           `timbre_7_max` double,
## 32                           `timbre_8_min` double,
## 33                           `timbre_8_max` double,
## 34                           `timbre_9_min` double,
## 35                           `timbre_9_max` double,
## 36                          `timbre_10_min` double,
## 37                          `timbre_10_max` double,
## 38                          `timbre_11_min` double,
## 39                          `timbre_11_max` double,
## 40                                     `top10` int)
## 41                             ROW FORMAT DELIMITED 
## 42                         FIELDS TERMINATED BY ',' 
## 43                            STORED AS INPUTFORMAT 
## 44       'org.apache.hadoop.mapred.TextInputFormat' 
## 45                                     OUTPUTFORMAT 
## 46  'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat'
## 47                                        LOCATION
## 48    's3://aws-bigdata-blog/artifacts/predict-billboard/data'
## 49                                  TBLPROPERTIES (
## 50            'transient_lastDdlTime'='1505484133')

Run a sample query

Next, run a sample query to obtain a list of all songs from Janet Jackson that made it to the Billboard Top 10 charts.

dbGetQuery(con, " SELECT songtitle,artistname,top10   FROM sampledb.billboard WHERE lower(artistname) =     'janet jackson' AND top10 = 1")
##                       songtitle    artistname top10
## 1                       Runaway Janet Jackson     1
## 2               Because Of Love Janet Jackson     1
## 3                         Again Janet Jackson     1
## 4                            If Janet Jackson     1
## 5  Love Will Never Do (Without You) Janet Jackson 1
## 6                     Black Cat Janet Jackson     1
## 7               Come Back To Me Janet Jackson     1
## 8                       Alright Janet Jackson     1
## 9                      Escapade Janet Jackson     1
## 10                Rhythm Nation Janet Jackson     1

Determine how many songs in this dataset are specifically from the year 2010.

dbGetQuery(con, " SELECT count(*)   FROM sampledb.billboard WHERE year = 2010")
##   _col0
## 1   373

The sample dataset provides certain song properties of interest that can be analyzed to gauge the impact to the song’s overall popularity. Look at one such property, timesignature, and determine the value that is the most frequent among songs in the database. Timesignature is a measure of the number of beats and the type of note involved.

Running the query directly may result in an error, as shown in the commented lines below. This error is a result of trying to retrieve a large result set over a JDBC connection, which can cause out-of-memory issues at the client level. To address this, reduce the fetch size and run again.

#t<-dbGetQuery(con, " SELECT timesignature FROM sampledb.billboard")
#Note:  Running the preceding query results in the following error: 
#Error in .jcall(rp, "I", "fetch", stride, block): java.sql.SQLException: The requested #fetchSize is more than the allowed value in Athena. Please reduce the fetchSize and try #again. Refer to the Athena documentation for valid fetchSize values.
# Use the dbSendQuery function, reduce the fetch size, and run again
r <- dbSendQuery(con, " SELECT timesignature     FROM sampledb.billboard")
dftimesignature<- fetch(r, n=-1, block=100)
dbClearResult(r)
## [1] TRUE
table(dftimesignature)
## dftimesignature
##    0    1    3    4    5    7 
##   10  143  503 6787  112   19
nrow(dftimesignature)
## [1] 7574

From the results, observe that 6787 songs have a timesignature of 4.

Next, determine the song with the highest tempo.

dbGetQuery(con, " SELECT songtitle,artistname,tempo   FROM sampledb.billboard WHERE tempo = (SELECT max(tempo) FROM sampledb.billboard) ")
##                   songtitle      artistname   tempo
## 1 Wanna Be Startin' Somethin' Michael Jackson 244.307

Create the training dataset

Your model needs to be trained such that it can learn and make accurate predictions. Split the data into training and test datasets, and create the training dataset first.  This dataset contains all observations from the year 2009 and earlier. You may face the same JDBC connection issue pointed out earlier, so this query uses a fetch size.

#BillboardTrain <- dbGetQuery(con, "SELECT * FROM sampledb.billboard WHERE year <= 2009")
#Running the preceding query results in the following error:-
#Error in .verify.JDBC.result(r, "Unable to retrieve JDBC result set for ", : Unable to retrieve #JDBC result set for SELECT * FROM sampledb.billboard WHERE year <= 2009 (Internal error)
#Follow the same approach as before to address this issue.

r <- dbSendQuery(con, "SELECT * FROM sampledb.billboard WHERE year <= 2009")
BillboardTrain <- fetch(r, n=-1, block=100)
dbClearResult(r)
## [1] TRUE
BillboardTrain[1:2,c(1:3,6:10)]
##   year           songtitle artistname timesignature
## 1 2009 The Awkward Goodbye    Athlete             3
## 2 2009        Rubik's Cube    Athlete             3
##   timesignature_confidence loudness   tempo tempo_confidence
## 1                    0.732   -6.320  89.614   0.652
## 2                    0.906   -9.541 117.742   0.542
nrow(BillboardTrain)
## [1] 7201

Create the test dataset

BillboardTest <- dbGetQuery(con, "SELECT * FROM sampledb.billboard where year = 2010")
BillboardTest[1:2,c(1:3,11:15)]
##   year              songtitle        artistname key
## 1 2010 This Is the House That Doubt Built A Day to Remember  11
## 2 2010        Sticks & Bricks A Day to Remember  10
##   key_confidence    energy pitch timbre_0_min
## 1          0.453 0.9666556 0.024        0.002
## 2          0.469 0.9847095 0.025        0.000
nrow(BillboardTest)
## [1] 373

Convert the training and test datasets into H2O dataframes

train.h2o <- as.h2o(BillboardTrain)
## 
  |                                                                       
  |                                                                 |   0%
  |                                                                       
  |=================================================================| 100%
test.h2o <- as.h2o(BillboardTest)
## 
  |                                                                       
  |                                                                 |   0%
  |                                                                       
  |=================================================================| 100%

Inspect the column names in your H2O dataframes.

colnames(train.h2o)
##  [1] "year"                     "songtitle"               
##  [3] "artistname"               "songid"                  
##  [5] "artistid"                 "timesignature"           
##  [7] "timesignature_confidence" "loudness"                
##  [9] "tempo"                    "tempo_confidence"        
## [11] "key"                      "key_confidence"          
## [13] "energy"                   "pitch"                   
## [15] "timbre_0_min"             "timbre_0_max"            
## [17] "timbre_1_min"             "timbre_1_max"            
## [19] "timbre_2_min"             "timbre_2_max"            
## [21] "timbre_3_min"             "timbre_3_max"            
## [23] "timbre_4_min"             "timbre_4_max"            
## [25] "timbre_5_min"             "timbre_5_max"            
## [27] "timbre_6_min"             "timbre_6_max"            
## [29] "timbre_7_min"             "timbre_7_max"            
## [31] "timbre_8_min"             "timbre_8_max"            
## [33] "timbre_9_min"             "timbre_9_max"            
## [35] "timbre_10_min"            "timbre_10_max"           
## [37] "timbre_11_min"            "timbre_11_max"           
## [39] "top10"

Create models

You need to designate the independent and dependent variables prior to applying your modeling algorithms. Because you’re trying to predict the ‘top10’ field, this would be your dependent variable and everything else would be independent.

Create your first model using GLM. Because GLM works best with numeric data, you create your model by dropping non-numeric variables. You only use the variables in the dataset that describe the numerical attributes of the song in the logistic regression model. You won’t use these variables:  “year”, “songtitle”, “artistname”, “songid”, or “artistid”.

y.dep <- 39
x.indep <- c(6:38)
x.indep
##  [1]  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28
## [24] 29 30 31 32 33 34 35 36 37 38

Create Model 1: All numeric variables

Create Model 1 with the training dataset, using GLM as the modeling algorithm and H2O’s built-in h2o.glm function.

modelh1 <- h2o.glm( y = y.dep, x = x.indep, training_frame = train.h2o, family = "binomial")
## 
  |                                                                       
  |                                                                 |   0%
  |                                                                       
  |=====                                                            |   8%
  |                                                                       
  |=================================================================| 100%

Measure the performance of Model 1, using H2O’s built-in performance function.

h2o.performance(model=modelh1,newdata=test.h2o)
## H2OBinomialMetrics: glm
## 
## MSE:  0.09924684
## RMSE:  0.3150347
## LogLoss:  0.3220267
## Mean Per-Class Error:  0.2380168
## AUC:  0.8431394
## Gini:  0.6862787
## R^2:  0.254663
## Null Deviance:  326.0801
## Residual Deviance:  240.2319
## AIC:  308.2319
## 
## Confusion Matrix (vertical: actual; across: predicted) for F1-optimal threshold:
##          0   1    Error     Rate
## 0      255  59 0.187898  =59/314
## 1       17  42 0.288136   =17/59
## Totals 272 101 0.203753  =76/373
## 
## Maximum Metrics: Maximum metrics at their respective thresholds
##                         metric threshold    value idx
## 1                       max f1  0.192772 0.525000 100
## 2                       max f2  0.124912 0.650510 155
## 3                 max f0point5  0.416258 0.612903  23
## 4                 max accuracy  0.416258 0.879357  23
## 5                max precision  0.813396 1.000000   0
## 6                   max recall  0.037579 1.000000 282
## 7              max specificity  0.813396 1.000000   0
## 8             max absolute_mcc  0.416258 0.455251  23
## 9   max min_per_class_accuracy  0.161402 0.738854 125
## 10 max mean_per_class_accuracy  0.124912 0.765006 155
## 
## Gains/Lift Table: Extract with `h2o.gainsLift(<model>, <data>)` or ` 
h2o.auc(h2o.performance(modelh1,test.h2o)) 
## [1] 0.8431394

The AUC metric provides insight into how well the classifier is able to separate the two classes. In this case, the value of 0.8431394 indicates that the classification is good. (A value of 0.5 indicates a worthless test, while a value of 1.0 indicates a perfect test.)

Next, inspect the coefficients of the variables in the dataset.

dfmodelh1 <- as.data.frame(h2o.varimp(modelh1))
dfmodelh1
##                       names coefficients sign
## 1              timbre_0_max  1.290938663  NEG
## 2                  loudness  1.262941934  POS
## 3                     pitch  0.616995941  NEG
## 4              timbre_1_min  0.422323735  POS
## 5              timbre_6_min  0.349016024  NEG
## 6                    energy  0.348092062  NEG
## 7             timbre_11_min  0.307331997  NEG
## 8              timbre_3_max  0.302225619  NEG
## 9             timbre_11_max  0.243632060  POS
## 10             timbre_4_min  0.224233951  POS
## 11             timbre_4_max  0.204134342  POS
## 12             timbre_5_min  0.199149324  NEG
## 13             timbre_0_min  0.195147119  POS
## 14 timesignature_confidence  0.179973904  POS
## 15         tempo_confidence  0.144242598  POS
## 16            timbre_10_max  0.137644568  POS
## 17             timbre_7_min  0.126995955  NEG
## 18            timbre_10_min  0.123851179  POS
## 19             timbre_7_max  0.100031481  NEG
## 20             timbre_2_min  0.096127636  NEG
## 21           key_confidence  0.083115820  POS
## 22             timbre_6_max  0.073712419  POS
## 23            timesignature  0.067241917  POS
## 24             timbre_8_min  0.061301881  POS
## 25             timbre_8_max  0.060041698  POS
## 26                      key  0.056158445  POS
## 27             timbre_3_min  0.050825116  POS
## 28             timbre_9_max  0.033733561  POS
## 29             timbre_2_max  0.030939072  POS
## 30             timbre_9_min  0.020708113  POS
## 31             timbre_1_max  0.014228818  NEG
## 32                    tempo  0.008199861  POS
## 33             timbre_5_max  0.004837870  POS
## 34                                    NA <NA>

Typically, songs with heavier instrumentation tend to be louder (have higher values in the variable “loudness”) and more energetic (have higher values in the variable “energy”). This knowledge is helpful for interpreting the modeling results.

You can make the following observations from the results:

  • The coefficient estimates for the confidence values associated with the time signature, key, and tempo variables are positive. This suggests that higher confidence leads to a higher predicted probability of a Top 10 hit.
  • The coefficient estimate for loudness is positive, meaning that mainstream listeners prefer louder songs with heavier instrumentation.
  • The coefficient estimate for energy is negative, meaning that mainstream listeners prefer songs that are less energetic, which are those songs with light instrumentation.

These coefficients lead to contradictory conclusions for Model 1. This could be due to multicollinearity issues. Inspect the correlation between the variables “loudness” and “energy” in the training set.

cor(train.h2o$loudness,train.h2o$energy)
## [1] 0.7399067

This number indicates that these two variables are highly correlated, and Model 1 does indeed suffer from multicollinearity. Typically, you associate a value of -1.0 to -0.5 or 1.0 to 0.5 to indicate strong correlation, and a value of 0.1 to 0.1 to indicate weak correlation. To avoid this correlation issue, omit one of these two variables and re-create the models.

You build two variations of the original model:

  • Model 2, in which you keep “energy” and omit “loudness”
  • Model 3, in which you keep “loudness” and omit “energy”

You compare these two models and choose the model with a better fit for this use case.

Create Model 2: Keep energy and omit loudness

colnames(train.h2o)
##  [1] "year"                     "songtitle"               
##  [3] "artistname"               "songid"                  
##  [5] "artistid"                 "timesignature"           
##  [7] "timesignature_confidence" "loudness"                
##  [9] "tempo"                    "tempo_confidence"        
## [11] "key"                      "key_confidence"          
## [13] "energy"                   "pitch"                   
## [15] "timbre_0_min"             "timbre_0_max"            
## [17] "timbre_1_min"             "timbre_1_max"            
## [19] "timbre_2_min"             "timbre_2_max"            
## [21] "timbre_3_min"             "timbre_3_max"            
## [23] "timbre_4_min"             "timbre_4_max"            
## [25] "timbre_5_min"             "timbre_5_max"            
## [27] "timbre_6_min"             "timbre_6_max"            
## [29] "timbre_7_min"             "timbre_7_max"            
## [31] "timbre_8_min"             "timbre_8_max"            
## [33] "timbre_9_min"             "timbre_9_max"            
## [35] "timbre_10_min"            "timbre_10_max"           
## [37] "timbre_11_min"            "timbre_11_max"           
## [39] "top10"
y.dep <- 39
x.indep <- c(6:7,9:38)
x.indep
##  [1]  6  7  9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29
## [24] 30 31 32 33 34 35 36 37 38
modelh2 <- h2o.glm( y = y.dep, x = x.indep, training_frame = train.h2o, family = "binomial")
## 
  |                                                                       
  |                                                                 |   0%
  |                                                                       
  |=======                                                          |  10%
  |                                                                       
  |=================================================================| 100%

Measure the performance of Model 2.

h2o.performance(model=modelh2,newdata=test.h2o)
## H2OBinomialMetrics: glm
## 
## MSE:  0.09922606
## RMSE:  0.3150017
## LogLoss:  0.3228213
## Mean Per-Class Error:  0.2490554
## AUC:  0.8431933
## Gini:  0.6863867
## R^2:  0.2548191
## Null Deviance:  326.0801
## Residual Deviance:  240.8247
## AIC:  306.8247
## 
## Confusion Matrix (vertical: actual; across: predicted) for F1-optimal threshold:
##          0  1    Error     Rate
## 0      280 34 0.108280  =34/314
## 1       23 36 0.389831   =23/59
## Totals 303 70 0.152815  =57/373
## 
## Maximum Metrics: Maximum metrics at their respective thresholds
##                         metric threshold    value idx
## 1                       max f1  0.254391 0.558140  69
## 2                       max f2  0.113031 0.647208 157
## 3                 max f0point5  0.413999 0.596026  22
## 4                 max accuracy  0.446250 0.876676  18
## 5                max precision  0.811739 1.000000   0
## 6                   max recall  0.037682 1.000000 283
## 7              max specificity  0.811739 1.000000   0
## 8             max absolute_mcc  0.254391 0.469060  69
## 9   max min_per_class_accuracy  0.141051 0.716561 131
## 10 max mean_per_class_accuracy  0.113031 0.761821 157
## 
## Gains/Lift Table: Extract with `h2o.gainsLift(<model>, <data>)` or `h2o.gainsLift(<model>, valid=<T/F>, xval=<T/F>)`
dfmodelh2 <- as.data.frame(h2o.varimp(modelh2))
dfmodelh2
##                       names coefficients sign
## 1                     pitch  0.700331511  NEG
## 2              timbre_1_min  0.510270513  POS
## 3              timbre_0_max  0.402059546  NEG
## 4              timbre_6_min  0.333316236  NEG
## 5             timbre_11_min  0.331647383  NEG
## 6              timbre_3_max  0.252425901  NEG
## 7             timbre_11_max  0.227500308  POS
## 8              timbre_4_max  0.210663865  POS
## 9              timbre_0_min  0.208516163  POS
## 10             timbre_5_min  0.202748055  NEG
## 11             timbre_4_min  0.197246582  POS
## 12            timbre_10_max  0.172729619  POS
## 13         tempo_confidence  0.167523934  POS
## 14 timesignature_confidence  0.167398830  POS
## 15             timbre_7_min  0.142450727  NEG
## 16             timbre_8_max  0.093377516  POS
## 17            timbre_10_min  0.090333426  POS
## 18            timesignature  0.085851625  POS
## 19             timbre_7_max  0.083948442  NEG
## 20           key_confidence  0.079657073  POS
## 21             timbre_6_max  0.076426046  POS
## 22             timbre_2_min  0.071957831  NEG
## 23             timbre_9_max  0.071393189  POS
## 24             timbre_8_min  0.070225578  POS
## 25                      key  0.061394702  POS
## 26             timbre_3_min  0.048384697  POS
## 27             timbre_1_max  0.044721121  NEG
## 28                   energy  0.039698433  POS
## 29             timbre_5_max  0.039469064  POS
## 30             timbre_2_max  0.018461133  POS
## 31                    tempo  0.013279926  POS
## 32             timbre_9_min  0.005282143  NEG
## 33                                    NA <NA>

h2o.auc(h2o.performance(modelh2,test.h2o)) 
## [1] 0.8431933

You can make the following observations:

  • The AUC metric is 0.8431933.
  • Inspecting the coefficient of the variable energy, Model 2 suggests that songs with high energy levels tend to be more popular. This is as per expectation.
  • As H2O orders variables by significance, the variable energy is not significant in this model.

You can conclude that Model 2 is not ideal for this use , as energy is not significant.

CreateModel 3: Keep loudness but omit energy

colnames(train.h2o)
##  [1] "year"                     "songtitle"               
##  [3] "artistname"               "songid"                  
##  [5] "artistid"                 "timesignature"           
##  [7] "timesignature_confidence" "loudness"                
##  [9] "tempo"                    "tempo_confidence"        
## [11] "key"                      "key_confidence"          
## [13] "energy"                   "pitch"                   
## [15] "timbre_0_min"             "timbre_0_max"            
## [17] "timbre_1_min"             "timbre_1_max"            
## [19] "timbre_2_min"             "timbre_2_max"            
## [21] "timbre_3_min"             "timbre_3_max"            
## [23] "timbre_4_min"             "timbre_4_max"            
## [25] "timbre_5_min"             "timbre_5_max"            
## [27] "timbre_6_min"             "timbre_6_max"            
## [29] "timbre_7_min"             "timbre_7_max"            
## [31] "timbre_8_min"             "timbre_8_max"            
## [33] "timbre_9_min"             "timbre_9_max"            
## [35] "timbre_10_min"            "timbre_10_max"           
## [37] "timbre_11_min"            "timbre_11_max"           
## [39] "top10"
y.dep <- 39
x.indep <- c(6:12,14:38)
x.indep
##  [1]  6  7  8  9 10 11 12 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29
## [24] 30 31 32 33 34 35 36 37 38
modelh3 <- h2o.glm( y = y.dep, x = x.indep, training_frame = train.h2o, family = "binomial")
## 
  |                                                                       
  |                                                                 |   0%
  |                                                                       
  |========                                                         |  12%
  |                                                                       
  |=================================================================| 100%
perfh3<-h2o.performance(model=modelh3,newdata=test.h2o)
perfh3
## H2OBinomialMetrics: glm
## 
## MSE:  0.0978859
## RMSE:  0.3128672
## LogLoss:  0.3178367
## Mean Per-Class Error:  0.264925
## AUC:  0.8492389
## Gini:  0.6984778
## R^2:  0.2648836
## Null Deviance:  326.0801
## Residual Deviance:  237.1062
## AIC:  303.1062
## 
## Confusion Matrix (vertical: actual; across: predicted) for F1-optimal threshold:
##          0  1    Error     Rate
## 0      286 28 0.089172  =28/314
## 1       26 33 0.440678   =26/59
## Totals 312 61 0.144772  =54/373
## 
## Maximum Metrics: Maximum metrics at their respective thresholds
##                         metric threshold    value idx
## 1                       max f1  0.273799 0.550000  60
## 2                       max f2  0.125503 0.663265 155
## 3                 max f0point5  0.435479 0.628931  24
## 4                 max accuracy  0.435479 0.882038  24
## 5                max precision  0.821606 1.000000   0
## 6                   max recall  0.038328 1.000000 280
## 7              max specificity  0.821606 1.000000   0
## 8             max absolute_mcc  0.435479 0.471426  24
## 9   max min_per_class_accuracy  0.173693 0.745763 120
## 10 max mean_per_class_accuracy  0.125503 0.775073 155
## 
## Gains/Lift Table: Extract with `h2o.gainsLift(<model>, <data>)` or `h2o.gainsLift(<model>, valid=<T/F>, xval=<T/F>)`
dfmodelh3 <- as.data.frame(h2o.varimp(modelh3))
dfmodelh3
##                       names coefficients sign
## 1              timbre_0_max 1.216621e+00  NEG
## 2                  loudness 9.780973e-01  POS
## 3                     pitch 7.249788e-01  NEG
## 4              timbre_1_min 3.891197e-01  POS
## 5              timbre_6_min 3.689193e-01  NEG
## 6             timbre_11_min 3.086673e-01  NEG
## 7              timbre_3_max 3.025593e-01  NEG
## 8             timbre_11_max 2.459081e-01  POS
## 9              timbre_4_min 2.379749e-01  POS
## 10             timbre_4_max 2.157627e-01  POS
## 11             timbre_0_min 1.859531e-01  POS
## 12             timbre_5_min 1.846128e-01  NEG
## 13 timesignature_confidence 1.729658e-01  POS
## 14             timbre_7_min 1.431871e-01  NEG
## 15            timbre_10_max 1.366703e-01  POS
## 16            timbre_10_min 1.215954e-01  POS
## 17         tempo_confidence 1.183698e-01  POS
## 18             timbre_2_min 1.019149e-01  NEG
## 19           key_confidence 9.109701e-02  POS
## 20             timbre_7_max 8.987908e-02  NEG
## 21             timbre_6_max 6.935132e-02  POS
## 22             timbre_8_max 6.878241e-02  POS
## 23            timesignature 6.120105e-02  POS
## 24                      key 5.814805e-02  POS
## 25             timbre_8_min 5.759228e-02  POS
## 26             timbre_1_max 2.930285e-02  NEG
## 27             timbre_9_max 2.843755e-02  POS
## 28             timbre_3_min 2.380245e-02  POS
## 29             timbre_2_max 1.917035e-02  POS
## 30             timbre_5_max 1.715813e-02  POS
## 31                    tempo 1.364418e-02  NEG
## 32             timbre_9_min 8.463143e-05  NEG
## 33                                    NA <NA>
h2o.sensitivity(perfh3,0.5)
## Warning in h2o.find_row_by_threshold(object, t): Could not find exact
## threshold: 0.5 for this set of metrics; using closest threshold found:
## 0.501855569251422. Run `h2o.predict` and apply your desired threshold on a
## probability column.
## [[1]]
## [1] 0.2033898
h2o.auc(perfh3)
## [1] 0.8492389

You can make the following observations:

  • The AUC metric is 0.8492389.
  • From the confusion matrix, the model correctly predicts that 33 songs will be top 10 hits (true positives). However, it has 26 false positives (songs that the model predicted would be Top 10 hits, but ended up not being Top 10 hits).
  • Loudness has a positive coefficient estimate, meaning that this model predicts that songs with heavier instrumentation tend to be more popular. This is the same conclusion from Model 2.
  • Loudness is significant in this model.

Overall, Model 3 predicts a higher number of top 10 hits with an accuracy rate that is acceptable. To choose the best fit for production runs, record labels should consider the following factors:

  • Desired model accuracy at a given threshold
  • Number of correct predictions for top10 hits
  • Tolerable number of false positives or false negatives

Next, make predictions using Model 3 on the test dataset.

predict.regh <- h2o.predict(modelh3, test.h2o)
## 
  |                                                                       
  |                                                                 |   0%
  |                                                                       
  |=================================================================| 100%
print(predict.regh)
##   predict        p0          p1
## 1       0 0.9654739 0.034526052
## 2       0 0.9654748 0.034525236
## 3       0 0.9635547 0.036445318
## 4       0 0.9343579 0.065642149
## 5       0 0.9978334 0.002166601
## 6       0 0.9779949 0.022005078
## 
## [373 rows x 3 columns]
predict.regh$predict
##   predict
## 1       0
## 2       0
## 3       0
## 4       0
## 5       0
## 6       0
## 
## [373 rows x 1 column]
dpr<-as.data.frame(predict.regh)
#Rename the predicted column 
colnames(dpr)[colnames(dpr) == 'predict'] <- 'predict_top10'
table(dpr$predict_top10)
## 
##   0   1 
## 312  61

The first set of output results specifies the probabilities associated with each predicted observation.  For example, observation 1 is 96.54739% likely to not be a Top 10 hit, and 3.4526052% likely to be a Top 10 hit (predict=1 indicates Top 10 hit and predict=0 indicates not a Top 10 hit).  The second set of results list the actual predictions made.  From the third set of results, this model predicts that 61 songs will be top 10 hits.

Compute the baseline accuracy, by assuming that the baseline predicts the most frequent outcome, which is that most songs are not Top 10 hits.

table(BillboardTest$top10)
## 
##   0   1 
## 314  59

Now observe that the baseline model would get 314 observations correct, and 59 wrong, for an accuracy of 314/(314+59) = 0.8418231.

It seems that Model 3, with an accuracy of 0.8552, provides you with a small improvement over the baseline model. But is this model useful for record labels?

View the two models from an investment perspective:

  • A production company is interested in investing in songs that are more likely to make it to the Top 10. The company’s objective is to minimize the risk of financial losses attributed to investing in songs that end up unpopular.
  • How many songs does Model 3 correctly predict as a Top 10 hit in 2010? Looking at the confusion matrix, you see that it predicts 33 top 10 hits correctly at an optimal threshold, which is more than half the number
  • It will be more useful to the record label if you can provide the production company with a list of songs that are highly likely to end up in the Top 10.
  • The baseline model is not useful, as it simply does not label any song as a hit.

Considering the three models built so far, you can conclude that Model 3 proves to be the best investment choice for the record label.

GBM model

H2O provides you with the ability to explore other learning models, such as GBM and deep learning. Explore building a model using the GBM technique, using the built-in h2o.gbm function.

Before you do this, you need to convert the target variable to a factor for multinomial classification techniques.

train.h2o$top10=as.factor(train.h2o$top10)
gbm.modelh <- h2o.gbm(y=y.dep, x=x.indep, training_frame = train.h2o, ntrees = 500, max_depth = 4, learn_rate = 0.01, seed = 1122,distribution="multinomial")
## 
  |                                                                       
  |                                                                 |   0%
  |                                                                       
  |===                                                              |   5%
  |                                                                       
  |=====                                                            |   7%
  |                                                                       
  |======                                                           |   9%
  |                                                                       
  |=======                                                          |  10%
  |                                                                       
  |======================                                           |  33%
  |                                                                       
  |=====================================                            |  56%
  |                                                                       
  |====================================================             |  79%
  |                                                                       
  |================================================================ |  98%
  |                                                                       
  |=================================================================| 100%
perf.gbmh<-h2o.performance(gbm.modelh,test.h2o)
perf.gbmh
## H2OBinomialMetrics: gbm
## 
## MSE:  0.09860778
## RMSE:  0.3140188
## LogLoss:  0.3206876
## Mean Per-Class Error:  0.2120263
## AUC:  0.8630573
## Gini:  0.7261146
## 
## Confusion Matrix (vertical: actual; across: predicted) for F1-optimal threshold:
##          0  1    Error     Rate
## 0      266 48 0.152866  =48/314
## 1       16 43 0.271186   =16/59
## Totals 282 91 0.171582  =64/373
## 
## Maximum Metrics: Maximum metrics at their respective thresholds
##                       metric threshold    value idx
## 1                     max f1  0.189757 0.573333  90
## 2                     max f2  0.130895 0.693717 145
## 3               max f0point5  0.327346 0.598802  26
## 4               max accuracy  0.442757 0.876676  14
## 5              max precision  0.802184 1.000000   0
## 6                 max recall  0.049990 1.000000 284
## 7            max specificity  0.802184 1.000000   0
## 8           max absolute_mcc  0.169135 0.496486 104
## 9 max min_per_class_accuracy  0.169135 0.796610 104
## 10 max mean_per_class_accuracy  0.169135 0.805948 104
## 
## Gains/Lift Table: Extract with `h2o.gainsLift(<model>, <data>)` or `
h2o.sensitivity(perf.gbmh,0.5)
## Warning in h2o.find_row_by_threshold(object, t): Could not find exact
## threshold: 0.5 for this set of metrics; using closest threshold found:
## 0.501205344484314. Run `h2o.predict` and apply your desired threshold on a
## probability column.
## [[1]]
## [1] 0.1355932
h2o.auc(perf.gbmh)
## [1] 0.8630573

This model correctly predicts 43 top 10 hits, which is 10 more than the number predicted by Model 3. Moreover, the AUC metric is higher than the one obtained from Model 3.

As seen above, H2O’s API provides the ability to obtain key statistical measures required to analyze the models easily, using several built-in functions. The record label can experiment with different parameters to arrive at the model that predicts the maximum number of Top 10 hits at the desired level of accuracy and threshold.

H2O also allows you to experiment with deep learning models. Deep learning models have the ability to learn features implicitly, but can be more expensive computationally.

Now, create a deep learning model with the h2o.deeplearning function, using the same training and test datasets created before. The time taken to run this model depends on the type of EC2 instance chosen for this purpose.  For models that require more computation, consider using accelerated computing instances such as the P2 instance type.

system.time(
  dlearning.modelh <- h2o.deeplearning(y = y.dep,
                                      x = x.indep,
                                      training_frame = train.h2o,
                                      epoch = 250,
                                      hidden = c(250,250),
                                      activation = "Rectifier",
                                      seed = 1122,
                                      distribution="multinomial"
  )
)
## 
  |                                                                       
  |                                                                 |   0%
  |                                                                       
  |===                                                              |   4%
  |                                                                       
  |=====                                                            |   8%
  |                                                                       
  |========                                                         |  12%
  |                                                                       
  |==========                                                       |  16%
  |                                                                       
  |=============                                                    |  20%
  |                                                                       
  |================                                                 |  24%
  |                                                                       
  |==================                                               |  28%
  |                                                                       
  |=====================                                            |  32%
  |                                                                       
  |=======================                                          |  36%
  |                                                                       
  |==========================                                       |  40%
  |                                                                       
  |=============================                                    |  44%
  |                                                                       
  |===============================                                  |  48%
  |                                                                       
  |==================================                               |  52%
  |                                                                       
  |====================================                             |  56%
  |                                                                       
  |=======================================                          |  60%
  |                                                                       
  |==========================================                       |  64%
  |                                                                       
  |============================================                     |  68%
  |                                                                       
  |===============================================                  |  72%
  |                                                                       
  |=================================================                |  76%
  |                                                                       
  |====================================================             |  80%
  |                                                                       
  |=======================================================          |  84%
  |                                                                       
  |=========================================================        |  88%
  |                                                                       
  |============================================================     |  92%
  |                                                                       
  |==============================================================   |  96%
  |                                                                       
  |=================================================================| 100%
##    user  system elapsed 
##   1.216   0.020 166.508
perf.dl<-h2o.performance(model=dlearning.modelh,newdata=test.h2o)
perf.dl
## H2OBinomialMetrics: deeplearning
## 
## MSE:  0.1678359
## RMSE:  0.4096778
## LogLoss:  1.86509
## Mean Per-Class Error:  0.3433013
## AUC:  0.7568822
## Gini:  0.5137644
## 
## Confusion Matrix (vertical: actual; across: predicted) for F1-optimal threshold:
##          0  1    Error     Rate
## 0      290 24 0.076433  =24/314
## 1       36 23 0.610169   =36/59
## Totals 326 47 0.160858  =60/373
## 
## Maximum Metrics: Maximum metrics at their respective thresholds
##                       metric threshold    value idx
## 1                     max f1  0.826267 0.433962  46
## 2                     max f2  0.000000 0.588235 239
## 3               max f0point5  0.999929 0.511811  16
## 4               max accuracy  0.999999 0.865952  10
## 5              max precision  1.000000 1.000000   0
## 6                 max recall  0.000000 1.000000 326
## 7            max specificity  1.000000 1.000000   0
## 8           max absolute_mcc  0.999929 0.363219  16
## 9 max min_per_class_accuracy  0.000004 0.662420 145
## 10 max mean_per_class_accuracy  0.000000 0.685334 224
## 
## Gains/Lift Table: Extract with `h2o.gainsLift(<model>, <data>)` or `h2o.gainsLift(<model>, valid=<T/F>, xval=<T/F>)`
h2o.sensitivity(perf.dl,0.5)
## Warning in h2o.find_row_by_threshold(object, t): Could not find exact
## threshold: 0.5 for this set of metrics; using closest threshold found:
## 0.496293348880151. Run `h2o.predict` and apply your desired threshold on a
## probability column.
## [[1]]
## [1] 0.3898305
h2o.auc(perf.dl)
## [1] 0.7568822

The AUC metric for this model is 0.7568822, which is less than what you got from the earlier models. I recommend further experimentation using different hyper parameters, such as the learning rate, epoch or the number of hidden layers.

H2O’s built-in functions provide many key statistical measures that can help measure model performance. Here are some of these key terms.

Metric Description
Sensitivity Measures the proportion of positives that have been correctly identified. It is also called the true positive rate, or recall.
Specificity Measures the proportion of negatives that have been correctly identified. It is also called the true negative rate.
Threshold Cutoff point that maximizes specificity and sensitivity. While the model may not provide the highest prediction at this point, it would not be biased towards positives or negatives.
Precision The fraction of the documents retrieved that are relevant to the information needed, for example, how many of the positively classified are relevant
AUC

Provides insight into how well the classifier is able to separate the two classes. The implicit goal is to deal with situations where the sample distribution is highly skewed, with a tendency to overfit to a single class.

0.90 – 1 = excellent (A)

0.8 – 0.9 = good (B)

0.7 – 0.8 = fair (C)

.6 – 0.7 = poor (D)

0.5 – 0.5 = fail (F)

Here’s a summary of the metrics generated from H2O’s built-in functions for the three models that produced useful results.

Metric Model 3 GBM Model Deep Learning Model

Accuracy

(max)

0.882038

(t=0.435479)

0.876676

(t=0.442757)

0.865952

(t=0.999999)

Precision

(max)

1.0

(t=0.821606)

1.0

(t=0802184)

1.0

(t=1.0)

Recall

(max)

1.0 1.0

1.0

(t=0)

Specificity

(max)

1.0 1.0

1.0

(t=1)

Sensitivity

 

0.2033898 0.1355932

0.3898305

(t=0.5)

AUC 0.8492389 0.8630573 0.756882

Note: ‘t’ denotes threshold.

Your options at this point could be narrowed down to Model 3 and the GBM model, based on the AUC and accuracy metrics observed earlier.  If the slightly lower accuracy of the GBM model is deemed acceptable, the record label can choose to go to production with the GBM model, as it can predict a higher number of Top 10 hits.  The AUC metric for the GBM model is also higher than that of Model 3.

Record labels can experiment with different learning techniques and parameters before arriving at a model that proves to be the best fit for their business. Because deep learning models can be computationally expensive, record labels can choose more powerful EC2 instances on AWS to run their experiments faster.

Conclusion

In this post, I showed how the popular music industry can use analytics to predict the type of songs that make the Top 10 Billboard charts. By running H2O’s scalable machine learning platform on AWS, data scientists can easily experiment with multiple modeling techniques and interactively query the data using Amazon Athena, without having to manage the underlying infrastructure. This helps record labels make critical decisions on the type of artists and songs to promote in a timely fashion, thereby increasing sales and revenue.

If you have questions or suggestions, please comment below.


Additional Reading

Learn how to build and explore a simple geospita simple GEOINT application using SparkR.


About the Authors

gopalGopal Wunnava is a Partner Solution Architect with the AWS GSI Team. He works with partners and customers on big data engagements, and is passionate about building analytical solutions that drive business capabilities and decision making. In his spare time, he loves all things sports and movies related and is fond of old classics like Asterix, Obelix comics and Hitchcock movies.

 

 

Bob Strahan, a Senior Consultant with AWS Professional Services, contributed to this post.

 

 

The Evil Within 2 Used Denuvo, Then Dumped it Before Launch

Post Syndicated from Andy original https://torrentfreak.com/the-evil-within-2-used-denuvo-then-dumped-it-before-launch-171013/

At the end of September we reported on a nightmare scenario for videogame anti-tamper technology Denuvo.

With cracking groups chipping away at the system for the past few months, progressing in leaps and bounds, the race to the bottom was almost complete. After aiming to hold off pirates for the first few lucrative weeks and months after launch, the Denuvo-protected Total War: Warhammer 2 fell to pirates in a matter of hours.

In the less than two weeks that have passed since, things haven’t improved much. By most measurements, in fact, the situation appears to have gotten worse.

On Wednesday, action role-playing game Middle Earth: Shadow of War was cracked a day after launch. While this didn’t beat the record set by Warhammer 2, the scene was given an unexpected gift.

Instead of the crack appearing courtesy of scene groups STEAMPUNKS or CPY, which has largely been the tradition thus far this year, old favorite CODEX stepped up to the mark with their own efforts. This means there are now close to half a dozen entities with the ability to defeat Denuvo, which isn’t a good look for the anti-piracy outfit.

A CODEX crack for Denuvo, from nowhere

Needless to say, this development was met with absolute glee by pirates, who forgave the additional day taken to crack the game in order to welcome CODEX into the anti-Denuvo club. But while this is bad news for the anti-tamper technology, there could be a worse enemy crossing the horizon – no confidence.

This Tuesday, DSO Gaming reported that it had received a review copy of Bethesda’s then-upcoming survival horror game, The Evil Within 2. The site, which is often a reliable source for Denuvo-related news, confirmed that the code was indeed protected by Denuvo.

“Another upcoming title that will be using Denuvo is The Evil Within 2,” the site reported. “Bethesda has provided us with a review code for The Evil Within 2. As such, we can confirm that Denuvo is present in it.”

As you read this, October 13, 2017, The Evil Within 2 is enjoying its official worldwide launch. Early yesterday afternoon, however, the title leaked early onto the Internet, courtesy of cracking group CODEX.

At first view, it looked like CODEX had cracked Denuvo before the game’s official launch but the reality was somewhat different after the dust had settled. For reasons best known to developer Bethesda, Denuvo was completely absent from the title. As shown by the title’s NFO (information) file, the only protection present was that provided by Steam.

Denuvo? What Denuvo?

This raises a number of scenarios, none of them good for Denuvo.

One possibility is that all along Bethesda never intended to use Denuvo on the final release. Exactly why we’ll likely never know, but the theory doesn’t really gel with them including it in the review code reviewed by DSO Gaming earlier this week.

The other proposition is that Bethesda witnessed the fiasco around Denuvo’s ‘protection’ in recent days and decided not to invest in something that wasn’t going to provide value for money.

Of course, these theories are going to be pretty difficult to confirm. Denuvo are a pretty confident bunch when things are going their way but they go suspiciously quiet when the tide is turning. Equally, developers tend to keep quiet about their anti-piracy strategies too.

The bottom line though is that if the protection really works and turns in valuable cash, why wouldn’t Bethesda use it as they have done on previous titles including Doom and Prey?

With that question apparently answering itself at the moment, all eyes now turn to Denuvo. Although it has a history of being one of the most successful anti-piracy systems overall, it has taken a massive battering in recent times. Will it recover? Only time will tell but at the moment things couldn’t get much worse.

Source: TF, for the latest info on copyright, file-sharing, torrent sites and ANONYMOUS VPN services.