Tag Archives: NPR

[$] Unprivileged filesystem mounts, 2018 edition

Post Syndicated from corbet original https://lwn.net/Articles/755593/rss

The advent of user namespaces and container technology has made it possible
to extend more root-like powers to unprivileged users in a (we hope) safe
way. One remaining sticking point is the mounting of filesystems, which
has long been fraught with security problems. Work has been proceeding to
allow such mounts for years, and it has gotten a little closer with the
posting of a patch series intended for the 4.18 kernel. But, as an
unrelated discussion has made clear, truly safe unprivileged filesystem
mounting is still a rather distant prospect — at least, if one wants to do
it in the kernel.

Measuring the throughput for Amazon MQ using the JMS Benchmark

Post Syndicated from Rachel Richardson original https://aws.amazon.com/blogs/compute/measuring-the-throughput-for-amazon-mq-using-the-jms-benchmark/

This post is courtesy of Alan Protasio, Software Development Engineer, Amazon Web Services

Just like compute and storage, messaging is a fundamental building block of enterprise applications. Message brokers (aka “message-oriented middleware”) enable different software systems, often written in different languages, on different platforms, running in different locations, to communicate and exchange information. Mission-critical applications, such as CRM and ERP, rely on message brokers to work.

A common performance consideration for customers deploying a message broker in a production environment is the throughput of the system, measured as messages per second. This is important to know so that application environments (hosts, threads, memory, etc.) can be configured correctly.

In this post, we demonstrate how to measure the throughput for Amazon MQ, a new managed message broker service for ActiveMQ, using JMS Benchmark. It should take between 15–20 minutes to set up the environment and an hour to run the benchmark. We also provide some tips on how to configure Amazon MQ for optimal throughput.

Benchmarking throughput for Amazon MQ

ActiveMQ can be used for a number of use cases. These use cases can range from simple fire and forget tasks (that is, asynchronous processing), low-latency request-reply patterns, to buffering requests before they are persisted to a database.

The throughput of Amazon MQ is largely dependent on the use case. For example, if you have non-critical workloads such as gathering click events for a non-business-critical portal, you can use ActiveMQ in a non-persistent mode and get extremely high throughput with Amazon MQ.

On the flip side, if you have a critical workload where durability is extremely important (meaning that you can’t lose a message), then you are bound by the I/O capacity of your underlying persistence store. We recommend using mq.m4.large for the best results. The mq.t2.micro instance type is intended for product evaluation. Performance is limited, due to the lower memory and burstable CPU performance.

Tip: To improve your throughput with Amazon MQ, make sure that you have consumers processing messaging as fast as (or faster than) your producers are pushing messages.

Because it’s impossible to talk about how the broker (ActiveMQ) behaves for each and every use case, we walk through how to set up your own benchmark for Amazon MQ using our favorite open-source benchmarking tool: JMS Benchmark. We are fans of the JMS Benchmark suite because it’s easy to set up and deploy, and comes with a built-in visualizer of the results.

Non-Persistent Scenarios – Queue latency as you scale producer throughput

JMS Benchmark nonpersistent scenarios

Getting started

At the time of publication, you can create an mq.m4.large single-instance broker for testing for $0.30 per hour (US pricing).

This walkthrough covers the following tasks:

  1.  Create and configure the broker.
  2. Create an EC2 instance to run your benchmark
  3. Configure the security groups
  4.  Run the benchmark.

Step 1 – Create and configure the broker
Create and configure the broker using Tutorial: Creating and Configuring an Amazon MQ Broker.

Step 2 – Create an EC2 instance to run your benchmark
Launch the EC2 instance using Step 1: Launch an Instance. We recommend choosing the m5.large instance type.

Step 3 – Configure the security groups
Make sure that all the security groups are correctly configured to let the traffic flow between the EC2 instance and your broker.

  1. Sign in to the Amazon MQ console.
  2. From the broker list, choose the name of your broker (for example, MyBroker)
  3. In the Details section, under Security and network, choose the name of your security group or choose the expand icon ( ).
  4. From the security group list, choose your security group.
  5. At the bottom of the page, choose Inbound, Edit.
  6. In the Edit inbound rules dialog box, add a role to allow traffic between your instance and the broker:
    • Choose Add Rule.
    • For Type, choose Custom TCP.
    • For Port Range, type the ActiveMQ SSL port (61617).
    • For Source, leave Custom selected and then type the security group of your EC2 instance.
    • Choose Save.

Your broker can now accept the connection from your EC2 instance.

Step 4 – Run the benchmark
Connect to your EC2 instance using SSH and run the following commands:

$ cd ~
$ curl -L https://github.com/alanprot/jms-benchmark/archive/master.zip -o master.zip
$ unzip master.zip
$ cd jms-benchmark-master
$ chmod a+x bin/*
$ env \
  SERVER_SETUP=false \
  SERVER_ADDRESS={activemq-endpoint} \
  ACTIVEMQ_TRANSPORT=ssl\
  ACTIVEMQ_PORT=61617 \
  ACTIVEMQ_USERNAME={activemq-user} \
  ACTIVEMQ_PASSWORD={activemq-password} \
  ./bin/benchmark-activemq

After the benchmark finishes, you can find the results in the ~/reports directory. As you may notice, the performance of ActiveMQ varies based on the number of consumers, producers, destinations, and message size.

Amazon MQ architecture

The last bit that’s important to know so that you can better understand the results of the benchmark is how Amazon MQ is architected.

Amazon MQ is architected to be highly available (HA) and durable. For HA, we recommend using the multi-AZ option. After a message is sent to Amazon MQ in persistent mode, the message is written to the highly durable message store that replicates the data across multiple nodes in multiple Availability Zones. Because of this replication, for some use cases you may see a reduction in throughput as you migrate to Amazon MQ. Customers have told us they appreciate the benefits of message replication as it helps protect durability even in the face of the loss of an Availability Zone.

Conclusion

We hope this gives you an idea of how Amazon MQ performs. We encourage you to run tests to simulate your own use cases.

To learn more, see the Amazon MQ website. You can try Amazon MQ for free with the AWS Free Tier, which includes up to 750 hours of a single-instance mq.t2.micro broker and up to 1 GB of storage per month for one year.

The Benefits of Side Projects

Post Syndicated from Bozho original https://techblog.bozho.net/the-benefits-of-side-projects/

Side projects are the things you do at home, after work, for your own “entertainment”, or to satisfy your desire to learn new stuff, in case your workplace doesn’t give you that opportunity (or at least not enough of it). Side projects are also a way to build stuff that you think is valuable but not necessarily “commercialisable”. Many side projects are open-sourced sooner or later and some of them contribute to the pool of tools at other people’s disposal.

I’ve outlined one recommendation about side projects before – do them with technologies that are new to you, so that you learn important things that will keep you better positioned in the software world.

But there are more benefits than that – serendipitous benefits, for example. And I’d like to tell some personal stories about that. I’ll focus on a few examples from my list of side projects to show how, through a sort-of butterfly effect, they helped shape my career.

The computoser project, no matter how cool algorithmic music composition, didn’t manage to have much of a long term impact. But it did teach me something apart from niche musical theory – how to read a bulk of scientific papers (mostly computer science) and understand them without being formally trained in the particular field. We’ll see how that was useful later.

Then there was the “State alerts” project – a website that scraped content from public institutions in my country (legislation, legislation proposals, decisions by regulators, new tenders, etc.), made them searchable, and “subscribable” – so that you get notified when a keyword of interest is mentioned in newly proposed legislation, for example. (I obviously subscribed for “information technologies” and “electronic”).

And that project turned out to have a significant impact on the following years. First, I chose a new technology to write it with – Scala. Which turned out to be of great use when I started working at TomTom, and on the 3rd day I was transferred to a Scala project, which was way cooler and much more complex than the original one I was hired for. It was a bit ironic, as my colleagues had just read that “I don’t like Scala” a few weeks earlier, but nevertheless, that was one of the most interesting projects I’ve worked on, and it went on for two years. Had I not known Scala, I’d probably be gone from TomTom much earlier (as the other project was restructured a few times), and I would not have learned many of the scalability, architecture and AWS lessons that I did learn there.

But the very same project had an even more important follow-up. Because if its “civic hacking” flavour, I was invited to join an informal group of developers (later officiated as an NGO) who create tools that are useful for society (something like MySociety.org). That group gathered regularly, discussed both tools and policies, and at some point we put up a list of policy priorities that we wanted to lobby policy makers. One of them was open source for the government, the other one was open data. As a result of our interaction with an interim government, we donated the official open data portal of my country, functioning to this day.

As a result of that, a few months later we got a proposal from the deputy prime minister’s office to “elect” one of the group for an advisor to the cabinet. And we decided that could be me. So I went for it and became advisor to the deputy prime minister. The job has nothing to do with anything one could imagine, and it was challenging and fascinating. We managed to pass legislation, including one that requires open source for custom projects, eID and open data. And all of that would not have been possible without my little side project.

As for my latest side project, LogSentinel – it became my current startup company. And not without help from the previous two mentioned above – the computer science paper reading was of great use when I was navigating the crypto papers landscape, and from the government job I not only gained invaluable legal knowledge, but I also “got” a co-founder.

Some other side projects died without much fanfare, and that’s fine. But the ones above shaped my “story” in a way that would not have been possible otherwise.

And I agree that such serendipitous chain of events could have happened without side projects – I could’ve gotten these opportunities by meeting someone at a bar (unlikely, but who knows). But we, as software engineers, are capable of tilting chance towards us by utilizing our skills. Side projects are our “extracurricular activities”, and they often lead to unpredictable, but rather positive chains of events. They would rarely be the only factor, but they are certainly great at unlocking potential.

The post The Benefits of Side Projects appeared first on Bozho's tech blog.

The US Is Unprepared for Election-Related Hacking in 2018

Post Syndicated from Bruce Schneier original https://www.schneier.com/blog/archives/2018/05/the_us_is_unpre.html

This survey and report is not surprising:

The survey of nearly forty Republican and Democratic campaign operatives, administered through November and December 2017, revealed that American political campaign staff — primarily working at the state and congressional levels — are not only unprepared for possible cyber attacks, but remain generally unconcerned about the threat. The survey sample was relatively small, but nevertheless the survey provides a first look at how campaign managers and staff are responding to the threat.

The overwhelming majority of those surveyed do not want to devote campaign resources to cybersecurity or to hire personnel to address cybersecurity issues. Even though campaign managers recognize there is a high probability that campaign and personal emails are at risk of being hacked, they are more concerned about fundraising and press coverage than they are about cybersecurity. Less than half of those surveyed said they had taken steps to make their data secure and most were unsure if they wanted to spend any money on this protection.

Security is never something we actually want. Security is something we need in order to avoid what we don’t want. It’s also more abstract, concerned with hypothetical future possibilities. Of course it’s lower on the priorities list than fundraising and press coverage. They’re more tangible, and they’re more immediate.

This is all to the attackers’ advantage.

Analyze data in Amazon DynamoDB using Amazon SageMaker for real-time prediction

Post Syndicated from YongSeong Lee original https://aws.amazon.com/blogs/big-data/analyze-data-in-amazon-dynamodb-using-amazon-sagemaker-for-real-time-prediction/

Many companies across the globe use Amazon DynamoDB to store and query historical user-interaction data. DynamoDB is a fast NoSQL database used by applications that need consistent, single-digit millisecond latency.

Often, customers want to turn their valuable data in DynamoDB into insights by analyzing a copy of their table stored in Amazon S3. Doing this separates their analytical queries from their low-latency critical paths. This data can be the primary source for understanding customers’ past behavior, predicting future behavior, and generating downstream business value. Customers often turn to DynamoDB because of its great scalability and high availability. After a successful launch, many customers want to use the data in DynamoDB to predict future behaviors or provide personalized recommendations.

DynamoDB is a good fit for low-latency reads and writes, but it’s not practical to scan all data in a DynamoDB database to train a model. In this post, I demonstrate how you can use DynamoDB table data copied to Amazon S3 by AWS Data Pipeline to predict customer behavior. I also demonstrate how you can use this data to provide personalized recommendations for customers using Amazon SageMaker. You can also run ad hoc queries using Amazon Athena against the data. DynamoDB recently released on-demand backups to create full table backups with no performance impact. However, it’s not suitable for our purposes in this post, so I chose AWS Data Pipeline instead to create managed backups are accessible from other services.

To do this, I describe how to read the DynamoDB backup file format in Data Pipeline. I also describe how to convert the objects in S3 to a CSV format that Amazon SageMaker can read. In addition, I show how to schedule regular exports and transformations using Data Pipeline. The sample data used in this post is from Bank Marketing Data Set of UCI.

The solution that I describe provides the following benefits:

  • Separates analytical queries from production traffic on your DynamoDB table, preserving your DynamoDB read capacity units (RCUs) for important production requests
  • Automatically updates your model to get real-time predictions
  • Optimizes for performance (so it doesn’t compete with DynamoDB RCUs after the export) and for cost (using data you already have)
  • Makes it easier for developers of all skill levels to use Amazon SageMaker

All code and data set in this post are available in this .zip file.

Solution architecture

The following diagram shows the overall architecture of the solution.

The steps that data follows through the architecture are as follows:

  1. Data Pipeline regularly copies the full contents of a DynamoDB table as JSON into an S3
  2. Exported JSON files are converted to comma-separated value (CSV) format to use as a data source for Amazon SageMaker.
  3. Amazon SageMaker renews the model artifact and update the endpoint.
  4. The converted CSV is available for ad hoc queries with Amazon Athena.
  5. Data Pipeline controls this flow and repeats the cycle based on the schedule defined by customer requirements.

Building the auto-updating model

This section discusses details about how to read the DynamoDB exported data in Data Pipeline and build automated workflows for real-time prediction with a regularly updated model.

Download sample scripts and data

Before you begin, take the following steps:

  1. Download sample scripts in this .zip file.
  2. Unzip the src.zip file.
  3. Find the automation_script.sh file and edit it for your environment. For example, you need to replace 's3://<your bucket>/<datasource path>/' with your own S3 path to the data source for Amazon ML. In the script, the text enclosed by angle brackets—< and >—should be replaced with your own path.
  4. Upload the json-serde-1.3.6-SNAPSHOT-jar-with-dependencies.jar file to your S3 path so that the ADD jar command in Apache Hive can refer to it.

For this solution, the banking.csv  should be imported into a DynamoDB table.

Export a DynamoDB table

To export the DynamoDB table to S3, open the Data Pipeline console and choose the Export DynamoDB table to S3 template. In this template, Data Pipeline creates an Amazon EMR cluster and performs an export in the EMRActivity activity. Set proper intervals for backups according to your business requirements.

One core node(m3.xlarge) provides the default capacity for the EMR cluster and should be suitable for the solution in this post. Leave the option to resize the cluster before running enabled in the TableBackupActivity activity to let Data Pipeline scale the cluster to match the table size. The process of converting to CSV format and renewing models happens in this EMR cluster.

For a more in-depth look at how to export data from DynamoDB, see Export Data from DynamoDB in the Data Pipeline documentation.

Add the script to an existing pipeline

After you export your DynamoDB table, you add an additional EMR step to EMRActivity by following these steps:

  1. Open the Data Pipeline console and choose the ID for the pipeline that you want to add the script to.
  2. For Actions, choose Edit.
  3. In the editing console, choose the Activities category and add an EMR step using the custom script downloaded in the previous section, as shown below.

Paste the following command into the new step after the data ­­upload step:

s3://#{myDDBRegion}.elasticmapreduce/libs/script-runner/script-runner.jar,s3://<your bucket name>/automation_script.sh,#{output.directoryPath},#{myDDBRegion}

The element #{output.directoryPath} references the S3 path where the data pipeline exports DynamoDB data as JSON. The path should be passed to the script as an argument.

The bash script has two goals, converting data formats and renewing the Amazon SageMaker model. Subsequent sections discuss the contents of the automation script.

Automation script: Convert JSON data to CSV with Hive

We use Apache Hive to transform the data into a new format. The Hive QL script to create an external table and transform the data is included in the custom script that you added to the Data Pipeline definition.

When you run the Hive scripts, do so with the -e option. Also, define the Hive table with the 'org.openx.data.jsonserde.JsonSerDe' row format to parse and read JSON format. The SQL creates a Hive EXTERNAL table, and it reads the DynamoDB backup data on the S3 path passed to it by Data Pipeline.

Note: You should create the table with the “EXTERNAL” keyword to avoid the backup data being accidentally deleted from S3 if you drop the table.

The full automation script for converting follows. Add your own bucket name and data source path in the highlighted areas.

#!/bin/bash
hive -e "
ADD jar s3://<your bucket name>/json-serde-1.3.6-SNAPSHOT-jar-with-dependencies.jar ; 
DROP TABLE IF EXISTS blog_backup_data ;
CREATE EXTERNAL TABLE blog_backup_data (
 customer_id map<string,string>,
 age map<string,string>, job map<string,string>, 
 marital map<string,string>,education map<string,string>, 
 default map<string,string>, housing map<string,string>,
 loan map<string,string>, contact map<string,string>, 
 month map<string,string>, day_of_week map<string,string>, 
 duration map<string,string>, campaign map<string,string>,
 pdays map<string,string>, previous map<string,string>, 
 poutcome map<string,string>, emp_var_rate map<string,string>, 
 cons_price_idx map<string,string>, cons_conf_idx map<string,string>,
 euribor3m map<string,string>, nr_employed map<string,string>, 
 y map<string,string> ) 
ROW FORMAT SERDE 'org.openx.data.jsonserde.JsonSerDe' 
LOCATION '$1/';

INSERT OVERWRITE DIRECTORY 's3://<your bucket name>/<datasource path>/' 
SELECT concat( customer_id['s'],',', 
 age['n'],',', job['s'],',', 
 marital['s'],',', education['s'],',', default['s'],',', 
 housing['s'],',', loan['s'],',', contact['s'],',', 
 month['s'],',', day_of_week['s'],',', duration['n'],',', 
 campaign['n'],',',pdays['n'],',',previous['n'],',', 
 poutcome['s'],',', emp_var_rate['n'],',', cons_price_idx['n'],',',
 cons_conf_idx['n'],',', euribor3m['n'],',', nr_employed['n'],',', y['n'] ) 
FROM blog_backup_data
WHERE customer_id['s'] > 0 ; 

After creating an external table, you need to read data. You then use the INSERT OVERWRITE DIRECTORY ~ SELECT command to write CSV data to the S3 path that you designated as the data source for Amazon SageMaker.

Depending on your requirements, you can eliminate or process the columns in the SELECT clause in this step to optimize data analysis. For example, you might remove some columns that have unpredictable correlations with the target value because keeping the wrong columns might expose your model to “overfitting” during the training. In this post, customer_id  columns is removed. Overfitting can make your prediction weak. More information about overfitting can be found in the topic Model Fit: Underfitting vs. Overfitting in the Amazon ML documentation.

Automation script: Renew the Amazon SageMaker model

After the CSV data is replaced and ready to use, create a new model artifact for Amazon SageMaker with the updated dataset on S3.  For renewing model artifact, you must create a new training job.  Training jobs can be run using the AWS SDK ( for example, Amazon SageMaker boto3 ) or the Amazon SageMaker Python SDK that can be installed with “pip install sagemaker” command as well as the AWS CLI for Amazon SageMaker described in this post.

In addition, consider how to smoothly renew your existing model without service impact, because your model is called by applications in real time. To do this, you need to create a new endpoint configuration first and update a current endpoint with the endpoint configuration that is just created.

#!/bin/bash
## Define variable 
REGION=$2
DTTIME=`date +%Y-%m-%d-%H-%M-%S`
ROLE="<your AmazonSageMaker-ExecutionRole>" 


# Select containers image based on region.  
case "$REGION" in
"us-west-2" )
    IMAGE="174872318107.dkr.ecr.us-west-2.amazonaws.com/linear-learner:latest"
    ;;
"us-east-1" )
    IMAGE="382416733822.dkr.ecr.us-east-1.amazonaws.com/linear-learner:latest" 
    ;;
"us-east-2" )
    IMAGE="404615174143.dkr.ecr.us-east-2.amazonaws.com/linear-learner:latest" 
    ;;
"eu-west-1" )
    IMAGE="438346466558.dkr.ecr.eu-west-1.amazonaws.com/linear-learner:latest" 
    ;;
 *)
    echo "Invalid Region Name"
    exit 1 ;  
esac

# Start training job and creating model artifact 
TRAINING_JOB_NAME=TRAIN-${DTTIME} 
S3OUTPUT="s3://<your bucket name>/model/" 
INSTANCETYPE="ml.m4.xlarge"
INSTANCECOUNT=1
VOLUMESIZE=5 
aws sagemaker create-training-job --training-job-name ${TRAINING_JOB_NAME} --region ${REGION}  --algorithm-specification TrainingImage=${IMAGE},TrainingInputMode=File --role-arn ${ROLE}  --input-data-config '[{ "ChannelName": "train", "DataSource": { "S3DataSource": { "S3DataType": "S3Prefix", "S3Uri": "s3://<your bucket name>/<datasource path>/", "S3DataDistributionType": "FullyReplicated" } }, "ContentType": "text/csv", "CompressionType": "None" , "RecordWrapperType": "None"  }]'  --output-data-config S3OutputPath=${S3OUTPUT} --resource-config  InstanceType=${INSTANCETYPE},InstanceCount=${INSTANCECOUNT},VolumeSizeInGB=${VOLUMESIZE} --stopping-condition MaxRuntimeInSeconds=120 --hyper-parameters feature_dim=20,predictor_type=binary_classifier  

# Wait until job completed 
aws sagemaker wait training-job-completed-or-stopped --training-job-name ${TRAINING_JOB_NAME}  --region ${REGION}

# Get newly created model artifact and create model
MODELARTIFACT=`aws sagemaker describe-training-job --training-job-name ${TRAINING_JOB_NAME} --region ${REGION}  --query 'ModelArtifacts.S3ModelArtifacts' --output text `
MODELNAME=MODEL-${DTTIME}
aws sagemaker create-model --region ${REGION} --model-name ${MODELNAME}  --primary-container Image=${IMAGE},ModelDataUrl=${MODELARTIFACT}  --execution-role-arn ${ROLE}

# create a new endpoint configuration 
CONFIGNAME=CONFIG-${DTTIME}
aws sagemaker  create-endpoint-config --region ${REGION} --endpoint-config-name ${CONFIGNAME}  --production-variants  VariantName=Users,ModelName=${MODELNAME},InitialInstanceCount=1,InstanceType=ml.m4.xlarge

# create or update the endpoint
STATUS=`aws sagemaker describe-endpoint --endpoint-name  ServiceEndpoint --query 'EndpointStatus' --output text --region ${REGION} `
if [[ $STATUS -ne "InService" ]] ;
then
    aws sagemaker  create-endpoint --endpoint-name  ServiceEndpoint  --endpoint-config-name ${CONFIGNAME} --region ${REGION}    
else
    aws sagemaker  update-endpoint --endpoint-name  ServiceEndpoint  --endpoint-config-name ${CONFIGNAME} --region ${REGION}
fi

Grant permission

Before you execute the script, you must grant proper permission to Data Pipeline. Data Pipeline uses the DataPipelineDefaultResourceRole role by default. I added the following policy to DataPipelineDefaultResourceRole to allow Data Pipeline to create, delete, and update the Amazon SageMaker model and data source in the script.

{
 "Version": "2012-10-17",
 "Statement": [
 {
 "Effect": "Allow",
 "Action": [
 "sagemaker:CreateTrainingJob",
 "sagemaker:DescribeTrainingJob",
 "sagemaker:CreateModel",
 "sagemaker:CreateEndpointConfig",
 "sagemaker:DescribeEndpoint",
 "sagemaker:CreateEndpoint",
 "sagemaker:UpdateEndpoint",
 "iam:PassRole"
 ],
 "Resource": "*"
 }
 ]
}

Use real-time prediction

After you deploy a model into production using Amazon SageMaker hosting services, your client applications use this API to get inferences from the model hosted at the specified endpoint. This approach is useful for interactive web, mobile, or desktop applications.

Following, I provide a simple Python code example that queries against Amazon SageMaker endpoint URL with its name (“ServiceEndpoint”) and then uses them for real-time prediction.

=== Python sample for real-time prediction ===

#!/usr/bin/env python
import boto3
import json 

client = boto3.client('sagemaker-runtime', region_name ='<your region>' )
new_customer_info = '34,10,2,4,1,2,1,1,6,3,190,1,3,4,3,-1.7,94.055,-39.8,0.715,4991.6'
response = client.invoke_endpoint(
    EndpointName='ServiceEndpoint',
    Body=new_customer_info, 
    ContentType='text/csv'
)
result = json.loads(response['Body'].read().decode())
print(result)
--- output(response) ---
{u'predictions': [{u'score': 0.7528127431869507, u'predicted_label': 1.0}]}

Solution summary

The solution takes the following steps:

  1. Data Pipeline exports DynamoDB table data into S3. The original JSON data should be kept to recover the table in the rare event that this is needed. Data Pipeline then converts JSON to CSV so that Amazon SageMaker can read the data.Note: You should select only meaningful attributes when you convert CSV. For example, if you judge that the “campaign” attribute is not correlated, you can eliminate this attribute from the CSV.
  2. Train the Amazon SageMaker model with the new data source.
  3. When a new customer comes to your site, you can judge how likely it is for this customer to subscribe to your new product based on “predictedScores” provided by Amazon SageMaker.
  4. If the new user subscribes your new product, your application must update the attribute “y” to the value 1 (for yes). This updated data is provided for the next model renewal as a new data source. It serves to improve the accuracy of your prediction. With each new entry, your application can become smarter and deliver better predictions.

Running ad hoc queries using Amazon Athena

Amazon Athena is a serverless query service that makes it easy to analyze large amounts of data stored in Amazon S3 using standard SQL. Athena is useful for examining data and collecting statistics or informative summaries about data. You can also use the powerful analytic functions of Presto, as described in the topic Aggregate Functions of Presto in the Presto documentation.

With the Data Pipeline scheduled activity, recent CSV data is always located in S3 so that you can run ad hoc queries against the data using Amazon Athena. I show this with example SQL statements following. For an in-depth description of this process, see the post Interactive SQL Queries for Data in Amazon S3 on the AWS News Blog. 

Creating an Amazon Athena table and running it

Simply, you can create an EXTERNAL table for the CSV data on S3 in Amazon Athena Management Console.

=== Table Creation ===
CREATE EXTERNAL TABLE datasource (
 age int, 
 job string, 
 marital string , 
 education string, 
 default string, 
 housing string, 
 loan string, 
 contact string, 
 month string, 
 day_of_week string, 
 duration int, 
 campaign int, 
 pdays int , 
 previous int , 
 poutcome string, 
 emp_var_rate double, 
 cons_price_idx double,
 cons_conf_idx double, 
 euribor3m double, 
 nr_employed double, 
 y int 
)
ROW FORMAT DELIMITED 
FIELDS TERMINATED BY ',' ESCAPED BY '\\' LINES TERMINATED BY '\n' 
LOCATION 's3://<your bucket name>/<datasource path>/';

The following query calculates the correlation coefficient between the target attribute and other attributes using Amazon Athena.

=== Sample Query ===

SELECT corr(age,y) AS correlation_age_and_target, 
 corr(duration,y) AS correlation_duration_and_target, 
 corr(campaign,y) AS correlation_campaign_and_target,
 corr(contact,y) AS correlation_contact_and_target
FROM ( SELECT age , duration , campaign , y , 
 CASE WHEN contact = 'telephone' THEN 1 ELSE 0 END AS contact 
 FROM datasource 
 ) datasource ;

Conclusion

In this post, I introduce an example of how to analyze data in DynamoDB by using table data in Amazon S3 to optimize DynamoDB table read capacity. You can then use the analyzed data as a new data source to train an Amazon SageMaker model for accurate real-time prediction. In addition, you can run ad hoc queries against the data on S3 using Amazon Athena. I also present how to automate these procedures by using Data Pipeline.

You can adapt this example to your specific use case at hand, and hopefully this post helps you accelerate your development. You can find more examples and use cases for Amazon SageMaker in the video AWS 2017: Introducing Amazon SageMaker on the AWS website.

 


Additional Reading

If you found this post useful, be sure to check out Serving Real-Time Machine Learning Predictions on Amazon EMR and Analyzing Data in S3 using Amazon Athena.

 


About the Author

Yong Seong Lee is a Cloud Support Engineer for AWS Big Data Services. He is interested in every technology related to data/databases and helping customers who have difficulties in using AWS services. His motto is “Enjoy life, be curious and have maximum experience.”

 

 

Sci-Hub ‘Pirate Bay For Science’ Security Certs Revoked by Comodo

Post Syndicated from Andy original https://torrentfreak.com/sci-hub-pirate-bay-for-science-security-certs-revoked-by-comodo-ca-180503/

Sci-Hub is often referred to as the “Pirate Bay of Science”. Like its namesake, it offers masses of unlicensed content for free, mostly against the wishes of copyright holders.

While The Pirate Bay will index almost anything, Sci-Hub is dedicated to distributing tens of millions of academic papers and articles, something which has turned itself into a target for publishing giants like Elsevier.

Sci-Hub and its Kazakhstan-born founder Alexandra Elbakyan have been under sustained attack for several years but more recently have been fending off an unprecedented barrage of legal action initiated by the American Chemical Society (ACS), a leading source of academic publications in the field of chemistry.

After winning a default judgment for $4.8 million in copyright infringement damages last year, ACS was further granted a broad injunction.

It required various third-party services (including domain registries, hosting companies and search engines) to stop facilitating access to the site. This plunged Sci-Hub into a game of domain whac-a-mole, one that continues to this day.

Determined to head Sci-Hub off at the pass, ACS obtained additional authority to tackle the evasive site and any new domains it may register in the future.

While Sci-Hub has been hopping around domains for a while, this week a new development appeared on the horizon. Visitors to some of the site’s domains were greeted with errors indicating that the domains’ security certificates had been revoked.

Tests conducted by TorrentFreak revealed clear revocations on Sci-Hub.hk and Sci-Hub.nz, both of which returned the error ‘NET::ERR_CERT_REVOKED’.

Certificate revoked

These certificates were first issued and then revoked by Comodo CA, the world’s largest certification authority. TF contacted the company who confirmed that it had been forced to take action against Sci-Hub.

“In response to a court order against Sci-Hub, Comodo CA has revoked four certificates for the site,” Jonathan Skinner, Director, Global Channel Programs at Comodo CA informed TorrentFreak.

“By policy Comodo CA obeys court orders and the law to the full extent of its ability.”

Comodo refused to confirm any additional details, including whether these revocations were anything to do with the current ACS injunction. However, Susan R. Morrissey, Director of Communications at ACS, told TorrentFreak that the revocations were indeed part of ACS’ legal action against Sci-Hub.

“[T]he action is related to our continuing efforts to protect ACS’ intellectual property,” Morrissey confirmed.

Sci-Hub operates multiple domains (an up-to-date list is usually available on Wikipedia) that can be switched at any time. At the time of writing the domain sci-hub.ga currently returns ‘ERR_SSL_VERSION_OR_CIPHER_MISMATCH’ while .CN and .GS variants both have Comodo certificates that expired last year.

When TF first approached Comodo earlier this week, Sci-Hub’s certificates with the company hadn’t been completely wiped out. For example, the domain https://sci-hub.tw operated perfectly, with an active and non-revoked Comodo certificate.

Still in the game…but not for long

By Wednesday, however, the domain was returning the now-familiar “revoked” message.

These domain issues are the latest technical problems to hit Sci-Hub as a result of the ACS injunction. In February, Cloudflare terminated service to several of the site’s domains.

“Cloudflare will terminate your service for the following domains sci-hub.la, sci-hub.tv, and sci-hub.tw by disabling our authoritative DNS in 24 hours,” Cloudflare told Sci-Hub.

While ACS has certainly caused problems for Sci-Hub, the platform is extremely resilient and remains online.

The domains https://sci-hub.is and https://sci-hub.nu are fully operational with certificates issued by Let’s Encrypt, a free and open certificate authority supported by the likes of Mozilla, EFF, Chrome, Private Internet Access, and other prominent tech companies.

It’s unclear whether these certificates will be targeted in the future but Sci-Hub doesn’t appear to be in the mood to back down.

Source: TF, for the latest info on copyright, file-sharing, torrent sites and more. We also have VPN reviews, discounts, offers and coupons.

Google launches the gVisor container runtime

Post Syndicated from corbet original https://lwn.net/Articles/753324/rss

Google has announced
the open-sourcing of gVisor, a sandboxed container runtime.
gVisor is more lightweight than a VM while maintaining a similar
level of isolation. The core of gVisor is a kernel that runs as a normal,
unprivileged process that supports most Linux system calls. This kernel is
written in Go, which was chosen for its memory- and type-safety. Just like
within a VM, an application running in a gVisor sandbox gets its own kernel
and set of virtualized devices, distinct from the host and other
sandboxes.

User Authentication Best Practices Checklist

Post Syndicated from Bozho original https://techblog.bozho.net/user-authentication-best-practices-checklist/

User authentication is the functionality that every web application shared. We should have perfected that a long time ago, having implemented it so many times. And yet there are so many mistakes made all the time.

Part of the reason for that is that the list of things that can go wrong is long. You can store passwords incorrectly, you can have a vulnerably password reset functionality, you can expose your session to a CSRF attack, your session can be hijacked, etc. So I’ll try to compile a list of best practices regarding user authentication. OWASP top 10 is always something you should read, every year. But that might not be enough.

So, let’s start. I’ll try to be concise, but I’ll include as much of the related pitfalls as I can cover – e.g. what could go wrong with the user session after they login:

  • Store passwords with bcrypt/scrypt/PBKDF2. No MD5 or SHA, as they are not good for password storing. Long salt (per user) is mandatory (the aforementioned algorithms have it built in). If you don’t and someone gets hold of your database, they’ll be able to extract the passwords of all your users. And then try these passwords on other websites.
  • Use HTTPS. Period. (Otherwise user credentials can leak through unprotected networks). Force HTTPS if user opens a plain-text version.
  • Mark cookies as secure. Makes cookie theft harder.
  • Use CSRF protection (e.g. CSRF one-time tokens that are verified with each request). Frameworks have such functionality built-in.
  • Disallow framing (X-Frame-Options: DENY). Otherwise your website may be included in another website in a hidden iframe and “abused” through javascript.
  • Have a same-origin policy
  • Logout – let your users logout by deleting all cookies and invalidating the session. This makes usage of shared computers safer (yes, users should ideally use private browsing sessions, but not all of them are that savvy)
  • Session expiry – don’t have forever-lasting sessions. If the user closes your website, their session should expire after a while. “A while” may still be a big number depending on the service provided. For ajax-heavy website you can have regular ajax-polling that keeps the session alive while the page stays open.
  • Remember me – implementing “remember me” (on this machine) functionality is actually hard due to the risks of a stolen persistent cookie. Spring-security uses this approach, which I think should be followed if you wish to implement more persistent logins.
  • Forgotten password flow – the forgotten password flow should rely on sending a one-time (or expiring) link to the user and asking for a new password when it’s opened. 0Auth explain it in this post and Postmark gives some best pracitces. How the link is formed is a separate discussion and there are several approaches. Store a password-reset token in the user profile table and then send it as parameter in the link. Or do not store anything in the database, but send a few params: userId:expiresTimestamp:hmac(userId+expiresTimestamp). That way you have expiring links (rather than one-time links). The HMAC relies on a secret key, so the links can’t be spoofed. It seems there’s no consensus, as the OWASP guide has a bit different approach
  • One-time login links – this is an option used by Slack, which sends one-time login links instead of asking users for passwords. It relies on the fact that your email is well guarded and you have access to it all the time. If your service is not accessed to often, you can have that approach instead of (rather than in addition to) passwords.
  • Limit login attempts – brute-force through a web UI should not be possible; therefore you should block login attempts if they become too many. One approach is to just block them based on IP. The other one is to block them based on account attempted. (Spring example here). Which one is better – I don’t know. Both can actually be combined. Instead of fully blocking the attempts, you may add a captcha after, say, the 5th attempt. But don’t add the captcha for the first attempt – it is bad user experience.
  • Don’t leak information through error messages – you shouldn’t allow attackers to figure out if an email is registered or not. If an email is not found, upon login report just “Incorrect credentials”. On passwords reset, it may be something like “If your email is registered, you should have received a password reset email”. This is often at odds with usability – people don’t often remember the email they used to register, and the ability to check a number of them before getting in might be important. So this rule is not absolute, though it’s desirable, especially for more critical systems.
  • Make sure you use JWT only if it’s really necessary and be careful of the pitfalls.
  • Consider using a 3rd party authentication – OpenID Connect, OAuth by Google/Facebook/Twitter (but be careful with OAuth flaws as well). There’s an associated risk with relying on a 3rd party identity provider, and you still have to manage cookies, logout, etc., but some of the authentication aspects are simplified.
  • For high-risk or sensitive applications use 2-factor authentication. There’s a caveat with Google Authenticator though – if you lose your phone, you lose your accounts (unless there’s a manual process to restore it). That’s why Authy seems like a good solution for storing 2FA keys.

I’m sure I’m missing something. And you see it’s complicated. Sadly we’re still at the point where the most common functionality – authenticating users – is so tricky and cumbersome, that you almost always get at least some of it wrong.

The post User Authentication Best Practices Checklist appeared first on Bozho's tech blog.

New – Machine Learning Inference at the Edge Using AWS Greengrass

Post Syndicated from Jeff Barr original https://aws.amazon.com/blogs/aws/new-machine-learning-inference-at-the-edge-using-aws-greengrass/

What happens when you combine the Internet of Things, Machine Learning, and Edge Computing? Before I tell you, let’s review each one and discuss what AWS has to offer.

Internet of Things (IoT) – Devices that connect the physical world and the digital one. The devices, often equipped with one or more types of sensors, can be found in factories, vehicles, mines, fields, homes, and so forth. Important AWS services include AWS IoT Core, AWS IoT Analytics, AWS IoT Device Management, and Amazon FreeRTOS, along with others that you can find on the AWS IoT page.

Machine Learning (ML) – Systems that can be trained using an at-scale dataset and statistical algorithms, and used to make inferences from fresh data. At Amazon we use machine learning to drive the recommendations that you see when you shop, to optimize the paths in our fulfillment centers, fly drones, and much more. We support leading open source machine learning frameworks such as TensorFlow and MXNet, and make ML accessible and easy to use through Amazon SageMaker. We also provide Amazon Rekognition for images and for video, Amazon Lex for chatbots, and a wide array of language services for text analysis, translation, speech recognition, and text to speech.

Edge Computing – The power to have compute resources and decision-making capabilities in disparate locations, often with intermittent or no connectivity to the cloud. AWS Greengrass builds on AWS IoT, giving you the ability to run Lambda functions and keep device state in sync even when not connected to the Internet.

ML Inference at the Edge
Today I would like to toss all three of these important new technologies into a blender! You can now perform Machine Learning inference at the edge using AWS Greengrass. This allows you to use the power of the AWS cloud (including fast, powerful instances equipped with GPUs) to build, train, and test your ML models before deploying them to small, low-powered, intermittently-connected IoT devices running in those factories, vehicles, mines, fields, and homes that I mentioned.

Here are a few of the many ways that you can put Greengrass ML Inference to use:

Precision Farming – With an ever-growing world population and unpredictable weather that can affect crop yields, the opportunity to use technology to increase yields is immense. Intelligent devices that are literally in the field can process images of soil, plants, pests, and crops, taking local corrective action and sending status reports to the cloud.

Physical Security – Smart devices (including the AWS DeepLens) can process images and scenes locally, looking for objects, watching for changes, and even detecting faces. When something of interest or concern arises, the device can pass the image or the video to the cloud and use Amazon Rekognition to take a closer look.

Industrial Maintenance – Smart, local monitoring can increase operational efficiency and reduce unplanned downtime. The monitors can run inference operations on power consumption, noise levels, and vibration to flag anomalies, predict failures, detect faulty equipment.

Greengrass ML Inference Overview
There are several different aspects to this new AWS feature. Let’s take a look at each one:

Machine Learning ModelsPrecompiled TensorFlow and MXNet libraries, optimized for production use on the NVIDIA Jetson TX2 and Intel Atom devices, and development use on 32-bit Raspberry Pi devices. The optimized libraries can take advantage of GPU and FPGA hardware accelerators at the edge in order to provide fast, local inferences.

Model Building and Training – The ability to use Amazon SageMaker and other cloud-based ML tools to build, train, and test your models before deploying them to your IoT devices. To learn more about SageMaker, read Amazon SageMaker – Accelerated Machine Learning.

Model Deployment – SageMaker models can (if you give them the proper IAM permissions) be referenced directly from your Greengrass groups. You can also make use of models stored in S3 buckets. You can add a new machine learning resource to a group with a couple of clicks:

These new features are available now and you can start using them today! To learn more read Perform Machine Learning Inference.

Jeff;

 

Австрия: трудни времена за обществените медии

Post Syndicated from nellyo original https://nellyo.wordpress.com/2018/03/23/orf-2/

Новото правителство на Австрия предприема стъпки за засилване на позициите си в медиите.

Обществената телевизия с най-голямата аудитория в Австрия – до 4 милиона зрители при население 8,7 милиона души – се финансира главно чрез данък, който правителството иска да отмени. Различни министри правят изявления, че не одобряват модела на  финансиране на  ORF. Заместник-канцлерът е най-директен, като нарича ORF  място, където лъжите стават новини. Понятия като фалшиви новини и lügenpresse (лъжепреса) се използват за критичните публикации  по подобие на употребата на термините от управляващите в САЩ.

Представители на ORF  оценяват атаките като част от опитите  на правителството да получи по-голямо политическо влияние чрез медийния сектор. В същото време медийният министър Блумел няколко пъти обявява публично, че правителството възнамерява да укрепи частните радио- и телевизионни медии.

По-широка картина на тревожните тенденции в Австрия – от www.indexoncensorship.org.

timeShift(GrafanaBuzz, 1w) Issue 35

Post Syndicated from Blogs on Grafana Labs Blog original https://grafana.com/blog/2018/02/23/timeshiftgrafanabuzz-1w-issue-35/

Welcome to TimeShift This week’s timeShift will be abridged, as we’re busy putting the final touches on GrafanaCon EU. As I write this, we have 3 Angel tickets remaining, surpassing a registered 350 attendees. 100% of proceeds from these angel tickets will go to the EFF (Electronic Frontier Foundation), a nonprofit who defends the rights of our digital privacy and free speech; a cause we’re very passionate about. You can snag these last tickets here.

Amazon GameLift FleetIQ and Spot Instances – Save up to 90% On Game Server Hosting

Post Syndicated from Jeff Barr original https://aws.amazon.com/blogs/aws/amazon-gamelift-fleetiq-and-spot-instances-save-up-to-90-on-game-server-hosting/

Amazon GameLift is a scalable, cloud-based runtime environment for session-based multiplayer games. You simply upload a build of your game, tell Amazon GameLift which type of EC2 instances you’d like to host it on, and sit back while Amazon GameLift takes care of setting up sessions and maintaining a suitably-sized fleet of EC2 instances. This automatic scaling allows you to accommodate demand that varies over time without having to keep compute resources in reserve during quiet periods.

Use Spot Instances
Last week we added a new feature to further decrease your per-player, per-hour costs when you host your game on Amazon GameLift. Before that launch, Amazon GameLift instances were always launched in On-Demand form. Instances of this type are always billed at fixed prices, as detailed on the Amazon GameLift Pricing page.

You can now make use Amazon GameLift Spot Instances in your GameLift fleets. These instances represent unused capacity and have prices that rise and fall over time. While your results will vary, you may see savings of up to 90% when compared to On-Demand Instances.

While you can use Spot Instances as a simple money-saving tool, there are other interesting use cases as well. Every game has a life cycle, along with a cadre of loyal players who want to keep on playing until you finally unplug and decommission the servers. You could create an Amazon GameLift fleet comprised of low-cost Spot Instances and keep that beloved game up and running as long as possible without breaking the bank. Behind the scenes, an Amazon GameLift Queue will make use of both Spot and On-Demand Instances, balancing price and availability in an attempt to give you the best possible service at the lowest price.

As I mentioned earlier, Spot Instances represent capacity that is not in use by On-Demand Instances. When this capacity decreases, existing Spot Instances could be interrupted with two minutes of notification and then terminated. Fortunately, there’s a lot of capacity and terminations are, statistically speaking, quite rare. To reduce the frequency even further, Amazon GameLift Queues now include a new feature that we call FleetIQ.

FleetIQ is powered by historical pricing and termination data for Spot Instances. This data, in combination with a very conservative strategy for choosing instance types, further reduces the odds that any particular game will be notified and then interrupted. The onProcessTerminate callback in your game’s server process will be activated if the underlying Spot Instance is about to be interrupted. At that point you have two minutes to close out the game, save any logs, free up any resources, and otherwise wrap things up. While you are doing this, you can call GetTerminationTime to see how much time remains.

Creating a Fleet
To take advantage of Spot Instances and FleetIQ, you can use the Amazon GameLift console or API to set up Queues with multiple fleets of Spot and On-Demand Instances. By adding more fleets into each Queue, you give FleetIQ more options to improve latency, interruption rate, and cost. To start a new game session on an instance, FleetIQ first selects the region with the lowest latency for each player, then chooses the fleet with the lowest interruption rate and cost.

Let’s walk through the process. I’ll create a fleet of On-Demand Instances and a fleet of Spot Instances, in that order:

And:

I take a quick break while the fleets are validated and activated:

Then I create a queue for my game. I select the fleets as the destinations for the queue:

If I am building a game that will have a global user base, I can create fleets in additional AWS Regions and use a player latency policy so that game sessions will be created in a suitable region:

To learn more about how to use this feature, take a look at the Spot Fleet Integration Guide.

Now Available
You can use Amazon GameLift Spot Instance fleets to host your session-based games now! Take a look, give it a try, and let me know what you think.

If you are planning to attend GDC this year, be sure to swing by booth 1001. Check out our GDC 2018 site for more information on our dev day talks, classroom sessions, and in-booth demos.

Jeff;

 

EFF Urges US Copyright Office To Reject Proactive ‘Piracy’ Filters

Post Syndicated from Andy original https://torrentfreak.com/eff-urges-us-copyright-office-to-reject-proactive-piracy-filters-180213/

Faced with millions of individuals consuming unlicensed audiovisual content from a variety of sources, entertainment industry groups have been seeking solutions closer to the roots of the problem.

As widespread site-blocking attempts to tackle ‘pirate’ sites in the background, greater attention has turned to legal platforms that host both licensed and unlicensed content.

Under current legislation, these sites and services can do business relatively comfortably due to the so-called safe harbor provisions of the US Digital Millennium Copyright Act (DMCA) and the European Union Copyright Directive (EUCD).

Both sets of legislation ensure that Internet platforms can avoid being held liable for the actions of others provided they themselves address infringement when they are made aware of specific problems. If a video hosting site has a copy of an unlicensed movie uploaded by a user, for example, it must be removed within a reasonable timeframe upon request from the copyright holder.

However, in both the US and EU there is mounting pressure to make it more difficult for online services to achieve ‘safe harbor’ protections.

Entertainment industry groups believe that platforms use the law to turn a blind eye to infringing content uploaded by users, content that is often monetized before being taken down. With this in mind, copyright holders on both sides of the Atlantic are pressing for more proactive regimes, ones that will see Internet platforms install filtering mechanisms to spot and discard infringing content before it can reach the public.

While such a system would be welcomed by rightsholders, Internet companies are fearful of a future in which they could be held more liable for the infringements of others. They’re supported by the EFF, who yesterday presented a petition to the US Copyright Office urging caution over potential changes to the DMCA.

“As Internet users, website owners, and online entrepreneurs, we urge you to preserve and strengthen the Digital Millennium Copyright Act safe harbors for Internet service providers,” the EFF writes.

“The DMCA safe harbors are key to keeping the Internet open to all. They allow anyone to launch a website, app, or other service without fear of crippling liability for copyright infringement by users.”

It is clear that pressure to introduce mandatory filtering is a concern to the EFF. Filters are blunt instruments that cannot fathom the intricacies of fair use and are liable to stifle free speech and stymie innovation, they argue.

“Major media and entertainment companies and their surrogates want Congress to replace today’s DMCA with a new law that would require websites and Internet services to use automated filtering to enforce copyrights.

“Systems like these, no matter how sophisticated, cannot accurately determine the copyright status of a work, nor whether a use is licensed, a fair use, or otherwise non-infringing. Simply put, automated filters censor lawful and important speech,” the EFF warns.

While its introduction was voluntary and doesn’t affect the company’s safe harbor protections, YouTube already has its own content filtering system in place.

ContentID is able to detect the nature of some content uploaded by users and give copyright holders a chance to remove or monetize it. The company says that the majority of copyright disputes are now handled by ContentID but the system is not perfect and mistakes are regularly flagged by users and mentioned in the media.

However, ContentID was also very expensive to implement so expecting smaller companies to deploy something similar on much more limited budgets could be a burden too far, the EFF warns.

“What’s more, even deeply flawed filters are prohibitively expensive for all but the largest Internet services. Requiring all websites to implement filtering would reinforce the market power wielded by today’s large Internet services and allow them to stifle competition. We urge you to preserve effective, usable DMCA safe harbors, and encourage Congress to do the same,” the EFF notes.

The same arguments, for and against, are currently raging in Europe where the EU Commission proposed mandatory upload filtering in 2016. Since then, opposition to the proposals has been fierce, with warnings of potential human rights breaches and conflicts with existing copyright law.

Back in the US, there are additional requirements for a provider to qualify for safe harbor, including having a named designated agent tasked with receiving copyright infringement notifications. This person’s name must be listed on a platform’s website and submitted to the US Copyright Office, which maintains a centralized online directory of designated agents’ contact information.

Under new rules, agents must be re-registered with the Copyright Office every three years, despite that not being a requirement under the DMCA. The EFF is concerned that by simply failing to re-register an agent, an otherwise responsible website could lose its safe harbor protections, even if the agent’s details have remained the same.

“We’re concerned that the new requirement will particularly disadvantage small and nonprofit websites. We ask you to reconsider this rule,” the EFF concludes.

The EFF’s letter to the Copyright Office can be found here.

Source: TF, for the latest info on copyright, file-sharing, torrent sites and more. We also have VPN discounts, offers and coupons

Jailed Streaming Site Operator Hit With Fresh $3m Damages Lawsuit

Post Syndicated from Andy original https://torrentfreak.com/jailed-streaming-site-operator-hit-with-fresh-3m-damages-lawsuit-180207/

After being founded more than half a decade ago, Swefilmer grew to become Sweden’s most popular movie and TV show streaming site. It was only a question of time before authorities stepped in to bring the show to an end.

In 2015, a Swedish operator of the site in his early twenties was raided by local police. A second man, Turkish and in his late twenties, was later arrested in Germany.

The pair, who hadn’t met in person, appeared before the Varberg District Court in January 2017, accused of making more than $1.5m from their activities between November 2013 and June 2015.

The prosecutor described Swefilmer as “organized crime”, painting the then 26-year-old as the main brains behind the site and the 23-year-old as playing a much smaller role. The former was said to have led a luxury lifestyle after benefiting from $1.5m in advertising revenue.

The sentences eventually handed down matched the defendants’ alleged level of participation. While the younger man received probation and community service, the Turk was sentenced to serve three years in prison and ordered to forfeit $1.59m.

Very quickly it became clear there would be an appeal, with plaintiffs represented by anti-piracy outfit RightsAlliance complaining that their 10m krona ($1.25m) claim for damages over the unlawful distribution of local movie Johan Falk: Kodnamn: Lisa had been ruled out by the Court.

With the appeal hearing now just a couple of weeks away, Swedish outlet Breakit is reporting that media giant Bonnier Broadcasting has launched an action of its own against the now 27-year-old former operator of Swefilmer.

According to the publication, Bonnier’s pay-TV company C More, which distributes for Fox, MGM, Paramount, Universal, Sony and Warner, is set to demand around 24m krona ($3.01m) via anti-piracy outfit RightsAlliance.

“This is about organized crime and grossly criminal individuals who earned huge sums on our and others’ content. We want to take every opportunity to take advantage of our rights,” says Johan Gustafsson, Head of Corporate Communications at Bonnier Broadcasting.

C More reportedly filed its lawsuit at the Stockholm District Court on January 30, 2018. At its core are four local movies said to have been uploaded and made available via Swefilmer.

“C More would probably never even have granted a license to [the operator] to make or allow others to make the films available to the public in a similar way as [the operator] did, but if that had happened, the fee would not be less than 5,000,000 krona ($628,350) per film or a total of 20,000,000 krona ($2,513,400),” C More’s claim reads.

Speaking with Breakit, lawyer Ansgar Firsching said he couldn’t say much about C More’s claims against his client.

“I am very surprised that two weeks before the main hearing [C More] comes in with this requirement. If you open another front, we have two trials that are partly about the same thing,” he said.

Firsching said he couldn’t elaborate at this stage but expects his client to deny the claim for damages. C More sees things differently.

“Many people live under the illusion that sites like Swefilmer are driven by idealistic teens in their parents’ basements, which is completely wrong. This is about organized crime where our content is used to generate millions and millions in revenue,” the company notes.

The appeal in the main case is set to go ahead February 20th.

Source: TF, for the latest info on copyright, file-sharing, torrent sites and more. We also have VPN discounts, offers and coupons