Tag Archives: Behavior

Monitoring your Amazon SNS message filtering activity with Amazon CloudWatch

Post Syndicated from Rachel Richardson original https://aws.amazon.com/blogs/compute/monitoring-your-amazon-sns-message-filtering-activity-with-amazon-cloudwatch/

This post is courtesy of Otavio Ferreira, Manager, Amazon SNS, AWS Messaging.

Amazon SNS message filtering provides a set of string and numeric matching operators that allow each subscription to receive only the messages of interest. Hence, SNS message filtering can simplify your pub/sub messaging architecture by offloading the message filtering logic from your subscriber systems, as well as the message routing logic from your publisher systems.

After you set the subscription attribute that defines a filter policy, the subscribing endpoint receives only the messages that carry attributes matching this filter policy. Other messages published to the topic are filtered out for this subscription. In this way, the native integration between SNS and Amazon CloudWatch provides visibility into the number of messages delivered, as well as the number of messages filtered out.

CloudWatch metrics are captured automatically for you. To get started with SNS message filtering, see Filtering Messages with Amazon SNS.

Message Filtering Metrics

The following six CloudWatch metrics are relevant to understanding your SNS message filtering activity:

  • NumberOfMessagesPublished – Inbound traffic to SNS. This metric tracks all the messages that have been published to the topic.
  • NumberOfNotificationsDelivered – Outbound traffic from SNS. This metric tracks all the messages that have been successfully delivered to endpoints subscribed to the topic. A delivery takes place either when the incoming message attributes match a subscription filter policy, or when the subscription has no filter policy at all, which results in a catch-all behavior.
  • NumberOfNotificationsFilteredOut – This metric tracks all the messages that were filtered out because they carried attributes that didn’t match the subscription filter policy.
  • NumberOfNotificationsFilteredOut-NoMessageAttributes – This metric tracks all the messages that were filtered out because they didn’t carry any attributes at all and, consequently, didn’t match the subscription filter policy.
  • NumberOfNotificationsFilteredOut-InvalidAttributes – This metric keeps track of messages that were filtered out because they carried invalid or malformed attributes and, thus, didn’t match the subscription filter policy.
  • NumberOfNotificationsFailed – This last metric tracks all the messages that failed to be delivered to subscribing endpoints, regardless of whether a filter policy had been set for the endpoint. This metric is emitted after the message delivery retry policy is exhausted, and SNS stops attempting to deliver the message. At that moment, the subscribing endpoint is likely no longer reachable. For example, the subscribing SQS queue or Lambda function has been deleted by its owner. You may want to closely monitor this metric to address message delivery issues quickly.

Message filtering graphs

Through the AWS Management Console, you can compose graphs to display your SNS message filtering activity. The graph shows the number of messages published, delivered, and filtered out within the timeframe you specify (1h, 3h, 12h, 1d, 3d, 1w, or custom).

SNS message filtering for CloudWatch Metrics

To compose an SNS message filtering graph with CloudWatch:

  1. Open the CloudWatch console.
  2. Choose Metrics, SNS, All Metrics, and Topic Metrics.
  3. Select all metrics to add to the graph, such as:
    • NumberOfMessagesPublished
    • NumberOfNotificationsDelivered
    • NumberOfNotificationsFilteredOut
  4. Choose Graphed metrics.
  5. In the Statistic column, switch from Average to Sum.
  6. Title your graph with a descriptive name, such as “SNS Message Filtering”

After you have your graph set up, you may want to copy the graph link for bookmarking, emailing, or sharing with co-workers. You may also want to add your graph to a CloudWatch dashboard for easy access in the future. Both actions are available to you on the Actions menu, which is found above the graph.

Summary

SNS message filtering defines how SNS topics behave in terms of message delivery. By using CloudWatch metrics, you gain visibility into the number of messages published, delivered, and filtered out. This enables you to validate the operation of filter policies and more easily troubleshoot during development phases.

SNS message filtering can be implemented easily with existing AWS SDKs by applying message and subscription attributes across all SNS supported protocols (Amazon SQS, AWS Lambda, HTTP, SMS, email, and mobile push). CloudWatch metrics for SNS message filtering is available now, in all AWS Regions.

For information about pricing, see the CloudWatch pricing page.

For more information, see:

Security and Human Behavior (SHB 2018)

Post Syndicated from Bruce Schneier original https://www.schneier.com/blog/archives/2018/05/security_and_hu_7.html

I’m at Carnegie Mellon University, at the eleventh Workshop on Security and Human Behavior.

SHB is a small invitational gathering of people studying various aspects of the human side of security, organized each year by Alessandro Acquisti, Ross Anderson, and myself. The 50 or so people in the room include psychologists, economists, computer security researchers, sociologists, political scientists, neuroscientists, designers, lawyers, philosophers, anthropologists, business school professors, and a smattering of others. It’s not just an interdisciplinary event; most of the people here are individually interdisciplinary.

The goal is to maximize discussion and interaction. We do that by putting everyone on panels, and limiting talks to 7-10 minutes. The rest of the time is left to open discussion. Four hour-and-a-half panels per day over two days equals eight panels; six people per panel means that 48 people get to speak. We also have lunches, dinners, and receptions — all designed so people from different disciplines talk to each other.

I invariably find this to be the most intellectually stimulating conference of my year. It influences my thinking in many different, and sometimes surprising, ways.

This year’s program is here. This page lists the participants and includes links to some of their work. As he does every year, Ross Anderson is liveblogging the talks. (Ross also maintains a good webpage of psychology and security resources.)

Here are my posts on the first, second, third, fourth, fifth, sixth, seventh, eighth, ninth, and tenth SHB workshops. Follow those links to find summaries, papers, and occasionally audio recordings of the various workshops.

Next year, I’ll be hosting the event at Harvard.

Solving Complex Ordering Challenges with Amazon SQS FIFO Queues

Post Syndicated from Christie Gifrin original https://aws.amazon.com/blogs/compute/solving-complex-ordering-challenges-with-amazon-sqs-fifo-queues/

Contributed by Shea Lutton, AWS Cloud Infrastructure Architect

Amazon Simple Queue Service (Amazon SQS) is a fully managed queuing service that helps decouple applications, distributed systems, and microservices to increase fault tolerance. SQS queues come in two distinct types:

  • Standard SQS queues are able to scale to enormous throughput with at-least-once delivery.
  • FIFO queues are designed to guarantee that messages are processed exactly once in the exact order that they are received and have a default rate of 300 transactions per second.

As customers explore SQS FIFO queues, they often have questions about how the behavior works when messages arrive and are consumed. This post walks through some common situations to identify the exact behavior that you can expect. It also covers the behavior of message groups in depth and explains why message groups are key to understanding how FIFO queues work.

The simple case

Suppose that you run a major auction platform where people buy and sell a wide range of products. Your platform requires that transactions from buyers and sellers get processed in exactly the order received. Here’s how a FIFO queue helps you keep all your transactions in one straight flow.

A seller currently is holding an auction for a laptop, and three different bids are received for the same price. Ties are awarded to the first bidder at that price so it is important to track which arrived first. Your auction platform receives the three bids and sends them to a FIFO queue before they are processed.

Now observe how messages leave the queue. When your consumer asks for a batch of up to 10 messages, SQS starts filling the batch with the oldest message (bid A1). It keeps filling until either the batch is full or the queue is empty. In this case, the batch contains the three messages and the queue is now empty. After a batch has left the queue, SQS considers that batch of messages to be “in-flight” until the consumer either deletes them or the batch’s visibility timer expires.

 

When you have a single consumer, this is easy to envision. The consumer gets a batch of messages (now in-flight), does its processing, and deletes the messages. That consumer is then ready to ask for the next batch of messages.

The critical thing to keep in mind is that SQS won’t release the next batch of messages until the first batch has been deleted. By adding more messages to the queue, you can see more interesting behaviors. Imagine that a burst of 11 bids is sent to your FIFO queue, with two bids for Auction A arriving last.

The FIFO queue now has at least two batches of messages in it. When your single consumer requests the first batch of 10 messages, it receives a batch starting with B1 and ending with A1. Later, after the first batch has been deleted, the consumer can get the second batch of messages containing the final A2 message from the queue.

Adding complexity with multiple message groups

A new challenge arises. Your auction platform is getting busier and your dev team added a number of new features. The combination of increased messages and extra processing time for the new features means that a single consumer is too slow. The solution is to scale to have more consumers and process messages in parallel.

To work in parallel, your team realized that only the messages related to a single auction must be kept in order. All transactions for Auction A need to be kept in order and so do all transactions for Auction B. But the two auctions are independent and it does not matter which auctions transactions are processed first.

FIFO can handle that case with a feature called message groups. Each transaction related to Auction A is placed by your producer into message group A, and so on. In the diagram below, Auction A and Auction B each received three bid transactions, with bid B1 arriving first. The FIFO queue always keeps transactions within a message group in the order in which they arrived.

How is this any different than earlier examples? The consumer now gets the messages ordered by message groups, all the B group messages followed by all the A group messages. Multiple message groups create the possibility of using multiple consumers, which I explain in a moment. If FIFO can’t fill up a batch of messages with a single message group, FIFO can place more than one message group in a batch of messages. But whenever possible, the queue gives you a full batch of messages from the same group.

The order of messages leaving a FIFO queue is governed by three rules:

  1. Return the oldest message where no other message in the same message group is currently in-flight.
  2. Return as many messages from the same message group as possible.
  3. If a message batch is still not full, go back to rule 1.

To see this behavior, add a second consumer and insert many more messages into the queue. For simplicity, the delete message action has been omitted in these diagrams but it is assumed that all messages in a batch are processed successfully by the consumer and the batch is properly deleted immediately after.

In this example, there are 11 Group A and 11 Group B transactions arriving in interleaved order and a second consumer has been added. Consumer 1 asks for a group of 10 messages and receives 10 Group A messages. Consumer 2 then asks for 10 messages but SQS knows that Group A is in flight, so it releases 10 Group B messages. The two consumers are now processing two batches of messages in parallel, speeding up throughput and then deleting their batches. When Consumer 1 requests the next batch of messages, it receives the remaining two messages, one from Group A and one from Group B.

Consider this nuanced detail from the example above. What would happen if Consumer 1 was on a faster server and processed its first batch of messages before Consumer 2 could mark its messages for deletion? See if you can predict the behavior before looking at the answer.

If Consumer 2 has not deleted its Group B messages yet when Consumer 1 asks for the next batch, then the FIFO queue considers Group B to still be in flight. It does not release any more Group B messages. Consumer 1 gets only the remaining Group A message. Later, after Consumer 2 has deleted its first batch, the remaining Group B message is released.

Conclusion

I hope this post answered your questions about how Amazon SQS FIFO queues work and why message groups are helpful. If you’re interested in exploring SQS FIFO queues further, here are a few ideas to get you started:

[$] Using user-space tracepoints with BPF

Post Syndicated from corbet original https://lwn.net/Articles/753601/rss

Much has been written on LWN about dynamically instrumenting kernel
code. These features are also available to user-space code with a
special kind of probe known as a User Statically-Defined Tracing
(USDT) probe. These probes provide a low-overhead way of
instrumenting user-space code and provide a convenient way to debug applications
running in production. In this final article of the BPF and BCC series
we’ll look at where USDT probes come from and how you can use them to
understand the behavior of your own applications.

The intersection of Customer Engagement and Data Science

Post Syndicated from Brent Meyer original https://aws.amazon.com/blogs/messaging-and-targeting/the-intersection-of-customer-engagement-and-data-science/

On the Messaging and Targeting team, we’re constantly inspired by the new and novel ways that customers use our services. For example, last year we took an in-depth look at a customer who built a fully featured email marketing platform based on Amazon SES and other AWS Services.

This week, our friends on the AWS Machine Learning team published a blog post that brings together the worlds of data science and customer engagement. Their solution uses Amazon SageMaker (a platform for building and deploying machine learning models) to create a system that makes purchasing predictions based on customers’ past behaviors. It then uses Amazon Pinpoint to send campaigns to customers based on these predictions.

The blog post is an interesting read that includes a primer on the process of creating a useful Machine Learning solution. It then goes in-depth, discussing the real-world considerations that are involved in implementing the solution.

Take a look at their post, Amazon Pinpoint campaigns driven by machine learning on Amazon SageMaker, on the AWS Machine Learning Blog.

Analyze data in Amazon DynamoDB using Amazon SageMaker for real-time prediction

Post Syndicated from YongSeong Lee original https://aws.amazon.com/blogs/big-data/analyze-data-in-amazon-dynamodb-using-amazon-sagemaker-for-real-time-prediction/

Many companies across the globe use Amazon DynamoDB to store and query historical user-interaction data. DynamoDB is a fast NoSQL database used by applications that need consistent, single-digit millisecond latency.

Often, customers want to turn their valuable data in DynamoDB into insights by analyzing a copy of their table stored in Amazon S3. Doing this separates their analytical queries from their low-latency critical paths. This data can be the primary source for understanding customers’ past behavior, predicting future behavior, and generating downstream business value. Customers often turn to DynamoDB because of its great scalability and high availability. After a successful launch, many customers want to use the data in DynamoDB to predict future behaviors or provide personalized recommendations.

DynamoDB is a good fit for low-latency reads and writes, but it’s not practical to scan all data in a DynamoDB database to train a model. In this post, I demonstrate how you can use DynamoDB table data copied to Amazon S3 by AWS Data Pipeline to predict customer behavior. I also demonstrate how you can use this data to provide personalized recommendations for customers using Amazon SageMaker. You can also run ad hoc queries using Amazon Athena against the data. DynamoDB recently released on-demand backups to create full table backups with no performance impact. However, it’s not suitable for our purposes in this post, so I chose AWS Data Pipeline instead to create managed backups are accessible from other services.

To do this, I describe how to read the DynamoDB backup file format in Data Pipeline. I also describe how to convert the objects in S3 to a CSV format that Amazon SageMaker can read. In addition, I show how to schedule regular exports and transformations using Data Pipeline. The sample data used in this post is from Bank Marketing Data Set of UCI.

The solution that I describe provides the following benefits:

  • Separates analytical queries from production traffic on your DynamoDB table, preserving your DynamoDB read capacity units (RCUs) for important production requests
  • Automatically updates your model to get real-time predictions
  • Optimizes for performance (so it doesn’t compete with DynamoDB RCUs after the export) and for cost (using data you already have)
  • Makes it easier for developers of all skill levels to use Amazon SageMaker

All code and data set in this post are available in this .zip file.

Solution architecture

The following diagram shows the overall architecture of the solution.

The steps that data follows through the architecture are as follows:

  1. Data Pipeline regularly copies the full contents of a DynamoDB table as JSON into an S3
  2. Exported JSON files are converted to comma-separated value (CSV) format to use as a data source for Amazon SageMaker.
  3. Amazon SageMaker renews the model artifact and update the endpoint.
  4. The converted CSV is available for ad hoc queries with Amazon Athena.
  5. Data Pipeline controls this flow and repeats the cycle based on the schedule defined by customer requirements.

Building the auto-updating model

This section discusses details about how to read the DynamoDB exported data in Data Pipeline and build automated workflows for real-time prediction with a regularly updated model.

Download sample scripts and data

Before you begin, take the following steps:

  1. Download sample scripts in this .zip file.
  2. Unzip the src.zip file.
  3. Find the automation_script.sh file and edit it for your environment. For example, you need to replace 's3://<your bucket>/<datasource path>/' with your own S3 path to the data source for Amazon ML. In the script, the text enclosed by angle brackets—< and >—should be replaced with your own path.
  4. Upload the json-serde-1.3.6-SNAPSHOT-jar-with-dependencies.jar file to your S3 path so that the ADD jar command in Apache Hive can refer to it.

For this solution, the banking.csv  should be imported into a DynamoDB table.

Export a DynamoDB table

To export the DynamoDB table to S3, open the Data Pipeline console and choose the Export DynamoDB table to S3 template. In this template, Data Pipeline creates an Amazon EMR cluster and performs an export in the EMRActivity activity. Set proper intervals for backups according to your business requirements.

One core node(m3.xlarge) provides the default capacity for the EMR cluster and should be suitable for the solution in this post. Leave the option to resize the cluster before running enabled in the TableBackupActivity activity to let Data Pipeline scale the cluster to match the table size. The process of converting to CSV format and renewing models happens in this EMR cluster.

For a more in-depth look at how to export data from DynamoDB, see Export Data from DynamoDB in the Data Pipeline documentation.

Add the script to an existing pipeline

After you export your DynamoDB table, you add an additional EMR step to EMRActivity by following these steps:

  1. Open the Data Pipeline console and choose the ID for the pipeline that you want to add the script to.
  2. For Actions, choose Edit.
  3. In the editing console, choose the Activities category and add an EMR step using the custom script downloaded in the previous section, as shown below.

Paste the following command into the new step after the data ­­upload step:

s3://#{myDDBRegion}.elasticmapreduce/libs/script-runner/script-runner.jar,s3://<your bucket name>/automation_script.sh,#{output.directoryPath},#{myDDBRegion}

The element #{output.directoryPath} references the S3 path where the data pipeline exports DynamoDB data as JSON. The path should be passed to the script as an argument.

The bash script has two goals, converting data formats and renewing the Amazon SageMaker model. Subsequent sections discuss the contents of the automation script.

Automation script: Convert JSON data to CSV with Hive

We use Apache Hive to transform the data into a new format. The Hive QL script to create an external table and transform the data is included in the custom script that you added to the Data Pipeline definition.

When you run the Hive scripts, do so with the -e option. Also, define the Hive table with the 'org.openx.data.jsonserde.JsonSerDe' row format to parse and read JSON format. The SQL creates a Hive EXTERNAL table, and it reads the DynamoDB backup data on the S3 path passed to it by Data Pipeline.

Note: You should create the table with the “EXTERNAL” keyword to avoid the backup data being accidentally deleted from S3 if you drop the table.

The full automation script for converting follows. Add your own bucket name and data source path in the highlighted areas.

#!/bin/bash
hive -e "
ADD jar s3://<your bucket name>/json-serde-1.3.6-SNAPSHOT-jar-with-dependencies.jar ; 
DROP TABLE IF EXISTS blog_backup_data ;
CREATE EXTERNAL TABLE blog_backup_data (
 customer_id map<string,string>,
 age map<string,string>, job map<string,string>, 
 marital map<string,string>,education map<string,string>, 
 default map<string,string>, housing map<string,string>,
 loan map<string,string>, contact map<string,string>, 
 month map<string,string>, day_of_week map<string,string>, 
 duration map<string,string>, campaign map<string,string>,
 pdays map<string,string>, previous map<string,string>, 
 poutcome map<string,string>, emp_var_rate map<string,string>, 
 cons_price_idx map<string,string>, cons_conf_idx map<string,string>,
 euribor3m map<string,string>, nr_employed map<string,string>, 
 y map<string,string> ) 
ROW FORMAT SERDE 'org.openx.data.jsonserde.JsonSerDe' 
LOCATION '$1/';

INSERT OVERWRITE DIRECTORY 's3://<your bucket name>/<datasource path>/' 
SELECT concat( customer_id['s'],',', 
 age['n'],',', job['s'],',', 
 marital['s'],',', education['s'],',', default['s'],',', 
 housing['s'],',', loan['s'],',', contact['s'],',', 
 month['s'],',', day_of_week['s'],',', duration['n'],',', 
 campaign['n'],',',pdays['n'],',',previous['n'],',', 
 poutcome['s'],',', emp_var_rate['n'],',', cons_price_idx['n'],',',
 cons_conf_idx['n'],',', euribor3m['n'],',', nr_employed['n'],',', y['n'] ) 
FROM blog_backup_data
WHERE customer_id['s'] > 0 ; 

After creating an external table, you need to read data. You then use the INSERT OVERWRITE DIRECTORY ~ SELECT command to write CSV data to the S3 path that you designated as the data source for Amazon SageMaker.

Depending on your requirements, you can eliminate or process the columns in the SELECT clause in this step to optimize data analysis. For example, you might remove some columns that have unpredictable correlations with the target value because keeping the wrong columns might expose your model to “overfitting” during the training. In this post, customer_id  columns is removed. Overfitting can make your prediction weak. More information about overfitting can be found in the topic Model Fit: Underfitting vs. Overfitting in the Amazon ML documentation.

Automation script: Renew the Amazon SageMaker model

After the CSV data is replaced and ready to use, create a new model artifact for Amazon SageMaker with the updated dataset on S3.  For renewing model artifact, you must create a new training job.  Training jobs can be run using the AWS SDK ( for example, Amazon SageMaker boto3 ) or the Amazon SageMaker Python SDK that can be installed with “pip install sagemaker” command as well as the AWS CLI for Amazon SageMaker described in this post.

In addition, consider how to smoothly renew your existing model without service impact, because your model is called by applications in real time. To do this, you need to create a new endpoint configuration first and update a current endpoint with the endpoint configuration that is just created.

#!/bin/bash
## Define variable 
REGION=$2
DTTIME=`date +%Y-%m-%d-%H-%M-%S`
ROLE="<your AmazonSageMaker-ExecutionRole>" 


# Select containers image based on region.  
case "$REGION" in
"us-west-2" )
    IMAGE="174872318107.dkr.ecr.us-west-2.amazonaws.com/linear-learner:latest"
    ;;
"us-east-1" )
    IMAGE="382416733822.dkr.ecr.us-east-1.amazonaws.com/linear-learner:latest" 
    ;;
"us-east-2" )
    IMAGE="404615174143.dkr.ecr.us-east-2.amazonaws.com/linear-learner:latest" 
    ;;
"eu-west-1" )
    IMAGE="438346466558.dkr.ecr.eu-west-1.amazonaws.com/linear-learner:latest" 
    ;;
 *)
    echo "Invalid Region Name"
    exit 1 ;  
esac

# Start training job and creating model artifact 
TRAINING_JOB_NAME=TRAIN-${DTTIME} 
S3OUTPUT="s3://<your bucket name>/model/" 
INSTANCETYPE="ml.m4.xlarge"
INSTANCECOUNT=1
VOLUMESIZE=5 
aws sagemaker create-training-job --training-job-name ${TRAINING_JOB_NAME} --region ${REGION}  --algorithm-specification TrainingImage=${IMAGE},TrainingInputMode=File --role-arn ${ROLE}  --input-data-config '[{ "ChannelName": "train", "DataSource": { "S3DataSource": { "S3DataType": "S3Prefix", "S3Uri": "s3://<your bucket name>/<datasource path>/", "S3DataDistributionType": "FullyReplicated" } }, "ContentType": "text/csv", "CompressionType": "None" , "RecordWrapperType": "None"  }]'  --output-data-config S3OutputPath=${S3OUTPUT} --resource-config  InstanceType=${INSTANCETYPE},InstanceCount=${INSTANCECOUNT},VolumeSizeInGB=${VOLUMESIZE} --stopping-condition MaxRuntimeInSeconds=120 --hyper-parameters feature_dim=20,predictor_type=binary_classifier  

# Wait until job completed 
aws sagemaker wait training-job-completed-or-stopped --training-job-name ${TRAINING_JOB_NAME}  --region ${REGION}

# Get newly created model artifact and create model
MODELARTIFACT=`aws sagemaker describe-training-job --training-job-name ${TRAINING_JOB_NAME} --region ${REGION}  --query 'ModelArtifacts.S3ModelArtifacts' --output text `
MODELNAME=MODEL-${DTTIME}
aws sagemaker create-model --region ${REGION} --model-name ${MODELNAME}  --primary-container Image=${IMAGE},ModelDataUrl=${MODELARTIFACT}  --execution-role-arn ${ROLE}

# create a new endpoint configuration 
CONFIGNAME=CONFIG-${DTTIME}
aws sagemaker  create-endpoint-config --region ${REGION} --endpoint-config-name ${CONFIGNAME}  --production-variants  VariantName=Users,ModelName=${MODELNAME},InitialInstanceCount=1,InstanceType=ml.m4.xlarge

# create or update the endpoint
STATUS=`aws sagemaker describe-endpoint --endpoint-name  ServiceEndpoint --query 'EndpointStatus' --output text --region ${REGION} `
if [[ $STATUS -ne "InService" ]] ;
then
    aws sagemaker  create-endpoint --endpoint-name  ServiceEndpoint  --endpoint-config-name ${CONFIGNAME} --region ${REGION}    
else
    aws sagemaker  update-endpoint --endpoint-name  ServiceEndpoint  --endpoint-config-name ${CONFIGNAME} --region ${REGION}
fi

Grant permission

Before you execute the script, you must grant proper permission to Data Pipeline. Data Pipeline uses the DataPipelineDefaultResourceRole role by default. I added the following policy to DataPipelineDefaultResourceRole to allow Data Pipeline to create, delete, and update the Amazon SageMaker model and data source in the script.

{
 "Version": "2012-10-17",
 "Statement": [
 {
 "Effect": "Allow",
 "Action": [
 "sagemaker:CreateTrainingJob",
 "sagemaker:DescribeTrainingJob",
 "sagemaker:CreateModel",
 "sagemaker:CreateEndpointConfig",
 "sagemaker:DescribeEndpoint",
 "sagemaker:CreateEndpoint",
 "sagemaker:UpdateEndpoint",
 "iam:PassRole"
 ],
 "Resource": "*"
 }
 ]
}

Use real-time prediction

After you deploy a model into production using Amazon SageMaker hosting services, your client applications use this API to get inferences from the model hosted at the specified endpoint. This approach is useful for interactive web, mobile, or desktop applications.

Following, I provide a simple Python code example that queries against Amazon SageMaker endpoint URL with its name (“ServiceEndpoint”) and then uses them for real-time prediction.

=== Python sample for real-time prediction ===

#!/usr/bin/env python
import boto3
import json 

client = boto3.client('sagemaker-runtime', region_name ='<your region>' )
new_customer_info = '34,10,2,4,1,2,1,1,6,3,190,1,3,4,3,-1.7,94.055,-39.8,0.715,4991.6'
response = client.invoke_endpoint(
    EndpointName='ServiceEndpoint',
    Body=new_customer_info, 
    ContentType='text/csv'
)
result = json.loads(response['Body'].read().decode())
print(result)
--- output(response) ---
{u'predictions': [{u'score': 0.7528127431869507, u'predicted_label': 1.0}]}

Solution summary

The solution takes the following steps:

  1. Data Pipeline exports DynamoDB table data into S3. The original JSON data should be kept to recover the table in the rare event that this is needed. Data Pipeline then converts JSON to CSV so that Amazon SageMaker can read the data.Note: You should select only meaningful attributes when you convert CSV. For example, if you judge that the “campaign” attribute is not correlated, you can eliminate this attribute from the CSV.
  2. Train the Amazon SageMaker model with the new data source.
  3. When a new customer comes to your site, you can judge how likely it is for this customer to subscribe to your new product based on “predictedScores” provided by Amazon SageMaker.
  4. If the new user subscribes your new product, your application must update the attribute “y” to the value 1 (for yes). This updated data is provided for the next model renewal as a new data source. It serves to improve the accuracy of your prediction. With each new entry, your application can become smarter and deliver better predictions.

Running ad hoc queries using Amazon Athena

Amazon Athena is a serverless query service that makes it easy to analyze large amounts of data stored in Amazon S3 using standard SQL. Athena is useful for examining data and collecting statistics or informative summaries about data. You can also use the powerful analytic functions of Presto, as described in the topic Aggregate Functions of Presto in the Presto documentation.

With the Data Pipeline scheduled activity, recent CSV data is always located in S3 so that you can run ad hoc queries against the data using Amazon Athena. I show this with example SQL statements following. For an in-depth description of this process, see the post Interactive SQL Queries for Data in Amazon S3 on the AWS News Blog. 

Creating an Amazon Athena table and running it

Simply, you can create an EXTERNAL table for the CSV data on S3 in Amazon Athena Management Console.

=== Table Creation ===
CREATE EXTERNAL TABLE datasource (
 age int, 
 job string, 
 marital string , 
 education string, 
 default string, 
 housing string, 
 loan string, 
 contact string, 
 month string, 
 day_of_week string, 
 duration int, 
 campaign int, 
 pdays int , 
 previous int , 
 poutcome string, 
 emp_var_rate double, 
 cons_price_idx double,
 cons_conf_idx double, 
 euribor3m double, 
 nr_employed double, 
 y int 
)
ROW FORMAT DELIMITED 
FIELDS TERMINATED BY ',' ESCAPED BY '\\' LINES TERMINATED BY '\n' 
LOCATION 's3://<your bucket name>/<datasource path>/';

The following query calculates the correlation coefficient between the target attribute and other attributes using Amazon Athena.

=== Sample Query ===

SELECT corr(age,y) AS correlation_age_and_target, 
 corr(duration,y) AS correlation_duration_and_target, 
 corr(campaign,y) AS correlation_campaign_and_target,
 corr(contact,y) AS correlation_contact_and_target
FROM ( SELECT age , duration , campaign , y , 
 CASE WHEN contact = 'telephone' THEN 1 ELSE 0 END AS contact 
 FROM datasource 
 ) datasource ;

Conclusion

In this post, I introduce an example of how to analyze data in DynamoDB by using table data in Amazon S3 to optimize DynamoDB table read capacity. You can then use the analyzed data as a new data source to train an Amazon SageMaker model for accurate real-time prediction. In addition, you can run ad hoc queries against the data on S3 using Amazon Athena. I also present how to automate these procedures by using Data Pipeline.

You can adapt this example to your specific use case at hand, and hopefully this post helps you accelerate your development. You can find more examples and use cases for Amazon SageMaker in the video AWS 2017: Introducing Amazon SageMaker on the AWS website.

 


Additional Reading

If you found this post useful, be sure to check out Serving Real-Time Machine Learning Predictions on Amazon EMR and Analyzing Data in S3 using Amazon Athena.

 


About the Author

Yong Seong Lee is a Cloud Support Engineer for AWS Big Data Services. He is interested in every technology related to data/databases and helping customers who have difficulties in using AWS services. His motto is “Enjoy life, be curious and have maximum experience.”

 

 

[$] Three sessions on memory control groups

Post Syndicated from corbet original https://lwn.net/Articles/753162/rss

Memory control groups allow the system administrator to impose memory-use
limits on the members of control groups. In many ways, these limits behave
like the overall limit on available memory, but there are also some
differences. The behavior of the memory controller also changed with the
advent of the version-2 control-group API, creating problems for at least
one significant user. Three sessions held in the memory-management track of
the Linux Storage, Filesystem, and Memory-Management Summit explored some
of these problems.

IoT Inspector Tool from Princeton

Post Syndicated from Bruce Schneier original https://www.schneier.com/blog/archives/2018/05/iot_inspector_t.html

Researchers at Princeton University have released IoT Inspector, a tool that analyzes the security and privacy of IoT devices by examining the data they send across the Internet. They’ve already used the tool to study a bunch of different IoT devices. From their blog post:

Finding #3: Many IoT Devices Contact a Large and Diverse Set of Third Parties

In many cases, consumers expect that their devices contact manufacturers’ servers, but communication with other third-party destinations may not be a behavior that consumers expect.

We have found that many IoT devices communicate with third-party services, of which consumers are typically unaware. We have found many instances of third-party communications in our analyses of IoT device network traffic. Some examples include:

  • Samsung Smart TV. During the first minute after power-on, the TV talks to Google Play, Double Click, Netflix, FandangoNOW, Spotify, CBS, MSNBC, NFL, Deezer, and Facebook­even though we did not sign in or create accounts with any of them.
  • Amcrest WiFi Security Camera. The camera actively communicates with cellphonepush.quickddns.com using HTTPS. QuickDDNS is a Dynamic DNS service provider operated by Dahua. Dahua is also a security camera manufacturer, although Amcrest’s website makes no references to Dahua. Amcrest customer service informed us that Dahua was the original equipment manufacturer.

  • Halo Smoke Detector. The smart smoke detector communicates with broker.xively.com. Xively offers an MQTT service, which allows manufacturers to communicate with their devices.

  • Geeni Light Bulb. The Geeni smart bulb communicates with gw.tuyaus.com, which is operated by TuYa, a China-based company that also offers an MQTT service.

We also looked at a number of other devices, such as Samsung Smart Camera and TP-Link Smart Plug, and found communications with third parties ranging from NTP pools (time servers) to video storage services.

Their first two findings are that “Many IoT devices lack basic encryption and authentication” and that “User behavior can be inferred from encrypted IoT device traffic.” No surprises there.

Boingboing post.

Related: IoT Hall of Shame.

Friday Squid Blogging: Bizarre Contorted Squid

Post Syndicated from Bruce Schneier original https://www.schneier.com/blog/archives/2018/04/friday_squid_bl_622.html

This bizarre contorted squid might be a new species, or a previously known species exhibiting a new behavior. No one knows.

As usual, you can also use this squid post to talk about the security stories in the news that I haven’t covered.

Read my blog posting guidelines here.

Secure Build with AWS CodeBuild and LayeredInsight

Post Syndicated from Asif Khan original https://aws.amazon.com/blogs/devops/secure-build-with-aws-codebuild-and-layeredinsight/

This post is written by Asif Awan, Chief Technology Officer of Layered InsightSubin Mathew – Software Development Manager for AWS CodeBuild, and Asif Khan – Solutions Architect

Enterprises adopt containers because they recognize the benefits: speed, agility, portability, and high compute density. They understand how accelerating application delivery and deployment pipelines makes it possible to rapidly slipstream new features to customers. Although the benefits are indisputable, this acceleration raises concerns about security and corporate compliance with software governance. In this blog post, I provide a solution that shows how Layered Insight, the pioneer and global leader in container-native application protection, can be used with seamless application build and delivery pipelines like those available in AWS CodeBuild to address these concerns.

Layered Insight solutions

Layered Insight enables organizations to unify DevOps and SecOps by providing complete visibility and control of containerized applications. Using the industry’s first embedded security approach, Layered Insight solves the challenges of container performance and protection by providing accurate insight into container images, adaptive analysis of running containers, and automated enforcement of container behavior.

 

AWS CodeBuild

AWS CodeBuild is a fully managed build service that compiles source code, runs tests, and produces software packages that are ready to deploy. With CodeBuild, you don’t need to provision, manage, and scale your own build servers. CodeBuild scales continuously and processes multiple builds concurrently, so your builds are not left waiting in a queue. You can get started quickly by using prepackaged build environments, or you can create custom build environments that use your own build tools.

 

Problem Definition

Security and compliance concerns span the lifecycle of application containers. Common concerns include:

Visibility into the container images. You need to verify the software composition information of the container image to determine whether known vulnerabilities associated with any of the software packages and libraries are included in the container image.

Governance of container images is critical because only certain open source packages/libraries, of specific versions, should be included in the container images. You need support for mechanisms for blacklisting all container images that include a certain version of a software package/library, or only allowing open source software that come with a specific type of license (such as Apache, MIT, GPL, and so on). You need to be able to address challenges such as:

·       Defining the process for image compliance policies at the enterprise, department, and group levels.

·       Preventing the images that fail the compliance checks from being deployed in critical environments, such as staging, pre-prod, and production.

Visibility into running container instances is critical, including:

·       CPU and memory utilization.

·       Security of the build environment.

·       All activities (system, network, storage, and application layer) of the application code running in each container instance.

Protection of running container instances that is:

·       Zero-touch to the developers (not an SDK-based approach).

·       Zero touch to the DevOps team and doesn’t limit the portability of the containerized application.

·       This protection must retain the option to switch to a different container stack or orchestration layer, or even to a different Container as a Service (CaaS ).

·       And it must be a fully automated solution to SecOps, so that the SecOps team doesn’t have to manually analyze and define detailed blacklist and whitelist policies.

 

Solution Details

In AWS CodeCommit, we have three projects:
●     “Democode” is a simple Java application, with one buildspec to build the app into a Docker container (run by build-demo-image CodeBuild project), and another to instrument said container (instrument-image CodeBuild project). The resulting container is stored in ECR repo javatestasjavatest:20180415-layered. This instrumented container is running in AWS Fargate cluster demo-java-appand can be seen in the Layered Insight runtime console as the javatestapplication in us-east-1.
●     aws-codebuild-docker-imagesis a clone of the official aws-codebuild-docker-images repo on GitHub . This CodeCommit project is used by the build-python-builder CodeBuild project to build the python 3.3.6 codebuild image and is stored at the codebuild-python ECR repo. We then manually instructed the Layered Insight console to instrument the image.
●     scan-java-imagecontains just a buildspec.yml file. This file is used by the scan-java-image CodeBuild project to instruct Layered Assessment to perform a vulnerability scan of the javatest container image built previously, and then run the scan results through a compliance policy that states there should be no medium vulnerabilities. This build fails — but in this case that is a success: the scan completes successfully, but compliance fails as there are medium-level issues found in the scan.

This build is performed using the instrumented version of the Python 3.3.6 CodeBuild image, so the activity of the processes running within the build are recorded each time within the LI console.

Build container image

Create or use a CodeCommit project with your application. To build this image and store it in Amazon Elastic Container Registry (Amazon ECR), add a buildspec file to the project and build a container image and create a CodeBuild project.

Scan container image

Once the image is built, create a new buildspec in the same project or a new one that looks similar to below (update ECR URL as necessary):

version: 0.2
phases:
  pre_build:
    commands:
      - echo Pulling down LI Scan API client scripts
      - git clone https://github.com/LayeredInsight/scan-api-example-python.git
      - echo Setting up LI Scan API client
      - cd scan-api-example-python
      - pip install layint_scan_api
      - pip install -r requirements.txt
  build:
    commands:
      - echo Scanning container started on `date`
      - IMAGEID=$(./li_add_image --name <aws-region>.amazonaws.com/javatest:20180415)
      - ./li_wait_for_scan -v --imageid $IMAGEID
      - ./li_run_image_compliance -v --imageid $IMAGEID --policyid PB15260f1acb6b2aa5b597e9d22feffb538256a01fbb4e5a95

Add the buildspec file to the git repo, push it, and then build a CodeBuild project using with the instrumented Python 3.3.6 CodeBuild image at <aws-region>.amazonaws.com/codebuild-python:3.3.6-layered. Set the following environment variables in the CodeBuild project:
●     LI_APPLICATIONNAME – name of the build to display
●     LI_LOCATION – location of the build project to display
●     LI_API_KEY – ApiKey:<key-name>:<api-key>
●     LI_API_HOST – location of the Layered Insight API service

Instrument container image

Next, to instrument the new container image:

  1. In the Layered Insight runtime console, ensure that the ECR registry and credentials are defined (click the Setup icon and the ‘+’ sign on the top right of the screen to add a new container registry). Note the name given to the registry in the console, as this needs to be referenced in the li_add_imagecommand in the script, below.
  2. Next, add a new buildspec (with a new name) to the CodeCommit project, such as the one shown below. This code will download the Layered Insight runtime client, and use it to instruct the Layered Insight service to instrument the image that was just built:
    version: 0.2
    phases:
    pre_build:
    commands:
    echo Pulling down LI API Runtime client scripts
    git clone https://github.com/LayeredInsight/runtime-api-example-python
    echo Setting up LI API client
    cd runtime-api-example-python
    pip install layint-runtime-api
    pip install -r requirements.txt
    build:
    commands:
    echo Instrumentation started on `date`
    ./li_add_image --registry "Javatest ECR" --name IMAGE_NAME:TAG --description "IMAGE DESCRIPTION" --policy "Default Policy" --instrument --wait --verbose
  3. Commit and push the new buildspec file.
  4. Going back to CodeBuild, create a new project, with the same CodeCommit repo, but this time select the new buildspec file. Use a Python 3.3.6 builder – either the AWS or LI Instrumented version.
  5. Click Continue
  6. Click Save
  7. Run the build, again on the master branch.
  8. If everything runs successfully, a new image should appear in the ECR registry with a -layered suffix. This is the instrumented image.

Run instrumented container image

When the instrumented container is now run — in ECS, Fargate, or elsewhere — it will log data back to the Layered Insight runtime console. It’s appearance in the console can be modified by setting the LI_APPLICATIONNAME and LI_LOCATION environment variables when running the container.

Conclusion

In the above blog we have provided you steps needed to embed governance and runtime security in your build pipelines running on AWS CodeBuild using Layered Insight.

 

 

 

Congratulations to Oracle on MySQL 8.0

Post Syndicated from Michael "Monty" Widenius original http://monty-says.blogspot.com/2018/04/congratulations-to-oracle-on-mysql-80.html

Last week, Oracle announced the general availability of MySQL 8.0. This is good news for database users, as it means Oracle is still developing MySQL.

I decide to celebrate the event by doing a quick test of MySQL 8.0. Here follows a step-by-step description of my first experience with MySQL 8.0.
Note that I did the following without reading the release notes, as is what I have done with every MySQL / MariaDB release up to date; In this case it was not the right thing to do.

I pulled MySQL 8.0 from [email protected]:mysql/mysql-server.git
I was pleasantly surprised that ‘cmake . ; make‘ worked without without any compiler warnings! I even checked the used compiler options and noticed that MySQL was compiled with -Wall + several other warning flags. Good job MySQL team!

I did have a little trouble finding the mysqld binary as Oracle had moved it to ‘runtime_output_directory’; Unexpected, but no big thing.

Now it’s was time to install MySQL 8.0.

I did know that MySQL 8.0 has removed mysql_install_db, so I had to use the mysqld binary directly to install the default databases:
(I have specified datadir=/my/data3 in the /tmp/my.cnf file)

> cd runtime_output_directory
> mkdir /my/data3
> ./mysqld –defaults-file=/tmp/my.cnf –install

2018-04-22T12:38:18.332967Z 1 [ERROR] [MY-011011] [Server] Failed to find valid data directory.
2018-04-22T12:38:18.333109Z 0 [ERROR] [MY-010020] [Server] Data Dictionary initialization failed.
2018-04-22T12:38:18.333135Z 0 [ERROR] [MY-010119] [Server] Aborting

A quick look in mysqld –help –verbose output showed that the right command option is –-initialize. My bad, lets try again,

> ./mysqld –defaults-file=/tmp/my.cnf –initialize

2018-04-22T12:39:31.910509Z 0 [ERROR] [MY-010457] [Server] –initialize specified but the data directory has files in it. Aborting.
2018-04-22T12:39:31.910578Z 0 [ERROR] [MY-010119] [Server] Aborting

Now I used the right options, but still didn’t work.
I took a quick look around:

> ls /my/data3/
binlog.index

So even if the mysqld noticed that the data3 directory was wrong, it still wrote things into it.  This even if I didn’t have –log-binlog enabled in the my.cnf file. Strange, but easy to fix:

> rm /my/data3/binlog.index
> ./mysqld –defaults-file=/tmp/my.cnf –initialize

2018-04-22T12:40:45.633637Z 0 [ERROR] [MY-011071] [Server] unknown variable ‘max-tmp-tables=100’
2018-04-22T12:40:45.633657Z 0 [Warning] [MY-010952] [Server] The privilege system failed to initialize correctly. If you have upgraded your server, make sure you’re executing mysql_upgrade to correct the issue.
2018-04-22T12:40:45.633663Z 0 [ERROR] [MY-010119] [Server] Aborting

The warning about the privilege system confused me a bit, but I ignored it for the time being and removed from my configuration files the variables that MySQL 8.0 doesn’t support anymore. I couldn’t find a list of the removed variables anywhere so this was done with the trial and error method.

> ./mysqld –defaults-file=/tmp/my.cnf

2018-04-22T12:42:56.626583Z 0 [ERROR] [MY-010735] [Server] Can’t open the mysql.plugin table. Please run mysql_upgrade to create it.
2018-04-22T12:42:56.827685Z 0 [Warning] [MY-010015] [Repl] Gtid table is not ready to be used. Table ‘mysql.gtid_executed’ cannot be opened.
2018-04-22T12:42:56.838501Z 0 [Warning] [MY-010068] [Server] CA certificate ca.pem is self signed.
2018-04-22T12:42:56.848375Z 0 [Warning] [MY-010441] [Server] Failed to open optimizer cost constant tables
2018-04-22T12:42:56.848863Z 0 [ERROR] [MY-013129] [Server] A message intended for a client cannot be sent there as no client-session is attached. Therefore, we’re sending the information to the error-log instead: MY-001146 – Table ‘mysql.component’ doesn’t exist
2018-04-22T12:42:56.848916Z 0 [Warning] [MY-013129] [Server] A message intended for a client cannot be sent there as no client-session is attached. Therefore, we’re sending the information to the error-log instead: MY-003543 – The mysql.component table is missing or has an incorrect definition.
….
2018-04-22T12:42:56.854141Z 0 [System] [MY-010931] [Server] /home/my/mysql-8.0/runtime_output_directory/mysqld: ready for connections. Version: ‘8.0.11’ socket: ‘/tmp/mysql.sock’ port: 3306 Source distribution.

I figured out that if there is a single wrong variable in the configuration file, running mysqld –initialize will leave the database in an inconsistent state. NOT GOOD! I am happy I didn’t try this in a production system!

Time to start over from the beginning:

> rm -r /my/data3/*
> ./mysqld –defaults-file=/tmp/my.cnf –initialize

2018-04-22T12:44:45.548960Z 5 [Note] [MY-010454] [Server] A temporary password is generated for [email protected]: px)NaaSp?6um
2018-04-22T12:44:51.221751Z 0 [System] [MY-013170] [Server] /home/my/mysql-8.0/runtime_output_directory/mysqld (mysqld 8.0.11) initializing of server has completed

Success!

I wonder why the temporary password is so complex; It could easily have been something that one could easily remember without decreasing security, it’s temporary after all. No big deal, one can always paste it from the logs. (Side note: MariaDB uses socket authentication on many system and thus doesn’t need temporary installation passwords).

Now lets start the MySQL server for real to do some testing:

> ./mysqld –defaults-file=/tmp/my.cnf

2018-04-22T12:45:43.683484Z 0 [System] [MY-010931] [Server] /home/my/mysql-8.0/runtime_output_directory/mysqld: ready for connections. Version: ‘8.0.11’ socket: ‘/tmp/mysql.sock’ port: 3306 Source distribution.

And the lets start the client:

> ./client/mysql –socket=/tmp/mysql.sock –user=root –password=”px)NaaSp?6um”
ERROR 2059 (HY000): Plugin caching_sha2_password could not be loaded: /usr/local/mysql/lib/plugin/caching_sha2_password.so: cannot open shared object file: No such file or directory

Apparently MySQL 8.0 doesn’t work with old MySQL / MariaDB clients by default 🙁

I was testing this in a system with MariaDB installed, like all modern Linux system today, and didn’t want to use the MySQL clients or libraries.

I decided to try to fix this by changing the authentication to the native (original) MySQL authentication method.

> mysqld –skip-grant-tables

> ./client/mysql –socket=/tmp/mysql.sock –user=root
ERROR 1045 (28000): Access denied for user ‘root’@’localhost’ (using password: NO)

Apparently –skip-grant-tables is not good enough anymore. Let’s try again with:

> mysqld –skip-grant-tables –default_authentication_plugin=mysql_native_password

> ./client/mysql –socket=/tmp/mysql.sock –user=root mysql
Welcome to the MariaDB monitor. Commands end with ; or \g.
Your MySQL connection id is 7
Server version: 8.0.11 Source distribution

Great, we are getting somewhere, now lets fix “root”  to work with the old authenticaion:

MySQL [mysql]> update mysql.user set plugin=”mysql_native_password”,authentication_string=password(“test”) where user=”root”;
ERROR 1064 (42000): You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near ‘(“test”) where user=”root”‘ at line 1

A quick look in the MySQL 8.0 release notes told me that the PASSWORD() function is removed in 8.0. Why???? I don’t know how one in MySQL 8.0 is supposed to generate passwords compatible with old installations of MySQL. One could of course start an old MySQL or MariaDB version, execute the password() function and copy the result.

I decided to fix this the easy way and use an empty password:

(Update:: I later discovered that the right way would have been to use: FLUSH PRIVILEGES;  ALTER USER’ root’@’localhost’ identified by ‘test’  ; I however dislike this syntax as it has the password in clear text which is easy to grab and the command can’t be used to easily update the mysql.user table. One must also disable the –skip-grant mode to do use this)

MySQL [mysql]> update mysql.user set plugin=”mysql_native_password”,authentication_string=”” where user=”root”;
Query OK, 1 row affected (0.077 sec)
Rows matched: 1 Changed: 1 Warnings: 0
 
I restarted mysqld:
> mysqld –default_authentication_plugin=mysql_native_password

> ./client/mysql –user=root –password=”” mysql
ERROR 1862 (HY000): Your password has expired. To log in you must change it using a client that supports expired passwords.

Ouch, forgot that. Lets try again:

> mysqld –skip-grant-tables –default_authentication_plugin=mysql_native_password

> ./client/mysql –user=root –password=”” mysql
MySQL [mysql]> update mysql.user set password_expired=”N” where user=”root”;

Now restart and test worked:

> ./mysqld –default_authentication_plugin=mysql_native_password

>./client/mysql –user=root –password=”” mysql

Finally I had a working account that I can use to create other users!

When looking at mysqld –help –verbose again. I noticed the option:

–initialize-insecure
Create the default database and exit. Create a super user
with empty password.

I decided to check if this would have made things easier:

> rm -r /my/data3/*
> ./mysqld –defaults-file=/tmp/my.cnf –initialize-insecure

2018-04-22T13:18:06.629548Z 5 [Warning] [MY-010453] [Server] [email protected] is created with an empty password ! Please consider switching off the –initialize-insecure option.

Hm. Don’t understand the warning as–initialize-insecure is not an option that one would use more than one time and thus nothing one would ‘switch off’.

> ./mysqld –defaults-file=/tmp/my.cnf

> ./client/mysql –user=root –password=”” mysql
ERROR 2059 (HY000): Plugin caching_sha2_password could not be loaded: /usr/local/mysql/lib/plugin/caching_sha2_password.so: cannot open shared object file: No such file or directory

Back to the beginning 🙁

To get things to work with old clients, one has to initialize the database with:
> ./mysqld –defaults-file=/tmp/my.cnf –initialize-insecure –default_authentication_plugin=mysql_native_password

Now I finally had MySQL 8.0 up and running and thought I would take it up for a spin by running the “standard” MySQL/MariaDB sql-bench test suite. This was removed in MySQL 5.7, but as I happened to have MariaDB 10.3 installed, I decided to run it from there.

sql-bench is a single threaded benchmark that measures the “raw” speed for some common operations. It gives you the ‘maximum’ performance for a single query. Its different from other benchmarks that measures the maximum throughput when you have a lot of users, but sql-bench still tells you a lot about what kind of performance to expect from the database.

I tried first to be clever and create the “test” database, that I needed for sql-bench, with
> mkdir /my/data3/test

but when I tried to run the benchmark, MySQL 8.0 complained that the test database didn’t exist.

MySQL 8.0 has gone away from the original concept of MySQL where the user can easily
create directories and copy databases into the database directory. This may have serious
implication for anyone doing backup of databases and/or trying to restore a backup with normal OS commands.

I created the ‘test’ database with mysqladmin and then tried to run sql-bench:

> ./run-all-tests –user=root

The first run failed in test-ATIS:

Can’t execute command ‘create table class_of_service (class_code char(2) NOT NULL,rank tinyint(2) NOT NULL,class_description char(80) NOT NULL,PRIMARY KEY (class_code))’
Error: You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near ‘rank tinyint(2) NOT NULL,class_description char(80) NOT NULL,PRIMARY KEY (class_’ at line 1

This happened because ‘rank‘ is now a reserved word in MySQL 8.0. This is also reserved in ANSI SQL, but I don’t know of any other database that has failed to run test-ATIS before. I have in the past run it against Oracle, PostgreSQL, Mimer, MSSQL etc without any problems.

MariaDB also has ‘rank’ as a keyword in 10.2 and 10.3 but one can still use it as an identifier.

I fixed test-ATIS and then managed to run all tests on MySQL 8.0.

I did run the test both with MySQL 8.0 and MariaDB 10.3 with the InnoDB storage engine and by having identical values for all InnoDB variables, table-definition-cache and table-open-cache. I turned off performance schema for both databases. All test are run with a user with an empty password (to keep things comparable and because it’s was too complex to generate a password in MySQL 8.0)

The result are as follows
Results per test in seconds:

Operation         |MariaDB|MySQL-8|

———————————–
ATIS              | 153.00| 228.00|
alter-table       |  92.00| 792.00|
big-tables        | 990.00|2079.00|
connect           | 186.00| 227.00|
create            | 575.00|4465.00|
insert            |4552.00|8458.00|
select            | 333.00| 412.00|
table-elimination |1900.00|3916.00|
wisconsin         | 272.00| 590.00|
———————————–

This is of course just a first view of the performance of MySQL 8.0 in a single user environment. Some reflections about the results:

  • Alter-table test is slower (as expected) in 8.0 as some of the alter tests benefits of the instant add column in MariaDB 10.3.
  • connect test is also better for MariaDB as we put a lot of efforts to speed this up in MariaDB 10.2
  • table-elimination shows an optimization in MariaDB for the  Anchor table model, which MySQL doesn’t have.
  • CREATE and DROP TABLE is almost 8 times slower in MySQL 8.0 than in MariaDB 10.3. I assume this is the cost of ‘atomic DDL’. This may also cause performance problems for any thread using the data dictionary when another thread is creating/dropping tables.
  • When looking at the individual test results, MySQL 8.0 was slower in almost every test, in many significantly slower.
  • The only test where MySQL was faster was “update_with_key_prefix”. I checked this and noticed that there was a bug in the test and the columns was updated to it’s original value (which should be instant with any storage engine). This is an old bug that MySQL has found and fixed and that we have not been aware of in the test or in MariaDB.
  • While writing this, I noticed that MySQL 8.0 is now using utf8mb4 as the default character set instead of latin1. This may affect some of the benchmarks slightly (not much as most tests works with numbers and Oracle claims that utf8mb4 is only 20% slower than latin1), but needs to be verified.
  • Oracle claims that MySQL 8.0 is much faster on multi user benchmarks. The above test indicates that they may have done this by sacrificing single user performance.
  •  We need to do more and many different benchmarks to better understand exactly what is going on. Stay tuned!

Short summary of my first run with MySQL 8.0:

  • Using the new caching_sha2_password authentication as default for new installation is likely to cause a lot of problems for users. No old application will be able to use MySQL 8.0, installed with default options, without moving to MySQL’s client libraries. While working on this blog I saw MySQL users complain on IRC that not even MySQL Workbench can authenticate with MySQL 8.0. This is the first time in MySQL’s history where such an incompatible change has ever been done!
  • Atomic DDL is a good thing (We plan to have this in MariaDB 10.4), but it should not have such a drastic impact on performance. I am also a bit skeptical of MySQL 8.0 having just one copy of the data dictionary as if this gets corrupted you will lose all your data. (Single point of failure)
  • MySQL 8.0 has several new reserved words and has removed a lot of variables, which makes upgrades hard. Before upgrading to MySQL 8.0 one has to check all one’s databases and applications to ensure that there are no conflicts.
  • As my test above shows, if you have a single deprecated variable in your configuration files, the installation of MySQL will abort and can leave the database in inconsistent state. I did of course my tests by installing into an empty data dictionary, but one can assume that some of the problems may also happen when upgrading an old installation.

Conclusions:
In many ways, MySQL 8.0 has caught up with some earlier versions of MariaDB. For instance, in MariaDB 10.0, we introduced roles (four years ago). In MariaDB 10.1, we introduced encrypted redo/undo logs (three years ago). In MariaDB 10.2, we introduced window functions and CTEs (a year ago). However, some catch-up of MariaDB Server 10.2 features still remains for MySQL (such as check constraints, binlog compression, and log-based rollback).

MySQL 8.0 has a few new interesting features (mostly Atomic DDL and JSON TABLE functions), but at the same time MySQL has strayed away from some of the fundamental corner stone principles of MySQL:

From the start of the first version of MySQL in 1995, all development has been focused around 3 core principles:

  • Ease of use
  • Performance
  • Stability

With MySQL 8.0, Oracle has sacrifices 2 of 3 of these.

In addition (as part of ease of use), while I was working on MySQL, we did our best to ensure that the following should hold:

  • Upgrades should be trivial
  • Things should be kept compatible, if possible (don’t remove features/options/functions that are used)
  • Minimize reserved words, don’t remove server variables
  • One should be able to use normal OS commands to create and drop databases, copy and move tables around within the same system or between different systems. With 8.0 and data dictionary taking backups of specific tables will be hard, even if the server is not running.
  • mysqldump should always be usable backups and to move to new releases
  • Old clients and application should be able to use ‘any’ MySQL server version unchanged. (Some Oracle client libraries, like C++, by default only supports the new X protocol and can thus not be used with older MySQL or any MariaDB version)

We plan to add a data dictionary to MariaDB 10.4 or MariaDB 10.5, but in a way to not sacrifice any of the above principles!

The competition between MySQL and MariaDB is not just about a tactical arms race on features. It’s about design philosophy, or strategic vision, if you will.

This shows in two main ways: our respective view of the Storage Engine structure, and of the top-level direction of the roadmap.

On the Storage Engine side, MySQL is converging on InnoDB, even for clustering and partitioning. In doing so, they are abandoning the advantages of multiple ways of storing data. By contrast, MariaDB sees lots of value in the Storage Engine architecture: MariaDB Server 10.3 will see the general availability of MyRocks (for write-intensive workloads) and Spider (for scalable workloads). On top of that, we have ColumnStore for analytical workloads. One can use the CONNECT engine to join with other databases. The use of different storage engines for different workloads and different hardware is a competitive differentiator, now more than ever.

On the roadmap side, MySQL is carefully steering clear of features that close the gap between MySQL and Oracle. MariaDB has no such constraints. With MariaDB 10.3, we are introducing PL/SQL compatibility (Oracle’s stored procedures) and AS OF (built-in system versioned tables with point-in-time querying). For both of those features, MariaDB is the first Open Source database doing so. I don’t except Oracle to provide any of the above features in MySQL!

Also on the roadmap side, MySQL is not working with the ecosystem in extending the functionality. In 2017, MariaDB accepted more code contributions in one year, than MySQL has done during its entire lifetime, and the rate is increasing!

I am sure that the experience I had with testing MySQL 8.0 would have been significantly better if MySQL would have an open development model where the community could easily participate in developing and testing MySQL continuously. Most of the confusing error messages and strange behavior would have been found and fixed long before the GA release.

Before upgrading to MySQL 8.0 please read https://dev.mysql.com/doc/refman/8.0/en/upgrading-from-previous-series.html to see what problems you can run into! Don’t expect that old installations or applications will work out of the box without testing as a lot of features and options has been removed (query cache, partition of myisam tables etc)! You probably also have to revise your backup methods, especially if you want to ever restore just a few tables. (With 8.0, I don’t know how this can be easily done).

According to the MySQL 8.0 release notes, one can’t use mysqldump to copy a database to MySQL 8.0. One has to first to move to a MySQL 5.7 GA version (with mysqldump, as recommended by Oracle) and then to MySQL 8.0 with in-place update. I assume this means that all old mysqldump backups are useless for MySQL 8.0?

MySQL 8.0 seams to be a one way street to an unknown future. Up to MySQL 5.7 it has been trivial to move to MariaDB and one could always move back to MySQL with mysqldump. All MySQL client libraries has worked with MariaDB and all MariaDB client libraries has worked with MySQL. With MySQL 8.0 this has changed in the wrong direction.

As long as you are using MySQL 5.7 and below you have choices for your future, after MySQL 8.0 you have very little choice. But don’t despair, as MariaDB will always be able to load a mysqldump file and it’s very easy to upgrade your old MySQL installation to MariaDB 🙂

I wish you good luck to try MySQL 8.0 (and also the upcoming MariaDB 10.3)!

[$] Rewiring x86 system-call dispatch

Post Syndicated from corbet original https://lwn.net/Articles/752422/rss

Each kernel development cycle includes a vast number of changes that are
not intended to change visible behavior and which, as a result, go
unnoticed by most users and developers. One such change in 4.17 is a
rewiring of how system-call implementations are invoked within the kernel.
The change is interesting, though, and provides an opportunity to look at
the macro magic that handles system-call definitions.

Let’s stop talking about password strength

Post Syndicated from Robert Graham original https://blog.erratasec.com/2018/04/lets-stop-talking-about-password.html

Picture from EFF — CC-BY license

Near the top of most security recommendations is to use “strong passwords”. We need to stop doing this.

Yes, weak passwords can be a problem. If a website gets hacked, weak passwords are easier to crack. It’s not that this is wrong advice.

On the other hand, it’s not particularly good advice, either. It’s far down the list of important advice that people need to remember. “Weak passwords” are nowhere near the risk of “password reuse”. When your Facebook or email account gets hacked, it’s because you used the same password across many websites, not because you used a weak password.

Important websites, where the strength of your password matters, already take care of the problem. They use strong, salted hashes on the backend to protect the password. On the frontend, they force passwords to be a certain length and a certain complexity. Maybe the better advice is to not trust any website that doesn’t enforce stronger passwords (minimum of 8 characters consisting of both letters and non-letters).

To some extent, this “strong password” advice has become obsolete. A decade ago, websites had poor protection (MD5 hashes) and no enforcement of complexity, so it was up to the user to choose strong passwords. Now that important websites have changed their behavior, such as using bcrypt, there is less onus on the user.

But the real issue here is that “strong password” advice reflects the evil, authoritarian impulses of the infosec community. Instead of measuring insecurity in terms of costs vs. benefits, risks vs. rewards, we insist that it’s an issue of moral weakness. We pretend that flaws happen because people are greedy, lazy, and ignorant. We pretend that security is its own goal, a benefit we should achieve, rather than a cost we must endure.

We like giving moral advice because it’s easy: just be “stronger”. Discussing “password reuse” is more complicated, forcing us discuss password managers, writing down passwords on paper, that it’s okay to reuse passwords for crappy websites you don’t care about, and so on.

What I’m trying to say is that the moral weakness here is us. Rather then give pertinent advice we give lazy advice. We give the advice that victim shames them for being weak while pretending that we are strong.

So stop telling people to use strong passwords. It’s crass advice on your part and largely unhelpful for your audience, distracting them from the more important things.

COPPA Compliance

Post Syndicated from Bruce Schneier original https://www.schneier.com/blog/archives/2018/04/coppa_complianc.html

Interesting research: “‘Won’t Somebody Think of the Children?’ Examining COPPA Compliance at Scale“:

Abstract: We present a scalable dynamic analysis framework that allows for the automatic evaluation of the privacy behaviors of Android apps. We use our system to analyze mobile apps’ compliance with the Children’s Online Privacy Protection Act (COPPA), one of the few stringent privacy laws in the U.S. Based on our automated analysis of 5,855 of the most popular free children’s apps, we found that a majority are potentially in violation of COPPA, mainly due to their use of third-party SDKs. While many of these SDKs offer configuration options to respect COPPA by disabling tracking and behavioral advertising, our data suggest that a majority of apps either do not make use of these options or incorrectly propagate them across mediation SDKs. Worse, we observed that 19% of children’s apps collect identifiers or other personally identifiable information (PII) via SDKs whose terms of service outright prohibit their use in child-directed apps. Finally, we show that efforts by Google to limit tracking through the use of a resettable advertising ID have had little success: of the 3,454 apps that share the resettable ID with advertisers, 66% transmit other, non-resettable, persistent identifiers as well, negating any intended privacy-preserving properties of the advertising ID.

Node.js 8.10 runtime now available in AWS Lambda

Post Syndicated from Chris Munns original https://aws.amazon.com/blogs/compute/node-js-8-10-runtime-now-available-in-aws-lambda/

This post courtesy of Ed Lima, AWS Solutions Architect

We are excited to announce that you can now develop your AWS Lambda functions using the Node.js 8.10 runtime, which is the current Long Term Support (LTS) version of Node.js. Start using this new version today by specifying a runtime parameter value of nodejs8.10 when creating or updating functions.

Supporting async/await

The Lambda programming model for Node.js 8.10 now supports defining a function handler using the async/await pattern.

Asynchronous or non-blocking calls are an inherent and important part of applications, as user and human interfaces are asynchronous by nature. If you decide to have a coffee with a friend, you usually order the coffee then start or continue a conversation with your friend while the coffee is getting ready. You don’t wait for the coffee to be ready before you start talking. These activities are asynchronous, because you can start one and then move to the next without waiting for completion. Otherwise, you’d delay (or block) the start of the next activity.

Asynchronous calls used to be handled in Node.js using callbacks. That presented problems when they were nested within other callbacks in multiple levels, making the code difficult to maintain and understand.

Promises were implemented to try to solve issues caused by “callback hell.” They allow asynchronous operations to call their own methods and handle what happens when a call is successful or when it fails. As your requirements become more complicated, even promises become harder to work with and may still end up complicating your code.

Async/await is the new way of handling asynchronous operations in Node.js, and makes for simpler, easier, and cleaner code for non-blocking calls. It still uses promises but a callback is returned directly from the asynchronous function, just as if it were a synchronous blocking function.

Take for instance the following Lambda function to get the current account settings, using the Node.js 6.10 runtime:

let AWS = require('aws-sdk');
let lambda = new AWS.Lambda();

exports.handler = (event, context, callback) => {
    let getAccountSettingsPromise = lambda.getAccountSettings().promise();
    getAccountSettingsPromise.then(
        (data) => {
            callback(null, data);
        },
        (err) => {
            console.log(err);
            callback(err);
        }
    );
};

With the new Node.js 8.10 runtime, there are new handler types that can be declared with the “async” keyword or can return a promise directly.

This is how the same function looks like using async/await with Node.js 8.10:

let AWS = require('aws-sdk');
let lambda = new AWS.Lambda();

exports.handler = async (event) => {
    return await lambda.getAccountSettings().promise() ;
};

Alternatively, you could have the handler return a promise directly:

let AWS = require('aws-sdk');
let lambda = new AWS.Lambda();

exports.handler = (event) => {
    return new Promise((resolve, reject) => {
        lambda.getAccountSettings(event)
        .then((data) => {
            resolve data;
        })
        .catch(reject);
     });
};

The new handler types are alternatives to the callback pattern, which is still fully supported.

All three functions return the same results. However, in the new runtime with async/await, all callbacks in the code are gone, which makes it easier to read. This is especially true for those less familiar with promises.

{
    "AccountLimit":{
        "TotalCodeSize":80530636800,
        "CodeSizeUnzipped":262144000,
        "CodeSizeZipped":52428800, 
        "ConcurrentExecutions":1000,
        "UnreservedConcurrentExecutions":1000
    },
    "AccountUsage":{
        "TotalCodeSize":52234461,
        "FunctionCount":53
    }
}

Another great advantage of async/await is better error handling. You can use a try/catch block inside the scope of an async function. Even though the function awaits an asynchronous operation, any errors end up in the catch block.

You can improve your previous Node.js 8.10 function with this trusted try/catch error handling pattern:

let AWS = require('aws-sdk');
let lambda = new AWS.Lambda();
let data;

exports.handler = async (event) => {
    try {
        data = await lambda.getAccountSettings().promise();
    }
    catch (err) {
        console.log(err);
        return err;
    }
    return data;
};

While you now have a similar number of lines in both runtimes, the code is cleaner and more readable with async/await. It makes the asynchronous calls look more synchronous. However, it is important to notice that the code is still executed the same way as if it were using a callback or promise-based API.

Backward compatibility

You may port your existing Node.js 4.3 and 6.10 functions over to Node.js 8.10 by updating the runtime. Node.js 8.10 does include numerous breaking changes from previous Node versions.

Make sure to review the API changes between Node.js 4.3, 6.10, and Node.js 8.10 to see if there are other changes that might affect your code. We recommend testing that your Lambda function passes internal validation for its behavior when upgrading to the new runtime version.

You can use Lambda versions/aliases to safely test that your function runs as expected on Node 8.10, before routing production traffic to it.

New node features

You can now get better performance when compared to the previous LTS version 6.x (up to 20%). The new V8 6.0 engine comes with Turbofan and the Ignition pipeline, which leads to lower memory consumption and faster startup time across Node.js applications.

HTTP/2, which is subject to future changes, allows developers to use the new protocol to speed application development and undo many of HTTP/1.1 workarounds to make applications faster, simpler, and more powerful.

For more information, see the AWS Lambda Developer Guide.

Hope you enjoy and… go build with Node.js 8.10!

A geometric Rust adventure

Post Syndicated from Eevee original https://eev.ee/blog/2018/03/30/a-geometric-rust-adventure/

Hi. Yes. Sorry. I’ve been trying to write this post for ages, but I’ve also been working on a huge writing project, and apparently I have a very limited amount of writing mana at my disposal. I think this is supposed to be a Patreon reward from January. My bad. I hope it’s super great to make up for the wait!

I recently ported some math code from C++ to Rust in an attempt to do a cool thing with Doom. Here is my story.

The problem

I presented it recently as a conundrum (spoilers: I solved it!), but most of those details are unimportant.

The short version is: I have some shapes. I want to find their intersection.

Really, I want more than that: I want to drop them all on a canvas, intersect everything with everything, and pluck out all the resulting polygons. The input is a set of cookie cutters, and I want to press them all down on the same sheet of dough and figure out what all the resulting contiguous pieces are. And I want to know which cookie cutter(s) each piece came from.

But intersection is a good start.

Example of the goal.  Given two squares that overlap at their corners, I want to find the small overlap piece, plus the two L-shaped pieces left over from each square

I’m carefully referring to the input as shapes rather than polygons, because each one could be a completely arbitrary collection of lines. Obviously there’s not much you can do with shapes that aren’t even closed, but at the very least, I need to handle concavity and multiple disconnected polygons that together are considered a single input.

This is a non-trivial problem with a lot of edge cases, and offhand I don’t know how to solve it robustly. I’m not too eager to go figure it out from scratch, so I went hunting for something I could build from.

(Infuriatingly enough, I can just dump all the shapes out in an SVG file and any SVG viewer can immediately solve the problem, but that doesn’t quite help me. Though I have had a few people suggest I just rasterize the whole damn problem, and after all this, I’m starting to think they may have a point.)

Alas, I couldn’t find a Rust library for doing this. I had a hard time finding any library for doing this that wasn’t a massive fully-featured geometry engine. (I could’ve used that, but I wanted to avoid non-Rust dependencies if possible, since distributing software is already enough of a nightmare.)

A Twitter follower directed me towards a paper that described how to do very nearly what I wanted and nothing else: “A simple algorithm for Boolean operations on polygons” by F. Martínez (2013). Being an academic paper, it’s trapped in paywall hell; sorry about that. (And as I understand it, none of the money you’d pay to get the paper would even go to the authors? Is that right? What a horrible and predatory system for discovering and disseminating knowledge.)

The paper isn’t especially long, but it does describe an awful lot of subtle details and is mostly written in terms of its own reference implementation. Rather than write my own implementation based solely on the paper, I decided to try porting the reference implementation from C++ to Rust.

And so I fell down the rabbit hole.

The basic algorithm

Thankfully, the author has published the sample code on his own website, if you want to follow along. (It’s the bottom link; the same author has, confusingly, published two papers on the same topic with similar titles, four years apart.)

If not, let me describe the algorithm and how the code is generally laid out. The algorithm itself is based on a sweep line, where a vertical line passes across the plane and ✨ does stuff ✨ as it encounters various objects. This implementation has no physical line; instead, it keeps track of which segments from the original polygon would be intersecting the sweep line, which is all we really care about.

A vertical line is passing rightwards over a couple intersecting shapes.  The line current intersects two of the shapes' sides, and these two sides are the "sweep list"

The code is all bundled inside a class with only a single public method, run, because… that’s… more object-oriented, I guess. There are several helper methods, and state is stored in some attributes. A rough outline of run is:

  1. Run through all the line segments in both input polygons. For each one, generate two SweepEvents (one for each endpoint) and add them to a std::deque for storage.

    Add pointers to the two SweepEvents to a std::priority_queue, the event queue. This queue uses a custom comparator to order the events from left to right, so the top element is always the leftmost endpoint.

  2. Loop over the event queue (where an “event” means the sweep line passed over the left or right end of a segment). Encountering a left endpoint means the sweep line is newly touching that segment, so add it to a std::set called the sweep list. An important point is that std::set is ordered, and the sweep list uses a comparator that keeps segments in order vertically.

    Encountering a right endpoint means the sweep line is leaving a segment, so that segment is removed from the sweep list.

  3. When a segment is added to the sweep list, it may have up to two neighbors: the segment above it and the segment below it. Call possibleIntersection to check whether it intersects either of those neighbors. (This is nearly sufficient to find all intersections, which is neat.)

  4. If possibleIntersection detects an intersection, it will split each segment into two pieces then and there. The old segment is shortened in-place to become the left part, and a new segment is created for the right part. The new endpoints at the point of intersection are added to the event queue.

  5. Some bookkeeping is done along the way to track which original polygons each segment is inside, and eventually the segments are reconstructed into new polygons.

Hopefully that’s enough to follow along. It took me an inordinately long time to tease this out. The comments aren’t especially helpful.

1
    std::deque<SweepEvent> eventHolder;    // It holds the events generated during the computation of the boolean operation

Syntax and basic semantics

The first step was to get something that rustc could at least parse, which meant translating C++ syntax to Rust syntax.

This was surprisingly straightforward! C++ classes become Rust structs. (There was no inheritance here, thankfully.) All the method declarations go away. Method implementations only need to be indented and wrapped in impl.

I did encounter some unnecessarily obtuse uses of the ternary operator:

1
(prevprev != sl.begin()) ? --prevprev : prevprev = sl.end();

Rust doesn’t have a ternary — you can use a regular if block as an expression — so I expanded these out.

C++ switch blocks become Rust match blocks, but otherwise function basically the same. Rust’s enums are scoped (hallelujah), so I had to explicitly spell out where enum values came from.

The only really annoying part was changing function signatures; C++ types don’t look much at all like Rust types, save for the use of angle brackets. Rust also doesn’t pass by implicit reference, so I needed to sprinkle a few &s around.

I would’ve had a much harder time here if this code had relied on any remotely esoteric C++ functionality, but thankfully it stuck to pretty vanilla features.

Language conventions

This is a geometry problem, so the sample code unsurprisingly has its own home-grown point type. Rather than port that type to Rust, I opted to use the popular euclid crate. Not only is it code I didn’t have to write, but it already does several things that the C++ code was doing by hand inline, like dot products and cross products. And all I had to do was add one line to Cargo.toml to use it! I have no idea how anyone writes C or C++ without a package manager.

The C++ code used getters, i.e. point.x (). I’m not a huge fan of getters, though I do still appreciate the need for them in lowish-level systems languages where you want to future-proof your API and the language wants to keep a clear distinction between attribute access and method calls. But this is a point, which is nothing more than two of the same numeric type glued together; what possible future logic might you add to an accessor? The euclid authors appear to side with me and leave the coordinates as public fields, so I took great joy in removing all the superfluous parentheses.

Polygons are represented with a Polygon class, which has some number of Contours. A contour is a single contiguous loop. Something you’d usually think of as a polygon would only have one, but a shape with a hole would have two: one for the outside, one for the inside. The weird part of this arrangement was that Polygon implemented nearly the entire STL container interface, then waffled between using it and not using it throughout the rest of the code. Rust lets anything in the same module access non-public fields, so I just skipped all that and used polygon.contours directly. Hell, I think I made contours public.

Finally, the SweepEvent type has a pol field that’s declared as an enum PolygonType (either SUBJECT or CLIPPING, to indicate which of the two inputs it is), but then some other code uses the same field as a numeric index into a polygon’s contours. Boy I sure do love static typing where everything’s a goddamn integer. I wanted to extend the algorithm to work on arbitrarily many input polygons anyway, so I scrapped the enum and this became a usize.


Then I got to all the uses of STL. I have only a passing familiarity with the C++ standard library, and this code actually made modest use of it, which caused some fun days-long misunderstandings.

As mentioned, the SweepEvents are stored in a std::deque, which is never read from. It took me a little thinking to realize that the deque was being used as an arena: it’s the canonical home for the structs so pointers to them can be tossed around freely. (It can’t be a std::vector, because that could reallocate and invalidate all the pointers; std::deque is probably a doubly-linked list, and guarantees no reallocation.)

Rust’s standard library does have a doubly-linked list type, but I knew I’d run into ownership hell here later anyway, so I think I replaced it with a Rust Vec to start with. It won’t compile either way, so whatever. We’ll get back to this in a moment.

The list of segments currently intersecting the sweep line is stored in a std::set. That type is explicitly ordered, which I’m very glad I knew already. Rust has two set types, HashSet and BTreeSet; unsurprisingly, the former is unordered and the latter is ordered. Dropping in BTreeSet and fixing some method names got me 90% of the way there.

Which brought me to the other 90%. See, the C++ code also relies on finding nodes adjacent to the node that was just inserted, via STL iterators.

1
2
3
next = prev = se->posSL = it = sl.insert(se).first;
(prev != sl.begin()) ? --prev : prev = sl.end();
++next;

I freely admit I’m bad at C++, but this seems like something that could’ve used… I don’t know, 1 comment. Or variable names more than two letters long. What it actually does is:

  1. Add the current sweep event (se) to the sweep list (sl), which returns a pair whose first element is an iterator pointing at the just-inserted event.

  2. Copies that iterator to several other variables, including prev and next.

  3. If the event was inserted at the beginning of the sweep list, set prev to the sweep list’s end iterator, which in C++ is a legal-but-invalid iterator meaning “the space after the end” or something. This is checked for in later code, to see if there is a previous event to look at. Otherwise, decrement prev, so it’s now pointing at the event immediately before the inserted one.

  4. Increment next normally. If the inserted event is last, then this will bump next to the end iterator anyway.

In other words, I need to get the previous and next elements from a BTreeSet. Rust does have bidirectional iterators, which BTreeSet supports… but BTreeSet::insert only returns a bool telling me whether or not anything was inserted, not the position. I came up with this:

1
2
3
let mut maybe_below = active_segments.range(..segment).last().map(|v| *v);
let mut maybe_above = active_segments.range(segment..).next().map(|v| *v);
active_segments.insert(segment);

The range method returns an iterator over a subset of the tree. The .. syntax makes a range (where the right endpoint is exclusive), so ..segment finds the part of the tree before the new segment, and segment.. finds the part of the tree after it. (The latter would start with the segment itself, except I haven’t inserted it yet, so it’s not actually there.)

Then the standard next() and last() methods on bidirectional iterators find me the element I actually want. But the iterator might be empty, so they both return an Option. Also, iterators tend to return references to their contents, but in this case the contents are already references, and I don’t want a double reference, so the map call dereferences one layer — but only if the Option contains a value. Phew!

This is slightly less efficient than the C++ code, since it has to look up where segment goes three times rather than just one. I might be able to get it down to two with some more clever finagling of the iterator, but microsopic performance considerations were a low priority here.

Finally, the event queue uses a std::priority_queue to keep events in a desired order and efficiently pop the next one off the top.

Except priority queues act like heaps, where the greatest (i.e., last) item is made accessible.

Sorting out sorting

C++ comparison functions return true to indicate that the first argument is less than the second argument. Sweep events occur from left to right. You generally implement sorts so that the first thing comes, erm, first.

But sweep events go in a priority queue, and priority queues surface the last item, not the first. This C++ code handled this minor wrinkle by implementing its comparison backwards.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
struct SweepEventComp : public std::binary_function<SweepEvent, SweepEvent, bool> { // for sorting sweep events
// Compare two sweep events
// Return true means that e1 is placed at the event queue after e2, i.e,, e1 is processed by the algorithm after e2
bool operator() (const SweepEvent* e1, const SweepEvent* e2)
{
    if (e1->point.x () > e2->point.x ()) // Different x-coordinate
        return true;
    if (e2->point.x () > e1->point.x ()) // Different x-coordinate
        return false;
    if (e1->point.y () != e2->point.y ()) // Different points, but same x-coordinate. The event with lower y-coordinate is processed first
        return e1->point.y () > e2->point.y ();
    if (e1->left != e2->left) // Same point, but one is a left endpoint and the other a right endpoint. The right endpoint is processed first
        return e1->left;
    // Same point, both events are left endpoints or both are right endpoints.
    if (signedArea (e1->point, e1->otherEvent->point, e2->otherEvent->point) != 0) // not collinear
        return e1->above (e2->otherEvent->point); // the event associate to the bottom segment is processed first
    return e1->pol > e2->pol;
}
};

Maybe it’s just me, but I had a hell of a time just figuring out what problem this was even trying to solve. I still have to reread it several times whenever I look at it, to make sure I’m getting the right things backwards.

Making this even more ridiculous is that there’s a second implementation of this same sort, with the same name, in another file — and that one’s implemented forwards. And doesn’t use a tiebreaker. I don’t entirely understand how this even compiles, but it does!

I painstakingly translated this forwards to Rust. Unlike the STL, Rust doesn’t take custom comparators for its containers, so I had to implement ordering on the types themselves (which makes sense, anyway). I wrapped everything in the priority queue in a Reverse, which does what it sounds like.

I’m fairly pleased with Rust’s ordering model. Most of the work is done in Ord, a trait with a cmp() method returning an Ordering (one of Less, Equal, and Greater). No magic numbers, no need to implement all six ordering methods! It’s incredible. Ordering even has some handy methods on it, so the usual case of “order by this, then by this” can be written as:

1
2
return self.point().x.cmp(&other.point().x)
    .then(self.point().y.cmp(&other.point().y));

Well. Just kidding! It’s not quite that easy. You see, the points here are composed of floats, and floats have the fun property that not all of them are comparable. Specifically, NaN is not less than, greater than, or equal to anything else, including itself. So IEEE 754 float ordering cannot be expressed with Ord. Unless you want to just make up an answer for NaN, but Rust doesn’t tend to do that.

Rust’s float types thus implement the weaker PartialOrd, whose method returns an Option<Ordering> instead. That makes the above example slightly uglier:

1
2
return self.point().x.partial_cmp(&other.point().x).unwrap()
    .then(self.point().y.partial_cmp(&other.point().y).unwrap())

Also, since I use unwrap() here, this code will panic and take the whole program down if the points are infinite or NaN. Don’t do that.

This caused some minor inconveniences in other places; for example, the general-purpose cmp::min() doesn’t work on floats, because it requires an Ord-erable type. Thankfully there’s a f64::min(), which handles a NaN by returning the other argument.

(Cool story: for the longest time I had this code using f32s. I’m used to translating int to “32 bits”, and apparently that instinct kicked in for floats as well, even floats spelled double.)

The only other sorting adventure was this:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
// Due to overlapping edges the resultEvents array can be not wholly sorted
bool sorted = false;
while (!sorted) {
    sorted = true;
    for (unsigned int i = 0; i < resultEvents.size (); ++i) {
        if (i + 1 < resultEvents.size () && sec (resultEvents[i], resultEvents[i+1])) {
            std::swap (resultEvents[i], resultEvents[i+1]);
            sorted = false;
        }
    }
}

(I originally misread this comment as saying “the array cannot be wholly sorted” and had no idea why that would be the case, or why the author would then immediately attempt to bubble sort it.)

I’m still not sure why this uses an ad-hoc sort instead of std::sort. But I’m used to taking for granted that general-purpose sorting implementations are tuned to work well for almost-sorted data, like Python’s. Maybe C++ is untrustworthy here, for some reason. I replaced it with a call to .sort() and all seemed fine.

Phew! We’re getting there. Finally, my code appears to type-check.

But now I see storm clouds gathering on the horizon.

Ownership hell

I have a problem. I somehow run into this problem every single time I use Rust. The solutions are never especially satisfying, and all the hacks I might use if forced to write C++ turn out to be unsound, which is even more annoying because rustc is just sitting there with this smug “I told you so expression” and—

The problem is ownership, which Rust is fundamentally built on. Any given value must have exactly one owner, and Rust must be able to statically convince itself that:

  1. No reference to a value outlives that value.
  2. If a mutable reference to a value exists, no other references to that value exist at the same time.

This is the core of Rust. It guarantees at compile time that you cannot lose pointers to allocated memory, you cannot double-free, you cannot have dangling pointers.

It also completely thwarts a lot of approaches you might be inclined to take if you come from managed languages (where who cares, the GC will take care of it) or C++ (where you just throw pointers everywhere and hope for the best apparently).

For example, pointer loops are impossible. Rust’s understanding of ownership and lifetimes is hierarchical, and it simply cannot express loops. (Rust’s own doubly-linked list type uses raw pointers and unsafe code under the hood, where “unsafe” is an escape hatch for the usual ownership rules. Since I only recently realized that pointers to the inside of a mutable Vec are a bad idea, I figure I should probably not be writing unsafe code myself.)

This throws a few wrenches in the works.

Problem the first: pointer loops

I immediately ran into trouble with the SweepEvent struct itself. A SweepEvent pulls double duty: it represents one endpoint of a segment, but each left endpoint also handles bookkeeping for the segment itself — which means that most of the fields on a right endpoint are unused. Also, and more importantly, each SweepEvent has a pointer to the corresponding SweepEvent at the other end of the same segment. So a pair of SweepEvents point to each other.

Rust frowns upon this. In retrospect, I think I could’ve kept it working, but I also think I’m wrong about that.

My first step was to wrench SweepEvent apart. I moved all of the segment-stuff (which is virtually all of it) into a single SweepSegment type, and then populated the event queue with a SweepEndpoint tuple struct, similar to:

1
2
3
4
5
6
enum SegmentEnd {
    Left,
    Right,
}

struct SweepEndpoint<'a>(&'a SweepSegment, SegmentEnd);

This makes SweepEndpoint essentially a tuple with a name. The 'a is a lifetime and says, more or less, that a SweepEndpoint cannot outlive the SweepSegment it references. Makes sense.

Problem solved! I no longer have mutually referential pointers. But I do still have pointers (well, references), and they have to point to something.

Problem the second: where’s all the data

Which brings me to the problem I always run into with Rust. I have a bucket of things, and I need to refer to some of them multiple times.

I tried half a dozen different approaches here and don’t clearly remember all of them, but I think my core problem went as follows. I translated the C++ class to a Rust struct with some methods hanging off of it. A simplified version might look like this.

1
2
3
4
struct Algorithm {
    arena: LinkedList<SweepSegment>,
    event_queue: BinaryHeap<SweepEndpoint>,
}

Ah, hang on — SweepEndpoint needs to be annotated with a lifetime, so Rust can enforce that those endpoints don’t live longer than the segments they refer to. No problem?

1
2
3
4
struct Algorithm<'a> {
    arena: LinkedList<SweepSegment>,
    event_queue: BinaryHeap<SweepEndpoint<'a>>,
}

Okay! Now for some methods.

1
2
3
4
5
6
7
8
fn run(&mut self) {
    self.arena.push_back(SweepSegment{ data: 5 });
    self.event_queue.push(SweepEndpoint(self.arena.back().unwrap(), SegmentEnd::Left));
    self.event_queue.push(SweepEndpoint(self.arena.back().unwrap(), SegmentEnd::Right));
    for event in &self.event_queue {
        println!("{:?}", event)
    }
}

Aaand… this doesn’t work. Rust “cannot infer an appropriate lifetime for autoref due to conflicting requirements”. The trouble is that self.arena.back() takes a reference to self.arena, and then I put that reference in the event queue. But I promised that everything in the event queue has lifetime 'a, and I don’t actually know how long self lives here; I only know that it can’t outlive 'a, because that would invalidate the references it holds.

A little random guessing let me to change &mut self to &'a mut self — which is fine because the entire impl block this lives in is already parameterized by 'a — and that makes this compile! Hooray! I think that’s because I’m saying self itself has exactly the same lifetime as the references it holds onto, which is true, since it’s referring to itself.

Let’s get a little more ambitious and try having two segments.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
fn run(&'a mut self) {
    self.arena.push_back(SweepSegment{ data: 5 });
    self.event_queue.push(SweepEndpoint(self.arena.back().unwrap(), SegmentEnd::Left));
    self.event_queue.push(SweepEndpoint(self.arena.back().unwrap(), SegmentEnd::Right));
    self.arena.push_back(SweepSegment{ data: 17 });
    self.event_queue.push(SweepEndpoint(self.arena.back().unwrap(), SegmentEnd::Left));
    self.event_queue.push(SweepEndpoint(self.arena.back().unwrap(), SegmentEnd::Right));
    for event in &self.event_queue {
        println!("{:?}", event)
    }
}

Whoops! Rust complains that I’m trying to mutate self.arena while other stuff is referring to it. And, yes, that’s true — I have references to it in the event queue, and Rust is preventing me from potentially deleting everything from the queue when references to it still exist. I’m not actually deleting anything here, of course (though I could be if this were a Vec!), but Rust’s type system can’t encode that (and I dread the thought of a type system that can).

I struggled with this for a while, and rapidly encountered another complete showstopper:

1
2
3
4
5
6
fn run(&'a mut self) {
    self.mutate_something();
    self.mutate_something();
}

fn mutate_something(&'a mut self) {}

Rust objects that I’m trying to borrow self mutably, twice — once for the first call, once for the second.

But why? A borrow is supposed to end automatically once it’s no longer used, right? Maybe if I throw some braces around it for scope… nope, that doesn’t help either.

It’s true that borrows usually end automatically, but here I have explicitly told Rust that mutate_something() should borrow with the lifetime 'a, which is the same as the lifetime in run(). So the first call explicitly borrows self for at least the rest of the method. Removing the lifetime from mutate_something() does fix this error, but if that method tries to add new segments, I’m back to the original problem.

Oh no. The mutation in the C++ code is several calls deep. Porting it directly seems nearly impossible.

The typical solution here — at least, the first thing people suggest to me on Twitter — is to wrap basically everything everywhere in Rc<RefCell<T>>, which gives you something that’s reference-counted (avoiding questions of ownership) and defers borrow checks until runtime (avoiding questions of mutable borrows). But that seems pretty heavy-handed here — not only does RefCell add .borrow() noise anywhere you actually want to interact with the underlying value, but do I really need to refcount these tiny structs that only hold a handful of floats each?

I set out to find a middle ground.

Solution, kind of

I really, really didn’t want to perform serious surgery on this code just to get it to build. I still didn’t know if it worked at all, and now I had to rearrange it without being able to check if I was breaking it further. (This isn’t Rust’s fault; it’s a natural problem with porting between fairly different paradigms.)

So I kind of hacked it into working with minimal changes, producing a grotesque abomination which I’m ashamed to link to. Here’s how!

First, I got rid of the class. It turns out this makes lifetime juggling much easier right off the bat. I’m pretty sure Rust considers everything in a struct to be destroyed simultaneously (though in practice it guarantees it’ll destroy fields in order), which doesn’t leave much wiggle room. Locals within a function, on the other hand, can each have their own distinct lifetimes, which solves the problem of expressing that the borrows won’t outlive the arena.

Speaking of the arena, I solved the mutability problem there by switching to… an arena! The typed-arena crate (a port of a type used within Rust itself, I think) is an allocator — you give it a value, and it gives you back a reference, and the reference is guaranteed to be valid for as long as the arena exists. The method that does this is sneaky and takes &self rather than &mut self, so Rust doesn’t know you’re mutating the arena and won’t complain. (One drawback is that the arena will never free anything you give to it, but that’s not a big problem here.)


My next problem was with mutation. The main loop repeatedly calls possibleIntersection with pairs of segments, which can split either or both segment. Rust definitely doesn’t like that — I’d have to pass in two &muts, both of which are mutable references into the same arena, and I’d have a bunch of immutable references into that arena in the sweep list and elsewhere. This isn’t going to fly.

This is kind of a shame, and is one place where Rust seems a little overzealous. Something like this seems like it ought to be perfectly valid:

1
2
3
4
let mut v = vec![1u32, 2u32];
let a = &mut v[0];
let b = &mut v[1];
// do stuff with a, b

The trouble is, Rust only knows the type signature, which here is something like index_mut(&'a mut self, index: usize) -> &'a T. Nothing about that says that you’re borrowing distinct elements rather than some core part of the type — and, in fact, the above code is only safe because you’re borrowing distinct elements. In the general case, Rust can’t possibly know that. It seems obvious enough from the different indexes, but nothing about the type system even says that different indexes have to return different values. And what if one were borrowed as &mut v[1] and the other were borrowed with v.iter_mut().next().unwrap()?

Anyway, this is exactly where people start to turn to RefCell — if you’re very sure you know better than Rust, then a RefCell will skirt the borrow checker while still enforcing at runtime that you don’t have more than one mutable borrow at a time.

But half the lines in this algorithm examine the endpoints of a segment! I don’t want to wrap the whole thing in a RefCell, or I’ll have to say this everywhere:

1
if segment1.borrow().point.x < segment2.borrow().point.x { ... }

Gross.

But wait — this code only mutates the points themselves in one place. When a segment is split, the original segment becomes the left half, and a new segment is created to be the right half. There’s no compelling need for this; it saves an allocation for the left half, but it’s not critical to the algorithm.

Thus, I settled on a compromise. My segment type now looks like this:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
struct SegmentPacket {
    // a bunch of flags and whatnot used in the algorithm
}
struct SweepSegment {
    left_point: MapPoint,
    right_point: MapPoint,
    faces_outwards: bool,
    index: usize,
    order: usize,
    packet: RefCell<SegmentPacket>,
}

I do still need to call .borrow() or .borrow_mut() to get at the stuff in the “packet”, but that’s far less common, so there’s less noise overall. And I don’t need to wrap it in Rc because it’s part of a type that’s allocated in the arena and passed around only via references.


This still leaves me with the problem of how to actually perform the splits.

I’m not especially happy with what I came up with, I don’t know if I can defend it, and I suspect I could do much better. I changed possibleIntersection so that rather than performing splits, it returns the points at which each segment needs splitting, in the form (usize, Option<MapPoint>, Option<MapPoint>). (The usize is used as a flag for calling code and oughta be an enum, but, isn’t yet.)

Now the top-level function is responsible for all arena management, and all is well.

Except, er. possibleIntersection is called multiple times, and I don’t want to copy-paste a dozen lines of split code after each call. I tried putting just that code in its own function, which had the world’s most godawful signature, and that didn’t work because… uh… hm. I can’t remember why, exactly! Should’ve written that down.

I tried a local closure next, but closures capture their environment by reference, so now I had references to a bunch of locals for as long as the closure existed, which meant I couldn’t mutate those locals. Argh. (This seems a little silly to me, since the closure’s references cannot possibly be used for anything if the closure isn’t being called, but maybe I’m missing something. Or maybe this is just a limitation of lifetimes.)

Increasingly desperate, I tried using a macro. But… macros are hygienic, which means that any new name you use inside a macro is different from any name outside that macro. The macro thus could not see any of my locals. Usually that’s good, but here I explicitly wanted the macro to mess with my locals.

I was just about to give up and go live as a hermit in a cabin in the woods, when I discovered something quite incredible. You can define local macros! If you define a macro inside a function, then it can see any locals defined earlier in that function. Perfect!

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
macro_rules! _split_segment (
    ($seg:expr, $pt:expr) => (
        {
            let pt = $pt;
            let seg = $seg;
            // ... waaay too much code ...
        }
    );
);

loop {
    // ...
    // This is possibleIntersection, renamed because Rust rightfully complains about camelCase
    let cross = handle_intersections(Some(segment), maybe_above);
    if let Some(pt) = cross.1 {
        segment = _split_segment!(segment, pt);
    }
    if let Some(pt) = cross.2 {
        maybe_above = Some(_split_segment!(maybe_above.unwrap(), pt));
    }
    // ...
}

(This doesn’t actually quite match the original algorithm, which has one case where a segment can be split twice. I realized that I could just do the left-most split, and a later iteration would perform the other split. I sure hope that’s right, anyway.)

It’s a bit ugly, and I ran into a whole lot of implicit behavior from the C++ code that I had to fix — for example, the segment is sometimes mutated just before it’s split, purely as a shortcut for mutating the left part of the split. But it finally compiles! And runs! And kinda worked, a bit!

Aftermath

I still had a lot of work to do.

For one, this code was designed for intersecting two shapes, not mass-intersecting a big pile of shapes. The basic algorithm doesn’t care about how many polygons you start with — all it sees is segments — but the code for constructing the return value needed some heavy modification.

The biggest change by far? The original code traced each segment once, expecting the result to be only a single shape. I had to change that to trace each side of each segment once, since the vast bulk of the output consists of shapes which share a side. This violated a few assumptions, which I had to hack around.

I also ran into a couple very bad edge cases, spent ages debugging them, then found out that the original algorithm had a subtle workaround that I’d commented out because it was awkward to port but didn’t seem to do anything. Whoops!

The worst was a precision error, where a vertical line could be split on a point not quite actually on the line, which wreaked all kinds of havoc. I worked around that with some tasteful rounding, which is highly dubious but makes the output more appealing to my squishy human brain. (I might switch to the original workaround, but I really dislike that even simple cases can spit out points at 1500.0000000000003. The whole thing is parameterized over the coordinate type, so maybe I could throw a rational type in there and cross my fingers?)

All that done, I finally, finally, after a couple months of intermittent progress, got what I wanted!

This is Doom 2’s MAP01. The black area to the left of center is where the player starts. Gray areas indicate where the player can walk from there, with lighter shades indicating more distant areas, where “distance” is measured by the minimum number of line crossings. Red areas can’t be reached at all.

(Note: large playable chunks of the map, including the exit room, are red. That’s because those areas are behind doors, and this code doesn’t understand doors yet.)

(Also note: The big crescent in the lower-right is also black because I was lazy and looked for the player’s starting sector by checking the bbox, and that sector’s bbox happens to match.)

The code that generated this had to go out of its way to delete all the unreachable zones around solid walls. I think I could modify the algorithm to do that on the fly pretty easily, which would probably speed it up a bit too. Downside is that the algorithm would then be pretty specifically tied to this problem, and not usable for any other kind of polygon intersection, which I would think could come up elsewhere? The modifications would be pretty minor, though, so maybe I could confine them to a closure or something.

Some final observations

It runs surprisingly slowly. Like, multiple seconds. Unless I add --release, which speeds it up by a factor of… some number with multiple digits. Wahoo. Debug mode has a high price, especially with a lot of calls in play.

The current state of this code is on GitHub. Please don’t look at it. I’m very sorry.

Honestly, most of my anguish came not from Rust, but from the original code relying on lots of fairly subtle behavior without bothering to explain what it was doing or even hint that anything unusual was going on. God, I hate C++.

I don’t know if the Rust community can learn from this. I don’t know if I even learned from this. Let’s all just quietly forget about it.

Now I just need to figure this one out…

Facebook and Cambridge Analytica

Post Syndicated from Bruce Schneier original https://www.schneier.com/blog/archives/2018/03/facebook_and_ca.html

In the wake of the Cambridge Analytica scandal, news articles and commentators have focused on what Facebook knows about us. A lot, it turns out. It collects data from our posts, our likes, our photos, things we type and delete without posting, and things we do while not on Facebook and even when we’re offline. It buys data about us from others. And it can infer even more: our sexual orientation, political beliefs, relationship status, drug use, and other personality traits — even if we didn’t take the personality test that Cambridge Analytica developed.

But for every article about Facebook’s creepy stalker behavior, thousands of other companies are breathing a collective sigh of relief that it’s Facebook and not them in the spotlight. Because while Facebook is one of the biggest players in this space, there are thousands of other companies that spy on and manipulate us for profit.

Harvard Business School professor Shoshana Zuboff calls it “surveillance capitalism.” And as creepy as Facebook is turning out to be, the entire industry is far creepier. It has existed in secret far too long, and it’s up to lawmakers to force these companies into the public spotlight, where we can all decide if this is how we want society to operate and — if not — what to do about it.

There are 2,500 to 4,000 data brokers in the United States whose business is buying and selling our personal data. Last year, Equifax was in the news when hackers stole personal information on 150 million people, including Social Security numbers, birth dates, addresses, and driver’s license numbers.

You certainly didn’t give it permission to collect any of that information. Equifax is one of those thousands of data brokers, most of them you’ve never heard of, selling your personal information without your knowledge or consent to pretty much anyone who will pay for it.

Surveillance capitalism takes this one step further. Companies like Facebook and Google offer you free services in exchange for your data. Google’s surveillance isn’t in the news, but it’s startlingly intimate. We never lie to our search engines. Our interests and curiosities, hopes and fears, desires and sexual proclivities, are all collected and saved. Add to that the websites we visit that Google tracks through its advertising network, our Gmail accounts, our movements via Google Maps, and what it can collect from our smartphones.

That phone is probably the most intimate surveillance device ever invented. It tracks our location continuously, so it knows where we live, where we work, and where we spend our time. It’s the first and last thing we check in a day, so it knows when we wake up and when we go to sleep. We all have one, so it knows who we sleep with. Uber used just some of that information to detect one-night stands; your smartphone provider and any app you allow to collect location data knows a lot more.

Surveillance capitalism drives much of the internet. It’s behind most of the “free” services, and many of the paid ones as well. Its goal is psychological manipulation, in the form of personalized advertising to persuade you to buy something or do something, like vote for a candidate. And while the individualized profile-driven manipulation exposed by Cambridge Analytica feels abhorrent, it’s really no different from what every company wants in the end. This is why all your personal information is collected, and this is why it is so valuable. Companies that can understand it can use it against you.

None of this is new. The media has been reporting on surveillance capitalism for years. In 2015, I wrote a book about it. Back in 2010, the Wall Street Journal published an award-winning two-year series about how people are tracked both online and offline, titled “What They Know.”

Surveillance capitalism is deeply embedded in our increasingly computerized society, and if the extent of it came to light there would be broad demands for limits and regulation. But because this industry can largely operate in secret, only occasionally exposed after a data breach or investigative report, we remain mostly ignorant of its reach.

This might change soon. In 2016, the European Union passed the comprehensive General Data Protection Regulation, or GDPR. The details of the law are far too complex to explain here, but some of the things it mandates are that personal data of EU citizens can only be collected and saved for “specific, explicit, and legitimate purposes,” and only with explicit consent of the user. Consent can’t be buried in the terms and conditions, nor can it be assumed unless the user opts in. This law will take effect in May, and companies worldwide are bracing for its enforcement.

Because pretty much all surveillance capitalism companies collect data on Europeans, this will expose the industry like nothing else. Here’s just one example. In preparation for this law, PayPal quietly published a list of over 600 companies it might share your personal data with. What will it be like when every company has to publish this sort of information, and explicitly explain how it’s using your personal data? We’re about to find out.

In the wake of this scandal, even Mark Zuckerberg said that his industry probably should be regulated, although he’s certainly not wishing for the sorts of comprehensive regulation the GDPR is bringing to Europe.

He’s right. Surveillance capitalism has operated without constraints for far too long. And advances in both big data analysis and artificial intelligence will make tomorrow’s applications far creepier than today’s. Regulation is the only answer.

The first step to any regulation is transparency. Who has our data? Is it accurate? What are they doing with it? Who are they selling it to? How are they securing it? Can we delete it? I don’t see any hope of Congress passing a GDPR-like data protection law anytime soon, but it’s not too far-fetched to demand laws requiring these companies to be more transparent in what they’re doing.

One of the responses to the Cambridge Analytica scandal is that people are deleting their Facebook accounts. It’s hard to do right, and doesn’t do anything about the data that Facebook collects about people who don’t use Facebook. But it’s a start. The market can put pressure on these companies to reduce their spying on us, but it can only do that if we force the industry out of its secret shadows.

This essay previously appeared on CNN.com.

EDITED TO ADD (4/2): Slashdot thread.

Israeli Security Attacks AMD by Publishing Zero-Day Exploits

Post Syndicated from Bruce Schneier original https://www.schneier.com/blog/archives/2018/03/israeli_securit.html

Last week, the Israeli security company CTS Labs published a series of exploits against AMD chips. The publication came with the flashy website, detailed whitepaper, cool vulnerability names — RYZENFALL, MASTERKEY, FALLOUT, and CHIMERA — and logos we’ve come to expect from these sorts of things. What’s new is that the company only gave AMD a day’s notice, which breaks with every norm about responsible disclosure. CTS Labs didn’t release details of the exploits, only high-level descriptions of the vulnerabilities, but it is probably still enough for others to reproduce their results. This is incredibly irresponsible of the company.

Moreover, the vulnerabilities are kind of meh. Nicholas Weaver explains:

In order to use any of the four vulnerabilities, an attacker must already have almost complete control over the machine. For most purposes, if the attacker already has this access, we would generally say they’ve already won. But these days, modern computers at least attempt to protect against a rogue operating system by having separate secure subprocessors. CTS Labs discovered the vulnerabilities when they looked at AMD’s implementation of the secure subprocessor to see if an attacker, having already taken control of the host operating system, could bypass these last lines of defense.

In a “Clarification,” CTS Labs kind of agrees:

The vulnerabilities described in amdflaws.com could give an attacker that has already gained initial foothold into one or more computers in the enterprise a significant advantage against IT and security teams.

The only thing the attacker would need after the initial local compromise is local admin privileges and an affected machine. To clarify misunderstandings — there is no need for physical access, no digital signatures, no additional vulnerability to reflash an unsigned BIOS. Buy a computer from the store, run the exploits as admin — and they will work (on the affected models as described on the site).

The weirdest thing about this story is that CTS Labs describes one of the vulnerabilities, Chimera, as a backdoor. Although it doesn’t t come out and say that this was deliberately planted by someone, it does make the point that the chips were designed in Taiwan. This is an incredible accusation, and honestly needs more evidence before we can evaluate it.

The upshot of all of this is that CTS Labs played this for maximum publicity: over-hyping its results and minimizing AMD’s ability to respond. And it may have an ulterior motive:

But CTS’s website touting AMD’s flaws also contained a disclaimer that threw some shadows on the company’s motives: “Although we have a good faith belief in our analysis and believe it to be objective and unbiased, you are advised that we may have, either directly or indirectly, an economic interest in the performance of the securities of the companies whose products are the subject of our reports,” reads one line. WIRED asked in a follow-up email to CTS whether the company holds any financial positions designed to profit from the release of its AMD research specifically. CTS didn’t respond.

We all need to demand better behavior from security researchers. I know that any publicity is good publicity, but I am pleased to see the stories critical of CTS Labs outnumbering the stories praising it.

EDITED TO ADD (3/21): AMD responds:

AMD’s response today agrees that all four bug families are real and are found in the various components identified by CTS. The company says that it is developing firmware updates for the three PSP flaws. These fixes, to be made available in “coming weeks,” will be installed through system firmware updates. The firmware updates will also mitigate, in some unspecified way, the Chimera issue, with AMD saying that it’s working with ASMedia, the third-party hardware company that developed Promontory for AMD, to develop suitable protections. In its report, CTS wrote that, while one CTS attack vector was a firmware bug (and hence in principle correctable), the other was a hardware flaw. If true, there may be no effective way of solving it.

Response here.

Message Filtering Operators for Numeric Matching, Prefix Matching, and Blacklisting in Amazon SNS

Post Syndicated from Christie Gifrin original https://aws.amazon.com/blogs/compute/message-filtering-operators-for-numeric-matching-prefix-matching-and-blacklisting-in-amazon-sns/

This blog was contributed by Otavio Ferreira, Software Development Manager for Amazon SNS

Message filtering simplifies the overall pub/sub messaging architecture by offloading message filtering logic from subscribers, as well as message routing logic from publishers. The initial launch of message filtering provided a basic operator that was based on exact string comparison. For more information, see Simplify Your Pub/Sub Messaging with Amazon SNS Message Filtering.

Today, AWS is announcing an additional set of filtering operators that bring even more power and flexibility to your pub/sub messaging use cases.

Message filtering operators

Amazon SNS now supports both numeric and string matching. Specifically, string matching operators allow for exact, prefix, and “anything-but” comparisons, while numeric matching operators allow for exact and range comparisons, as outlined below. Numeric matching operators work for values between -10e9 and +10e9 inclusive, with five digits of accuracy right of the decimal point.

  • Exact matching on string values (Whitelisting): Subscription filter policy   {"sport": ["rugby"]} matches message attribute {"sport": "rugby"} only.
  • Anything-but matching on string values (Blacklisting): Subscription filter policy {"sport": [{"anything-but": "rugby"}]} matches message attributes such as {"sport": "baseball"} and {"sport": "basketball"} and {"sport": "football"} but not {"sport": "rugby"}
  • Prefix matching on string values: Subscription filter policy {"sport": [{"prefix": "bas"}]} matches message attributes such as {"sport": "baseball"} and {"sport": "basketball"}
  • Exact matching on numeric values: Subscription filter policy {"balance": [{"numeric": ["=", 301.5]}]} matches message attributes {"balance": 301.500} and {"balance": 3.015e2}
  • Range matching on numeric values: Subscription filter policy {"balance": [{"numeric": ["<", 0]}]} matches negative numbers only, and {"balance": [{"numeric": [">", 0, "<=", 150]}]} matches any positive number up to 150.

As usual, you may apply the “AND” logic by appending multiple keys in the subscription filter policy, and the “OR” logic by appending multiple values for the same key, as follows:

  • AND logic: Subscription filter policy {"sport": ["rugby"], "language": ["English"]} matches only messages that carry both attributes {"sport": "rugby"} and {"language": "English"}
  • OR logic: Subscription filter policy {"sport": ["rugby", "football"]} matches messages that carry either the attribute {"sport": "rugby"} or {"sport": "football"}

Message filtering operators in action

Here’s how this new set of filtering operators works. The following example is based on a pharmaceutical company that develops, produces, and markets a variety of prescription drugs, with research labs located in Asia Pacific and Europe. The company built an internal procurement system to manage the purchasing of lab supplies (for example, chemicals and utensils), office supplies (for example, paper, folders, and markers) and tech supplies (for example, laptops, monitors, and printers) from global suppliers.

This distributed system is composed of the four following subsystems:

  • A requisition system that presents the catalog of products from suppliers, and takes orders from buyers
  • An approval system for orders targeted to Asia Pacific labs
  • Another approval system for orders targeted to European labs
  • A fulfillment system that integrates with shipping partners

As shown in the following diagram, the company leverages AWS messaging services to integrate these distributed systems.

  • Firstly, an SNS topic named “Orders” was created to take all orders placed by buyers on the requisition system.
  • Secondly, two Amazon SQS queues, named “Lab-Orders-AP” and “Lab-Orders-EU” (for Asia Pacific and Europe respectively), were created to backlog orders that are up for review on the approval systems.
  • Lastly, an SQS queue named “Common-Orders” was created to backlog orders that aren’t related to lab supplies, which can already be picked up by shipping partners on the fulfillment system.

The company also uses AWS Lambda functions to automatically process lab supply orders that don’t require approval or which are invalid.

In this example, because different types of orders have been published to the SNS topic, the subscribing endpoints have had to set advanced filter policies on their SNS subscriptions, to have SNS automatically filter out orders they can’t deal with.

As depicted in the above diagram, the following five filter policies have been created:

  • The SNS subscription that points to the SQS queue “Lab-Orders-AP” sets a filter policy that matches lab supply orders, with a total value greater than $1,000, and that target Asia Pacific labs only. These more expensive transactions require an approver to review orders placed by buyers.
  • The SNS subscription that points to the SQS queue “Lab-Orders-EU” sets a filter policy that matches lab supply orders, also with a total value greater than $1,000, but that target European labs instead.
  • The SNS subscription that points to the Lambda function “Lab-Preapproved” sets a filter policy that only matches lab supply orders that aren’t as expensive, up to $1,000, regardless of their target lab location. These orders simply don’t require approval and can be automatically processed.
  • The SNS subscription that points to the Lambda function “Lab-Cancelled” sets a filter policy that only matches lab supply orders with total value of $0 (zero), regardless of their target lab location. These orders carry no actual items, obviously need neither approval nor fulfillment, and as such can be automatically canceled.
  • The SNS subscription that points to the SQS queue “Common-Orders” sets a filter policy that blacklists lab supply orders. Hence, this policy matches only office and tech supply orders, which have a more streamlined fulfillment process, and require no approval, regardless of price or target location.

After the company finished building this advanced pub/sub architecture, they were then able to launch their internal procurement system and allow buyers to begin placing orders. The diagram above shows six example orders published to the SNS topic. Each order contains message attributes that describe the order, and cause them to be filtered in a different manner, as follows:

  • Message #1 is a lab supply order, with a total value of $15,700 and targeting a research lab in Singapore. Because the value is greater than $1,000, and the location “Asia-Pacific-Southeast” matches the prefix “Asia-Pacific-“, this message matches the first SNS subscription and is delivered to SQS queue “Lab-Orders-AP”.
  • Message #2 is a lab supply order, with a total value of $1,833 and targeting a research lab in Ireland. Because the value is greater than $1,000, and the location “Europe-West” matches the prefix “Europe-“, this message matches the second SNS subscription and is delivered to SQS queue “Lab-Orders-EU”.
  • Message #3 is a lab supply order, with a total value of $415. Because the value is greater than $0 and less than $1,000, this message matches the third SNS subscription and is delivered to Lambda function “Lab-Preapproved”.
  • Message #4 is a lab supply order, but with a total value of $0. Therefore, it only matches the fourth SNS subscription, and is delivered to Lambda function “Lab-Cancelled”.
  • Messages #5 and #6 aren’t lab supply orders actually; one is an office supply order, and the other is a tech supply order. Therefore, they only match the fifth SNS subscription, and are both delivered to SQS queue “Common-Orders”.

Although each message only matched a single subscription, each was tested against the filter policy of every subscription in the topic. Hence, depending on which attributes are set on the incoming message, the message might actually match multiple subscriptions, and multiple deliveries will take place. Also, it is important to bear in mind that subscriptions with no filter policies catch every single message published to the topic, as a blank filter policy equates to a catch-all behavior.

Summary

Amazon SNS allows for both string and numeric filtering operators. As explained in this post, string operators allow for exact, prefix, and “anything-but” comparisons, while numeric operators allow for exact and range comparisons. These advanced filtering operators bring even more power and flexibility to your pub/sub messaging functionality and also allow you to simplify your architecture further by removing even more logic from your subscribers.

Message filtering can be implemented easily with existing AWS SDKs by applying message and subscription attributes across all SNS supported protocols (Amazon SQS, AWS Lambda, HTTP, SMS, email, and mobile push). SNS filtering operators for numeric matching, prefix matching, and blacklisting are available now in all AWS Regions, for no extra charge.

To experiment with these new filtering operators yourself, and continue learning, try the 10-minute Tutorial Filter Messages Published to Topics. For more information, see Filtering Messages with Amazon SNS in the SNS documentation.