Tag Archives: MICROS

The FAA Is Arguing for Security by Obscurity

Post Syndicated from Bruce Schneier original https://www.schneier.com/blog/archives/2017/06/the_faa_is_argu.html

In a proposed rule by the FAA, it argues that software in an Embraer S.A. Model ERJ 190-300 airplane is secure because it’s proprietary:

In addition, the operating systems for current airplane systems are usually and historically proprietary. Therefore, they are not as susceptible to corruption from worms, viruses, and other malicious actions as are more-widely used commercial operating systems, such as Microsoft Windows, because access to the design details of these proprietary operating systems is limited to the system developer and airplane integrator. Some systems installed on the Embraer Model ERJ 190-300 airplane will use operating systems that are widely used and commercially available from third-party software suppliers. The security vulnerabilities of these operating systems may be more widely known than are the vulnerabilities of proprietary operating systems that the avionics manufacturers currently use.

Longtime readers will immediately recognize the “security by obscurity” argument. Its main problem is that it’s fragile. The information is likely less obscure than you think, and even if it is truly obscure, once it’s published you’ve just lost all your security.

This is me from 2014, 2004, and 2002.

The comment period for this proposed rule is ongoing. If you comment, please be polite — they’re more likely to listen to you.

Synchronizing Amazon S3 Buckets Using AWS Step Functions

Post Syndicated from Andy Katz original https://aws.amazon.com/blogs/compute/synchronizing-amazon-s3-buckets-using-aws-step-functions/

Constantin Gonzalez is a Principal Solutions Architect at AWS

In my free time, I run a small blog that uses Amazon S3 to host static content and Amazon CloudFront to distribute it world-wide. I use a home-grown, static website generator to create and upload my blog content onto S3.

My blog uses two S3 buckets: one for staging and testing, and one for production. As a website owner, I want to update the production bucket with all changes from the staging bucket in a reliable and efficient way, without having to create and populate a new bucket from scratch. Therefore, to synchronize files between these two buckets, I use AWS Lambda and AWS Step Functions.

In this post, I show how you can use Step Functions to build a scalable synchronization engine for S3 buckets and learn some common patterns for designing Step Functions state machines while you do so.

Step Functions overview

Step Functions makes it easy to coordinate the components of distributed applications and microservices using visual workflows. Building applications from individual components that each perform a discrete function lets you scale and change applications quickly.

While this particular example focuses on synchronizing objects between two S3 buckets, it can be generalized to any other use case that involves coordinated processing of any number of objects in S3 buckets, or other, similar data processing patterns.

Bucket replication options

Before I dive into the details on how this particular example works, take a look at some alternatives for copying or replicating data between two Amazon S3 buckets:

  • The AWS CLI provides customers with a powerful aws s3 sync command that can synchronize the contents of one bucket with another.
  • S3DistCP is a powerful tool for users of Amazon EMR that can efficiently load, save, or copy large amounts of data between S3 buckets and HDFS.
  • The S3 cross-region replication functionality enables automatic, asynchronous copying of objects across buckets in different AWS regions.

In this use case, you are looking for a slightly different bucket synchronization solution that:

  • Works within the same region
  • Is more scalable than a CLI approach running on a single machine
  • Doesn’t require managing any servers
  • Uses a more finely grained cost model than the hourly based Amazon EMR approach

You need a scalable, serverless, and customizable bucket synchronization utility.

Solution architecture

Your solution needs to do three things:

  1. Copy all objects from a source bucket into a destination bucket, but leave out objects that are already present, for efficiency.
  2. Delete all "orphaned" objects from the destination bucket that aren’t present on the source bucket, because you don’t want obsolete objects lying around.
  3. Keep track of all objects for #1 and #2, regardless of how many objects there are.

In the beginning, you read in the source and destination buckets as parameters and perform basic parameter validation. Then, you operate two separate, independent loops, one for copying missing objects and one for deleting obsolete objects. Each loop is a sequence of Step Functions states that read in chunks of S3 object lists and use the continuation token to decide in a choice state whether to continue the loop or not.

This solution is based on the following architecture that uses Step Functions, Lambda, and two S3 buckets:

As you can see, this setup involves no servers, just two main building blocks:

  • Step Functions manages the overall flow of synchronizing the objects from the source bucket with the destination bucket.
  • A set of Lambda functions carry out the individual steps necessary to perform the work, such as validating input, getting lists of objects from source and destination buckets, copying or deleting objects in batches, and so on.

To understand the synchronization flow in more detail, look at the Step Functions state machine diagram for this example.

Walkthrough

Here’s a detailed discussion of how this works.

To follow along, use the code in the sync-buckets-state-machine GitHub repo. The code comes with a ready-to-run deployment script in Python that takes care of all the IAM roles, policies, Lambda functions, and of course the Step Functions state machine deployment using AWS CloudFormation, as well as instructions on how to use it.

Fine print: Use at your own risk

Before I start, here are some disclaimers:

  • Educational purposes only.

    The following example and code are intended for educational purposes only. Make sure that you customize, test, and review it on your own before using any of this in production.

  • S3 object deletion.

    In particular, using the code included below may delete objects on S3 in order to perform synchronization. Make sure that you have backups of your data. In particular, consider using the Amazon S3 Versioning feature to protect yourself against unintended data modification or deletion.

Step Functions execution starts with an initial set of parameters that contain the source and destination bucket names in JSON:

{
    "source":       "my-source-bucket-name",
    "destination":  "my-destination-bucket-name"
}

Armed with this data, Step Functions execution proceeds as follows.

Step 1: Detect the bucket region

First, you need to know the regions where your buckets reside. In this case, take advantage of the Step Functions Parallel state. This allows you to use a Lambda function get_bucket_location.py inside two different, parallel branches of task states:

  • FindRegionForSourceBucket
  • FindRegionForDestinationBucket

Each task state receives one bucket name as an input parameter, then detects the region corresponding to "their" bucket. The output of these functions is collected in a result array containing one element per parallel function.

Step 2: Combine the parallel states

The output of a parallel state is a list with all the individual branches’ outputs. To combine them into a single structure, use a Lambda function called combine_dicts.py in its own CombineRegionOutputs task state. The function combines the two outputs from step 1 into a single JSON dict that provides you with the necessary region information for each bucket.

Step 3: Validate the input

In this walkthrough, you only support buckets that reside in the same region, so you need to decide if the input is valid or if the user has given you two buckets in different regions. To find out, use a Lambda function called validate_input.py in the ValidateInput task state that tests if the two regions from the previous step are equal. The output is a Boolean.

Step 4: Branch the workflow

Use another type of Step Functions state, a Choice state, which branches into a Failure state if the comparison in step 3 yields false, or proceeds with the remaining steps if the comparison was successful.

Step 5: Execute in parallel

The actual work is happening in another Parallel state. Both branches of this state are very similar to each other and they re-use some of the Lambda function code.

Each parallel branch implements a looping pattern across the following steps:

  1. Use a Pass state to inject either the string value "source" (InjectSourceBucket) or "destination" (InjectDestinationBucket) into the listBucket attribute of the state document.

    The next step uses either the source or the destination bucket, depending on the branch, while executing the same, generic Lambda function. You don’t need two Lambda functions that differ only slightly. This step illustrates how to use Pass states as a way of injecting constant parameters into your state machine and as a way of controlling step behavior while re-using common step execution code.

  2. The next step UpdateSourceKeyList/UpdateDestinationKeyList lists objects in the given bucket.

    Remember that the previous step injected either "source" or "destination" into the state document’s listBucket attribute. This step uses the same list_bucket.py Lambda function to list objects in an S3 bucket. The listBucket attribute of its input decides which bucket to list. In the left branch of the main parallel state, use the list of source objects to work through copying missing objects. The right branch uses the list of destination objects, to check if they have a corresponding object in the source bucket and eliminate any orphaned objects. Orphans don’t have a source object of the same S3 key.

  3. This step performs the actual work. In the left branch, the CopySourceKeys step uses the copy_keys.py Lambda function to go through the list of source objects provided by the previous step, then copies any missing object into the destination bucket. Its sister step in the other branch, DeleteOrphanedKeys, uses its destination bucket key list to test whether each object from the destination bucket has a corresponding source object, then deletes any orphaned objects.

  4. The S3 ListObjects API action is designed to be scalable across many objects in a bucket. Therefore, it returns object lists in chunks of configurable size, along with a continuation token. If the API result has a continuation token, it means that there are more objects in this list. You can work from token to token to continue getting object list chunks, until you get no more continuation tokens.

By breaking down large amounts of work into chunks, you can make sure each chunk is completed within the timeframe allocated for the Lambda function, and within the maximum input/output data size for a Step Functions state.

This approach comes with a slight tradeoff: the more objects you process at one time in a given chunk, the faster you are done. There’s less overhead for managing individual chunks. On the other hand, if you process too many objects within the same chunk, you risk going over time and space limits of the processing Lambda function or the Step Functions state so the work cannot be completed.

In this particular case, use a Lambda function that maximizes the number of objects listed from the S3 bucket that can be stored in the input/output state data. This is currently up to 32,768 bytes, assuming (based on some experimentation) that the execution of the COPY/DELETE requests in the processing states can always complete in time.

A more sophisticated approach would use the Step Functions retry/catch state attributes to account for any time limits encountered and adjust the list size accordingly through some list site adjusting.

Step 6: Test for completion

Because the presence of a continuation token in the S3 ListObjects output signals that you are not done processing all objects yet, use a Choice state to test for its presence. If a continuation token exists, it branches into the UpdateSourceKeyList step, which uses the token to get to the next chunk of objects. If there is no token, you’re done. The state machine then branches into the FinishCopyBranch/FinishDeleteBranch state.

By using Choice states like this, you can create loops exactly like the old times, when you didn’t have for statements and used branches in assembly code instead!

Step 7: Success!

Finally, you’re done, and can step into your final Success state.

Lessons learned

When implementing this use case with Step Functions and Lambda, I learned the following things:

  • Sometimes, it is necessary to manipulate the JSON state of a Step Functions state machine with just a few lines of code that hardly seem to warrant their own Lambda function. This is ok, and the cost is actually pretty low given Lambda’s 100 millisecond billing granularity. The upside is that functions like these can be helpful to make the data more palatable for the following steps or for facilitating Choice states. An example here would be the combine_dicts.py function.
  • Pass states can be useful beyond debugging and tracing, they can be used to inject arbitrary values into your state JSON and guide generic Lambda functions into doing specific things.
  • Choice states are your friend because you can build while-loops with them. This allows you to reliably grind through large amounts of data with the patience of an engine that currently supports execution times of up to 1 year.

    Currently, there is an execution history limit of 25,000 events. Each Lambda task state execution takes up 5 events, while each choice state takes 2 events for a total of 7 events per loop. This means you can loop about 3500 times with this state machine. For even more scalability, you can split up work across multiple Step Functions executions through object key sharding or similar approaches.

  • It’s not necessary to spend a lot of time coding exception handling within your Lambda functions. You can delegate all exception handling to Step Functions and instead simplify your functions as much as possible.

  • Step Functions are great replacements for shell scripts. This could have been a shell script, but then I would have had to worry about where to execute it reliably, how to scale it if it went beyond a few thousand objects, etc. Think of Step Functions and Lambda as tools for scripting at a cloud level, beyond the boundaries of servers or containers. "Serverless" here also means "boundary-less".

Summary

This approach gives you scalability by breaking down any number of S3 objects into chunks, then using Step Functions to control logic to work through these objects in a scalable, serverless, and fully managed way.

To take a look at the code or tweak it for your own needs, use the code in the sync-buckets-state-machine GitHub repo.

To see more examples, please visit the Step Functions Getting Started page.

Enjoy!

DynamoDB Accelerator (DAX) Now Generally Available

Post Syndicated from Jeff Barr original https://aws.amazon.com/blogs/aws/dynamodb-accelerator-dax-now-generally-available/

Earlier this year I told you about Amazon DynamoDB Accelerator (DAX), a fully-managed caching service that sits in front of (logically speaking) your Amazon DynamoDB tables. DAX returns cached responses in microseconds, making it a great fit for eventually-consistent read-intensive workloads. DAX supports the DynamoDB API, and is seamless and easy to use. As a managed service, you simply create your DAX cluster and use it as the target for your existing reads and writes. You don’t have to worry about patching, cluster maintenance, replication, or fault management.

Now Generally Available
Today I am pleased to announce that DAX is now generally available. We have expanded DAX into additional AWS Regions and used the preview time to fine-tune performance and availability:

Now in Five Regions – DAX is now available in the US East (Northern Virginia), EU (Ireland), US West (Oregon), Asia Pacific (Tokyo), and US West (Northern California) Regions.

In Production – Our preview customers are reporting that they are using DAX in production, that they loved how easy it was to add DAX to their application, and have told us that their apps are now running 10x faster.

Getting Started with DAX
As I outlined in my earlier post, it is easy to use DAX to accelerate your existing DynamoDB applications. You simply create a DAX cluster in the desired region, update your application to reference the DAX SDK for Java (the calls are the same; this is a drop-in replacement), and configure the SDK to use the endpoint to your cluster. As a read-through/write-through cache, DAX seamlessly handles all of the DynamoDB read/write APIs.

We are working on SDK support for other languages, and I will share additional information as it becomes available.

DAX Pricing
You pay for each node in the cluster (see the DynamoDB Pricing page for more information) on a per-hour basis, with prices starting at $0.269 per hour in the US East (Northern Virginia) and US West (Oregon) regions. With DAX, each of the nodes in your cluster serves as a read target and as a failover target for high availability. The DAX SDK is cluster aware and will issue round-robin requests to all nodes in the cluster so that you get to make full use of the cluster’s cache resources.

Because DAX can easily handle sudden spikes in read traffic, you may be able to reduce the amount of provisioned throughput for your tables, resulting in an overall cost savings while still returning results in microseconds.

Jeff;

 

Is Continuing to Patch Windows XP a Mistake?

Post Syndicated from Bruce Schneier original https://www.schneier.com/blog/archives/2017/06/is_continuing_t.html

Last week, Microsoft issued a security patch for Windows XP, a 16-year-old operating system that Microsoft officially no longer supports. Last month, Microsoft issued a Windows XP patch for the vulnerability used in WannaCry.

Is this a good idea? This 2014 essay argues that it’s not:

The zero-day flaw and its exploitation is unfortunate, and Microsoft is likely smarting from government calls for people to stop using Internet Explorer. The company had three ways it could respond. It could have done nothing­ — stuck to its guns, maintained that the end of support means the end of support, and encouraged people to move to a different platform. It could also have relented entirely, extended Windows XP’s support life cycle for another few years and waited for attrition to shrink Windows XP’s userbase to irrelevant levels. Or it could have claimed that this case is somehow “special,” releasing a patch while still claiming that Windows XP isn’t supported.

None of these options is perfect. A hard-line approach to the end-of-life means that there are people being exploited that Microsoft refuses to help. A complete about-turn means that Windows XP will take even longer to flush out of the market, making it a continued headache for developers and administrators alike.

But the option Microsoft took is the worst of all worlds. It undermines efforts by IT staff to ditch the ancient operating system and undermines Microsoft’s assertion that Windows XP isn’t supported, while doing nothing to meaningfully improve the security of Windows XP users. The upside? It buys those users at best a few extra days of improved security. It’s hard to say how that was possibly worth it.

This is a hard trade-off, and it’s going to get much worse with the Internet of Things. Here’s me:

The security of our computers and phones also comes from the fact that we replace them regularly. We buy new laptops every few years. We get new phones even more frequently. This isn’t true for all of the embedded IoT systems. They last for years, even decades. We might buy a new DVR every five or ten years. We replace our refrigerator every 25 years. We replace our thermostat approximately never. Already the banking industry is dealing with the security problems of Windows 95 embedded in ATMs. This same problem is going to occur all over the Internet of Things.

At least Microsoft has security engineers on staff that can write a patch for Windows XP. There will be no one able to write patches for your 16-year-old thermostat and refrigerator, even assuming those devices can accept security patches.

Building Loosely Coupled, Scalable, C# Applications with Amazon SQS and Amazon SNS

Post Syndicated from Tara Van Unen original https://aws.amazon.com/blogs/compute/building-loosely-coupled-scalable-c-applications-with-amazon-sqs-and-amazon-sns/

 
Stephen Liedig, Solutions Architect

 

One of the many challenges professional software architects and developers face is how to make cloud-native applications scalable, fault-tolerant, and highly available.

Fundamental to your project success is understanding the importance of making systems highly cohesive and loosely coupled. That means considering the multi-dimensional facets of system coupling to support the distributed nature of the applications that you are building for the cloud.

By that, I mean addressing not only the application-level coupling (managing incoming and outgoing dependencies), but also considering the impacts of of platform, spatial, and temporal coupling of your systems. Platform coupling relates to the interoperability, or lack thereof, of heterogeneous systems components. Spatial coupling deals with managing components at a network topology level or protocol level. Temporal, or runtime coupling, refers to the ability of a component within your system to do any kind of meaningful work while it is performing a synchronous, blocking operation.

The AWS messaging services, Amazon SQS and Amazon SNS, help you deal with these forms of coupling by providing mechanisms for:

  • Reliable, durable, and fault-tolerant delivery of messages between application components
  • Logical decomposition of systems and increased autonomy of components
  • Creating unidirectional, non-blocking operations, temporarily decoupling system components at runtime
  • Decreasing the dependencies that components have on each other through standard communication and network channels

Following on the recent topic, Building Scalable Applications and Microservices: Adding Messaging to Your Toolbox, in this post, I look at some of the ways you can introduce SQS and SNS into your architectures to decouple your components, and show how you can implement them using C#.

Walkthrough

To illustrate some of these concepts, consider a web application that processes customer orders. As good architects and developers, you have followed best practices and made your application scalable and highly available. Your solution included implementing load balancing, dynamic scaling across multiple Availability Zones, and persisting orders in a Multi-AZ Amazon RDS database instance, as in the following diagram.


In this example, the application is responsible for handling and persisting the order data, as well as dealing with increases in traffic for popular items.

One potential point of vulnerability in the order processing workflow is in saving the order in the database. The business expects that every order has been persisted into the database. However, any potential deadlock, race condition, or network issue could cause the persistence of the order to fail. Then, the order is lost with no recourse to restore the order.

With good logging capability, you may be able to identify when an error occurred and which customer’s order failed. This wouldn’t allow you to “restore” the transaction, and by that stage, your customer is no longer your customer.

As illustrated in the following diagram, introducing an SQS queue helps improve your ordering application. Using the queue isolates the processing logic into its own component and runs it in a separate process from the web application. This, in turn, allows the system to be more resilient to spikes in traffic, while allowing work to be performed only as fast as necessary in order to manage costs.


In addition, you now have a mechanism for persisting orders as messages (with the queue acting as a temporary database), and have moved the scope of your transaction with your database further down the stack. In the event of an application exception or transaction failure, this ensures that the order processing can be retired or redirected to the Amazon SQS Dead Letter Queue (DLQ), for re-processing at a later stage. (See the recent post, Using Amazon SQS Dead-Letter Queues to Control Message Failure, for more information on dead-letter queues.)

Scaling the order processing nodes

This change allows you now to scale the web application frontend independently from the processing nodes. The frontend application can continue to scale based on metrics such as CPU usage, or the number of requests hitting the load balancer. Processing nodes can scale based on the number of orders in the queue. Here is an example of scale-in and scale-out alarms that you would associate with the scaling policy.

Scale-out Alarm

aws cloudwatch put-metric-alarm --alarm-name AddCapacityToCustomerOrderQueue --metric-name ApproximateNumberOfMessagesVisible --namespace "AWS/SQS" 
--statistic Average --period 300 --threshold 3 --comparison-operator GreaterThanOrEqualToThreshold --dimensions Name=QueueName,Value=customer-orders
--evaluation-periods 2 --alarm-actions <arn of the scale-out autoscaling policy>

Scale-in Alarm

aws cloudwatch put-metric-alarm --alarm-name RemoveCapacityFromCustomerOrderQueue --metric-name ApproximateNumberOfMessagesVisible --namespace "AWS/SQS" 
 --statistic Average --period 300 --threshold 1 --comparison-operator LessThanOrEqualToThreshold --dimensions Name=QueueName,Value=customer-orders
 --evaluation-periods 2 --alarm-actions <arn of the scale-in autoscaling policy>

In the above example, use the ApproximateNumberOfMessagesVisible metric to discover the queue length and drive the scaling policy of the Auto Scaling group. Another useful metric is ApproximateAgeOfOldestMessage, when applications have time-sensitive messages and developers need to ensure that messages are processed within a specific time period.

Scaling the order processing implementation

On top of scaling at an infrastructure level using Auto Scaling, make sure to take advantage of the processing power of your Amazon EC2 instances by using as many of the available threads as possible. There are several ways to implement this. In this post, we build a Windows service that uses the BackgroundWorker class to process the messages from the queue.

Here’s a closer look at the implementation. In the first section of the consuming application, use a loop to continually poll the queue for new messages, and construct a ReceiveMessageRequest variable.

public static void PollQueue()
{
    while (_running)
    {
        Task<ReceiveMessageResponse> receiveMessageResponse;

        // Pull messages off the queue
        using (var sqs = new AmazonSQSClient())
        {
            const int maxMessages = 10;  // 1-10

            //Receiving a message
            var receiveMessageRequest = new ReceiveMessageRequest
            {
                // Get URL from Configuration
                QueueUrl = _queueUrl, 
                // The maximum number of messages to return. 
                // Fewer messages might be returned. 
                MaxNumberOfMessages = maxMessages, 
                // A list of attributes that need to be returned with message.
                AttributeNames = new List<string> { "All" },
                // Enable long polling. 
                // Time to wait for message to arrive on queue.
                WaitTimeSeconds = 5 
            };

            receiveMessageResponse = sqs.ReceiveMessageAsync(receiveMessageRequest);
        }

The WaitTimeSeconds property of the ReceiveMessageRequest specifies the duration (in seconds) that the call waits for a message to arrive in the queue before returning a response to the calling application. There are a few benefits to using long polling:

  • It reduces the number of empty responses by allowing SQS to wait until a message is available in the queue before sending a response.
  • It eliminates false empty responses by querying all (rather than a limited number) of the servers.
  • It returns messages as soon any message becomes available.

For more information, see Amazon SQS Long Polling.

After you have returned messages from the queue, you can start to process them by looping through each message in the response and invoking a new BackgroundWorker thread.

// Process messages
if (receiveMessageResponse.Result.Messages != null)
{
    foreach (var message in receiveMessageResponse.Result.Messages)
    {
        Console.WriteLine("Received SQS message, starting worker thread");

        // Create background worker to process message
        BackgroundWorker worker = new BackgroundWorker();
        worker.DoWork += (obj, e) => ProcessMessage(message);
        worker.RunWorkerAsync();
    }
}
else
{
    Console.WriteLine("No messages on queue");
}

The event handler, ProcessMessage, is where you implement business logic for processing orders. It is important to have a good understanding of how long a typical transaction takes so you can set a message VisibilityTimeout that is long enough to complete your operation. If order processing takes longer than the specified timeout period, the message becomes visible on the queue. Other nodes may pick it and process the same order twice, leading to unintended consequences.

Handling Duplicate Messages

In order to manage duplicate messages, seek to make your processing application idempotent. In mathematics, idempotent describes a function that produces the same result if it is applied to itself:

f(x) = f(f(x))

No matter how many times you process the same message, the end result is the same (definition from Enterprise Integration Patterns: Designing, Building, and Deploying Messaging Solutions, Hohpe and Wolf, 2004).

There are several strategies you could apply to achieve this:

  • Create messages that have inherent idempotent characteristics. That is, they are non-transactional in nature and are unique at a specified point in time. Rather than saying “place new order for Customer A,” which adds a duplicate order to the customer, use “place order <orderid> on <timestamp> for Customer A,” which creates a single order no matter how often it is persisted.
  • Deliver your messages via an Amazon SQS FIFO queue, which provides the benefits of message sequencing, but also mechanisms for content-based deduplication. You can deduplicate using the MessageDeduplicationId property on the SendMessage request or by enabling content-based deduplication on the queue, which generates a hash for MessageDeduplicationId, based on the content of the message, not the attributes.
var sendMessageRequest = new SendMessageRequest
{
    QueueUrl = _queueUrl,
    MessageBody = JsonConvert.SerializeObject(order),
    MessageGroupId = Guid.NewGuid().ToString("N"),
    MessageDeduplicationId = Guid.NewGuid().ToString("N")
};
  • If using SQS FIFO queues is not an option, keep a message log of all messages attributes processed for a specified period of time, as an alternative to message deduplication on the receiving end. Verifying the existence of the message in the log before processing the message adds additional computational overhead to your processing. This can be minimized through low latency persistence solutions such as Amazon DynamoDB. Bear in mind that this solution is dependent on the successful, distributed transaction of the message and the message log.

Handling exceptions

Because of the distributed nature of SQS queues, it does not automatically delete the message. Therefore, you must explicitly delete the message from the queue after processing it, using the message ReceiptHandle property (see the following code example).

However, if at any stage you have an exception, avoid handling it as you normally would. The intention is to make sure that the message ends back on the queue, so that you can gracefully deal with intermittent failures. Instead, log the exception to capture diagnostic information, and swallow it.

By not explicitly deleting the message from the queue, you can take advantage of the VisibilityTimeout behavior described earlier. Gracefully handle the message processing failure and make the unprocessed message available to other nodes to process.

In the event that subsequent retries fail, SQS automatically moves the message to the configured DLQ after the configured number of receives has been reached. You can further investigate why the order process failed. Most importantly, the order has not been lost, and your customer is still your customer.

private static void ProcessMessage(Message message)
{
    using (var sqs = new AmazonSQSClient())
    {
        try
        {
            Console.WriteLine("Processing message id: {0}", message.MessageId);

            // Implement messaging processing here
            // Ensure no downstream resource contention (parallel processing)
            // <your order processing logic in here…>
            Console.WriteLine("{0} Thread {1}: {2}", DateTime.Now.ToString("s"), Thread.CurrentThread.ManagedThreadId, message.MessageId);
            
            // Delete the message off the queue. 
            // Receipt handle is the identifier you must provide 
            // when deleting the message.
            var deleteRequest = new DeleteMessageRequest(_queueName, message.ReceiptHandle);
            sqs.DeleteMessageAsync(deleteRequest);
            Console.WriteLine("Processed message id: {0}", message.MessageId);

        }
        catch (Exception ex)
        {
            // Do nothing.
            // Swallow exception, message will return to the queue when 
            // visibility timeout has been exceeded.
            Console.WriteLine("Could not process message due to error. Exception: {0}", ex.Message);
        }
    }
}

Using SQS to adapt to changing business requirements

One of the benefits of introducing a message queue is that you can accommodate new business requirements without dramatically affecting your application.

If, for example, the business decided that all orders placed over $5000 are to be handled as a priority, you could introduce a new “priority order” queue. The way the orders are processed does not change. The only significant change to the processing application is to ensure that messages from the “priority order” queue are processed before the “standard order” queue.

The following diagram shows how this logic could be isolated in an “order dispatcher,” whose only purpose is to route order messages to the appropriate queue based on whether the order exceeds $5000. Nothing on the web application or the processing nodes changes other than the target queue to which the order is sent. The rates at which orders are processed can be achieved by modifying the poll rates and scalability settings that I have already discussed.

Extending the design pattern with Amazon SNS

Amazon SNS supports reliable publish-subscribe (pub-sub) scenarios and push notifications to known endpoints across a wide variety of protocols. It eliminates the need to periodically check or poll for new information and updates. SNS supports:

  • Reliable storage of messages for immediate or delayed processing
  • Publish / subscribe – direct, broadcast, targeted “push” messaging
  • Multiple subscriber protocols
  • Amazon SQS, HTTP, HTTPS, email, SMS, mobile push, AWS Lambda

With these capabilities, you can provide parallel asynchronous processing of orders in the system and extend it to support any number of different business use cases without affecting the production environment. This is commonly referred to as a “fanout” scenario.

Rather than your web application pushing orders to a queue for processing, send a notification via SNS. The SNS messages are sent to a topic and then replicated and pushed to multiple SQS queues and Lambda functions for processing.

As the diagram above shows, you have the development team consuming “live” data as they work on the next version of the processing application, or potentially using the messages to troubleshoot issues in production.

Marketing is consuming all order information, via a Lambda function that has subscribed to the SNS topic, inserting the records into an Amazon Redshift warehouse for analysis.

All of this, of course, is happening without affecting your order processing application.

Summary

While I haven’t dived deep into the specifics of each service, I have discussed how these services can be applied at an architectural level to build loosely coupled systems that facilitate multiple business use cases. I’ve also shown you how to use infrastructure and application-level scaling techniques, so you can get the most out of your EC2 instances.

One of the many benefits of using these managed services is how quickly and easily you can implement powerful messaging capabilities in your systems, and lower the capital and operational costs of managing your own messaging middleware.

Using Amazon SQS and Amazon SNS together can provide you with a powerful mechanism for decoupling application components. This should be part of design considerations as you architect for the cloud.

For more information, see the Amazon SQS Developer Guide and Amazon SNS Developer Guide. You’ll find tutorials on all the concepts covered in this post, and more. To can get started using the AWS console or SDK of your choice visit:

Happy messaging!

Sync vs. Backup vs. Storage

Post Syndicated from Yev original https://www.backblaze.com/blog/sync-vs-backup-vs-storage/

Cloud Sync vs. Cloud Backup vs. Cloud Storage

Google Drive recently announced their new Backup and Sync feature for Google Drive, which allows users to select folders on their computer that they want to back up to their Google Drive account (note: these files count against your Google Drive storage limit). Whenever new backup services are announced, we get a lot of questions so I thought we should take a minute to review the differences in cloud based services.

What is the Cloud? Sync Vs Backup Vs Storage

There is still a lot of confusion in the space about what exactly the “cloud” is and how different services interact with it. When folks use a syncing and sharing service like Dropbox, Box, Google Drive, OneDrive or any of the others, they often assume those are acting as a cloud backup solution as well. Adding to the confusion, cloud storage services are often the backend for backup and sync services as well as standalone services. To help sort this out, we’ll define some of the terms below as they apply to a traditional computer set-up with a bunch of apps and data.

Cloud Sync (ex. Dropbox, iCloud Drive, OneDrive, Box, Google Drive) – these services sync folders on your computer to folders on other machines or to the cloud – allowing users to work from a folder or directory across devices. Typically these services have tiered pricing, meaning you pay for the amount of data you store with the service. If there is data loss, sometimes these services even have a rollback feature, of course only files that are in the synced folders are available to be recovered.

Cloud Backup (ex. Backblaze Cloud Backup, Mozy, Carbonite) – these services work in the background automatically. The user does not need to take any action like setting up specific folders. Backup services typically back up any new or changed data on your computer to another location. Before the cloud took off, that location was primarily a CD or an external hard drive – but as cloud storage became more readily available it became the most popular storage medium. Typically these services have fixed pricing, and if there is a system crash or data loss, all backed up data is available for restore. In addition, these services have rollback features in case there is data loss / accidental file deletion.

Cloud Storage (ex. Backblaze B2, Amazon S3, Microsoft Azure) – these services are where many online backup and syncing and sharing services store data. Cloud storage providers typically serve as the endpoint for data storage. These services typically provide APIs, CLIs, and access points for individuals and developers to tie in their cloud storage offerings directly. These services are priced “per GB” meaning you pay for the amount of storage that you use. Since these services are designed for high-availability and durability, data can live solely on these services – though we still recommend having multiple copies of your data, just in case.

What Should You Use?

Backblaze strongly believes in a 3-2-1 Backup Strategy. A 3-2-1 strategy means having at least 3 total copies of your data, 2 of which are local but on different mediums (e.g. an external hard drive in addition to your computer’s local drive), and at least 1 copy offsite. The best setup is data on your computer, a copy on a hard drive that lives somewhere not inside your computer, and another copy with a cloud backup provider. Backblaze Cloud Backup is a great compliment to other services, like Time Machine, Dropbox, and even the free-tiers of cloud storage services.

What is The Difference Between Cloud Sync and Backup?

Let’s take a look at some sync setups that we see fairly frequently.

Example 1) Users have one folder on their computer that is designated for Dropbox, Google Drive, OneDrive, or one of the other syncing/sharing services. Users save or place data into those directories when they want them to appear on other devices. Often these users are using the free-tier of those syncing and sharing services and only have a few GB of data uploaded in them.

Example 2) Users are paying for extended storage for Dropbox, Google Drive, OneDrive, etc… and use those folders as the “Documents” folder – essentially working out of those directories. Files in that folder are available across devices, however, files outside of that folder (e.g. living on the computer’s desktop or anywhere else) are not synced or stored by the service.

What both examples are missing however is the backup of photos, movies, videos, and the rest of the data on their computer. That’s where cloud backup providers excel, by automatically backing up user data with little or no set-up, and no need for the dragging-and-dropping of files. Backblaze actually scans your hard drive to find all the data, regardless of where it might be hiding. The results are, all the user’s data is kept in the Backblaze cloud and the portion of the data that is synced is also kept in that provider’s cloud – giving the user another layer of redundancy. Best of all, Backblaze will actually back up your Dropbox, iCloud Drive, Google Drive, and OneDrive folders.

Data Recovery

The most important feature to think about is how easy it is to get your data back from all of these services. With sync and share services, retrieving a lot of data, especially if you are in a high-data tier, can be cumbersome and take awhile. Generally, the sync and share services only allow customers to download files over the Internet. If you are trying to download more than a couple gigabytes of data, the process can take time and can be fraught with errors.

With cloud storage services, you can usually only retrieve data over the Internet as well, and you pay for both the storage and the egress of the data, so retrieving a large amount of data can be both expensive and time consuming.

Cloud backup services will enable you to download files over the internet too and can also suffer from long download times. At Backblaze we never want our customers to feel like we’re holding their data hostage, which is why we have a lot of restore options, including our Restore Return Refund policy, which allows people to restore their data via a USB Hard Drive, and then return that drive to us for a refund. Cloud sync providers do not provide this capability.

One popular data recovery use case we’ve seen when a person has a lot of data to restore is to download just the files that are needed immediately, and then order a USB Hard Drive restore for the remaining files that are not as time sensitive. The user gets all their files back in a few days, and their network is spared the download charges.

The bottom line is that all of these services have merit for different use-cases. Have questions about which is best for you? Sound off in the comments below!

The post Sync vs. Backup vs. Storage appeared first on Backblaze Blog | Cloud Storage & Cloud Backup.

New – Managed Device Authentication for Amazon WorkSpaces

Post Syndicated from Jeff Barr original https://aws.amazon.com/blogs/aws/new-managed-device-authentication-for-amazon-workspaces/

Amazon WorkSpaces allows you to access a virtual desktop in the cloud from the web and from a wide variety of desktop and mobile devices. This flexibility makes WorkSpaces ideal for environments where users have the ability to use their existing devices (often known as BYOD, or Bring Your Own Device). In these environments, organizations sometimes need the ability to manage the devices which can access WorkSpaces. For example, they may have to regulate access based on the client device operating system, version, or patch level in order to help meet compliance or security policy requirements.

Managed Device Authentication
Today we are launching device authentication for WorkSpaces. You can now use digital certificates to manage client access from Apple OSX and Microsoft Windows. You can also choose to allow or block access from iOS, Android, Chrome OS, web, and zero client devices. You can implement policies to control which device types you want to allow and which ones you want to block, with control all the way down to the patch level. Access policies are set for each WorkSpaces directory. After you have set the policies, requests to connect to WorkSpaces from a client device are assessed and either blocked or allowed. In order to make use of this feature, you will need to distribute certificates to your client devices using Microsoft System Center Configuration Manager or a mobile device management (MDM) tool.

Here’s how you set your access control options from the WorkSpaces Console:

Here’s what happens if a client is not authorized to connect:

 

Available Today
This feature is now available in all Regions where WorkSpaces is available.

Jeff;

 

BPI Breaks Record After Sending 310 Million Google Takedowns

Post Syndicated from Andy original https://torrentfreak.com/bpi-breaks-record-after-sending-310-million-google-takedowns-170619/

A little over a year ago during March 2016, music industry group BPI reached an important milestone. After years of sending takedown notices to Google, the group burst through the 200 million URL barrier.

The fact that it took BPI several years to reach its 200 million milestone made the surpassing of the quarter billion milestone a few months later even more remarkable. In October 2016, the group sent its 250 millionth takedown to Google, a figure that nearly doubled when accounting for notices sent to Microsoft’s Bing.

But despite the volumes, the battle hadn’t been won, let alone the war. The BPI’s takedown machine continued to run at a remarkable rate, churning out millions more notices per week.

As a result, yet another new milestone was reached this month when the BPI smashed through the 300 million URL barrier. Then, days later, a further 10 million were added, with the latter couple of million added during the time it took to put this piece together.

BPI takedown notices, as reported by Google

While demanding that Google places greater emphasis on its de-ranking of ‘pirate’ sites, the BPI has called again and again for a “notice and stay down” regime, to ensure that content taken down by the search engine doesn’t simply reappear under a new URL. It’s a position BPI maintains today.

“The battle would be a whole lot easier if intermediaries played fair,” a BPI spokesperson informs TF.

“They need to take more proactive responsibility to reduce infringing content that appears on their platform, and, where we expressly notify infringing content to them, to ensure that they do not only take it down, but also keep it down.”

The long-standing suggestion is that the volume of takedown notices sent would reduce if a “take down, stay down” regime was implemented. The BPI says it’s difficult to present a precise figure but infringing content has a tendency to reappear, both in search engines and on hosting sites.

“Google rejects repeat notices for the same URL. But illegal content reappears as it is re-indexed by Google. As to the sites that actually host the content, the vast majority of notices sent to them could be avoided if they implemented take-down & stay-down,” BPI says.

The fact that the BPI has added 60 million more takedowns since the quarter billion milestone a few months ago is quite remarkable, particularly since there appears to be little slowdown from month to month. However, the numbers have grown so huge that 310 billion now feels a lot like 250 million, with just a few added on top for good measure.

That an extra 60 million takedowns can almost be dismissed as a handful is an indication of just how massive the issue is online. While pirates always welcome an abundance of links to juicy content, it’s no surprise that groups like the BPI are seeking more comprehensive and sustainable solutions.

Previously, it was hoped that the Digital Economy Bill would provide some relief, hopefully via government intervention and the imposition of a search engine Code of Practice. In the event, however, all pressure on search engines was removed from the legislation after a separate voluntary agreement was reached.

All parties agreed that the voluntary code should come into effect two weeks ago on June 1 so it seems likely that some effects should be noticeable in the near future. But the BPI says it’s still early days and there’s more work to be done.

“BPI has been working productively with search engines since the voluntary code was agreed to understand how search engines approach the problem, but also what changes can and have been made and how results can be improved,” the group explains.

“The first stage is to benchmark where we are and to assess the impact of the changes search engines have made so far. This will hopefully be completed soon, then we will have better information of the current picture and from that we hope to work together to continue to improve search for rights owners and consumers.”

With more takedown notices in the pipeline not yet publicly reported by Google, the BPI informs TF that it has now notified the search giant of 315 million links to illegal content.

“That’s an astonishing number. More than 1 in 10 of the entire world’s notices to Google come from BPI. This year alone, one in every three notices sent to Google from BPI is for independent record label repertoire,” BPI concludes.

While it’s clear that groups like BPI have developed systems to cope with the huge numbers of takedown notices required in today’s environment, it’s clear that few rightsholders are happy with the status quo. With that in mind, the fight will continue, until search engines are forced into compromise. Considering the implications, that could only appear on a very distant horizon.

Source: TF, for the latest info on copyright, file-sharing, torrent sites and ANONYMOUS VPN services.

Mira, tiny robot of joyful delight

Post Syndicated from Alex Bate original https://www.raspberrypi.org/blog/mira-robot-alonso-martinez/

The staff of Pi Towers are currently melting into puddles while making ‘Aaaawwwwwww’ noises as Mira, the adorable little Pi-controlled robot made by Pixar 3D artist Alonso Martinez, steals their hearts.

Mira the robot playing peek-a-boo

If you want to get updates on Mira’s progress, sign up for the mailing list! http://eepurl.com/bteigD Mira is a desk companion that makes your life better one smile at a time. This project explores human robot interactivity and emotional intelligence. Currently Mira uses face tracking to interact with the users and loves playing the game “peek-a-boo”.

Introducing Mira

Honestly, I can’t type words – I am but a puddle! If I could type at all, I would only produce a stream of affectionate fragments. Imagine walking into a room full of kittens. What you would sound like is what I’d type.

No! I can do this. I’m a professional. I write for a living! I can…

SHE BLINKS OHMYAAAARGH!!!

Mira Alonso Martinez Raspberry Pi

Weebl & Bob meets South Park’s Ike Broflovski in an adorable 3D-printed bundle of ‘Aaawwwww’

Introducing Mira (I promise I can do this)

Right. I’ve had a nap and a drink. I’ve composed myself. I am up for this challenge. As long as I don’t look directly at her, I’ll be fine!

Here I go.

As one of the many über-talented 3D artists at Pixar, Alonso Martinez knows a thing or two about bringing adorable-looking characters to life on screen. However, his work left him wondering:

In movies you see really amazing things happening but you actually can’t interact with them – what would it be like if you could interact with characters?

So with the help of his friends Aaron Nathan and Vijay Sundaram, Alonso set out to bring the concept of animation to the physical world by building a “character” that reacts to her environment. His experiments with robotics started with Gertie, a ball-like robot reminiscent of his time spent animating bouncing balls when he was learning his trade. From there, he moved on to Mira.

Mira Alonso Martinez

Many, many of the views of this Tested YouTube video have come from me. So many.

Mira swivels to follow a person’s face, plays games such as peekaboo, shows surprise when you finger-shoot her, and giggles when you give her a kiss.

Mira’s inner workings

To get Mira to turn her head in three dimensions, Alonso took inspiration from the Microsoft Sidewinder Pro joystick he had as a kid. He purchased one on eBay, took it apart to understand how it works, and replicated its mechanism for Mira’s Raspberry Pi-powered innards.

Mira Alonso Martinez

Alonso used the smallest components he could find so that they would fit inside Mira’s tiny body.

Mira’s axis of 3D-printed parts moves via tiny Power HD DSM44 servos, while a camera and OpenCV handle face-tracking, and a single NeoPixel provides a range of colours to indicate her emotions. As for the blinking eyes? Two OLED screens boasting acrylic domes fit within the few millimeters between all the other moving parts.

More on Mira, including her history and how she works, can be found in this wonderful video released by Tested this week.

Pixar Artist’s 3D-Printed Animated Robots!

We’re gushing with grins and delight at the sight of these adorable animated robots created by artist Alonso Martinez. Sean chats with Alonso to learn how he designed and engineered his family of robots, using processes like 3D printing, mold-making, and silicone casting. They’re amazing!

You can also sign up for Alonso’s newsletter here to stay up-to-date about this little robot. Hopefully one of these newsletters will explain how to buy or build your own Mira, as I for one am desperate to see her adorable little face on my desk every day for the rest of my life.

The post Mira, tiny robot of joyful delight appeared first on Raspberry Pi.

Notes on open-sourcing abandoned code

Post Syndicated from Robert Graham original http://blog.erratasec.com/2017/06/notes-on-open-sourcing-abandoned-code.html

Some people want a law that compels companies to release their source code for “abandoned software”, in the name of cybersecurity, so that customers who bought it can continue to patch bugs long after the seller has stopped supporting the product. This is a bad policy, for a number of reasons.

Code is Speech

First of all, code is speech. That was the argument why Phil Zimmerman could print the source code to PGP in a book, ship it overseas, and then have somebody scan the code back into a computer. Compelled speech is a violation of free speech. That was one of the arguments in the Apple vs. FBI case, where the FBI demanded that Apple write code for them, compelling speech.

Compelling the opening of previously closed source is compelled speech.

There might still be legal arguments that get away with it. After all state already compels some speech, such as warning labels, where is services a narrow, legitimate government interest. So the courts may allow it. Also, like many free-speech issues (e.g. the legality of hate-speech), people may legitimately disagree with the courts about what “is” legal and what “should” be legal.

But here’s the thing. What rights “should” be protected changes depending on what side you are on. Whether something deserves the protection of “free speech” depends upon whether the speaker is “us” or the speaker is “them”. If it’s “them”, then you’ll find all sorts of reasons why their speech is a special case, and what it doesn’t deserve protection.

That’s what’s happening here. The legitimate government purpose of “product safety” looms large, the “code is speech” doesn’t, because they hate closed-source code, and hate Microsoft in particular. The open-source community has been strong on “code is speech” when it applies to them, but weak when it applies to closed-source.

Define abandoned

What, precisely, does ‘abandoned’ mean? Consider Windows 3.1. Microsoft hasn’t sold it for decades. Yet, it’s not precisely abandoned either, because they still sell modern versions of Windows. Being forced to show even 30 year old source code would give competitors a significant advantage in creating Windows-compatible code like WINE.

When code is truly abandoned, such as when the vendor has gone out of business, chances are good they don’t have the original source code anyway. Thus, in order for this policy to have any effect, you’d have to force vendors to give a third-party escrow service a copy of their code whenever they release a new version of their product.

All the source code

And that is surprisingly hard and costly. Most companies do not precisely know what source code their products are based upon. Yes, technically, all the code is in that ZIP file they gave to the escrow service, but it doesn’t build. Essential build steps are missing, so that source code won’t compile. It’s like the dependency hell that many open-source products experience, such as downloading and installing two different versions of Python at different times during the build. Except, it’s a hundred times worse.

Often times building closed-source requires itself an obscure version of a closed-source tool that itself has been abandoned by its original vendor. You often times can’t even define which is the source code. For example, engine control units (ECUs) are Matlab code that compiles down to C, which is then integrated with other C code, all of which is (using a special compiler) is translated to C. Unless you have all these closed source products, some of which are no longer sold, the source-code to the ECU will not help you in patch bugs.

For small startups running fast, such as off Kickstarter, forcing them to escrow code that actually builds would force upon them an undue burden, harming innovation.

Binary patch and reversing

Then there is the issue of why you need the source code in the first place. Here’s the deal with binary exploits like buffer-overflows: if you know enough to exploit it, you know enough to patch it. Just add some binary code onto the end of the function the program that verifies the input, then replace where the vulnerability happens to a jump instruction to the new code.

I know this is possible and fairly trivial because I’ve done it myself. Indeed, one of the reason Microsoft has signed kernel components is specifically because they got tired of me patching the live kernel this way (and, almost sued me for reverse engineering their code in violation of their EULA).

Given the aforementioned difficulties in building software, this would be the easier option for third parties trying to fix bugs. The only reason closed-source companies don’t do this already is because they need to fix their products permanently anyway, which involves checking in the change into their source control systems and rebuilding.

Conclusion

So what we see here is that there is no compelling benefit to forcing vendors to release code for “abandoned” products, while at the same time, there are significant costs involved, not the least of which is a violation of the principle that “code is speech”.

It doesn’t exist as a serious proposal. It only exists as a way to support open-source advocacy and security advocacy. Both would gladly stomp on your rights and drive up costs in order to achieve their higher moral goal.


Bonus: so let’s say you decide that “Window XP” has been abandoned, which is exactly the intent of proponents. You think what would happen is that we (the open-source community) would then be able to continue to support WinXP and patch bugs.

But what we’d see instead is a lot more copies of WinXP floating around, with vulnerabilities, as people decided to use it instead of paying hundreds of dollars for a new Windows 10 license.

Indeed, part of the reason for Micrsoft abandoning WinXP is because it’s riddled with flaws that can’t practically be fixed, whereas the new features of Win10 fundamentally fixes them. Getting rid of SMBv1 is just one of many examples.

Latency Distribution Graph in AWS X-Ray

Post Syndicated from Randall Hunt original https://aws.amazon.com/blogs/aws/latency-distribution-graph-in-aws-x-ray/

We’re continuing to iterate on the AWS X-Ray service based on customer feedback and today we’re excited to release a set of tools to help you quickly dive deep on latencies in your applications. Visual Node and Edge latency distribution graphs are shown in a handy new “Service Details” side bar in your X-Ray Service Map.

The X-Ray service graph gives you a visual representation of services and their interactions over a period of time that you select. The nodes represent services and the edges between the nodes represent calls between the services. The nodes and edges each have a set of statistics associated with them. While the visualizations provided in the service map are useful for estimating the average latency in an application they don’t help you to dive deep on specific issues. Most of the time issues occur at statistical outliers. To alleviate this X-Ray computes histograms like the one above help you solve those 99th percentile bugs.

To see a Response Distribution for a Node just click on it in the service graph. You can also click on the edges between the nodes to see the Response Distribution from the viewpoint of the calling service.

The team had a few interesting problems to solve while building out this feature and I wanted to share a bit of that with you now! Given the large number of traces an app can produce it’s not a great idea (for your browser) to plot every single trace client side. Instead most plotting libraries, when dealing with many points, use approximations and bucketing to get a network and performance friendly histogram. If you’ve used monitoring software in the past you’ve probably seen as you zoom in on the data you get higher fidelity. The interesting thing about the latencies coming in from X-Ray is that they vary by several orders of magnitude.

If the latencies were distributed between strictly 0s and 1s you could easily just create 10 buckets of 100 milliseconds. If your apps are anything like mine there’s a lot of interesting stuff happening in the outliers, so it’s beneficial to have more fidelity at 1% and 99% than it is at 50%. The problem with fixed bucket sizes is that they’re not necessarily giving you an accurate summary of data. So X-Ray, for now, uses dynamic bucket sizing based on the t-digests algorithm by Ted Dunning and Otmar Ertl. One of the distinct advantages of this algorithm over other approximation algorithms is its accuracy and precision at extremes (where most errors typically are).

An additional advantage of X-Ray over other monitoring software is the ability to measure two perspectives of latency simultaneously. Developers almost always have some view into the server side latency from their application logs but with X-Ray you can examine latency from the view of each of the clients, services, and microservices that you’re interacting with. You can even dive deeper by adding additional restrictions and queries on your selection. You can identify the specific users and clients that are having issues at that 99th percentile.

This info has already been available in API calls to GetServiceGraph as ResponseTimeHistogram but now we’re exposing it in the console as well to make it easier for customers to consume. For more information check out the documentation here.

Randall

Teaching tech

Post Syndicated from Eevee original https://eev.ee/blog/2017/06/10/teaching-tech/

A sponsored post from Manishearth:

I would kinda like to hear about any thoughts you have on technical teaching or technical writing. Pedagogy is something I care about. But I don’t know how much you do, so feel free to ignore this suggestion 🙂

Good news: I care enough that I’m trying to write a sorta-kinda-teaching book!

Ironically, one of the biggest problems I’ve had with writing the introduction to that book is that I keep accidentally rambling on for pages about problems and difficulties with teaching technical subjects. So maybe this is a good chance to get it out of my system.

Phaser

I recently tried out a new thing. It was Phaser, but this isn’t a dig on them in particular, just a convenient example fresh in my mind. If anything, they’re better than most.

As you can see from Phaser’s website, it appears to have tons of documentation. Two of the six headings are “LEARN” and “EXAMPLES”, which seems very promising. And indeed, Phaser offers:

  • Several getting-started walkthroughs
  • Possibly hundreds of examples
  • A news feed that regularly links to third-party tutorials
  • Thorough API docs

Perfect. Beautiful. Surely, a dream.

Well, almost.

The examples are all microscopic, usually focused around a single tiny feature — many of them could be explained just as well with one line of code. There are a few example games, but they’re short aimless demos. None of them are complete games, and there’s no showcase either. Games sometimes pop up in the news feed, but most of them don’t include source code, so they’re not useful for learning from.

Likewise, the API docs are just API docs, leading to the sorts of problems you might imagine. For example, in a few places there’s a mention of a preUpdate stage that (naturally) happens before update. You might rightfully wonder what kinds of things happen in preUpdate — and more importantly, what should you put there, and why?

Let’s check the API docs for Phaser.Group.preUpdate:

The core preUpdate – as called by World.

Okay, that didn’t help too much, but let’s check what Phaser.World has to say:

The core preUpdate – as called by World.

Ah. Hm. It turns out World is a subclass of Group and inherits this method — and thus its unaltered docstring — from Group.

I did eventually find some brief docs attached to Phaser.Stage (but only by grepping the source code). It mentions what the framework uses preUpdate for, but not why, and not when I might want to use it too.


The trouble here is that there’s no narrative documentation — nothing explaining how the library is put together and how I’m supposed to use it. I get handed some brief primers and a massive reference, but nothing in between. It’s like buying an O’Reilly book and finding out it only has one chapter followed by a 500-page glossary.

API docs are great if you know specifically what you’re looking for, but they don’t explain the best way to approach higher-level problems, and they don’t offer much guidance on how to mesh nicely with the design of a framework or big library. Phaser does a decent chunk of stuff for you, off in the background somewhere, so it gives the strong impression that it expects you to build around it in a particular way… but it never tells you what that way is.

Tutorials

Ah, but this is what tutorials are for, right?

I confess I recoil whenever I hear the word “tutorial”. It conjures an image of a uniquely useless sort of post, which goes something like this:

  1. Look at this cool thing I made! I’ll teach you how to do it too.

  2. Press all of these buttons in this order. Here’s a screenshot, which looks nothing like what you have, because I’ve customized the hell out of everything.

  3. You did it!

The author is often less than forthcoming about why they made any of the decisions they did, where you might want to try something else, or what might go wrong (and how to fix it).

And this is to be expected! Writing out any of that stuff requires far more extensive knowledge than you need just to do the thing in the first place, and you need to do a good bit of introspection to sort out something coherent to say.

In other words, teaching is hard. It’s a skill, and it takes practice, and most people blogging are not experts at it. Including me!


With Phaser, I noticed that several of the third-party tutorials I tried to look at were 404s — sometimes less than a year after they were linked on the site. Pretty major downside to relying on the community for teaching resources.

But I also notice that… um…

Okay, look. I really am not trying to rag on this author. I’m not. They tried to share their knowledge with the world, and that’s a good thing, something worthy of praise. I’m glad they did it! I hope it helps someone.

But for the sake of example, here is the most recent entry in Phaser’s list of community tutorials. I have to link it, because it’s such a perfect example. Consider:

  • The post itself is a bulleted list of explanation followed by a single contiguous 250 lines of source code. (Not that there’s anything wrong with bulleted lists, mind you.) That code contains zero comments and zero blank lines.

  • This is only part two in what I think is a series aimed at beginners, yet the title and much of the prose focus on object pooling, a performance hack that’s easy to add later and that’s almost certainly unnecessary for a game this simple. There is no explanation of why this is done; the prose only says you’ll understand why it’s critical once you add a lot more game objects.

  • It turns out I only have two things to say here so I don’t know why I made this a bulleted list.

In short, it’s not really a guided explanation; it’s “look what I did”.

And that’s fine, and it can still be interesting. I’m not sure English is even this person’s first language, so I’m hardly going to criticize them for not writing a novel about platforming.

The trouble is that I doubt a beginner would walk away from this feeling very enlightened. They might be closer to having the game they wanted, so there’s still value in it, but it feels closer to having someone else do it for them. And an awful lot of tutorials I’ve seen — particularly of the “post on some blog” form (which I’m aware is the genre of thing I’m writing right now) — look similar.

This isn’t some huge social problem; it’s just people writing on their blog and contributing to the corpus of written knowledge. It does become a bit stickier when a large project relies on these community tutorials as its main set of teaching aids.


Again, I’m not ragging on Phaser here. I had a slightly frustrating experience with it, coming in knowing what I wanted but unable to find a description of the semantics anywhere, but I do sympathize. Teaching is hard, writing documentation is hard, and programmers would usually rather program than do either of those things. For free projects that run on volunteer work, and in an industry where anything other than programming is a little undervalued, getting good docs written can be tricky.

(Then again, Phaser sells books and plugins, so maybe they could hire a documentation writer. Or maybe the whole point is for you to buy the books?)

Some pretty good docs

Python has pretty good documentation. It introduces the language with a tutorial, then documents everything else in both a library and language reference.

This sounds an awful lot like Phaser’s setup, but there’s some considerable depth in the Python docs. The tutorial is highly narrative and walks through quite a few corners of the language, stopping to mention common pitfalls and possible use cases. I clicked an arbitrary heading and found a pleasant, informative read that somehow avoids being bewilderingly dense.

The API docs also take on a narrative tone — even something as humble as the collections module offers numerous examples, use cases, patterns, recipes, and hints of interesting ways you might extend the existing types.

I’m being a little vague and hand-wavey here, but it’s hard to give specific examples without just quoting two pages of Python documentation. Hopefully you can see right away what I mean if you just take a look at them. They’re good docs, Bront.

I’ve likewise always enjoyed the SQLAlchemy documentation, which follows much the same structure as the main Python documentation. SQLAlchemy is a database abstraction layer plus ORM, so it can do a lot of subtly intertwined stuff, and the complexity of the docs reflects this. Figuring out how to do very advanced things correctly, in particular, can be challenging. But for the most part it does a very thorough job of introducing you to a large library with a particular philosophy and how to best work alongside it.

I softly contrast this with, say, the Perl documentation.

It’s gotten better since I first learned Perl, but Perl’s docs are still a bit of a strange beast. They exist as a flat collection of manpage-like documents with terse names like perlootut. The documentation is certainly thorough, but much of it has a strange… allocation of detail.

For example, perllol — the explanation of how to make a list of lists, which somehow merits its own separate documentation — offers no fewer than nine similar variations of the same code for reading a file into a nested lists of words on each line. Where Python offers examples for a variety of different problems, Perl shows you a lot of subtly different ways to do the same basic thing.

A similar problem is that Perl’s docs sometimes offer far too much context; consider the references tutorial, which starts by explaining that references are a powerful “new” feature in Perl 5 (first released in 1994). It then explains why you might want to nest data structures… from a Perl 4 perspective, thus explaining why Perl 5 is so much better.

Some stuff I’ve tried

I don’t claim to be a great teacher. I like to talk about stuff I find interesting, and I try to do it in ways that are accessible to people who aren’t lugging around the mountain of context I already have. This being just some blog, it’s hard to tell how well that works, but I do my best.

I also know that I learn best when I can understand what’s going on, rather than just seeing surface-level cause and effect. Of course, with complex subjects, it’s hard to develop an understanding before you’ve seen the cause and effect a few times, so there’s a balancing act between showing examples and trying to provide an explanation. Too many concrete examples feel like rote memorization; too much abstract theory feels disconnected from anything tangible.

The attempt I’m most pleased with is probably my post on Perlin noise. It covers a fairly specific subject, which made it much easier. It builds up one step at a time from scratch, with visualizations at every point. It offers some interpretations of what’s going on. It clearly explains some possible extensions to the idea, but distinguishes those from the core concept.

It is a little math-heavy, I grant you, but that was hard to avoid with a fundamentally mathematical topic. I had to be economical with the background information, so I let the math be a little dense in places.

But the best part about it by far is that I learned a lot about Perlin noise in the process of writing it. In several places I realized I couldn’t explain what was going on in a satisfying way, so I had to dig deeper into it before I could write about it. Perhaps there’s a good guideline hidden in there: don’t try to teach as much as you know?

I’m also fairly happy with my series on making Doom maps, though they meander into tangents a little more often. It’s hard to talk about something like Doom without meandering, since it’s a convoluted ecosystem that’s grown organically over the course of 24 years and has at least three ways of doing anything.


And finally there’s the book I’m trying to write, which is sort of about game development.

One of my biggest grievances with game development teaching in particular is how often it leaves out important touches. Very few guides will tell you how to make a title screen or menu, how to handle death, how to get a Mario-style variable jump height. They’ll show you how to build a clearly unfinished demo game, then leave you to your own devices.

I realized that the only reliable way to show how to build a game is to build a real game, then write about it. So the book is laid out as a narrative of how I wrote my first few games, complete with stumbling blocks and dead ends and tiny bits of polish.

I have no idea how well this will work, or whether recapping my own mistakes will be interesting or distracting for a beginner, but it ought to be an interesting experiment.

An Open Letter To Microsoft: A 64-bit OS is Better Than a 32-bit OS

Post Syndicated from Brian Wilson original https://www.backblaze.com/blog/64-bit-os-vs-32-bit-os/

Windows 32 Bit vs. 64 Bit

Editor’s Note: Our co-founder & CTO, Brian Wilson, was working on a few minor performance enhancements and bug fixes (Inherit Backup State is a lot faster now). We got a version of this note from him late one night and thought it was worth sharing.

There are a few absolutes in life – death, taxes, and that a 64-bit OS is better than a 32-bit OS. Moving over to a 64-bit OS allows your laptop to run BOTH the old compatible 32-bit processes and also the new 64-bit processes. In other words, there is zero downside (and there are gigantic upsides).

32-Bit vs. 64-Bit

The main gigantic upside of a 64-bit process is the ability to support more than 2 GBytes of RAM (pedantic people will say “4 GBytes”… but there are technicalities I don’t want to get into here). Since only 1.6% of Backblaze customers have 2 GBytes or less of RAM, the other 98.4% desperately need 64-bit support, period, end of story. And remember, there is no downside.

Because there is zero downside, the first time it could, Apple shipped with 64-bit OS support. Apple did not give customers the option of “turning off all 64-bit programs.” Apple first shipped 64-bit support in OS X 10.6 Tiger in 2009 (which also had 32-bit support, so there was zero downside to the decision).

This was so successful that Apple shipped all future Operating Systems configured to support both 64-bit and 32-bit processes. All of them. Customers no longer had an option to turn off 64-bit support.

As a result, less than 2/10ths of 1% of Backblaze Mac customers are running a computer that is so old that it can only run 32-bit programs. Despite those microscopic numbers we still loyally support this segment of our customers by providing a 32-bit only version of Backblaze’s backup client.

Apple vs. Microsoft

But let’s contrast the Apple approach with that of Microsoft. Microsoft offers a 64-bit OS in Windows 10 that runs all 64-bit and all 32-bit programs. This is a valid choice of an Operating System. The problem is Microsoft ALSO gives customers the option to install 32-bit Windows 10 which will not run 64-bit programs. That’s crazy.

Another advantage of the 64-bit version of Windows is security. There are a variety of security features such as ASLR (Address Space Layout Randomization) that work best in 64-bits. The 32-bit version is inherently less secure.

By choosing 32-bit Windows 10 a customer is literally choosing a lower performance, LOWER SECURITY, Operating System that is artificially hobbled to not run all software.

When one of our customers running 32-bit Windows 10 contacts Backblaze support, it is almost always a customer that did not realize the choice they were making when they installed 32-bit Windows 10. They did not have the information to understand what they are giving up. For example, we have seen customers that have purchased 8 GB of RAM, yet they had installed 32-bit Windows 10. Simply by their OS “choice”, they disabled about 3/4ths of the RAM that they paid for!

Let’s put some numbers around it: Approximately 4.3% of Backblaze customers with Windows machines are running a 32-bit version of Windows compared with just 2/10ths of 1% of our Apple customers. The Apple customers did not choose incorrectly, they just have not upgraded their operating system in the last 9 years. If we assume the same rate of “legitimate older computers not upgraded yet” for Microsoft users that means 4.1% of the Microsoft users made a fairly large mistake when they choose their Microsoft Operating System version.

Now some people would blame the customer because after all they made the OS selection. Microsoft offers the correct choice, which is 64-bit Windows 10. In fact, 95.7% of Backblaze customers running Windows made the correct choice. My issue is that Microsoft shouldn’t offer the 32-bit version at all.

And again, for the fifth time, you will not lose any 32-bit capabilities as the 64-bit operating system runs BOTH 32-bit applications and 64-bit applications. You only lose capabilities if you choose the 32-bit only Operating System.

This is how bad it is -> When Microsoft released Windows Vista in 2007 it was 64-bit and also ran all 32-bit programs flawlessly. So at that time I was baffled why Microsoft ALSO released Windows Vista in 32-bit only mode – a version that refused to run any 64-bit binaries. Then, again in Windows 7, they did the same thing and I thought I was losing my mind. And again with Windows 8! By Windows 10, I realized Microsoft may never stop doing this. No matter how much damage they cause, no matter what happens.

You might be asking -> why do I care? Why does Brian want Microsoft to stop shipping an Operating System that is likely only chosen by mistake? My problem is this: Backblaze, like any good technology vendor, wants to be easy to use and friendly. In this case, that means we need to quietly, invisibly, continue to support BOTH the 32-bit and the 64-bit versions of every Microsoft OS they release. And we’ll probably need to do this for at least 5 years AFTER Microsoft officially retires the 32-bit only version of their operating system.

Supporting both versions is complicated. The more data our customers have, the more momentarily RAM intensive some functions (like inheriting backup state) can be. The more data you have the bigger the problem. Backblaze customers who accidentally chose to disable 64-bit operations are then going to have problems. It means we have to explain to some customers that their operating system is the root cause of many performance issues in their technical lives. This is never a pleasant conversation.

I know this will probably fall on deaf ears, but Microsoft, for the sake of your customers and third party application developers like Backblaze, please stop shipping Operating Systems that disable 64-bit support. It is causing all of us a bunch of headaches we do not need.

The post An Open Letter To Microsoft: A 64-bit OS is Better Than a 32-bit OS appeared first on Backblaze Blog | Cloud Storage & Cloud Backup.

Using Amazon SQS Dead-Letter Queues to Control Message Failure

Post Syndicated from Tara Van Unen original https://aws.amazon.com/blogs/compute/using-amazon-sqs-dead-letter-queues-to-control-message-failure/


Michael G. Khmelnitsky, Senior Programmer Writer

 

Sometimes, messages can’t be processed because of a variety of possible issues, such as erroneous conditions within the producer or consumer application. For example, if a user places an order within a certain number of minutes of creating an account, the producer might pass a message with an empty string instead of a customer identifier. Occasionally, producers and consumers might fail to interpret aspects of the protocol that they use to communicate, causing message corruption or loss. Also, the consumer’s hardware errors might corrupt message payload. For these reasons, messages that can’t be processed in a timely manner are delivered to a dead-letter queue.

The recent post Building Scalable Applications and Microservices: Adding Messaging to Your Toolbox gives an overview of messaging in the microservice architecture of modern applications. This post explains how and when you should use dead-letter queues to gain better control over message handling in your applications. It also offers some resources for configuring a dead-letter queue in Amazon Simple Queue Service (SQS).

What are the benefits of dead-letter queues?

The main task of a dead-letter queue is handling message failure. A dead-letter queue lets you set aside and isolate messages that can’t be processed correctly to determine why their processing didn’t succeed. Setting up a dead-letter queue allows you to do the following:

  • Configure an alarm for any messages delivered to a dead-letter queue.
  • Examine logs for exceptions that might have caused messages to be delivered to a dead-letter queue.
  • Analyze the contents of messages delivered to a dead-letter queue to diagnose software or the producer’s or consumer’s hardware issues.
  • Determine whether you have given your consumer sufficient time to process messages.

How do high-throughput, unordered queues handle message failure?

High-throughput, unordered queues (sometimes called standard or storage queues) keep processing messages until the expiration of the retention period. This helps ensure continuous processing of messages, which minimizes the chances of your queue being blocked by messages that can’t be processed. It also ensures fast recovery for your queue.

In a system that processes thousands of messages, having a large number of messages that the consumer repeatedly fails to acknowledge and delete might increase costs and place extra load on the hardware. Instead of trying to process failing messages until they expire, it is better to move them to a dead-letter queue after a few processing attempts.

Note: This queue type often allows a high number of in-flight messages. If the majority of your messages can’t be consumed and aren’t sent to a dead-letter queue, your rate of processing valid messages can slow down. Thus, to maintain the efficiency of your queue, you must ensure that your application handles message processing correctly.

How do FIFO queues handle message failure?

FIFO (first-in-first-out) queues (sometimes called service bus queues) help ensure exactly-once processing by consuming messages in sequence from a message group. Thus, although the consumer can continue to retrieve ordered messages from another message group, the first message group remains unavailable until the message blocking the queue is processed successfully.

Note: This queue type often allows a lower number of in-flight messages. Thus, to help ensure that your FIFO queue doesn’t get blocked by a message, you must ensure that your application handles message processing correctly.

When should I use a dead-letter queue?

  • Do use dead-letter queues with high-throughput, unordered queues. You should always take advantage of dead-letter queues when your applications don’t depend on ordering. Dead-letter queues can help you troubleshoot incorrect message transmission operations. Note: Even when you use dead-letter queues, you should continue to monitor your queues and retry sending messages that fail for transient reasons.
  • Do use dead-letter queues to decrease the number of messages and to reduce the possibility of exposing your system to poison-pill messages (messages that can be received but can’t be processed).
  • Don’t use a dead-letter queue with high-throughput, unordered queues when you want to be able to keep retrying the transmission of a message indefinitely. For example, don’t use a dead-letter queue if your program must wait for a dependent process to become active or available.
  • Don’t use a dead-letter queue with a FIFO queue if you don’t want to break the exact order of messages or operations. For example, don’t use a dead-letter queue with instructions in an Edit Decision List (EDL) for a video editing suite, where changing the order of edits changes the context of subsequent edits.

How do I get started with dead-letter queues in Amazon SQS?

Amazon SQS is a fully managed service that offers reliable, highly scalable hosted queues for exchanging messages between applications or microservices. Amazon SQS moves data between distributed application components and helps you decouple these components. It supports both standard queues and FIFO queues. To configure a queue as a dead-letter queue, you can use the AWS Management Console or the Amazon SQS SetQueueAttributes API action.

To get started with dead-letter queues in Amazon SQS, see the following topics in the Amazon SQS Developer Guide:

To start working with dead-letter queues programmatically, see the following resources:

How to Deploy Local Administrator Password Solution with AWS Microsoft AD

Post Syndicated from Dragos Madarasan original https://aws.amazon.com/blogs/security/how-to-deploy-local-administrator-password-solution-with-aws-microsoft-ad/

Local Administrator Password Solution (LAPS) from Microsoft simplifies password management by allowing organizations to use Active Directory (AD) to store unique passwords for computers. Typically, an organization might reuse the same local administrator password across the computers in an AD domain. However, this approach represents a security risk because it can be exploited during lateral escalation attacks. LAPS solves this problem by creating unique, randomized passwords for the Administrator account on each computer and storing it encrypted in AD.

Deploying LAPS with AWS Microsoft AD requires the following steps:

  1. Install the LAPS binaries on instances joined to your AWS Microsoft AD domain. The binaries add additional client-side extension (CSE) functionality to the Group Policy client.
  2. Extend the AWS Microsoft AD schema. LAPS requires new AD attributes to store an encrypted password and its expiration time.
  3. Configure AD permissions and delegate the ability to retrieve the local administrator password for IT staff in your organization.
  4. Configure Group Policy on instances joined to your AWS Microsoft AD domain to enable LAPS. This configures the Group Policy client to process LAPS settings and uses the binaries installed in Step 1.

The following diagram illustrates the setup that I will be using throughout this post and the associated tasks to set up LAPS. Note that the AWS Directory Service directory is deployed across multiple Availability Zones, and monitoring automatically detects and replaces domain controllers that fail.

Diagram illustrating this blog post's solution

In this blog post, I explain the prerequisites to set up Local Administrator Password Solution, demonstrate the steps involved to update the AD schema on your AWS Microsoft AD domain, show how to delegate permissions to IT staff and configure LAPS via Group Policy, and demonstrate how to retrieve the password using the graphical user interface or with Windows PowerShell.

This post assumes you are familiar with Lightweight Directory Access Protocol Data Interchange Format (LDIF) files and AWS Microsoft AD. If you need more of an introduction to Directory Service and AWS Microsoft AD, see How to Move More Custom Applications to the AWS Cloud with AWS Directory Service, which introduces working with schema changes in AWS Microsoft AD.

Prerequisites

In order to implement LAPS, you must use AWS Directory Service for Microsoft Active Directory (Enterprise Edition), also known as AWS Microsoft AD. Any instance on which you want to configure LAPS must be joined to your AWS Microsoft AD domain. You also need a Management instance on which you install the LAPS management tools.

In this post, I use an AWS Microsoft AD domain called example.com that I have launched in the EU (London) region. To see which the regions in which Directory Service is available, see AWS Regions and Endpoints.

Screenshot showing the AWS Microsoft AD domain example.com used in this blog post

In addition, you must have at least two instances launched in the same region as the AWS Microsoft AD domain. To join the instances to your AWS Microsoft AD domain, you have two options:

  1. Use the Amazon EC2 Systems Manager (SSM) domain join feature. To learn more about how to set up domain join for EC2 instances, see joining a Windows Instance to an AWS Directory Service Domain.
  2. Manually configure the DNS server addresses in the Internet Protocol version 4 (TCP/IPv4) settings of the network card to use the AWS Microsoft AD DNS addresses (172.31.9.64 and 172.31.16.191, for this blog post) and perform a manual domain join.

For the purpose of this post, my two instances are:

  1. A Management instance on which I will install the management tools that I have tagged as Management.
  2. A Web Server instance on which I will be deploying the LAPS binary.

Screenshot showing the two EC2 instances used in this post

Implementing the solution

 

1. Install the LAPS binaries on instances joined to your AWS Microsoft AD domain by using EC2 Run Command

LAPS binaries come in the form of an MSI installer and can be downloaded from the Microsoft Download Center. You can install the LAPS binaries manually, with an automation service such as EC2 Run Command, or with your existing software deployment solution.

For this post, I will deploy the LAPS binaries on my Web Server instance (i-0b7563d0f89d3453a) by using EC2 Run Command:

  1. While signed in to the AWS Management Console, choose EC2. In the Systems Manager Services section of the navigation pane, choose Run Command.
  2. Choose Run a command, and from the Command document list, choose AWS-InstallApplication.
  3. From Target instances, choose the instance on which you want to deploy the LAPS binaries. In my case, I will be selecting the instance tagged as Web Server. If you do not see any instances listed, make sure you have met the prerequisites for Amazon EC2 Systems Manager (SSM) by reviewing the Systems Manager Prerequisites.
  4. For Action, choose Install, and then stipulate the following values:
    • Parameters: /quiet
    • Source: https://download.microsoft.com/download/C/7/A/C7AAD914-A8A6-4904-88A1-29E657445D03/LAPS.x64.msi
    • Source Hash: f63ebbc45e2d080630bd62a195cd225de734131a56bb7b453c84336e37abd766
    • Comment: LAPS deployment

Leave the other options with the default values and choose Run. The AWS Management Console will return a Command ID, which will initially have a status of In Progress. It should take less than 5 minutes to download and install the binaries, after which the Command ID will update its status to Success.

Status showing the binaries have been installed successfully

If the Command ID runs for more than 5 minutes or returns an error, it might indicate a problem with the installer. To troubleshoot, review the steps in Troubleshooting Systems Manager Run Command.

To verify the binaries have been installed successfully, open Control Panel and review the recently installed applications in Programs and Features.

Screenshot of Control Panel that confirms LAPS has been installed successfully

You should see an entry for Local Administrator Password Solution with a version of 6.2.0.0 or newer.

2. Extend the AWS Microsoft AD schema

In the previous section, I used EC2 Run Command to install the LAPS binaries on an EC2 instance. Now, I am ready to extend the schema in an AWS Microsoft AD domain. Extending the schema is a requirement because LAPS relies on new AD attributes to store the encrypted password and its expiration time.

In an on-premises AD environment, you would update the schema by running the Update-AdmPwdADSchema Windows PowerShell cmdlet with schema administrator credentials. Because AWS Microsoft AD is a managed service, I do not have permissions to update the schema directly. Instead, I will update the AD schema from the Directory Service console by importing an LDIF file. If you are unfamiliar with schema updates or LDIF files, see How to Move More Custom Applications to the AWS Cloud with AWS Directory Service.

To make things easier for you, I am providing you with a sample LDIF file that contains the required AD schema changes. Using Notepad or a similar text editor, open the SchemaChanges-0517.ldif file and update the values of dc=example,dc=com with your own AWS Microsoft AD domain and suffix.

After I update the LDIF file with my AWS Microsoft AD details, I import it by using the AWS Management Console:

  1. On the Directory Service console, select from the list of directories in the Microsoft AD directory by choosing its identifier (it will look something like d-534373570ea).
  2. On the Directory details page, choose the Schema extensions tab and choose Upload and update schema.
    Screenshot showing the "Upload and update schema" option
  3. When prompted for the LDIF file that contains the changes, choose the sample LDIF file.
  4. In the background, the LDIF file is validated for errors and a backup of the directory is created for recovery purposes. Updating the schema might take a few minutes and the status will change to Updating Schema. When the process has completed, the status of Completed will be displayed, as shown in the following screenshot.

Screenshot showing the schema updates in progress
When the process has completed, the status of Completed will be displayed, as shown in the following screenshot.

Screenshot showing the process has completed

If the LDIF file contains errors or the schema extension fails, the Directory Service console will generate an error code and additional debug information. To help troubleshoot error messages, see Schema Extension Errors.

The sample LDIF file triggers AWS Microsoft AD to perform the following actions:

  1. Create the ms-Mcs-AdmPwd attribute, which stores the encrypted password.
  2. Create the ms-Mcs-AdmPwdExpirationTime attribute, which stores the time of the password’s expiration.
  3. Add both attributes to the Computer class.

3. Configure AD permissions

In the previous section, I updated the AWS Microsoft AD schema with the required attributes for LAPS. I am now ready to configure the permissions for administrators to retrieve the password and for computer accounts to update their password attribute.

As part of configuring AD permissions, I grant computers the ability to update their own password attribute and specify which security groups have permissions to retrieve the password from AD. As part of this process, I run Windows PowerShell cmdlets that are not installed by default on Windows Server.

Note: To learn more about Windows PowerShell and the concept of a cmdlet (pronounced “command-let”), go to Getting Started with Windows PowerShell.

Before getting started, I need to set up the required tools for LAPS on my Management instance, which must be joined to the AWS Microsoft AD domain. I will be using the same LAPS installer that I downloaded from the Microsoft LAPS website. In my Management instance, I have manually run the installer by clicking the LAPS.x64.msi file. On the Custom Setup page of the installer, under Management Tools, for each option I have selected Install on local hard drive.

Screenshot showing the required management tools

In the preceding screenshot, the features are:

  • The fat client UI – A simple user interface for retrieving the password (I will use it at the end of this post).
  • The Windows PowerShell module – Needed to run the commands in the next sections.
  • The GPO Editor templates – Used to configure Group Policy objects.

The next step is to grant computers in the Computers OU the permission to update their own attributes. While connected to my Management instance, I go to the Start menu and type PowerShell. In the list of results, right-click Windows PowerShell and choose Run as administrator and then Yes when prompted by User Account Control.

In the Windows PowerShell prompt, I type the following command.

Import-module AdmPwd.PS

Set-AdmPwdComputerSelfPermission –OrgUnit “OU=Computers,OU=MyMicrosoftAD,DC=example,DC=com

To grant the administrator group called Admins the permission to retrieve the computer password, I run the following command in the Windows PowerShell prompt I previously started.

Import-module AdmPwd.PS

Set-AdmPwdReadPasswordPermission –OrgUnit “OU=Computers, OU=MyMicrosoftAD,DC=example,DC=com” –AllowedPrincipals “Admins”

4. Configure Group Policy to enable LAPS

In the previous section, I deployed the LAPS management tools on my management instance, granted the computer accounts the permission to self-update their local administrator password attribute, and granted my Admins group permissions to retrieve the password.

Note: The following section addresses the Group Policy Management Console and Group Policy objects. If you are unfamiliar with or wish to learn more about these concepts, go to Get Started Using the GPMC and Group Policy for Beginners.

I am now ready to enable LAPS via Group Policy:

  1. On my Management instance (i-03b2c5d5b1113c7ac), I have installed the Group Policy Management Console (GPMC) by running the following command in Windows PowerShell.
Install-WindowsFeature –Name GPMC
  1. Next, I have opened the GPMC and created a new Group Policy object (GPO) called LAPS GPO.
  2. In the Local Group Policy Editor, I navigate to Computer Configuration > Policies > Administrative Templates > LAPS. I have configured the settings using the values in the following table.

Setting

State

Options

Password Settings

Enabled

Complexity: large letters, small letters, numbers, specials

Do not allow password expiration time longer than required by policy

Enabled

N/A

Enable local admin password management

Enabled

N/A

  1. Next, I need to link the GPO to an organizational unit (OU) in which my machine accounts sit. In your environment, I recommend testing the new settings on a test OU and then deploying the GPO to production OUs.

Note: If you choose to create a new test organizational unit, you must create it in the OU that AWS Microsoft AD delegates to you to manage. For example, if your AWS Microsoft AD directory name were example.com, the test OU path would be example.com/example/Computers/Test.

  1. To test that LAPS works, I need to make sure the computer has received the new policy by forcing a Group Policy update. While connected to the Web Server instance (i-0b7563d0f89d3453a) using Remote Desktop, I open an elevated administrative command prompt and run the following command: gpupdate /force. I can check if the policy is applied by running the command: gpresult /r | findstr LAPS GPO, where LAPS GPO is the name of the GPO created in the second step.
  2. Back on my Management instance, I can then launch the LAPS interface from the Start menu and use it to retrieve the password (as shown in the following screenshot). Alternatively, I can run the Get-ADComputer Windows PowerShell cmdlet to retrieve the password.
Get-ADComputer [YourComputerName] -Properties ms-Mcs-AdmPwd | select name, ms-Mcs-AdmPwd

Screenshot of the LAPS UI, which you can use to retrieve the password

Summary

In this blog post, I demonstrated how you can deploy LAPS with an AWS Microsoft AD directory. I then showed how to install the LAPS binaries by using EC2 Run Command. Using the sample LDIF file I provided, I showed you how to extend the schema, which is a requirement because LAPS relies on new AD attributes to store the encrypted password and its expiration time. Finally, I showed how to complete the LAPS setup by configuring the necessary AD permissions and creating the GPO that starts the LAPS password change.

If you have comments about this post, submit them in the “Comments” section below. If you have questions about or issues implementing this solution, please start a new thread on the Directory Service forum.

– Dragos

Event: AWS Serverless Roadshow – Hands-on Workshops

Post Syndicated from Tara Walker original https://aws.amazon.com/blogs/aws/event-aws-serverless-roadshow-hands-on-workshops/

Surely, some of you have contemplated how you would survive the possible Zombie apocalypse or how you would build your exciting new startup to disrupt the transportation industry when Unicorn haven is uncovered. Well, there is no need to worry; I know just the thing to get you prepared to handle both of those scenarios: the AWS Serverless Computing Workshop Roadshow.

With the roadshow’s serverless workshops, you can get hands-on experience building serverless applications and microservices so you can rebuild what remains of our great civilization after a widespread viral infection causes human corpses to reanimate around the world in the AWS Zombie Microservices Workshop. In addition, you can give your startup a jump on the competition with the Wild Rydes workshop in order to revolutionize the transportation industry; just in time for a pilot’s crash landing leading the way to the discovery of abundant Unicorn pastures found on the outskirts of the female Amazonian warrior inhabited island of Themyscira also known as Paradise Island.

These free, guided hands-on workshops will introduce the basics of building serverless applications and microservices for common and uncommon scenarios using services like AWS Lambda, Amazon API Gateway, Amazon DynamoDB, Amazon S3, Amazon Kinesis, AWS Step Functions, and more. Let me share some advice before you decide to tackle Zombies and mount Unicorns – don’t forget to bring your laptop to the workshop and make sure you have an AWS account established and available for use for the event.

Check out the schedule below and get prepared today by registering for an upcoming workshop in a city near you. Remember these are workshops are completely free, so participation is on a first come, first served basis. So register and get there early, we need Zombie hunters and Unicorn riders across the globe.  Learn more about AWS Serverless Computing Workshops here and register for your city using links below.

Event Location Date
Wild Rydes New York Thursday, June 8
Wild Rydes Austin Thursday, June 22
Wild Rydes Santa Monica Thursday, July 20
Zombie Apocalypse Chicago Thursday, July 20
Wild Rydes Atlanta Tuesday, September 12
Zombie Apocalypse Dallas Tuesday, September 19

 

I look forward to fighting zombies and riding unicorns with you all.

Tara

How The Intercept Outed Reality Winner

Post Syndicated from Robert Graham original http://blog.erratasec.com/2017/06/how-intercept-outed-reality-winner.html

Today, The Intercept released documents on election tampering from an NSA leaker. Later, the arrest warrant request for an NSA contractor named “Reality Winner” was published, showing how they tracked her down because she had printed out the documents and sent them to The Intercept. The document posted by the Intercept isn’t the original PDF file, but a PDF containing the pictures of the printed version that was then later scanned in.

As the warrant says, she confessed while interviewed by the FBI. Had she not confessed, the documents still contained enough evidence to convict her: the printed document was digitally watermarked.

The problem is that most new printers print nearly invisibly yellow dots that track down exactly when and where documents, any document, is printed. Because the NSA logs all printing jobs on its printers, it can use this to match up precisely who printed the document.

In this post, I show how.

You can download the document from the original article here. You can then open it in a PDF viewer, such as the normal “Preview” app on macOS. Zoom into some whitespace on the document, and take a screenshot of this. On macOS, hit [Command-Shift-3] to take a screenshot of a window. There are yellow dots in this image, but you can barely see them, especially if your screen is dirty.

We need to highlight the yellow dots. Open the screenshot in an image editor, such as the “Paintbrush” program built into macOS. Now use the option to “Invert Colors” in the image, to get something like this. You should see a roughly rectangular pattern checkerboard in the whitespace.

It’s upside down, so we need to rotate it 180 degrees, or flip-horizontal and flip-vertical:

Now we go to the EFF page and manually click on the pattern so that their tool can decode the meaning:

This produces the following result:

The document leaked by the Intercept was from a printer with model number 54, serial number 29535218. The document was printed on May 9, 2017 at 6:20. The NSA almost certainly has a record of who used the printer at that time.

The situation is similar to how Vice outed the location of John McAfee, by publishing JPEG photographs of him with the EXIF GPS coordinates still hidden in the file. Or it’s how PDFs are often redacted by adding a black bar on top of image, leaving the underlying contents still in the file for people to read, such as in this NYTime accident with a Snowden document. Or how opening a Microsoft Office document, then accidentally saving it, leaves fingerprints identifying you behind, as repeatedly happened with the Wikileaks election leaks. These sorts of failures are common with leaks. To fix this yellow-dot problem, use a black-and-white printer, black-and-white scanner, or convert to black-and-white with an image editor.

Copiers/printers have two features put in there by the government to be evil to you. The first is that scanners/copiers (when using scanner feature) recognize a barely visible pattern on currency, so that they can’t be used to counterfeit money, as shown on this $20 below:

The second is that when they print things out, they includes these invisible dots, so documents can be tracked. In other words, those dots on bills prevent them from being scanned in, and the dots produced by printers help the government track what was printed out.

Yes, this code the government forces into our printers is a violation of our 3rd Amendment rights.


While I was writing up this post, these tweets appeared first:


Comments:
https://news.ycombinator.com/item?id=14494818

Some non-lessons from WannaCry

Post Syndicated from Robert Graham original http://blog.erratasec.com/2017/06/some-non-lessons-from-wannacry.html

This piece by Bruce Schneier needs debunking. I thought I’d list the things wrong with it.

The NSA 0day debate

Schneier’s description of the problem is deceptive:

When the US government discovers a vulnerability in a piece of software, however, it decides between two competing equities. It can keep it secret and use it offensively, to gather foreign intelligence, help execute search warrants, or deliver malware. Or it can alert the software vendor and see that the vulnerability is patched, protecting the country — and, for that matter, the world — from similar attacks by foreign governments and cybercriminals. It’s an either-or choice.

The government doesn’t “discover” vulnerabilities accidentally. Instead, when the NSA has a need for something specific, it acquires the 0day, either through internal research or (more often) buying from independent researchers.

The value of something is what you are willing to pay for it. If the NSA comes across a vulnerability accidentally, then the value to them is nearly zero. Obviously such vulns should be disclosed and fixed. Conversely, if the NSA is willing to pay $1 million to acquire a specific vuln for imminent use against a target, the offensive value is much greater than the fix value.

What Schneier is doing is deliberately confusing the two, combing the policy for accidentally found vulns with deliberately acquired vulns.

The above paragraph should read instead:

When the government discovers a vulnerability accidentally, it then decides to alert the software vendor to get it patched. When the government decides it needs as vuln for a specific offensive use, it acquires one that meets its needs, uses it, and keeps it secret. After spending so much money acquiring an offensive vuln, it would obviously be stupid to change this decision and not use it offensively.

Hoarding vulns

Schneier also says the NSA is “hoarding” vulns. The word has a couple inaccurate connotations.
One connotation is that the NSA is putting them on a heap inside a vault, not using them. The opposite is true: the NSA only acquires vulns it for which it has an active need. It uses pretty much all the vulns it acquires. That can be seen in the ShadowBroker dump, all the vulns listed are extremely useful to attackers, especially ETERNALBLUE. Efficiency is important to the NSA. Your efficiency is your basis for promotion. There are other people who make their careers finding waste in the NSA. If you are hoarding vulns and not using them, you’ll quickly get ejected from the NSA.
Another connotation is that the NSA is somehow keeping the vulns away from vendors. That’s like saying I’m hoarding naked selfies of myself. Yes, technically I’m keeping them away from you, but it’s not like they ever belong to you in the first place. The same is true the NSA. Had it never acquired the ETERNALBLUE 0day, it never would’ve been researched, never found.

The VEP

Schneier describes the “Vulnerability Equities Process” or “VEP”, a process that is supposed to manage the vulnerabilities the government gets.

There’s no evidence the VEP process has ever been used, at least not with 0days acquired by the NSA. The VEP allows exceptions for important vulns, and all the NSA vulns are important, so all are excepted from the process. Since the NSA is in charge of the VEP, of course, this is at the sole discretion of the NSA. Thus, the entire point of the VEP process goes away.

Moreover, it can’t work in many cases. The vulns acquired by the NSA often come with clauses that mean they can’t be shared.

New classes of vulns

One reason sellers forbid 0days from being shared is because they use new classes of vulnerabilities, such that sharing one 0day will effectively ruin a whole set of vulnerabilities. Schneier poo-poos this because he doesn’t see new classes of vulns in the ShadowBroker set.
This is wrong for two reasons. The first is that the ShadowBroker 0days are incomplete. There’s no iOS exploits, for example, and we know that iOS is a big target of the NSA.
Secondly, I’m not sure we’ve sufficiently analyzed the ShadowBroker exploits yet to realize there may be a new class of vuln. It’s easy to miss the fact that a single bug we see in the dump may actually be a whole new class of vulnerability. In the past, it’s often been the case that a new class was named only after finding many examples.
In any case, Schneier misses the point denying new classes of vulns exist. He should instead use the point to prove the value of disclosure, that instead of playing wack-a-mole fixing bugs one at a time, vendors would be able to fix whole classes of bugs at once.

Rediscovery

Schneier cites two studies that looked at how often vulnerabilities get rediscovered. In other words, he’s trying to measure the likelihood that some other government will find the bug and use it against us.
These studies are weak, scarcely better than anecdotal evidence. Schneier’s own study seems almost unrelated to the problem, and the Rand’s study cannot be replicated, as it relies upon private data. Also, there is little differentiation between important bugs (like SMB/MSRPC exploits and full-chain iOS exploits) and lesser bugs.
Whether from the Rand study or from anecdotes, we have good reason to believe that the longer an 0day exists, the less likely it’ll be rediscovered. Schneier argues that vulns should only be used for 6 months before being disclosed to a vendor. Anecdotes suggest otherwise, that if it hasn’t been rediscovered in the first year, it likely won’t ever be.
The Rand study was overwhelmingly clear on the issue that 0days are dramatically more likely to become obsolete than be rediscovered. The latest update to iOS will break an 0day, rather than somebody else rediscovering it. Win10 adoption will break older SMB exploits faster than rediscovery.
In any case, this post is about ETERNALBLUE specifically. What we learned from this specific bug is that it was used for at least 5 year without anybody else rediscovering it (before it was leaked). Chances are good it never would’ve been rediscovered, just made obsolete by Win10.

Notification is notification

All disclosure has the potential of leading to worms like WannaCry. The Conficker worm of 2008, for example, was written after Microsoft patched the underlying vulnerability.
Thus, had the NSA disclosed the bug in the normal way, chances are good it still would’ve been used for worming ransomware.
Yes, WannaCry had a head-start because ShadowBrokers published a working exploit, but this doesn’t appear to have made a difference. The Blaster worm (the first worm to compromise millions of computers) took roughly the same amount of time to create, and almost no details were made public about the vulnerability, other than the fact it was patched. (I know from personal experience — we used diff to find what changed in the patch in order to reverse engineer the 0day).
In other words, the damage the NSA is responsible for isn’t really the damage that came after it was patched — that was likely to happen anyway, as it does with normal vuln disclosure. Instead, the only damage the NSA can truly be held responsible for is the damage ahead of time, such as the months (years?) the ShadowBrokers possessed the exploits before they were patched.

Disclosed doesn’t mean fixed

One thing we’ve learned from 30 years of disclosure is that vendors ignore bugs.
We’ve gotten to the state where a few big companies like Microsoft and Apple will actually fix bugs, but the vast majority of vendors won’t. Even Microsoft and Apple have been known to sit on tricky bugs for over a year before fixing them.
And the only reason Microsoft and Apple have gotten to this state is because we, the community, bullied them into it. When we disclose bugs to them, we give them a deadline when we make the bug public, whether or not its been fixed.
The same goes for the NSA. If they quietly disclose bugs to vendors, in general, they won’t be fixed unless the NSA also makes the bug public within a certain time frame. Either Schneier has to argue that the NSA should do such public full-disclosures, or argue that disclosures won’t always lead to fixes.

Replacement SMB/MSRPC

The ETERNALBLUE vuln is so valuable to the NSA that it’s almost certainly seeking a replacement.
Again, I’m trying to debunk the impression Schneier tries to form that somehow the NSA stumbled upon ETERNALBLUE by accident to begin with. The opposite is true: remote exploits for the SMB (port 445) or MSRPC (port 135) services are some of the most valuable vulns, and the NSA will work hard to acquire them.

That it was leaked

The only issue here is that the 0day leaked. If the NSA can’t keep it’s weaponized toys secret, then maybe it shouldn’t have them.
Instead of processing this new piece of information, which is important, Schneier takes this opportunity to just re-hash the old inaccurate and deceptive VEP debate.

Conclusion

Except for a tiny number of people working for the NSA, none of us really know what’s going on with 0days inside government. Schneier’s comments seem more off-base than most. Like all activists, he deliberately uses language to deceive rather than explain (like “discover” instead of “acquire”). Like all activists, he seems obsessed with the VEP, even though as far as anybody can tell, it’s not used for NSA acquired vulns. He deliberate ignores things he should be an expert in, such as how all patches/disclosures sometimes lead to worms/exploits, and how not all disclosure leads to fixes.

How to track that annoying pop-up

Post Syndicated from Robert Graham original http://blog.erratasec.com/2017/06/how-to-track-that-annoying-pop-up.html

In a recent update to their Office suite on Windows, Microsoft made a mistake where every hour, for a fraction of a second,  a black window pops up on the screen. This leads many to fear their system has been infected by a virus. I thought I’d document how to track this down.

The short answer is to use Mark Russinovich’s “sysinternals.com” tools. He’s Windows internals guru at Microsoft and has been maintaining a suite of tools that are critical for Windows system maintenance and security. Copy all the tools from “https://live.sysinternals.com“. Also, you can copy with Microsoft Windows Networking (SMB).

Of these tools, what we want is something that looks at “processes”. There are several tools that do this, but focus on processes that are currently running. What we want is something that monitors process creation.

The tool for that is “sysmon.exe”. It can monitor not only process creation, but a large number of other system events that a techy can use to see what the system has been doing, and if you are infected with a virus.

Sysmon has a fairly complicated configuration file, and if you enabled everything, you’d soon be overwhelmed with events. @SwiftOnSecurity has published a configuration file they use in the real world in real environment that cuts down on the noise, and focuses on events that are really important. It enables monitoring of “process creation”, but filters out know good processes that might fill up your logs. You grab the file here. Save it to the same directory to where you saved Sysmon:

https://raw.githubusercontent.com/SwiftOnSecurity/sysmon-config/master/sysmonconfig-export.xml

Once you’ve done it, run the following command to activate the Sysmon monitoring service using this configuration file by running the following command as Administrator. (Right click on the Command Prompt icon and select More/Run as Administrator).

sysmon.exe -accepteula -i sysmonconfig-export.xml

Now sit back and relax until that popup happens again. Right after it does, go into the “Event Viewer” application (click on Windows menu and type “Event Viewer”, or run ‘eventvwr.exe’. Now you have to find where the sysmon events are located, since there are so many things that log events.

The Sysmon events are under the path:

Applications and Services Logs\Microsoft\Windows\Sysmon\operational

When you open that up, you should see the top event is the one we are looking for. Actually, the very top event is launching the process “eventvwr.exe”, but the next one down is our event. It looks like this:

Drilling down into the details, we find the the offending thing causing those annoying popups is “officebackgroundtask.exe” in Office.

We can see it’s started by the “Schedule” service. This means we can go look at it with “autoruns.exe”, another Sysinternals tool that looks at all the things configured to automatically start when you start/login to your computer.

They are pink, which [update] is how autoruns shows they are “unsigned” programs (Microsoft’s programs should, normally, always be signed, so this should be suspicious). I’m assuming the suspicious thing is that they run in the user’s context, rather than system context, creating popup screens.

Autoruns allows you to do a bunch of things. You can click on the [X] box and disable it from running in the future. You can [right-click] in order to upload to Virus Total and check if it’s a known virus.

You can also double-click, to open the Task Scheduler, and see the specific configuration. You can see here that this thing is scheduled to run every hour:

Conclusion

So the conclusions are this.
To solve this particular problem of identifying what’s causing a process to flash a screen occasionally, use sysmon.
To solve generation problems like this, use Sysinternals suite of applications.
I haven’t been, but I am now, using @SwiftOnSecurity’s sysmon configuration just to monitor the security of my computers. I should probably install something to move a copy of the logs off the system.

Some Notes

Some URLs:
Some tweets: