Tag Archives: Amazon SQS

How to Create an AMI Builder with AWS CodeBuild and HashiCorp Packer – Part 2

Post Syndicated from Heitor Lessa original https://aws.amazon.com/blogs/devops/how-to-create-an-ami-builder-with-aws-codebuild-and-hashicorp-packer-part-2/

Written by AWS Solutions Architects Jason Barto and Heitor Lessa

 
In Part 1 of this post, we described how AWS CodeBuild, AWS CodeCommit, and HashiCorp Packer can be used to build an Amazon Machine Image (AMI) from the latest version of Amazon Linux. In this post, we show how to use AWS CodePipeline, AWS CloudFormation, and Amazon CloudWatch Events to continuously ship new AMIs. We use Ansible by Red Hat to harden the OS on the AMIs through a well-known set of security controls outlined by the Center for Internet Security in its CIS Amazon Linux Benchmark.

You’ll find the source code for this post in our GitHub repo.

At the end of this post, we will have the following architecture:

Requirements

 
To follow along, you will need Git and a text editor. Make sure Git is configured to work with AWS CodeCommit, as described in Part 1.

Technologies

 
In addition to the services and products used in Part 1 of this post, we also use these AWS services and third-party software:

AWS CloudFormation gives developers and systems administrators an easy way to create and manage a collection of related AWS resources, provisioning and updating them in an orderly and predictable fashion.

Amazon CloudWatch Events enables you to react selectively to events in the cloud and in your applications. Specifically, you can create CloudWatch Events rules that match event patterns, and take actions in response to those patterns.

AWS CodePipeline is a continuous integration and continuous delivery service for fast and reliable application and infrastructure updates. AWS CodePipeline builds, tests, and deploys your code every time there is a code change, based on release process models you define.

Amazon SNS is a fast, flexible, fully managed push notification service that lets you send individual messages or to fan out messages to large numbers of recipients. Amazon SNS makes it simple and cost-effective to send push notifications to mobile device users or email recipients. The service can even send messages to other distributed services.

Ansible is a simple IT automation system that handles configuration management, application deployment, cloud provisioning, ad-hoc task-execution, and multinode orchestration.

Getting Started

 
We use CloudFormation to bootstrap the following infrastructure:

Component Purpose
AWS CodeCommit repository Git repository where the AMI builder code is stored.
S3 bucket Build artifact repository used by AWS CodePipeline and AWS CodeBuild.
AWS CodeBuild project Executes the AWS CodeBuild instructions contained in the build specification file.
AWS CodePipeline pipeline Orchestrates the AMI build process, triggered by new changes in the AWS CodeCommit repository.
SNS topic Notifies subscribed email addresses when an AMI build is complete.
CloudWatch Events rule Defines how the AMI builder should send a custom event to notify an SNS topic.
Region AMI Builder Launch Template
N. Virginia (us-east-1)
Ireland (eu-west-1)

After launching the CloudFormation template linked here, we will have a pipeline in the AWS CodePipeline console. (Failed at this stage simply means we don’t have any data in our newly created AWS CodeCommit Git repository.)

Next, we will clone the newly created AWS CodeCommit repository.

If this is your first time connecting to a AWS CodeCommit repository, please see instructions in our documentation on Setup steps for HTTPS Connections to AWS CodeCommit Repositories.

To clone the AWS CodeCommit repository (console)

  1. From the AWS Management Console, open the AWS CloudFormation console.
  2. Choose the AMI-Builder-Blogpost stack, and then choose Output.
  3. Make a note of the Git repository URL.
  4. Use git to clone the repository.

For example: git clone https://git-codecommit.eu-west-1.amazonaws.com/v1/repos/AMI-Builder_repo

To clone the AWS CodeCommit repository (CLI)

# Retrieve CodeCommit repo URL
git_repo=$(aws cloudformation describe-stacks --query 'Stacks[0].Outputs[?OutputKey==`GitRepository`].OutputValue' --output text --stack-name "AMI-Builder-Blogpost")

# Clone repository locally
git clone ${git_repo}

Bootstrap the Repo with the AMI Builder Structure

 
Now that our infrastructure is ready, download all the files and templates required to build the AMI.

Your local Git repo should have the following structure:

.
├── ami_builder_event.json
├── ansible
├── buildspec.yml
├── cloudformation
├── packer_cis.json

Next, push these changes to AWS CodeCommit, and then let AWS CodePipeline orchestrate the creation of the AMI:

git add .
git commit -m "My first AMI"
git push origin master

AWS CodeBuild Implementation Details

 
While we wait for the AMI to be created, let’s see what’s changed in our AWS CodeBuild buildspec.yml file:

...
phases:
  ...
  build:
    commands:
      ...
      - ./packer build -color=false packer_cis.json | tee build.log
  post_build:
    commands:
      - egrep "${AWS_REGION}\:\sami\-" build.log | cut -d' ' -f2 > ami_id.txt
      # Packer doesn't return non-zero status; we must do that if Packer build failed
      - test -s ami_id.txt || exit 1
      - sed -i.bak "s/<<AMI-ID>>/$(cat ami_id.txt)/g" ami_builder_event.json
      - aws events put-events --entries file://ami_builder_event.json
      ...
artifacts:
  files:
    - ami_builder_event.json
    - build.log
  discard-paths: yes

In the build phase, we capture Packer output into a file named build.log. In the post_build phase, we take the following actions:

  1. Look up the AMI ID created by Packer and save its findings to a temporary file (ami_id.txt).
  2. Forcefully make AWS CodeBuild to fail if the AMI ID (ami_id.txt) is not found. This is required because Packer doesn’t fail if something goes wrong during the AMI creation process. We have to tell AWS CodeBuild to stop by informing it that an error occurred.
  3. If an AMI ID is found, we update the ami_builder_event.json file and then notify CloudWatch Events that the AMI creation process is complete.
  4. CloudWatch Events publishes a message to an SNS topic. Anyone subscribed to the topic will be notified in email that an AMI has been created.

Lastly, the new artifacts phase instructs AWS CodeBuild to upload files built during the build process (ami_builder_event.json and build.log) to the S3 bucket specified in the Outputs section of the CloudFormation template. These artifacts can then be used as an input artifact in any later stage in AWS CodePipeline.

For information about customizing the artifacts sequence of the buildspec.yml, see the Build Specification Reference for AWS CodeBuild.

CloudWatch Events Implementation Details

 
CloudWatch Events allow you to extend the AMI builder to not only send email after the AMI has been created, but to hook up any of the supported targets to react to the AMI builder event. This event publication means you can decouple from Packer actions you might take after AMI completion and plug in other actions, as you see fit.

For more information about targets in CloudWatch Events, see the CloudWatch Events API Reference.

In this case, CloudWatch Events should receive the following event, match it with a rule we created through CloudFormation, and publish a message to SNS so that you can receive an email.

Example CloudWatch custom event

[
        {
            "Source": "com.ami.builder",
            "DetailType": "AmiBuilder",
            "Detail": "{ \"AmiStatus\": \"Created\"}",
            "Resources": [ "ami-12cd5guf" ]
        }
]

Cloudwatch Events rule

{
  "detail-type": [
    "AmiBuilder"
  ],
  "source": [
    "com.ami.builder"
  ],
  "detail": {
    "AmiStatus": [
      "Created"
    ]
  }
}

Example SNS message sent in email

{
    "version": "0",
    "id": "f8bdede0-b9d7...",
    "detail-type": "AmiBuilder",
    "source": "com.ami.builder",
    "account": "<<aws_account_number>>",
    "time": "2017-04-28T17:56:40Z",
    "region": "eu-west-1",
    "resources": ["ami-112cd5guf "],
    "detail": {
        "AmiStatus": "Created"
    }
}

Packer Implementation Details

 
In addition to the build specification file, there are differences between the current version of the HashiCorp Packer template (packer_cis.json) and the one used in Part 1.

Variables

  "variables": {
    "vpc": "{{env `BUILD_VPC_ID`}}",
    "subnet": "{{env `BUILD_SUBNET_ID`}}",
         “ami_name”: “Prod-CIS-Latest-AMZN-{{isotime \”02-Jan-06 03_04_05\”}}”
  },
  • ami_name: Prefixes a name used by Packer to tag resources during the Builders sequence.
  • vpc and subnet: Environment variables defined by the CloudFormation stack parameters.

We no longer assume a default VPC is present and instead use the VPC and subnet specified in the CloudFormation parameters. CloudFormation configures the AWS CodeBuild project to use these values as environment variables. They are made available throughout the build process.

That allows for more flexibility should you need to change which VPC and subnet will be used by Packer to launch temporary resources.

Builders

  "builders": [{
    ...
    "ami_name": “{{user `ami_name`| clean_ami_name}}”,
    "tags": {
      "Name": “{{user `ami_name`}}”,
    },
    "run_tags": {
      "Name": “{{user `ami_name`}}",
    },
    "run_volume_tags": {
      "Name": “{{user `ami_name`}}",
    },
    "snapshot_tags": {
      "Name": “{{user `ami_name`}}",
    },
    ...
    "vpc_id": "{{user `vpc` }}",
    "subnet_id": "{{user `subnet` }}"
  }],

We now have new properties (*_tag) and a new function (clean_ami_name) and launch temporary resources in a VPC and subnet specified in the environment variables. AMI names can only contain a certain set of ASCII characters. If the input in project deviates from the expected characters (for example, includes whitespace or slashes), Packer’s clean_ami_name function will fix it.

For more information, see functions on the HashiCorp Packer website.

Provisioners

  "provisioners": [
    {
        "type": "shell",
        "inline": [
            "sudo pip install ansible"
        ]
    }, 
    {
        "type": "ansible-local",
        "playbook_file": "ansible/playbook.yaml",
        "role_paths": [
            "ansible/roles/common"
        ],
        "playbook_dir": "ansible",
        "galaxy_file": "ansible/requirements.yaml"
    },
    {
      "type": "shell",
      "inline": [
        "rm .ssh/authorized_keys ; sudo rm /root/.ssh/authorized_keys"
      ]
    }

We used shell provisioner to apply OS patches in Part 1. Now, we use shell to install Ansible on the target machine and ansible-local to import, install, and execute Ansible roles to make our target machine conform to our standards.

Packer uses shell to remove temporary keys before it creates an AMI from the target and temporary EC2 instance.

Ansible Implementation Details

 
Ansible provides OS patching through a custom Common role that can be easily customized for other tasks.

CIS Benchmark and Cloudwatch Logs are implemented through two Ansible third-party roles that are defined in ansible/requirements.yaml as seen in the Packer template.

The Ansible provisioner uses Ansible Galaxy to download these roles onto the target machine and execute them as instructed by ansible/playbook.yaml.

For information about how these components are organized, see the Playbook Roles and Include Statements in the Ansible documentation.

The following Ansible playbook (ansible</playbook.yaml) controls the execution order and custom properties:

---
- hosts: localhost
  connection: local
  gather_facts: true    # gather OS info that is made available for tasks/roles
  become: yes           # majority of CIS tasks require root
  vars:
    # CIS Controls whitepaper:  http://bit.ly/2mGAmUc
    # AWS CIS Whitepaper:       http://bit.ly/2m2Ovrh
    cis_level_1_exclusions:
    # 3.4.2 and 3.4.3 effectively blocks access to all ports to the machine
    ## This can break automation; ignoring it as there are stronger mechanisms than that
      - 3.4.2 
      - 3.4.3
    # CloudWatch Logs will be used instead of Rsyslog/Syslog-ng
    ## Same would be true if any other software doesn't support Rsyslog/Syslog-ng mechanisms
      - 4.2.1.4
      - 4.2.2.4
      - 4.2.2.5
    # Autofs is not installed in newer versions, let's ignore
      - 1.1.19
    # Cloudwatch Logs role configuration
    logs:
      - file: /var/log/messages
        group_name: "system_logs"
  roles:
    - common
    - anthcourtney.cis-amazon-linux
    - dharrisio.aws-cloudwatch-logs-agent

Both third-party Ansible roles can be easily configured through variables (vars). We use Ansible playbook variables to exclude CIS controls that don’t apply to our case and to instruct the CloudWatch Logs agent to stream the /var/log/messages log file to CloudWatch Logs.

If you need to add more OS or application logs, you can easily duplicate the playbook and make changes. The CloudWatch Logs agent will ship configured log messages to CloudWatch Logs.

For more information about parameters you can use to further customize third-party roles, download Ansible roles for the Cloudwatch Logs Agent and CIS Amazon Linux from the Galaxy website.

Committing Changes

 
Now that Ansible and CloudWatch Events are configured as a part of the build process, commiting any changes to the AWS CodeComit Git Repository will triger a new AMI build process that can be followed through the AWS CodePipeline console.

When the build is complete, an email will be sent to the email address you provided as a part of the CloudFormation stack deployment. The email serves as notification that an AMI has been built and is ready for use.

Summary

 
We used AWS CodeCommit, AWS CodePipeline, AWS CodeBuild, Packer, and Ansible to build a pipeline that continuously builds new, hardened CIS AMIs. We used Amazon SNS so that email addresses subscribed to a SNS topic are notified upon completion of the AMI build.

By treating our AMI creation process as code, we can iterate and track changes over time. In this way, it’s no different from a software development workflow. With that in mind, software patches, OS configuration, and logs that need to be shipped to a central location are only a git commit away.

Next Steps

 
Here are some ideas to extend this AMI builder:

  • Hook up a Lambda function in Cloudwatch Events to update EC2 Auto Scaling configuration upon completion of the AMI build.
  • Use AWS CodePipeline parallel steps to build multiple Packer images.
  • Add a commit ID as a tag for the AMI you created.
  • Create a scheduled Lambda function through Cloudwatch Events to clean up old AMIs based on timestamp (name or additional tag).
  • Implement Windows support for the AMI builder.
  • Create a cross-account or cross-region AMI build.

Cloudwatch Events allow the AMI builder to decouple AMI configuration and creation so that you can easily add your own logic using targets (AWS Lambda, Amazon SQS, Amazon SNS) to add events or recycle EC2 instances with the new AMI.

If you have questions or other feedback, feel free to leave it in the comments or contribute to the AMI Builder repo on GitHub.

Building Loosely Coupled, Scalable, C# Applications with Amazon SQS and Amazon SNS

Post Syndicated from Tara Van Unen original https://aws.amazon.com/blogs/compute/building-loosely-coupled-scalable-c-applications-with-amazon-sqs-and-amazon-sns/

 
Stephen Liedig, Solutions Architect

 

One of the many challenges professional software architects and developers face is how to make cloud-native applications scalable, fault-tolerant, and highly available.

Fundamental to your project success is understanding the importance of making systems highly cohesive and loosely coupled. That means considering the multi-dimensional facets of system coupling to support the distributed nature of the applications that you are building for the cloud.

By that, I mean addressing not only the application-level coupling (managing incoming and outgoing dependencies), but also considering the impacts of of platform, spatial, and temporal coupling of your systems. Platform coupling relates to the interoperability, or lack thereof, of heterogeneous systems components. Spatial coupling deals with managing components at a network topology level or protocol level. Temporal, or runtime coupling, refers to the ability of a component within your system to do any kind of meaningful work while it is performing a synchronous, blocking operation.

The AWS messaging services, Amazon SQS and Amazon SNS, help you deal with these forms of coupling by providing mechanisms for:

  • Reliable, durable, and fault-tolerant delivery of messages between application components
  • Logical decomposition of systems and increased autonomy of components
  • Creating unidirectional, non-blocking operations, temporarily decoupling system components at runtime
  • Decreasing the dependencies that components have on each other through standard communication and network channels

Following on the recent topic, Building Scalable Applications and Microservices: Adding Messaging to Your Toolbox, in this post, I look at some of the ways you can introduce SQS and SNS into your architectures to decouple your components, and show how you can implement them using C#.

Walkthrough

To illustrate some of these concepts, consider a web application that processes customer orders. As good architects and developers, you have followed best practices and made your application scalable and highly available. Your solution included implementing load balancing, dynamic scaling across multiple Availability Zones, and persisting orders in a Multi-AZ Amazon RDS database instance, as in the following diagram.


In this example, the application is responsible for handling and persisting the order data, as well as dealing with increases in traffic for popular items.

One potential point of vulnerability in the order processing workflow is in saving the order in the database. The business expects that every order has been persisted into the database. However, any potential deadlock, race condition, or network issue could cause the persistence of the order to fail. Then, the order is lost with no recourse to restore the order.

With good logging capability, you may be able to identify when an error occurred and which customer’s order failed. This wouldn’t allow you to “restore” the transaction, and by that stage, your customer is no longer your customer.

As illustrated in the following diagram, introducing an SQS queue helps improve your ordering application. Using the queue isolates the processing logic into its own component and runs it in a separate process from the web application. This, in turn, allows the system to be more resilient to spikes in traffic, while allowing work to be performed only as fast as necessary in order to manage costs.


In addition, you now have a mechanism for persisting orders as messages (with the queue acting as a temporary database), and have moved the scope of your transaction with your database further down the stack. In the event of an application exception or transaction failure, this ensures that the order processing can be retired or redirected to the Amazon SQS Dead Letter Queue (DLQ), for re-processing at a later stage. (See the recent post, Using Amazon SQS Dead-Letter Queues to Control Message Failure, for more information on dead-letter queues.)

Scaling the order processing nodes

This change allows you now to scale the web application frontend independently from the processing nodes. The frontend application can continue to scale based on metrics such as CPU usage, or the number of requests hitting the load balancer. Processing nodes can scale based on the number of orders in the queue. Here is an example of scale-in and scale-out alarms that you would associate with the scaling policy.

Scale-out Alarm

aws cloudwatch put-metric-alarm --alarm-name AddCapacityToCustomerOrderQueue --metric-name ApproximateNumberOfMessagesVisible --namespace "AWS/SQS" 
--statistic Average --period 300 --threshold 3 --comparison-operator GreaterThanOrEqualToThreshold --dimensions Name=QueueName,Value=customer-orders
--evaluation-periods 2 --alarm-actions <arn of the scale-out autoscaling policy>

Scale-in Alarm

aws cloudwatch put-metric-alarm --alarm-name RemoveCapacityFromCustomerOrderQueue --metric-name ApproximateNumberOfMessagesVisible --namespace "AWS/SQS" 
 --statistic Average --period 300 --threshold 1 --comparison-operator LessThanOrEqualToThreshold --dimensions Name=QueueName,Value=customer-orders
 --evaluation-periods 2 --alarm-actions <arn of the scale-in autoscaling policy>

In the above example, use the ApproximateNumberOfMessagesVisible metric to discover the queue length and drive the scaling policy of the Auto Scaling group. Another useful metric is ApproximateAgeOfOldestMessage, when applications have time-sensitive messages and developers need to ensure that messages are processed within a specific time period.

Scaling the order processing implementation

On top of scaling at an infrastructure level using Auto Scaling, make sure to take advantage of the processing power of your Amazon EC2 instances by using as many of the available threads as possible. There are several ways to implement this. In this post, we build a Windows service that uses the BackgroundWorker class to process the messages from the queue.

Here’s a closer look at the implementation. In the first section of the consuming application, use a loop to continually poll the queue for new messages, and construct a ReceiveMessageRequest variable.

public static void PollQueue()
{
    while (_running)
    {
        Task<ReceiveMessageResponse> receiveMessageResponse;

        // Pull messages off the queue
        using (var sqs = new AmazonSQSClient())
        {
            const int maxMessages = 10;  // 1-10

            //Receiving a message
            var receiveMessageRequest = new ReceiveMessageRequest
            {
                // Get URL from Configuration
                QueueUrl = _queueUrl, 
                // The maximum number of messages to return. 
                // Fewer messages might be returned. 
                MaxNumberOfMessages = maxMessages, 
                // A list of attributes that need to be returned with message.
                AttributeNames = new List<string> { "All" },
                // Enable long polling. 
                // Time to wait for message to arrive on queue.
                WaitTimeSeconds = 5 
            };

            receiveMessageResponse = sqs.ReceiveMessageAsync(receiveMessageRequest);
        }

The WaitTimeSeconds property of the ReceiveMessageRequest specifies the duration (in seconds) that the call waits for a message to arrive in the queue before returning a response to the calling application. There are a few benefits to using long polling:

  • It reduces the number of empty responses by allowing SQS to wait until a message is available in the queue before sending a response.
  • It eliminates false empty responses by querying all (rather than a limited number) of the servers.
  • It returns messages as soon any message becomes available.

For more information, see Amazon SQS Long Polling.

After you have returned messages from the queue, you can start to process them by looping through each message in the response and invoking a new BackgroundWorker thread.

// Process messages
if (receiveMessageResponse.Result.Messages != null)
{
    foreach (var message in receiveMessageResponse.Result.Messages)
    {
        Console.WriteLine("Received SQS message, starting worker thread");

        // Create background worker to process message
        BackgroundWorker worker = new BackgroundWorker();
        worker.DoWork += (obj, e) => ProcessMessage(message);
        worker.RunWorkerAsync();
    }
}
else
{
    Console.WriteLine("No messages on queue");
}

The event handler, ProcessMessage, is where you implement business logic for processing orders. It is important to have a good understanding of how long a typical transaction takes so you can set a message VisibilityTimeout that is long enough to complete your operation. If order processing takes longer than the specified timeout period, the message becomes visible on the queue. Other nodes may pick it and process the same order twice, leading to unintended consequences.

Handling Duplicate Messages

In order to manage duplicate messages, seek to make your processing application idempotent. In mathematics, idempotent describes a function that produces the same result if it is applied to itself:

f(x) = f(f(x))

No matter how many times you process the same message, the end result is the same (definition from Enterprise Integration Patterns: Designing, Building, and Deploying Messaging Solutions, Hohpe and Wolf, 2004).

There are several strategies you could apply to achieve this:

  • Create messages that have inherent idempotent characteristics. That is, they are non-transactional in nature and are unique at a specified point in time. Rather than saying “place new order for Customer A,” which adds a duplicate order to the customer, use “place order <orderid> on <timestamp> for Customer A,” which creates a single order no matter how often it is persisted.
  • Deliver your messages via an Amazon SQS FIFO queue, which provides the benefits of message sequencing, but also mechanisms for content-based deduplication. You can deduplicate using the MessageDeduplicationId property on the SendMessage request or by enabling content-based deduplication on the queue, which generates a hash for MessageDeduplicationId, based on the content of the message, not the attributes.
var sendMessageRequest = new SendMessageRequest
{
    QueueUrl = _queueUrl,
    MessageBody = JsonConvert.SerializeObject(order),
    MessageGroupId = Guid.NewGuid().ToString("N"),
    MessageDeduplicationId = Guid.NewGuid().ToString("N")
};
  • If using SQS FIFO queues is not an option, keep a message log of all messages attributes processed for a specified period of time, as an alternative to message deduplication on the receiving end. Verifying the existence of the message in the log before processing the message adds additional computational overhead to your processing. This can be minimized through low latency persistence solutions such as Amazon DynamoDB. Bear in mind that this solution is dependent on the successful, distributed transaction of the message and the message log.

Handling exceptions

Because of the distributed nature of SQS queues, it does not automatically delete the message. Therefore, you must explicitly delete the message from the queue after processing it, using the message ReceiptHandle property (see the following code example).

However, if at any stage you have an exception, avoid handling it as you normally would. The intention is to make sure that the message ends back on the queue, so that you can gracefully deal with intermittent failures. Instead, log the exception to capture diagnostic information, and swallow it.

By not explicitly deleting the message from the queue, you can take advantage of the VisibilityTimeout behavior described earlier. Gracefully handle the message processing failure and make the unprocessed message available to other nodes to process.

In the event that subsequent retries fail, SQS automatically moves the message to the configured DLQ after the configured number of receives has been reached. You can further investigate why the order process failed. Most importantly, the order has not been lost, and your customer is still your customer.

private static void ProcessMessage(Message message)
{
    using (var sqs = new AmazonSQSClient())
    {
        try
        {
            Console.WriteLine("Processing message id: {0}", message.MessageId);

            // Implement messaging processing here
            // Ensure no downstream resource contention (parallel processing)
            // <your order processing logic in here…>
            Console.WriteLine("{0} Thread {1}: {2}", DateTime.Now.ToString("s"), Thread.CurrentThread.ManagedThreadId, message.MessageId);
            
            // Delete the message off the queue. 
            // Receipt handle is the identifier you must provide 
            // when deleting the message.
            var deleteRequest = new DeleteMessageRequest(_queueName, message.ReceiptHandle);
            sqs.DeleteMessageAsync(deleteRequest);
            Console.WriteLine("Processed message id: {0}", message.MessageId);

        }
        catch (Exception ex)
        {
            // Do nothing.
            // Swallow exception, message will return to the queue when 
            // visibility timeout has been exceeded.
            Console.WriteLine("Could not process message due to error. Exception: {0}", ex.Message);
        }
    }
}

Using SQS to adapt to changing business requirements

One of the benefits of introducing a message queue is that you can accommodate new business requirements without dramatically affecting your application.

If, for example, the business decided that all orders placed over $5000 are to be handled as a priority, you could introduce a new “priority order” queue. The way the orders are processed does not change. The only significant change to the processing application is to ensure that messages from the “priority order” queue are processed before the “standard order” queue.

The following diagram shows how this logic could be isolated in an “order dispatcher,” whose only purpose is to route order messages to the appropriate queue based on whether the order exceeds $5000. Nothing on the web application or the processing nodes changes other than the target queue to which the order is sent. The rates at which orders are processed can be achieved by modifying the poll rates and scalability settings that I have already discussed.

Extending the design pattern with Amazon SNS

Amazon SNS supports reliable publish-subscribe (pub-sub) scenarios and push notifications to known endpoints across a wide variety of protocols. It eliminates the need to periodically check or poll for new information and updates. SNS supports:

  • Reliable storage of messages for immediate or delayed processing
  • Publish / subscribe – direct, broadcast, targeted “push” messaging
  • Multiple subscriber protocols
  • Amazon SQS, HTTP, HTTPS, email, SMS, mobile push, AWS Lambda

With these capabilities, you can provide parallel asynchronous processing of orders in the system and extend it to support any number of different business use cases without affecting the production environment. This is commonly referred to as a “fanout” scenario.

Rather than your web application pushing orders to a queue for processing, send a notification via SNS. The SNS messages are sent to a topic and then replicated and pushed to multiple SQS queues and Lambda functions for processing.

As the diagram above shows, you have the development team consuming “live” data as they work on the next version of the processing application, or potentially using the messages to troubleshoot issues in production.

Marketing is consuming all order information, via a Lambda function that has subscribed to the SNS topic, inserting the records into an Amazon Redshift warehouse for analysis.

All of this, of course, is happening without affecting your order processing application.

Summary

While I haven’t dived deep into the specifics of each service, I have discussed how these services can be applied at an architectural level to build loosely coupled systems that facilitate multiple business use cases. I’ve also shown you how to use infrastructure and application-level scaling techniques, so you can get the most out of your EC2 instances.

One of the many benefits of using these managed services is how quickly and easily you can implement powerful messaging capabilities in your systems, and lower the capital and operational costs of managing your own messaging middleware.

Using Amazon SQS and Amazon SNS together can provide you with a powerful mechanism for decoupling application components. This should be part of design considerations as you architect for the cloud.

For more information, see the Amazon SQS Developer Guide and Amazon SNS Developer Guide. You’ll find tutorials on all the concepts covered in this post, and more. To can get started using the AWS console or SDK of your choice visit:

Happy messaging!

Using Amazon SQS Dead-Letter Queues to Control Message Failure

Post Syndicated from Tara Van Unen original https://aws.amazon.com/blogs/compute/using-amazon-sqs-dead-letter-queues-to-control-message-failure/


Michael G. Khmelnitsky, Senior Programmer Writer

 

Sometimes, messages can’t be processed because of a variety of possible issues, such as erroneous conditions within the producer or consumer application. For example, if a user places an order within a certain number of minutes of creating an account, the producer might pass a message with an empty string instead of a customer identifier. Occasionally, producers and consumers might fail to interpret aspects of the protocol that they use to communicate, causing message corruption or loss. Also, the consumer’s hardware errors might corrupt message payload. For these reasons, messages that can’t be processed in a timely manner are delivered to a dead-letter queue.

The recent post Building Scalable Applications and Microservices: Adding Messaging to Your Toolbox gives an overview of messaging in the microservice architecture of modern applications. This post explains how and when you should use dead-letter queues to gain better control over message handling in your applications. It also offers some resources for configuring a dead-letter queue in Amazon Simple Queue Service (SQS).

What are the benefits of dead-letter queues?

The main task of a dead-letter queue is handling message failure. A dead-letter queue lets you set aside and isolate messages that can’t be processed correctly to determine why their processing didn’t succeed. Setting up a dead-letter queue allows you to do the following:

  • Configure an alarm for any messages delivered to a dead-letter queue.
  • Examine logs for exceptions that might have caused messages to be delivered to a dead-letter queue.
  • Analyze the contents of messages delivered to a dead-letter queue to diagnose software or the producer’s or consumer’s hardware issues.
  • Determine whether you have given your consumer sufficient time to process messages.

How do high-throughput, unordered queues handle message failure?

High-throughput, unordered queues (sometimes called standard or storage queues) keep processing messages until the expiration of the retention period. This helps ensure continuous processing of messages, which minimizes the chances of your queue being blocked by messages that can’t be processed. It also ensures fast recovery for your queue.

In a system that processes thousands of messages, having a large number of messages that the consumer repeatedly fails to acknowledge and delete might increase costs and place extra load on the hardware. Instead of trying to process failing messages until they expire, it is better to move them to a dead-letter queue after a few processing attempts.

Note: This queue type often allows a high number of in-flight messages. If the majority of your messages can’t be consumed and aren’t sent to a dead-letter queue, your rate of processing valid messages can slow down. Thus, to maintain the efficiency of your queue, you must ensure that your application handles message processing correctly.

How do FIFO queues handle message failure?

FIFO (first-in-first-out) queues (sometimes called service bus queues) help ensure exactly-once processing by consuming messages in sequence from a message group. Thus, although the consumer can continue to retrieve ordered messages from another message group, the first message group remains unavailable until the message blocking the queue is processed successfully.

Note: This queue type often allows a lower number of in-flight messages. Thus, to help ensure that your FIFO queue doesn’t get blocked by a message, you must ensure that your application handles message processing correctly.

When should I use a dead-letter queue?

  • Do use dead-letter queues with high-throughput, unordered queues. You should always take advantage of dead-letter queues when your applications don’t depend on ordering. Dead-letter queues can help you troubleshoot incorrect message transmission operations. Note: Even when you use dead-letter queues, you should continue to monitor your queues and retry sending messages that fail for transient reasons.
  • Do use dead-letter queues to decrease the number of messages and to reduce the possibility of exposing your system to poison-pill messages (messages that can be received but can’t be processed).
  • Don’t use a dead-letter queue with high-throughput, unordered queues when you want to be able to keep retrying the transmission of a message indefinitely. For example, don’t use a dead-letter queue if your program must wait for a dependent process to become active or available.
  • Don’t use a dead-letter queue with a FIFO queue if you don’t want to break the exact order of messages or operations. For example, don’t use a dead-letter queue with instructions in an Edit Decision List (EDL) for a video editing suite, where changing the order of edits changes the context of subsequent edits.

How do I get started with dead-letter queues in Amazon SQS?

Amazon SQS is a fully managed service that offers reliable, highly scalable hosted queues for exchanging messages between applications or microservices. Amazon SQS moves data between distributed application components and helps you decouple these components. It supports both standard queues and FIFO queues. To configure a queue as a dead-letter queue, you can use the AWS Management Console or the Amazon SQS SetQueueAttributes API action.

To get started with dead-letter queues in Amazon SQS, see the following topics in the Amazon SQS Developer Guide:

To start working with dead-letter queues programmatically, see the following resources:

Amazon Simple Queue Service Introduces Server-Side Encryption for Queues

Post Syndicated from Craig Liebendorfer original https://aws.amazon.com/blogs/security/amazon-simple-queue-service-introduces-server-side-encryption-for-queues/

SQS + SSE image

You can now use Amazon Simple Queue Service (SQS) to exchange sensitive data between applications using server-side encryption (SSE). SQS is a fully managed message queuing service for reliably communicating between distributed software components and microservices at any scale. You can use SQS to take advantage of the scale, cost, and operational benefits of a managed messaging service. The addition of SSE allows you to transmit sensitive data with the increased security of using encrypted queues.

SQS SSE uses the 256-bit Advanced Encryption Standard (AES-256 GCM algorithm) to encrypt each message body by using a unique key. AWS Key Management Service (KMS) provides encryption key management. In addition, KMS works with AWS CloudTrail to provide logs of all encryption key usage to help meet your regulatory and compliance needs.

SQS SSE is now available in the US West (Oregon) and US East (Ohio) Regions, with more AWS Regions to follow. There is no additional charge for using encrypted queues, but there is a charge to use KMS. For more information about KMS and CloudTrail pricing, see AWS Key Management Service Pricing.

To learn more, see the full post on the AWS Blog.

– Craig

Four HIPAA Eligible Services Recently Added to the AWS Business Associate Agreement

Post Syndicated from Chad Woolf original https://aws.amazon.com/blogs/security/four-hipaa-eligible-services-recently-added-to-the-aws-business-associate-agreement/

HIPAA logo

We are pleased to announce that the following four AWS services have been added in recent weeks to the AWS Business Associate Agreement (BAA):

As with all HIPAA Eligible Services covered under the BAA, Protected Health Information (PHI) must be encrypted while at rest or in transit. See Architecting for HIPAA Security and Compliance on Amazon Web Services, which explains how you can configure each AWS HIPAA Eligible Service to store, process, and transmit PHI.

For more details, see the full AWS Blog post.

– Chad

Roundup of AWS HIPAA Eligible Service Announcements

Post Syndicated from Ana Visneski original https://aws.amazon.com/blogs/aws/roundup-of-aws-hipaa-eligible-service-announcements/

At AWS we have had a number of HIPAA eligible service announcements. Patrick Combes, the Healthcare and Life Sciences Global Technical Leader at AWS, and Aaron Friedman, a Healthcare and Life Sciences Partner Solutions Architect at AWS, have written this post to tell you all about it.

-Ana


We are pleased to announce that the following AWS services have been added to the BAA in recent weeks: Amazon API Gateway, AWS Direct Connect, AWS Database Migration Service, and Amazon SQS. All four of these services facilitate moving data into and through AWS, and we are excited to see how customers will be using these services to advance their solutions in healthcare. While we know the use cases for each of these services are vast, we wanted to highlight some ways that customers might use these services with Protected Health Information (PHI).

As with all HIPAA-eligible services covered under the AWS Business Associate Addendum (BAA), PHI must be encrypted while at-rest or in-transit. We encourage you to reference our HIPAA whitepaper, which details how you might configure each of AWS’ HIPAA-eligible services to store, process, and transmit PHI. And of course, for any portion of your application that does not touch PHI, you can use any of our 90+ services to deliver the best possible experience to your users. You can find some ideas on architecting for HIPAA on our website.

Amazon API Gateway
Amazon API Gateway is a web service that makes it easy for developers to create, publish, monitor, and secure APIs at any scale. With PHI now able to securely transit API Gateway, applications such as patient/provider directories, patient dashboards, medical device reports/telemetry, HL7 message processing and more can securely accept and deliver information to any number and type of applications running within AWS or client presentation layers.

One particular area we are excited to see how our customers leverage Amazon API Gateway is with the exchange of healthcare information. The Fast Healthcare Interoperability Resources (FHIR) specification will likely become the next-generation standard for how health information is shared between entities. With strong support for RESTful architectures, FHIR can be easily codified within an API on Amazon API Gateway. For more information on FHIR, our AWS Healthcare Competency partner, Datica, has an excellent primer.

AWS Direct Connect
Some of our healthcare and life sciences customers, such as Johnson & Johnson, leverage hybrid architectures and need to connect their on-premises infrastructure to the AWS Cloud. Using AWS Direct Connect, you can establish private connectivity between AWS and your datacenter, office, or colocation environment, which in many cases can reduce your network costs, increase bandwidth throughput, and provide a more consistent network experience than Internet-based connections.

In addition to a hybrid-architecture strategy, AWS Direct Connect can assist with the secure migration of data to AWS, which is the first step to using the wide array of our HIPAA-eligible services to store and process PHI, such as Amazon S3 and Amazon EMR. Additionally, you can connect to third-party/externally-hosted applications or partner-provided solutions as well as securely and reliably connect end users to those same healthcare applications, such as a cloud-based Electronic Medical Record system.

AWS Database Migration Service (DMS)
To date, customers have migrated over 20,000 databases to AWS through the AWS Database Migration Service. Customers often use DMS as part of their cloud migration strategy, and now it can be used to securely and easily migrate your core databases containing PHI to the AWS Cloud. As your source database remains fully operational during the migration with DMS, you minimize downtime for these business-critical applications as you migrate your databases to AWS. This service can now be utilized to securely transfer such items as patient directories, payment/transaction record databases, revenue management databases and more into AWS.

Amazon Simple Queue Service (SQS)
Amazon Simple Queue Service (SQS) is a message queueing service for reliably communicating among distributed software components and microservices at any scale. One way that we envision customers using SQS with PHI is to buffer requests between application components that pass HL7 or FHIR messages to other parts of their application. You can leverage features like SQS FIFO to ensure your messages containing PHI are passed in the order they are received and delivered in the order they are received, and available until a consumer processes and deletes it. This is important for applications with patient record updates or processing payment information in a hospital.

Let’s get building!
We are beyond excited to see how our customers will use our newly HIPAA-eligible services as part of their healthcare applications. What are you most excited for? Leave a comment below.

AWS Hot Startups – March 2017

Post Syndicated from Ana Visneski original https://aws.amazon.com/blogs/aws/aws-hot-startups-march-2017/

As the madness of March rounds up, take a break from all the basketball and check out the cool startups Tina Barr brings you for this month!

-Ana


The arrival of spring brings five new startups this month:

  • Amino Apps – providing social networks for hundreds of thousands of communities.
  • Appboy – empowering brands to strengthen customer relationships.
  • Arterys – revolutionizing the medical imaging industry.
  • Protenus – protecting patient data for healthcare organizations.
  • Syapse – improving targeted cancer care with shared data from across the country.

In case you missed them, check out February’s hot startups here.

Amino Apps (New York, NY)
Amino Logo
Amino Apps was founded on the belief that interest-based communities were underdeveloped and outdated, particularly when it came to mobile. CEO Ben Anderson and CTO Yin Wang created the app to give users access to hundreds of thousands of communities, each of them a complete social network dedicated to a single topic. Some of the largest communities have over 1 million members and are built around topics like popular TV shows, video games, sports, and an endless number of hobbies and other interests. Amino hosts communities from around the world and is currently available in six languages with many more on the way.

Navigating the Amino app is easy. Simply download the app (iOS or Android), sign up with a valid email address, choose a profile picture, and start exploring. Users can search for communities and join any that fit their interests. Each community has chatrooms, multimedia content, quizzes, and a seamless commenting system. If a community doesn’t exist yet, users can create it in minutes using the Amino Creator and Manager app (ACM). The largest user-generated communities are turned into their own apps, which gives communities their own piece of real estate on members’ phones, as well as in app stores.

Amino’s vast global network of hundreds of thousands of communities is run on AWS services. Every day users generate, share, and engage with an enormous amount of content across hundreds of mobile applications. By leveraging AWS services including Amazon EC2, Amazon RDS, Amazon S3, Amazon SQS, and Amazon CloudFront, Amino can continue to provide new features to their users while scaling their service capacity to keep up with user growth.

Interested in joining Amino? Check out their jobs page here.

Appboy (New York, NY)
In 2011, Bill Magnuson, Jon Hyman, and Mark Ghermezian saw a unique opportunity to strengthen and humanize relationships between brands and their customers through technology. The trio created Appboy to empower brands to build long-term relationships with their customers and today they are the leading lifecycle engagement platform for marketing, growth, and engagement teams. The team recognized that as rapid mobile growth became undeniable, many brands were becoming frustrated with the lack of compelling and seamless cross-channel experiences offered by existing marketing clouds. Many of today’s top mobile apps and enterprise companies trust Appboy to take their marketing to the next level. Appboy manages user profiles for nearly 700 million monthly active users, and is used to power more than 10 billion personalized messages monthly across a multitude of channels and devices.

Appboy creates a holistic user profile that offers a single view of each customer. That user profile in turn powers contextual cross-channel messaging, lifecycle engagement automation, and robust campaign insights and optimization opportunities. Appboy offers solutions that allow brands to create push notifications, targeted emails, in-app and in-browser messages, news feed cards, and webhooks to enhance the user experience and increase customer engagement. The company prides itself on its interoperability, connecting to a variety of complimentary marketing tools and technologies so brands can build the perfect stack to enable their strategies and experiments in real time.

AWS makes it easy for Appboy to dynamically size all of their service components and automatically scale up and down as needed. They use an array of services including Elastic Load Balancing, AWS Lambda, Amazon CloudWatch, Auto Scaling groups, and Amazon S3 to help scale capacity and better deal with unpredictable customer loads.

To keep up with the latest marketing trends and tactics, visit the Appboy digital magazine, Relate. Appboy was also recently featured in the #StartupsOnAir video series where they gave insight into their AWS usage.

Arterys (San Francisco, CA)
Getting test results back from a physician can often be a time consuming and tedious process. Clinicians typically employ a variety of techniques to manually measure medical images and then make their assessments. Arterys founders Fabien Beckers, John Axerio-Cilies, Albert Hsiao, and Shreyas Vasanawala realized that much more computation and advanced analytics were needed to harness all of the valuable information in medical images, especially those generated by MRI and CT scanners. Clinicians were often skipping measurements and making assessments based mostly on qualitative data. Their solution was to start a cloud/AI software company focused on accelerating data-driven medicine with advanced software products for post-processing of medical images.

Arterys’ products provide timely, accurate, and consistent quantification of images, improve speed to results, and improve the quality of the information offered to the treating physician. This allows for much better tracking of a patient’s condition, and thus better decisions about their care. Advanced analytics, such as deep learning and distributed cloud computing, are used to process images. The first Arterys product can contour cardiac anatomy as accurately as experts, but takes only 15-20 seconds instead of the 45-60 minutes required to do it manually. Their computing cloud platform is also fully HIPAA compliant.

Arterys relies on a variety of AWS services to process their medical images. Using deep learning and other advanced analytic tools, Arterys is able to render images without latency over a web browser using AWS G2 instances. They use Amazon EC2 extensively for all of their compute needs, including inference and rendering, and Amazon S3 is used to archive images that aren’t needed immediately, as well as manage costs. Arterys also employs Amazon Route 53, AWS CloudTrail, and Amazon EC2 Container Service.

Check out this quick video about the technology that Arterys is creating. They were also recently featured in the #StartupsOnAir video series and offered a quick demo of their product.

Protenus (Baltimore, MD)
Protenus Logo
Protenus founders Nick Culbertson and Robert Lord were medical students at Johns Hopkins Medical School when they saw first-hand how Electronic Health Record (EHR) systems could be used to improve patient care and share clinical data more efficiently. With increased efficiency came a huge issue – an onslaught of serious security and privacy concerns. Over the past two years, 140 million medical records have been breached, meaning that approximately 1 in 3 Americans have had their health data compromised. Health records contain a repository of sensitive information and a breach of that data can cause major havoc in a patient’s life – namely identity theft, prescription fraud, Medicare/Medicaid fraud, and improper performance of medical procedures. Using their experience and knowledge from former careers in the intelligence community and involvement in a leading hedge fund, Nick and Robert developed the prototype and algorithms that launched Protenus.

Today, Protenus offers a number of solutions that detect breaches and misuse of patient data for healthcare organizations nationwide. Using advanced analytics and AI, Protenus’ health data insights platform understands appropriate vs. inappropriate use of patient data in the EHR. It also protects privacy, aids compliance with HIPAA regulations, and ensures trust for patients and providers alike.

Protenus built and operates its SaaS offering atop Amazon EC2, where Dedicated Hosts and encrypted Amazon EBS volume are used to ensure compliance with HIPAA regulation for the storage of Protected Health Information. They use Elastic Load Balancing and Amazon Route 53 for DNS, enabling unique, secure client specific access points to their Protenus instance.

To learn more about threats to patient data, read Hospitals’ Biggest Threat to Patient Data is Hiding in Plain Sight on the Protenus blog. Also be sure to check out their recent video in the #StartupsOnAir series for more insight into their product.

Syapse (Palo Alto, CA)
Syapse provides a comprehensive software solution that enables clinicians to treat patients with precision medicine for targeted cancer therapies — treatments that are designed and chosen using genetic or molecular profiling. Existing hospital IT doesn’t support the robust infrastructure and clinical workflows required to treat patients with precision medicine at scale, but Syapse centralizes and organizes patient data to clinicians at the point of care. Syapse offers a variety of solutions for oncologists that allow them to access the full scope of patient data longitudinally, view recommended treatments or clinical trials for similar patients, and track outcomes over time. These solutions are helping health systems across the country to improve patient outcomes by offering the most innovative care to cancer patients.

Leading health systems such as Stanford Health Care, Providence St. Joseph Health, and Intermountain Healthcare are using Syapse to improve patient outcomes, streamline clinical workflows, and scale their precision medicine programs. A group of experts known as the Molecular Tumor Board (MTB) reviews complex cases and evaluates patient data, documents notes, and disseminates treatment recommendations to the treating physician. Syapse also provides reports that give health system staff insight into their institution’s oncology care, which can be used toward quality improvement, business goals, and understanding variables in the oncology service line.

Syapse uses Amazon Virtual Private Cloud, Amazon EC2 Dedicated Instances, and Amazon Elastic Block Store to build a high-performance, scalable, and HIPAA-compliant data platform that enables health systems to make precision medicine part of routine cancer care for patients throughout the country.

Be sure to check out the Syapse blog to learn more and also their recent video on the #StartupsOnAir video series where they discuss their product, HIPAA compliance, and more about how they are using AWS.

Thank you for checking out another month of awesome hot startups!

-Tina Barr

 

A Serverless Authentication System by Jumia

Post Syndicated from Bryan Liston original https://aws.amazon.com/blogs/compute/a-serverless-authentication-system-by-jumia/

Jumia Group is an ecosystem of nine different companies operating in 22 different countries in Africa. Jumia employs 3000 people and serves 15 million users/month.

Want to secure and centralize millions of user accounts across Africa? Shut down your servers! Jumia Group unified and centralized customer authentication on nine digital services platforms, operating in 22 (and counting) countries in Africa, totaling over 120 customer and merchant facing applications. All were unified into a custom Jumia Central Authentication System (JCAS), built in a timely fashion and designed using a serverless architecture.

In this post, we give you our solution overview. For the full technical implementation, see the Jumia Central Authentication System post on the Jumia Tech blog.

The challenge

A group-level initiative was started to centralize authentication for all Jumia users in all countries for all companies. But it was impossible to unify the operational databases for the different companies. Each company had its own user database with its own technological implementation. Each company alone had yet to unify the logins for their own countries. The effects of deduplicating all user accounts were yet to be determined but were considered to be large. Finally, there was no team responsible for manage this new project, given that a new dedicated infrastructure would be needed.

With these factors in mind, we decided to design this project as a serverless architecture to eliminate the need for infrastructure and a dedicated team. AWS was immediately considered as the best option for its level of
service, intelligent pricing model, and excellent serverless services.

The goal was simple. For all user accounts on all Jumia Group websites:

  • Merge duplicates that might exist in different companies and countries
  • Create a unique login (username and password)
  • Enrich the user profile in this centralized system
  • Share the profile with all Jumia companies

Requirements

We had the following initial requirements while designing this solution on the AWS platform:

  • Secure by design
  • Highly available via multimaster replication
  • Single login for all platforms/countries
  • Minimal performance impact
  • No admin overhead

Chosen technologies

We chose the following AWS services and technologies to implement our solution.

Amazon API Gateway

Amazon API Gateway is a fully managed service, making it really simple to set up an API. It integrates directly with AWS Lambda, which was chosen as our endpoint. It can be easily replicated to other regions, using Swagger import/export.

AWS Lambda

AWS Lambda is the base of serverless computing, allowing you to run code without worrying about infrastructure. All our code runs on Lambda functions using Python; some functions are
called from the API Gateway, others are scheduled like cron jobs.

Amazon DynamoDB

Amazon DynamoDB
is a highly scalable, NoSQL database with a good API and a clean pricing model. It has great scalability as well as high availability, and fits the serverless model we aimed for with JCAS.

AWS KMS

AWS KMS was chosen as a key manager to perform envelope encryption. It’s a simple and secure key manager with multiple encryption functionalities.

Envelope encryption

Oversimplifying envelope encryption is when you encrypt a key rather than the data itself. That encrypted key, which was used to encrypt the data, may now be stored with the data itself on your persistence layer since it doesn’t decrypt the data if compromised. For more information, see How Envelope Encryption Works with Supported AWS Services.

Envelope encryption was chosen given that master keys have a 4 KB limit for data to be encrypted or decrypted.

Amazon SQS

Amazon SQS is an inexpensive queuing service with dynamic scaling, 14-day retention availability, and easy management. It’s the perfect choice for our needs, as we use queuing systems only as a
fallback when saving data to remote regions fails. All the features needed for those fallback cases were covered.

JWT

We also use JSON web tokens for encoding and signing communications between JCAS and company servers. It’s another layer of security for data in transit.

Data structure

Our database design was pretty straightforward. DynamoDB records can be accessed by the primary key and allow access using secondary indexes as well. We created a UUID for each user on JCAS, which is used as a primary key
on DynamoDB.

Most data must be encrypted so we use a single field for that data. On the other hand, there’s data that needs to be stored in separate fields as they need to be accessed from the code without decryption for lookups or basic checks. This indexed data was also stored, as a hash or plain, outside the main ciphered blob store.

We used a field for each searchable data piece:

  • Id (primary key)
  • main phone (indexed)
  • main email (indexed)
  • account status
  • document timestamp

To hold the encrypted data we use a data dictionary with two main dictionaries, info with the user’s encrypted data and keys with the key for each AWS KMS region to decrypt the info blob.
Passwords are stored in two dictionaries, old_hashes contains the legacy hashes from the origin systems and secret holds the user’s JCAS password.

Here’s an example:
data_at_rest

Security

Security is like an onion, it needs to be layered. That’s what we did when designing this solution. Our design makes all of our data unreadable at each layer of this solution, while easing our compliance needs.

A field called data stores all personal information from customers. It’s encrypted using AES256-CBC with a key generated and managed by AWS KMS. A new data key is used for each transaction. For communication between companies and the API, we use API Keys, TLS and JWT, in the body to ensure that the post is signed and verified.

Data flow

Our second requirement on JCAS was system availability, so we designed some data pipelines. These pipelines allow multi-region replication, which evades collision using idempotent operations on all endpoints. The only technology added to the stack was Amazon SQS. On SQS queues, we place all the items we aren’t able to replicate at the time of the client’s request.

JCAS may have inconsistencies between regions, caused by network or region availability; by using SQS, we have workers that synchronize all regions as soon as possible.

Example with two regions

We have three Lambda functions and two SQS queues, where:

1) A trigger Lambda function is called for the DynamoDB stream. Upon changes to the user’s table, it tries to write directly to another DynamoDB table in the second region, falling back to writing to two SQS queues.

fig1

2) A scheduled Lambda function (cron-style) checks a SQS queue for items and tries writing them to the DynamoDB table that potentially failed.

fig2

3) A cron-style Lambda function checks the SQS queue, calling KMS for any items, and fixes the issue.

fig3

The following diagram shows the full infrastrucure (for clarity, this diagram leaves out KMS recovery).

CAS Full Diagram

Results

Upon going live, we noticed a minor impact in our response times. Note the brown legend in the images below.

lat_1

This was a cold start, as the infrastructure started to get hot, response time started to converge. On 27 April, we were almost at the initial 500ms.
lat_2

It kept steady on values before JCAS went live (≈500ms).
lat_3

As of the writing of this post, our response time kept improving (dev changed the method name and we changed the subdomain name).
lat_4

Customer login used to take ≈500ms and it still takes ≈500ms with JCAS. Those times have improved as other components changed inside our code.

Lessons learned

  • Instead of the standard cross-region replication, creating your own DynamoDB cross-region replication might be a better fit for you, if your data and applications allow it.
  • Take some time to tweak the Lambda runtime memory. Remember that it’s billed per 100ms, so it saves you money if you have it run near a round number.
  • KMS takes away the problem of key management with great security by design. It really simplifies your life.
  • Always check the timestamp before operating on data. If it’s invalid, save the money by skipping further KMS, DynamoDB, and Lambda calls. You’ll love your systems even more.
  • Embrace the serverless paradigm, even if it looks complicated at first. It will save your life further down the road when your traffic bursts or you want to find an engineer who knows your whole system.

Next steps

We are going to leverage everything we’ve done so far to implement SSO in Jumia. For a future project, we are already testing OpenID connect with DynamoDB as a backend.

Conclusion

We did a mindset revolution in many ways.
Not only we went completely serverless as we started storing critical info on the cloud.
On top of this we also architectured a system where all user data is decoupled between local systems and our central auth silo.
Managing all of these critical systems became far more predictable and less cumbersome than we thought possible.
For us this is the proof that good and simple designs are the best features to look out for when sketching new systems.

If we were to do this again, we would do it in exactly the same way.

Daniel Loureiro (SecOps) & Tiago Caxias (DBA) – Jumia Group

AWS Achieves FedRAMP Authorization for New Services in the AWS GovCloud (US) Region

Post Syndicated from Chad Woolf original https://aws.amazon.com/blogs/security/aws-achieves-fedramp-authorization-for-a-wide-array-of-services/

Today, we’re pleased to announce an array of AWS services that are available in the AWS GovCloud (US) Region and have achieved Federal Risk and Authorization Management Program (FedRAMP) High authorizations. The FedRAMP Joint Authorization Board (JAB) has issued Provisional Authority to Operate (P-ATO) approvals, which are effective immediately. If you are a federal or commercial customer, you can use these services to process and store your critical workloads in the AWS GovCloud (US) Region’s authorization boundary with data up to the high impact level.

The services newly available in the AWS GovCloud (US) Region include database, storage, data warehouse, security, and configuration automation solutions that will help you increase your ability to manage data in the cloud. For example, with AWS CloudFormation, you can deploy AWS resources by automating configuration processes. AWS Key Management Service (KMS) enables you to create and control the encryption keys used to secure your data. Amazon Redshift enables you to analyze all your data cost effectively by using existing business intelligence tools to automate common administrative tasks for managing, monitoring, and scaling your data warehouse.

Our federal and commercial customers can now leverage our FedRAMP P-ATO to access the following services:

  • CloudFormation – CloudFormation gives developers and systems administrators an easy way to create and manage a collection of related AWS resources, provisioning and updating them in an orderly and predictable fashion. You can use sample templates in CloudFormation, or create your own templates to describe the AWS resources and any associated dependencies or run-time parameters required to run your application.
  • Amazon DynamoDBAmazon DynamoDB is a fast and flexible NoSQL database service for all applications that need consistent, single-digit-millisecond latency at any scale. It is a fully managed cloud database and supports both document and key-value store models.
  • Amazon EMRAmazon EMR provides a managed Hadoop framework that makes it efficient and cost effective to process vast amounts of data across dynamically scalable Amazon EC2 instances. You can also run other popular distributed frameworks such as Apache Spark, HBase, Presto, and Flink in EMR, and interact with data in other AWS data stores such as Amazon S3 and DynamoDB.
  • Amazon GlacierAmazon Glacier is a secure, durable, and low-cost cloud storage service for data archiving and long-term backup. Customers can reliably store large or small amounts of data for as little as $0.004 per gigabyte per month, a significant savings compared to on-premises solutions.
  • KMS – KMS is a managed service that makes it easier for you to create and control the encryption keys used to encrypt your data, and uses Hardware Security Modules (HSMs) to protect the security of your keys. KMS is integrated with other AWS services to help you protect the data you store with these services. For example, KMS is integrated with CloudTrail to provide you with logs of all key usage and help you meet your regulatory and compliance needs.
  • Redshift – Redshift is a fast, fully managed, petabyte-scale data warehouse that makes it simple and cost effective to analyze all your data by using your existing business intelligence tools.
  • Amazon Simple Notification Service (SNS)Amazon SNS is a fast, flexible, fully managed push notification service that lets you send individual messages or “fan out” messages to large numbers of recipients. SNS makes it simple and cost effective to send push notifications to mobile device users and email recipients or even send messages to other distributed services.
  • Amazon Simple Queue Service (SQS)Amazon SQS is a fully-managed message queuing service for reliably communicating among distributed software components and microservices—at any scale. Using SQS, you can send, store, and receive messages between software components at any volume, without losing messages or requiring other services to be always available.
  • Amazon Simple Workflow Service (SWF)Amazon SWF helps developers build, run, and scale background jobs that have parallel or sequential steps. SWF is a fully managed state tracker and task coordinator in the cloud.

AWS works closely with the FedRAMP Program Management Office (PMO), National Institute of Standards and Technology (NIST), and other federal regulatory and compliance bodies to ensure that we provide you with the cutting-edge technology you need in a secure and compliant fashion. We are working with our authorizing officials to continue to expand the scope of our authorized services, and we are fully committed to ensuring that AWS GovCloud (US) continues to offer government customers the most comprehensive mix of functionality and security.

– Chad

AWS Lambda – A Look Back at 2016

Post Syndicated from Tara Walker original https://aws.amazon.com/blogs/aws/aws-lambda-a-look-back-at-2016/

2016 was an exciting year for AWS Lambda, Amazon API Gateway and serverless compute technology, to say the least. But just in case you have been hiding away and haven’t heard of serverless computing with AWS Lambda and Amazon API Gateway, let me introduce these great services to you.  AWS Lambda lets you run code without provisioning or managing servers, making it a serverless compute service that is event-driven and allows developers to bring their functions to the cloud easily for virtually any type of application or backend.  Amazon API Gateway helps you quickly build highly scalable, secure, and robust APIs at scale and provides the ability to maintain and monitor created APIs.

With the momentum of serverless in 2016, of course, the year had to end with a bang as the AWS team launched some powerful service features at re:Invent to make it even easier to build serverless solutions.  These features include:

Since Jeff has already introduced most of the aforementioned new service features for building distributed applications and microservices like Step Functions, let’s walk-through the last four new features not yet discussed using a common serverless use case example: Real-time Stream Processing.  In our walk-through of the stream processing use case, we will implement a Dead Letter Queue for notifications of errors that may come from the Lambda function processing a stream of data, we will take an existing Lambda function written in Node.js to process the stream and rewrite it using the C# language.  We then will build an example of the monetization of a Lambda backed API using API Gateway’s integration with AWS Marketplace.  This will be exciting, so let’s get started.

During the AWS Developer Days in San Francisco and Austin, I presented an example of leveraging AWS Lambda for real-time stream processing by building a demo showcasing a streaming solution with Twitter Streaming APIs. I will build upon this example to demonstrate the power of Dead Letter Queues (DLQ), C# Support, API Gateway Monetization features, and the open source template for API Gateway Developer Portal.  In the demo, a console or web application streams tweets gathered from the Twitter Streaming API that has the keywords ‘awscloud’ and/or ‘serverless’.  Those tweets are sent real-time to Amazon Kinesis Streams where Lambda detects the new records and processes the stream batch by writing the tweets to the NoSQL database, Amazon DynamoDB.

Now that we understand the real-time streaming process demo’s workflow, let’s take a deeper look at the Lambda function that processes the batch records from Kinesis.  First, you will notice below that the Lambda function, DevDayStreamProcessor, has an event source or trigger that is a Kinesis stream named DevDay2016Stream with a Batch size of 100.  Our Lambda function will poll the stream periodically for new records and automatically read and process batches of records, in this case, the tweets detected on the stream.

Now we will examine our Lambda function code which is written in Node.js 4.3. The section of the Lambda function shown below loops through the batch of tweet records from our Kinesis stream, parses each record, and writes desired tweet information into an array of JSON data. The array of the JSON tweet items is passed to the function, ddbItemsWrite which is outside of our Lambda handler.

'use strict';
console.log('Loading function');

var timestamp;
var twitterID;
var tweetData;
var ddbParams;
var itemNum = 0;
var dataItemsBatch = [];
var dbBatch = [];
var AWS = require('aws-sdk');
var ddbTable = 'TwitterStream';
var dynamoDBClient = new AWS.DynamoDB.DocumentClient();

exports.handler = (event, context, callback) => {
    var counter = 0; 
    
    event.Records.forEach((record) => {
        // Kinesis data is base64 encoded so decode here
        console.log("Base 64 record: " + JSON.stringify(record, null, 2));
        const payload = new Buffer(record.kinesis.data, 'base64').toString('ascii');
        console.log('Decoded payload:', payload);
        
        var data = payload.replace(/[\u0000-\u0019]+/g," "); 
        try
        {  tweetData = JSON.parse(data);   }
        catch(err)
        {  callback(err, err.stack);   }
        
        timestamp = "" + new Date().getTime();
        twitterID = tweetData.id.toString();
        itemNum = itemNum+1;
               
         var ddbItem = {
                PutRequest: { 
                    Item: { 
                        TwitterID: twitterID,
                        TwitterUser: tweetData.username.toString(),
                        TwitterUserPic: tweetData.pic,
                        TwitterTime: new Date(tweetData.time.replace(/( \+)/, ' UTC$1')).toLocaleString(), 
                        Tweet: tweetData.text,
                        TweetTopic: tweetData.topic,
                        Tags: (tweetData.hashtags) ? tweetData.hashtags : " ",
                        Location: (tweetData.loc) ? tweetData.loc : " ",
                        Country: (tweetData.country) ? tweetData.country : " ",
                        TimeStamp: timestamp,
                        RecordNum: itemNum
                    }
                }
            };
            
            dataItemsBatch.push(ddbItem);
            counter++;
});
    
    var twitterItems = {}; 
    twitterItems[ddbTable] = dataItemsBatch; 
    ddbItemsWrite(twitterItems, 0, context, callback); 

};

The ddbItemsWrite function shown below will take the array of JSON tweet records processed from the Kinesis stream, and write the records multiple items at a time to our DynamoDB table using batch operations. This function leverages the DynamoDB best practice of retrying unprocessed items by implementing an exponential backoff algorithm to prevent write request failures due to throttling on the individual tables.

 function ddbItemsWrite(items, retries, ddbContext, ddbCallback) 
    { 
        dynamoDBClient.batchWrite({ RequestItems: items }, function(err, data) 
            { 
                if (err) 
                { 
                    console.log('DDB call failed: ' + err, err.stack); 
                    ddbCallback(err, err.stack); 
                } 
                else 
                { 
                    if(Object.keys(data.UnprocessedItems).length) 
                    { 
                        console.log('Unprocessed items remain, retrying.'); 
                        var delay = Math.min(Math.pow(2, retries) * 100, ddbContext.getRemainingTimeInMillis() - 200); 
                        setTimeout(function() {ddbItemsWrite(data.UnprocessedItems, retries + 1, ddbContext, ddbCallback)}, delay); 
                    } 
                    else 
                    { 
                         ddbCallback(null, "Success");
                         console.log("Completed Successfully");
                    } 
                } 
            } 
        );
    }

Currently, this Lambda function works as expected and will successfully process tweets captured in Kinesis from the Twitter Streaming API, however, this function has a flaw that will cause an error to occur when processing batch write requests to our DynamoDB table.  In the Lambda function, the current code does not take into account that the DynamoDB batchWrite function should be comprised of no more than 25 write (put) requests per single call to this function up to 16 MB of data. Therefore, without changing the code appropriately to have the ddbItemsWrite function to handle batches of 25 or have the handler function put items in the array in groups of 25 requests before sending to the ddbItemsWrite function; there will be a validation exception thrown when the batch of tweets items sent is greater than 25.  This is a great example of a bug that is not easily detected in small-scale testing scenarios yet will cause failures under production load.

 

Dead Letter Queues

Now that we are aware of an event that will cause the ddbItemsWrite Lambda function to throw an exception and/or an event that will fail while processing records, we have a first-rate scenario for leveraging Dead Letter Queues (DLQ).

Since AWS Lambda DLQ functionality is only available for asynchronous event sources like Amazon S3, Amazon SNS, AWS IoT or direct asynchronous invocations, and not for streaming event sources such as Amazon Kinesis or Amazon DynamoDB streams; our first step is to break this Lambda function into two functions.  The first Lambda function will handle the processing of the Kinesis stream, and the second Lambda function will take the data processed by the first function and write the tweet information to DynamoDB.  We will then setup our DLQ on the second Lambda function for the error that will occur on writing the batch of tweets to DynamoDB as noted above.

We have two options when setting up a target for our DLQ; Amazon SNS topic or an Amazon SQS queue.  In this walk-through, we will opt for using an Amazon SQS queue.  Therefore, my first step in using DLQ is to create a SQS Standard queue.  A Standard queue type is a queue which has high transactions throughput, a message will be delivered at least once, but another copy of the message may also be delivered, and it is possible that messages might be delivered in an order different from which they were sent.  You can learn more about creating SQS queues and queue type in the Amazon SQS documentation.

Once my queue, StreamDemoDLQ, is created, I will grab the ARN from the Details tab of this selected queue. If I am not using the console to designate the DLQ resource for this function, I will need the ARN for the queue for my Lambda function to identify this SQS queue as the DLQ target for error and event failure notifications. Additionally, I will use the ARN to add permissions to my Lambda execution role policy in order to access this SQS queue.

I will now return to my Lambda function and select the Configuration tab and expand the Advanced settings section. I will select SQS in the DLQ Resource field and select my StreamDemoDLQ queue in the SQS Queue field dropdown.

Remember, the execution role for the Lambda function must explicitly provide sqs:SendMessage access permissions to in order to successfully send messages to your SQS DLQ.  Therefore, I ensured that my Lambda role, lambda_kinesis_role, has the following IAM policy for SQS permissions.

 

We have now successfully configured a Dead Letter Queue for our Lambda function using Amazon SQS. To learn more about Dead Letter Queues in Lambda, read the Troubleshooting and Monitoring section of the AWS Lambda Developer Guide and check out the AWS Compute Blog post on Dead Letter Queues.

C# Support

As I mentioned earlier, another very exciting feature added to Lambda during AWS re:Invent was the support for the C# language via the open source .NET Core 1.0 platform.  Since the Lambda console does not offer editing for compiled languages yet, in order to author a C# Lambda function you can use tooling in Visual Studio with the AWS Toolkit, Yeoman, and/or the .NET CLI.  To deploy Lambda functions written in C#, you can use the Lambda plugin in the AWS ToolKit for Visual Studio or create a deployment package with the .NET Core command line.

A C# Lambda function handler should be defined as an instance or static method in a class. There are two handler function parameters; the first is the input type which is the event data and second is the Lambda context object of type ILambdaContext. The event data input object types for AWS Services include the following:

  • Amazon.Lambda.APIGatewayEvents
  • Amazon.Lambda.CognitoEvents
  • Amazon.Lambda.ConfigEvents
  • Amazon.Lambda.DynamoDBEvents
  • Amazon.Lambda.KinesisEvents
  • Amazon.Lambda.S3Events
  • Amazon.Lambda.SNSEvents

Now that we have discussed more detail around C# Support in Lambda, let’s rewrite our DevDayStreamProcessor lambda function with the C# language. For this example, I will use Visual Studio IDE to write the Lambda function, and additionally take advantage of the AWS Lambda Visual Studio plugin to deploy the function. Remember in order to use the AWS Toolkit for Visual Studio with Lambda, you will need to have Visual Studio 2015 Update 3 version and NET Core tools. You can read more about installing Visual Studio 2015 Update 3 and .NET Core here.

To create the C# function using Visual Studio, I start a New Project, select AWS Lambda Project (.NET Core) and name it ServerlessStreamProcessor.

What’s really cool about taking advantage of the AWS Toolkit for Visual Studio to author this function, is that inside of Visual Studio I can use Lambda blueprints to get started in a similar way that I would in using the Lambda console.  Therefore in order to replicate the DevDayStreamProcessor in C#, I will select the Simple Kinesis Function blueprint.

It should be noted that when writing Lambda functions in C#, there is no need to mark the class declaration nor the target handler function as a Lambda function. Additionally, when writing CloudWatch logs you can use the standard C# Console class WriteLine function or use the ILambdaContext LogLine function found as a part of the ILambdaContext interface. With the template for accessing the Kinesis stream in place, I finish writing the C# Lambda function, ServerlessStreamProcessor, utilizing the same variable names as in the Node.js code in DevDayStreamProcessor. Please note the C# Lambda handler function below.

using System.Collections.Generic;
using Amazon.Lambda.Core;
using Amazon.Lambda.KinesisEvents;
using Amazon.DynamoDBv2;
using Amazon.DynamoDBv2.DataModel;
using Newtonsoft.Json.Linq;

// Assembly attribute to enable the Lambda function's JSON input to be converted into a .NET class.
[assembly: LambdaSerializerAttribute(typeof(Amazon.Lambda.Serialization.Json.JsonSerializer))]

namespace ServerlessStreamProcessor
{
    public class LambdaTwitterStream
    {
        string twitterID, timeStamp;
        int itemNum = 0;
        
        private static AmazonDynamoDBClient dynamoDBClient = new AmazonDynamoDBClient();
        List<TwitterItem> dataItemsBatch = new List<TwitterItem>();
        
        public void FunctionHandler(KinesisEvent kinesisEvent, ILambdaContext context)
        {
            DynamoDBContext dbContext = new DynamoDBContext(dynamoDBClient);
            context.Logger.LogLine($"Beginning to process {kinesisEvent.Records.Count} records...");
            
            foreach (var record in kinesisEvent.Records)
            {
                context.Logger.LogLine($"Event ID: {record.EventId}");
                context.Logger.LogLine($"Event Name: {record.EventName}");

                // Kinesis data is base64 encoded so decode here
                string tweetData = GetRecordContents(record.Kinesis);
                context.Logger.LogLine($"Decoded Payload: {tweetData}");
                tweetData = @"" + tweetData;
                JObject twitterObj = JObject.Parse(tweetData);
                
                twitterID = twitterObj["id"].ToString();
                timeStamp = DateTime.Now.Millisecond.ToString();
                itemNum++;
                context.Logger.LogLine(timeStamp);
                context.Logger.LogLine($"Twitter ID is: {twitterID}");
                context.Logger.LogLine(itemNum.ToString());

                TwitterItem ddbItem = new TwitterItem()
                { 
                    TwitterID = twitterID,
                    TwitterUser = twitterObj["username"].ToString(),
                    TwitterUserPic = twitterObj["pic"].ToString(),
                    TwitterTime = DateTime.Parse(twitterObj["time"].ToString()).ToUniversalTime().ToString(),
                    Tweet = twitterObj["text"].ToString(),
                    TweetTopic = twitterObj["topic"].ToString(),
                    Tags = twitterObj["hashtags"] != null ? twitterObj["hashtags"].ToString() : String.Empty,
                    Location = twitterObj["loc"] != null ? twitterObj["loc"].ToString() : String.Empty,
                    Country = twitterObj["country"] != null ? twitterObj["country"].ToString() : String.Empty,
                    TimeStamp =  timeStamp,
                    RecordNum = itemNum
                };
                
                dataItemsBatch.Add(ddbItem);
            }

            context.Logger.LogLine(JObject.FromObject(dataItemsBatch).ToString());
            ddbItemsWrite(dataItemsBatch, 0, dbContext, context);
            context.Logger.LogLine("Success - Completed Successfully");
            context.Logger.LogLine("Stream processing complete.");
        }

There are only a few differences that should be noted between our Kinesis stream processor written in C# and our original Node.js code.  Since the input parameter type supported by default in C# Lambda functions is the System.IO.Stream type, the Kinesis base64 string is decoded by using a StreamReader with ASCII encoding in a blueprint provided function, GetRecordContents.

 

private string GetRecordContents(KinesisEvent.Record streamRecord)
{
    using (var reader = new StreamReader(streamRecord.Data, Encoding.ASCII))
    {
        return reader.ReadToEnd();
    }
}

The other thing to note is that in order to write the tweet data to the DynamoDB Table, I added the AWS .NET SDK NuGet package for DynamoDB; AWSSDK.DynamoDBv2 to the Lambda function project via the NuGet package manager within Visual Studio.  I also created a .NET data object, TwitterItem, to map to the data being stored in the DynamoDB table. Using the AWS .NET SDK higher level programming interface, object persistence model for DynamoDB, I created a collection of TwitterItem objects to be written via the BatchWrite object class in our ddbItemsWrite C# function.

private async void ddbItemsWrite(List<TwitterItem> items, int retries, DynamoDBContext ddbContext, ILambdaContext context)
{
BatchWrite<TwitterItem> twitterStreamBatchWrite = ddbContext.CreateBatchWrite<TwitterItem>();
        
        try
        {
            twitterStreamBatchWrite.AddPutItems(items);   
            await twitterStreamBatchWrite.ExecuteAsync();
        }
        catch (Exception ex)
        {
            context.Logger.LogLine($"DDB call failed: {ex.Source} ");
            context.Logger.LogLine($"Exception: {ex.Message}");
            context.Logger.LogLine($"Exception Stacktrace: {ex.StackTrace}");
        }      
}

Another benefit of using AWS Toolkit for Visual Studio to author my C# Lambda function is that I can deploy my Lambda function directly to AWS with a single click.  Selecting my project name in the Solution Explorer and performing a right-click, I get a menu option, Publish to AWS Lambda, which brings up a menu for information to include about my Lambda function for deployment to AWS.

It is important to note that the handler function signature follows the nomenclature of Assembly :: Namespace :: ClassName :: Method, therefore, the signature of our C# Lambda function shown here is: ServerlessStreamProcessor :: ServerlessStreamProcessor.LambdaTwitterStream :: FunctionHandler.  We provide this information to the Upload to AWS Lambda dialog box and select Next to assign a role for the function.

Upon completion, you can test in the Lambda console or in Visual Studio with AWS toolkit provided plugin (shown below) using the sample data of the triggering event source for an iterative approach to developing the Lambda function.

You can learn more about authoring AWS Lambda functions using the C# Language in the AWS Lambda developer guide or by reading the post announcing C# Support on the Compute Blog.

API Gateway Monetization and Developer Portal

If you have been following the microservices momentum, you may be aware of an architectural pattern that calls for using smart endpoints and/or using an API gateway via REST APIs to manage access and exposure of individual services that make up a microservices solution.  Amazon API Gateway enables creation and management of RESTful APIs to expose AWS Lambda functions, external HTTP endpoints, as well as, other AWS services.  In addition, Amazon API Gateway allows clients and external developers to have access to a deployed APIs by via HTTP protocol or a platform/language targeted SDK.

With the introduction of SaaS Subscriptions on AWS Marketplace and the API Gateway integration with the AWS Marketplace, you can now monetize your APIs by allowing customers to directly consume the APIs you create with API Gateway in the AWS Marketplace.  AWS customers can subscribe and be billed for the APIs published on the marketplace with their existing AWS account.  With the integration of API Gateway with the AWS Marketplace, the process to get started is easy on the AWS Marketplace.

To get started, you must ensure that you have enabled the Usage Plan feature in Amazon API Gateway.

Once enabled the next step is to create a Usage Plan, enable throttling (if desired) with targeted rate and burst request thresholds, and finally enable quotas (if you choose) by providing targeted request quota per a set timeframe.

Next, we would choose our APIs and related stage(s) that we wish to be associated with the usage plan. Please note that this is an optional step as you can opt not associate a specific API with your usage plan.

All that is left to do is add or create an API key for the usage plan.  Again, it should be noted that this is also an optional step in creating your usage plan.

Now that we have our usage plan, StreamingPlan, we are ready for the next step in preparation for selling our API on the marketplace. You have the option to create multiple usage plans with varying APIs and limits, and sell these plans as differentiated API products on AWS Marketplace.

In order to enable customers to buy our new API product, however, the AWS Marketplace requires that each API product has an external developer portal to handle subscription requests, provide API information details and ability for the management of usage.

This customer need for an external developer portal for the marketplace birthed the new open source API Gateway developer portal serverless web application implementation.  The goal of the API Gateway developer portal project was to allow customers to follow a few easy steps to create a serverless web application that lists a catalog of your APIs built with API Gateway while allowing for developer signups.

The API Gateway developer portal was built upon AWS Serverless Express; an open source library published by AWS which aids you in utilizing AWS Lambda and Amazon API Gateway in building web applications/services with the Node.js Express framework.  Additionally, the API Gateway developer portal application uses an AWS SAM (Serverless Application Model) template to deploy its serverless resources.  AWS SAM is a simplified CloudFormation template and specification that allows easier management and deployment of serverless applications on AWS.

To build your developer portal using the API Gateway portal, you would start by cloning the aws-api-gateway-developer-portal project from GitHub.

Assuming you have the latest version of the AWS CLI and Node.js installed, you would setup the developer portal by running “npm run setup” on the command line for Mac and Linux OS users. For Windows users, you would run “npm run win-setup” on the command line setup the developer portal.

The result is a functional sample developer portal website running on S3 that you can customize in order to create your own developer portal for your APIs.

The frontend of the sample developer portal website is built with the React JavaScript library, and the backend is an AWS Lambda function running using the aws-serverless-express library. Additionally, a Lambda function with a SNS event source was created as a listener for notification when customers subscribe or unsubscribe to your API via the AWS Marketplace console.  You can learn more about the steps to build, customize, and deploy your API Gateway developer portal web application with this reference project by visiting the AWS Compute blog post which discusses the architecture and implementation in more detail.

 

The next key step in monetizing our API is establishing an account on the AWS Marketplace.  If an account is not already established, registering is simply verifying that you meet the requirement prerequisites provided in the AWS Marketplace Seller Guide and completing a seller registration form on the AWS Marketplace Management Portal.  You can see a snapshot of the start of the seller registration form below.

To list the API, you would fill a product load form describing the API, establish the pricing for the API, and provide t\he IDs of AWS Accounts that will test the API subscription process.  Completing this form would also require you to submit the URL for your API developer portal.

When your seller registration is complete, you will be supplied an AWS Marketplace product code.  You will need to associate your marketplace product code with your API usage plan.  In order to complete this step, you would simply log into the API Gateway console and go to your API usage plan. Go to the Marketplace tab and enter your product code. This tells API Gateway to send measurement data to AWS Marketplace when your API is used.

With your Amazon API Gateway managed API packaged into a usage plan, the accompanying API developer portal created, seller account registration completed, and product code associated with API usage plan; we are now ready to monetize our API on the AWS Marketplace.

Learn more about monetizing your APIs created with API Gateway by checking out the related blog post and reviewing the API Gateway developer guide documentation.

Summary

As you can see, the AWS teams were busy in 2016 working to make the customer experience easier for creating and deploying serverless architectures, as well as, providing mechanisms for customers to generate and monetize their API Gateway managed APIs.

Visit the product documentation for AWS Lambda and Amazon API Gateway to learn more about these services and all the newly released features.

Tara

Month in Review: December 2016

Post Syndicated from Derek Young original https://aws.amazon.com/blogs/big-data/month-in-review-december-2016/

Another month of big data solutions on the Big Data Blog.

Take a look at our summaries below and learn, comment, and share. Thank you for reading!

Implementing Authorization and Auditing using Apache Ranger on Amazon EMR
Apache Ranger is a framework to enable, monitor, and manage comprehensive data security across the Hadoop platform. Features include centralized security administration, fine-grained authorization across many Hadoop components (Hadoop, Hive, HBase, Storm, Knox, Solr, Kafka, and YARN) and central auditing. In this post, walk through the steps to enable authorization and audit for Amazon EMR clusters using Apache Ranger.

Amazon Redshift Engineering’s Advanced Table Design Playbook
Amazon Redshift is a fully managed, petabyte scale, massively parallel data warehouse that offers simple operations and high performance. In practice, the best way to improve query performance by orders of magnitude is by tuning Amazon Redshift tables to better meet your workload requirements. This five-part blog series will guide you through applying distribution styles, sort keys, and compression encodings and configuring tables for data durability and recovery purposes.

Interactive Analysis of Genomic Datasets Using Amazon Athena
In this post, learn to prepare genomic data for analysis with Amazon Athena. We’ll demonstrate how Athena is well-adapted to address common genomics query paradigms using the Thousand Genomes dataset hosted on Amazon S3, a seminal genomics study. Although this post is focused on genomic analysis, similar approaches can be applied to any discipline where large-scale, interactive analysis is required.

Joining and Enriching Streaming Data on Amazon Kinesis
In this blog post, learn three approaches for joining and enriching streaming data on Amazon Kinesis Streams by using Amazon Kinesis Analytics, AWS Lambda, and Amazon DynamoDB.

Using SaltStack to Run Commands in Parallel on Amazon EMR
SaltStack is an open source project for automation and configuration management. It started as a remote execution engine designed to scale to many machines while delivering high-speed execution. You can now use the new bootstrap action that installs SaltStack on Amazon EMR. It provides a basic configuration that enables selective targeting of the nodes based on instance roles, instance groups, and other parameters.

Building an Event-Based Analytics Pipeline for Amazon Game Studios’ Breakaway
Amazon Game Studios’ new title Breakaway is an online 4v4 team battle sport that delivers fast action, teamwork, and competition. In this post, learn the technical details of how the Breakaway team uses AWS to collect, process, and analyze gameplay telemetry to answer questions about arena design.

Respond to State Changes on Amazon EMR Clusters with Amazon CloudWatch Events
With new support for Amazon EMR in Amazon CloudWatch Events, you can be notified quickly and programmatically respond to state changes in your EMR clusters. Additionally, these events are also displayed in the Amazon EMR console. CloudWatch Events allows you to create filters and rules to match these events and route them to Amazon SNS topics, AWS Lambda functions, Amazon SQS queues, streams in Amazon Kinesis Streams, or built-in targets.

Run Jupyter Notebook and JupyterHub on Amazon EMR
Data scientists who run Jupyter and JupyterHub on Amazon EMR can use Python, R, Julia, and Scala to process, analyze, and visualize big data stored in Amazon S3. Jupyter notebooks can be saved to S3 automatically, so users can shut down and launch new EMR clusters, as needed. See how EMR makes it easy to spin up clusters with different sizes and CPU/memory configurations to suit different workloads and budgets.

Derive Insights from IoT in Minutes using AWS IoT, Amazon Kinesis Firehose, Amazon Athena, and Amazon QuickSight
In this post, see how you can build a business intelligence capability for streaming IoT device data using AWS serverless and managed services. You can be up and running in minutes―starting small, but able to easily grow to millions of devices and billions of messages.

Serving Real-Time Machine Learning Predictions on Amazon EMR
The typical progression for creating and using a trained model for recommendations falls into two general areas: training the model and hosting the model. Model training has become a well-known standard practice. In this post, we highlight one way to host those recommendations using Amazon EMR with JobServer

Powering Amazon Redshift Analytics with Apache Spark and Amazon Machine Learning
In this post, learn to generate a predictive model for flight delays that can be used to help pick the flight least likely to add to your travel stress. To accomplish this, you’ll use Apache Spark running on Amazon EMR for extracting, transforming, and loading (ETL) the data, Amazon Redshift for analysis, and Amazon Machine Learning for creating predictive models.

FROM THE ARCHIVE

Running sparklyr – RStudio’s R Interface to Spark on Amazon EMR
Sparklyr is an R interface to Spark that allows users to use Spark as the backend for dplyr, one of the most popular data manipulation packages. Sparklyr provides interfaces to Spark packages and also allows users to query data in Spark using SQL and develop extensions for the full Spark API. This short post shows you how to run RStudio and sparklyr on EMR.


Want to learn more about Big Data or Streaming Data? Check out our Big Data and Streaming data educational pages.

Leave a comment below to let us know what big data topics you’d like to see next on the AWS Big Data Blog.

Respond to State Changes on Amazon EMR Clusters with Amazon CloudWatch Events

Post Syndicated from Jonathan Fritz original https://aws.amazon.com/blogs/big-data/respond-to-state-changes-on-amazon-emr-clusters-with-amazon-cloudwatch-events/

Jonathan Fritz is a Senior Product Manager for Amazon EMR

Customers can take advantage of the Amazon EMR API to create and terminate EMR clusters, scale clusters using Auto Scaling or manual resizing, and submit and run Apache Spark, Apache Hive, or Apache Pig workloads. These decisions are often triggered from cluster state-related information.

Previously, you could use the “describe” and “list” set of API operations to find the relevant information about your EMR clusters and associated instance groups, steps, and Auto Scaling policies. However, programmatic applications that check resource state changes and post notifications or take actions are forced to poll these API operations, which provides a slower end-to-end reaction time and additional management overhead than if you were able to use an event-driven architecture.

With new support for Amazon EMR in Amazon CloudWatch Events, you can be notified quickly and programmatically respond to state changes in your EMR clusters. Additionally, these events are also displayed in the Amazon EMR console, on the Cluster Details page in the Events section.

There are four new EMR event types:

  • Cluster State Change
  • Instance Group State Change
  • Step State Change
  • Auto Scaling State Change

CloudWatch Events allows you to create filters and rules to match these events and route them to Amazon SNS topics, AWS Lambda functions, Amazon SQS queues, streams in Amazon Kinesis Streams, or built-in targets. You then have the ability to programmatically act on these events, including sending emails and SMS messages, running retry logic in Lambda, or tracking the state of running steps. For more information about the sample events generated for each event type, see the CloudWatch Events documentation.

The following is an example using the CloudWatch Events console to route EMR step failure events to Lambda for automated retry logic and to SNS to push a notification to an email alias:

Cloudwatch_1

You can create rules for EMR event types in the CloudWatch Events console, AWS CLI, or the AWS SDKs using the CloudWatch Events API.

If you have any questions or would like to share an interesting use case about events and notifications with EMR, please leave a comment below.


Related

Dynamically Scale Applications on Amazon EMR with Auto Scaling

AutoScaling_SocialMedia

 

Robust Serverless Application Design with AWS Lambda Dead Letter Queues

Post Syndicated from Bryan Liston original https://aws.amazon.com/blogs/compute/robust-serverless-application-design-with-aws-lambda-dlq/

Gene Ting
Gene Ting, Solutions Architect

AWS Lambda is a serverless, event-driven compute service that allows developers to bring their functions to the cloud easily. A key challenge that Lambda developers often face is to create solutions that handle exceptions and failures gracefully. Some examples include:

  • Notifying operations support when a function fails with context
  • Sending jobs that have timed out to a handler that can either notify operations of a critical failure or rebalance jobs

Now, with the release of Lambda Dead Letter Queues, Lambda functions can be configured to notify when the function fails, with context on what the failure was.

In this post, we show how you can configure your function to deliver notification to an Amazon SQS queue or Amazon SNS topic, and how you can create a process to automatically generate more meaningful notifications when your Lambda functions fail.

Introducing Lambda Dead Letter Queues

Dead-letter queues are a powerful concept, which help software developers find software issue patterns in their asynchronous processing components. The way it works is simple—when your messaging component receives a message and detects a fatal or unhandled error while processing the message, it sends information about the message that failed to another location, such as another queue or another notification system. SQS provides dead letter queues today, sending messages that couldn’t be handled to a different queue for further investigation.

AWS Lambda Dead Letter Queues builds upon the concept by enabling Lambda functions to be configured with an SQS queue or SNS topic as a destination to which the Lambda service can send information about an asynchronous request when processing fails. The Lambda service sends information about the failed request when the request will no longer be retried. Supported invocations include:

  • An event type invocation from a custom application
  • Any AWS event source that’s not a DynamoDB table, Amazon Kinesis stream, or API Gateway resource request integration

Take the typical beginner use case for learning about serverless applications on AWS: creating thumbnails from images dropped onto an S3 bucket. The transcoding Lambda function can be configured to send any transcoding failures to an SNS topic, which triggers a Lambda function for further investigation.

deadletterqueues_1.jpeg
Now, you can set up a dead letter queue for an existing Lambda function and test out the feature.

Configuring a DLQ target for a Lambda function

First, make sure that the execution role for the Lambda function is allowed to publish to the SNS topic. For this demo, use the sns-lambda-test topic. An example is provided below:

{
   "Version":"2012-10-17",
   "Statement":[{
      "Effect":"Allow",
      "Action":"sns:Publish",
      "Resource":"arn:aws:sns:us-west-2:123456789012:sns-lambda-test"
      }
   ]
}

If an SQS queue is the intended target, you need a comparable policy that allows the appropriate SendMessage action to the queue.

Next, choose an existing Lambda function against which to configure a dead-letter queue. For this example, choose a predeployed function, such as CreateThumbnail.

deadletterqueues_2.png

Select the function, choose Configuration, expand the **Advanced settings **section in the middle of the page, and scroll to the DLQ Resource form. Choose SNS and for SNS Topic name, enter sns-lambda-test.
deadletterqueues_3.png
That’s it—the function is now configured and ready for testing.

Processing failure notifications

One easy way to test the handler for your dead letter queue is to submit an event that is known to fail for the Lambda function. In this example, you can simply drop a text file pretending to be an image to the S3 bucket, to be recognized by the image thumbnail creator as a non-image file, and have the handler exit with an error message.

When Lambda sends an error notification to an SNS topic, three additional message attributes are attached to the notification in the MessageAttributes object:

  • RequestID – The request ID.
  • ErrorCode – The HTTP response code that would have been given if the handler was synchronously called.
  • ErrorMessage – The error message given back by the Lambda runtime. In the example above, it is the error message from the handler.

In addition to these attributes, the body of the event is held in the Message attribute of the Sns object. If you use an SQS queue instead, the additional attributes are in the MessageAttributes object and the event body is held in the Body attribute of the message.

Handling timeouts

One of the most common failures to occur in Lambda functions is a timeout. In this scenario, the Lambda function executes until it’s been forcefully terminated by the Lambda runtime, which sends an error message indicating that the function has timed out, as in the following example error message:

"ErrorMessage": {

"Type": "String", 

"Value": "2016-11-29T04:27:36.789Z b4797725-b5eb-11e6-acb2-17876a085622 Task timed out after 300.00 seconds" 

}

An error handler can simply parse for the string Task timed out after in the Value attribute, and act accordingly, such as breaking the request into multiple Lambda invocations, or sending to a different queue that spins up EC2 instances in an Auto Scaling group for handling larger jobs.

Handling critical failures

Another scenario that you may need to handle is when critical failures occur. Some examples of a critical failure are:

  • A misconfiguration of the Lambda handler
  • A system crash, such as an out-of-memory error

In either case, there’s very little that can be handled gracefully in application logic. These kinds of errors can be forwarded to operations support for root cause analysis or break glass fixes.

In the case of a system crash, your dead letter queue receives an error message similar to the following:

"RequestID": { "Type": "String", "Value": "6502cad0-b641-11e6-bd4e-279609143c53" }, 

"ErrorCode": { "Type": "String", "Value": "200" },

"ErrorMessage": { "Type": "String", "Value": "Process exited before completing request" }

For this example, the Lambda handler was forced to crash with an out-of-memory error, which can be found by searching in the Lambda handler’s log stream by the given RequestID.

In the case of a misconfiguration, your dead letter queue receives an error message along the following lines:

"ErrorMessage": { "Type": "String", "Value": "Cannot find module '/var/task/index'" }

In this example, the Lambda handler was misconfigured to load a non-existent index.js module.

Monitoring Lambda functions configured with dead letter queues

Lambda functions with a configured dead letter queue also come with their own CloudWatch metric called “DeadLetterErrors”. The metric is incremented whenever the dead letter message payload can’t be sent to the dead letter queue at any time.

Conclusion

With the launch of Dead Letter Queues, Lambda function developers can now create much simpler functions by focusing only on the business logic, and leverage the AWS Lambda infrastructure to delegate error handling elsewhere in a more graceful manner.

For more information, read about Dead Letter Queues in the AWS Lambda Developer Guide. Happy coding everyone, and have fun creating awesome serverless applications!

New for Amazon Simple Queue Service – FIFO Queues with Exactly-Once Processing & Deduplication

Post Syndicated from Jeff Barr original https://aws.amazon.com/blogs/aws/new-for-amazon-simple-queue-service-fifo-queues-with-exactly-once-delivery-deduplication/

As the very first member of the AWS family of services, Amazon Simple Queue Service (SQS) has certainly withstood the test of time!  Back in 2004, we described it as a “reliable, highly scalable hosted queue for buffering messages between distributed application components.” Over the years, we have added many features including a dead letter queue, 256 KB payloads, SNS integration, long polling, batch operations, a delay queue, timers, CloudWatch metrics, and message attributes.

New FIFO Queues
Today we are making SQS even more powerful and flexible with support for FIFO (first-in, first-out) queues. We are rolling out this new type of queue in two regions now, and plan to make it available in many others in early 2017.

These queues are designed to guarantee that messages are processed exactly once, in the order that they are sent, and without duplicates. We expect that FIFO queues will be of particular value to our financial services and e-commerce customers,  and to those who use messages to update database tables. Many of these customers have systems that depend on receiving messages in the order that they were sent.

FIFO ordering means that, if you send message A, wait for a successful response, and then send message B, message B will be enqueued after message A, and then delivered accordingly. This ordering does not apply if you make multiple SendMessage calls in parallel. It does apply to the individual messages within a call to SendMessageBatch, and across multiple consecutive calls to SendMessageBatch.

Exactly-once processing applies to both single-consumer and multiple-consumer scenarios. If you use FIFO queues in a multiple-consumer environment, you can configure your queue to make messages visible to other consumers only after the current message has been deleted or the visibility timeout expires. In this scenario, at most one consumer will actively process messages; the other consumers will be waiting until the first consumer finishes or fails.

Duplicate messages can sometimes occur when a networking issue outside of SQS prevents the message sender from learning the status of an action and causes the sender to retry the call. FIFO queues use multiple strategies to detect and eliminate duplicate messages. In addition to content-based deduplication, you can include a MessageDeduplicationId when you call SendMessage for a FIFO queue. The ID can be up to 128 characters long, and, if present, takes higher precedence than content-based deduplication.

When you call SendMessage for a FIFO queue, you can now include a MessageGroupId. Messages that belong to the same group (as indicated by the ID) are processed in order, allowing you to create and process multiple, ordered streams within a single queue and to use multiple consumers while keeping data from multiple groups distinct and ordered.

You can create standard queues (the original queue type) or the new FIFO queues using the CreateQueue function, the create-queue command, or the AWS Management Console. The same API functions apply to both types of queues, but you cannot convert one queue type into the other.

Although the same API calls apply to both queue types, the newest AWS SDKs and SQS clients provide some additional functionality. This includes automatic, idempotent retries of failed ReceiveMessage calls.

Individual FIFO queues can handle up to 300 send, receive, or delete requests per second.

Some SQS Resources
Here are some resources to help you to learn more about SQS and the new FIFO queues:

If you’re coming to Las Vegas for AWS re:Invent and would like to hear more about how AWS customer Capital One is making use of SQS and FIFO queues, register and plan to attend ENT-217, Migrating Enterprise Messaging to the Cloud on Wednesday, November 30 at 3:30 PM.

Available Now
FIFO queues are available now in all the US East (Ohio) and US West (Oregon) regions and you can start using them today. If you are running in US East (Northern Virginia) and want to give them a try, you can create them in US East (Ohio) and take advantage of the no-cost, low-latency connectivity between the regions.

As part of today’s launch, we are also reducing the price for standard queues by 20%. For the updated pricing, take a look at the SQS Pricing page.

Jeff;

 

AWS Hot Startups – November 2016 – AwareLabs, Doctor On Demand, Starling Bank, and VigLink

Post Syndicated from Jeff Barr original https://aws.amazon.com/blogs/aws/aws-hot-startups-november-2016-awarelabs-doctor-on-demand-starling-bank-and-viglink/

Tina is back with another impressive set of startups!

Jeff;


This month we are featuring four hot AWS-powered startups:

  • AwareLabs – Helping small businesses build smart websites
  • Doctor On Demand – Delivering fast, easy, and cost-effective access to top healthcare providers.
  • Starling Bank – Mobile banking for the next generation.
  • VigLink – Powering content-driven commerce.

Make sure to also check out October’s Hot Startups if you missed it!

AwareLabs (Phoenix/Charlotte)
AwareLabs is a small, three person startup focused on helping business owners engage their customers through dozens of integrated applications. The startup was born in November 2011 and began as a website building guide that helped hundreds of entrepreneurs within its first few weeks. Early on, founder Paul Kenjora recognized that small businesses were being slowed down by existing business solutions, and in 2013 he took on the task of creating a business centric website builder. After attending an AWS seminar, Paul realized that small teams could design and deploy massive infrastructure just as well as heavily funded, high-tech companies. Previously, only big companies or heavy investment allowed for that type of scale. With the help of AwareLabs, small businesses with limited time and budgets can build the smart websites they need.

The AwareLabs team relies on AWS to achieve what was previously impossible with a team of their size. They’ve been able to raise less capital, move faster, and deliver a solution customers love. AwareLabs leverages Amazon EC2 extensively for everything from running client websites, to maintaining their own secure code repository. Amazon S3 has also been a game changer in offloading the burden of data storage and reliability. This was the single biggest factor in letting the AwareLabs development team focus on client-facing features instead of infrastructure issues. Amazon SES and Amazon SNS freed their developers to deliver integrated one-click newsletters with intelligent bounce reduction, which was very well received by clients. Finally, AWS has helped AwareLabs be profitable, which is huge for any startup!

Be sure to check out AwareLabs for your professional website needs!

Doctor On Demand (San Francisco)
Doctor On Demand was built to address the growing problem that many of those in the U.S. face – lack of access to healthcare providers. The average wait time to see a physician is three weeks, and in rural areas, it can be even longer. It takes an average of 25 days to see a psychiatrist or psychologist and nearly half of all patients with mental health issues go without treatment. With Doctor On Demand, patients can see a board-certified physician or psychologist in a matter of minutes directly from their smartphone, tablet, or computer. They can also have video visits with providers at any time of day – no matter where they are. Patients simply download the Doctor On Demand app (iOS and Android) or visit www.doctorondemand.com, provide a summary of the reason for their visit, and are connected to a licensed provider in their state. Services are delivered through hundreds of employers and work with dozens of major health plans.

From the very beginning, AWS has allowed Doctor on Demand to operate securely in the healthcare space. They utilize Amazon EC2, Amazon S3, Amazon CloudFront, Amazon CloudWatch, and AWS Trusted Advisor. With these services they are able to build compliant security and privacy controls, ‘simple’ fault tolerance, and easily setup a disaster recovery site (utilizing multiple AWS Regions). The company says the best part about working with AWS is that they are able to get everything they need on a startup budget.

Check out the Doctor On Demand blog to keep up with the latest news!

Starling Bank (UK)
Starling Bank is on a mission to shake up financial services.  In the way that TV was radically changed by Netflix, music by the likes of Spotify, and social media by Snapchat – this is what Starling aims to do for banking. Founded in 2014 by Anne Boden, Starling uses the latest technology to make the traditional current account obsolete. Having assembled a team of engineers, artists, and economists, the build of the bank is nearing completion. They will be launching their app in early 2017.

Many next generation banks continue to stick to the traditional bank model that was built on technology from the 1960s and 70s. Instead of providing a range of products that are sold and cross-sold to unwilling customers, Starling will empower their users through seamless access to a mobile marketplace of financial services and products that best meet their needs at any given time. Customers can enjoy the security and protection of a licensed and regulated bank while also getting access to insights, data, and services that empower them to make decisions about their money.

Starling Bank uses AWS to provision and scale a secure infrastructure automatically and on demand. They primarily use Amazon CloudFormation and Amazon EC2, but also make use of Amazon S3, Amazon RDS, and Amazon Lambda.

Sign up here to be one of Starling’s first customers!

VigLink (San Francisco)
Oliver Roup, founder and CEO of VigLink, was first introduced to affiliate marketing as a student at Harvard Business School. His interest in the complex ecosystem prompted him to write a crawler to identify existing product links to Amazon. Roup found that less than half of those links were enrolled in the associates program. It was at this moment that he determined there was a real business opportunity at hand, and VigLink was born.

Over the last seven years, the company has grown into not only a content monetization platform, but a platform that provides publishers and merchants with insights into their ecommerce business. At its core, VigLink identifies commercial product mentions within a publisher’s content and automatically transforms them into revenue generating hyperlinks whose destinations can be determined in real-time, advertiser-bid auctions. Since its founding in 2009, VigLink has been backed by top investors including Google Ventures, Emergence Capital Partners, and RRE. Check out a recent interview with Roup and a tour of VigLink’s offices here!

Since the company’s start, VigLink has utilized AWS extensively. The flexibility to be able to respond to demand elastically without capital costs or hardware maintenance has been game-changing. They use numerous services including Amazon EC2, Amazon S3, Amazon SQS, Amazon RDS, and Amazon Redshift. While continuing to scale, VigLink has recently been able to cut costs by 15% using tools such as AWS Cost Explorer.

Take a behind-the-scenes look at VigLink in this short video.

Tina Barr