All posts by Danilo Poccia

AWS Cloud Development Kit (CDK) – TypeScript and Python are Now Generally Available

Post Syndicated from Danilo Poccia original https://aws.amazon.com/blogs/aws/aws-cloud-development-kit-cdk-typescript-and-python-are-now-generally-available/

Managing your Infrastructure as Code provides great benefits and is often a stepping stone for a successful application of DevOps practices. In this way, instead of relying on manually performed steps, both administrators and developers can automate provisioning of compute, storage, network, and application services required by their applications using configuration files.

For example, defining your Infrastructure as Code makes it possible to:

  • Keep infrastructure and application code in the same repository
  • Make infrastructure changes repeatable and predictable across different environments, AWS accounts, and AWS regions
  • Replicate production in a staging environment to enable continuous testing
  • Replicate production in a performance test environment that you use just for the time required to run a stress test
  • Release infrastructure changes using the same tools as code changes, so that deployments include infrastructure updates
  • Apply software development best practices to infrastructure management, such as code reviews, or deploying small changes frequently

Configuration files used to manage your infrastructure are traditionally implemented as YAML or JSON text files, but in this way you’re missing most of the advantages of modern programming languages. Specifically with YAML, it can be very difficult to detect a file truncated while transferring to another system, or a missing line when copying and pasting from one template to another.

Wouldn’t it be better if you could use the expressive power of your favorite programming language to define your cloud infrastructure? For this reason, we introduced last year in developer preview the AWS Cloud Development Kit (CDK), an extensible open-source software development framework to model and provision your cloud infrastructure using familiar programming languages.

I am super excited to share that the AWS CDK for TypeScript and Python is generally available today!

With the AWS CDK you can design, compose, and share your own custom components that incorporate your unique requirements. For example, you can create a component setting up your own standard VPC, with its associated routing and security configurations. Or a standard CI/CD pipeline for your microservices using tools like AWS CodeBuild and CodePipeline.

Personally I really like that by using the AWS CDK, you can build your application, including the infrastructure, in your IDE, using the same programming language and with the support of autocompletion and parameter suggestion that modern IDEs have built in, without having to do a mental switch between one tool, or technology, and another. The AWS CDK makes it really fun to quickly code up your AWS infrastructure, configure it, and tie it together with your application code!

How the AWS CDK works
Everything in the AWS CDK is a construct. You can think of constructs as cloud components that can represent architectures of any complexity: a single resource, such as an S3 bucket or an SNS topic, a static website, or even a complex, multi-stack application that spans multiple AWS accounts and regions. To foster reusability, constructs can include other constructs. You compose constructs together into stacks, that you can deploy into an AWS environment, and apps, a collection of one of more stacks.

How to use the AWS CDK
We continuously add new features based on the feedback of our customers. That means that when creating an AWS resource, you often have to specify many options and dependencies. For example, if you create a VPC you have to think about how many Availability Zones (AZs) to use and how to configure subnets to give private and public access to the resources that will be deployed in the VPC.

To make it easy to define the state of AWS resources, the AWS Construct Library exposes the full richness of many AWS services with sensible defaults that you can customize as needed. In the case above, the VPC construct creates by default public and private subnets for all the AZs in the VPC, using 3 AZs if not specified.

For creating and managing CDK apps, you can use the AWS CDK Command Line Interface (CLI), a command-line tool that requires Node.js and can be installed quickly with:

npm install -g aws-cdk

After that, you can use the CDK CLI with different commands:

  • cdk init to initialize in the current directory a new CDK project in one of the supported programming languages
  • cdk synth to print the CloudFormation template for this app
  • cdk deploy to deploy the app in your AWS Account
  • cdk diff to compare what is in the project files with what has been deployed

Just run cdk to see more of the available commands and options.

You can easily include the CDK CLI in your deployment automation workflow, for example using Jenkins or AWS CodeBuild.

Let’s use the AWS CDK to build two sample projects, using different programming languages.

An example in TypeScript
For the first project I am using TypeScript to define the infrastructure:

cdk init app --language=typescript

Here’s a simplified view of what I want to build, not entering into the details of the public/private subnets in the VPC. There is an online frontend, writing messages in a queue, and an asynchronous backend, consuming messages from the queue:

Inside the stack, the following TypeScript code defines the resources I need, and their relations:

  • First I define the VPC and an Amazon ECS cluster in that VPC. By using the defaults provided by the AWS Construct Library, I don’t need to specify any parameter here.
  • Then I use an ECS pattern that in a few lines of code sets up an Amazon SQS queue and an ECS service running on AWS Fargate to consume the messages in that queue.
  • The ECS pattern library provides higher-level ECS constructs which follow common architectural patterns, such as load balanced services, queue processing, and scheduled tasks.
  • A Lambda function has the name of the queue, created by the ECS pattern, passed as an environment variable and is granted permissions to send messages to the queue.
  • The code of the Lambda function and the Docker image are passed as assets. Assets allow you to bundle files or directories from your project and use them with Lambda or ECS.
  • Finally, an Amazon API Gateway endpoint provides an HTTPS REST interface to the function.
const myVpc = new ec2.Vpc(this, "MyVPC");

const myCluster = new ecs.Cluster(this, "MyCluster", {
  vpc: myVpc
});

const myQueueProcessingService = new ecs_patterns.QueueProcessingFargateService(
  this, "MyQueueProcessingService", {
    cluster: myCluster,
    memoryLimitMiB: 512,
    image: ecs.ContainerImage.fromAsset("my-queue-consumer")
  });

const myFunction = new lambda.Function(
  this, "MyFrontendFunction", {
    runtime: lambda.Runtime.NODEJS_10_X,
    timeout: Duration.seconds(3),
    handler: "index.handler",
    code: lambda.Code.asset("my-front-end"),
    environment: {
      QUEUE_NAME: myQueueProcessingService.sqsQueue.queueName
    }
  });

myQueueProcessingService.sqsQueue.grantSendMessages(myFunction);

const myApi = new apigateway.LambdaRestApi(
  this, "MyFrontendApi", {
    handler: myFunction
  });

I find this code very readable and easier to maintain than the corresponding JSON or YAML. By the way, cdk synth in this case outputs more than 800 lines of plain CloudFormation YAML.

An example in Python
For the second project I am using Python:

cdk init app --language=python

I want to build a Lambda function that is executed every 10 minutes:

When you initialize a CDK project in Python, a virtualenv is set up for you. You can activate the virtualenv and install your project requirements with:

source .env/bin/activate

pip install -r requirements.txt

Note that Python autocompletion may not work with some editors, like Visual Studio Code, if you don’t start the editor from an active virtualenv.

Inside the stack, here’s the Python code defining the Lambda function and the CloudWatch Event rule to invoke the function periodically as target:

myFunction = aws_lambda.Function(
    self, "MyPeriodicFunction",
    code=aws_lambda.Code.asset("src"),
    handler="index.main",
    timeout=core.Duration.seconds(30),
    runtime=aws_lambda.Runtime.PYTHON_3_7,
)

myRule = aws_events.Rule(
    self, "MyRule",
    schedule=aws_events.Schedule.rate(core.Duration.minutes(10)),
)
myRule.add_target(aws_events_targets.LambdaFunction(myFunction))

Again, this is easy to understand even if you don’t know the details of AWS CDK. For example, durations include the time unit and you don’t have to wonder if they are expressed in seconds, milliseconds, or days. The output of cdk synth in this case is more than 90 lines of plain CloudFormation YAML.

Available Now
There is no charge for using the AWS CDK, you pay for the AWS resources that are deployed by the tool.

To quickly get hands-on with the CDK, start with this awesome step-by-step online tutorial!

More examples of CDK projects, using different programming languages, are available in this repository:

https://github.com/aws-samples/aws-cdk-examples

You can find more information on writing your own constructs here.

The AWS CDK is open source and we welcome your contribution to make it an even better tool:

https://github.com/awslabs/aws-cdk

Check out our source code on GitHub, start building your infrastructure today using TypeScript or Python, or try different languages in developer preview, such as C# and Java, and give us your feedback!

Amazon Aurora PostgreSQL Serverless – Now Generally Available

Post Syndicated from Danilo Poccia original https://aws.amazon.com/blogs/aws/amazon-aurora-postgresql-serverless-now-generally-available/

The database is usually the most critical part of a software architecture and managing databases, especially relational ones, has never been easy. For this reason, we created Amazon Aurora Serverless, an auto-scaling version of Amazon Aurora that automatically starts up, shuts down and scales up or down based on your application workload.

The MySQL-compatible edition of Aurora Serverless has been available for some time now. I am pleased to announce that the PostgreSQL-compatible edition of Aurora Serverless is generally available today.

Before moving on with details, I take the opportunity to congratulate the Amazon Aurora development team that has just won the 2019 Association for Computing Machinery’s (ACM) Special Interest Group on Management of Data (SIGMOD) Systems Award!

When you create a database with Aurora Serverless, you set the minimum and maximum capacity. Your client applications transparently connect to a proxy fleet that routes the workload to a pool of resources that are automatically scaled. Scaling is very fast because resources are “warm” and ready to be added to serve your requests.

 

There is no change with Aurora Serverless on how storage is managed by Aurora. The storage layer is independent from the compute resources used by the database. There is no need to provision storage in advance. The minimum storage is 10GB and, based on the database usage, the Amazon Aurora storage will automatically grow, up to 64 TB, in 10GB increments with no impact to database performance.

Creating an Aurora Serverless PostgreSQL Database
Let’s start an Aurora Serverless PostgreSQL database and see the automatic scalability at work. From the Amazon RDS console, I select to create a database using Amazon Aurora as engine. Currently, Aurora serverless is compatible with PostgreSQL version 10.5. Selecting that version, the serverless option becomes available.

I give the new DB cluster an identifier, choose my master username, and let Amazon RDS generate a password for me. I will be able to retrieve my credentials during database creation.

I can now select the minimum and maximum capacity for my database, in terms of Aurora Capacity Units (ACUs), and in the additional scaling configuration I choose to pause compute capacity after 5 minutes of inactivity. Based on my settings, Aurora Serverless automatically creates scaling rules for thresholds for CPU utilization, connections, and available memory.

Testing Some Load on the Database
To generate some load on the database I am using sysbench on an EC2 instance. There are a couple of Lua scripts bundled with sysbench that can help generate an online transaction processing (OLTP) workload:

  • The first script, parallel_prepare.lua, generates 100,000 rows per table for 24 tables.
  • The second script, oltp.lua, generates workload against those data using 64 worker threads.

By using those scripts, I start generating load on my database cluster. As you can see from this graph, taken from the RDS console monitoring tab, the serverless database capacity grows and shrinks to follow my requirements. The metric shown on this graph is the number of ACUs used by the database cluster. First it scales up to accommodate the sysbench workload. When I stop the load generator, it scales down and then pauses.

Available Now
Aurora Serverless PostgreSQL is available now in US East (N. Virginia), US East (Ohio), US West (Oregon), EU (Ireland), and Asia Pacific (Tokyo). With Aurora Serverless, you pay on a per-second basis for the database capacity you use when the database is active, plus the usual Aurora storage costs.

For more information on Amazon Aurora, I recommend this great post explaining why and how it was created:

Amazon Aurora ascendant: How we designed a cloud-native relational database

It’s never been so easy to use a relational database in production. I am so excited to see what you are going to use it for!

Amazon Managed Streaming for Apache Kafka (MSK) – Now Generally Available

Post Syndicated from Danilo Poccia original https://aws.amazon.com/blogs/aws/amazon-managed-streaming-for-apache-kafka-msk-now-generally-available/

I am always amazed at how our customers are using streaming data. For example, Thomson Reuters, one of the world’s most trusted news organizations for businesses and professionals, built a solution to capture, analyze, and visualize analytics data to help product teams continuously improve the user experience. Supercell, the social game company providing games such as Hay Day, Clash of Clans, and Boom Beach, is delivering in-game data in real-time, handling 45 billion events per day.

Since we launched Amazon Kinesis at re:Invent 2013, we have continually expanded the ways in in which customers work with streaming data on AWS. Some of the available tools are:

  • Kinesis Data Streams, to capture, store, and process data streams with your own applications.
  • Kinesis Data Firehose, to transform and collect data into destinations such as Amazon S3, Amazon Elasticsearch Service, and Amazon Redshift.
  • Kinesis Data Analytics, to continuously analyze data using SQL or Java (via Apache Flink applications), for example to detect anomalies or for time series aggregation.
  • Kinesis Video Streams, to simplify processing of media streams.

At re:Invent 2018, we introduced in open preview Amazon Managed Streaming for Apache Kafka (MSK), a fully managed service that makes it easy to build and run applications that use Apache Kafka to process streaming data.

I am excited to announce that Amazon MSK is generally available today!

How it works

Apache Kafka (Kafka) is an open-source platform that enables customers to capture streaming data like click stream events, transactions, IoT events, application and machine logs, and have applications that perform real-time analytics, run continuous transformations, and distribute this data to data lakes and databases in real time. You can use Kafka as a streaming data store to decouple applications producing streaming data (producers) from those consuming streaming data (consumers).

While Kafka is a popular enterprise data streaming and messaging framework, it can be difficult to setup, scale, and manage in production. Amazon MSK takes care of these managing tasks and makes it easy to set up, configure, and run Kafka, along with Apache ZooKeeper, in an environment following best practices for high availability and security.

Your MSK clusters always run within an Amazon VPC managed by the MSK service. Your MSK resources are made available to your own VPC, subnet, and security group through elastic network interfaces (ENIs) which will appear in your account, as described in the following architectural diagram:

Customers can create a cluster in minutes, use AWS Identity and Access Management (IAM) to control cluster actions, authorize clients using TLS private certificate authorities fully managed by AWS Certificate Manager (ACM), encrypt data in-transit using TLS, and encrypt data at rest using AWS Key Management Service (KMS) encryption keys.

Amazon MSK continuously monitors server health and automatically replaces servers when they fail, automates server patching, and operates highly available ZooKeeper nodes as a part of the service at no additional cost. Key Kafka performance metrics are published in the console and in Amazon CloudWatch. Amazon MSK is fully compatible with Kafka versions 1.1.1 and 2.1.0, so that you can continue to run your applications, use Kafka’s admin tools, and and use Kafka compatible tools and frameworks without having to change your code.

Based on our customer feedback during the open preview, Amazon MSK added may features such as:

  • Encryption in-transit via TLS between clients and brokers, and between brokers
  • Mutual TLS authentication using ACM private certificate authorities
  • Support for Kafka version 2.1.0
  • 99.9% availability SLA
  • HIPAA eligible
  • Cluster-wide storage scale up
  • Integration with AWS CloudTrail for MSK API logging
  • Cluster tagging and tag-based IAM policy application
  • Defining custom, cluster-wide configurations for topics and brokers

AWS CloudFormation support is coming in the next few weeks.

Creating a cluster

Let’s create a cluster using the AWS management console. I give the cluster a name, select the VPC I want to use the cluster from, and the Kafka version.

I then choose the Availability Zones (AZs) and the corresponding subnets to use in the VPC. In the next step, I select how many Kafka brokers to deploy in each AZ. More brokers allow you to scale the throughtput of a cluster by allocating partitions to different brokers.

I can add tags to search and filter my resources, apply IAM policies to the Amazon MSK API, and track my costs. For storage, I leave the default storage volume size per broker.

I select to use encryption within the cluster and to allow both TLS and plaintext traffic between clients and brokers. For data at rest, I use the AWS-managed customer master key (CMK), but you can select a CMK in your account, using KMS, to have further control. You can use private TLS certificates to authenticate the identity of clients that connect to your cluster. This feature is using Private Certificate Authorities (CA) from ACM. For now, I leave this option unchecked.

In the advanced setting, I leave the default values. For example, I could have chosen here a different instance type for my brokers. Some of these settings can be updated using the AWS CLI.

I create the cluster and monitor the status from the cluster summary, including the Amazon Resource Name (ARN) that I can use when interacting via CLI or SDKs.

When the status is active, the client information section provides specific details to connect to the cluster, such as:

  • The bootstrap servers I can use with Kafka tools to connect to the cluster.
  • The Zookeper connect list of hosts and ports.

I can get similar information using the AWS CLI:

  • aws kafka list-clusters to see the ARNs of your clusters in a specific region
  • aws kafka get-bootstrap-brokers --cluster-arn <ClusterArn> to get the Kafka bootstrap servers
  • aws kafka describe-cluster --cluster-arn <ClusterArn> to see more details on the cluster, including the Zookeeper connect string

Quick demo of using Kafka

To start using Kafka, I create two EC2 instances in the same VPC, one will be a producer and one a consumer. To set them up as client machines, I download and extract the Kafka tools from the Apache website or any mirror. Kafka requires Java 8 to run, so I install Amazon Corretto 8.

On the producer instance, in the Kafka directory, I create a topic to send data from the producer to the consumer:

bin/kafka-topics.sh --create --zookeeper <ZookeeperConnectString> \
--replication-factor 3 --partitions 1 --topic MyTopic

Then I start a console-based producer:

bin/kafka-console-producer.sh --broker-list <BootstrapBrokerString> \
--topic MyTopic

On the consumer instance, in the Kafka directory, I start a console-based consumer:

bin/kafka-console-consumer.sh --bootstrap-server <BootstrapBrokerString> \
--topic MyTopic --from-beginning

Here’s a recording of a quick demo where I create the topic and then send messages from a producer (top terminal) to a consumer of that topic (bottom terminal):

Pricing and availability

Pricing is per Kafka broker-hour and per provisioned storage-hour. There is no cost for the Zookeeper nodes used by your clusters. AWS data transfer rates apply for data transfer in and out of MSK. You will not be charged for data transfer within the cluster in a region, including data transfer between brokers and data transfer between brokers and ZooKeeper nodes.

You can migrate your existing Kafka cluster to MSK using tools like MirrorMaker (that comes with open source Kafka) to replicate data from your clusters into a MSK cluster.

Upstream compatibility is a core tenet of Amazon MSK. Our code changes to the Kafka platform are released back to open source.

Amazon MSK is available in US East (N. Virginia), US East (Ohio), US West (Oregon), Asia Pacific (Tokyo), Asia Pacific (Singapore), Asia Pacific (Sydney), EU (Frankfurt), EU (Ireland), EU (Paris), and EU (London).

I look forward to see how are you going to use Amazon MSK to simplify building and migrating streaming applications to the cloud!

New for AWS Lambda – Use Any Programming Language and Share Common Components

Post Syndicated from Danilo Poccia original https://aws.amazon.com/blogs/aws/new-for-aws-lambda-use-any-programming-language-and-share-common-components/

I remember the excitement when AWS Lambda was announced in 2014Four years on, customers are using Lambda functions for many different use cases. For example, iRobot is using AWS Lambda to provide compute services for their Roomba robotic vacuum cleaners, Fannie Mae to run Monte Carlo simulations for millions of mortgages, Bustle to serve billions of requests for their digital content.

Today, we are introducing two new features that are going to make serverless development even easier:

  • Lambda Layers, a way to centrally manage code and data that is shared across multiple functions.
  • Lambda Runtime API, a simple interface to use any programming language, or a specific language version, for developing your functions.

These two features can be used together: runtimes can be shared as layers so that developers can pick them up and use their favorite programming language when authoring Lambda functions.

Let’s see how they work more in detail.

Lambda Layers

When building serverless applications, it is quite common to have code that is shared across Lambda functions. It can be your custom code, that is used by more than one function, or a standard library, that you add to simplify the implementation of your business logic.

Previously, you would have to package and deploy this shared code together with all the functions using it. Now, you can put common components in a ZIP file and upload it as a Lambda Layer. Your function code doesn’t need to be changed and can reference the libraries in the layer as it would normally do.

Layers can be versioned to manage updates, each version is immutable. When a version is deleted or permissions to use it are revoked, functions that used it previously will continue to work, but you won’t be able to create new ones.

In the configuration of a function, you can reference up to five layers, one of which can optionally be a runtime. When the function is invoked, layers are installed in /opt in the order you provided. Order is important because layers are all extracted under the same path, so each layer can potentially overwrite the previous one. This approach can be used to customize the environment. For example, the first layer can be a runtime and the second layer adds specific versions of the libraries you need.

The overall, uncompressed size of function and layers is subject to the usual unzipped deployment package size limit.

Layers can be used within an AWS account, shared between accounts, or shared publicly with the broad developer community.

There are many advantages when using layers. For example, you can use Lambda Layers to:

  • Enforce separation of concerns, between dependencies and your custom business logic.
  • Make your function code smaller and more focused on what you want to build.
  • Speed up deployments, because less code must be packaged and uploaded, and dependencies can be reused.

Based on our customer feedback, and to provide an example of how to use Lambda Layers, we are publishing a public layer which includes NumPy and SciPy, two popular scientific libraries for Python. This prebuilt and optimized layer can help you start very quickly with data processing and machine learning applications.

In addition to that, you can find layers for application monitoring, security, and management from partners such as Datadog, Epsagon, IOpipe, NodeSource, Thundra, Protego, PureSec, Twistlock, Serverless, and Stackery.

Using Lambda Layers

In the Lambda console I can now manage my own layers:

I don’t want to create a new layer now but use an existing one in a function. I create a new Python function and, in the function configuration, I can see that there are no referenced layers. I choose to add a layer:

From the list of layers compatible with the runtime of my function, I select the one with NumPy and SciPy, using the latest available version:

After I add the layer, I click Save to update the function configuration. In case you’re using more than one layer, you can adjust here the order in which they are merged with the function code.

To use the layer in my function, I just have to import the features I need from NumPy and SciPy:

import numpy as np
from scipy.spatial import ConvexHull

def lambda_handler(event, context):

    print("\nUsing NumPy\n")

    print("random matrix_a =")
    matrix_a = np.random.randint(10, size=(4, 4))
    print(matrix_a)

    print("random matrix_b =")
    matrix_b = np.random.randint(10, size=(4, 4))
    print(matrix_b)

    print("matrix_a * matrix_b = ")
    print(matrix_a.dot(matrix_b)
    print("\nUsing SciPy\n")

    num_points = 10
    print(num_points, "random points:")
    points = np.random.rand(num_points, 2)
    for i, point in enumerate(points):
        print(i, '->', point)

    hull = ConvexHull(points)
    print("The smallest convex set containing all",
        num_points, "points has", len(hull.simplices),
        "sides,\nconnecting points:")
    for simplex in hull.simplices:
        print(simplex[0], '<->', simplex[1])

I run the function, and looking at the logs, I can see some interesting results.

First, I am using NumPy to perform matrix multiplication (matrices and vectors are often used to represent the inputs, outputs, and weights of neural networks):

random matrix_1 =
[[8 4 3 8]
[1 7 3 0]
[2 5 9 3]
[6 6 8 9]]
random matrix_2 =
[[2 4 7 7]
[7 0 0 6]
[5 0 1 0]
[4 9 8 6]]
matrix_1 * matrix_2 = 
[[ 91 104 123 128]
[ 66 4 10 49]
[ 96 35 47 62]
[130 105 122 132]]

Then, I use SciPy advanced spatial algorithms to compute something quite hard to build by myself: finding the smallest “convex set” containing a list of points on a plane. For example, this can be used in a Lambda function receiving events from multiple geographic locations (corresponding to buildings, customer locations, or devices) to visually “group” similar events together in an efficient way:

10 random points:
0 -> [0.07854072 0.91912467]
1 -> [0.11845307 0.20851106]
2 -> [0.3774705 0.62954561]
3 -> [0.09845837 0.74598477]
4 -> [0.32892855 0.4151341 ]
5 -> [0.00170082 0.44584693]
6 -> [0.34196204 0.3541194 ]
7 -> [0.84802508 0.98776034]
8 -> [0.7234202 0.81249389]
9 -> [0.52648981 0.8835746 ]
The smallest convex set containing all 10 points has 6 sides,
connecting points:
1 <-> 5
0 <-> 5
0 <-> 7
6 <-> 1
8 <-> 7
8 <-> 6

When I was building this example, there was no need to install or package dependencies. I could quickly iterate on the code of the function. Deployments were very fast because I didn’t have to include large libraries or modules.

To visualize the output of SciPy, it was easy for me to create an additional layer to import matplotlib, a plotting library. Adding a few lines of code at the end of the previous function, I can now upload to Amazon Simple Storage Service (S3) an image that shows how the “convex set” is wrapping all the points:

    plt.plot(points[:,0], points[:,1], 'o')
    for simplex in hull.simplices:
        plt.plot(points[simplex, 0], points[simplex, 1], 'k-')
        
    img_data = io.BytesIO()
    plt.savefig(img_data, format='png')
    img_data.seek(0)

    s3 = boto3.resource('s3')
    bucket = s3.Bucket(S3_BUCKET_NAME)
    bucket.put_object(Body=img_data, ContentType='image/png', Key=S3_KEY)
    
    plt.close()

Lambda Runtime API

You can now select a custom runtime when creating or updating a function:

With this selection, the function must include (in its code or in a layer) an executable file called bootstrap, responsible for the communication between your code (that can use any programming language) and the Lambda environment.

The runtime bootstrap uses a simple HTTP based interface to get the event payload for a new invocation and return back the response from the function. Information on the interface endpoint and the function handler are shared as environment variables.

For the execution of your code, you can use anything that can run in the Lambda execution environment. For example, you can bring an interpreter for the programming language of your choice.

You only need to know how the Runtime API works if you want to manage or publish your own runtimes. As a developer, you can quickly use runtimes that are shared with you as layers.

We are making these open source runtimes available today:

We are also working with our partners to provide more open source runtimes:

  • Erlang (Alert Logic)
  • Elixir (Alert Logic)
  • Cobol (Blu Age)
  • N|Solid (NodeSource)
  • PHP (Stackery)

The Runtime API is the future of how we’ll support new languages in Lambda. For example, this is how we built support for the Ruby language.

Available Now

You can use runtimes and layers in all regions where Lambda is available, via the console or the AWS Command Line Interface (CLI). You can also use the AWS Serverless Application Model (SAM) and the SAM CLI to test, deploy and manage serverless applications using these new features.

There is no additional cost for using runtimes and layers. The storage of your layers takes part in the AWS Lambda Function storage per region limit.

To learn more about using the Runtime API and Lambda Layers, don’t miss our webinar on December 11, hosted by Principal Developer Advocate Chris Munns.

I am so excited by these new features, please let me know what are you going to build next!

New – AWS Toolkits for PyCharm, IntelliJ (Preview), and Visual Studio Code (Preview)

Post Syndicated from Danilo Poccia original https://aws.amazon.com/blogs/aws/new-aws-toolkits-for-pycharm-intellij-preview-and-visual-studio-code-preview/

Software developers have their own preferred tools. Some use powerful editors, others Integrated Development Environments (IDEs) that are tailored for specific languages and platforms. In 2014 I created my first AWS Lambda function using the editor in the Lambda console. Now, you can choose from a rich set of tools to build and deploy serverless applications. For example, the editor in the Lambda console has been greatly enhanced last year when AWS Cloud9 was released. For .NET applications, you can use the AWS Toolkit for Visual Studio and AWS Tools for Visual Studio Team Services.

AWS Toolkits for PyCharm, IntelliJ, and Visual Studio Code

Today, we are announcing the general availability of the AWS Toolkit for PyCharm. We are also announcing the developer preview of the AWS Toolkits for IntelliJ and Visual Studio Code, which are under active development in GitHub. These open source toolkits will enable you to easily develop serverless applications, including a full create, step-through debug, and deploy experience in the IDE and language of your choice, be it Python, Java, Node.js, or .NET.

For example, using the AWS Toolkit for PyCharm you can:

These toolkits are distributed under the open source Apache License, Version 2.0.

Installation

Some features use the AWS Serverless Application Model (SAM) CLI. You can find installation instructions for your system here.

The AWS Toolkit for PyCharm is available via the IDEA Plugin Repository. To install it, in the Settings/Preferences dialog, click Plugins, search for “AWS Toolkit”, use the checkbox to enable it, and click the Install button. You will need to restart your IDE for the changes to take effect.

The AWS Toolkit for IntelliJ and Visual Studio Code are currently in developer preview and under active development. You are welcome to build and install these from the GitHub repositories:

Building a Serverless application with PyCharm

After installing AWS SAM CLI and AWS Toolkit, I create a new project in PyCharm and choose SAM on the left to create a serverless application using the AWS Serverless Application Model. I call my project hello-world in the Location field. Expanding More Settings, I choose which SAM template to use as the starting point for my project. For this walkthrough, I select the “AWS SAM Hello World”.

In PyCharm you can use credentials and profiles from your AWS Command Line Interface (CLI) configuration. You can change AWS region quickly if you have multiple environments.
The AWS Explorer shows Lambda functions and AWS CloudFormation stacks in the selected AWS region. Starting from a CloudFormation stack, you can see which Lambda functions are part of it.

The function handler is in the app.py file. After I open the file, I click on the Lambda icon on the left of the function declaration to have the option to run the function locally or start a local step-by-step debugging session.

First, I run the function locally. I can configure the payload of the event that is provided in input for the local invocation, starting from the event templates provided for most services, such as the Amazon API Gateway, Amazon Simple Notification Service (SNS), Amazon Simple Queue Service (SQS), and so on. You can use a file for the payload, or select the share checkbox to make it available to other team members. The function is executed locally, but here you can choose the credentials and the region to be used if the function is calling other AWS services, such as Amazon Simple Storage Service (S3) or Amazon DynamoDB.

A local container is used to emulate the Lambda execution environment. This function is implementing a basic web API, and I can check that the result is in the format expected by the API Gateway.

After that, I want to get more information on what my code is doing. I set a breakpoint and start a local debugging session. I use the same input event as before. Again, you can choose the credentials and region for the AWS services used by the function.

I step over the HTTP request in the code to inspect the response in the Variables tab. Here you have access to all local variables, including the event and the context provided in input to the function.

After that, I resume the program to reach the end of the debugging session.

Now I am confident enough to deploy the serverless application right-clicking on the project (or the SAM template file). I can create a new CloudFormation stack, or update an existing one. For now, I create a new stack called hello-world-prod. For example, you can have a stack for production, and one for testing. I select an S3 bucket in the region to store the package used for the deployment. If your template has parameters, here you can set up the values used by this deployment.

After a few minutes, the stack creation is complete and I can run the function in the cloud with a right-click in the AWS Explorer. Here there is also the option to jump to the source code of the function.

As expected, the result of the remote invocation is the same as the local execution. My serverless application is in production!

Using these toolkits, developers can test locally to find problems before deployment, change the code of their application or the resources they need in the SAM template, and update an existing stack, quickly iterating until they reach their goal. For example, they can add an S3 bucket to store images or documents, or a DynamoDB table to store your users, or change the permissions used by their functions.

I am really excited by how much faster and easier it is to build your ideas on AWS. Now you can use your preferred environment to accelerate even further. I look forward to seeing what you will do with these new tools!

Amazon Forecast – Time Series Forecasting Made Easy

Post Syndicated from Danilo Poccia original https://aws.amazon.com/blogs/aws/amazon-forecast-time-series-forecasting-made-easy/

The capacity to foresee the future would be an incredible superpower. At AWS, we can’t give you that, but we can help you use machine learning to forecast time series in a few steps.

The goal of time series forecasting is to predict future values of time-dependent data such as weekly sales, daily inventory levels, or hourly website traffic. Companies today use everything from simple spreadsheets to complex financial planning software to attempt to accurately forecast future business outcomes such as product demand, resource needs, or financial performance.

These tools build forecasts by looking at a historical series of data, which is called time series data. For example, such tools may try to predict the future sales of a raincoat by looking only at its previous sales data with the underlying assumption that the future is determined by the past.

This approach can struggle to produce accurate forecasts for large sets of data that have irregular trends. Also, it fails to easily combine data series that change over time (such as price, discounts, web traffic) with relevant independent variables like product features and store locations.

Introducing Amazon Forecast

Amazon has been solving time-series forecasting challenges across multiple areas including retail, supply chain, and server capacity for over two decades. Using machine learning techniques we have learned from this experience, today we are introducing Amazon Forecast, a fully managed deep learning service for time-series forecasting. Amazon Forecast packages our years of experience in building and operating scalable, highly accurate forecasting technology into an easy-to-use and fully-managed service.

You can use Amazon Forecast to generate predictions on time-series data to estimate:

  • Operational metrics, such as web traffic to servers, AWS usage, or IoT sensor metrics.
  • Business metrics, such as sales, profits, and expenses.
  • Resource requirements, such as the quantity of energy or bandwidth needed to meet a specific demand.
  • The amount of raw goods, services, or other inputs needed by a manufacturing process.
  • Retail demand considering the impact of price discounts, marketing promotions, and other campaigns.

Amazon Forecast is designed with these three main benefits in minds:

  • Accuracy, using deep neural nets and traditional statistical methods for forecasting. Amazon Forecast can learn from your data automatically and pick the best algorithms to train a model designed for your data. When you have many related time- series, forecasts made using the Amazon Forecast deep learning algorithms, such as DeepAR and MQ-RNN, tend to be more accurate than forecasts made with traditional methods, such as exponential smoothing.
  • End-to-end management, automating the entire forecasting workflow from data upload to data processing, model training, dataset updates, and forecasting. Enterprise systems can directly consume your forecasts as an API.
  • Usability, in the console you can look up and visualize forecasts for any time series at different granularities. You can also see metrics for the accuracy of your predictor’s forecasts. Developers with no machine learning expertise can use the Amazon Forecast APIs, AWS Command Line Interface (CLI), or the console to import training data into one or more Amazon Forecast datasets, train models, and deploy the models to generate forecasts.

Using Amazon Forecast

When creating forecasting projects in Amazon Forecast, you primarily work with the following resources:

  • Dataset, to upload your data. Amazon Forecast algorithms use the datasets to train models.
  • Dataset Group, a container for one or more datasets, to use multiple datasets for model training.
  • Predictor, a result of training models. To create a predictor you provide a dataset group and a recipe (which provides an algorithm) or let Amazon Forecast decide which forecasting model works best. The algorithm trains a model using the data in the datasets.
  • Forecast, using a predictor you can run inference to generate forecasts.

You can use Amazon Forecast with the AWS console, CLI and SDKs. For example, you can use the AWS SDK for Python to train a model or get a forecast in a Jupyter notebook, or the AWS SDK for Java to add forecasting capabilities to an existing business application.

Pricing and Availability

With Amazon Forecast, you pay only for what you use. There are three different types of costs in Amazon Forecast:

  • Generated forecast: A forecast is a prediction of future values for a single variable over any time horizon. Forecasts are billed in units of 1,000 (rounded up to the nearest thousand).
  • Data storage: Costs for each GB of data stored and used to train your models.
  • Training hours: Costs for each hour of training required for a custom model based on data provided by customers.

As part of the AWS Free Tier, for the first two months after first using Amazon Forecast, you have no charge for:

  • Generated forecasts: Up to 10K time series forecasts per month
  • Data storage: Up to 10GB per month
  • Training hours: Up to 10 hours per month

Amazon Forecast is available in preview in the following regions: US East (Northern Virginia), US West (Oregon).

It has never been so easy to do time-series forecasts with high accuracy. I really look forward to seeing what our customers are going to build with this!

Amazon DynamoDB On-Demand – No Capacity Planning and Pay-Per-Request Pricing

Post Syndicated from Danilo Poccia original https://aws.amazon.com/blogs/aws/amazon-dynamodb-on-demand-no-capacity-planning-and-pay-per-request-pricing/

Just a few years ago, creating a database that could support your business at any scale while providing consistent low latency was a daunting task. That changed for me in 2012 while reading Werner Vogels’ blog post announcing Amazon DynamoDB (it was a few months before I joined AWS). DynamoDB was built on the principles in the original Dynamo paper that Amazon published in 2007. Over the years, lots of new features have been introduced to further simplify how AWS customers use databases. You can now create fully managed, multi-region, multi-master database tables with features such as encryption at rest, point-in-time recovery, in-memory caching, and a 99.99% uptime service level agreement (SLA).

Amazon DynamoDB on-demand

Today we are introducing Amazon DynamoDB on-demand, a flexible new billing option for DynamoDB capable of serving thousands of requests per second without capacity planning. DynamoDB on-demand offers simple pay-per-request pricing for read and write requests so that you only pay for what you use, making it easy to balance costs and performance. For tables using on-demand mode, DynamoDB instantly accommodates customers’ workloads as they ramp up or down to any previously observed traffic level. If the level of traffic hits a new peak, DynamoDB adapts rapidly to accommodate the workload.

In the DynamoDB console, you can choose the on-demand read/write capacity mode when creating a new table, or change it later in the Capacity tab.

Tables using on-demand mode support all DynamoDB features (such as encryption at rest, point-in-time recovery, global tables, and so on) with the exception of auto scaling, which is not applicable with this mode.

Indexes created on a table using on-demand mode inherit the same scalability and billing model. You don’t need to specify throughput capacity settings for indexes, and you pay by their use. If you don’t have read/write traffic to a table using on-demand mode and its indexes, you only pay for the data storage.

DynamoDB on-demand is useful if your application traffic is difficult to predict and control, your workload has large spikes of short duration, or if your average table utilization is well below the peak. For example:

  • New applications, or applications whose database workload is complex to forecast
  • Developers working on serverless stacks with pay-per-use pricing
  • SaaS provider and independent software vendors (ISVs) who want the simplicity and resource isolation of deploying a table per subscriber

You can change a table from provisioned capacity to on-demand once per day. You can go from on-demand capacity to provisioned as often as you want.

A quick performance test

Let’s test some load on a newly created DynamoDB table using on-demand mode!

I created two serverless applications:

  • The first application creates a REST API on top of a DynamoDB table using an AWS Lambda function and Amazon API Gateway. Using this API, you can read, add, update, and delete items in the table using HTTP methods such as get, post, put, delete.
  • The second application starts 1,000 Lambda functions in parallel to generate load on the API endpoint, using random HTTP methods and random data for the items.

Each load generating function runs 100 concurrent requests, and when they are all terminated starts another 100, and so on, for one minute. There is no ramp-up period. Load generation starts immediately at full speed!

As you can see in the metrics tab for this table in the DynamoDB console, I reached a peak of almost 5,000 requests per second very quickly and without any throttling.

The scaling of the serverless stack, from API Gateway to the Lambda function and the DynamoDB table, was fully managed. I didn’t have to plan for the right throughput, and I could focus on the application logic I was building.

With DynamoDB on-demand you pay only for what you use. For example, in the US East (N. Virginia) region, you are charged $1.25 per million write requests units and $0.25 per million read request units, plus the usual data storage costs.

You can use the AWS Command Line Interface (CLI), AWS SDKs, and AWS CloudFormation to create a table using on-demand mode or to change the read/write capacity mode of an existing table.

Available now

The DynamoDB on-demand is available globally in all commercial regions.

I am really excited by the new possibilities for developers, ISVs and SaaS providers, and I look forward to seeing what you build with pay-per-request billing.

New – Amazon Kinesis Data Analytics for Java

Post Syndicated from Danilo Poccia original https://aws.amazon.com/blogs/aws/new-amazon-kinesis-data-analytics-for-java/

Customers are using Amazon Kinesis to collect, process, and analyze real-time streaming data. In this way, they can react quickly to new information from their business, their infrastructure, or their customers. For example, Epic Games ingests more than 1.5 million game events per second for its popular online game, Fornite.

With Amazon Kinesis Data Analytics you can process data in real-time using standard SQL. While SQL provides an easy way to quickly query large volumes of streaming data without learning new frameworks or languages, many customers also want to build more sophisticated data processing applications using general-purpose programming languages.

Using Java with Amazon Kinesis Data Analytics

Today, we are introducing support for Java in Amazon Kinesis Data Analytics. Now, developers can use their own Java code to create powerful real-time applications that process streaming data like continuously transforming and loading data into their data lakes, generating metrics to feed real-time gaming leaderboards, applying machine learning models to data streams from connected devices, and more.

To use this new functionality, developers build applications using open source libraries which include built-in operators for common data processing functions that allow applications to organize, transform, aggregate, and analyze data at any scale. These libraries are both open source and you can run them anywhere:

  • Apache Flink, an open source framework and engine for processing data streams.
  • AWS SDK for Java, providing Java APIs for many AWS services.

Developers can use these Java libraries within their Integrated Development Environment (IDE) of choice. Using these libraries, the following AWS services can be integrated with as little as one line of code:

  • Streaming Data Sources: Amazon Kinesis Data Streams
  • Streaming Destinations: Amazon S3, Amazon DynamoDB, Amazon Kinesis Data Streams, Amazon Kinesis Data Firehose

In addition to the pre-built AWS integrations, the Java libraries include more connectors to tools like Cassandra, ElasticSearch, RabbitMQ, Redis, and more, and the ability to build custom integrations.

Building a Kinesis Data Streams Java Application

I prepared a simple Java application that implements the “mandatory” word count example for data processing. I send some paragraphs of text in input and I get, every five seconds, the number of times each word is being used as output.

First, I create two Kinesis Data Streams:

  • TextInputStream, where I am going to send my input records
  • WordCountOutputStream, where I am going to read the output of the Java application

 

Here is the code of the word-count Java application. To read and write from Kinesis Data Streams, I am using the Kinesis Connector from the Apache Flink project.

public class StreamingJob {

    private static final String region = "us-east-1";
    private static final String inputStreamName = "TextInputStream";
    private static final String outputStreamName = "WordCountOutputStream";

    private static DataStream<String> createSourceFromStaticConfig(
            StreamExecutionEnvironment env) {
        Properties inputProperties = new Properties();
        inputProperties.setProperty(ConsumerConfigConstants.AWS_REGION, region);
        inputProperties.setProperty(ConsumerConfigConstants.STREAM_INITIAL_POSITION,
            "LATEST");

        return env.addSource(new FlinkKinesisConsumer<>(inputStreamName,
            new SimpleStringSchema(), inputProperties));
    }

    private static FlinkKinesisProducer<String> createSinkFromStaticConfig() {
        Properties outputProperties = new Properties();
        outputProperties.setProperty(ConsumerConfigConstants.AWS_REGION, region);

        FlinkKinesisProducer<String> sink = new FlinkKinesisProducer<>(new
            SimpleStringSchema(), outputProperties);
        sink.setDefaultStream(outputStreamName);
        sink.setDefaultPartition("0");
        return sink;
    }

    public static void main(String[] args) throws Exception {

        final StreamExecutionEnvironment env =
        StreamExecutionEnvironment.getExecutionEnvironment();

        DataStream<String> input = createSourceFromStaticConfig(env);

        input.flatMap(new Tokenizer())
             .keyBy(0)
             .timeWindow(Time.seconds(5))
             .sum(1)
             .map(new MapFunction<Tuple2<String, Integer>, String>() {
                 @Override
                 public String map(Tuple2<String, Integer> value) throws Exception {
                     return value.f0 + "," + value.f1.toString();
                }
             })
             .addSink(createSinkFromStaticConfig());

        env.execute("Word Count");
    }

    public static final class Tokenizer
        implements FlatMapFunction<String, Tuple2<String, Integer>> {

		@Override
		public void flatMap(String value, Collector<Tuple2<String, Integer>> out) {
			String[] tokens = value.toLowerCase().split("\\W+");
			for (String token : tokens) {
				if (token.length() > 0) {
					out.collect(new Tuple2<>(token, 1));
				}
			}
		}
    }
    
}

The most important part of the application is the manipulation of the input object, where I apply a few DataStream Transformations:

  1. I start with a DataFrame containing the String from the input stream.
  2. I use a Tokenizer in a FlatMap to split the sentence into “words”, each word followed by the number “1”.
  3. I apply the KeyBy operator to logically partition the stream in respect to the “word”.
  4. I use a 5 seconds tumbling window.
  5. I aggregate within the window, summing up for each word the number “1” to count them.
  6. I use a simple Map for each record to join the word and the number into a comma-separated values (CSV) String that I send to the output stream.

One of the most powerful operators shown here is the KeyBy operator. It enables you to re-organize a particular stream by a specified key in real-time. This type of re-keying enables further downstream operations like aggregations, counts, and much more. This enables you to set up streaming map-reduce on different keys within the same application.

I build the Java application using Maven and load the output JAR to an Amazon Simple Storage Service (S3) bucket in the region where I want to deploy the application. In the Kinesis Data Analytics console, I create a new application and select “Flink” as runtime:

I then configure the application to use the code on my S3 bucket. The console updates the IAM role for the application to have permissions to read the code.

You can optionally add key/value properties to the configuration of the application. You can read those properties from within the application, to provide customization at deployment time.

For monitoring, I leave the default metrics. I enable logging to Amazon CloudWatch, for errors only.

Don’t forget to add permissions to the IAM role created by the console to allow the Kinesis Analytics application to read and write from the streams used for input and output, TextInputStream and WordCountOutputStream in my case.

I can now start the application with the “Run” button, and when it is running, I use a script that I prepared to put some text (I am using a description of the Amazon Kinesis platform) in the input stream:

$ python put_records.py TextInputStream
Amazon Kinesis makes it easy to collect, process, and analyze real-time, streaming data...

The behavior of my application is summarized in the console in the Application Graph, a visual representation of the data flow consisting of operators and intermediate results (complex applications, using multiple streams, have a much more interesting graph):

To read the output stream, I am using a Lambda function written in Python. I am using the one provided with the Kinesis Record Aggregation & Deaggregation Modules for AWS Lambda, that provides automatic “de-aggregation” of records aggregated by the Amazon Kinesis Producer Library (KPL).

As expected, in the CloudWatch Logs console I get the list of the words and the number of times they were used, updated every 5 seconds by the Lambda function:

Pricing and Availability

With Amazon Kinesis Data Analytics for Java, you pay only for what you use. Pricing is similar to Amazon Kinesis Data Analytics for SQL, but there are a few differences.

For Java applications, you are charged a single additional Amazon Kinesis Processing Unit (KPU) per application, used for application orchestration. Java applications are also charged for running application storage and durable application backups. Running application storage is used for Amazon Kinesis Data Analytics’ stateful processing capabilities and is charged per GB-month. Durable application backups are optional and provide a point-in-time recovery point for applications, charged per GB-month.

For example, pricing is $0.11 per KPU hour in US East (N. Virginia), and you are charged for running application storage ($0.10 per GB-month) and durable application backups ($0.023 per GB-month).

Available Now

Amazon Kinesis Data Analytics for Java is available now in US East (N. Virginia), US East (Ohio), US West (Oregon), EU West (Ireland).

I only scratched the surface of the capabilities for stream processing enabled by the support of Java in Amazon Kinesis Data Analytics. I think this is a powerful tool that can enable new use cases. Let me know what you are going to build with it!

New – Amazon DynamoDB Transactions

Post Syndicated from Danilo Poccia original https://aws.amazon.com/blogs/aws/new-amazon-dynamodb-transactions/

Over the years, customers have used Amazon DynamoDB for lots of different use cases, from building microservices and mobile backends to implementing gaming and Internet of Things (IoT) solutions. For example, Capital One uses DynamoDB to reduce the latency of their mobile applications by moving their mainframe transactions to a serverless architecture. Tinder migrated user data to DynamoDB with zero downtime, to get the scalability they need to support their global user base.

Developers sometimes need to implement business logic that requires multiple, all-or-nothing operations across one or more tables. This requirement can add unnecessary complexity to their implementation. Today, we are making these use cases easier to build on DynamoDB with native support for transactions!

Introducing Amazon DynamoDB Transactions

DynamoDB transactions provide developers atomicity, consistency, isolation, and durability (ACID) across one or more tables within a single AWS account and region. You can use transactions when building applications that require coordinated inserts, deletes, or updates to multiple items as part of a single logical business operation. DynamoDB is the only non-relational database that supports transactions across multiple partitions and tables.

Transactions bring the scale, performance, and enterprise benefits of DynamoDB to a broader set of workloads. Many use cases are easier and faster to implement using transactions, for example:

  • Processing financial transactions
  • Fulfilling and managing orders
  • Building multiplayer game engines
  • Coordinating actions across distributed components and services

Two new DynamoDB operations have been introduced for handling transactions:

  • TransactWriteItems, a batch operation that contains a write set, with one or more PutItem, UpdateItem, and DeleteItem operations. TransactWriteItems can optionally check for prerequisite conditions that must be satisfied before making updates. These conditions may involve the same or different items than those in the write set. If any condition is not met, the transaction is rejected.
  • TransactGetItems, a batch operation that contains a read set, with one or more GetItem operations. If a TransactGetItems request is issued on an item that is part of an active write transaction, the read transaction is canceled. To get the previously committed value, you can use a standard read.

Each transaction can include up to 10 unique items or up to 4 MB of data, including conditions.

With this new feature, DynamoDB offers multiple read and write options to meet different application requirements, providing huge flexibility to developers implementing complex, data-driven business logic:

  • Three options for reads—eventual consistency, strong consistency, and transactional.
  • Two for writes—standard and transactional.

For example, imagine you are building a game where players can buy items with virtual coins:

  • In the players table, each player has a number of coins and an inventory of purchased items.
  • In the items table, each item has a price and is marked as available (or not) with a Boolean value.

To purchase an item, you can now implement a single atomic transaction:

  1. First, check that the item is available and the player has the necessary coins.
  2. If those conditions are satisfied, the item is marked as not available and owned by the player.
  3. The purchased item is then added to the player inventory list.

In JavaScript, using the AWS SDK for JavaScript in Node.js, you would have code similar to this:

data = await dynamoDb.transactWriteItems({
    TransactItems: [
        {
            Update: {
                TableName: 'items',
                Key: { id: { S: itemId } },
                ConditionExpression: 'available = :true',
                UpdateExpression: 'set available = :false, ' +
                    'ownedBy = :player',
                ExpressionAttributeValues: {
                    ':true': { BOOL: true },
                    ':false': { BOOL: false },
                    ':player': { S: playerId }
                }
            }
        },
        {
            Update: {
                TableName: 'players',
                Key: { id: { S: playerId } },
                ConditionExpression: 'coins >= :price',
                UpdateExpression: 'set coins = coins - :price, ' +
                    'inventory = list_append(inventory, :items)',
                ExpressionAttributeValues: {
                    ':items': { L: [{ S: itemId }] },
                    ':price': { N: itemPrice.toString() }
                }
            }
        }
    ]
}).promise();

Using Transactions

Transactions are enabled for all single-region DynamoDB tables and are disabled on global tables by default. You can choose to enable transactions on global tables by request, but replication across regions is asynchronous and eventually consistent. You may observe partially completed transactions during replication to other regions. Additionally, simultaneous writes to the same item in different regions are not guaranteed to be serially isolated.

Items are not locked during a transaction. DynamoDB transactions provide serializable isolation. If an item is modified outside of a transaction while the transaction is in progress, the transaction is canceled and an exception is thrown with details about which item or items caused the exception.

When creating an AWS Identity and Access Management (IAM) policy, there are no new permissions for TransactGetItems and TransactWriteItems. Existing DynamoDB UpdateItem, PutItem, DeleteItem, and GetItem actions authorize the use of those operations also within transactions. For example, if an IAM user has only PutItem permission, they can send a transaction with one or more put, but if they add a delete to the write set, it will get rejected because they do not have DeleteItem permission.

For any committed operation that was part of a transaction, DynamoDB Streams adds a new field, transaction-id, as a universally unique identifier (UUID) for the transaction. The in-order and exactly once semantics of DynamoDB Streams guarantee that eventually all updates of a TransactWriteItems request will be propagated through streams in an order that is consistent with the transaction serialization order.

Pricing, Monitoring, and Availability

There is no additional cost to enable transactions for DynamoDB tables. You only pay for the reads or writes that are part of your transaction. DynamoDB performs two underlying reads or writes of every item in the transaction, one to prepare the transaction and one to commit the transaction. The two underlying read/write operations are visible in your CloudWatch metrics. You should plan your costs, capacity, and performance needs assuming each transactional read performs two reads and each transactional write performs two writes.

DynamoDB transactions are available globally in all commercial regions.

I am really intrigued by these new capabilities. Please let me know what you are going to use them for!