Tag Archives: AWS SDK for Java

S3 URI Parsing is now available in AWS SDK for Java 2.x

Post Syndicated from David Ho original https://aws.amazon.com/blogs/devops/s3-uri-parsing-is-now-available-in-aws-sdk-for-java-2-x/

The AWS SDK for Java team is pleased to announce the general availability of Amazon Simple Storage Service (Amazon S3) URI parsing in the AWS SDK for Java 2.x. You can now parse path-style and virtual-hosted-style S3 URIs to easily retrieve the bucket, key, region, style, and query parameters. The new parseUri() API and S3Uri class provide the highly-requested parsing features that many customers miss from the AWS SDK for Java 1.x. Please note that Amazon S3 AccessPoints and Amazon S3 on Outposts URI parsing are not supported.

Motivation

Users often need to extract important components like bucket and key from stored S3 URIs to use in S3Client operations. The new parsing APIs allow users to conveniently do so, bypassing the need for manual parsing or storing the components separately.

Getting Started

To begin, first add the dependency for S3 to your project.

<dependency>
    <groupId>software.amazon.awssdk</groupId>
    <artifactId>s3</artifactId>
    <version>${s3.version}</version>
</dependency>

Next, instantiate S3Client and S3Utilities objects.

S3Client s3Client = S3Client.create();
S3Utilities s3Utilities = s3Client.utilities();

Parsing an S3 URI

To parse your S3 URI, call parseUri() from S3Utilities, passing in the URI. This will return a parsed S3Uri object. If you have a String of the URI, you’ll need to convert it into an URI object first.

String url = "https://s3.us-west-1.amazonaws.com/myBucket/resources/doc.txt?versionId=abc123&partNumber=77&partNumber=88";
URI uri = URI.create(url);
S3Uri s3Uri = s3Utilities.parseUri(uri);

With the S3Uri, you can call the appropriate getter methods to retrieve the bucket, key, region, style, and query parameters. If the bucket, key, or region is not specified in the URI, an empty Optional will be returned. If query parameters are not specified in the URI, an empty map will be returned. If the field is encoded in the URI, it will be returned decoded.

Region region = s3Uri.region().orElse(null); // Region.US_WEST_1
String bucket = s3Uri.bucket().orElse(null); // "myBucket"
String key = s3Uri.key().orElse(null); // "resources/doc.txt"
boolean isPathStyle = s3Uri.isPathStyle(); // true

Retrieving query parameters

There are several APIs for retrieving the query parameters. You can return a Map<String, List<String>> of the query parameters. Alternatively, you can specify a query parameter to return the first value for the given query, or return the list of values for the given query.

Map<String, List<String>> queryParams = s3Uri.rawQueryParameters(); // {versionId=["abc123"], partNumber=["77", "88"]}
String versionId = s3Uri.firstMatchingRawQueryParameter("versionId").orElse(null); // "abc123"
String partNumber = s3Uri.firstMatchingRawQueryParameter("partNumber").orElse(null); // "77"
List<String> partNumbers = s3Uri.firstMatchingRawQueryParameters("partNumber"); // ["77", "88"]

Caveats

Special Characters

If you work with object keys or query parameters with reserved or unsafe characters, they must be URL-encoded, e.g., replace whitespace " " with "%20".

Valid:
"https://s3.us-west-1.amazonaws.com/myBucket/object%20key?query=%5Bbrackets%5D"

Invalid:
"https://s3.us-west-1.amazonaws.com/myBucket/object key?query=[brackets]"

Virtual-hosted-style URIs

If you work with virtual-hosted-style URIs with bucket names that contain a dot, i.e., ".", the dot must not be URL-encoded.

Valid:
"https://my.Bucket.s3.us-west-1.amazonaws.com/key"

Invalid:
"https://my%2EBucket.s3.us-west-1.amazonaws.com/key"

Conclusion

In this post, I discussed parsing S3 URIs in the AWS SDK for Java 2.x and provided code examples for retrieving the bucket, key, region, style, and query parameters. To learn more about how to set up and begin using the feature, visit our Developer Guide. If you are curious about how it is implemented, check out the source code on GitHub. As always, the AWS SDK for Java team welcomes bug reports, feature requests, and pull requests on the aws-sdk-java-v2 GitHub repository.

Week in Review – February 13, 2023

Post Syndicated from Sébastien Stormacq original https://aws.amazon.com/blogs/aws/week-in-review-february-13-2023/

AWS announced 32 capabilities since we published the last Week in Review blog post a week ago. I also read a couple of other news and blog posts.

Here is my summary.

The VPC section of the AWS Management Console now allows you to visualize your VPC resources, such as the relationships between a VPC and its subnets, routing tables, and gateways. This visualization was available at VPC creation time only, and now you can go back to it using the Resource Map tab in the console. You can read the details in Channy’s blog post.

CloudTrail Lake now gives you the ability to ingest activity events from non-AWS sources. This lets you immutably store and then process activity events without regard to their origin–AWS, on-premises servers, and so forth. All of this power is available to you with a single API call: PutAuditEvents. We launched AWS CloudTrail Lake about a year ago. It is a managed organization-scale data lake that aggregates, immutably stores, and allows querying of events recorded by CloudTrail. You can use it for auditing, security investigation, and troubleshooting. Again, my colleague Channy wrote a post with the details.

There are three new Amazon CloudWatch metrics for asynchronous AWS Lambda function invocations: AsyncEventsReceived, AsyncEventAge, and AsyncEventsDropped. These metrics provide visibility for asynchronous Lambda function invocations. They help you to identify the root cause of processing issues such as throttling, concurrency limit, function errors, processing latency because of retries, or missing events. You can learn more and have access to a sample application in this blog post.

Amazon Simple Notification Service (Amazon SNS) now supports AWS X-Ray to visualize, analyze, and debug applications. Developers can now trace messages going through Amazon SNS, making it easier to understand or debug microservices or serverless applications.

Amazon EC2 Mac instances now support replacing root volumes for quick instance restoration. Stopping and starting EC2 Mac instances trigger a scrubbing workflow that can take up to one hour to complete. Now you can swap the root volume of the instance with an EBS snapshot or an AMI. It helps to reset your instance to a previous known state in 10–15 minutes only. This significantly speeds up your CI and CD pipelines.

Amazon Polly launches two new Japanese NTTS voices. Neural Text To Speech (NTTS) produces the most natural and human-like text-to-speech voices possible. You can try these voices in the Polly section of the AWS Management Console. With this addition, according to my count, you can now choose among 52 NTTS voices in 28 languages or language variants (French from France or from Quebec, for example).

The AWS SDK for Java now includes the AWS CRT HTTP Client. The HTTP client is the center-piece powering our SDKs. Every single AWS API call triggers a network call to our API endpoints. It is therefore important to use a low-footprint and low-latency HTTP client library in our SDKs. AWS created a common HTTP client for all SDKs using the C programming language. We also offer 11 wrappers for 11 programming languages, from C++ to Swift. When you develop in Java, you now have the option to use this common HTTP client. It provides up to 76 percent cold start time reduction on AWS Lambda functions and up to 14 percent less memory usage compared to the Netty-based HTTP client provided by default. My colleague Zoe has more details in her blog post.

X in Y Jeff started this section a while ago to list the expansion of new services and capabilities to additional Regions. I noticed 10 Regional expansions this week:

Other AWS News
This week, I also noticed these AWS news items:

My colleague Mai-Lan shared some impressive customer stories and metrics related to the use and scale of Amazon S3 Glacier. Check it out to learn how to put your cold data to work.

Space is the final (edge) frontier. I read this blog post published on avionweek.com. It explains how AWS helps to deploy AIML models on observation satellites to analyze image quality before sending them to earth, saving up to 40 percent satellite bandwidth. Interestingly, the main cause for unusable satellite images is…clouds.

Upcoming AWS Events
Check your calendars and sign up for these AWS events:

AWS re:Invent recaps in your area. During the re:Invent week, we had lots of new announcements, and in the next weeks, you can find in your area a recap of all these launches. All the events are posted on this site, so check it regularly to find an event nearby.

AWS re:Invent keynotes, leadership sessions, and breakout sessions are available on demand. I recommend that you check the playlists and find the talks about your favorite topics in one collection.

AWS Summits season will restart in Q2 2023. The dates and locations will be announced here. Paris and Sidney are kicking off the season on April 4th. You can register today to attend these in-person, free events (Paris, Sidney).

Stay Informed
That was my selection for this week! To better keep up with all of this news, do not forget to check out the following resources:

— seb
This post is part of our Week in Review series. Check back each week for a quick roundup of interesting news and announcements from AWS!

Reducing Java cold starts on AWS Lambda functions with SnapStart

Post Syndicated from Eric Johnson original https://aws.amazon.com/blogs/compute/reducing-java-cold-starts-on-aws-lambda-functions-with-snapstart/

Written by Mark Sailes, Senior Serverless Solutions Architect, AWS.

At AWS re:Invent 2022, AWS announced SnapStart for AWS Lambda functions running on Java Corretto 11. This feature enables customers to achieve up to 10x faster function startup performance for Java functions, at no additional cost, and typically with minimal or no code changes.

Overview

Today, for Lambda’s function invocations, the largest contributor to startup latency is the time spent initializing a function. This includes loading the function’s code and initializing dependencies. For interactive workloads that are sensitive to start-up latencies, this can cause suboptimal end user experience.

To address this challenge, customers either provision resources ahead of time, or spend effort building relatively complex performance optimizations, such as compiling with GraalVM native-image. Although these workarounds help reduce the startup latency, users must spend time on some heavy lifting instead of focusing on delivering business value. SnapStart addresses this concern directly for Java-based Lambda functions.

How SnapStart works

With SnapStart, when a customer publishes a function version, the Lambda service initializes the function’s code. It takes an encrypted snapshot of the initialized execution environment, and persists the snapshot in a tiered cache for low latency access.

When the function is first invoked and then scaled, Lambda resumes the execution environment from the persisted snapshot instead of initializing from scratch. This results in a lower startup latency.

Lambda function lifecycle

Lambda function lifecycle

A function version activated with SnapStart transitions to an inactive state if it remains idle for 14 days, after which Lambda deletes the snapshot. When you try to invoke a function version that is inactive, the invocation fails. Lambda sends a SnapStartNotReadyException and begins initializing a new snapshot in the background, during which the function version remains in Pending state. Wait until the function reaches the Active state, and then invoke it again. To learn more about this process and the function states, read the documentation.

Using SnapStart

Application frameworks such as Spring give developers an enormous productivity gain by reducing the amount of boilerplate code they write to accomplish common tasks. When first created, frameworks didn’t have to consider startup time because they run on application servers, which run for long periods of time. The startup time is minimal compared to the running duration. You often only restart them when there is an application version change.

If the functionality that these frameworks bring is implemented at runtime, then they often contribute to latency in startup time. SnapStart allows you to use frameworks like Spring and not compromise tail latency.

To demonstrate SnapStart, I use a sample application that saves records into Amazon DynamoDB. This Spring Boot application (TODO Link) uses a REST controller to handle CRUD requests. This sample includes infrastructure as code to deploy the application using the AWS Serverless Application Model (AWS SAM). You must install the AWS SAM CLI to deploy this example.

To deploy:

  1. Clone the git repository and change to project directory:
    git clone https://github.com/aws-samples/serverless-patterns.git
    cd serverless-patterns/apigw-lambda-snapstart
  2. Use the AWS SAM CLI to build the application:
    sam build
  3. Use the AWS SAM CLI to deploy the resources to your AWS account:
    sam deploy -g

This project deploys with SnapStart already enabled. To enable or disable this functionality in the AWS Management Console:

  1. Navigate to your Lambda function.
  2. Select the Configuration tab.
  3. Choose Edit and change the SnapStart attribute to PublishedVersions.
  4. Choose Save.

    Lambda Console confoguration

    Lambda Console confoguration

  5. Select the Versions tab and choose Publish new.
  6. Choose Publish.

Once you’ve enabled SnapStart, Lambda publishes all subsequent versions with snapshots. The time to run your publish version depends on your init code. You can run init up to 15 minutes with this feature.

Considerations

Stale credentials

Using SnapStart and restoring from a snapshot often changes how you create functions. With on-demand functions, you might access one time data in the init phase, and then reuse it during future invokes. If this data is ephemeral, a database password for example, then there might be a time between fetching the secret and using it, that the password has changed leading to an error. You must write code to handle this error case.

With SnapStart, if you follow the same approach, your database password is persisted in an encrypted snapshot. All future execution environments have the same state. This can be days, weeks, or longer after the snapshot is taken. This makes it more likely that your function has the incorrect password stored. To improve this, you could move the functionality to fetch the password to the post-snapshot hook. With each approach, it is important to understand your application’s needs and handle errors when they occur.

Demo application architecture

Demo application architecture

A second challenge in sharing the initial state is with randomness and uniqueness. If random seeds are stored in the snapshot during the initialization phase, then it may cause random numbers to be predictable.

Cryptography

AWS has changed the managed runtime to help customers handle the effects of uniqueness and randomness when restoring functions.

Lambda has already incorporated updates to Amazon Linux 2 and one of the commonly used cryptographic libraries, OpenSSL (1.0.2), to make them resilient to snapshot operations. AWS has also validated that Java runtime’s built-in RNG java.security.SecureRandom maintains uniqueness when resuming from a snapshot.

Software that always gets random numbers from the operating system (for example, from /dev/random or /dev/urandom) is already resilient to snapshot operations. It does not need updates to restore uniqueness. However, customers who prefer to implement uniqueness using custom code for their Lambda functions must verify that their code restores uniqueness when using SnapStart.

For more details, read Starting up faster with AWS Lambda SnapStart and refer to Lambda documentation on SnapStart uniqueness.

Runtime hooks

These pre- and post-hooks give developers a way to react to the snapshotting process.

For example, a function that must always preload large amounts of data from Amazon S3 should do this before Lambda takes the snapshot. This embeds the data in the snapshot so that it does not need fetching repeatedly. However, in some cases, you may not want to keep ephemeral data. A password to a database may be rotated frequently and cause unnecessary errors. I discuss this in greater detail in a later section.

The Java managed runtime uses the open-source Coordinated Restore at Checkpoint (CRaC) project to provide hook support. The managed Java runtime contains a customized CRaC context implementation that calls your Lambda function’s runtime hooks before completing snapshot creation and after restoring the execution environment from a snapshot.

The following function example shows how you can create a function handler with runtime hooks. The handler implements the CRaC Resource and the Lambda RequestHandler interface.

...
import org.crac.Resource;
import org.crac.Core;
...

public class HelloHandler implements RequestHandler<String, String>, Resource {

    public HelloHandler() {
        Core.getGlobalContext().register(this);
    }

    public String handleRequest(String name, Context context) throws IOException {
        System.out.println("Handler execution");
        return "Hello " + name;
    }

    @Override
    public void beforeCheckpoint(org.crac.Context<? extends Resource> context) throws Exception {
        System.out.println("Before Checkpoint");
    }

    @Override
    public void afterRestore(org.crac.Context<? extends Resource> context) throws Exception {
        System.out.println("After Restore");
    }
}

For the classes required to write runtime hooks, add the following dependency to your project:

Maven

<dependency>
  <groupId>io.github.crac</groupId>
  <artifactId>org-crac</artifactId>
  <version>0.1.3</version>
</dependency>

Gradle

implementation 'io.github.crac:org-crac:0.1.3'

Priming

SnapStart and runtime hooks give you new ways to build your Lambda functions for low startup latency. You can use the pre-snapshot hook to make your Java application as ready as possible for the first invoke. Do as much as possible within your function before the snapshot is taken. This is called priming.

When you upload your zip file of Java code to Lambda, the zip contains .class files of bytecode. This can be run on any machine with a JVM. When the JVM executes your bytecode, it is initially interpreted, then compiled into native machine code. This compilation stage is relatively CPU intensive and happens just in time (JIT Compiler).

You can use the before snapshot hook to run code paths before the snapshot is taken. The JVM compiles these code paths and the optimization is kept for future restores. For example, if you have a function that integrates with DynamoDB, you can make a read operation in your before snapshot hook.

This means that your function code, the AWS SDK for Java, and any other libraries used in that action are compiled and kept within the snapshot. The JVM then won’t need to compile this code when your function is invoked, meaning your latency is less the first time an execution environment is invoked.

Priming requires that you understand your application code and the consequences of executing it. The sample application includes a before snapshot hook, which primes the application by making a read operation from DynamoDB. (TODO Link)

Metrics

The following chart reflects invoking the sample application Lambda function 100 times per second for 10 minutes. This test is based on this function, both with and without SnapStart.

 

p50 p99.9
On-demand 7.87ms 5,114ms
SnapStart 7.87ms 488ms

Conclusion

This blog shows how SnapStart reduces startup (cold-start) latencies times for Java-based Lambda functions. You can configure SnapStart using AWS SDK, AWS CloudFormation, AWS SAM, and CDK.

To learn more, see Configuring function options in the AWS documentation. This functionality may require some minimal code changes. In most cases, the existing code is already compatible with SnapStart. You can now bring your latency-sensitive Java-based workloads to Lambda and run with improved tail latencies.

This feature allows developers to use the on-demand model in Lambda with low-latency response times, without incurring extra cost. To read more about how to use SnapStart with partner frameworks, find out more from Quarkus and Micronaut. To read more about this and other features, visit Serverless Land.