Tag Archives: Top Posts*

Improve Build Performance and Save Time Using Local Caching in AWS CodeBuild

Post Syndicated from Kausalya Rani Krishna Samy original https://aws.amazon.com/blogs/devops/improve-build-performance-and-save-time-using-local-caching-in-aws-codebuild/

AWS CodeBuild now supports local caching, which makes it possible for you to persist intermediate build artifacts locally on the build host so that they are available for reuse in subsequent build runs.

Your build project can use one of two types of caching: Amazon S3 or local. In this blog post, we will discuss how to use the local caching feature.

Local caching stores a cache on a build host. The cache is available to that build host only for a limited time and until another build is complete. For example, when you are dealing with large Java projects, compilation might take a long time. You can speed up subsequent builds by using local caching. This is a good option for large intermediate build artifacts because the cache is immediately available on the build host.

Local caching increases build performance for:

  • Projects with a large, monolithic source code repository.
  • Projects that generate and reuse many intermediate build artifacts.
  • Projects that build large Docker images.
  • Projects with many source dependencies.

To use local caching

1. Open AWS CodeBuild console at https://console.aws.amazon.com/codesuite/codebuild/home.

2. Choose Create project.

3. In Project configuration, enter a name and description for the build project.

4. In Source, for Source provider, choose the source code provider type. In this example, we use an AWS CodeCommit repository name.

5. For Environment image, choose Managed image or Custom image, as appropriate. For environment type, choose Linux or Windows Server. Specify a runtime, runtime version, and service role for your project.

6. Configure the buildspec file for your project.

7. In Artifacts, expand Additional Configuration. For Cache type, choose Local, as shown here.

Local caching supports the following caching modes:

Source cache mode caches Git metadata for primary and secondary sources. After the cache is created, subsequent builds pull only the change between commits. This mode is a good choice for projects with a clean working directory and a source that is a large Git repository. If you choose this option and your project does not use a Git repository (GitHub, GitHub Enterprise, or Bitbucket), the option is ignored. No changes are required in the buildspec file.

Docker layer cache mode caches existing Docker layers. This mode is a good choice for projects that build or pull large Docker images. It can prevent the performance issues caused by pulling large Docker images down from the network.

Note

  • You can use a Docker layer cache in the Linux environment only.
  • The privileged flag must be set so that your project has the required Docker permissions
  • You should consider the security implications before you use a Docker layer cache.

Custom cache mode caches directories you specify in the buildspec file. This mode is a good choice if your build scenario is not suited to one of the other two local cache modes. If you use a custom cache:

  • Only directories can be specified for caching. You cannot specify individual files.
  • Symlinks are used to reference cached directories.
  • Cached directories are linked to your build before it downloads its project sources. Cached items are overridden if a source item has the same name. Directories are specified using cache paths in the buildspec file.

To use source cache mode

In the build project configuration, under Artifacts, expand Additional Configuration. For Cache type, choose Local. Select Source cache, as shown here.

To use Docker layer cache mode

In the build project configuration, under Artifacts, expand Additional Configuration. For Cache type, choose Local. Select Docker layer cache, as shown here.

Under Privileged, select Enable this flag if you want to build Docker images or want your builds to get elevated privileges. This grants elevated privileges to the Docker process running on the build host.

To use custom cache mode

In your buildspec file, specify the cache path, as shown here.

In the build project configuration, under Artifacts, expand Additional Configuration. For Cache type, choose Local. Select Custom cache, as shown here.


version: 0.2
phases:
  pre_build:
    commands:
      - echo "Enter pre_build commands"
  build:
    commands:
      - echo "Enter build commands"
      
cache:
  paths:
    - '/root/.m2/**/*'
    - '/root/.npm/**/*'
    - 'build/**/*'

Conclusion

We hope you find the information in this post helpful. If you have feedback, please leave it in the Comments section below. If you have questions, start a new thread on the AWS CodeBuild forum or contact AWS Support.

 

 

 

 

 

Behind the Scenes & Under the Carpet – The CenturyLink Network that Powered AWS re:Invent 2018

Post Syndicated from Jeff Barr original https://aws.amazon.com/blogs/aws/behind-the-scenes-under-the-carpet-the-centurylink-network-that-powered-aws-reinvent-2018/

If you are a long-time reader, you may have already figured out that I am fascinated by the behind-the-scenes and beneath-the-streets activities that enable and power so much of our modern world. For example, late last year I told you how The AWS Cloud Goes Underground at re:Invent and shared some information about the communication and network infrastructure that was used to provide top-notch connectivity to re:Invent attendees and to those watching the keynotes and live streams from afar.

Today, with re:Invent 2018 in the rear-view mirror (and planning for next year already underway), I would like to tell you how 5-time re:Invent Network Services Provider CenturyLink designed and built a redundant, resilient network that used AWS Direct Connect to provide 180 Gbps of bandwidth and supported over 81,000 devices connected across eight venues. Above the ground, we worked closely with ShowNets to connect their custom network and WiFi deployment in each venue to the infrastructure provided by CenturyLink.

The 2018 re:Invent Network
This year, the network included diverse routes to multiple AWS regions, with a brand-new multi-node metro fiber ring that encompassed the Sands Expo, Wynn Resort, Circus Circus, Mirage, Vdara, Bellagio, Aria, and MGM Grand facilities. Redundant 10 Gbps connections to each venue and to multiple AWS Direct Connect locations were used to ensure high availability. The network was provisioned using CenturyLink Cloud Connect Dynamic Connections.

Here’s a network diagram (courtesy of CenturyLink) that shows the metro fiber ring and the connectivity:

The network did its job, and supported keynotes, live streams, breakout sessions, hands-on labs, hackathons, workshops, and certification exams. Here are the final numbers, as measured on-site at re:Invent 2018:

  • Live Streams – Over 60K views from over 100 countries.
  • Peak Data Transfer – 9.5 Gbps across six 10 Gbps connections.
  • Total Data Transfer – 160 TB.

Thanks again to our Managed Service Partner for building and running the robust network that supported our customers, partners, and employees at re:Invent!

Jeff;

Amazon Polly Update – Time-Driven Prosody and Asynchronous Synthesis

Post Syndicated from Jeff Barr original https://aws.amazon.com/blogs/aws/amazon-polly-update-time-driven-prosody-and-asynchronous-synthesis/

I hope that you are enjoying the Polly-powered audio that is available for the newest posts on this blog, including the DeepLens Challenge and the Storage Gateway Recap. As part of my blogging process, I now listen to the synthesized speech for my draft blog posts in order to get a better sense for how they flow.

Today we are launching two new features for Amazon Polly:

Time-Driven Prosody – You can now specify the desired duration for the synthesized speech that corresponds to part or all of the input text.

Asynchronous Synthesis – You can now process large blocks of text and store the synthesized speech in Amazon S3 with a single call.

Both of these features are available now and you can start using them today. Let’s take a closer look!

Time-Driven Prosody
Imagine that you are creating a multi-lingual version of a video or a self-running presentation. You write the script, record the video in one language, and then use Amazon Translate and Amazon Polly to create audio tracks in other languages. In order to keep each language in sync with the visual content, you need to exercise fine-grained control over the duration of each segment. That’s where this new feature comes in. You can now specify the maximum desired duration for any desired segments, counting on Polly to adjust the speech rate in order to limit the length of each segment.

The preceding paragraph generates 19 seconds of audio if I use Amazon Polly’s Joanna voice with no other options:

<speak>
  In order to keep each language in sync with the visual content, 
  you need to exercise fine-grained control over the duration of
  each segment. That's where this new feature comes in. You can 
  now specify the maximum desired duration for any desired segments, 
  counting on Polly to adjust the speech rate in order to limit 
  the length of each segment.
</speak>

I can use a <prosody> tag to limit the length to 15 seconds:

<speak>
  <prosody amazon:max-duration="15s">
    In order to keep each language in sync with the visual content, 
    you need to exercise fine-grained control over the duration of
    each segment. That's where this new feature comes in. You can 
    now specify the maximum desired duration for any desired segments, 
    counting on Polly to adjust the speech rate in order to limit 
    the length of each segment.
 </prosody>
</speak>

I can control the duration at a more fine-grained level by using multiple <prosody> tags:

  <prosody amazon:max-duration="10s">
    In order to keep each language in sync with the visual content, 
    you need to exercise fine-grained control over the duration of
    each segment. 
  </prosody>
  <prosody amazon:max-duration="7s">
    That's where this new feature comes in. You can now specify 
    the maximum desired duration for any desired segments, 
    counting on Polly to adjust the speech rate in order to limit 
    the length of each segment.
 </prosody>

The Spanish equivalent (courtesy of Amazon Translate) of my English text is somewhat longer and the speed-up is apparent:

<speak>
  <prosody amazon:max-duration="15s">
    Para mantener cada idioma sincronizado con el contenido
    visual, es necesario ejercer un control detallado sobre
    la duración de cada segmento. Ahí es donde entra esta 
    nueva característica. Ahora puede especificar la 
    duración máxima deseada para los segmentos deseados, 
    contando con que Polly ajuste la velocidad de voz para 
    limitar la longitud de cada segmento.
 </prosody>
</speak>

The text inside of each time-limited <prosody> tag is limited to 1500 characters and nesting is not allowed (the inner tag will be ignored). In order to ensure that the audio remains comprehensible, Polly will speed up the audio by a maximum of 5x.

Asynchronous Synthesis
This feature makes it easier for you to use Polly to generate speech for long-form content such as articles or book chapters by allowing you to process up to 100,000 characters of text at a time using asynchronous requests. The synthesized speech is delivered to the S3 bucket of your choice, with failure notifications optionally routed to the Amazon Simple Notification Service (SNS) topic of your choice. The generated audio can be up to 6 hours long, and is typically ready within minutes. In addition to 100,000 characters of text, each request can include an additional 100,000 characters of Speech Synthesis Markup Language (SSML) markup.

Each asynchronous request creates a new speech synthesis task. You can initiate and manage tasks from the Polly Console, CLI (start-speech-synthesis-task), or API (StartSpeechSynthesisTask).

To test this feature I created a plain-text version of my thoroughly obsolete AWS book and inserted some SSML tags, turning it in to valid XML along the way. Then I open the Polly Console, click Text-to-Speech, paste the XML, and click Synthesize to S3:

I enter the name of my S3 bucket (which must be in region where I plan to create the task), and click Synthesize to proceed:

My task is created:

And I can see it in the list of tasks:

I receive an email when the synthesis is complete:

And the file is in my bucket as expected:

I did not spend a lot of time on the markup, but the results are impressive:

Interestingly enough, most of that chapter is still relevant. The rest of the book has been overtaken by history, and is best left there! Perhaps I’ll write another one sometime.

Anyway, as you can see (and hear) the asynchronous speech synthesis is powerful and easy to use. Give it a shot, build something cool, and tell me about it.

Jeff;

 

 

Amazon EC2 Instance Update – Faster Processors and More Memory

Post Syndicated from Jeff Barr original https://aws.amazon.com/blogs/aws/amazon-ec2-instance-update-faster-processors-and-more-memory/

Last month I told you about the Nitro system and explained how it will allow us to broaden the selection of EC2 instances and to pick up the pace as we do so, with an ever-broadening selection of compute, storage, memory, and networking options. This will allow us to give you access to the latest technology very quickly, giving you the ability to choose the instance type that is the best match for your applications.

Today, I would like to tell you about three new instance types that are in the works and that will be available soon:

Z1d – Compute-intensive instances running at up to 4.0 GHz, powered by sustained all-core Turbo Boost. They are ideal for Electronic Design Automation (EDA) and relational database workloads, and are also a great fit for several kinds of HPC workloads.

R5 – Memory-optimized instances running at up to 3.1 GHz powered by sustained all-core Turbo Boost, with up to 50% more vCPUs and 60% more memory than R4 instances.

R5d – Memory-optimized instances equipped with local NVMe storage (up to 3.6 TB for the largest R5d instance), and will be available in the same sizes and with the same specs as the R5 instances.

We are also planning to launch R5 Bare Metal, R5d Bare Metal, and Z1d Bare Metal instances. As is the case with the existing i3.metal instances, you will be able to access low-level hardware features and to run applications that are not licensed or supported in virtualized environments.

Z1d Instances
The Z1d instances are designed for applications that can benefit from extremely high per-core performance. These include:

Electronic Design Automation – As chips become smaller and denser, the amount of compute power needed to design and verify the chips increases non-linearly. Semiconductor customers deploy jobs that span thousands of cores; having access to faster cores reduces turnaround time for each job and can also lead to a measurable reduction in software licensing costs.

HPC – In the financial services world, jobs that run analyses or compute risks also benefit from faster cores. Manufacturing organizations can run their Finite Element Analysis (FEA) and simulation jobs to completion more quickly.

Relational Database – CPU-bound workloads that run on a database that “features” high per-core license fees will enjoy both cost and performance benefits.

Z1d instances use custom Intel® Xeon® Scalable Processors running at up to 4.0 GHz, powered by sustained all-core Turbo Boost. They will be available in 6 sizes, with up to 48 vCPUs, 384 GiB of memory, and 1.8 TB of local NVMe storage. On the network side, they feature ENA networking that will deliver up to 25 Gbps of bandwidth, and are EBS-Optimized by default for up to 14 Gbps of bandwidth. As usual, you can launch them in a Cluster Placement Group to increase throughput and reduce latency. Here are the sizes and specs:

Instance NamevCPUsMemoryLocal StorageEBS-Optimized BandwidthNetwork Bandwidth
z1d.large216 GiB1 x 75 GB NVMe SSDUp to 2.333 GbpsUp to 10 Gbps
z1d.xlarge432 GiB1 x 150 GB NVMe SSDUp to 2.333 GbpsUp to 10 Gbps
z1d.2xlarge864 GiB1 x 300 GB NVMe SSD2.333 GbpsUp to 10 Gbps
z1d.3xlarge1296 GiB1 x 450 GB NVMe SSD3.5 GbpsUp to 10 Gbps
z1d.6xlarge24192 GiB1 x 900 GB NVMe SSD7.0 Gbps10 Gbps
z1d.12xlarge48384 GiB2 x 900 GB NVMe SSD14.0 Gbps25 Gbps

The instances are HVM and VPC-only, and you will need to use an AMI with the appropriate ENA and NVMe drivers. Any AMI that runs on C5 or M5 instances will also run on Z1d instances.

R5 Instances
Building on the earlier generations of memory-intensive instance types (M2, CR1, R3, and R4), the R5 instances are designed to support high-performance databases, distributed in-memory caches, in-memory analytics, and big data analytics. They use custom Intel® Xeon® Platinum 8000 Series (Skylake-SP) processors running at up to 3.1 GHz, again powered by sustained all-core Turbo Boost. The instances will be available in 6 sizes, with up to 96 vCPUs and 768 GiB of memory. Like the Z1d instances, they feature ENA networking and are EBS-Optimized by default, and can be launched in Placement Groups. Here are the sizes and specs:

Instance NamevCPUsMemoryEBS-Optimized BandwidthNetwork Bandwidth
r5.large216 GiBUp to 3.5 GbpsUp to 10 Gbps
r5.xlarge432 GiBUp to 3.5 GbpsUp to 10 Gbps
r5.2xlarge864 GiBUp to 3.5 GbpsUp to 10 Gbps
r5.4xlarge16128 GiB3.5 GbpsUp to 10 Gbps
r5.12xlarge48384 GiB7.0 Gbps10 Gbps
r5.24xlarge96768 GiB14.0 Gbps25 Gbps

Once again, the instances are HVM and VPC-only, and you will need to use an AMI with the appropriate ENA and NVMe drivers.

Learn More
The new EC2 instances announced today highlight our plan to continue innovating in order to better meet your needs! I’ll share additional information as soon as it is available.

Jeff;

 

 

New – EC2 Compute Instances for AWS Snowball Edge

Post Syndicated from Jeff Barr original https://aws.amazon.com/blogs/aws/new-ec2-compute-instances-for-aws-snowball-edge/

I love factories and never miss an opportunity to take a tour. Over the years, I have been lucky enough to watch as raw materials and sub-assemblies are turned into cars, locomotives, memory chips, articulated buses, and more. I’m always impressed by the speed, precision, repeat-ability, and the desire to automate every possible step. On one recent tour, the IT manager told me that he wanted to be able to set up and centrally manage the global collection of on-premises industrialized PCs that monitor their machinery as easily and as efficiently as he does their EC2 instances and other cloud resources.

Today we are making that manager’s dream a reality, with the introduction of EC2 instances that run on AWS Snowball Edge devices! These ruggedized devices, with 100 TB of local storage, can be used to collect and process data in hostile environments with limited or non-existent Internet connections before shipping the processed data back to AWS for storage, aggregation, and detailed analysis. Here are the instance specs:

Instance NamevCPUsMemory
sbe1.small11 GiB
sbe1.medium12 GiB
sbe1.large24 GiB
sbe1.xlarge48 GiB
sbe1.2xlarge816 GiB
sbe1.4xlarge1632 GiB

Each Snowball Edge device is powered by an Intel® Xeon® D processor running at 1.8 GHz, and supports any combination instances that consume up to 24 vCPUs and 32 GiB of memory. You can build and test AMIs (Amazon Machine Images) in the cloud and then preload them onto the device as part of the ordering process (I’ll show you how in just a minute). You can use the EC2-compatible endpoint exposed by each device to programmatically start, stop, resume, and terminate instances. This allows you to use the existing CLI commands and to build tools and scripts to manage fleets of devices. It also allows you to take advantage of your existing EC2 skills and knowledge, and to put them to good use in a new environment.

There are three main setup steps:

  1. Creating a suitable AMI.
  2. Ordering a Snowball Edge Device.
  3. Connecting and Configuring the Device.

Let’s take an in-depth look at the first two steps. Time was tight and I didn’t have time to get hands-on experience with an actual device, so the third step will have to wait for another time.

Creating a Suitable AMI
I have the ability to choose up to 10 AMIs that will be preloaded onto the device. The AMIs must be owned by my AWS account, and must be based on one of the following Marketplace AMIs:

These AMIs have been tested for use on Snowball Edge devices and can be used as a starting point for customization. We will be adding additional options over time, so let us know what you need.

I decided to start with the newest Ubuntu AMI, and launch it on an M5 instance, taking care to specify the SSH keypair that I will eventually use to connect to the instance from my terminal client:

After my instance launches, I connect to it, customize it as desired for use on my device, and then return to the EC2 Console to create an AMI. I select the running instance, choose Create Image from the Actions menu, specify the details, and click Create Image:

The size of the root volume will determine how much of the device’s SSD storage is allocated to the instance when it launches. A total of one TB of space is available for all running instances, so keep your local file storage needs in mind as your analyze your use case and set up your AMIs. Also, Snowball Edge devices cannot make use of additional EBS volumes, so don’t bother including them in your AMI. My AMI is ready within minutes (To learn more about how to create AMIs, read Creating an Amazon EBS-Backed Linux AMI):

Now I am ready to order my first device!

Ordering a Snowball Edge Device
The ordering procedure lets me designate a shipping address and specify how I would like my Snowball device to be configured. I open the AWS Snowball Console and click Create job:

I specify the job type (they all support EC2 compute instances):

Then I select my shipping address, entering a new one if necessary (come and visit me):

Next, I define my job. I give it a name (SJ1), select the 100 TB device, and pick the S3 bucket that will receive data when the device is returned to AWS:

Now comes the fun part! I click Enable compute with EC2 and select the AMIs to be loaded on the Snowball Edge:

I click Add an AMI and find the one that I created earlier:

I can add up to ten AMIs to my job, but will stop at one for this post:

Next, I set up my IAM role and configure encryption:

Then I configure the optional SNS notifications. I can choose to receive notification for a wide variety of job status values:

My job is almost ready! I review the settings and click Create job to create it:

Connecting and Configuring the Device
After I create the job, I wait until my Snowball Edge device arrives. I connect it to my network, power it on, and then unlock it using my manifest and device code, as detailed in Unlock the Snowball Edge. Then I configure my EC2 CLI to use the EC2 endpoint on the device and launch an instance. Since I configured my AMI for SSH access, I can connect to it as if it were an EC2 instance in the cloud.

Things to Know
Here are a couple of things to keep in mind:

Long-Term Usage – You can keep the Snowball Edge devices on your premises and hard at work for as long as you would like. You’ll be billed for a one-time setup fee for each job; after 10 days you will pay an additional, per-day fee for each device. If you want to keep a device for an extended period of time, you can also pay upfront as part of a one or three year commitment.

Dev/Test – You should be able to do much of your development and testing on an EC2 instance running in the cloud; some of our early users are working in this way as part of a “Digital Twin” strategy.

S3 Access – Each Snowball Edge device includes an S3-compatible endpoint that you can access from your on-device code. You can also make use of existing S3 tools and applications.

Now Available
You can start ordering devices today and make use of this exciting new AWS feature right away.

Jeff;

 

 

Amazon Translate Adds Support for Japanese, Russian, Italian, Traditional Chinese, Turkish, and Czech

Post Syndicated from Randall Hunt original https://aws.amazon.com/blogs/aws/amazon-translate-adds-support-for-new-languages/

Today I’m excited to announce that Amazon Translate has added support for Japanese, Russian, Italian, Traditional Chinese, Turkish, and Czech! Amazon Translate is a translation API that delivers fast, high-quality, and affordable language translation. Translate was originally released in preview at AWS re:Invent in 2017 and my colleague, Tara, wrote about the service in depth.

Since the initial preview, we’ve continued to iterate on customer feedback by adding features like automatic source language inference via Amazon Comprehend, Amazon CloudWatch metrics, and larger pieces of text in each TranslateText. In April, we made the service generally available and have continued to collect feature requests and feedback from customers.

Working with Amazon Translate

You can use the API explorer in the Amazon Translate Console to try out the new languages immediately.

You could also use any of the SDKs as well. I wrote a quick Python sample below.

import boto3
translate = boto3.client("translate")
lang_flag_pairs = [("ja", "🇯🇵"), ("ru", "🇷🇺"),
                   ("it", "🇮🇹"), ("zh-TW", "🇹🇼"),
                   ("tr", "🇹🇷"), ("cs", "🇨🇿")]
for lang, flag in language_flag_pairs:
    print(flag)
    print(translate.translate_text(
        Text="Hello, World!",
        SourceLanguageCode="en",
        TargetLanguageCode=lang
    )['TranslatedText'])

Which gave me a cool result:

🇯🇵
ハローワールド!
🇷🇺
Привет, Мир!
🇮🇹
Ciao, Mondo!
🇹🇼
你好,世界杯!
🇹🇷
Merhaba, Dünya!
🇨🇿
Ahoj, světe!

You can check out the examples page in the documentation for more interesting ways to use Amazon Translate.

I had the chance to chat with a few of our customers about how they’re using Amazon Translate. I found out that they’re using it to enable a lot of innovative use cases and customers are building real-time chat translation in games, capturing and translating tweets for analytics, and as you can see on this blog – translating blog posts and creating spoken versions of them with Polly.

Additional Info

Amazon Translate is available in US East (N. Virginia), US East (Ohio), US West (Oregon), and EU (Ireland) and AWS GovCloud (US). It has a useful monthly free tier of 2 million characters for the first 12 months, and $15 per million characters after that.

The team wanted me to let you know that they’re hard at work on some additional functionality and that they’re hoping, later this year, to add support for Danish, Dutch, Finnish, Hebrew, Polish, and Swedish.

As always, let us know if you have any feedback on this service in the comments or on Twitter!

Randall

AWS Certificate Manager Launches Private Certificate Authority

Post Syndicated from Randall Hunt original https://aws.amazon.com/blogs/aws/aws-certificate-manager-launches-private-certificate-authority/

Today we’re launching a new feature for AWS Certificate Manager (ACM), Private Certificate Authority (CA). This new service allows ACM to act as a private subordinate CA. Previously, if a customer wanted to use private certificates, they needed specialized infrastructure and security expertise that could be expensive to maintain and operate. ACM Private CA builds on ACM’s existing certificate capabilities to help you easily and securely manage the lifecycle of your private certificates with pay as you go pricing. This enables developers to provision certificates in just a few simple API calls while administrators have a central CA management console and fine grained access control through granular IAM policies. ACM Private CA keys are stored securely in AWS managed hardware security modules (HSMs) that adhere to FIPS 140-2 Level 3 security standards. ACM Private CA automatically maintains certificate revocation lists (CRLs) in Amazon Simple Storage Service (S3) and lets administrators generate audit reports of certificate creation with the API or console. This service is packed full of features so let’s jump in and provision a CA.

Provisioning a Private Certificate Authority (CA)

First, I’ll navigate to the ACM console in my region and select the new Private CAs section in the sidebar. From there I’ll click Get Started to start the CA wizard. For now, I only have the option to provision a subordinate CA so we’ll select that and use my super secure desktop as the root CA and click Next. This isn’t what I would do in a production setting but it will work for testing out our private CA.

Now, I’ll configure the CA with some common details. The most important thing here is the Common Name which I’ll set as secure.internal to represent my internal domain.

Now I need to choose my key algorithm. You should choose the best algorithm for your needs but know that ACM has a limitation today that it can only manage certificates that chain up to to RSA CAs. For now, I’ll go with RSA 2048 bit and click Next.

In this next screen, I’m able to configure my certificate revocation list (CRL). CRLs are essential for notifying clients in the case that a certificate has been compromised before certificate expiration. ACM will maintain the revocation list for me and I have the option of routing my S3 bucket to a custome domain. In this case I’ll create a new S3 bucket to store my CRL in and click Next.

Finally, I’ll review all the details to make sure I didn’t make any typos and click Confirm and create.

A few seconds later and I’m greeted with a fancy screen saying I successfully provisioned a certificate authority. Hooray! I’m not done yet though. I still need to activate my CA by creating a certificate signing request (CSR) and signing that with my root CA. I’ll click Get started to begin that process.

Now I’ll copy the CSR or download it to a server or desktop that has access to my root CA (or potentially another subordinate – so long as it chains to a trusted root for my clients).

Now I can use a tool like openssl to sign my cert and generate the certificate chain.


$openssl ca -config openssl_root.cnf -extensions v3_intermediate_ca -days 3650 -notext -md sha256 -in csr/CSR.pem -out certs/subordinate_cert.pem
Using configuration from openssl_root.cnf
Enter pass phrase for /Users/randhunt/dev/amzn/ca/private/root_private_key.pem:
Check that the request matches the signature
Signature ok
The Subject's Distinguished Name is as follows
stateOrProvinceName   :ASN.1 12:'Washington'
localityName          :ASN.1 12:'Seattle'
organizationName      :ASN.1 12:'Amazon'
organizationalUnitName:ASN.1 12:'Engineering'
commonName            :ASN.1 12:'secure.internal'
Certificate is to be certified until Mar 31 06:05:30 2028 GMT (3650 days)
Sign the certificate? [y/n]:y


1 out of 1 certificate requests certified, commit? [y/n]y
Write out database with 1 new entries
Data Base Updated

After that I’ll copy my subordinate_cert.pem and certificate chain back into the console. and click Next.

Finally, I’ll review all the information and click Confirm and import. I should see a screen like the one below that shows my CA has been activated successfully.

Now that I have a private CA we can provision private certificates by hopping back to the ACM console and creating a new certificate. After clicking create a new certificate I’ll select the radio button Request a private certificate then I’ll click Request a certificate.

From there it’s just similar to provisioning a normal certificate in ACM.

Now I have a private certificate that I can bind to my ELBs, CloudFront Distributions, API Gateways, and more. I can also export the certificate for use on embedded devices or outside of ACM managed environments.

Available Now
ACM Private CA is a service in and of itself and it is packed full of features that won’t fit into a blog post. I strongly encourage the interested readers to go through the developer guide and familiarize themselves with certificate based security. ACM Private CA is available in in US East (N. Virginia), US East (Ohio), US West (Oregon), Asia Pacific (Singapore), Asia Pacific (Sydney), Asia Pacific (Tokyo), Canada (Central), EU (Frankfurt) and EU (Ireland). Private CAs cost $400 per month (prorated) for each private CA. You are not charged for certificates created and maintained in ACM but you are charged for certificates where you have access to the private key (exported or created outside of ACM). The pricing per certificate is tiered starting at $0.75 per certificate for the first 1000 certificates and going down to $0.001 per certificate after 10,000 certificates.

I’m excited to see administrators and developers take advantage of this new service. As always please let us know what you think of this service on Twitter or in the comments below.

Randall

AWS Secrets Manager: Store, Distribute, and Rotate Credentials Securely

Post Syndicated from Randall Hunt original https://aws.amazon.com/blogs/aws/aws-secrets-manager-store-distribute-and-rotate-credentials-securely/

Today we’re launching AWS Secrets Manager which makes it easy to store and retrieve your secrets via API or the AWS Command Line Interface (CLI) and rotate your credentials with built-in or custom AWS Lambda functions. Managing application secrets like database credentials, passwords, or API Keys is easy when you’re working locally with one machine and one application. As you grow and scale to many distributed microservices, it becomes a daunting task to securely store, distribute, rotate, and consume secrets. Previously, customers needed to provision and maintain additional infrastructure solely for secrets management which could incur costs and introduce unneeded complexity into systems.

AWS Secrets Manager

Imagine that I have an application that takes incoming tweets from Twitter and stores them in an Amazon Aurora database. Previously, I would have had to request a username and password from my database administrator and embed those credentials in environment variables or, in my race to production, even in the application itself. I would also need to have our social media manager create the Twitter API credentials and figure out how to store those. This is a fairly manual process, involving multiple people, that I have to restart every time I want to rotate these credentials. With Secrets Manager my database administrator can provide the credentials in secrets manager once and subsequently rely on a Secrets Manager provided Lambda function to automatically update and rotate those credentials. My social media manager can put the Twitter API keys in Secrets Manager which I can then access with a simple API call and I can even rotate these programmatically with a custom lambda function calling out to the Twitter API. My secrets are encrypted with the KMS key of my choice, and each of these administrators can explicitly grant access to these secrets with with granular IAM policies for individual roles or users.

Let’s take a look at how I would store a secret using the AWS Secrets Manager console. First, I’ll click Store a new secret to get to the new secrets wizard. For my RDS Aurora instance it’s straightforward to simply select the instance and provide the initial username and password to connect to the database.

Next, I’ll fill in a quick description and a name to access my secret by. You can use whatever naming scheme you want here.

Next, we’ll configure rotation to use the Secrets Manager-provided Lambda function to rotate our password every 10 days.

Finally, we’ll review all the details and check out our sample code for storing and retrieving our secret!

Finally I can review the secrets in the console.

Now, if I needed to access these secrets I’d simply call the API.

import json
import boto3
secrets = boto3.client("secretsmanager")
rds = json.dumps(secrets.get_secrets_value("prod/TwitterApp/Database")['SecretString'])
print(rds)

Which would give me the following values:


{'engine': 'mysql',
 'host': 'twitterapp2.abcdefg.us-east-1.rds.amazonaws.com',
 'password': '-)Kw>THISISAFAKEPASSWORD:lg{&sad+Canr',
 'port': 3306,
 'username': 'ranman'}

More than passwords

AWS Secrets Manager works for more than just passwords. I can store OAuth credentials, binary data, and more. Let’s look at storing my Twitter OAuth application keys.

Now, I can define the rotation for these third-party OAuth credentials with a custom AWS Lambda function that can call out to Twitter whenever we need to rotate our credentials.

Custom Rotation

One of the niftiest features of AWS Secrets Manager is custom AWS Lambda functions for credential rotation. This allows you to define completely custom workflows for credentials. Secrets Manager will call your lambda with a payload that includes a Step which specifies which step of the rotation you’re in, a SecretId which specifies which secret the rotation is for, and importantly a ClientRequestToken which is used to ensure idempotency in any changes to the underlying secret.

When you’re rotating secrets you go through a few different steps:

  1. createSecret
  2. setSecret
  3. testSecret
  4. finishSecret

The advantage of these steps is that you can add any kind of approval steps you want for each phase of the rotation. For more details on custom rotation check out the documentation.

Available Now
AWS Secrets Manager is available today in US East (N. Virginia), US East (Ohio), US West (N. California), US West (Oregon), Asia Pacific (Mumbai), Asia Pacific (Seoul), Asia Pacific (Singapore), Asia Pacific (Sydney), Asia Pacific (Tokyo), Canada (Central), EU (Frankfurt), EU (Ireland), EU (London), and South America (São Paulo). Secrets are priced at $0.40 per month per secret and $0.05 per 10,000 API calls. I’m looking forward to seeing more users adopt rotating credentials to secure their applications!

Randall

Amazon S3 Update: New Storage Class and General Availability of S3 Select

Post Syndicated from Jeff Barr original https://aws.amazon.com/blogs/aws/amazon-s3-update-new-storage-class-general-availability-of-s3-select/

I’ve got two big pieces of news for anyone who stores and retrieves data in Amazon Simple Storage Service (S3):

New S3 One Zone-IA Storage Class – This new storage class is 20% less expensive than the existing Standard-IA storage class. It is designed to be used to store data that does not need the extra level of protection provided by geographic redundancy.

General Availability of S3 Select – This unique retrieval option lets you retrieve subsets of data from S3 objects using simple SQL expressions, with the possibility for a 400% performance improvement in the process.

Let’s take a look at both!

S3 One Zone-IA (Infrequent Access) Storage Class
This new storage class stores data in a single AWS Availability Zone and is designed to provide eleven 9’s (99.99999999%) of data durability, just like the other S3 storage classes. Unlike those other classes, it is not designed to be resilient to the physical loss of an AZ due to major event such as an earthquake or a flood, and data could be lost in the unlikely event that an AZ is destroyed. S3 One Zone-IA storage gives you a lower cost option for secondary backups of on-premises data and for data that can be easily re-created. You can also use it as the target of S3 Cross-Region Replication from another AWS region.

You can specify the use of S3 One Zone-IA storage when you upload a new object to S3:

You can also make use of it as part of an S3 lifecycle rule:

You can set up a lifecycle rule that moves previous versions of an object to S3 One Zone-IA after 30 or more days:

And you can modify the storage class of an existing object:

You can also manage storage classes using the S3 API, CLI, and CloudFormation templates.

The S3 One Zone-IA storage class can be used in all public AWS regions. As I noted earlier, pricing is 20% lower than for the S3 Standard-IA storage class (see the S3 Pricing page for more info). There’s a 30 day minimum retention period, and a 128 KB minimum object size.

General Availability of S3 Select
Randall wrote a detailed introduction to S3 Select last year and showed you how you can use it to retrieve selected data from within S3 objects. During the preview we added support for server-side encryption and the ability to run queries from the S3 Console.

I used a CSV file of airport codes to exercise the new console functionality:

This file contains listings for over 9100 airports, so it makes for useful test data but it definitely does not test the limits of S3 Select in any way. I select the file, open the More menu, and choose Select from:

The console sets the file format and compression according to the file name and the encryption status. I set delimiter and click Show file preview to verify that my settings are correct. Then I click Next to proceed:

I type SQL expressions in the SQL editor and click Run SQL to issue the query:

Or:

I can also issue queries from the AWS SDKs. I initiate the select operation:

s3 = boto3.client('s3', region_name='us-west-2')

r = s3.select_object_content(
        Bucket='jbarr-us-west-2',
        Key='sample-data/airportCodes.csv',
        ExpressionType='SQL',
        Expression="select * from s3object s where s.\"Country (Name)\" like '%United States%'",
        InputSerialization = {'CSV': {"FileHeaderInfo": "Use"}},
        OutputSerialization = {'CSV': {}},
)

And then I process the stream of results:

for event in r['Payload']:
    if 'Records' in event:
        records = event['Records']['Payload'].decode('utf-8')
        print(records)
    elif 'Stats' in event:
        statsDetails = event['Stats']['Details']
        print("Stats details bytesScanned: ")
        print(statsDetails['BytesScanned'])
        print("Stats details bytesProcessed: ")
        print(statsDetails['BytesProcessed'])

S3 Select is available in all public regions and you can start using it today. Pricing is based on the amount of data scanned and the amount of data returned.

Jeff;

Amazon Transcribe Now Generally Available

Post Syndicated from Randall Hunt original https://aws.amazon.com/blogs/aws/amazon-transcribe-now-generally-available/


At AWS re:Invent 2017 we launched Amazon Transcribe in private preview. Today we’re excited to make Amazon Transcribe generally available for all developers. Amazon Transcribe is an automatic speech recognition service (ASR) that makes it easy for developers to add speech to text capabilities to their applications. We’ve iterated on customer feedback in the preview to make a number of enhancements to Amazon Transcribe.

New Amazon Transcribe Features in GA

To start off we’ve made the SampleRate parameter optional which means you only need to know the file type of your media and the input language. We’ve added two new features – the ability to differentiate multiple speakers in the audio to provide more intelligible transcripts (“who spoke when”), and a custom vocabulary to improve the accuracy of speech recognition for product names, industry-specific terminology, or names of individuals. To refresh our memories on how Amazon Transcribe works lets look at a quick example. I’ll convert this audio in my S3 bucket.

import boto3
transcribe = boto3.client("transcribe")
transcribe.start_transcription_job(
    TranscriptionJobName="TranscribeDemo",
    LanguageCode="en-US",
    MediaFormat="mp3",
    Media={"MediaFileUri": "https://s3.amazonaws.com/randhunt-transcribe-demo-us-east-1/out.mp3"}
)

This will output JSON similar to this (I’ve stripped out most of the response) with indidivudal speakers identified:

{
  "jobName": "reinvent",
  "accountId": "1234",
  "results": {
    "transcripts": [
      {
        "transcript": "Hi, everybody, i'm randall ..."
      }
    ],
    "speaker_labels": {
      "speakers": 2,
      "segments": [
        {
          "start_time": "0.000000",
          "speaker_label": "spk_0",
          "end_time": "0.010",
          "items": []
        },
        {
          "start_time": "0.010000",
          "speaker_label": "spk_1",
          "end_time": "4.990",
          "items": [
            {
              "start_time": "1.000",
              "speaker_label": "spk_1",
              "end_time": "1.190"
            },
            {
              "start_time": "1.190",
              "speaker_label": "spk_1",
              "end_time": "1.700"
            }
          ]
        }
      ]
    },
    "items": [
      {
        "start_time": "1.000",
        "end_time": "1.190",
        "alternatives": [
          {
            "confidence": "0.9971",
            "content": "Hi"
          }
        ],
        "type": "pronunciation"
      },
      {
        "alternatives": [
          {
            "content": ","
          }
        ],
        "type": "punctuation"
      },
      {
        "start_time": "1.190",
        "end_time": "1.700",
        "alternatives": [
          {
            "confidence": "1.0000",
            "content": "everybody"
          }
        ],
        "type": "pronunciation"
      }
    ]
  },
  "status": "COMPLETED"
}

Custom Vocabulary

Now if I needed to have a more complex technical discussion with a colleague I could create a custom vocabulary. A custom vocabulary is specified as an array of strings passed to the CreateVocabulary API and you can include your custom vocabulary in a transcription job by passing in the name as part of the Settings in a StartTranscriptionJob API call. An individual vocabulary can be as large as 50KB and each phrase must be less than 256 characters. If I wanted to transcribe the recordings of my highschool AP Biology class I could create a custom vocabulary in Python like this:

import boto3
transcribe = boto3.client("transcribe")
transcribe.create_vocabulary(
LanguageCode="en-US",
VocabularyName="APBiology"
Phrases=[
    "endoplasmic-reticulum",
    "organelle",
    "cisternae",
    "eukaryotic",
    "ribosomes",
    "hepatocyes",
    "cell-membrane"
]
)

I can refer to this vocabulary later on by the name APBiology and update it programatically based on any errors I may find in the transcriptions.

Available Now

Amazon Transcribe is available now in US East (N. Virginia), US West (Oregon), US East (Ohio) and EU (Ireland). Transcribe’s free tier gives you 60 minutes of transcription for free per month for the first 12 months with a pay-as-you-go model of $0.0004 per second of transcribed audio after that, with a minimum charge of 15 seconds.

When combined with other tools and services I think transcribe opens up a entirely new opportunities for application development. I’m excited to see what technologies developers build with this new service.

Randall

Amazon Translate Now Generally Available

Post Syndicated from Randall Hunt original https://aws.amazon.com/blogs/aws/amazon-translate-now-generally-available/


Today we’re excited to make Amazon Translate generally available. Late last year at AWS re:Invent my colleague Tara Walker wrote about a preview of a new AI service, Amazon Translate. Starting today you can access Amazon Translate in US East (N. Virginia), US East (Ohio), US West (Oregon), and EU (Ireland) with a 2 million character monthly free tier for the first 12 months and $15 per million characters after that. There are a number of new features available in GA: automatic source language inference, Amazon CloudWatch support, and up to 5000 characters in a single TranslateText call. Let’s take a quick look at the service in general availability.

Amazon Translate New Features

Since Tara’s post already covered the basics of the service I want to point out some of the new features of the service released today. Let’s start with a code sample:

import boto3
translate = boto3.client("translate")
resp = translate.translate_text(
    Text="🇫🇷Je suis très excité pour Amazon Traduire🇫🇷",
    SourceLanguageCode="auto",
    TargetLanguageCode="en"
)
print(resp['TranslatedText'])

Since I have specified my source language as auto, Amazon Translate will call Amazon Comprehend on my behalf to determine the source language used in this text. If you couldn’t guess it, we’re writing some French and the output is 🇫🇷I'm very excited about Amazon Translate 🇫🇷. You’ll notice that our emojis are preserved in the output text which is definitely a bonus feature for Millennials like me.

The Translate console is a great way to get started and see some sample response.

Translate is extremely easy to use in AWS Lambda functions which allows you to use it with almost any AWS service. There are a number of examples in the Translate documentation showing how to do everything from translate a web page to a Amazon DynamoDB table. Paired with other ML services like Amazon Comprehend and [transcribe] you can build everything from closed captioning to real-time chat translation to a robust text analysis pipeline for call centers transcriptions and other textual data.

New Languages Coming Soon

Today, Amazon Translate allows you to translate text to or from English, to any of the following languages: Arabic, Chinese (Simplified), French, German, Portuguese, and Spanish. We’ve announced support for additional languages coming soon: Japanese (go JAWSUG), Russian, Italian, Chinese (Traditional), Turkish, and Czech.

Amazon Translate can also be used to increase professional translator efficiency, and reduce costs and turnaround times for their clients. We’ve already partnered with a number of Language Service Providers (LSPs) to offer their customers end-to-end translation services at a lower cost by allowing Amazon Translate to produce a high-quality draft translation that’s then edited by the LSP for a guaranteed human quality result.

I’m excited to see what applications our customers are able to build with high quality machine translation just one API call away.

Randall

Amazon SageMaker Now Supports Additional Instance Types, Local Mode, Open Sourced Containers, MXNet and Tensorflow Updates

Post Syndicated from Randall Hunt original https://aws.amazon.com/blogs/aws/amazon-sagemaker-roundup-sf/

Amazon SageMaker continues to iterate quickly and release new features on behalf of customers. Starting today, SageMaker adds support for many new instance types, local testing with the SDK, and Apache MXNet 1.1.0 and Tensorflow 1.6.0. Let’s take a quick look at each of these updates.

New Instance Types

Amazon SageMaker customers now have additional options for right-sizing their workloads for notebooks, training, and hosting. Notebook instances now support almost all T2, M4, P2, and P3 instance types with the exception of t2.micro, t2.small, and m4.large instances. Model training now supports nearly all M4, M5, C4, C5, P2, and P3 instances with the exception of m4.large, c4.large, and c5.large instances. Finally, model hosting now supports nearly all T2, M4, M5, C4, C5, P2, and P3 instances with the exception of m4.large instances. Many customers can take advantage of the newest P3, C5, and M5 instances to get the best price/performance for their workloads. Customers also take advantage of the burstable compute model on T2 instances for endpoints or notebooks that are used less frequently.

Open Sourced Containers, Local Mode, and TensorFlow 1.6.0 and MXNet 1.1.0

Today Amazon SageMaker has open sourced the MXNet and Tensorflow deep learning containers that power the MXNet and Tensorflow estimators in the SageMaker SDK. The ability to write Python scripts that conform to simple interface is still one of my favorite SageMaker features and now those containers can be additionally customized to include any additional libraries. You can download these containers locally to iterate and experiment which can accelerate your debugging cycle. When you’re ready go from local testing to production training and hosting you just change one line of code.

These containers launch with support for Tensorflow 1.6.0 and MXNet 1.1.0 as well. Tensorflow has a number of new 1.6.0 features including support for CUDA 9.0, cuDNN 7, and AVX instructions which allows for significant speedups in many training applications. MXNet 1.1.0 adds a number of new features including a Text API mxnet.text with support for text processing, indexing, glossaries, and more. Two of the really cool pre-trained embeddings included are GloVe and fastText.
<

Available Now
All of the features mentioned above are available today. As always please let us know on Twitter or in the comments below if you have any questions or if you’re building something interesting. Now, if you’ll excuse me I’m going to go experiment with some of those new MXNet APIs!

Randall